Forward metrics to OpenShift

OpenShift comes with built-in Prometheus instances that you can use to monitor metrics for your workloads. Instead of using the built-in Prometheus that Gloo Mesh Gateway provides, you might want to forward the metrics from the telemetry gateway and collector agents to the OpenShift Prometheus to have a single observability layer for all of your workloads in the cluster.

  1. Get the current values of the Helm release for your Gloo Mesh Gateway installation. Note that your Helm release might have a different name.

    helm get values gloo-platform -n gloo-mesh -o yaml > gloo-gateway-single.yaml
    open gloo-gateway-single.yaml  
    
  2. In your Helm values file, expose the otlp-metrics and metrics ports on the Gloo collector agent. The otlp-metrics port is used to expose the metrics that were collected by the telemetry collector agent from other workloads in the cluster. The metrics port exposes metrics for the Gloo telemetry collector agents themselves.

    telemetryCollector:
      enabled: true
      ports:
        otlp-metrics:
          containerPort: 9091
          enabled: true
          protocol: TCP
          servicePort: 9091
        metrics: 
          enabled: true
          containerPort: 8888
          servicePort: 8888
          protocol: TCP
    
  3. Upgrade your Helm release. Change the release name as needed.

    helm upgrade gloo-platform gloo-platform/gloo-platform \
      --namespace gloo-mesh \
      -f gloo-gateway-single.yaml \
      --version $UPGRADE_VERSION
    
  4. Verify that the Gloo telemetry collector deploys successfully.

    kubectl get pods -n gloo-mesh | grep telemetry
    
  5. Verify that the ports are exposed on the telemetry collector service.

    kubectl get services -n gloo-mesh | grep telemetry
    
  6. Create a service monitor resource to instruct the OpenShift Prometheus to scrape metrics from the Gloo telemetry collector agent. The service monitor scrapes metrics from the otlp-metrics and metrics ports that you exposed earlier.

    kubectl apply -f- <<EOF
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      name: gloo-telemetry-collector-sm
      namespace: gloo-mesh
    spec:
      endpoints:
      - interval: 30s
        port: otlp-metrics
        scheme: http
      - interval: 30s
        port: metrics
        scheme: http
      selector:
        matchLabels:
          app.kubernetes.io/name: telemetryCollector
    EOF
    
  7. Create a configmap to enable workload monitoring in the cluster.

    kubectl apply -f - <<EOF
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: cluster-monitoring-config
      namespace: openshift-monitoring
    data:
      config.yaml: |
        enableUserWorkload: true
    EOF
    
  8. Open the OpenShift web console and select the Administrator view.

  9. Navigate to Observe > Metrics to open the built-in Prometheus expression browser.

  10. Verify that you can see metrics for the telemetrycollector container. For example, you can enter otelcol_exporter_sent_metric_points in the expression browser and verify that these metrics were sent. For an overview of metrics that are exposed, see Default metrics in the pipeline.

  11. Optional: You can update the Gloo UI to read metrics from the OpenShift Prometheus instance to populate the Gloo UI graph and other metrics. This way, you can remove the built-in Prometheus instance that Gloo Mesh Gateway provides. For more information, see Connect the Gloo UI to OpenShift Prometheus.

  1. Get the current values of the Helm release for the management cluster. Note that your Helm release might have a different name.

    helm get values gloo-platform -n gloo-mesh -o yaml --kube-context $MGMT_CONTEXT > mgmt-server.yaml
    open mgmt-server.yaml
    
  2. In your Helm values file for the management cluster, expose the otlp-metrics and metrics ports on the Gloo telemetry gateway and the metrics port of the Gloo telemetry collector agent. The otlp-metrics port is used to expose the metrics that were collected by the telemetry collector agents across workload clusters and sent to the telemetry gateway. The metrics port exposes metrics for the Gloo telemetry gateway and collector agents themselves.

    telemetryGateway:
      enabled: true
      service:
        type: LoadBalancer
      ports:
        otlp-metrics:
          containerPort: 9091
          enabled: true
          protocol: TCP
          servicePort: 9091
        metrics: 
          enabled: true
          containerPort: 8888
          servicePort: 8888
          protocol: TCP
    telemetryCollector:
      enabled: true
      ports:
        metrics: 
          enabled: true
          containerPort: 8888
          servicePort: 8888
          protocol: TCP
    
  3. Upgrade your Helm release in the management cluster. Change the release name as needed.

    helm upgrade gloo-platform gloo-platform/gloo-platform \
     --kube-context $MGMT_CONTEXT \
     --namespace gloo-mesh \
     -f mgmt-server.yaml \
     --version $UPGRADE_VERSION
    
  4. Verify that the Gloo telemetry gateway and collector agents deploy successfully.

    kubectl get pods --context $MGMT_CONTEXT -n gloo-mesh | grep telemetry
    
  5. Verify that the ports are exposed on the telemetry collector and gateway services.

    kubectl get services --context $MGMT_CONTEXT -n gloo-mesh | grep telemetry
    
  6. Create a service monitor resource to instruct the OpenShift Prometheus to scrape metrics from the Gloo telemetry gateway. The service monitor scrapes metrics from the otlp-metrics and metrics ports that you exposed earlier.

    kubectl --context ${MGMT_CONTEXT} apply -f- <<EOF
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      name: gloo-telemetry-gateway-sm
      namespace: gloo-mesh
    spec:
      endpoints:
      - interval: 30s
        port: otlp-metrics
        scheme: http
      - interval: 30s
        port: metrics
        scheme: http
      selector:
        matchLabels:
          app.kubernetes.io/name: telemetryGateway
    EOF
    
  7. Create another service monitor to scrape metrics from the Gloo telemetry collector agent in the management cluster.

    kubectl --context ${MGMT_CONTEXT} apply -f- <<EOF
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      name: gloo-telemetry-collector-sm
      namespace: gloo-mesh
    spec:
      endpoints:
      - interval: 30s
        port: metrics
        scheme: http
      selector:
        matchLabels:
          app.kubernetes.io/name: telemetryCollector
    EOF
    
  8. Create a configmap to enable workload monitoring for the management cluster.

    kubectl --context $MGMT_CONTEXT apply -f - <<EOF
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: cluster-monitoring-config
      namespace: openshift-monitoring
    data:
      config.yaml: |
        enableUserWorkload: true
    EOF
    
  9. Open the OpenShift web console for the management cluster and select the Administrator view.

  10. Navigate to Observe > Metrics to open the built-in Prometheus expression browser.

  11. Verify that you can see metrics for the telemetrygateway and telemetrycollector containers. For example, you can enter otelcol_exporter_sent_metric_points in the expression browser and verify that these metrics were sent from both containers. For an overview of metrics that these two components expose, see Default metrics in the pipeline.

  12. Get the current values of the Helm release for the workload cluster. Note that your Helm release might have a different name.

    helm get values gloo-platform -n gloo-mesh -o yaml --kube-context $REMOTE_CONTEXT > agent.yaml
    open agent.yaml
    
  13. In your Helm values file for the workload cluster, expose the metrics port on the Gloo telemetry collector agent. The metrics port exposes metrics for the Gloo telemetry collector agents, such as otelcol_exporter_enqueue_failed_metric_points, that you can use to determine whether the connection between the collector agents and the telemetry gateway in the management cluster is healhty.

    telemetryCollector:
      enabled: true
      ports:
        metrics: 
          enabled: true
          containerPort: 8888
          servicePort: 8888
          protocol: TCP
    
  14. Upgrade your Helm release in each workload cluster. Change the release name as needed. Be sure to update the cluster context for each workload cluster that you repeat this command for.

    helm upgrade gloo-platform gloo-platform/gloo-platform \
      --kube-context $REMOTE_CONTEXT \
      --namespace gloo-mesh \
      -f agent.yaml \
      --version $UPGRADE_VERSION
    
  15. Verify that the Gloo telemetry collector agents deploy successfully.

    kubectl get pods --context $REMOTE_CONTEXT -n gloo-mesh | grep telemetry
    
  16. Verify that the port is exposed on the telemetry collector service.

    kubectl get services --context $REMOTE_CONTEXT -n gloo-mesh | grep telemetry
    
  17. Create a service monitor to scrape metrics from the Gloo telemetry collector agent in the management cluster.

    kubectl --context ${REMOTE_CONTEXT} apply -f- <<EOF
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      name: gloo-telemetry-collector-sm
      namespace: gloo-mesh
    spec:
      endpoints:
      - interval: 30s
        port: metrics
        scheme: http
      selector:
        matchLabels:
          app.kubernetes.io/name: telemetryCollector
    EOF
    
  18. Create a configmap to enable workload monitoring for the workload cluster.

    kubectl --context $REMOTE_CONTEXT apply -f - <<EOF
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: cluster-monitoring-config
      namespace: openshift-monitoring
    data:
      config.yaml: |
        enableUserWorkload: true
    EOF
    
  19. Open the OpenShift web console for the workload cluster and select the Administrator view.

  20. Navigate to Observe > Metrics to open the built-in Prometheus expression browser.

  21. Verify that you can see metrics for the telemetrycollector containers. For example, you can enter otelcol_exporter_sent_metric_points in the expression browser. For an overview of metrics that are exposed, see Default metrics in the pipeline.

  22. Optional: Repeat these steps for each workload cluster.

  23. Optional: You can update the Gloo UI to read metrics from the OpenShift Prometheus instance to populate the Gloo UI graph and other metrics. This way, you can remove the built-in Prometheus instance that Gloo Mesh Gateway provides. For more information, see Connect the Gloo UI to OpenShift Prometheus.