Set up the pipeline
Set up the Gloo OpenTelemetry (OTel) pipeline in a new or an existing Gloo Mesh installation.
Before you begin
-
Review the default pipelines that are available in the Gloo OTel pipeline and decide on the pipelines that you want to enable. Pipelines can be enabled in either the Gloo telemetry agent or gateway as shown in the following tables. By default, the Gloo OTel pipeline is set up with the
metrics/ui
andmetrics/prometheus
pipelines.- Gloo telemetry collector agent pipelines:
Pipeline Description metrics/ui
The metrics/ui
pipeline collects the metrics that are required for the Gloo UI graph. This pipeline is enabled by default. To view the metrics that are included with this pipeline, see View default metrics.metrics/cilium
This pipeline collects extra Cilium metrics to feed the Cilium dashboard in Grafana. logs/istio_access_logs
This pipeline collects Istio access logs from Istio-enabled workloads. For more information, see access logs for the ingress gateway and workloads in a service mesh. logs/cilium_flows
This pipeline collects network flows for Cilium-enabled cluster workloads so that you can use the meshctl hubble observe
command. For more information, see Network flow logs.traces/istio
A pre-defined pipeline that collects traces to observe and monitor requests and pushes them to the built-in Jaeger platform or a custom Jaeger instance. For more information, see request tracing for the ingress gateway or workloads in a service mesh. - Gloo telemetry gateway pipelines:
Pipeline Description logs/clickhouse
This pipeline forwards Istio access logs that the collector agents receive to Clickhouse. metrics/prometheus
This pipeline collects metrics from various sources, such as the Gloo management server, Gloo Platform, Istio, Cilium, and the Gloo OTel pipeline, and makes this data available to the built-in Prometheus server. This pipeline is enabled by default. traces/jaeger
This pipeline receives traces from the Gloo telemetry collector agents, and forwards them to the built-in or custom Jaeger tracing platform.
- Gloo telemetry collector agent pipelines:
-
Choose how to secure the communication between the telemetry gateway in the management cluster and collector agents in the workload clusters.
- Testing or demo setups: To use the default certificate that the telemetry gateway is automatically created with, see Set up OTel with the default certificate.
- POC or production setups: To bring your own certificate to secure the connection, see Set up OTel with a custom certificate.
Set up OTel with the default certificate
Enable the OTel telemetry pipeline by using the default certificate that the telemetry gateway is automatically created with.
-
Enable the Gloo telemetry gateway in your management cluster.
- Get your current installation Helm values, and save them in a file. Note that if you migrated from the legacy charts, your release might have a different name.
helm get values gloo-platform -n gloo-mesh --kube-context $MGMT_CONTEXT > mgmt-server.yaml open mgmt-server.yaml
- Add or update the the following sections in your Helm values file.
legacyMetricsPipeline: enabled: false telemetryGateway: enabled: true resources: limits: cpu: 600m memory: 2Gi requests: cpu: 300m memory: 1Gi
- Upgrade your installation by using your updated values file.
helm upgrade gloo-platform gloo-platform/gloo-platform \ --namespace gloo-mesh \ --kube-context $MGMT_CONTEXT \ --values mgmt-server.yaml
- Get your current installation Helm values, and save them in a file. Note that if you migrated from the legacy charts, your release might have a different name.
-
Verify that all pods in the
gloo-mesh
namespace are up and running, and that you see agloo-telemetry-gateway*
pod.kubectl get deployments -n gloo-mesh --context $MGMT_CONTEXT
-
Verify that the default certificate secret for the telemetry gateway is created in the management cluster.
kubectl get secrets -n gloo-mesh --context $MGMT_CONTEXT
Example output:
NAME TYPE DATA AGE dashboard Opaque 0 3d20h gloo-telemetry-gateway-tls-secret kubernetes.io/tls 3 3d20h ...
-
Get the external IP address of the load balancer service that was created for the Gloo telemetry gateway.
export TELEMETRY_GATEWAY_IP=$(kubectl get svc -n gloo-mesh gloo-telemetry-gateway --context $MGMT_CONTEXT -o jsonpath='{.status.loadBalancer.ingress[0].ip}') export TELEMETRY_GATEWAY_PORT=$(kubectl -n gloo-mesh get service gloo-telemetry-gateway --context $MGMT_CONTEXT -o jsonpath='{.spec.ports[?(@.name=="otlp")].port}') export TELEMETRY_GATEWAY_ADDRESS=${TELEMETRY_GATEWAY_IP}:${TELEMETRY_GATEWAY_PORT} echo $TELEMETRY_GATEWAY_ADDRESS
-
Enable the Gloo telemetry collector agents in your workload clusters.
- Get your current installation Helm values, and save them in a file. Note that if you migrated from the legacy charts, your release might have a different name.
helm get values gloo-platform -n gloo-mesh --kube-context $REMOTE_CONTEXT > agent.yaml open agent.yaml
- Add or update the the following sections in your Helm values file.
telemetryCollector: enabled: true resources: limits: cpu: 2 memory: 2Gi requests: cpu: 500m memory: 1Gi
- Optional: Enable additional default pipelines. The following example shows how to enable the
traces/istio
andtraces/jaeger
pipelines in the Gloo telemetry gateway and collector agents.telemetryCollectorCustomization: pipelines: traces/istio: enabled: true telemetryGatewayCustomization: pipelines: traces/jaeger: enabled: true
- Upgrade your installation by using your updated values file. Include the telemetry gateway's address in a
--set
flag.helm upgrade gloo-platform gloo-platform/gloo-platform \ --namespace gloo-mesh \ --kube-context $REMOTE_CONTEXT \ --values agent.yaml \ --set telemetryCollector.config.exporters.otlp.endpoint=$TELEMETRY_GATEWAY_ADDRESS
- Repeat these steps for each workload cluster.
- Get your current installation Helm values, and save them in a file. Note that if you migrated from the legacy charts, your release might have a different name.
-
Verify that the Gloo telemetry collector agents are deployed in your workload clusters. Because the agents are deployed as a daemon set, the number of telemetry collector agent pods equals the number of worker nodes in your cluster.
kubectl get pods -n gloo-mesh --context $REMOTE_CONTEXT
Example output:
NAME READY STATUS RESTARTS AGE gloo-mesh-agent-d89944685-mmgtt 1/1 Running 0 83m gloo-telemetry-collector-agent-7rzfb 1/1 Running 0 107s gloo-telemetry-collector-agent-dgs87 1/1 Running 0 107s gloo-telemetry-collector-agent-nbmr6 1/1 Running 0 107s
Set up OTel with a custom certificate
Enable the OTel telemetry pipeline by using a custom certificate to secure the connection between the telemetry gateway and collector agents.
-
Decide on the root CA that you want to use to sign the certificate for the telemetry gateway. The recommended approach is to derive the telemetry gateway certificate from the same root CA that you used to sign the server and client TLS certificates for your relay connection. However, you can also use a custom root CA for your telemetry gateway certificate.
-
Choose the domain name that you want to use for your telemetry gateway. In the following steps, the example domain
gloo-telemetry-gateway.apps.cluster1.mydomain.net
is used. -
Use your preferred certificate issuer to create a server certificate and key for the telemetry gateway's domain, and store that information in a secret named
gloo-telemetry-gateway-tls-secret
in thegloo-mesh
namespace. You might follow steps similar to the management server certificate generation to generate your telemetry gateway certificate. For example, you might use the following YAML file with acert-manager
instance to create the certificate and a key for thegloo-telemetry-gateway.apps.cluster1.mydomain.net
domain in a Vault instance. This example assumes that the root CA certificate and key are stored and managed in Vault so that Vault can derive the telemetry gateway certificate from the same root. After the telemetry gateway certificate and key are created, the information is stored in thegloo-telemetry-gateway-tls-secret
secret in thegloo-mesh
namespace. This file is provided only as an example; your certificate and key generation might be different, depending on your certificate setup.kind: Certificate apiVersion: cert-manager.io/v1 metadata: name: gloo-telemetry-gateway namespace: gloo-mesh spec: secretName: gloo-telemetry-gateway-tls-secret duration: 8760h # 365 days renewBefore: 360h # 15 days # Issuer for certs issuerRef: kind: ClusterIssuer name: vault-issuer-gloo commonName: gloo-telemetry-gateway dnsNames: # Domain for gateway's DNS entry - gloo-telemetry-gateway.apps.cluster1.mydomain.net usages: - server auth - client auth - digital signature - key encipherment privateKey: algorithm: "RSA" size: 2048
-
Verify that the
gloo-telemetry-gateway-tls-secret
secret is created. This secret name is referenced by default in thetelemetryGateway.extraVolumes
field of your Helm values file, which ensures that the telemetry gateway can access and use the certificate information.kubectl get secret gloo-telemetry-gateway-tls-secret -n gloo-mesh -o yaml --context $MGMT_CONTEXT
Example output:
apiVersion: v1 data: ca.crt: [ca.crt content] tls.crt: [tls.crt content] tls.key: [tls.key content] kind: Secret metadata: annotations: cert-manager.io/alt-names: gloo-telemetry-gateway.apps.cluster1.mydomain.net cert-manager.io/certificate-name: gloo-telemetry-gateway cert-manager.io/common-name: gloo-telemetry-gateway cert-manager.io/ip-sans: "" cert-manager.io/issuer-group: "" cert-manager.io/issuer-kind: ClusterIssuer cert-manager.io/issuer-name: vault-issuer-gloo cert-manager.io/uri-sans: "" creationTimestamp: "2023-02-17T00:57:39Z" labels: controller.cert-manager.io/fao: "true" name: gloo-telemetry-gateway-tls-secret namespace: gloo-mesh resourceVersion: "11625264" uid: 31c794da-2359-43e6-ae02-6575968a0814 type: kubernetes.io/tls
-
Enable the Gloo telemetry gateway in your management cluster.
- Get your current installation Helm values, and save them in a file. Note that if you migrated from the legacy charts, your release might have a different name.
helm get values gloo-platform -n gloo-mesh --kube-context $MGMT_CONTEXT > mgmt-server.yaml open mgmt-server.yaml
- Add or update the following sections in your Helm values file.
legacyMetricsPipeline: enabled: false telemetryGateway: enabled: true resources: limits: cpu: 600m memory: 2Gi requests: cpu: 300m memory: 1Gi telemetryGatewayCustomization: disableCertGeneration: true
- Upgrade your installation by using your updated values file.
helm upgrade gloo-platform gloo-platform/gloo-platform \ --namespace gloo-mesh \ --kube-context $MGMT_CONTEXT \ --values mgmt-server.yaml
- Get your current installation Helm values, and save them in a file. Note that if you migrated from the legacy charts, your release might have a different name.
-
Verify that all pods in the
gloo-mesh
namespace are up and running, and that you see agloo-telemetry-gateway*
pod.kubectl get deployments -n gloo-mesh --context $MGMT_CONTEXT
-
Get the external IP address of the load balancer service that was created for the Gloo telemetry gateway.
export TELEMETRY_GATEWAY_IP=$(kubectl get svc -n gloo-mesh gloo-telemetry-gateway --context $MGMT_CONTEXT -o jsonpath='{.status.loadBalancer.ingress[0].ip}') export TELEMETRY_GATEWAY_PORT=$(kubectl -n gloo-mesh get service gloo-telemetry-gateway --context $MGMT_CONTEXT -o jsonpath='{.spec.ports[?(@.name=="otlp")].port}') export TELEMETRY_GATEWAY_ADDRESS=${TELEMETRY_GATEWAY_IP}:${TELEMETRY_GATEWAY_PORT} echo $TELEMETRY_GATEWAY_ADDRESS
-
Use your cloud or DNS provider to create a DNS entry in your domain for the telemetry gateway's IP address.
-
Prepare the Gloo telemetry collector agent installation. To successfully connect from a collector agent in the workload cluster to the telemetry gateway in the management cluster, the root CA certificate must be stored in a Kubernetes secret on the workload cluster. By default, the collector agents are configured to look up the root CA certificate from the
relay-root-tls-secret
Kubernetes secret in thegloo-mesh
namespace. This secret might already exist in your workload cluster if you implemented Option 2 or Option 3 of the relay certificate setup options. Review the following options to decide if you can use this Kubernetes secret or need to create a new one.If you implemented Option 2 or Option 3 of the relay certificate setup options and you used the same root CA certificate to create the certificate for the telemetry gateway, you can use the
relay-root-tls-secret
Kubernetes for the collector agents.- Check whether the
relay-root-tls-secret
secret exists on workload clusters.kubectl get secret relay-root-tls-secret -n gloo-mesh --context $REMOTE_CONTEXT
- If the secret exists, no further action is required. If the certificate does not exist, copy the root CA certificate from the management cluster to each workload cluster.
kubectl get secret relay-root-tls-secret -n gloo-mesh --context $MGMT_CONTEXT -o jsonpath='{.data.ca\.crt}' | base64 -d > ca.crt kubectl create secret generic relay-root-tls-secret -n gloo-mesh --context $REMOTE_CONTEXT --from-file ca.crt=ca.crt
If you implemented Option 1 or Option 4 of the relay setup options, or if you decided to use a different root CA certificate for the telemetry gateway certificate, store the root CA certificate in a Kubernetes secret on the workload cluster.
-
Store the root CA certificate that you want to use for the OTel pipeline in a secret.
kubectl create secret generic telemetry-root-secret -n gloo-mesh --context $REMOTE_CONTEXT --from-file ca.crt=<root_ca_cert>.crt
-
Verify that the secret is created.
kubectl get secret telemetry-root-secret -n gloo-mesh --context $REMOTE_CONTEXT
- Check whether the
-
Enable the Gloo telemetry collector agents in each workload cluster.
- Get your updated installation Helm values again, and save them in a file. Note that if you migrated from the legacy charts, your release might have a different name.
helm get values gloo-platform -n gloo-mesh --kube-context $REMOTE_CONTEXT > agent.yaml open agent.yaml
- Add or update the the following sections in your Helm values file. Replace the
serverName
value with the domain for your telemetry gateway's DNS entry. If you created a custom root CA certificate secret namedtelemetry-root-secret
, include that secret name in theextraVolumes
section. If you decided to use the root CA certificate in therelay-root-tls-secret
Kubernetes secret, you can remove thesecretName: telemetry-root-secret
line from the Helm values file.telemetryCollector: config: exporters: otlp: # Domain for gateway's DNS entry # The default port is 4317. # If you set up an external load balancer between the telemetry gateway and collector agents, and you configured TLS passthrough to forward data to the telemetry gateway on port 4317, use port 443 instead. endpoint: [domain]:4317 tls: ca_file: /etc/otel-certs/ca.crt enabled: true resources: limits: cpu: 2 memory: 2Gi requests: cpu: 500m memory: 1Gi # Include this section if you created custom a root CA cert secret extraVolumes: - name: root-ca secret: defaultMode: 420 # Add your root CA cert secret name secretName: telemetry-root-secret telemetryCollectorCustomization: # Domain for gateway's DNS entry serverName: [domain]
- Optional: Enable additional default pipelines. The following example shows how to enable the
traces/istio
andtraces/jaeger
pipelines in the Gloo telemetry gateway and collector agents.telemetryCollectorCustomization: pipelines: traces/istio: enabled: true telemetryGatewayCustomization: pipelines: traces/jaeger: enabled: true
- Upgrade each workload cluster by using your updated values file. Include the telemetry gateway's address in a
--set
flag.helm upgrade gloo-platform gloo-platform/gloo-platform \ --namespace gloo-mesh \ --kube-context $REMOTE_CONTEXT \ --values agent.yaml \ --set telemetryCollector.config.exporters.otlp.endpoint=$TELEMETRY_GATEWAY_ADDRESS
- Get your updated installation Helm values again, and save them in a file. Note that if you migrated from the legacy charts, your release might have a different name.
-
Verify that the Gloo telemetry collector agents are deployed in your workload clusters. Because the agents are deployed as a daemon set, the number of telemetry collector agent pods equals the number of worker nodes in your cluster.
kubectl get pods -n gloo-mesh --context $REMOTE_CONTEXT
Example output:
NAME READY STATUS RESTARTS AGE gloo-mesh-agent-d89944685-mmgtt 1/1 Running 0 83m gloo-telemetry-collector-agent-7rzfb 1/1 Running 0 107s gloo-telemetry-collector-agent-dgs87 1/1 Running 0 107s gloo-telemetry-collector-agent-nbmr6 1/1 Running 0 107s
Verify metrics collection
-
Generate traffic for the apps in your cluster. For example, if you set up the Bookinfo app as part of the getting started guide, you can open the product page app in your browser to generate traffic.
- Open a port on your local machine for the product page app.
kubectl port-forward deploy/productpage-v1 -n bookinfo --context $REMOTE_CONTEXT 9080
- Open the product page in your browser.
open http://localhost:9080/productpage?u=normal
- Refresh the page a couple of times to generate traffic.
- Open a port on your local machine for the product page app.
-
Open the Gloo UI.
meshctl dashboard --kubecontext=$MGMT_CONTEXT
-
Verify that metrics were populated for your workloads by looking at the UI Graph.
-
You can optionally review the raw metrics by opening the Prometheus UI and entering
istio_requests_total
in the expression search bar.
Collect compute instance metadata
Allow OTel collector agents to gather metadata about the compute instances that the workload cluster is deployed to, and add the metadata as labels on the metrics it scrapes. This compute instance metadata helps you better visualize your Gloo Mesh setup across your cloud provider infrastructure network. For example, if you deploy workload clusters across multiple cloud providers, or add a virtual machine to your Gloo Mesh setup, you can more easily see how your Gloo resources are deployed across your compute instances in the Gloo UI.
-
Enable the required infrastructure settings in your cloud provider.
-
Enable Workload Identity for the workload cluster. Workload Identity allows the Kubernetes service account for the OTel collector to act as a GCP IAM service account, which you assign the necessary permissions to.
-
Save your GCP project ID in an environment variable.
export PROJECT=<gcp_project_id>
-
Create an IAM service account in GCP for the OTel collector in the workload cluster, and grant IAM permissions so that the collector can access metadata about the compute instances that the workload cluster is deployed to.
- Create an IAM service account in GCP named
OTelCollector
.gcloud iam service-accounts create OTelCollector --project $PROJECT
- Create an IAM role that gives the permission to describe the VM instances in your project.
gcloud iam roles create OTelComputeViewer \ --project $PROJECT \ --title "OTel compute viewer" \ --permissions compute.instances.get,iam.serviceAccounts.getAccessToken
- Bind the role to the OTel GCP IAM service account.
gcloud iam service-accounts add-iam-policy-binding OTelCollector@$PROJECT.iam.gserviceaccount.com \ --project $PROJECT \ --role "projects/$PROJECT/roles/OTelComputeViewer" \ --member "serviceAccount:$PROJECT.svc.id.goog[gloo-mesh/gloo-telemetry-collector]"
- Create an IAM service account in GCP named
-
Annotate the Kubernetes service account for the OTel collector with its GCP IAM permissions.
kubectl annotate serviceaccount gloo-telemetry-collector \ --context $REMOTE_CONTEXT -n gloo-mesh \ iam.gke.io/gcp-service-account=OTelCollector@$PROJECT.iam.gserviceaccount.com
-
Restart the OTel collector daemonset to apply the change.
kubectl --context $REMOTE_CONTEXT rollout restart daemonset/gloo-telemetry-collector-agent -n gloo-mesh
Ensure that the workload cluster is associated with an IAM OIDC provider. Otherwise, no other permissions are required. -
-
Visualize your setup by launching the Gloo UI.
- Access the Gloo UI.
meshctl dashboard --kubecontext $MGMT_CONTEXT
- Click the Graph tab to open the network visualization graph for your Gloo Mesh setup.
- From the footer toolbar, click Layout Settings.
- Toggle Group By to
INFRA
to review the clusters, virtual machines, and Kubernetes namespaces that your app nodes are organized in. This view also shows details for the cloud provider infrastructure, such as the VPCs and subnets that your resources are deployed to. You can see more compute network details by clicking on resource icons, which opens the resource's details pane.
- Access the Gloo UI.