Review options to customize the default Prometheus setup.
The built-in Prometheus server is the recommended approach to feed metrics to the Gloo UI graph to visualize workload communication. However, the pod is not set up with persistent storage and metrics are lost when the pod restarts or when the deployment is scaled down. Additionally, you might want to replace the built-in Prometheus server and use your organization’s own Prometheus-compatible solution or time series database that is hardened for production and integrates with other applications that might exist outside the cluster where your API Gateway runs.
Replace the built-in Prometheus server with your own instance
In this setup, you configure Gloo Mesh Core to disable the built-in Prometheus instance and to use your production Prometheus instance instead. This setup is a reasonable approach if you want to scrape raw Istio metrics to collect them in your production Prometheus instance. However, you cannot control the number of metrics that you collect, or federate and aggregate the metrics before you scrape them with your production Prometheus.
To query the metrics and compute results, you use the compute resources of the cluster where your production Prometheus instance runs. Note that depending on the number and complexity of the queries that you plan to run in your production Prometheus instance, especially if you use the instance to consolidate metrics of other apps as well, your production instance might get overloaded or start to respond more slowly.
To have more granular control over the metrics that you want to collect, it is recommended to set up additional receivers, processors, and exporters in the Gloo telemetry pipeline to make these metrics available to the Gloo telemetry gateway. Then, forward these metrics to the third-party solution or time series database of your choice, such as your production Prometheus or Datadog instance. For more information, see the Prometheus receiver and Prometheus exporter OpenTelemetry documentation.
Get your current installation Helm values, and save them in a file.
helm get values gloo-mesh-core -n gloo-mesh > gloo-mesh-core-single.yaml open gloo-mesh-core-single.yaml
Delete the first line that contains
USER-SUPPLIED VALUES:, and save the file.
In your Helm values file, disable the default Prometheus instance and instead enter the details of your custom Prometheus server. Make sure that the instance runs Prometheus version 2.16.0 or later. In the
prometheusUrlfield, enter the Prometheus URL that your instance is exposed on, such as
http://kube-prometheus-stack-prometheus.monitoring:9090. You can get this value from the
--web.external-urlfield in your Prometheus Helm values file or by selecting Status > Command-Line-Flags from the Prometheus UI. Do not use the FQDN for the Prometheus URL.
prometheus: enabled: false common: prometheusUrl: <Prometheus_server_URL_and_port>
Upgrade your installation by using your updated Helm values file.
helm upgrade gloo-mesh-core gloo-mesh-core/gloo-mesh-core \ --namespace gloo-mesh \ --values gloo-mesh-core-single.yaml
Remove high cardinality labels at creation time
To reduce the amount of data that is collected, you can customize the Envoy filter in the Istio proxy deployment to modify how Istio metrics are recorded at creation time. With this setup, you can remove any unwanted cardinality labels before metrics are scraped by the built-in or your own custom Prometheus server.
The following approach requires Istio version 1.17.x or lower. If you run Istio version 1.18.x or later, use the Istio Telemetry API to customize how metrics are recorded.
Make sure to only remove labels that you do not need in any of your production queries, alerts, or dashboards. After you apply the Envoy filter, high cardinality labels are permanently removed and cannot be recovered later.
Decide which context of the Istio Envoy filter you want to modify. Each Istio release includes an Envoy filter that is named
stats-filter-<istio_version>and that defines how metrics are collected for a workload. Depending on whether you modify the Envoy filter directly or use the Istio Helm chart to configure the filter, you can choose between the following contexts:
inboundSidecar: Used to collect metrics for traffic that is sent to a destination (reporter=destination).
outboundSidecar: Used to collect metrics for traffic that leaves a microservice (reporter=source).
gateway: Used to collect metrics for traffic that passes through the ingress gateway.
Decide on the metric labels you want to remove with your custom Envoy filter. To find an overview of metrics that are collected by default, see the Istio documentation. For an overview of labels that are collected, see Labels. You can start by looking at Istio histogram metrics, also referred to as distribution metrics. Histograms show the frequency distribution of data in a certain timeframe. While these metrics provide great insights and detail, they often come with lots of labels that lead to high cardinality.Removing labels from histograms can significantly reduce cardinality and the amount of data that you collect. For example, you might want to keep all the labels, including the high cardinality labels of the
istio_request_duration_millisecondsmetric to monitor request latency for your workloads. However, collecting the same high cardinality labels in histograms such as
istio_response_byte_bucketmight not be important for your environment.
Configure your Envoy filter to remove specific labels. To apply the same configuration across all of your Istio microservices, modify the filter in the Istio Helm chart. If you want to update the configuration for a particular workload only, you can patch the Envoy filter instead.
To find the name of the metric that you need to use in your filter configuration, see Metrics. Note that you must remove the
istio_prefix from the metric name before you add it to your filter configuration. For example, if you want to customize the request size metric, use
request_bytes. To find an overview of available labels that you can remove, see Labels. Note that this page lists the labels with their actual names and not as the value that you need to provide in the Envoy filter or Helm chart. To find the corresponding label name value, refer to the Istio bootstrap config for your release.