Production Deployments

This document shows some of the Production options that may be useful. We will continue to add to this document and welcome users of Gloo Edge to send PRs to this as well.

Dropping capabilities

One of the more important (and unique) things about Gloo Edge is the ability to significantly lock down the edge proxy. Other proxies require privileges to write to disk or access the Kubernetes API, while Gloo Edge splits those responsibilities between control plane and data plane. The data plane can be locked down with zero privileges while separating out the need to secure the control plane differently.

For example, Gloo Edge’s data plane (the gateway-proxy pod) has ReadOnly file system. Additionally it doesn’t require any additional tokens mounted in or OS-level privileges. By default some of these options are enabled to simplify developer experience, but if your use case doesn’t need them, you should lock them down.

Enable replacing invalid routes

Enable health checks

Liveness/readiness probes on Envoy are disabled by default. This is because Envoy’s behavior can be surprising: When there are no routes configured, Envoy reports itself as un-ready. As it becomes configured with a nonzero number of routes, it will start to report itself as ready.

Upstream health checks

In addition to defining health checks for Envoy, you should strongly consider defining health checks for your Upstreams. These health checks are used by Envoy to determine the health of the various upstream hosts in an upstream cluster, for example checking the health of the various pods that make up a Kubernetes Service. This is known as “active health checking” and can be configured on the Upstream resource directly. See the documentation for additional info.

Additionally, “outlier detection” can be configured which allows Envoy to passively check the health of upstream hosts. A helpful overview of this feature is available in Envoy’s documentation. This can be configured via the outlierDetection field on the Upstream resource. See the API reference for more detail .

Envoy performance

Configure appropriate resource usage

Metrics and monitoring

When running Gloo Edge (or any application for that matter) in a production environment, it is important to have a monitoring solution in place. Gloo Edge Enterprise provides a simple deployment of Prometheus and Grafana to assist with this necessity. However, depending on the requirements on your organization you may require a more robust solution, in which case you should make sure the metrics from the Gloo Edge components (especially Envoy) are available in whatever solution you are using. The general documentation for monitoring/observability has more info.

Some metrics that may be useful to monitor (listed in Prometheus format):

Access Logging

Envoy provides a powerful access logging mechanism which enables users and operators to understand the various traffic flowing through the proxy. Before deploying Gloo Edge in production, consider enabling access logging to help with monitoring traffic as well as to provide helpful information for troubleshooting. The access logging documentation should be consulted for more details.

Other Envoy-specific guidance