Prepare to upgrade
Before you upgrade Gloo Gateway, complete the following preparatory steps:
- Prepare your environment, such as upgrading your current version to the latest patch and upgrading any dependencies to the required supported versions.
- Review important changes made to Gloo Gateway in version 1.17, including CRD, Helm, CLI, and feature changes.
- Review frequently-asked questions about the upgrade process.
Prepare your environment
Review the following preparatory steps that might be required for your environment.
Upgrade your current minor version to the latest patch
Before you upgrade your minor version, first upgrade your current version to the latest patch. For example, if you currently run Gloo Gateway Enterprise version 1.16.13, first upgrade your installation to version 1.16.14. This ensures that your current environment is up-to-date with any bug fixes or security patches before you begin the minor version upgrade process.
- Find the latest patch of your minor version by checking the Open Source changelog or Enterprise changelog.
- Go to the documentation set for your current minor version. For example, if you currently run Gloo Gateway Enterprise version 1.16.13, use the drop-down menu in the header of this page to select v1.16.x.
- Follow the upgrade guide, using the latest patch for your minor version.
If required, perform incremental minor version updates
If you plan to upgrade to a version that is more than one minor version greater than your current version, such as to version 1.17 from 1.15 or older, you must upgrade incrementally. For example, you must first use the upgrade guide in the v1.16.x documentation set to upgrade from 1.15 to 1.16, and then follow the upgrade guide in the v1.17.x documentation set to upgrade from 1.16 to 1.17.
Upgrade dependencies
Check that your underlying infrastructure platform, such as Kubernetes, and other dependencies run a version that is supported for 1.17.
- Review the supported versions for dependencies such as Kubernetes, Istio, Helm, and more.
- Compare the supported versions against the versions you currently use.
- If necessary, upgrade your dependencies, such as consulting your cluster infrastructure provider to upgrade the version of Kubernetes that your cluster runs.
Consider settings to avoid downtime
You might deploy Gloo Gateway in Kubernetes environments that use the Kubernetes load balancer, or in non-Kubernetes environments. Depending on your setup, you can take additional steps to avoid downtime during the upgrade process.
- Kubernetes: Enable Envoy readiness and liveness probes during the upgrade. When these probes are set, Kubernetes sends requests only to the healthy Envoy proxy during the upgrade process, which helps to prevent potential downtime. The probes are not enabled in default installations because they can lead to timeouts or other poor getting started experiences.
- Non-Kubernetes: Configure health checks on Envoy. Then, configure your load balancer to leverage these health checks, so that requests stop going to Envoy when it begins draining connections.
Review version 1.17 changes
Review the following changes made to Gloo Gateway in version 1.17. For some changes, you might be required to complete additional steps during the upgrade process.
Kubernetes Gateway API support
Gloo Gateway is now a fully conformant Kubernetes Gateway API implementation. The existing Gloo Edge APIs were not changed and continue to be fully supported. To deploy a gateway proxy that is based on the Kubernetes Gateway API, see the docs.
Breaking changes
The Gloo Gateway extProc filter implementation was changed to comply with the latest extProc implementation in Envoy. Previously, request and response attributes were included only in a header processing request, and were therefore sent to the extProc server only when request header processing messages were configured to be sent. Starting in Gloo Gateway version 1.17.0, the Gloo extProc filter sends request and response attributes as part of the top level processing request. That way, attributes can be processed on the first processing request regardless of its type.
If you implemented your extProc server to expect request and response attributes as part of the HTTP header processing request, you must change this implementation to read attributes from the top-level processing request instead.
For more information, see the extProc proto definition in Envoy.
Envoy version 1.29 upgrade
The Envoy dependency in Gloo Gateway 1.17 was upgraded from 1.27.x to 1.29.x. This upgrade includes the following changes. For more information about these changes, see the Envoy changelog documentation.
- ExtProc attribute processing: For more information, see ExtProc attribute processing.
- JWT tokens: The behavior for extracting JWT tokens changed. Previously, the JWT token was cut into non-base64 characters. Now, the entire JWT token is passed for validation. This change can be reverted temporarily by setting
envoy.reloadable_features.token_passed_entirely
tofalse
. - HTTP2 host header: The HTTP2 host header is discarded if the
:authority
header is received. This change makes Envoy compliant with the HTTP2 request pseudo-header field implementation. For more information, see the HTTP2 reference. You can temporarily revert this change by setting theenvoy.reloadable_features.http2_discard_host_header
runtime flag tofalse
. - Transfer encoding header: The transfer encoding header is removed from downstream request headers. You can temporarily revert this change by setting
envoy.reloadable_features.sanitize_te
tofalse
.
Kubernetes Ingress API deprecation
As of version 1.17, the Kubernetes Ingress API is deprecated in Gloo Gateway. Instead, you can use the Gloo Gateway (Edge API) Gateway
custom resource. Alternatively, to use the Kubernetes Gateway API for Gateway
custom resources, you can check out the Gloo Gateway (Kubernetes Gateway API) docs.
OTel service name change
Previously, when using the Envoy OpenTelemetry configuration with Gloo Gateway, the service_name
field was set to an empty string, which resulted in a display name of unknown_service:envoy
. Now, the service_name
is set to the name that you define in the Gateway
resource.
Changelogs
Check the changelogs for the type of Gloo Gateway deployment that you have. Focus especially on any Breaking Changes that might require a different upgrade procedure. For Gloo Gateway Enterprise, you might also review the open source changelogs because most of the proto definitions are open source.
- Open Source changelogs
- Enterprise changelogs: Keep in mind that Gloo Gateway Enterprise pulls in Gloo Gateway Open Source as a dependency. Although the major and minor version numbers are the same for open source and enterprise, their patch versions often differ. For example, open source might use version
x.y.a
but enterprise uses versionx.y.b
. If you are unfamiliar with these versioning concepts, see Semantic versioning. Because of the differing patch versions, you might notice different output when checking your version withglooctl version
. For example, your API server might run Gloo Gateway Enterprise version 1.17.1, which pulls in Gloo Gateway Open Source version 1.17.7 as a dependency.~ > glooctl version Client: {"version":"1.17.7"} Server: {"type":"Gateway","enterprise":true,"kubernetes":...,{"Tag":"1.17.1","Name":"grpcserver-ee","Registry":"quay.io/solo-io"},...,{"Tag":"1.17.7","Name":"discovery","Registry":"quay.io/solo-io"},...}
You can use the changelogs' built-in comparison tool to compare between your current version and the version that you want to upgrade to.
Feature changes
Review the following summary of important new, deprecated, or removed features.
The following lists consist of the changes that were initially introduced with the 1.17.0 release. These changes might be backported to earlier versions of Gloo Gateway. Additionally, there might be other changes that are introduced in later 1.17 patch releases. For patch release changes, check the changelogs.
New or improved features:
- New auto-mTLS feature for the Istio integration: A new auto-mTLS feature is introduced that simplifies the integration with Istio service meshes. The auto-mTLS feature automatically injects mTLS configuration into all Upstream resources in your cluster. Without auto-mTLS, every Upstream must be updated manually to add the mTLS configuration. For more information, see Gloo Gateway and Istio.
- Image support:
- When you install Gloo Gateway, distroless images for all Gloo components are now supported. For example, you can specify an image and add the
-distroless
tag to install the distroless version of that component. - Enterprise only: FIPS-compliant images for the SDS container are now supported. For example, you can specify the
sds-ee-fips
image.
- When you install Gloo Gateway, distroless images for all Gloo components are now supported. For example, you can specify an image and add the
- Secret deletion: Secrets can now be deleted even when warnings or errors are present. When the deletion of a secret is validated, the validating admission webhook removes the secret from the current snapshot, runs translations, and looks for errors. Previously, the validating webhook did not delete a secret if errors were present, or if warnings were present and the
ignore_warnings
setting was set tofalse
. The old behavior could cause issues when attempting to delete secrets that were unrelated to the warnings or errors. Now, the validating webhook collects all the artifacts of the valdiation process, re-runs validation against the original snapshot, and compares the artifacts from that process to the artifacts previously collected. If the artifacts are the same, the secret is deleted. If the artifacts are different, the secret is not deleted and errors are returned. Because this process might slightly degrade performance, you can disable this feature by setting thegloo.gloo.deployment.customEnv.DISABLE_VALIDATION_AGAINST_PREVIOUS_STATE
environment variable totrue
in the your Gloo Gateway deployment. - Access log updates:
- You can now use
%METADATA()%
command directives in the Envoy access log format configuration. In future releases, these command directives will replace the%DYNAMIC_METADATA()%
,%UPSTREAM_METADATA()%
, and%CLUSTER_METADATA()%
command directives. - A new listener-level access log option is added, which configures HTTP listener logs. The listener access logs complement HTTP request access logging, and can be enabled separately and independently from filter access logs.
- You can now use
- Kubernetes API server unavailability: The new
MAX_RECOVERY_DURATION_WITHOUT_KUBE_API_SERVER
environment variable defines the maximum duration that the Gloo pod can run and attempt to reconnect to the Kubernetes API server if it is unreachable. If the duration is exceeded, the Gloo pod quits. This means that when leader election is enabled, the Gloo pod falls back to a follower. Previously, the Gloo pod crashed in this situation, which could cause an outage. To set this environment variable, you can either update the Gloo deployment or update the Helm values by specifying thegloo.deployment.customEnv[0].Name=MAX_RECOVERY_DURATION_WITHOUT_KUBE_API_SERVER
andgloo.deployment.customEnv[0].Value=60s
values. - Extended HTTP methods for Envoy: Envoy can now accept requests with extended HTTP methods, such as LABEL or UPDATE. Previously, requests with these methods returned an HTTP 400 response. Note that currently, this functionality is supported for HTTP/1 only.
Helm changes
Review the following summary of important new, deprecated, or removed Helm fields. For full details, see the changelogs.
New Helm fields:
global.image.variant
: Specify the image variant for all Gloo Gateway components. Supported values includestandard
,fips
,distroless
,fips-distroless
, and the default value isstandard
. Note that thefips
andfips-distroless
image variants are supported for Enterise only. Additionally, theglobal.image.fips
setting is now deprecated.global.additionalLabels
: Specify additional labels to add to Gloo resources.containerSecurityContext
: Specify values for Pod Security Standards in each component. For example, Helm fields such assettings.integrations.knative.proxy.containerSecurityContext
orglobal.extensions.extAuth.deployment.extAuthContainerSecurityContext
now exist to allow you to specify container security context settings.
Updated Helm fields:
deployment.runAsUser
: Thediscovery
andingress-proxy
deployments now respect thedeployment.runAsUser
value.kubeGateway.enabled
: The Kubernetes Gateway API integration is now disabled by default. To use the Kubernetes Gateway API, you can set this field to true, or check out the Gloo Gateway with Kubernetes Gateway API docs.
Deprecated Helm fields:
global.image.fips
: This setting is now deprecated. Use theglobal.image.variant=fips
setting instead.global.istioIntegration
: The following Istio integration Helm settings that rely on a double proxy setup are now deprecated:global.istioIntegration.labelInstallNamespace
global.istioIntegration.whitelistDiscovery
global.istioIntegration.enableIstioSidecarOnGateway
global.istioIntegration.istioSidecarRevTag
global.istioIntegration.appendXForwardedHost
CRD changes
New CRDs are automatically applied to your cluster when performing a helm install
operation, but are not applied when performing an helm upgrade
operation. This is a deliberate design choice on the part of the Helm maintainers, given the risk associated with changing CRDs. Given this limitation, you must apply new CRDs to the cluster before upgrading.
Review the following summary of important new, deprecated, or removed CRD updates. For full details, see the changelogs.
New and updated CRDs:
ExtAuth
: A newretryPolicy
section is added to thePassThroughGrpc
settings in theExtAuth
CRD. You can use this option to configure retries for gRPC passthrough authentication in the case that the service is unavailable.RateLimit
: A newgrpcService
setting is added to theRateLimit
CRD to configure the authority header for the rate limit gRPC call.Settings
(Enterprise-only):- A new
observabilityOptions.grafanaIntegration.dashboardPrefix
setting allows you to specify the prefix for the title and UID of Grafana dashboards that Gloo Gateway generates. This prefix can be useful when you aggregate data in a central Grafana instance to prevent conflicts across multiple Gloo environments. Note that if you set this field, you must manually remove any dashboard created without a prefix or with a different prefix. - A new
observabilityOptions.grafanaIntegration.extraMetricQueryParameters
setting allows you to specify additional query parameters to add to all metric query definitions in the Grafana dashboards that Gloo Gateway generates. This string value can consist of multiple query parameters separated by a comma, such ascluster="some-cluster",gateway_proxy_id="proxy-2"
.
- A new
CLI changes
You must upgrade glooctl
before you upgrade Gloo Gateway. Because glooctl
can create resources in your cluster, such as with glooctl add route
, you might have errors in Gloo Gateway if you create resources with an older version of glooctl
.
As part of the 1.17.7 release, no CLI changes were introduced.
Frequently-asked questions
Review the following frequently-asked questions about the upgrade process. If you still aren’t sure about the version upgrade impact, or if your use case doesn’t quite fit the standard upgrade path, feel free to post in the #gloo
or #gloo-enterprise
channels of our public Slack.
How do I upgrade Gloo Gateway in testing or sandbox environments?
If downtime is not a concern for your use case, you can follow the Quick upgrade guide to update your Gloo Gateway installation.
Note that for sandbox or exploratory environments, the easiest way to upgrade is to uninstall Gloo Gateway by running glooctl uninstall --all
. Then, re-install Gloo Gateway at the desired version by the following one of the installation guides.
How do I upgrade Gloo Gateway in a production environment, where downtime is unacceptable?
The basic helm upgrade
process is not suitable for environments in which downtime is unacceptable. Instead, you can follow the Canary upgrade guide to deploy multiple version of Gloo Gateway to your cluster, and test the upgrade version before uninstalling the existing version.
Additionally, you might need to take steps to account for other factors such as Gloo Gateway version changes, probe configurations, and external infrastructure like the load balancer that Gloo Gateway uses. Consider setting up liveness probes and healthchecks in your environment.
What happens to my Gloo Gateway CRs during an upgrade? How do I handle breaking changes?
A typical upgrade of Gloo Gateway across minor versions should not cause disruptions to the existing Gloo Gateway state. In the case of a breaking change, Solo will communicate through the upgrade guides, changelogs, or other channels if you must make a specific adjustment to perform the upgrade. Note that you can always use the glooctl debug yaml
command to download the current Gloo Gateway state to one large YAML manifest.
Is the upgrade procedure different if I am not a cluster administrator?
If you are not an administrator of your cluster, you might be unable to create custom resource definitions (CRDs) and other cluster-scoped resources, such as cluster roles and cluster role bindings. If you encounter an error related to these resources, you can disable their creation by including the following setting in your Helm values:
global:
glooRbac:
create: false
Otherwise, you can try performing an installation of Gloo Gateway that is scoped to a single namespace by including the following setting in your Helm values:
global:
glooRbac:
namespaced: true
Why do I get an error about re-creating CRDs when upgrading using helm install
or helm upgrade
?
Helm v2 does not manage CRDs well, and is not supported in Gloo Gateway. Upgrade to Helm v3, delete the CRDs, and try again.
Why do I get an error about a gateway-certgen
job?
The upgrade creates a Kubernetes Job named gateway-certgen
to generate a certificate for the validation webhook. The job contains the ttlSecondsAfterFinished
value so that the cluster cleans up the job automatically, but because this setting is still in Alpha, your cluster might ignore this value. In this case, you might have an issue while upgrading in which the upgrade attempts to change the gateway-certgen
job, but the change fails because the job is immutable. To fix this issue, you can delete the job, which already completed, and re-apply the upgrade.