0.18+ Upgrade Notice

Please use open source gloo 0.18.1+, or gloo enterprise 0.18.0+.

Upgrading to 0.18 on Kubernetes

In Open Source Gloo Edge 0.18.1, we updated the Gateway API to support managing Envoy TCP (L4) routing configuration in addition to existing HTTP/S (L7). This required making a non-backwards compatible change to the Gateway CRD in Gloo Edge.

Due to limited support for CRD versioning in older versions of Kubernetes (we cannot assume our customers run Kubernetes 1.13+), we implemented this change with a new Gateway v2 CRD. Open Source Gloo Edge 0.18.1+ will no longer install the old Gateway CRD or controller.

We recommend that you update your load balancer to handle routing traffic to two services: gateway-proxy and the new gateway-proxy-v2. If that is not possible in your environment, you can follow the steps below. However, installation for subsequent versions of Gloo Edge may assume that you are running gateway-proxy-v2 exclusively.

Upgrade Steps

This guide documents a process for safely upgrading Open Source Gloo Edge to 0.18.1+, or Enterprise Gloo Edge to 0.18.0+, with an emphasis on minimizing manual work and avoiding downtime while shifting traffic to the new Gateway implementation.

1. Prepare your existing resources for upgrade

To facilitate future updates, we’ve introduced live labels to our gateway-proxy deployments and services. You can take advantage of this feature during the migration to v0.18.1 by updating your existing resources. Run kubectl edit deployment -n gloo-system gateway-proxy and add gateway-proxy: live to the labels map under spec.template.metadata. Similarly, run kubectl edit service -n gloo-system gateway-proxy, add gateway-proxy: live to the selector map under spec, and remove all other selectors. Your gateway should function as before.

Adding this label and selector will make it possible to route traffic through gateway-proxy-v2 from the gateway-proxy service, minimizing downtime.

2. Install Gloo Edge 0.18 with the upgrade flag

If installing with glooctl, provide the --upgrade flag. If installing with Helm, provide the upgrade flag via Helm values.

The Gloo Edge 0.18 installation manifest creates two new deployments in Kubernetes, gateway-v2 and gateway-proxy-v2.

By providing the upgrade flag, the manifest includes a Kubernetes Job that will automatically create new Gateway v2 CRDs based on the contents of the Gateway v1 CRDs. This ensures the new gateway-v2 pod will maintain the same configuration as the original gateway, and the Envoy instance running in gateway-proxy-v2 will be configured correctly.

This does not modify or delete the existing Gateway v1 CRD(s), nor the deployments for gateway and gateway-proxy. In a later step, after traffic has been shifted to gateway-proxy-v2, these can be safely deleted.

3. Verify Gateway v2 is healthy

First, make sure that the gateway-v2 and gateway-proxy-v2 pods are ready.

The gateway-proxy-v2 configuration can be tested.

4. Migrate traffic to Gateway v2

Prior to upgrading, all traffic was being routed through the Envoy instance inside gateway-proxy. Now we can migrate traffic from gateway-proxy to gateway-proxy-v2.

In order to start shifting 50% of the traffic to the new gateway-proxy-v2, the gateway-proxy: live label needs to be added to the gateway-proxy-v2 deployment.

Run kubectl edit deployment -n gloo-system gateway-proxy-v2, and edit the template spec to include the label gateway-proxy: live.

To verify that traffic is being split across each gateway-proxy, run kubectl port-forward -n gloo-system deployment/gateway-proxy 19001:19000 and kubectl port-forward -n gloo-system deployment/gateway-proxy-v2 19002:19000 to expose the Envoy admin pages for the v1 and v2 proxies, respectively. Open localhost:19001/stats and localhost:19002/stats in a web browser and compare the upstream_rq_completed stat for the same service on each proxy. As requests come in, these counts should increase at roughly the same rate to indicate a 50/50 traffic split.

Once you are satisfied with traffic on Gateway v2, you can remove the gateway-proxy: live label from the gateway-proxy deployment.

Verify that all traffic is successfully routed to gateway-proxy-v2.

5. Clean up

Once all of the traffic is being routed through gateway-proxy-v2, the Gateway v1 resources and upgrade job can be safely deleted.

Notes

Upgrading Gateway CRDs in Git

If you are doing a GitOps workflow and have Gateway v1 CRDs in a git repository, those will need to be updated. Once upgraded to 0.18.1, the Gateway v2 CRDs can be saved to the Git repository, replacing the Gateway v1 CRDs. Alternative, the Gateway CRDs in git can be manually fixed and the group updated. Please contact us in Slack if you’d like help with this.

Kubernetes CRD upgrade support

In the future, we intend to utilize Kubernetes schemas and conversion webhooks, first introduced in Kubernetes 1.13, to facilitate better support around CRD versioning. However, at this time we must support older versions of Kubernetes, and so our upgrade process does not rely on those features.

More sophisticated traffic shifting

It may not be desirable to shift traffic 0 -> 50/50 -> 100% – a more sophisticated approach may be preferred. We don’t currently have support for this as part of this upgrade, but it is an area we’d like to improve and please contact us in slack if you’d like to explore that.