Upgrading Istio

Upgrade the minor version of your production Istio in workload clusters with zero downtime.

Istio recommends upgrading your Istio deployment by only one minor version at a time. If you want to upgrade multiple minor versions, such as from 1.13 to 1.15, upgrade incrementally from one version to the next minor version, such as first from 1.13 to 1.14, and then from 1.14 to 1.15.

Upgrade overview

This upgrade process uses a blue/green deployment model in which you run two versions of Istio simultaneously, as shown in this figure.

Figure of a production-level blue/green deployment model in which multiple versions of Istio are installed in one workload cluster

The Istio upgrade process follows these general steps:

  1. Deploy an Istio operator that runs the targeted upgrade version.
  2. Deploy a new Istio control plane via the IstioOperator configuration.
  3. Label service namespaces with the new Istio revision label, and upgrade service workloads with new Istio proxies.
  4. Upgrade the Istio ingress and east-west gateways in-place, and verify traffic.
  5. Clean up the old Istio resources.

The following steps upgrade the Istio architecture outlined in the Deploy Istio in production guide. This installation profile uses Istio revisions to facilitate the upgrade process, and deploys a gateway load balancer service that is not managed by Istio so that you can run multiple versions of one Istio gateway in a blue/green deployment model. For more information about the example resource files that are provided in the following steps, see the GitHub repository for Gloo Mesh Use Cases.

Step 1: Gather mesh details and prepare revisions

Before you begin the upgrade process, take inventory of the Istio service mesh and prepare for the new Istio revision.

  1. Verify that the version you want to upgrade to is tested and supported by Gloo Istio. If not, you might be able to upgrade Gloo Mesh, and then upgrade Istio.

  2. Check the Istio release notes for the upgrade version to prepare for any breaking changes.

  3. Note any resources that point to a specific Istio revision.

    • Custom Envoy filters: As a recommended practice, Envoy filters often target a specific Istio revision. When you create the Istio revision for the upgrade process, you might need to create filters that target the new Istio revision.
    • Namespaces: Note the Istio revisions that your service namespaces currently target.
      kubectl get namespace -L istio.io/rev
      

      Example output:

      NAME              STATUS   AGE   REV
      kube-system       Active   54m
      default           Active   54m   1-15-4
      bookinfo          Active   14s   1-15-4
      
  4. Save the existing and new Istio versions as environment variables. Versions are formatted such as 1.16.2-solo and revisions are formatted such as 1-16-2.

    • For REPO, use a Gloo Istio repo key for the new image that you can get by logging in to the Support Center and reviewing the Istio images built by Solo.io support article.
    • For ISTIO_IMAGE, save the new version that you want to install, such as 1.16.2, and append the solo tag. You can optionally append other Gloo Istio tags, as described in About Gloo Istio. If you downloaded a different version than the following, make sure to specify that version instead.
    • For REVISION, take the new Istio version number and replace the periods with hyphens, such as 1.16.2 to 1-16-2.
    ISTIO_OLD_VERSION=<existing_version>
    ISTIO_OLD_REVISION=<existing_revision>
    REPO=<repo-key>
    ISTIO_IMAGE=<upgrade_version>
    REVISION=<upgrade_revision>
    
  5. Save the workload cluster that you want to upgrade Istio for as the environment variable CLUSTER_NAME so that you can reuse this same variable when you repeat these steps for all your workload clusters.

    export CLUSTER_NAME=$REMOTE_CLUSTER1
    
  6. If you did not deploy Bookinfo, deploy the sample application with the old Istio version to keep track of changes during the upgrade.

    1. Navigate to the directory for the old Istio version.
      cd istio-$ISTIO_OLD_VERSION
      
    2. Create a bookinfo namespace and label it so that the Istio sidecars in this namespace target the old revision.
      kubectl create namespace bookinfo
      kubectl label ns bookinfo istio.io/rev=$ISTIO_OLD_REVISION
      
    3. Install Bookinfo in the bookinfo namespace.
      kubectl apply -n bookinfo -f samples/bookinfo/platform/kube/bookinfo.yaml
      kubectl apply -n bookinfo -f samples/bookinfo/networking/bookinfo-gateway.yaml
      
    4. Verify that the pods are running.
      kubectl get pods -n bookinfo
      
    5. Scale Bookinfo to 2 replicas.
      kubectl scale -n bookinfo --replicas=2 deployment/details-v1 deployment/ratings-v1 deployment/productpage-v1 deployment/reviews-v1 deployment/reviews-v2 deployment/reviews-v3
      
  7. In a separate terminal, generate some traffic to the Bookinfo app to verify that access to the app is uninterrupted and zero downtime occurs during the upgrade.

    1. Install the HTTP load testing utility.
    2. Get the address of the Istio ingress gateway.
      
         export INGRESS_GW_IP=$(kubectl get svc -n istio-ingress istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
         echo http://$INGRESS_GW_IP/productpage
         
      
         export INGRESS_GW_IP=$(kubectl get svc -n istio-ingress istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
         echo http://$INGRESS_GW_IP/productpage
         
    3. Navigate to http://$INGRESS_GW_IP/productpage in a web browser to verify that the productpage for Bookinfo is reachable.
      open http://$INGRESS_GW_IP/productpage
      
    4. Run the 15 minute load test that sends 10rps to localhost:8080/productpage.
      RUN_TIME_SECONDS=900
      
      echo "GET http://$INGRESS_GW_IP/productpage" | vegeta attack -rate 10/1s -duration=${RUN_TIME_SECONDS}s | vegeta encode > stats.json
      
      vegeta report stats.json
      
      vegeta plot stats.json > plot.html
      

Step 2: Deploy the new Istio operator and control plane

Deploy the Istio operator and control plane for the new version that you want to upgrade to.

  1. Download the Istio version that you want to upgrade to. The latest version supported by Solo, 1.16.2, is provided as an example.

    1. Download the new version.

      curl -L https://istio.io/downloadIstio | ISTIO_IMAGE=1.16.2 sh -
      
    2. Navigate to the directory for the new version.

      cd ./istio-1.16.2
      
    3. Add the istioctl client to your path.

      export PATH=$PWD/bin:$PATH
      
  2. Create a Helm template with the following settings, and save the template as operator.yaml. Direct Helm chart installation cannot currently be used due to a namespace ownership bug.

    TEMPLATE=$(helm template istio-operator-$REVISION manifests/charts/istio-operator \
      --set operatorNamespace=istio-operator \
      --set watchedNamespaces="istio-system\,istio-ingress\,istio-eastwest" \
      --set global.hub="docker.io/istio" \
      --set global.tag="$ISTIO_IMAGE" \
      --set revision="$REVISION")
    
    echo $TEMPLATE > operator.yaml
    
  3. Optional: View the operator resource configurations.

    cat operator.yaml
    
  4. Create the new Istio operator in your cluster.

    kubectl apply -f operator.yaml
    
  5. Verify that the operator resources are deployed.

    kubectl get all -n istio-operator | grep $REVISION
    

    Example output:

    pod/istio-operator-1-16-2-5448478484-l45kg   1/1     Running   0          51s
    service/istio-operator-1-16-2   ClusterIP   10.27.246.10    <none>        8383/TCP   51s
    deployment.apps/istio-operator-1-16-2   1/1     1            1           51s
    replicaset.apps/istio-operator-1-16-2-5448478484   1         1         1       51s
    
  6. Prepare an IstioOperator resource file to create the Istio control plane that runs the new version. This sample command downloads an example file, istiod-kubernetes.yaml, and updates the environment variables with the values that you previously set. You can further edit the file to provide your own details for production-level settings.

    curl -0L https://raw.githubusercontent.com/solo-io/gloo-mesh-use-cases/main/gloo-mesh/istio-install/1.16/istiod-kubernetes.yaml > istiod-kubernetes.yaml
    envsubst < istiod-kubernetes.yaml > istiod-kubernetes-values.yaml
    
  7. Deploy the control plane that runs the new version to your cluster.

    kubectl apply -f istiod-kubernetes-values.yaml
    
  8. After the installation is complete, verify that the Istio control plane pods for the new revision are now running alongside the existing pods for the old revision.

    kubectl get pods -n istio-system
    

    Example output:

    NAME                             READY   STATUS    RESTARTS   AGE
    istiod-1-15-4-668dd8cc4c-6d49g   1/1     Running   0          42m
    istiod-1-15-4-668dd8cc4c-btx8d   1/1     Running   0          42m
    istiod-1-16-2-76fbc7b85c-7hh7f   1/1     Running   0          42s
    istiod-1-16-2-76fbc7b85c-m5mlc   1/1     Running   0          41s
    

Step 3: Update apps and Istio gateways

Now that the all the components for the new Istio version are deployed, you can upgrade your apps’ Istio sidecars, and perform an in-place upgrade for the Istio ingress and east-west gateways.

  1. Change the label on the bookinfo namespace to use the new revision.

    kubectl label ns bookinfo istio.io/rev=$REVISION --overwrite
    

    If you did not previously use revision labels for your apps, you can upgrade your application's sidecars by running ‘kubectl label ns bookinfo istio-injection-’ and ‘kubectl label ns bookinfo istio.io/rev=$REVISION’.

  2. Update the Bookinfo sample app by rolling out restarts to each of the microservices. The Istio sidecars for each microservice are updated to use the new Istio version. Make sure that you only restart one microservice at a time. For example, in the following commands, 20 seconds elapse between each restart to ensure that the pods have time to start running.

    kubectl rollout restart deployment -n bookinfo details-v1
    sleep 20s
    kubectl rollout restart deployment -n bookinfo ratings-v1
    sleep 20s
    kubectl rollout restart deployment -n bookinfo productpage-v1
    sleep 20s
    kubectl rollout restart deployment -n bookinfo reviews-v1
    sleep 20s
    kubectl rollout restart deployment -n bookinfo reviews-v2
    sleep 20s
    kubectl rollout restart deployment -n bookinfo reviews-v3
    
  3. Update the IstioOperator resource for the ingress gateway. Change the values of spec.tag, spec.revision, and spec.components.ingressGateways.label.version to use the new revision. Note that this in-place upgrade is safe for gateway deployments.

    kubectl edit IstioOperator ingress-gateway -n istio-system
    
  4. If you created an Istio east-west gateway, update the IstioOperator resource for the gateway. Change the values of spec.tag, spec.revision, and spec.components.ingressGateways.label.version to use the new revision. Note that this in-place upgrade is safe for gateway deployments.

    kubectl edit IstioOperator eastwest-gateway -n istio-system
    
  5. Verify that the productpage for Bookinfo is still reachable after the upgrade.

    open http://$INGRESS_GW_IP/productpage
    

Step 4: Validate traffic

Check the results of the load test to ensure traffic was uninterrupted throughout the upgrade process.

  1. Once the 15 minute load test is complete in your other terminal, check the results of the traffic requests that were sent to Bookinfo during the upgrade. The following example output shows that 6000 200 response codes and no error codes were received.

    Requests      [total, rate, throughput]         6000, 10.00, 10.00
    Duration      [total, attack, wait]             15m0s, 15m0s, 26.776ms
    Latencies     [min, mean, 50, 90, 95, 99, max]  15.344ms, 29.06ms, 25.727ms, 33.811ms, 41.936ms, 85.286ms, 1.212s
    Bytes In      [total, mean]                     29091004, 4848.50
    Bytes Out     [total, mean]                     0, 0.00
    Success       [ratio]                           100.00%
    Status Codes  [code:count]                      200:6000  
    Error Set:
    
  2. You can also check the graph of the results that was generated.

    open ./plot.html
    

In the following example graph, a spike in latency occured when the Bookinfo application sidecars were updating. You might remedy this latency by adjusting the application scaling properties.

Example graph of the load test results

Step 5: Clean up previous resources

After you validate your upgrade, clean up the Istio resources that run the previous version.

  1. Delete the IstioOperator resource for the old ingress gateway.

    kubectl delete IstioOperator ingress-gateway-$ISTIO_OLD_REVISION -n istio-ingress
    
  2. Verify that only the new gateway is running.

    kubectl get pods -n istio-ingress
    
  3. Uninstall the previous control plane.

    istioctl uninstall --revision $ISTIO_OLD_REVISION
    
  4. Delete the operator for the old Istio version and the operator's ClusterIP service.

    kubectl delete deploy istio-operator-$ISTIO_OLD_REVISION -n istio-operator
    kubectl delete svc istio-operator-$ISTIO_OLD_REVISION -n istio-operator