Debugging Istio service mesh

Try using the Gloo Mesh UI and other observability tools to help you debug your service mesh and apps.

Istio debugging

If after debugging your Gloo Mesh components, you still have seem to have service mesh configuration issues, you might need to debug Istio. For help using the Istio diagnostic tools, see the Istio documentation.

Debugging the proxy? The configuration file for the Envoy proxy container in each of your Istio pods can be hundreds of lines long. Follow along with the Solo blog, Navigating Istio Config: A look into Istio's toolkit, to learn how to use istioctl to focus on the most common configuration areas.

If you use Grafana to monitor Istio performance, check out the Grafana performance monitoring dashboard in the Solo Communities of Practice (COP) repository.

Note that COP tools are provided as helpful starting resources that are maintained by the community. These tools are not guaranteed to work in your environment, and are not part of product SLAs.

Review ways from the community to debug your service mesh, such as this Istio debugging video or the Istio diagnostic tools documentation.

Knative and Istio performance

If you use Knative with Istio at large scales (thousands of services), consider the following performance factors.

Benchmark and tune Knative resources: Review tuning resources online, such as the Kperf benchmarking tool and the slides from this Istiocon talk.

Specify ports in the Istio sidecar: In your Istio sidecar configurations, specify the ports that the Istio proxy needs to listen on, such as the following example file. If you leave the ports empty, the Istio proxy might try to keep up with all of the Knative ports that are opened, which can slow down performance.

apiVersion: networking.istio.io/v1alpha3
kind: Sidecar
metadata:
  name: default
spec:
  egress:
  - hosts:
    - "./*"
    - "istio-system/*"
    port:
      number: 8080
      protocol: HTTP
      name: egresshttp

Separate istiod deployments: Consider using a separate istiod revision for the Knative activator proxy. Sometimes, Knative resources such as the webhook depend on information from the Istio proxy, such as when scaling up Knative services. By having a separate istiod revision for Knative, you can reduce the Istio proxy push time and speed up Knative connections.

EKS load balancer port

If you use EKS clusters and cannot connect to your apps, your istio-ingressgateway load balancer in your workload cluster might not use the required port 15443. EKS load balancer health checks use the first port listed in the load balancer's port list by default. In some cases, this causes istio-ingressgateway to listen on port 80 instead of 15443 because 80 is listed first, such as in this example load balancer YAML file:

...
spec:
  clusterIP: 10.100.108.166
  externalTrafficPolicy: Cluster
  ports:
  - name: http2
    nodePort: 31143
    port: 80
    protocol: TCP
    targetPort: 8080
  - name: https
    nodePort: 30131
    port: 443
    protocol: TCP
    targetPort: 8443
  - name: tls
    nodePort: 32287
    port: 15443
    protocol: TCP
    targetPort: 15443
  selector:
    app: istio-ingressgateway
    istio: ingressgateway

To redeploy the istio-ingressgateway load balancer with port 15443 instead, edit the istio-ingressgateway load balancer service in cluster-1 by running kubectl edit svc istio-ingressgateway -n istio-system --context $REMOTE_CONTEXT1. Then move the tls port for 15443 to the top of the ports list, such as the following:

...
spec:
  clusterIP: 10.100.108.166
  externalTrafficPolicy: Cluster
  ports:
  - name: tls
    nodePort: 32287
    port: 15443
    protocol: TCP
    targetPort: 15443
  - name: http2
    nodePort: 31143
    port: 80
    protocol: TCP
    targetPort: 8080
  - name: https
    nodePort: 30131
    port: 443
    protocol: TCP
    targetPort: 8443
  selector:
    app: istio-ingressgateway
    istio: ingressgateway

Istio gateway installation fails with timeout

When you install Istio, you might notice an error similar to the following.

✘ Ingress gateways encountered an error: failed to wait for resource: resources not ready after 5m0s: timed out waiting for the condition
  Deployment/istio-system/istio-eastwestgateway (container failed to start: CrashLoopBackOff: back-off 2m40s restarting failed container=istio-proxy pod=istio-eastwestgateway-56f99b9f8c-hv844_istio-system(35406c23-a05c-4640-99c4-d935d0b4b203))
  Deployment/istio-system/istio-ingressgateway (container failed to start: CrashLoopBackOff: back-off 2m40s restarting failed container=istio-proxy pod=istio-ingressgateway-7966bb6b69-5xg9g_istio-system(a22a578e-33ec-4579-94b5-3867d8cb3082))
- Pruning removed resources                               Error: failed to install manifests: errors occurred during operation

Your istioctl version might not match the version during Istio installation.

  1. Check the version that you use for the Istio installation, such as checking the ISTIO_IMAGE environment variable.

    echo $ISTIO_IMAGE
    
  2. Check the istioctl version on your workstation.

    istioctl version
    
  3. If these versions do not match, upgrade your istioctl version, such as with the following commands.

  4. If the versions match, check the logs of the istiod and any gateway pods that might be in a crash loop state for error messages, such as the following.

    • no space left on device: If you set up Istio on a local device for testing purposes using kind or k3d, you might run out of space. Try cleaning up your system, such as with docker system prune.
  5. Try to install Istio again.

Bookinfo apps are stuck in pending

If you install the Bookinfo sample app, but the deployment is stuck in a pending state, you might see the following errors.

admission webhook "sidecar-injector.istio.io" denied threquest: template:
      inject:1: function "Template_Version_And_Istio_Version_Mismatched_Check_Installation"
      not defined

Error creating: Internal error occurred: failed calling webhook "sidecar-injector.istio.io": Post "https://istiod.istio-system.svc:443/inject?timeout=30s": x509: certificate signed by unknown authority

Your istioctl version does not match the IstioOperator version that was used during Istio installation.

  1. Ensure that you download the same version of istioctl that you plan to install in your workload clusters.
  2. Uninstall your current Istio installation such as by running istioctl x uninstall --purge.
  3. Reinstall Istio.
  4. Try again.