The Gloo agent reports the state of Istio and other resources in the workload cluster to the management server. It also applies configuration updates from the management server.

  1. Verify that the Gloo agent pod is running.

      kubectl get pods -n gloo-mesh --context ${REMOTE_CONTEXT}

    If not, describe the pod and look for error messages.

      kubectl describe pod -n gloo-mesh -l app=gloo-mesh-agent --context ${REMOTE_CONTEXT}
  2. Check the logs of the Gloo agent in your workload cluster. To view logs recorded since a relative duration such as 5s, 2m, or 3h, you can specify the --since <duration> flag.

      meshctl logs agent -l error --kubecontext ${REMOTE_CONTEXT} [--since DURATION]

    Optionally, you can format the output with jq or save it in a local file so that you can read and analyze the output more easily.

      meshctl logs agent -l error --kubecontext ${REMOTE_CONTEXT} | jq > mgmt-server-logs.json
  3. In the logs, look for "err", Err:, or Error messages. For example, you might see a message similar to the following.

    MessageDescriptionSteps to resolve
    Err: connection error: desc = \"transport: Error while dialing dial tcp: missing address\"The Gloo agent does not have the correct address set for the management server.In your Helm settings file for the agent Helm chart, compare the value for the serverAddress setting with the IP address and port of the management server. If necessary, upgrade your agent installation with the correct address, such as helm upgrade gloo-mesh-agent gloo-mesh-agent/gloo-mesh-agent --namespace gloo-mesh --kube-context=${REMOTE_CONTEXT} --version ${GLOO_VERSION} --set serverAddress=<mgmt_server_address>.
    "err": " \"istio-ingressgateway\" not found",Gloo expected to find a resource such as a Gateway Lifecycle Manager named istio-ingressgateway. You can check the resource field to see which namespace the resource was expected in.If you recently deleted the resource, wait to see if the error resolves itself. If not, try debugging the resource.
    "err": "Operation cannot be fulfilled on \"istio-ingressgateway\": the object has been modified; please apply your changes to the latest version and try againGloo is trying to reconcile your changes to the resource, such as updating a Gateway Lifecycle Manager to add a workload cluster.If you recently updated the resource, wait to see if the error resolves itself. If not, try debugging the resource.
    Waited for <time> due to client-side throttling, not priority and fairness, requestGloo experienced a timeout when sending a request to the Kubernetes API server. For example, the Kubernetes etcd might be overloaded by the number of resources in the cluster.Wait to see if the error resolves as your Kubernetes cluster load reduces.
    Error: getting initial relay connection: context deadline exceededThe Gloo management server cannot set up a relay connection with the agent. The connection can fail for several reasons, such as pods or the service mesh in an unhealthy state.Try debugging the management server and relay connection.
    transport: authentication handshake failed: x509: certificate signed by unknown authorityYour Gloo Mesh Core installation might have multiple certificates with different CAs. For example, you might have performed a Helm upgrade while using OpenTelemetry without disabling token and certificate regeneration.Review the Solo Support Center article (requires login).
  4. You can also check the logs for other all other log levels, such as warn, debug, or info.

      meshctl logs agent --kubecontext ${REMOTE_CONTEXT} [--since DURATION]
  5. If you continue to see error messages that indicate state reconciliation issues, try debugging Gloo resources.