Management server
Debug the Gloo Mesh Core management server.
The Gloo management server configures resources such as the Gloo agent and istio-controller
to maintain the desired state and generate insights of your Istio environment.
Debug the management server
Debug the Gloo management server.
- Verify that the Gloo management server pod is running.If not, describe the pod and look for error messages. If you have multiple replicas, check each pod.
kubectl get pods -n gloo-mesh -l app=gloo-mesh-mgmt-server --context ${MGMT_CONTEXT}
kubectl describe pod -n gloo-mesh -l app=gloo-mesh-mgmt-server --context ${MGMT_CONTEXT}
- Optional: To increase the verbosity of the logs, you can patch the management server deployment.
kubectl patch deploy -n gloo-mesh gloo-mesh-mgmt-server --context $MGMT_CONTEXT --type "json" -p '[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--verbose=true"}]'
- Check the logs of the Gloo management server pod. To view logs recorded since a relative duration such as 5s, 2m, or 3h, you can specify the
--since <duration>
flag.Optionally, you can format the output withmeshctl logs mgmt -l error --kubecontext $MGMT_CONTEXT [--since DURATION]
jq
or save it in a local file so that you can read and analyze the output more easily.meshctl logs mgmt -l error --kubecontext $MGMT_CONTEXT | jq > mgmt-server-logs.json
- In the logs, look for error messages. For example, you might see a message similar to the following.
Message Description Steps to resolve json: cannot unmarshal array into Go struct field
The Gloo configuration of the resource does not match the expected configuration in the Gloo custom resource definition. Gloo cannot translate the resource, and dependent resources such as policies do not work. Review the configuration of the resource against the API reference, and try debugging the resource. For example, a field might be missing or have an incorrect value such as the wrong cluster name. If you recently upgraded the management server version, make sure that you reapply the CRDs . License is invalid or expired, crashing - license expired
The Gloo license is expired. Your Gloo management server is in a crash loop, and no Gloo resources can be modified until you update the license. See Updating your Gloo license. conflicting IOPs have been created from a different parent Istio Lifecycle Manager
You might have a conflicting Istio Lifecycle Manager. For example, you might have uninstalled a previous Istio Lifecycle Manager that did not completely delete. Try debugging the Istio Lifecycle Manager. - You can also check the logs for other all log levels, such as
warn
,debug
, orinfo
.meshctl logs mgmt --kubecontext $MGMT_CONTEXT [--since DURATION]
Debug the relay connection
Verify the relay connection between the Gloo management server and agent.
Verify that the Gloo management server and agent pods are running. If not, try troubleshooting the management server or agent.
kubectl get pods -n gloo-mesh --context ${MGMT_CONTEXT} kubectl get pods -n gloo-mesh --context ${REMOTE_CONTEXT}
Verify that the workload clusters are successfully identified by the management plane. This check might take a few seconds to ensure that the expected relay agent is now running and is connected to the relay server in the management cluster.
meshctl check --kubecontext $MGMT_CONTEXT
Example output:
... 🟢 Mgmt server connectivity to workload agents Cluster | Registered | Connected Pod cluster1 | true | gloo-mesh/gloo-mesh-mgmt-server-676f4b9945-2pngd cluster2 | true | gloo-mesh/gloo-mesh-mgmt-server-676f4b9945-2pngd
Check that the relay connection between the management server and workload agents is healthy.
- Forward port 9091 of the
gloo-mesh-mgmt-server
pod to your localhost.kubectl port-forward -n gloo-mesh --context $MGMT_CONTEXT deploy/gloo-mesh-mgmt-server 9091
- In your browser, connect to http://localhost:9091/metrics.
- In the metrics UI, look for the following lines. If the values are
1
, the agents in the workload clusters are successfully registered with the management server. If the values are0
, the agents are not successfully connected. Thewarmed
successful message indicates that the management server can push configuration to the agents.relay_pull_clients_connected{cluster="cluster1"} 1 relay_pull_clients_connected{cluster="cluster2"} 1 relay_push_clients_connected{cluster="cluster1"} 1 relay_push_clients_connected{cluster="cluster2"} 1 relay_push_clients_warmed{cluster="cluster1"} 1 relay_push_clients_warmed{cluster="cluster2"} 1
- Take snapshots in case you want to refer to the logs later, such as to open a Support issue.
curl localhost:9091/snapshots/input -o input_snapshot.json curl localhost:9091/snapshots/output -o output_snapshot.json
- Forward port 9091 of the
Check that the Gloo management services are running.
Send a gRPC request to the Gloo management server.
kubectl get secret --context $MGMT_CONTEXT -n gloo-mesh relay-root-tls-secret -o json | jq -r '.data["ca.crt"]' | base64 -d > ca.crt grpcurl -authority gloo-mesh-mgmt-server.gloo-mesh --cacert=./ca.crt $MGMT_SERVER_NETWORKING_ADDRESS list
Verify that the following services are listed.
envoy.service.accesslog.v3.AccessLogService envoy.service.metrics.v2.MetricsService envoy.service.metrics.v3.MetricsService grpc.reflection.v1alpha.ServerReflection relay.multicluster.skv2.solo.io.RelayCertificateService relay.multicluster.skv2.solo.io.RelayPullServer relay.multicluster.skv2.solo.io.RelayPushServer
Check the logs on the
gloo-mesh-mgmt-server
pod on the management cluster for communication from the workload cluster.meshctl logs mgmt -l error --kubecontext $MGMT_CONTEXT | grep $REMOTE_CLUSTER
Example output:
{"level":"debug","ts":1616160185.5505846,"logger":"pull-resource-deltas","msg":"recieved request for delta: response_nonce:\"1\"","metadata":{":authority":["gloo-mesh-mgmt-server.gloo-mesh.svc.cluster.local:11100"],"content-type":["application/grpc"],"user-agent":["grpc-go/1.34.0"],"x-cluster-id":["remote.cluster"]},"peer":"10.244.0.17:40074"}