Generate relay certificates with Vault
To secure communication between the Gloo Mesh management and data planes, set up the relay root and intermediate certificate authorities (CAs) to generate the relay server certificate and relay agent client certificates.
The following steps contain example configurations to generate each relay certificate manually. You can use these example configurations as a starting point to create CAs in your own public key infrastructure (PKI).
Although this Vault setup is more secure than using the self-signing default setup, the certificates are still stored within the management cluster. You might restrict access to the management cluster, or you might need to use a different setup to meet your production security requirements.
Overview
The following figure depicts an example architecture for using cert-manager
and Vault to set up the relay certificates for multiple clusters.
Figure: Example Vault certificate setup for multiple clusters.
- After installing
cert-manager
and Vault in your management cluster, you set up a root of trust for the CA chain. - Next, you create an intermediate CA that is used to sign the relay server and client certificates.
- After creating the relay server and client certificates in your clusters, Gloo Mesh
gloo-mesh-mgmt-server
andgloo-mesh-agent
deployments use the certificates to secure gRPC protocol communication between the deployments by using the mutual TLS (mTLS) that is provided by the signed certificate.
Before you begin
-
Install
cert-manager
in the management and each workload cluster, such as in the following examples.kubectl --context ${MGMT_CONTEXT} apply -f https://github.com/jetstack/cert-manager/releases/download/v1.5.4/cert-manager.yaml
kubectl --context ${REMOTE_CONTEXT} apply -f https://github.com/jetstack/cert-manager/releases/download/v1.5.4/cert-manager.yaml
-
Save the kubeconfig contexts for your clusters. Run
kubectl config get-contexts
, look for your cluster in theCLUSTER
column, and get the context name in theNAME
column. Note: Do not use context names with underscores. The context name is used as a SAN specification in the generated certificate that connects workload clusters to the management cluster, and underscores in SAN are not FQDN compliant. You can rename a context by runningkubectl config rename-context "<oldcontext>" <newcontext>
.export MGMT_CLUSTER=<mgmt-cluster-name> export REMOTE_CLUSTER=<remote-cluster-name> export MGMT_CONTEXT=<management-cluster-context> export REMOTE_CONTEXT=<remote-cluster-context>
Generate the relay root CA
Create and securely store the relay root CA in HashiCorp Vault. Although this Vault setup is more secure than using the self-signing default setup, you might need to use a different setup to meet your production security requirements.
- If it doesn't already exist, create the
gloo-mesh
namespace.kubectl create namespace gloo-mesh --context $MGMT_CONTEXT
- If not added already, add and update the HashiCorp Helm repository in your management cluster.
helm repo add hashicorp https://helm.releases.hashicorp.com helm repo update
- Install Vault in your management cluster.
helm install vault hashicorp/vault -n vault \ --kube-context ${MGMT_CONTEXT} \ --set "injector.enabled=false" \ --set "server.dev.enabled=true" \ --set "server.service.type=LoadBalancer" \ --create-namespace
- Enable Vault for root CA certificates along the
pki
path.kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault secrets enable pki'
- Set up the root of trust. The following example uses solo.io, but replace these values with your own CA provider.
kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write -format=json pki/root/generate/internal \ common_name="Solo.io Root CA" organization="solo.io" ttl=187600h'
- Create an intermediate CA that is used for the relay server operations, along the
pki_relay
path. The key is kept internally, and a certificate signing request is created.kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault secrets enable -path pki_relay pki' CSR_INPUT=$(kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write \ -format=json pki_relay/intermediate/generate/internal \ common_name="gloo-mesh-mgmt-server-ca" organization="mesh.solo.io" ttl=43800h') CSR=$(echo $CSR_INPUT | tr '\r\n' ' ' | jq ' .data.csr' | sed 's/ /\\n/g' | sed 's/BEGIN\\nCERTIFICATE\\nREQUEST/BEGIN CERTIFICATE REQUEST/g' |sed 's/END\\nCERTIFICATE\\nREQUEST/END CERTIFICATE REQUEST/g')
- Copy the
CSR
value, including the double quotes. You use this value later to sign and generate the certificate.echo $CSR
Example output:
"-----BEGIN CERTIFICATE REQUEST----- MIICtTCCAZ0CAQAwOjEVMBMGA1UEChMMbWVzaC5zb2xvLmlvMSEwHwYDVQQDExhl bnRlcnByaXNlLW5ldHdvcmtpbmctY2EwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAw ggEKAoIBAQDA7o31x9auoUNqz2LpB1GkIkHD/VT09tnenkT9ldph56ysWLL681XU KqUKkTTZxof9Evi5DrpnetXH7WQYwLXDixcm78qGXEzYi2bnAHAYuoJWdnbWSNZW FJ5VOlYJm51zTMsxk5bQ5UrkjJvX3inbYASNBUMrlRgLWsLYe0avTc/EwpDMZ9XK gHMnJb/VyFg4mHrEwTLxKVtWmBxC9AflEcg6Zm5KZPkJX2v3iF+XOQw/63RIMwAG 9MNU+pDkasOKdqtdX1HWURLf8vnHVpWvFWCxNCa6OojTpntBNH8wrpLhbnqoeKPL xyAXOdnaBTt+5DdAg4k4+1lSrUM8+bERAgMBAAGgNjA0BgkqhkiG9w0BCQ4xJzAl MCMGA1UdEQQcMBqCGGVudGVycHJpc2UtbmV0d29ya2luZy1jYTANBgkqhkiG9w0B AQsFAAOCAQEAMhpY0vrihMIpYxaFBuJFX6FRDIhoiPiYgklwOwAijfrrC/68DlRl KOG+1RsK03tCjFHNkvTpAHZ2UbOfkd54SIEbRjadroN5SubG2XQb9pg73gk7XqOP g3Koss8SEdF3RU4swWKSNCV370mpJPY8QNvjpj+nbT2W9LzmnXpU26LtTUrOfGJj wf89VlquVRRgi6KF5ewQu/c2Ov1iN0SZOSBBELqi8dIaT8ZaWgXcwtgTueLHfvQf mYrjJAoy0FmdFhMN6zYs9EfacjRXuoHoTzqSad3i5A6ofmAnGuLwlZVmSC85xUpm VQW87mr/cWLm6se36ZPmQDzGUn6yVHOWIg== -----END CERTIFICATE REQUEST-----"
- Sign and generate the certificate. Replace
$CSR
with the value that you copied in the previous step.CERT_INPUT=$(kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write -format=json pki/root/sign-intermediate \ csr=$CSR format=pem_bundle ttl=43800h') CERT=$(echo $CERT_INPUT | tr '\r\n' ' ' | jq ' .data.certificate' | sed 's/ /\\n/g' | sed 's/BEGIN\\nCERTIFICATE/BEGIN CERTIFICATE/g' |sed 's/END\\nCERTIFICATE/END CERTIFICATE/g')
- Copy the
CERT
, including the double quotes. You use this value later to set the signed certificate.echo $CERT
Example output:
"-----BEGIN CERTIFICATE----- MIIDZzCCAk+gAwIBAgIULGGnrdUSautQsOPDSYiPwdm3uUowDQYJKoZIhvcNAQEL BQAwLDEQMA4GA1UEChMHc29sby5pbzEYMBYGA1UEAxMPU29sby5pbyBSb290IENB MB4XDTIxMTEwMTE1MjQxOFoXDTIxMTIwMzE1MjQ0OFowIzEhMB8GA1UEAxMYZW50 ZXJwcmlzZS1uZXR3b3JraW5nLWNhMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB CgKCAQEAwO6N9cfWrqFDas9i6QdRpCJBw/1U9PbZ3p5E/ZXaYeesrFiy+vNV1Cql CpE02caH/RL4uQ66Z3rVx+1kGMC1w4sXJu/KhlxM2Itm5wBwGLqCVnZ21kjWVhSe VTpWCZudc0zLMZOW0OVK5Iyb194p22AEjQVDK5UYC1rC2HtGr03PxMKQzGfVyoBz JyW/1chYOJh6xMEy8SlbVpgcQvQH5RHIOmZuSmT5CV9r94hflzkMP+t0SDMABvTD VPqQ5GrDinarXV9R1lES3/L5x1aVrxVgsTQmujqI06Z7QTR/MK6S4W56qHijy8cg FznZ2gU7fuQ3QIOJOPtZUq1DPPmxEQIDAQABo4GJMIGGMA4GA1UdDwEB/wQEAwIB BjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBR9rVq8QaXUIrRVecwMeuAbCdMa wjAfBgNVHSMEGDAWgBTn8wKJK3TBLAyJ7V29jou0CCcx0jAjBgNVHREEHDAaghhl bnRlcnByaXNlLW5ldHdvcmtpbmctY2EwDQYJKoZIhvcNAQELBQADggEBAAWuzxz8 wlvXIbGft8GX9pt/FXOZfedm1AJGP2zQELuWUk7J6p2QfEqQPKntvykCP3xlfgAH BVqWrNSv1DfV6+QTACzGG83muUtGuX0/Av6VtjHfwoFgC0y8A3XH6P3vrwHx6heI 7tkT2GbCXq5Br0Mne6uTvYskMskuwAuZyglz8XK7bKerfD8Z4w2O7Fu41t9Mlirx LireTJEKR4ggVfPITyECkBJ9TFaj0T83qupFwrw4K60xLmHs2akKxm69tofcH1ZQ Z9a96zY0x3GPVuAU1WApf64roaofH4Vbk/gzbChKhsV6vX16jzwwWrjFQvS0Rn59 hUNB9r37cXzuaJM= -----END CERTIFICATE-----"
- Set the signed certificate value. Replace
$CERT
with the value that you copied in the previous step.kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write -format=json pki_relay/intermediate/set-signed certificate=$CERT' kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write pki_relay/roles/gloo-mesh-mgmt-server-ca allow_any_name=true max_ttl="720h"'
- Get the External IP address of the
LoadBalancer
service for Vault.VAULT_IP=$(kubectl get svc -n vault vault --context $MGMT_CONTEXT \ -o jsonpath='{.status.loadBalancer.ingress[0].ip}') echo $VAULT_IP
- Create a
cert-manager
issuer for the CA, replacing$VAULT_IP
with the external IP address that you previously retrieved.kubectl --context ${MGMT_CONTEXT} create secret generic vault-token --from-literal=token=root -n gloo-mesh kubectl --context ${MGMT_CONTEXT} apply -f- <<EOF apiVersion: cert-manager.io/v1 kind: Issuer metadata: name: vault-issuer namespace: gloo-mesh spec: vault: path: pki_relay/sign/gloo-mesh-mgmt-server-ca server: http://$VAULT_IP:8200 auth: tokenSecretRef: name: vault-token key: token EOF
Create the management server certificates
Generate the gloo-mesh-mgmt-server
certificates.
kubectl --context ${MGMT_CONTEXT} apply -f- <<EOF
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: relay-server-tls
namespace: gloo-mesh
spec:
commonName: "gloo-mesh-mgmt-server-ca"
dnsNames:
- "gloo-mesh-mgmt-server-ca"
- "gloo-mesh-mgmt-server-ca.gloo-mesh"
- "gloo-mesh-mgmt-server-ca.gloo-mesh.svc"
- "*.gloo-mesh"
secretName: relay-server-tls-secret
duration: 24h
renewBefore: 30m
privateKey:
rotationPolicy: Always
algorithm: RSA
size: 2048
usages:
- digital signature
- key encipherment
- server auth
- client auth
issuerRef:
name: vault-issuer
kind: Issuer
group: cert-manager.io
EOF
Create the agent certificates
Generate an gloo-mesh-agent
client certificate for each workload cluster. Be sure to repeat these steps for each workload cluster that you plan to register with Gloo Mesh.
- Configure the
cert-manager
installation on the workload cluster to authenticate with the Vault installation on the management cluster. The secret contains the Vault token to use for authentication.kubectl --context ${REMOTE_CONTEXT} create namespace gloo-mesh kubectl --context ${REMOTE_CONTEXT} create secret generic vault-token --from-literal=token=root -n gloo-mesh kubectl --context ${REMOTE_CONTEXT} apply -f- <<EOF apiVersion: cert-manager.io/v1 kind: Issuer metadata: name: vault-issuer namespace: gloo-mesh spec: vault: path: pki_relay/sign/gloo-mesh-mgmt-server-ca server: http://$VAULT_IP:8200 auth: tokenSecretRef: name: vault-token key: token EOF
- Create a
cert-manager
certificate that refers to the issuer that you set up in the previous step.kubectl apply --context ${REMOTE_CONTEXT} -f- <<EOF apiVersion: cert-manager.io/v1 kind: Certificate metadata: name: relay-client-tls namespace: gloo-mesh spec: commonName: "gloo-mesh-mgmt-server-ca" dnsNames: - "$REMOTE_CLUSTER" secretName: relay-client-tls-secret duration: 24h renewBefore: 30m privateKey: rotationPolicy: Always algorithm: RSA size: 2048 issuerRef: name: vault-issuer kind: Issuer group: cert-manager.io EOF
Verify the cert-manager resources
For clusters that have cert-manager
installed, verify that your cert-manager
issuer and certificate resources are ready. If the READY column says False for any of the following resources, describe the resource for more details and resolve the issue before continuing.
kubectl get issuer -n gloo-mesh --context $MGMT_CONTEXT
kubectl get certificates -n gloo-mesh --context $MGMT_CONTEXT
Now that your custom certificates are created, continue to the next section to modify your Gloo Mesh deployment to use these certificates.
Modifying the Gloo Mesh installation Helm charts
To use the custom CAs and certificates that you create, you must modify the Gloo Mesh installation and registration Helm charts to use these values instead of the default values, such as the default certificates that are generated and managed by Gloo Mesh.
If you already installed Gloo Mesh via Helm, you can upgrade the Helm installation to use these Helm values instead.
gloo-mesh-mgmt-server Helm chart
Install or upgrade an existing gloo-mesh-mgmt-server
Helm chart with your updated values. For more information, see the Modifying Helm chart values guide.
helm install gloo-mgmt gloo-mesh-enterprise/gloo-mesh-enterprise \
--kube-context ${MGMT_CONTEXT} \
--namespace gloo-mesh \
--set glooMeshLicenseKey=${GLOO_MESH_LICENSE_KEY} \
--version ${GLOO_VERSION} \
--set glooMeshMgmtServer.relay.tlsSecret.name=relay-server-tls-secret \
--set glooMeshMgmtServer.relay.disableCaCertGeneration=true \
--set glooMeshMgmtServer.relay.disableCa=true
helm upgrade gloo-mgmt gloo-mesh-enterprise/gloo-mesh-enterprise \
--kube-context ${MGMT_CONTEXT} \
--namespace gloo-mesh \
--set glooMeshLicenseKey=${GLOO_MESH_LICENSE_KEY} \
--version ${GLOO_VERSION} \
--set glooMeshMgmtServer.relay.tlsSecret.name=relay-server-tls-secret \
--set glooMeshMgmtServer.relay.disableCaCertGeneration=true \
--set glooMeshMgmtServer.relay.disableCa=true
gloo-mesh-agent Helm chart
Install the gloo-mesh-agent
Helm chart in each workload cluster. For more information, see the Modifying Helm chart values guide.
- With the Kubernetes context still set to your management cluster, get the IP address of the
gloo-mesh-mgmt-server
deployment.MGMT_INGRESS_ADDRESS=$(kubectl get svc -n gloo-mesh gloo-mesh-mgmt-server --context ${MGMT_CONTEXT} -o jsonpath='{.status.loadBalancer.ingress[0].ip}') MGMT_INGRESS_PORT=$(kubectl -n gloo-mesh get service gloo-mesh-mgmt-server --context ${MGMT_CONTEXT} -o jsonpath='{.spec.ports[?(@.name=="grpc")].port}') MGMT_SERVER_NETWORKING_ADDRESS=${MGMT_INGRESS_ADDRESS}:${MGMT_INGRESS_PORT}
- Install or upgrade an existing
gloo-mesh-agent
Helm chart with your updated values.
helm install gloo-agent gloo-mesh-agent/gloo-mesh-agent \
--kube-context=${REMOTE_CONTEXT} \
--namespace gloo-mesh \
--set relay.serverAddress=${MGMT_SERVER_NETWORKING_ADDRESS} \
--version ${GLOO_VERSION}
helm upgrade gloo-agent gloo-mesh-agent/gloo-mesh-agent \
--kube-context=${REMOTE_CONTEXT} \
--namespace gloo-mesh \
--set relay.serverAddress=${MGMT_SERVER_NETWORKING_ADDRESS} \
--version ${GLOO_VERSION}
-
Allow the Gloo Mesh management plane to use the relay certificates to connect to the agents. The steps vary depending on whether you are installing Gloo Mesh for the first time, or upgrading an existing installation.
Create a KubernetesCluster object in the management cluster for your workload cluster.
kubectl apply --context $MGMT_CONTEXT -f- <<EOF apiVersion: admin.gloo.solo.io/v2 kind: KubernetesCluster metadata: name: ${REMOTE_CLUSTER} namespace: gloo-mesh spec: clusterDomain: cluster.local EOF
Reload the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods to pick up the new certificates.Restarting the Gloo Mesh pods does not impact your running apps. However, you cannot change the configuration of Gloo Mesh resources, such as to control traffic policies, until the pods are healthy again.
- Get the name of the
gloo-mesh-mgmt-server
pod in your management cluster.kubectl get pods -n gloo-mesh --context $MGMT_CONTEXT
- Restart the
gloo-mesh-mgmt-server
pod.kubectl delete pod -n gloo-mesh --context $MGMT_CONTEXT <gloo-mesh-mgmt-server-pod>
- Get the name of the
gloo-mesh-agent
pod in your workload cluster.kubectl get pods -n gloo-mesh --context $REMOTE_CONTEXT
- Restart the
gloo-mesh-agent
pod.kubectl delete pod -n gloo-mesh --context $REMOTE_CONTEXT <gloo-mesh-agent-pod>
- Get the name of the
-
Repeat these steps for each workload cluster.
Verifying your relay certificate setup
- Check that the relay connection between the management server and workload agents is healthy.
- Forward port 9091 of the
gloo-mesh-mgmt-server
pod to your localhost.kubectl port-forward -n gloo-mesh --context $MGMT_CONTEXT deploy/gloo-mesh-mgmt-server 9091
- In your browser, connect to http://localhost:9091/metrics.
- In the metrics UI, look for the following lines. If the values are
1
, the agents in the workload clusters are successfully registered with the management server. If the values are0
, the agents are not successfully connected.relay_pull_clients_connected{cluster="cluster1"} 1 relay_pull_clients_connected{cluster="cluster2"} 1 # HELP relay_push_clients_connected Current number of connected Relay push clients (Relay Agents). # TYPE relay_push_clients_connected gauge relay_push_clients_connected{cluster="cluster1"} 1 relay_push_clients_connected{cluster="cluster2"} 1
- Forward port 9091 of the
- Review the Gloo Mesh UI. Check that the Overall Mesh Status is healthy and that your remote clusters are registered without any configuration issues.
meshctl dashboard --kubecontext $MGMT_CONTEXT
- If the setup is unsuccessful, continue to Troubleshooting.
Troubleshooting relay certificates
-
Review the health of your Gloo Mesh pods in the management and remote clusters.
-
Check that the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods are running.kubectl get pods -n gloo-mesh --context ${MGMT_CONTEXT} kubectl get pods -n gloo-mesh --context ${REMOTE_CONTEXT}
-
If the pods are not running, describe the pods and check the State and Last State sections for error messages and reasons why the pod might not be healthy. For example, the following error messages in the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods indicate that the secret is misnamed or missing. Check the secrets and names, upgrade your Helm installation, and try again.- Example error message for
gloo-mesh-mgmt-server
pod:
Message: 3 errors occurred: * no tls secret found for grpc server: Secret "relay-server-tls-secret" not found * could not find forwarding server token: no token secret found: Timeout: failed waiting for *v1.Secret Informer to sync * no tls secret found for grpc server: Secret "relay-server-tls-secret" not found
- Example error message for
gloo-mesh-agent
pod:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Created 84m (x25 over 3h28m) kubelet Created container gloo-mesh-agent Normal Pulled 84m (x24 over 3h28m) kubelet Container image "gcr.io/gloo-mesh/gloo-mesh-agent:1.2.3" already present on machine Normal Started 84m (x25 over 3h28m) kubelet Started container gloo-mesh-agent Warning FailedMount 84m kubelet MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": read tcp 172.18.0.5:47314->172.18.0.5:6443: use of closed network connection, failed to sync configmap cache: timed out waiting for the condition] Warning FailedMount 84m kubelet MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": read tcp 172.18.0.5:57262->172.18.0.5:6443: use of closed network connection, failed to sync configmap cache: timed out waiting for the condition] Warning FailedMount 83m kubelet MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": http2: client connection force closed via ClientConn.Close, failed to sync configmap cache: timed out waiting for the condition] Warning BackOff 72s (x522 over 3h28m) kubelet Back-off restarting failed container
- Example error message for
-
-
Check the Kubernetes logs for the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods in each cluster for errors. Look for errors during thegrpc
connection.- For example, the following error message indicates that the
gloo-mesh-mgmt-server
load balancer IP address was set incorrectly for the agent during the Helm installation.
{"level":"warn","ts":"2021-11-02T19:56:42.197Z"caller":"zap/ grpclogger.go:85","msg":"[core]grpcaddrConn.createTransport failed to connect to {34.145.18106:9900:9900 gloo-mesh-mgmt-server.gloo-mesh <nil> <nil>}. Err: connection error: desc = \"transport: Error while dialing dial tcp: address 34.145.184.106:9900:9900 too many colons in address\".
- The following
gloo-mesh-agent
pod error indicates that you need to follow the steps in ca.crt.
{"level":"fatal","ts":1640102555.6522746,"msg":"secrets \"relay-root-tls-secret\" not found","version":"1.3.0-beta6","stacktrace":"runtime.main\n\t/usr/local/go/src/runtime/proc.go:255"}
- The following errors indicate that the server or client TLS certificate is expired. Regenerate the certificate, restart the pods, and try again.
{"level":"error","ts":1650047047.6682806,"logger":"translator.reconcile-42","caller":"translator/reconciler.go:195","msg":"translation for parent object failed","parent":"istio-ingressgateway-istio-system-cluster1~gloo-mesh~cluster1~internal.gloo.solo.io/v2, Kind=DiscoveredGateway","err":"Gateway istio-ingressgateway.istio-system in cluster cluster1 not found in snapshot.","errVerbose":"Gateway istio-ingressgateway.istio-system in cluster cluster1 not found in snapshot.\n\ttranslator.(*translator).TranslateOutputs.func1:/src/pkg/translator/translator.go:163\n\ttranslator.(*translator).translateParallel:/src/pkg/translator/translator.go:189\n\tsets.(*discoveredGatewaySet).UnsortedList:/src/pkg/api/internal.gloo.solo.io/v2/sets/sets.go:999\n\tsets.(*resourceSet).UnsortedList:/go/pkg/mod/github.com/solo-io/skv2@v0.22.11/contrib/pkg/sets/sets.go:118\n\tsets.(*discoveredGatewaySet).UnsortedList.func1:/src/pkg/api/internal.gloo.solo.io/v2/sets/sets.go:994\n\ttranslator.(*translator).translateParallel.func1:/src/pkg/translator/translator.go:191\n\ttranslator.getValidEastWestIngressGateway:/src/pkg/translator/translator.go:426","stacktrace":"github.com/solo-io/gloo-mesh-enterprise/pkg/translator.(*reconciler).reconcilePrimary.func1\n\t/src/pkg/translator/reconciler.go:195\ngithub.com/solo-io/gloo-mesh-enterprise/pkg/utils/syncutils.(*workQueue).Execute.func1\n\t/src/pkg/utils/syncutils/parallel.go:52"}
{"level":"info","ts":1650046690.815508,"caller":"grpclog/grpclog.go:37","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc00111c9d0, {TRANSIENT_FAILURE connection error: desc = \"transport: authentication handshake failed: x509: certificate has expired or is not yet valid: current time 2022-04-15T18:18:10Z is after 2022-04-15T14:28:30Z\"}","system":"grpc","grpc_log":true}
- For example, the following error message indicates that the
-
For
gloo-mesh-agent
pods, make sure that the cluster name matches the registered cluster name.- Check the KubernetesCluster resources in the management cluster to get registered cluster names.
kubectl get kubernetesclusters --context $MGMT_CONTEXT
- Check that the registered cluster name matches the name in the client certificate that is issued by the root CA, specifically the DNS SAN extension.
- If the cluster names do not match, update the KubernetesCluster to have the same name, or re-issue the client certificate with the same name.
- Check the KubernetesCluster resources in the management cluster to get registered cluster names.
-
If you still have issues, review the Known issues.
Known issues
ca.crt
Although the ca.crt is included in the gloo-mesh-agent
certificate secret, the gloo-mesh-agent
still expects it to exist separately in the remote cluster. To copy it from the management cluster into the remote clusters, you can run the following command. Make sure to update $CLUSTER_NAME
with your remote cluster name.
CLUSTER_NAME=$REMOTE_CLUSTER
CLUSTER_CONTEXT=$REMOTE_CONTEXT
kubectl get secret gloo-mesh-agent-$CLUSTER_NAME-tls-cert \
--namespace gloo-mesh \
--output json \
--context $CLUSTER_CONTEXT \
| jq 'del(.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid,.data."tls.key",.data."tls.crt",.metadata.annotations)' \
| sed 's/gloo-mesh-agent-$CLUSTER_NAME-tls-cert/relay-root-tls-secret/' \
| kubectl apply --context $CLUSTER_CONTEXT -f -