About this approach

Vault is a popular open source secret management tool that you can use to set up a secure, private key infrastructure (PKI) and manage TLS certificates. In this setup, you install a Vault instance in the Gloo management cluster and use that instance to generate root and intermediate CA certificates. The intermediate CA certificate is used to sign and issue the server and client TLS certificates for the management server and agents. To manage the lifecycle of the server and client certificates, you also install cert-manager. Cert-manager is a Kubernetes controller that helps you automate the process of obtaining and renewing certificates from various PKI providers, such as AWS Private CA, Gloo Cloud CA, or Vault.

With this approach, you get the following benefits:

  • Secure storage of root and intermediate CA certificates and keys.
  • Automatically obtain and renew server and client TLS certificates with cert-manager.
  • Possibility to reuse this architecture for other certificates, such as the Istio root and intermediate CA certificates.

Architecture overview

The following figure depicts an example architecture for using cert-manager and Vault to set up the relay certificates for multiple clusters.

Figure: Example Vault certificate setup for multiple clusters.
  1. After installing cert-manager and Vault in your management cluster, you set up a root of trust for the CA chain.
  2. Next, you create an intermediate CA that is used to sign the relay server and client certificates.
  3. After creating the relay server and client certificates in your clusters, Gloo gloo-mesh-mgmt-server and gloo-mesh-agent deployments use the certificates to secure gRPC protocol communication between the deployments by using the mutual TLS (mTLS) that is provided by the signed certificate.

Before you begin

Save the kubeconfig contexts for your clusters. Run kubectl config get-contexts, look for your cluster in the CLUSTER column, and get the context name in the NAME column. Note: Do not use context names with underscores. The generated certificate that connects workload clusters to the management cluster uses the context name as a SAN specification, and underscores in SAN are not FQDN compliant. You can rename a context by running kubectl config rename-context "<oldcontext>" <newcontext>.
  export MGMT_CLUSTER=<mgmt-cluster-name>
export REMOTE_CLUSTER=<remote-cluster-name>
export MGMT_CONTEXT=<management-cluster-context>
export REMOTE_CONTEXT=<remote-cluster-context>
  

Step 1: Install cert-manager

  1. In your management cluster, install cert-manager. For more information about installation options and versions, see the cert-manager documentation.

    • kubectl installation:
        kubectl apply --context $MGMT_CONTEXT -f https://github.com/jetstack/cert-manager/releases/download/v1.5.4/cert-manager.yaml
        
    • Helm installation:
        helm repo add jetstack https://charts.jetstack.io
      helm repo update
      helm install \
        cert-manager jetstack/cert-manager \
        --namespace cert-manager \
        --create-namespace \
        --version v1.5.4 \
        --set installCRDs=true
        
  2. Verify that cert-manager was successfully installed.

      kubectl get pod -n cert-manager --context $MGMT_CONTEXT
      

    Example output:

      NAME                                       READY   STATUS    RESTARTS   AGE
    cert-manager-7c6f78c46d-247br              1/1     Running   0          17s
    cert-manager-cainjector-668d9c86df-7cqb8   1/1     Running   0          17s
    cert-manager-webhook-764b556954-2m4zf      1/1     Running   0          17s
      

Step 2: Set up Vault and generate the root and intermediate CAs

Create and securely store the relay root CA in HashiCorp Vault. Although this Vault setup is more secure than using the self-signing default setup, you might need to use a different setup to meet your production security requirements.

  1. If it doesn’t already exist, create the gloo-mesh namespace.
      kubectl create namespace gloo-mesh --context $MGMT_CONTEXT
      
  2. If not added already, add and update the HashiCorp Helm repository in your management cluster.
      helm repo add hashicorp https://helm.releases.hashicorp.com
    helm repo update
      
  3. Install Vault in your management cluster.
      helm install vault hashicorp/vault -n vault \
    --kube-context ${MGMT_CONTEXT} \
    --set "injector.enabled=false" \
    --set "server.dev.enabled=true" \
    --set "server.service.type=LoadBalancer" \
    --create-namespace
      
  4. Enable Vault for root CA certificates along the pki path.
      kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault secrets enable pki'
      
  5. Set up the root of trust. The following example uses solo.io, but replace these values with your own CA provider.
      kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write -format=json pki/root/generate/internal \
    common_name="Solo.io Root CA" organization="solo.io"  ttl=187600h'
      
  6. Create an intermediate CA that is used for the relay server operations, along the pki_relay path. The key is kept internally, and a certificate signing request is created.
      kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault secrets enable -path pki_relay pki'
    
    CSR_INPUT=$(kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write \
    -format=json pki_relay/intermediate/generate/internal \
    common_name="gloo-mesh-mgmt-server-ca" organization="mesh.solo.io" ttl=43800h')
    
    CSR=$(echo $CSR_INPUT | tr '\r\n' ' ' | jq ' .data.csr' | sed 's/ /\\n/g' | sed 's/BEGIN\\nCERTIFICATE\\nREQUEST/BEGIN CERTIFICATE REQUEST/g' |sed 's/END\\nCERTIFICATE\\nREQUEST/END CERTIFICATE REQUEST/g')
      
  7. Copy the CSR value, including the double quotes. You use this value later to sign and generate the certificate.
      echo $CSR
      
    Example output:
      "-----BEGIN CERTIFICATE REQUEST-----                  
    MIICtTCCAZ0CAQAwOjEVMBMGA1UEChMMbWVzaC5zb2xvLmlvMSEwHwYDVQQDExhl
    bnRlcnByaXNlLW5ldHdvcmtpbmctY2EwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAw
    ggEKAoIBAQDA7o31x9auoUNqz2LpB1GkIkHD/VT09tnenkT9ldph56ysWLL681XU
    KqUKkTTZxof9Evi5DrpnetXH7WQYwLXDixcm78qGXEzYi2bnAHAYuoJWdnbWSNZW
    FJ5VOlYJm51zTMsxk5bQ5UrkjJvX3inbYASNBUMrlRgLWsLYe0avTc/EwpDMZ9XK
    gHMnJb/VyFg4mHrEwTLxKVtWmBxC9AflEcg6Zm5KZPkJX2v3iF+XOQw/63RIMwAG
    9MNU+pDkasOKdqtdX1HWURLf8vnHVpWvFWCxNCa6OojTpntBNH8wrpLhbnqoeKPL
    xyAXOdnaBTt+5DdAg4k4+1lSrUM8+bERAgMBAAGgNjA0BgkqhkiG9w0BCQ4xJzAl
    MCMGA1UdEQQcMBqCGGVudGVycHJpc2UtbmV0d29ya2luZy1jYTANBgkqhkiG9w0B
    AQsFAAOCAQEAMhpY0vrihMIpYxaFBuJFX6FRDIhoiPiYgklwOwAijfrrC/68DlRl
    KOG+1RsK03tCjFHNkvTpAHZ2UbOfkd54SIEbRjadroN5SubG2XQb9pg73gk7XqOP
    g3Koss8SEdF3RU4swWKSNCV370mpJPY8QNvjpj+nbT2W9LzmnXpU26LtTUrOfGJj
    wf89VlquVRRgi6KF5ewQu/c2Ov1iN0SZOSBBELqi8dIaT8ZaWgXcwtgTueLHfvQf
    mYrjJAoy0FmdFhMN6zYs9EfacjRXuoHoTzqSad3i5A6ofmAnGuLwlZVmSC85xUpm
    VQW87mr/cWLm6se36ZPmQDzGUn6yVHOWIg==                            
    -----END CERTIFICATE REQUEST-----"
      
  8. Sign and generate the certificate. Replace $CSR with the value that you copied in the previous step.
      CERT_INPUT=$(kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write -format=json pki/root/sign-intermediate \
    csr=$CSR format=pem_bundle ttl=43800h')
    
    CERT=$(echo $CERT_INPUT | tr '\r\n' ' ' | jq ' .data.certificate' | sed 's/ /\\n/g' | sed 's/BEGIN\\nCERTIFICATE/BEGIN CERTIFICATE/g' |sed 's/END\\nCERTIFICATE/END CERTIFICATE/g')
      
  9. Copy the CERT, including the double quotes. You use this value later to set the signed certificate.
      echo $CERT
      
    Example output:
      "-----BEGIN CERTIFICATE-----
    MIIDZzCCAk+gAwIBAgIULGGnrdUSautQsOPDSYiPwdm3uUowDQYJKoZIhvcNAQEL
    BQAwLDEQMA4GA1UEChMHc29sby5pbzEYMBYGA1UEAxMPU29sby5pbyBSb290IENB
    MB4XDTIxMTEwMTE1MjQxOFoXDTIxMTIwMzE1MjQ0OFowIzEhMB8GA1UEAxMYZW50
    ZXJwcmlzZS1uZXR3b3JraW5nLWNhMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIB
    CgKCAQEAwO6N9cfWrqFDas9i6QdRpCJBw/1U9PbZ3p5E/ZXaYeesrFiy+vNV1Cql
    CpE02caH/RL4uQ66Z3rVx+1kGMC1w4sXJu/KhlxM2Itm5wBwGLqCVnZ21kjWVhSe
    VTpWCZudc0zLMZOW0OVK5Iyb194p22AEjQVDK5UYC1rC2HtGr03PxMKQzGfVyoBz
    JyW/1chYOJh6xMEy8SlbVpgcQvQH5RHIOmZuSmT5CV9r94hflzkMP+t0SDMABvTD
    VPqQ5GrDinarXV9R1lES3/L5x1aVrxVgsTQmujqI06Z7QTR/MK6S4W56qHijy8cg
    FznZ2gU7fuQ3QIOJOPtZUq1DPPmxEQIDAQABo4GJMIGGMA4GA1UdDwEB/wQEAwIB
    BjAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBR9rVq8QaXUIrRVecwMeuAbCdMa
    wjAfBgNVHSMEGDAWgBTn8wKJK3TBLAyJ7V29jou0CCcx0jAjBgNVHREEHDAaghhl
    bnRlcnByaXNlLW5ldHdvcmtpbmctY2EwDQYJKoZIhvcNAQELBQADggEBAAWuzxz8
    wlvXIbGft8GX9pt/FXOZfedm1AJGP2zQELuWUk7J6p2QfEqQPKntvykCP3xlfgAH
    BVqWrNSv1DfV6+QTACzGG83muUtGuX0/Av6VtjHfwoFgC0y8A3XH6P3vrwHx6heI
    7tkT2GbCXq5Br0Mne6uTvYskMskuwAuZyglz8XK7bKerfD8Z4w2O7Fu41t9Mlirx
    LireTJEKR4ggVfPITyECkBJ9TFaj0T83qupFwrw4K60xLmHs2akKxm69tofcH1ZQ
    Z9a96zY0x3GPVuAU1WApf64roaofH4Vbk/gzbChKhsV6vX16jzwwWrjFQvS0Rn59
    hUNB9r37cXzuaJM=
    -----END CERTIFICATE-----"
      
  10. Set the signed certificate value. Replace $CERT with the value that you copied in the previous step.
      kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write -format=json pki_relay/intermediate/set-signed certificate=$CERT'
    
    kubectl --context ${MGMT_CONTEXT} exec -n vault vault-0 -- /bin/sh -c 'vault write pki_relay/roles/gloo-mesh-mgmt-server-ca allow_any_name=true max_ttl="720h"'
      
  11. Get the External IP address of the LoadBalancer service for Vault.
      VAULT_IP=$(kubectl get svc -n vault vault --context $MGMT_CONTEXT \
    -o jsonpath="{.status.loadBalancer.ingress[0]['hostname','ip']}")
    echo $VAULT_IP
      
  12. Create a cert-manager issuer for the CA, replacing $VAULT_IP with the external IP address that you previously retrieved.
      kubectl --context ${MGMT_CONTEXT} create secret generic vault-token --from-literal=token=root -n gloo-mesh
    
    kubectl --context ${MGMT_CONTEXT} apply -f- <<EOF
    apiVersion: cert-manager.io/v1
    kind: Issuer
    metadata:
      name: vault-issuer
      namespace: gloo-mesh
    spec:
      vault:
        path: pki_relay/sign/gloo-mesh-mgmt-server-ca
        server: http://$VAULT_IP:8200
        auth:
          tokenSecretRef:
            name: vault-token
            key: token
    EOF
      

Step 3: Create the server TLS certificate for the management server

Generate the server TLS certificate that the Gloo management server uses for mutual TLS connections with Gloo agents.

  kubectl --context ${MGMT_CONTEXT} apply -f- <<EOF
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: relay-server-tls
  namespace: gloo-mesh
spec:
  commonName: "gloo-mesh-mgmt-server-ca"
  dnsNames:
    - "gloo-mesh-mgmt-server-ca"
    - "gloo-mesh-mgmt-server-ca.gloo-mesh"
    - "gloo-mesh-mgmt-server-ca.gloo-mesh.svc"
    - "*.gloo-mesh"
  secretName: relay-server-tls-secret
  duration: 24h
  renewBefore: 30m
  privateKey:
    rotationPolicy: Always
    algorithm: RSA
    size: 2048
  usages:
    - digital signature
    - key encipherment
    - server auth
    - client auth
  issuerRef:
    name: vault-issuer
    kind: Issuer
    group: cert-manager.io
EOF
  

Step 4: Create the client TLS certificate for the Gloo agent

In each workload cluster, generate a client TLS certificate for the Gloo agent.

  1. Configure the cert-manager installation on the workload cluster to authenticate with the Vault installation on the management cluster. The secret contains the Vault token to use for authentication.

      kubectl --context ${REMOTE_CONTEXT1} create namespace gloo-mesh
    
    kubectl --context ${REMOTE_CONTEXT1} create secret generic vault-token --from-literal=token=root -n gloo-mesh
    
    kubectl --context ${REMOTE_CONTEXT1} apply -f- <<EOF
    apiVersion: cert-manager.io/v1
    kind: Issuer
    metadata:
      name: vault-issuer
      namespace: gloo-mesh
    spec:
      vault:
        path: pki_relay/sign/gloo-mesh-mgmt-server-ca
        server: http://$VAULT_IP:8200
        auth:
          tokenSecretRef:
            name: vault-token
            key: token
    EOF
      
  2. Create a cert-manager certificate that refers to the issuer that you set up in the previous step.

      kubectl apply --context ${REMOTE_CONTEXT1} -f- <<EOF
    apiVersion: cert-manager.io/v1
    kind: Certificate
    metadata:
      name: relay-client-tls
      namespace: gloo-mesh
    spec:
      commonName: "gloo-mesh-mgmt-server-ca"
      dnsNames:
        - "$REMOTE_CLUSTER1"
      secretName: relay-client-tls-secret
      duration: 24h
      renewBefore: 30m
      privateKey:
        rotationPolicy: Always
        algorithm: RSA
        size: 2048
      issuerRef:
        name: vault-issuer
        kind: Issuer
        group: cert-manager.io
    EOF
      

Verify the cert-manager resources

For clusters that have cert-manager installed, verify that your cert-manager issuer and certificate resources are ready. If the READY column says False for any of the following resources, describe the resource for more details and resolve the issue before continuing.

  kubectl get issuer -n gloo-mesh --context $MGMT_CONTEXT
kubectl get certificates -n gloo-mesh --context $MGMT_CONTEXT
  

Now that your custom certificates are created, continue to the next section to modify your Gloo Mesh deployment to use these certificates.

Step 5: Install the Gloo management server and agent

Set up Gloo Mesh Core to use the client and server TLS certificates that you created earlier.

  1. Prepare the Helm installation settings for the Gloo management server.

      glooMgmtServer:
      relay:
        disableCa: true
        disableCaCertGeneration: true
        disableTokenGeneration: true
        # Secret containing server TLS certs used to secure the management server.
        tlsSecret:
          name: relay-server-tls-secret
      
  2. Install a new or upgrade an existing or upgrade an existing Gloo management server with the Helm settings from the previous step.

  3. Prepare the Helm installation settings for the Gloo agent.

      
    glooAgent:
      relay:
        # gloo-mesh-mgmt-server IP address
        serverAddress: $MGMT_SERVER_NETWORKING_ADDRESS
        # Custom certs: Secret containing client TLS certs used to identify the Gloo agent to the management server. If you do not specify a clientTlssSecret, you must specify a tokenSecret and a rootTlsSecret.
        clientTlsSecret:
          name: relay-client-tls-secret
        tokenSecret:
          # Key value of the data within the Kubernetes secret.
          key: token
          # Name of the Kubernetes secret.
          name: relay-identity-token-secret
          # Namespace of the Kubernetes secret.
          namespace: ""
      
  4. Register the workload cluster or upgrade an existing Gloo agent with the Helm settings from the previous step.

Verifying your relay certificate setup

  1. Check that the relay connection between the management server and workload agents is healthy.
    1. Forward port 9091 of the gloo-mesh-mgmt-server pod to your localhost.
        kubectl port-forward -n gloo-mesh --context $MGMT_CONTEXT deploy/gloo-mesh-mgmt-server 9091
        
    2. In your browser, connect to http://localhost:9091/metrics.
    3. In the metrics UI, look for the following lines. If the values are 1, the agents in the workload clusters are successfully registered with the management server. If the values are 0, the agents are not successfully connected.
        relay_pull_clients_connected{cluster="cluster1"} 1
      relay_pull_clients_connected{cluster="cluster2"} 1
      # HELP relay_push_clients_connected Current number of connected Relay push clients (Relay Agents).
      # TYPE relay_push_clients_connected gauge
      relay_push_clients_connected{cluster="cluster1"} 1
      relay_push_clients_connected{cluster="cluster2"} 1
        
  2. Review the Gloo UI. Check that the Overall Mesh Status is healthy and that your remote clusters are registered without any configuration issues.
      meshctl dashboard --kubecontext $MGMT_CONTEXT
      
  3. If the setup is unsuccessful, continue to Troubleshooting.

Troubleshooting relay certificates

  1. Review the health of your Gloo pods in the management and remote clusters.

    1. Check that the gloo-mesh-mgmt-server and gloo-mesh-agent pods are running.

        kubectl get pods -n gloo-mesh --context ${MGMT_CONTEXT}
      kubectl get pods -n gloo-mesh --context ${REMOTE_CONTEXT}
        
    2. If the pods are not running, describe the pods and check the State and Last State sections for error messages and reasons why the pod might not be healthy. For example, the following error messages in the gloo-mesh-mgmt-server and gloo-mesh-agent pods indicate that the secret is misnamed or missing. Check the secrets and names, upgrade your Helm installation, and try again.

      • Example error message for gloo-mesh-mgmt-server pod:
        Message:   3 errors occurred:
          * no tls secret found for grpc server: Secret "relay-server-tls-secret" not found
          * could not find forwarding server token: no token secret found: Timeout: failed waiting for *v1.Secret Informer to sync
          * no tls secret found for grpc server: Secret "relay-server-tls-secret" not found
        
      • Example error message for gloo-mesh-agent pod:
        Events:
        Type     Reason       Age                    From     Message
        ----     ------       ----                   ----     -------
        Normal   Created      84m (x25 over 3h28m)   kubelet  Created container gloo-mesh-agent
        Normal   Pulled       84m (x24 over 3h28m)   kubelet  Container image "gcr.io/gloo-mesh/gloo-mesh-agent:1.2.3" already present on machine
        Normal   Started      84m (x25 over 3h28m)   kubelet  Started container gloo-mesh-agent
        Warning  FailedMount  84m                    kubelet  MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": read tcp 172.18.0.5:47314->172.18.0.5:6443: use of closed network connection, failed to sync configmap cache: timed out waiting for the condition]
        Warning  FailedMount  84m                    kubelet  MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": read tcp 172.18.0.5:57262->172.18.0.5:6443: use of closed network connection, failed to sync configmap cache: timed out waiting for the condition]
        Warning  FailedMount  83m                    kubelet  MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": http2: client connection force closed via ClientConn.Close, failed to sync configmap cache: timed out waiting for the condition]
        Warning  BackOff      72s (x522 over 3h28m)  kubelet  Back-off restarting failed container
        
  2. Check the Kubernetes logs for the gloo-mesh-mgmt-server and gloo-mesh-agent pods in each cluster for errors. Look for errors during the grpc connection.

    • For example, the following error message indicates that the gloo-mesh-mgmt-server load balancer IP address was set incorrectly for the agent during the Helm installation.
      {"level":"warn","ts":"2021-11-02T19:56:42.197Z"caller":"zap/
    grpclogger.go:85","msg":"[core]grpcaddrConn.createTransport
    failed to connect to {34.145.18106:9900:9900 
    gloo-mesh-mgmt-server.gloo-mesh <nil> <nil>}. Err: connection 
    error: desc = \"transport: Error while dialing dial tcp: address 
    34.145.184.106:9900:9900 too many colons in address\".
      
    • The following gloo-mesh-agent pod error indicates that you need to follow the steps in ca.crt.
      {"level":"fatal","ts":1640102555.6522746,"msg":"secrets \"relay-root-tls-secret\" not found","version":"1.3.0-beta6","stacktrace":"runtime.main\n\t/usr/local/go/src/runtime/proc.go:255"}
      
    • The following errors indicate that the server or client TLS certificate is expired. Regenerate the certificate, restart the pods, and try again.
      {"level":"error","ts":1650047047.6682806,"logger":"translator.reconcile-42","caller":"translator/reconciler.go:195","msg":"translation for parent object failed","parent":"istio-ingressgateway-istio-system-cluster1~gloo-mesh~cluster1~internal.gloo.solo.io/v2, Kind=DiscoveredGateway","err":"Gateway istio-ingressgateway.istio-system in cluster cluster1 not found in snapshot.","errVerbose":"Gateway istio-ingressgateway.istio-system in cluster cluster1 not found in snapshot.\n\ttranslator.(*translator).TranslateOutputs.func1:/src/pkg/translator/translator.go:163\n\ttranslator.(*translator).translateParallel:/src/pkg/translator/translator.go:189\n\tsets.(*discoveredGatewaySet).UnsortedList:/src/pkg/api/internal.gloo.solo.io/v2/sets/sets.go:999\n\tsets.(*resourceSet).UnsortedList:/go/pkg/mod/github.com/solo-io/skv2@v0.22.11/contrib/pkg/sets/sets.go:118\n\tsets.(*discoveredGatewaySet).UnsortedList.func1:/src/pkg/api/internal.gloo.solo.io/v2/sets/sets.go:994\n\ttranslator.(*translator).translateParallel.func1:/src/pkg/translator/translator.go:191\n\ttranslator.getValidEastWestIngressGateway:/src/pkg/translator/translator.go:426","stacktrace":"github.com/solo-io/gloo-mesh-enterprise/pkg/translator.(*reconciler).reconcilePrimary.func1\n\t/src/pkg/translator/reconciler.go:195\ngithub.com/solo-io/gloo-mesh-enterprise/pkg/utils/syncutils.(*workQueue).Execute.func1\n\t/src/pkg/utils/syncutils/parallel.go:52"}
      
      {"level":"info","ts":1650046690.815508,"caller":"grpclog/grpclog.go:37","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc00111c9d0, {TRANSIENT_FAILURE connection error: desc = \"transport: authentication handshake failed: x509: certificate has expired or is not yet valid: current time 2022-04-15T18:18:10Z is after 2022-04-15T14:28:30Z\"}","system":"grpc","grpc_log":true}
      
  3. For gloo-mesh-agent pods, make sure that the cluster name matches the registered cluster name.

    1. Check the KubernetesCluster resources in the management cluster to get registered cluster names.
        kubectl get kubernetesclusters --context $MGMT_CONTEXT
        
    2. Check that the registered cluster name matches the name in the client certificate that is issued by the root CA, specifically the DNS SAN extension.
    3. If the cluster names do not match, update the KubernetesCluster to have the same name, or re-issue the client certificate with the same name.
  4. If you still have issues, review the Known issues.

Known issues

ca.crt

Although the ca.crt is included in the gloo-mesh-agent certificate secret, the gloo-mesh-agent still expects it to exist separately in the remote cluster. To copy it from the management cluster into the remote clusters, you can run the following command. Make sure to update $CLUSTER_NAME with your remote cluster name.

  CLUSTER_NAME=$REMOTE_CLUSTER
CLUSTER_CONTEXT=$REMOTE_CONTEXT

kubectl get secret gloo-mesh-agent-$CLUSTER_NAME-tls-cert \
  --namespace gloo-mesh \
  --output json \
  --context $CLUSTER_CONTEXT \
  | jq 'del(.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid,.data."tls.key",.data."tls.crt",.metadata.annotations)' \
  | sed 's/gloo-mesh-agent-$CLUSTER_NAME-tls-cert/relay-root-tls-secret/' \
  | kubectl apply --context $CLUSTER_CONTEXT -f -