Use custom CAs

Instead of using Gloo Mesh self-signed certificates for the root CA certificate, you can generate your own relay root CA certificate and key with the certificate management tool of your choice. You then use these credentials to create an intermediate CA certificate and key that can be use by Gloo Mesh to automatically sign and issue client TLS certificates for the workload clusters.

For more information about this approach, see Option 3: Custom CAs with automatic client TLS certificate rotation.

Step 1: Create your own root CA certificate and key

To generate and store your own root CA certificate and key, you typically use your preferred PKI provider, such as Vault, Google Cloud CA, or AWS Private CA. If you do not have a PKI provider, you can use tools, such as OpenSSL to generate the certificate and key for the root CA as described in this guide.

  1. Make sure that you have the OpenSSL version of openssl, not LibreSSL. The openssl version must be at least 1.1.

    1. Check the openssl version that is installed. If you see LibreSSL in the output, continue to the next step.
      openssl version
      
    2. Install the OpenSSL version (not LibreSSL). For example, you might use Homebrew.
      brew install openssl
      
    3. Review the output of the OpenSSL installation for the path of the binary file. You can choose to export the binary to your path, or call the entire path whenever the following steps use an openssl command.
      • For example, openssl might be installed along the following path: /usr/local/opt/openssl@3/bin/
      • To run commands, you can append the path so that your terminal uses this installed version of OpenSSL, and not the default LibreSSL. /usr/local/opt/openssl@3/bin/openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650...
  2. Create the configuration for the root CA.

    cat > "root-ca.conf" <<EOF
    [ v3_ca ]
    basicConstraints = critical,CA:TRUE
    subjectKeyIdentifier = hash
    authorityKeyIdentifier = keyid:always,issuer:always
    keyUsage = digitalSignature, keyEncipherment, keyCertSign
    extendedKeyUsage = clientAuth, serverAuth
    EOF
    
  3. Create a self-signed root CA certificate and key.

    openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650 -nodes -out relay-root-ca.crt -keyout relay-root-ca.key -subj "/CN=relay-root-ca" -config "root-ca.conf" -extensions v3_ca
    
  4. If it doesn't already exist, create the gloo-mesh namespace in each cluster.

    kubectl create namespace gloo-mesh --context $MGMT_CONTEXT
    kubectl create namespace gloo-mesh --context $REMOTE_CONTEXT1
    kubectl create namespace gloo-mesh --context $REMOTE_CONTEXT2
    
  5. Store the root CA certificate in the management cluster.

    kubectl create secret generic relay-root-tls-secret -n gloo-mesh \
      --from-file=tls.crt=relay-root-ca.crt \
      --from-file=ca.crt=relay-root-ca.crt \
    
  6. Copy the root CA certificate to each workload cluster.

    kubectl create secret generic relay-root-tls-secret -n gloo-mesh --context $REMOTE_CONTEXT1 --from-file ca.crt=relay-root-ca.crt
    kubectl create secret generic relay-root-tls-secret -n gloo-mesh --context $REMOTE_CONTEXT2 --from-file ca.crt=relay-root-ca.crt
    

Step 2: Create an intermediate CA certificate and key

Use the root CA key to generate an intermediate CA certificate and key. These credentials are later used to sign client TLS certificates for the Gloo agents on each workload cluster.

  1. Create the configuration for the intermediate CA.

    cat > "relay-intermediate-ca.conf" <<EOF
    [req]
    req_extensions = req_ext
    distinguished_name = req_distinguished_name
    [req_distinguished_name]
    [req_ext]
    basicConstraints = CA:TRUE
    subjectKeyIdentifier = hash
    
    [v3_ca]
    basicConstraints = CA:TRUE
    subjectKeyIdentifier = hash
    authorityKeyIdentifier = keyid:always,issuer:always
    keyUsage = digitalSignature, keyEncipherment, keyCertSign
    extendedKeyUsage = clientAuth, serverAuth
    subjectAltName = @alt_names
    [alt_names]
    DNS = ${DNS_NAME}
    EOF
    
  2. Generate the intermediate-ca.key.

    openssl genrsa -out "intermediate-ca.key" 2048
    
  3. Generate the certificate signing request (CSR).

    openssl req -new -key "intermediate-ca.key" -out "intermediate-ca.csr" -subj "/CN=gloo-mesh-mgmt-server" -config relay-intermediate-ca.conf -extensions req_ext  
    
  4. Sign the CSR with the root CA key.

    openssl x509 -req -in "intermediate-ca.csr" -CA "relay-root-ca.crt" -CAkey "relay-root-ca.key" -CAcreateserial -out "intermediate-ca.crt" -days 365 -extensions v3_ca -extfile relay-intermediate-ca.conf
    
  5. Save the intermediate CA certificate and key in the relay-tls-signing-secret Kubernetes secret on the management cluster.

    kubectl create secret generic relay-tls-signing-secret \
      --from-file=tls.key=intermediate-ca.key \
      --from-file=tls.crt=intermediate-ca.crt \
      --from-file=ca.crt=relay-root-ca.crt \
      --context ${MGMT_CONTEXT} \
      --namespace gloo-mesh
    

Step 3: Create the server TLS certificate

Use the root CA credentials that you created earlier to derive the server TLS certificate that the Gloo management server uses for mutual TLS connections with the Gloo agents.

  1. If it doesn't already exist, create the gloo-mesh namespace in each cluster.

    kubectl create namespace gloo-mesh --context $MGMT_CONTEXT
    kubectl create namespace gloo-mesh --context $REMOTE_CONTEXT1
    kubectl create namespace gloo-mesh --context $REMOTE_CONTEXT2
    
  2. Create the configuration for the server TLS certificate.

    # Server certificate configuration
    cat > "gloo-mesh-mgmt-server.conf" <<EOF
    [req]
    req_extensions = v3_req
    distinguished_name = req_distinguished_name
    [req_distinguished_name]
    [ v3_req ]
    basicConstraints = CA:FALSE
    keyUsage = digitalSignature, keyEncipherment
    extendedKeyUsage = clientAuth, serverAuth
    subjectAltName = @alt_names
    [alt_names]
    DNS = *.gloo-mesh
    EOF
    
  3. Generate the private key and certificate signing request (CSR).

    # Generate gloo-mesh-mgmt-server private key
    openssl genrsa -out "gloo-mesh-mgmt-server.key" 2048
    # Generate gloo-mesh-mgmt-server CSR
    openssl req -new -key "gloo-mesh-mgmt-server.key" -out gloo-mesh-mgmt-server.csr -subj "/CN=gloo-mesh-mgmt-server" -config "gloo-mesh-mgmt-server.conf"
    
  4. Sign the CSR with the root CA key.

    # Sign certificate with local relay-root-ca
    openssl x509 -req \
      -days 3650 \
      -CA relay-root-ca.crt -CAkey relay-root-ca.key \
      -set_serial 0 \
      -in gloo-mesh-mgmt-server.csr -out gloo-mesh-mgmt-server.crt \
      -extensions v3_req -extfile "gloo-mesh-mgmt-server.conf"
    
  5. Save the server TLS certificate in the relay-server-tls-secret Kubernetes secret on the management cluster.

    kubectl create secret generic relay-server-tls-secret \
      --from-file=tls.key=gloo-mesh-mgmt-server.key \
      --from-file=tls.crt=gloo-mesh-mgmt-server.crt \
      --from-file=ca.crt=relay-root-ca.crt \
      --context ${MGMT_CONTEXT} \
      --namespace gloo-mesh
    

Step 4: Set up initial trust with identity tokens

Because no client TLS certificate exists on the workload cluster yet, you must use identity tokens to establish initial trust between the Gloo agent and the management server. For more information about this process, see Initial proof of trust.

You have the option to automatically generate identity tokens or to provide your own identity tokens before you register the workload clusters with the management server.

No steps are required prior to installing the Gloo management server and agent. You can skip to step 5: Install the Gloo management server and agent.
  1. Create an environment variable with your identity token. The token can be any string value.

    export TOKEN="<identity_token>"
    
  2. Store the token in the relay-identity-token-secret Kubernetes secret on the management cluster.

    kubectl create secret generic relay-identity-token-secret -n gloo-mesh --context $MGMT_CONTEXT --from-file token=$TOKEN
    
  3. Copy the identity token to each workload cluster that you want to register.

    kubectl create secret generic relay-identity-token-secret -n gloo-mesh --context $REMOTE_CONTEXT1 --from-file token=$TOKEN
    kubectl create secret generic relay-identity-token-secret -n gloo-mesh --context $REMOTE_CONTEXT2 --from-file token=$TOKEN
    

Step 5: Install the Gloo management server and agent

Set up Gloo Mesh to use your own intermediate CA credentials to issue and sign client TLS certificates. The setup varies depending on how you chose to create the identity tokens.

  1. Prepare the Helm installation settings for the Gloo management server.

     
    glooMgmtServer:
        relay:
            disableCa: true
            disableCaCertGeneration: true
            disableTokenGeneration: false
            # Push RBAC resources to the management server. Required for multicluster RBAC in the Gloo UI.
            pushRbac: true
            # Secret containing TLS certs used to sign CSRs created by workload agents.
            signingTlsSecret:
                name: relay-tls-signing-secret
            # Secret containing server TLS certs used to secure the management server.
            tlsSecret:
                name: relay-server-tls-secret
            # Secret containing a shared token for authenticating Gloo agents when they first communicate with the management server.
            tokenSecret:
                # Key value of the data within the Kubernetes secret.
                key: token
                # Name of the Kubernetes secret.
                name: relay-identity-token-secret
                # Namespace of the Kubernetes secret.
                namespace: ""
     
    glooMgmtServer:
        relay:
            disableCa: true
            disableCaCertGeneration: true
            disableTokenGeneration: true
            # Push RBAC resources to the management server. Required for multicluster RBAC in the Gloo UI.
            pushRbac: true
            # Secret containing TLS certs used to sign CSRs created by workload agents.
            signingTlsSecret:
                name: relay-tls-signing-secret
            # Secret containing server TLS certs used to secure the management server.
            tlsSecret:
                name: relay-server-tls-secret
            tokenSecret:
                # Key value of the data within the Kubernetes secret.
                key: token
                # Name of the Kubernetes secret.
                name: relay-identity-token-secret
                # Namespace of the Kubernetes secret.
                namespace: "gloo-mesh"

  2. Install a new or upgrade an existing Gloo management server with the Helm settings from the previous step.

  3. Prepare the Helm installation settings for the Gloo agent.

     
    glooAgent:
        relay:
            # SNI name in the authority/host header used to connect to relay forwarding server. Must match server certificate CommonName. Do not change the default value.
            authority: gloo-mesh-mgmt-server.gloo-mesh
            # Custom certs: Secret containing client TLS certs used to identify the Gloo agent to the management server. If you do not specify a clientTlssSecret, you must specify a tokenSecret and a rootTlsSecret.
            clientTlsSecret:
                name: relay-client-tls-secret
            # The ratio of the client TLS certificate lifetime to when the management server starts the certificate rotation process.
            clientTlsSecretRotationGracePeriodRatio: ""
            # Secret containing a root TLS cert used to verify the management server cert. The secret can also optionally specify a 'tls.key', which is used to generate the agent client cert.
            rootTlsSecret:
                name: relay-root-tls-secret
            # Address and port by which gloo-mesh-mgmt-server in the Gloo control plane can be accessed by the Gloo workload agents.
            serverAddress: $MGMT_SERVER_NETWORKING_ADDRESS
            # Secret containing a shared token for authenticating Gloo agents when they first communicate with the management server. A token secret is not needed with ACM certs.
            tokenSecret:
                # Key value of the data within the Kubernetes secret.
                key: token
                # Name of the Kubernetes secret.
                name: relay-identity-token-secret
                # Namespace of the Kubernetes secret.
                namespace: ""
     
    glooAgent:
        relay:
            # SNI name in the authority/host header used to connect to relay forwarding server. Must match server certificate CommonName. Do not change the default value.
            authority: gloo-mesh-mgmt-server.gloo-mesh
            # Custom certs: Secret containing client TLS certs used to identify the Gloo agent to the management server. If you do not specify a clientTlssSecret, you must specify a tokenSecret and a rootTlsSecret.
            clientTlsSecret:
                name: relay-client-tls-secret
            # The ratio of the client TLS certificate lifetime to when the management server starts the certificate rotation process.
            clientTlsSecretRotationGracePeriodRatio: ""
            # Secret containing a root TLS cert used to verify the management server cert. The secret can also optionally specify a 'tls.key', which is used to generate the agent client cert.
            rootTlsSecret:
                name: relay-root-tls-secret
            # Address and port by which gloo-mesh-mgmt-server in the Gloo control plane can be accessed by the Gloo workload agents.
            serverAddress: $MGMT_SERVER_NETWORKING_ADDRESS
            # Secret containing a shared token for authenticating Gloo agents when they first communicate with the management server. A token secret is not needed with ACM certs.
            tokenSecret:
                # Key value of the data within the Kubernetes secret.
                key: token
                # Name of the Kubernetes secret.
                name: relay-identity-token-secret
                # Namespace of the Kubernetes secret.
                namespace: "gloo-mesh"

  4. Register the workload cluster or upgrade an existing Gloo agent with the Helm settings from the previous step.

Verifying your relay certificate setup

  1. Check that the relay connection between the management server and workload agents is healthy.
    1. Forward port 9091 of the gloo-mesh-mgmt-server pod to your localhost.
      kubectl port-forward -n gloo-mesh --context $MGMT_CONTEXT deploy/gloo-mesh-mgmt-server 9091
      
    2. In your browser, connect to http://localhost:9091/metrics.
    3. In the metrics UI, look for the following lines. If the values are 1, the agents in the workload clusters are successfully registered with the management server. If the values are 0, the agents are not successfully connected.
      relay_pull_clients_connected{cluster="cluster1"} 1
      relay_pull_clients_connected{cluster="cluster2"} 1
      # HELP relay_push_clients_connected Current number of connected Relay push clients (Relay Agents).
      # TYPE relay_push_clients_connected gauge
      relay_push_clients_connected{cluster="cluster1"} 1
      relay_push_clients_connected{cluster="cluster2"} 1
      
  2. Review the Gloo Mesh UI. Check that the Overall Mesh Status is healthy and that your remote clusters are registered without any configuration issues.
    meshctl dashboard --kubecontext $MGMT_CONTEXT
    
  3. If the setup is unsuccessful, continue to Troubleshooting.

Troubleshooting relay certificates

  1. Review the health of your Gloo Mesh pods in the management and remote clusters.

    1. Check that the gloo-mesh-mgmt-server and gloo-mesh-agent pods are running.

      kubectl get pods -n gloo-mesh --context ${MGMT_CONTEXT}
      kubectl get pods -n gloo-mesh --context ${REMOTE_CONTEXT}
      
    2. If the pods are not running, describe the pods and check the State and Last State sections for error messages and reasons why the pod might not be healthy. For example, the following error messages in the gloo-mesh-mgmt-server and gloo-mesh-agent pods indicate that the secret is misnamed or missing. Check the secrets and names, upgrade your Helm installation, and try again.

      • Example error message for gloo-mesh-mgmt-server pod:
      Message:   3 errors occurred:
          * no tls secret found for grpc server: Secret "relay-server-tls-secret" not found
          * could not find forwarding server token: no token secret found: Timeout: failed waiting for *v1.Secret Informer to sync
          * no tls secret found for grpc server: Secret "relay-server-tls-secret" not found
      
      • Example error message for gloo-mesh-agent pod:
      Events:
        Type     Reason       Age                    From     Message
        ----     ------       ----                   ----     -------
        Normal   Created      84m (x25 over 3h28m)   kubelet  Created container gloo-mesh-agent
        Normal   Pulled       84m (x24 over 3h28m)   kubelet  Container image "gcr.io/gloo-mesh/gloo-mesh-agent:1.2.3" already present on machine
        Normal   Started      84m (x25 over 3h28m)   kubelet  Started container gloo-mesh-agent
        Warning  FailedMount  84m                    kubelet  MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": read tcp 172.18.0.5:47314->172.18.0.5:6443: use of closed network connection, failed to sync configmap cache: timed out waiting for the condition]
        Warning  FailedMount  84m                    kubelet  MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": read tcp 172.18.0.5:57262->172.18.0.5:6443: use of closed network connection, failed to sync configmap cache: timed out waiting for the condition]
        Warning  FailedMount  83m                    kubelet  MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": http2: client connection force closed via ClientConn.Close, failed to sync configmap cache: timed out waiting for the condition]
        Warning  BackOff      72s (x522 over 3h28m)  kubelet  Back-off restarting failed container
      
  2. Check the Kubernetes logs for the gloo-mesh-mgmt-server and gloo-mesh-agent pods in each cluster for errors. Look for errors during the grpc connection.

    • For example, the following error message indicates that the gloo-mesh-mgmt-server load balancer IP address was set incorrectly for the agent during the Helm installation.
    {"level":"warn","ts":"2021-11-02T19:56:42.197Z"caller":"zap/
    grpclogger.go:85","msg":"[core]grpcaddrConn.createTransport
    failed to connect to {34.145.18106:9900:9900 
    gloo-mesh-mgmt-server.gloo-mesh <nil> <nil>}. Err: connection 
    error: desc = \"transport: Error while dialing dial tcp: address 
    34.145.184.106:9900:9900 too many colons in address\".
    
    • The following gloo-mesh-agent pod error indicates that you need to follow the steps in ca.crt.
    {"level":"fatal","ts":1640102555.6522746,"msg":"secrets \"relay-root-tls-secret\" not found","version":"1.3.0-beta6","stacktrace":"runtime.main\n\t/usr/local/go/src/runtime/proc.go:255"}
    
    • The following errors indicate that the server or client TLS certificate is expired. Regenerate the certificate, restart the pods, and try again.
    {"level":"error","ts":1650047047.6682806,"logger":"translator.reconcile-42","caller":"translator/reconciler.go:195","msg":"translation for parent object failed","parent":"istio-ingressgateway-istio-system-cluster1~gloo-mesh~cluster1~internal.gloo.solo.io/v2, Kind=DiscoveredGateway","err":"Gateway istio-ingressgateway.istio-system in cluster cluster1 not found in snapshot.","errVerbose":"Gateway istio-ingressgateway.istio-system in cluster cluster1 not found in snapshot.\n\ttranslator.(*translator).TranslateOutputs.func1:/src/pkg/translator/translator.go:163\n\ttranslator.(*translator).translateParallel:/src/pkg/translator/translator.go:189\n\tsets.(*discoveredGatewaySet).UnsortedList:/src/pkg/api/internal.gloo.solo.io/v2/sets/sets.go:999\n\tsets.(*resourceSet).UnsortedList:/go/pkg/mod/github.com/solo-io/skv2@v0.22.11/contrib/pkg/sets/sets.go:118\n\tsets.(*discoveredGatewaySet).UnsortedList.func1:/src/pkg/api/internal.gloo.solo.io/v2/sets/sets.go:994\n\ttranslator.(*translator).translateParallel.func1:/src/pkg/translator/translator.go:191\n\ttranslator.getValidEastWestIngressGateway:/src/pkg/translator/translator.go:426","stacktrace":"github.com/solo-io/gloo-mesh-enterprise/v2/pkg/translator.(*reconciler).reconcilePrimary.func1\n\t/src/pkg/translator/reconciler.go:195\ngithub.com/solo-io/gloo-mesh-enterprise/v2/pkg/utils/syncutils.(*workQueue).Execute.func1\n\t/src/pkg/utils/syncutils/parallel.go:52"}
    
    {"level":"info","ts":1650046690.815508,"caller":"grpclog/grpclog.go:37","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc00111c9d0, {TRANSIENT_FAILURE connection error: desc = \"transport: authentication handshake failed: x509: certificate has expired or is not yet valid: current time 2022-04-15T18:18:10Z is after 2022-04-15T14:28:30Z\"}","system":"grpc","grpc_log":true}
    
  3. For gloo-mesh-agent pods, make sure that the cluster name matches the registered cluster name.

    1. Check the KubernetesCluster resources in the management cluster to get registered cluster names.
      kubectl get kubernetesclusters --context $MGMT_CONTEXT
      
    2. Check that the registered cluster name matches the name in the client certificate that is issued by the root CA, specifically the DNS SAN extension.
    3. If the cluster names do not match, update the KubernetesCluster to have the same name, or re-issue the client certificate with the same name.
  4. If you still have issues, review the Known issues.

Known issues

ca.crt

Although the ca.crt is included in the gloo-mesh-agent certificate secret, the gloo-mesh-agent still expects it to exist separately in the remote cluster. To copy it from the management cluster into the remote clusters, you can run the following command. Make sure to update $CLUSTER_NAME with your remote cluster name.

CLUSTER_NAME=$REMOTE_CLUSTER
CLUSTER_CONTEXT=$REMOTE_CONTEXT

kubectl get secret gloo-mesh-agent-$CLUSTER_NAME-tls-cert \
  --namespace gloo-mesh \
  --output json \
  --context $CLUSTER_CONTEXT \
  | jq 'del(.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid,.data."tls.key",.data."tls.crt",.metadata.annotations)' \
  | sed 's/gloo-mesh-agent-$CLUSTER_NAME-tls-cert/relay-root-tls-secret/' \
  | kubectl apply --context $CLUSTER_CONTEXT -f -