Use custom CAs
Instead of using Gloo Mesh self-signed certificates for the root CA certificate, you can generate your own relay root CA certificate and key with the certificate management tool of your choice. You then use these credentials to create an intermediate CA certificate and key that can be use by Gloo Mesh to automatically sign and issue client TLS certificates for the workload clusters.
For more information about this approach, see Option 3: Custom CAs with automatic client TLS certificate rotation.
Step 1: Create your own root CA certificate and key
To generate and store your own root CA certificate and key, you typically use your preferred PKI provider, such as Vault, Google Cloud CA, or AWS Private CA. If you do not have a PKI provider, you can use tools, such as OpenSSL to generate the certificate and key for the root CA as described in this guide.
-
Make sure that you have the OpenSSL version of
openssl
, not LibreSSL. Theopenssl
version must be at least 1.1.- Check the
openssl
version that is installed. If you see LibreSSL in the output, continue to the next step.openssl version
- Install the OpenSSL version (not LibreSSL). For example, you might use Homebrew.
brew install openssl
- Review the output of the OpenSSL installation for the path of the binary file. You can choose to export the binary to your path, or call the entire path whenever the following steps use an
openssl
command.- For example,
openssl
might be installed along the following path:/usr/local/opt/openssl@3/bin/
- To run commands, you can append the path so that your terminal uses this installed version of OpenSSL, and not the default LibreSSL.
/usr/local/opt/openssl@3/bin/openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650...
- For example,
- Check the
-
Create the configuration for the root CA.
cat > "root-ca.conf" <<EOF [ v3_ca ] basicConstraints = critical,CA:TRUE subjectKeyIdentifier = hash authorityKeyIdentifier = keyid:always,issuer:always keyUsage = digitalSignature, keyEncipherment, keyCertSign extendedKeyUsage = clientAuth, serverAuth EOF
-
Create a self-signed root CA certificate and key.
openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650 -nodes -out relay-root-ca.crt -keyout relay-root-ca.key -subj "/CN=relay-root-ca" -config "root-ca.conf" -extensions v3_ca
-
If it doesn't already exist, create the
gloo-mesh
namespace in each cluster.kubectl create namespace gloo-mesh --context $MGMT_CONTEXT kubectl create namespace gloo-mesh --context $REMOTE_CONTEXT1 kubectl create namespace gloo-mesh --context $REMOTE_CONTEXT2
-
Store the root CA certificate in the management cluster.
kubectl create secret generic relay-root-tls-secret -n gloo-mesh \ --from-file=tls.crt=relay-root-ca.crt \ --from-file=ca.crt=relay-root-ca.crt \
-
Copy the root CA certificate to each workload cluster.
kubectl create secret generic relay-root-tls-secret -n gloo-mesh --context $REMOTE_CONTEXT1 --from-file ca.crt=relay-root-ca.crt kubectl create secret generic relay-root-tls-secret -n gloo-mesh --context $REMOTE_CONTEXT2 --from-file ca.crt=relay-root-ca.crt
Step 2: Create an intermediate CA certificate and key
Use the root CA key to generate an intermediate CA certificate and key. These credentials are later used to sign client TLS certificates for the Gloo agents on each workload cluster.
-
Create the configuration for the intermediate CA.
cat > "relay-intermediate-ca.conf" <<EOF [req] req_extensions = req_ext distinguished_name = req_distinguished_name [req_distinguished_name] [req_ext] basicConstraints = CA:TRUE subjectKeyIdentifier = hash [v3_ca] basicConstraints = CA:TRUE subjectKeyIdentifier = hash authorityKeyIdentifier = keyid:always,issuer:always EOF
-
Generate the
intermediate-ca.key
.openssl genrsa -out "intermediate-ca.key" 2048
-
Generate the certificate signing request (CSR).
openssl req -new -key "intermediate-ca.key" -out "intermediate-ca.csr" -subj "/CN=gloo-mesh-mgmt-server" -config relay-intermediate-ca.conf -extensions req_ext
-
Sign the CSR with the root CA key.
openssl x509 -req -in "intermediate-ca.csr" -CA "relay-root-ca.crt" -CAkey "relay-root-ca.key" -CAcreateserial -out "intermediate-ca.crt" -days 365 -extensions v3_ca -extfile relay-intermediate-ca.conf
-
Save the intermediate CA certificate and key in the
relay-tls-signing-secret
Kubernetes secret on the management cluster.kubectl create secret generic relay-tls-signing-secret \ --from-file=tls.key=intermediate-ca.key \ --from-file=tls.crt=intermediate-ca.crt \ --from-file=ca.crt=relay-root-ca.crt \ --context ${MGMT_CONTEXT} \ --namespace gloo-mesh
Step 3: Create the server TLS certificate
Use the root CA credentials that you created earlier to derive the server TLS certificate that the Gloo management server uses for mutual TLS connections with the Gloo agents.
-
If it doesn't already exist, create the
gloo-mesh
namespace in each cluster.kubectl create namespace gloo-mesh --context $MGMT_CONTEXT kubectl create namespace gloo-mesh --context $REMOTE_CONTEXT1 kubectl create namespace gloo-mesh --context $REMOTE_CONTEXT2
-
Create the configuration for the server TLS certificate.
# Server certificate configuration cat > "gloo-mesh-mgmt-server.conf" <<EOF [req] req_extensions = v3_req distinguished_name = req_distinguished_name [req_distinguished_name] [ v3_req ] basicConstraints = CA:FALSE keyUsage = digitalSignature, keyEncipherment extendedKeyUsage = clientAuth, serverAuth subjectAltName = @alt_names [alt_names] DNS = *.gloo-mesh EOF
-
Generate the private key and certificate signing request (CSR).
# Generate gloo-mesh-mgmt-server private key openssl genrsa -out "gloo-mesh-mgmt-server.key" 2048 # Generate gloo-mesh-mgmt-server CSR openssl req -new -key "gloo-mesh-mgmt-server.key" -out gloo-mesh-mgmt-server.csr -subj "/CN=gloo-mesh-mgmt-server" -config "gloo-mesh-mgmt-server.conf"
-
Sign the CSR with the root CA key.
# Sign certificate with local relay-root-ca openssl x509 -req \ -days 3650 \ -CA relay-root-ca.crt -CAkey relay-root-ca.key \ -set_serial 0 \ -in gloo-mesh-mgmt-server.csr -out gloo-mesh-mgmt-server.crt \ -extensions v3_req -extfile "gloo-mesh-mgmt-server.conf"
-
Save the server TLS certificate in the
relay-server-tls-secret
Kubernetes secret on the management cluster.kubectl create secret generic relay-server-tls-secret \ --from-file=tls.key=gloo-mesh-mgmt-server.key \ --from-file=tls.crt=gloo-mesh-mgmt-server.crt \ --from-file=ca.crt=relay-root-ca.crt \ --context ${MGMT_CONTEXT} \ --namespace gloo-mesh
Step 4: Set up initial trust with identity tokens
Because no client TLS certificate exists on the workload cluster yet, you must use identity tokens to establish initial trust between the Gloo agent and the management server. For more information about this process, see Initial proof of trust.
You have the option to automatically generate identity tokens or to provide your own identity tokens before you register the workload clusters with the management server.
-
Create an environment variable with your identity token. The token can be any string value.
export TOKEN="<identity_token>"
-
Store the token in the
relay-identity-token-secret
Kubernetes secret on the management cluster.kubectl create secret generic relay-identity-token-secret -n gloo-mesh --context $MGMT_CONTEXT --from-file token=$TOKEN
-
Copy the identity token to each workload cluster that you want to register.
kubectl create secret generic relay-identity-token-secret -n gloo-mesh --context $REMOTE_CONTEXT1 --from-file token=$TOKEN kubectl create secret generic relay-identity-token-secret -n gloo-mesh --context $REMOTE_CONTEXT2 --from-file token=$TOKEN
Step 5: Install the Gloo management server and agent
Set up Gloo Mesh to use your own intermediate CA credentials to issue and sign client TLS certificates. The setup varies depending on how you chose to create the identity tokens.
-
Prepare the Helm installation settings for the Gloo management server.
glooMgmtServer: relay: disableCa: true disableCaCertGeneration: true disableTokenGeneration: false # Push RBAC resources to the management server. Required for multicluster RBAC in the Gloo UI. pushRbac: true # Secret containing TLS certs used to sign CSRs created by workload agents. signingTlsSecret: name: relay-tls-signing-secret # Secret containing server TLS certs used to secure the management server. tlsSecret: name: relay-server-tls-secret # Secret containing a shared token for authenticating Gloo agents when they first communicate with the management server. tokenSecret: # Key value of the data within the Kubernetes secret. key: token # Name of the Kubernetes secret. name: relay-identity-token-secret # Namespace of the Kubernetes secret. namespace: ""
glooMgmtServer: relay: disableCa: true disableCaCertGeneration: true disableTokenGeneration: true # Push RBAC resources to the management server. Required for multicluster RBAC in the Gloo UI. pushRbac: true # Secret containing TLS certs used to sign CSRs created by workload agents. signingTlsSecret: name: relay-tls-signing-secret # Secret containing server TLS certs used to secure the management server. tlsSecret: name: relay-server-tls-secret tokenSecret: # Key value of the data within the Kubernetes secret. key: token # Name of the Kubernetes secret. name: relay-identity-token-secret # Namespace of the Kubernetes secret. namespace: "gloo-mesh"
-
Install a new or upgrade an existing Gloo management server with the Helm settings from the previous step.
-
Prepare the Helm installation settings for the Gloo agent.
glooAgent: relay: # SNI name in the authority/host header used to connect to relay forwarding server. Must match server certificate CommonName. Do not change the default value. authority: gloo-mesh-mgmt-server.gloo-mesh # Custom certs: Secret containing client TLS certs used to identify the Gloo agent to the management server. If you do not specify a clientTlssSecret, you must specify a tokenSecret and a rootTlsSecret. clientTlsSecret: name: relay-client-tls-secret # The ratio of the client TLS certificate lifetime to when the management server starts the certificate rotation process. clientTlsSecretRotationGracePeriodRatio: "" # Secret containing a root TLS cert used to verify the management server cert. The secret can also optionally specify a 'tls.key', which is used to generate the agent client cert. rootTlsSecret: name: relay-root-tls-secret # Address and port by which gloo-mesh-mgmt-server in the Gloo control plane can be accessed by the Gloo workload agents. serverAddress: $MGMT_SERVER_NETWORKING_ADDRESS # Secret containing a shared token for authenticating Gloo agents when they first communicate with the management server. A token secret is not needed with ACM certs. tokenSecret: # Key value of the data within the Kubernetes secret. key: token # Name of the Kubernetes secret. name: relay-identity-token-secret # Namespace of the Kubernetes secret. namespace: ""
glooAgent: relay: # SNI name in the authority/host header used to connect to relay forwarding server. Must match server certificate CommonName. Do not change the default value. authority: gloo-mesh-mgmt-server.gloo-mesh # Custom certs: Secret containing client TLS certs used to identify the Gloo agent to the management server. If you do not specify a clientTlssSecret, you must specify a tokenSecret and a rootTlsSecret. clientTlsSecret: name: relay-client-tls-secret # The ratio of the client TLS certificate lifetime to when the management server starts the certificate rotation process. clientTlsSecretRotationGracePeriodRatio: "" # Secret containing a root TLS cert used to verify the management server cert. The secret can also optionally specify a 'tls.key', which is used to generate the agent client cert. rootTlsSecret: name: relay-root-tls-secret # Address and port by which gloo-mesh-mgmt-server in the Gloo control plane can be accessed by the Gloo workload agents. serverAddress: $MGMT_SERVER_NETWORKING_ADDRESS # Secret containing a shared token for authenticating Gloo agents when they first communicate with the management server. A token secret is not needed with ACM certs. tokenSecret: # Key value of the data within the Kubernetes secret. key: token # Name of the Kubernetes secret. name: relay-identity-token-secret # Namespace of the Kubernetes secret. namespace: "gloo-mesh"
-
Register the worklod cluster or upgrade an existing Gloo agent with the Helm settings from the previous step.
Verifying your relay certificate setup
- Check that the relay connection between the management server and workload agents is healthy.
- Forward port 9091 of the
gloo-mesh-mgmt-server
pod to your localhost.kubectl port-forward -n gloo-mesh --context $MGMT_CONTEXT deploy/gloo-mesh-mgmt-server 9091
- In your browser, connect to http://localhost:9091/metrics.
- In the metrics UI, look for the following lines. If the values are
1
, the agents in the workload clusters are successfully registered with the management server. If the values are0
, the agents are not successfully connected.relay_pull_clients_connected{cluster="cluster1"} 1 relay_pull_clients_connected{cluster="cluster2"} 1 # HELP relay_push_clients_connected Current number of connected Relay push clients (Relay Agents). # TYPE relay_push_clients_connected gauge relay_push_clients_connected{cluster="cluster1"} 1 relay_push_clients_connected{cluster="cluster2"} 1
- Forward port 9091 of the
- Review the Gloo Mesh UI. Check that the Overall Mesh Status is healthy and that your remote clusters are registered without any configuration issues.
meshctl dashboard --kubecontext $MGMT_CONTEXT
- If the setup is unsuccessful, continue to Troubleshooting.
Troubleshooting relay certificates
-
Review the health of your Gloo Mesh pods in the management and remote clusters.
-
Check that the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods are running.kubectl get pods -n gloo-mesh --context ${MGMT_CONTEXT} kubectl get pods -n gloo-mesh --context ${REMOTE_CONTEXT}
-
If the pods are not running, describe the pods and check the State and Last State sections for error messages and reasons why the pod might not be healthy. For example, the following error messages in the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods indicate that the secret is misnamed or missing. Check the secrets and names, upgrade your Helm installation, and try again.- Example error message for
gloo-mesh-mgmt-server
pod:
Message: 3 errors occurred: * no tls secret found for grpc server: Secret "relay-server-tls-secret" not found * could not find forwarding server token: no token secret found: Timeout: failed waiting for *v1.Secret Informer to sync * no tls secret found for grpc server: Secret "relay-server-tls-secret" not found
- Example error message for
gloo-mesh-agent
pod:
Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Created 84m (x25 over 3h28m) kubelet Created container gloo-mesh-agent Normal Pulled 84m (x24 over 3h28m) kubelet Container image "gcr.io/gloo-mesh/gloo-mesh-agent:1.2.3" already present on machine Normal Started 84m (x25 over 3h28m) kubelet Started container gloo-mesh-agent Warning FailedMount 84m kubelet MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": read tcp 172.18.0.5:47314->172.18.0.5:6443: use of closed network connection, failed to sync configmap cache: timed out waiting for the condition] Warning FailedMount 84m kubelet MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": read tcp 172.18.0.5:57262->172.18.0.5:6443: use of closed network connection, failed to sync configmap cache: timed out waiting for the condition] Warning FailedMount 83m kubelet MountVolume.SetUp failed for volume "kube-api-access-zlr9b" : [failed to fetch token: Post "https://kind2-control-plane:6443/api/v1/namespaces/gloo-mesh/serviceaccounts/gloo-mesh-agent/token": http2: client connection force closed via ClientConn.Close, failed to sync configmap cache: timed out waiting for the condition] Warning BackOff 72s (x522 over 3h28m) kubelet Back-off restarting failed container
- Example error message for
-
-
Check the Kubernetes logs for the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods in each cluster for errors. Look for errors during thegrpc
connection.- For example, the following error message indicates that the
gloo-mesh-mgmt-server
load balancer IP address was set incorrectly for the agent during the Helm installation.
{"level":"warn","ts":"2021-11-02T19:56:42.197Z"caller":"zap/ grpclogger.go:85","msg":"[core]grpcaddrConn.createTransport failed to connect to {34.145.18106:9900:9900 gloo-mesh-mgmt-server.gloo-mesh <nil> <nil>}. Err: connection error: desc = \"transport: Error while dialing dial tcp: address 34.145.184.106:9900:9900 too many colons in address\".
- The following
gloo-mesh-agent
pod error indicates that you need to follow the steps in ca.crt.
{"level":"fatal","ts":1640102555.6522746,"msg":"secrets \"relay-root-tls-secret\" not found","version":"1.3.0-beta6","stacktrace":"runtime.main\n\t/usr/local/go/src/runtime/proc.go:255"}
- The following errors indicate that the server or client TLS certificate is expired. Regenerate the certificate, restart the pods, and try again.
{"level":"error","ts":1650047047.6682806,"logger":"translator.reconcile-42","caller":"translator/reconciler.go:195","msg":"translation for parent object failed","parent":"istio-ingressgateway-istio-system-cluster1~gloo-mesh~cluster1~internal.gloo.solo.io/v2, Kind=DiscoveredGateway","err":"Gateway istio-ingressgateway.istio-system in cluster cluster1 not found in snapshot.","errVerbose":"Gateway istio-ingressgateway.istio-system in cluster cluster1 not found in snapshot.\n\ttranslator.(*translator).TranslateOutputs.func1:/src/pkg/translator/translator.go:163\n\ttranslator.(*translator).translateParallel:/src/pkg/translator/translator.go:189\n\tsets.(*discoveredGatewaySet).UnsortedList:/src/pkg/api/internal.gloo.solo.io/v2/sets/sets.go:999\n\tsets.(*resourceSet).UnsortedList:/go/pkg/mod/github.com/solo-io/skv2@v0.22.11/contrib/pkg/sets/sets.go:118\n\tsets.(*discoveredGatewaySet).UnsortedList.func1:/src/pkg/api/internal.gloo.solo.io/v2/sets/sets.go:994\n\ttranslator.(*translator).translateParallel.func1:/src/pkg/translator/translator.go:191\n\ttranslator.getValidEastWestIngressGateway:/src/pkg/translator/translator.go:426","stacktrace":"github.com/solo-io/gloo-mesh-enterprise/pkg/translator.(*reconciler).reconcilePrimary.func1\n\t/src/pkg/translator/reconciler.go:195\ngithub.com/solo-io/gloo-mesh-enterprise/pkg/utils/syncutils.(*workQueue).Execute.func1\n\t/src/pkg/utils/syncutils/parallel.go:52"}
{"level":"info","ts":1650046690.815508,"caller":"grpclog/grpclog.go:37","msg":"[core]pickfirstBalancer: UpdateSubConnState: 0xc00111c9d0, {TRANSIENT_FAILURE connection error: desc = \"transport: authentication handshake failed: x509: certificate has expired or is not yet valid: current time 2022-04-15T18:18:10Z is after 2022-04-15T14:28:30Z\"}","system":"grpc","grpc_log":true}
- For example, the following error message indicates that the
-
For
gloo-mesh-agent
pods, make sure that the cluster name matches the registered cluster name.- Check the KubernetesCluster resources in the management cluster to get registered cluster names.
kubectl get kubernetesclusters --context $MGMT_CONTEXT
- Check that the registered cluster name matches the name in the client certificate that is issued by the root CA, specifically the DNS SAN extension.
- If the cluster names do not match, update the KubernetesCluster to have the same name, or re-issue the client certificate with the same name.
- Check the KubernetesCluster resources in the management cluster to get registered cluster names.
-
If you still have issues, review the Known issues.
Known issues
ca.crt
Although the ca.crt is included in the gloo-mesh-agent
certificate secret, the gloo-mesh-agent
still expects it to exist separately in the remote cluster. To copy it from the management cluster into the remote clusters, you can run the following command. Make sure to update $CLUSTER_NAME
with your remote cluster name.
CLUSTER_NAME=$REMOTE_CLUSTER
CLUSTER_CONTEXT=$REMOTE_CONTEXT
kubectl get secret gloo-mesh-agent-$CLUSTER_NAME-tls-cert \
--namespace gloo-mesh \
--output json \
--context $CLUSTER_CONTEXT \
| jq 'del(.metadata.creationTimestamp,.metadata.resourceVersion,.metadata.uid,.data."tls.key",.data."tls.crt",.metadata.annotations)' \
| sed 's/gloo-mesh-agent-$CLUSTER_NAME-tls-cert/relay-root-tls-secret/' \
| kubectl apply --context $CLUSTER_CONTEXT -f -