Vault
Use Vault to generate the root and intermediate CA certificates, and use cert-manager
to automatically generate client and server TLS certificates.
About this approach
Vault is a popular open source secret management tool that you can use to set up a secure, private key infrastructure (PKI) and manage TLS certificates. In this setup, you install a Vault instance in the Gloo management cluster and use that instance to generate root and intermediate CA certificates. The intermediate CA certificate is used to sign and issue the server and client TLS certificates for the management server and agents. To manage the lifecycle of the server and client certificates, you also install cert-manager
. Cert-manager is a Kubernetes controller that helps you automate the process of obtaining and renewing certificates from various PKI providers, such as AWS Private CA, Gloo Cloud CA, or Vault.
With this approach, you get the following benefits:
- Secure storage of root and intermediate CA certificates and keys.
- Automatically obtain and renew server and client TLS certificates with
cert-manager
.
Although this Vault setup is more secure than using the self-signing default setup, the certificates are still stored within the management cluster. You might restrict access to the management cluster, or you might need to use a different setup to meet your production security requirements.
Architecture overview
The following figure depicts an example architecture for using cert-manager
and Vault to set up the relay certificates for multiple clusters.
- After installing
cert-manager
and Vault in your management cluster, you set up a root of trust for the CA chain. - Next, you create an intermediate CA that is used to sign the relay server and client certificates.
- After creating the relay server and client certificates in your clusters, Gloo
gloo-mesh-mgmt-server
andgloo-mesh-agent
deployments use the certificates to secure gRPC protocol communication between the deployments by using the mutual TLS (mTLS) that is provided by the signed certificate.
Before you begin
Save the kubeconfig contexts for your clusters. Runkubectl config get-contexts
, look for your cluster in the CLUSTER
column, and get the context name in the NAME
column. Note: Do not use context names with underscores. The generated certificate that connects workload clusters to the management cluster uses the context name as a SAN specification, and underscores in SAN are not FQDN compliant. You can rename a context by running kubectl config rename-context "<oldcontext>" <newcontext>
.Step 1: Install cert-manager
In your management cluster, install
cert-manager
. For more information about installation options and versions, see the cert-manager documentation.kubectl
installation:- Helm installation:
Verify that
cert-manager
was successfully installed.Example output:
Step 2: Set up Vault and generate the root and intermediate CAs
Create and securely store the relay root CA in HashiCorp Vault. Although this Vault setup is more secure than using the self-signing default setup, you might need to use a different setup to meet your production security requirements.
- If it doesn’t already exist, create the
gloo-mesh
namespace. - If not added already, add and update the HashiCorp Helm repository in your management cluster.
- Install Vault in your management cluster.
- Enable Vault for root CA certificates along the
pki
path. - Set up the root of trust. The following example uses solo.io, but replace these values with your own CA provider.
- Create an intermediate CA that is used for the relay server operations, along the
pki_relay
path. The key is kept internally, and a certificate signing request is created. - Copy the
CSR
value, including the double quotes. You use this value later to sign and generate the certificate.Example output: - Sign and generate the certificate. Replace
$CSR
with the value that you copied in the previous step. - Copy the
CERT
, including the double quotes. You use this value later to set the signed certificate.Example output: - Set the signed certificate value. Replace
$CERT
with the value that you copied in the previous step. - Get the External IP address of the
LoadBalancer
service for Vault. - Create a
cert-manager
issuer for the CA, replacing$VAULT_IP
with the external IP address that you previously retrieved.
Step 3: Create the server TLS certificate for the management server
Generate the server TLS certificate that the Gloo management server uses for mutual TLS connections with Gloo agents.
Step 4: Create the client TLS certificate for the Gloo agent
In each workload cluster, generate a client TLS certificate for the Gloo agent.
Configure the
cert-manager
installation on the workload cluster to authenticate with the Vault installation on the management cluster. The secret contains the Vault token to use for authentication.Create a
cert-manager
certificate that refers to the issuer that you set up in the previous step.
Verify the cert-manager resources
For clusters that have cert-manager
installed, verify that your cert-manager
issuer and certificate resources are ready. If the READY column says False for any of the following resources, describe the resource for more details and resolve the issue before continuing.
Now that your custom certificates are created, continue to the next section to modify your Gloo Mesh deployment to use these certificates.
Step 5: Install the Gloo management server and agent
Set up Gloo Network to use the client and server TLS certificates that you created earlier.
Prepare the Helm installation settings for the Gloo management server.
Install a new or upgrade an existing or upgrade an existing Gloo management server with the Helm settings from the previous step.
Prepare the Helm installation settings for the Gloo agent.
Register the workload cluster or upgrade an existing Gloo agent with the Helm settings from the previous step.
Verifying your relay certificate setup
- Check that the relay connection between the management server and workload agents is healthy.
- Forward port 9091 of the
gloo-mesh-mgmt-server
pod to your localhost. - In your browser, connect to http://localhost:9091/metrics.
- In the metrics UI, look for the following lines. If the values are
1
, the agents in the workload clusters are successfully registered with the management server. If the values are0
, the agents are not successfully connected.
- Forward port 9091 of the
- Review the Gloo UI. Check that the Overall Mesh Status is healthy and that your remote clusters are registered without any configuration issues.
- If the setup is unsuccessful, continue to Troubleshooting.
Troubleshooting relay certificates
Review the health of your Gloo pods in the management and remote clusters.
Check that the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods are running.If the pods are not running, describe the pods and check the State and Last State sections for error messages and reasons why the pod might not be healthy. For example, the following error messages in the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods indicate that the secret is misnamed or missing. Check the secrets and names, upgrade your Helm installation, and try again.- Example error message for
gloo-mesh-mgmt-server
pod:
- Example error message for
gloo-mesh-agent
pod:
- Example error message for
Check the Kubernetes logs for the
gloo-mesh-mgmt-server
andgloo-mesh-agent
pods in each cluster for errors. Look for errors during thegrpc
connection.- For example, the following error message indicates that the
gloo-mesh-mgmt-server
load balancer IP address was set incorrectly for the agent during the Helm installation.
- The following
gloo-mesh-agent
pod error indicates that you need to follow the steps in ca.crt.
- The following errors indicate that the server or client TLS certificate is expired. Regenerate the certificate, restart the pods, and try again.
- For example, the following error message indicates that the
For
gloo-mesh-agent
pods, make sure that the cluster name matches the registered cluster name.- Check the KubernetesCluster resources in the management cluster to get registered cluster names.
- Check that the registered cluster name matches the name in the client certificate that is issued by the root CA, specifically the DNS SAN extension.
- If the cluster names do not match, update the KubernetesCluster to have the same name, or re-issue the client certificate with the same name.
- Check the KubernetesCluster resources in the management cluster to get registered cluster names.
If you still have issues, review the Known issues.
Known issues
ca.crt
Although the ca.crt is included in the gloo-mesh-agent
certificate secret, the gloo-mesh-agent
still expects it to exist separately in the remote cluster. To copy it from the management cluster into the remote clusters, you can run the following command. Make sure to update $CLUSTER_NAME
with your remote cluster name.