Manage Istio certificates with Vault
Vault is a popular open source secret management tool, including for private key infrastructure (PKI). With Vault, you can securely store your private keys, as well as create new intermediate, or leaf, certificates.
For multicluster traffic, you can establish a shared root of trust by using a single root CA and an intermediate CA that is signed by the same root CA for each cluster. This guide shows you how to configure Istio and Gloo Mesh to use Vault to store the root CA and generate intermediate CA to be used by Istio on each cluster to sign its workload certificates, as shown in the following figure.
Figure: Using Gloo Mesh to configure Istio to use Vault for the intermediate CA across clusters.
In addition to using Vault for the intermediate CA, you can use Gloo Mesh Enterprise to get added security benefits. The Gloo Mesh Enterprise integration with Vault uses the istiod-agent
, which runs as a sidecar to the istiod pod, and communicates with Vault to request private keys and to sign certificates. In this setup, Gloo Mesh loads the private key directly into the pod filesystem, thereby allowing for an added layer of security by not saving the key to etcd
or any permanent storage. When the pod is deleted, the private key is also deleted.
Note that while this guide describes how to use Vault for the Istio intermediate CA, you can also use Vault to generate and manage the CAs for the Gloo Mesh Enterprise relay server and agent certificates. For example, you can follow the example in the Generating relay certificates guide to set up a relay intermediate CA.
Before you begin
- Complete the multicluster getting started guide to set up the following testing environment.
- Three clusters along with environment variables for the clusters and their Kubernetes contexts.
- The Gloo Platform CLI,
meshctl
, along with other CLI tools such askubectl
andistioctl
. - The Gloo management server in the management cluster, and the Gloo agents in the workload clusters.
- Istio installed in the workload clusters.
- A simple Gloo workspace setup.
- Install Bookinfo and other sample apps.
-
The default
openssl
version that is included in macOS is LibreSSL, which does not work with these instructions.Make sure that you have the OpenSSL version of
openssl
, not LibreSSL. Theopenssl
version must be at least 1.1.- Check the
openssl
version that is installed. If you see LibreSSL in the output, continue to the next step.openssl version
- Install the OpenSSL version (not LibreSSL). For example, you might use Homebrew.
brew install openssl
- Review the output of the OpenSSL installation for the path of the binary file. You can choose to export the binary to your path, or call the entire path whenever the following steps use an
openssl
command.- For example,
openssl
might be installed along the following path:/usr/local/opt/openssl@3/bin/
- To run commands, you can append the path so that your terminal uses this installed version of OpenSSL, and not the default LibreSSL.
/usr/local/opt/openssl@3/bin/openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650...
- For example,
- Check the
- Save the kubeconfig contexts for your clusters. Run
kubectl config get-contexts
, look for your cluster in theCLUSTER
column, and get the context name in theNAME
column. Note: Do not use context names with underscores. The context name is used as a SAN specification in the generated certificate that connects workload clusters to the management cluster, and underscores in SAN are not FQDN compliant. You can rename a context by runningkubectl config rename-context "<oldcontext>" <newcontext>
.export MGMT_CLUSTER=<mgmt-cluster-name> export REMOTE_CLUSTER=<remote-cluster-name> export MGMT_CONTEXT=<management-cluster-context> export REMOTE_CONTEXT=<remote-cluster-context>
Install Vault
-
If not added already, add the HashiCorp Helm repository to your management cluster.
helm repo add hashicorp https://helm.releases.hashicorp.com --kube-context ${MGMT_CONTEXT} helm repo update
-
Generate a root CA certificate and key for Vault. You can update the
subj
field to your domain.openssl req -new -newkey rsa:4096 -x509 -sha256 \ -days 3650 -nodes -out root-cert.pem -keyout root-key.pem \ -subj "/O=solo.io"
-
In the management cluster, install Vault in dev mode and enable debugging logs. For more information about setting up Vault in Kubernetes, see the Vault docs.
helm install -n vault vault hashicorp/vault --set "injector.enabled=false" --set "server.logLevel=debug" --set "server.dev.enabled=true" --set "server.service.type=LoadBalancer" --kube-context="${MGMT_CONTEXT}" --create-namespace kubectl --context="${MGMT_CONTEXT}" wait --for=condition=Ready -n vault pod/vault-0
Example output:
pod/vault-0 condition met
-
Enable Vault userpass.
kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c 'vault auth enable userpass' kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c 'vault write auth/userpass/users/admin password=admin policies=admins'
Example output:
Success! Enabled userpass auth method at: userpass/ Success! Data written to: auth/userpass/users/admin
-
Enable Vault authentication along a path for the workload cluster. Replace
${REMOTE_CLUSTER}
with your cluster's name.kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c 'vault auth enable -path=kube-${REMOTE_CLUSTER}-mesh-auth kubernetes'
Example output:
Success! Enabled kubernetes auth method at: kube-${REMOTE_CLUSTER}-mesh-auth/
-
Set up the environment variables to create the policy for intermediate signing. These variables include the service account token details and cluster address for Vault to use.
VAULT_SA_NAME_C1=$(kubectl --context $REMOTE_CONTEXT get sa istiod-service-account -n istio-system -o jsonpath="{.secrets[*]['name']}") SA_TOKEN_C1=$(kubectl --context $REMOTE_CONTEXT get secret $VAULT_SA_NAME_C1 -n istio-system -o 'go-template={{ .data.token }}' | base64 --decode) SA_CA_CRT_C1=$(kubectl config view --raw -o json | jq -r --arg wc $REMOTE_CONTEXT '. as $c | $c.contexts[] | select(.name == $wc) as $context | $c.clusters[] | select(.name == $context.context.cluster) | .cluster."certificate-authority-data"'| base64 -d) K8S_ADDR_C1=$(kubectl config view -o json | jq -r --arg wc $REMOTE_CONTEXT '. as $c | $c.contexts[] | select(.name == $wc) as $context | $c.clusters[] | select(.name == $context.context.cluster) | .cluster.server') echo $VAULT_SA_NAME_C1 echo $SA_TOKEN_C1 echo $SA_CA_CRT_C1 echo $K8S_ADDR_C1
Example output:
eyJhbG... -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- https://34.xxx.xxx.xxx
-
Set the Kubernetes auth config for Vault to the mounted service account token.
kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c "vault write auth/kube-${REMOTE_CLUSTER}-mesh-auth/config \ token_reviewer_jwt="$SA_TOKEN_C1" \ kubernetes_host="$K8S_ADDR_C1" \ kubernetes_ca_cert='$SA_CA_CRT_C1' \ disable_local_ca_jwt="true" \ issuer='https://kubernetes.default.svc.cluster.local'"
Example output:
Success! Data written to: auth/kube-${REMOTE_CLUSTER}-mesh-auth/config
-
Bind the istiod service account to the Vault PKI policy. Replace
${REMOTE_CLUSTER}
with your cluster's name.kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c "vault write \ auth/kube-${REMOTE_CLUSTER}-mesh-auth/role/gen-int-ca-istio-${REMOTE_CLUSTER}-mesh \ bound_service_account_names=istiod-service-account \ bound_service_account_namespaces=istio-system \ policies=gen-int-ca-istio-${REMOTE_CLUSTER}-mesh \ ttl=720h"
Example output:
Success! Data written to: auth/kube-${REMOTE_CLUSTER}-mesh-auth/role/gen-int-ca-istio-${REMOTE_CLUSTER}-mesh
-
Initialize the Vault PKI.
kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c 'vault secrets enable pki'
Example output:
Success! Enabled the pki secrets engine at: pki/
-
Set the Vault CA to the pem_bundle.
kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c "vault write -format=json pki/config/ca pem_bundle=\"$(cat root-key.pem root-cert.pem)\""
Example output:
{ "request_id": "2aa29fd6-9fa3-3edd-2f8b-2a0e4c007e8c", "lease_id": "", "lease_duration": 0, "renewable": false, "data": { "imported_issuers": null, "imported_keys": null, "mapping": { "aa877391-b4f2-045d-63da-33521c91dc68": "8257875c-4016-f28e-288b-ecca33065097" } }, "warnings": null }
-
Enable the Vault intermediate cert path. Replace
${REMOTE_CLUSTER}
with your cluster's name.kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c 'vault secrets enable -path=pki_int_${REMOTE_CLUSTER} pki'
Example output:
Success! Enabled the pki secrets engine at: pki_int_${REMOTE_CLUSTER}/
-
Set the policy for the intermediate cert path. Replace
${REMOTE_CLUSTER}
with your cluster's name.kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c 'vault policy write gen-int-ca-istio-${REMOTE_CLUSTER}-mesh - <<EOF path "pki_int_${REMOTE_CLUSTER}/*" { capabilities = ["create", "read", "update", "delete", "list"] } path "pki/cert/ca" { capabilities = ["read"] } path "pki/root/sign-intermediate" { capabilities = ["create", "read", "update", "list"] } EOF'
Example output:
Success! Uploaded policy: gen-int-ca-istio-${REMOTE_CLUSTER}-mesh
-
Repeat the steps 5-12 for each workload cluster.
Now that Vault is set up in your clusters, you can use Vault as an intermediate CA provider. If you see any errors, review the troubleshooting section.
Update Gloo RBAC
The istio-agent
sidecar in each cluster needs to read and modify Gloo resources. To enable the necessary RBAC permissions, update the gloo-agent
Helm release. You can update the Helm release by adding the following snippet to the YAML configuration file in your GitOps pipeline or directly with the helm upgrade
command.
istiodSidecar:
createRoleBinding: true
- Set the Gloo version as an environment variable.
export GLOO_VERSION=2.2.4
- Make sure that you have the Helm repo for the Gloo agent. Note that you might have a different name for the Helm repo, such as
gloo-mesh-agent
.helm repo add gloo-agent https://storage.googleapis.com/gloo-mesh-enterprise/gloo-mesh-agent --kube-context ${REMOTE_CONTEXT} helm repo update --kube-context ${REMOTE_CONTEXT}
- Upgrade the Helm chart with the required RBAC permission. Note that you might have a different name for the Helm repo, such as
gloo-mesh-agent
.helm get values -n gloo-mesh gloo-agent --kube-context=${REMOTE_CONTEXT} > ${REMOTE_CLUSTER}-values.yaml echo "istiodSidecar:" >> ${REMOTE_CLUSTER}-values.yaml echo " createRoleBinding: true" >> ${REMOTE_CLUSTER}-values.yaml echo " istiodServiceAccount:" >> $cluster-values.yaml echo " name: istiod-service-account" >> $cluster-values.yaml echo " namespace: istio-system" >> $cluster-values.yaml helm upgrade -n gloo-mesh gloo-agent gloo-mesh-agent/gloo-mesh-agent --kube-context="${REMOTE_CONTEXT}" --version=$GLOO_VERSION -f ${REMOTE_CLUSTER}-values.yaml
Modify istiod
So far, you set up the Gloo Mesh agent on each cluster to use Vault for the intermediate CA. Now, you can modify your Istio installation to support fetching and dynamically reloading the intermediate CA from Vault.
-
Get the version that runs in your management cluster.
export MGMT_PLANE_VERSION=$(meshctl version --kubecontext $MGMT_CONTEXT | jq '.server[].components[] | select(.componentName == "gloo-mesh-mgmt-server") | .images[] | select(.name == "gloo-mesh-mgmt-server") | .version') echo $MGMT_PLANE_VERSION
Example output:
"2.2.4"
-
Get your istiod deployment. Choose from the following options.
- If you did not deploy Istio yet, you can use an example Istio operator deployment as described in Deploy Istio in production.
- If you already deployed Istio, get your current deployment configuration. For example, you might check your GitOps configuration or run the following command to review the Istio operator configuration.
kubectl get istiooperator -n istio-system --context $REMOTE_CONTEXT -o yaml > istio-operator.yaml
-
Update the istiod deployment with the
gloo-mesh-istiod-agent
sidecar to load and store the Vault certificates. For most installations, you use an Istio operator to manage the istiod deployment. You can add an overlay section to the Istio operator configuration. If you did not use an Istio operator to manage istiod, such as for quick testing in local Kind clusters, you can patch the deployment.- In the
spec.config.components.pilot.k8s
section of your Istio operator configuration file, add the following overlay. Replace$MGMT_PLANE_VERSION
with the version that you got in the previous step.apiVersion: install.istio.io/v1alpha1 kind: IstioOperator metadata: name: production-istio namespace: istio-system spec: components: pilot: k8s: overlays: - apiVersion: apps/v1 kind: Deployment name: istiod patches: # override istiod cacerts volume - path: spec.template.spec.volumes[name:cacerts] value: name: cacerts secret: null emptyDir: medium: Memory # override istiod istiod-agent container to use Solo.io istiod-agent build - path: spec.template.spec.containers[1] value: name: istiod-agent image: gcr.io/gloo-mesh/gloo-mesh-istiod-agent:$MGMT_PLANE_VERSION imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /etc/cacerts name: cacerts args: - sidecar env: - name: PILOT_CERT_PROVIDER value: istiod - name: POD_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: SERVICE_ACCOUNT valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.serviceAccountName # override istiod istiod-agent-init init-container to use Solo.io istiod-agent-init build - path: spec.template.spec.initContainers value: - name: istiod-agent-init image: gcr.io/gloo-mesh/gloo-mesh-istiod-agent:$MGMT_PLANE_VERSION imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /etc/cacerts name: cacerts args: - init-container env: - name: PILOT_CERT_PROVIDER value: istiod - name: POD_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.name - name: POD_NAMESPACE valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.namespace - name: SERVICE_ACCOUNT valueFrom: fieldRef: apiVersion: v1 fieldPath: spec.serviceAccountName
- Apply the updated Istio operator configuration to your cluster. Your GitOps workflow might apply changes automatically.
kubectl apply --context ${REMOTE_CONTEXT} -f istio-operator.yaml
kubectl patch -n istio-system deploy/istiod --patch '{ "spec": { "template": { "spec": { "initContainers": [ { "args": [ "init-container" ], "env": [ { "name": "PILOT_CERT_PROVIDER", "value": "istiod" }, { "name": "POD_NAME", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "metadata.name" } } }, { "name": "POD_NAMESPACE", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "metadata.namespace" } } }, { "name": "SERVICE_ACCOUNT", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "spec.serviceAccountName" } } } ], "volumeMounts": [ { "mountPath": "/etc/cacerts", "name": "cacerts" } ], "imagePullPolicy": "IfNotPresent", "image": "gcr.io/gloo-mesh/gloo-mesh-istiod-agent:$MGMT_PLANE_VERSION", "name": "istiod-agent-init" } ], "containers": [ { "args": [ "sidecar" ], "env": [ { "name": "PILOT_CERT_PROVIDER", "value": "istiod" }, { "name": "POD_NAME", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "metadata.name" } } }, { "name": "POD_NAMESPACE", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "metadata.namespace" } } }, { "name": "SERVICE_ACCOUNT", "valueFrom": { "fieldRef": { "apiVersion": "v1", "fieldPath": "spec.serviceAccountName" } } } ], "volumeMounts": [ { "mountPath": "/etc/cacerts", "name": "cacerts" } ], "imagePullPolicy": "IfNotPresent", "image": "gcr.io/gloo-mesh/gloo-mesh-istiod-agent:$MGMT_PLANE_VERSION", "name": "istiod-agent" } ], "volumes": [ { "name": "cacerts", "secret": null, "emptyDir": { "medium": "Memory" } } ] } } } }'
- In the
-
Repeat the previous step for each workload cluster with Istio installed.
Enable Vault as an intermediate CA provider
Now, federate the two meshes together by using Gloo with Vault to establish trusted communication across the service meshes.
- Get the endpoint for the Vault service in the management cluster.
export VAULT_ENDPOINT="http://$(kubectl get svc/vault -n vault -o wide --context $MGMT_CONTEXT -o jsonpath='{.status.loadBalancer.ingress[0].*}')" echo $VAULT_ENDPOINT
Example output:
http://35.xxx.xxx.xxx
- Get the name of your Istio mesh.
export MESH=$(kubectl get meshes -n gloo-mesh --context $REMOTE_CONTEXT -o jsonpath='{.items[*].metadata.name}') echo $MESH
- Create a root trust policy for the workload cluster so that the istiod agent on the workload cluster knows how to communicate with Vault on the management cluster. For more information about root trust policies, see the API docs.
kubectl apply --context ${REMOTE_CONTEXT} -f - << EOF apiVersion: admin.gloo.solo.io/v2 kind: RootTrustPolicy metadata: name: ${REMOTE_CLUSTER} namespace: gloo-mesh spec: applyToMeshes: - istio: clusterSelector: mesh: ${MESH} namespace: istio-system selector: app: istiod vault: ${REMOTE_CLUSTER} config: agentCa: vault: caPath: pki/root/sign-intermediate csrPath: pki_int_${REMOTE_CLUSTER}/intermediate/generate/exported server: $VAULT_ENDPOINT:8200 kubernetesAuth: mountPath: /v1/auth/kube-${REMOTE_CLUSTER}-mesh-auth role: gen-int-ca-istio-${REMOTE_CLUSTER}-mesh EOF
- Restart the istiod deployment. Note that you cannot update Istio resources until istiod is running again.
kubectl rollout restart deployment -l app=istiod -n istio-system --context ${REMOTE_CONTEXT}
- Repeat the previous steps for each workload cluster with Istio.
Verify traffic uses the root CA
Now that the Istio control plane is patched with the gloo-mesh-istiod-agent
sidecar, you can verify that all of the service mesh traffic is secured by using the root CA that you generated for Vault in the previous section.
To verify, you can check the root-cert.pem
in the istio-ca-root-cert
config map that Istio propagates for the initial TLS connection. The following example checks the propagated root-cert.pem
against the local certificate that you supplied to Vault in the previous section.
-
Check the Vault version that the management cluster runs.
kubectl --context="${MGMT_CONTEXT}" exec -n vault vault-0 -- /bin/sh -c "vault version"
Example output:
Vault v1.11.3 (17250b25303c6418c283c95b1d5a9c9f16174fe8), built 2022-08-26T10:27:10Z
-
Check the root trust policy for errors.
kubectl describe RootTrustPolicy ${REMOTE_CLUSTER} -n gloo-mesh --context ${REMOTE_CONTEXT}
-
Check the mesh for errors.
kubectl describe mesh ${MESH} -n gloo-mesh --context ${REMOTE_CONTEXT}
-
From your terminal, navigate to the same directory as the
root-cert.pem
file that you previously created. Or, if you are using an existing Vault deployment, save the root certificate asroot-cert.pem
. -
Check the difference between the root certificate that istiod uses and the Vault root certificate. If installed correctly, the files are the same.
kubectl --context=$REMOTE_CONTEXT get cm -n bookinfo istio-ca-root-cert -ojson | jq -r '.data["root-cert.pem"]' | diff -q root-cert.pem -
-
If you see that the files differ, check the istiod logs.
kubectl logs -n istio-system --context ${REMOTE_CONTEXT} $(kubectl get pods -n istio-system -l app=istiod --context ${REMOTE_CONTEXT} | cut -d" " -f1 | tail -1) > istiod-logs.txt
-
Check the issued certificates for errors.
kubectl describe issuedcertificates -n istio-system --context ${REMOTE_CONTEXT}
For more troubleshooting steps, see Troubleshoot errors with the Vault setup or Debug Istio.
Rotating certificates for Istio workloads
When certificates are issued, pods that are managed by Istio must be restarted to ensure they pick up the new certificates. The certificate issuer creates a PodBounceDirective, which contains the namespaces and labels of the pods that must be restarted. For more information about how certificate rotation works in Istio, review the video series in this blog post.
Note: To avoid potential downtime for your apps in production, disable the PodBounceDirective feature by setting autoRestartPods
to false
. Then, control pod restarts in another way, such as a rolling update.
-
Get your root trust policies.
kubectl get roottrustpolicy --context ${MGMT_CONTEXT} -A
-
In the root trust policy, remove or set the
autoRestartPods
field tofalse
.kubectl edit roottrustpolicy --context ${MGMT_CONTEXT} -n <namespace> <root-trust-policy>
apiVersion: admin.gloo.solo.io/v2 kind: RootTrustPolicy metadata: name: istio-ingressgateway namespace: gloo-mesh spec: config: autoRestartPods: false ...
-
To ensure pods pick up the new certificates, restart the
istiod
pod in each remote cluster.kubectl --context {$REMOTE_CONTEXT} -n istio-system patch deployment istiod \ -p "{\"spec\":{\"template\":{\"metadata\":{\"labels\":{\"date\":\"`date +'%s'`\"}}}}}"
-
Restart your app pods that are managed by Istio, such as by using a rolling update strategy.
Troubleshoot errors with the Vault setup
If you have errors with the steps to install Vault, review the following table.
Error | Description |
---|---|
Error from server (NotFound): pods “vault-0” not found. Error from server (BadRequest): pod vault-0 does not have a host assigned. | The Vault pod might not be running. Check the pod status, troubleshoot any issues, wait for the pod to start, and try again. |
* path is already in use |
You already have set up that path. If you already ran the script, you can ignore this message. |
Error writing data to pki/config/ca: Error making API request. Code: 400. Errors: * the given certificate is not marked for CA use and cannot be used with this backend command terminated with exit code 2 | If you are using macOS, you might have the default LibreSSL version. Set up OpenSSL instead. For more information see Before you begin. |
Example script
You can review or adapt the following example script for your own use.
Environment details:
- 3 cluster setup: 1 management cluster and 2 workload clusters
- Gloo installed on all clusters
- Istio installed on the workload clusters, including the httpbin sample app
The script organizes the functions into the following commands that you can run.
-
Copy the GitHub Gist, also rendered after these steps.
-
Make sure to update the environment variables at the beginning of the script for the Gloo version, management, and workload cluster contexts that you want to use.
-
Read and execute the Vault script.
source ~/Downloads/lib.sh
-
Execute the Vault functions in order. If you notice errors, try running them one at a time, or refer to the troubleshooting section.
- Run all functions at once:
vault-install-all
Run functions separately, one at a time:
- Install Vault on the management cluster.
vault-install
- Enable Vault authentication for Kubernetes.
vault-enable-kube-auth
- Set up the CA in Vault.
vault-setup-ca
- Run all functions at once:
-
Verify Vault. Note that this verification assumes you have
httpbin
on each workload cluster in thehttpbin
namespace.vault-verify