Architecture

A Gloo Mesh setup consists of one management cluster that the Gloo Mesh Enterprise management components are installed in, and one or more remote clusters that run services meshes which are registered with and managed by the management cluster. The management cluster serves as the management plane, and the remote clusters serve as the data plane.

Gloo Mesh Architecture

You can think of Gloo Mesh as a management plane for multiple service mesh control planes. When a remote cluster is registered with the Gloo Mesh, the management plane can begin managing that cluster by discovering workloads, pushing out configurations, unifying the trust model, scraping metrics, and more.

Components

Relay server on the management cluster

When you install Gloo Mesh Enterprise in the management cluster, a deployment named enterprise-networking runs the relay server. The relay server is exposed by the enterprise-networking service on a default port of 9900/TCP. The management cluster requires a configured ingress point that allows remote clusters to communicate with the relay server, which can be achieved through an ingress gateway such as Istio or Gloo Edge, or by setting the enterprise-networking service type to LoadBalancer.

Relay agents on remote clusters

When you register remote clusters to be managed by Gloo Mesh Enterprise, a deployment named enterprise-agent is created on each cluster to run the relay agent. The relay agent is exposed by the enterprise-agent service on the default ports of 9988 and 9977. Because all communication is outbound from the remote clusters to the management cluster, no ingress point must be configured for the relay agent.

Agent-server communication

Communication between the management and data planes is initiated by relay agents, which run in the remote clusters, to the relay server, which runs in the management cluster. The following steps outline the general flow of how the relay agents and server communicate to keep your multimesh ennvironment up to date:

  1. A remote cluster is registered with the Gloo Mesh management plane. The relay agent in the remote cluster establishes an mTLS-secured gRPC communication route with the relay server in the management cluster.
  2. The relay agent in the remote cluster sends a snapshot of its state to the relay server. The mesh discovery components of the management plane translate the snapshot into custom resources, which form a complete view of the meshes and mesh entities across all remote clusters.
  3. You create Gloo Mesh resources to configure multimesh settings. The mesh networking components of the management plane translate these resources into mesh-level and service-level configuration updates for the meshes and mesh entities in remote clusters.
  4. The relay agent in the remote cluster pulls the updates from the relay server and applies the updates to the mesh control plane in the remote cluster.

Secure communication

To validate authenticity during registration time of a remote cluster, the relay agent transmits a token value, which is defined in relay-identity-token-secret on the remote cluster, to the relay server. The token must match the value stored in relay-identity-token-secret on the management cluster, which is created during deployment of the relay server.

When the token is validated, the relay server generates a TLS certificate for the relay agent. Note that you can use self-signed certificates or certificates from your PKI to secure server-agent communication. All future communication from relay agents to the server, which uses the gRPC protocol, is secured by using mTLS provided by this certificate.

Mesh discovery

Each relay agent performs mesh discovery for the cluster that it is deployed to. The relay agent constructs a snapshot of the actual state of discovered entities, such as service meshes and services, in the remote cluster. Currently, Gloo Mesh discovers and manages Istio meshes.

The agent then pushes this snapshot of the remote cluster state to the relay server. When the relay server receives a snapshot, the discovered entities in the snapshot are translated into custom resources by the following mesh discovery components.

These custom resources provide the relay server with the complete state of the service meshes, workloads, and destinations across the multicluster, multimesh environment. When you make cross-mesh configuration changes, the relay server uses this state to create configuration updates for the remote clusters.

Mesh networking

While the mesh discovery components discover the state of resources in remote clusters, the mesh networking components federate individual service meshes across remote clusters. The VirtualMesh concept enables the federation of multiple service meshes into a single managed construct.

The following mesh networking components configure mesh-level and service-level settings across multiple service meshes.

Configuration updates and state reconciliation

The relay server watches for user-provided configuration updates in the management cluster. For example, you might create a TrafficPolicy or AccessPolicy for Gloo Mesh. Using the information that was gathered by mesh discovery components about the individual mesh control planes, workloads, and destinations across remote clusters, the mesh networking components automatically translate your provided Gloo Mesh resources into resources that are specific to each remote mesh. For example, if a relay agent reports that its cluster runs an Istio service, the relay server translates your Gloo Mesh resources into VirtualService, DestinationRule, and AuthorizationPolicy Istio resources.

The relay server then reconciles this declared state with the actual state of the remote clusters, and creates configuration updates. The relay agents in remote clusters pull these updates in real time and apply them to the control planes of each service mesh. Note that many service mesh proxies, like Envoy, rely on a polling mechanism between the control plane and the proxy instances. Therefore, any changes pushed from Gloo Mesh are contingent on the polling cycle within a remote cluster's service mesh for the mesh proxy instances.