Redis

Redisยฎ* 1 is an open source, in-memory data structure store. For more information, see the Redis docs.

Gloo usage of Redis as backing storage

Gloo Mesh Enterprise uses Redis as the backing database for four main use cases, as described in the following table. Each use case name corresponds to the redisStore section in the Helm values file where you configure the details for that use case.

You can use the same Redis instance for all use cases, share an instance across some use cases, or have a separate instance for each use case. For more information, see How many Redis instances you need.

For each Redis instance, you can choose to use the default built-in Redis instance, deploy your own local Redis instance, or configure your own external Redis instance in a cloud provider, such as AWS ElastiCache. For more information, see Redis deployment options.

UsageProduced byConsumed byRequired?Description
snapshotManagement serverManagement server, insights engine, UIRequiredThe management server maintains a snapshot that includes the configuration details and status of all the resources in your Gloo Mesh Enterprise environment. The insights engine uses this information to generate special insights for your Istio setup. The UI displays this information in a web app that you can launch with the meshctl dashboard command. For more information, see Data stored in snapshots.
insightsInsights engineInsights engine, UIOptionalThe insights engine analyzes your Istio setup for health issues and recommends steps to harden your environment. For more information, see Data stored for insights.
extAuthServiceExternal auth service, portal serverExternal auth service, portal serverOptionalThe external auth service stores data related to authenticated requests, such as session data, returned OIDC tokens, and API keys. If you use Gloo Portal, the portal server can also store and read API keys to authenticate users to the developer portal. For more information, see Data stored for the developer portal.
rateLimiterRate limiterRate limiterOptionalThe rate limiter stores rate limit counters to rate limit requests to your APIs. For more information, see Data stored for the rate limiter.

Separate or single Redis instances

As described in Gloo usage of Redis as backing storage, many Gloo Mesh Enterprise components read from or write data to Redis. Review the following options to help you decide how many instances to use.

Preferred: Separate control plane and data plane

The preferred setup option is to create separate Redis instances for the control plane (snapshot and insights) versus the data plane (extAuthService and rateLimiter). Although you now have to manage separate instances, you get more flexibility in setup. For example, you might have stricter HA/DR requirements for the data plane information, so that in case of failure, API keys or rate limits persist. You can also set up different types of Redis, such as built-in for the control plane vs. an external cloud-provided Redis instance for a multicluster data plane.

Single instance

A common setup is to use a single instance, which can simplify management and speed up setup. This configuration sets up a single, shared Redis instance for all redisStore usages (snapshot, insights, extAuthService, and rateLimiter). This way, all the data for your Gloo Mesh Enterprise environment is kept in the same place. Any settings that you apply to this instance, such as security or sizing configuration, are used consistently across all usages.

Separate instances for all usages

This option is similar to having separate instances for the control plane vs. data plane.By separating the instances for each usage, you can adjust the size, security, and other features per instance.

Separate instances vs. separate databases

Another common Redis deployment pattern is to create different databases within a single Redis instance. Then, you configure different Gloo Mesh Enterprise components to store their data in these databases.

Although this setup avoids the complexity of managing separate Redis instances, you might want to test performance for your use case. Redis instances are typically single-threaded, so each database shares the same thread, which impacts performance.Especially for data plane usages such as rate limiting, this setup increases latency and can result in slower response rates.

If you still want to configure different Gloo Mesh Enterprise components to use different databases in the same Redis instance, you can configure the client.db Helm value for the component.

Other factors to help you decide

When deciding how many instances to use, consider the impact of a Redis instance going down, such as a restart during an upgrade, by use case.

  • Control plane snapshot and insights: If Redis goes down, you temporarily cannot do common administrative tasks such as updating the configuration of your Gloo Mesh Enterprise resources or reviewing status information in the Gloo UI. To help mitigate the impact of potential translation errors, safe mode is enabled by default.
    • External auth service: If you only store caching data for OAuth2 sessions in the Redis instance, then a pod restart might cause protected requests to temporarily fail. But the requests eventually succeed when the Redis pod becomes healthy. However, if you store API keys in the Redis instance, then a pod restart can delete these API keys, which can cause major impact to your users. As such, you might want a separate Redis instance for the external auth service that uses an external Redis with HA configured, or at the very least persistent storage.
    • Rate limiter: During a pod restart, the limits are reset and requests that might otherwise be throttled can temporarily succeed. As such, you might want a separate Redis instance for the rate limiter that uses an external Redis with HA configured.

    Redis deployment options

    When you install Gloo Mesh Enterprise, you can choose how to back up different components with Redis:

    Overview of deployment options

    For a quick red-yellow-green scale of each option by common requirement, review the following table.

    RequirementBuilt-in RedisYour own local RedisYour own external Redis
    Ease of setupโœ…๐ŸŸก๐ŸŸก
    Management burdenโœ…๐ŸŸกโœ…
    Password auth and TLS connectionโŒโœ…โœ…
    Highly available (HA) data๐ŸŸก๐ŸŸกโœ…
    Disaster recovery (DR) for data๐ŸŸก๐ŸŸกโœ…
    MulticlusterโŒโŒโœ…
    Performance at scaleโœ…โœ…๐ŸŸก

    About the requirements:

    • Ease of setup: The built-in option is the fastest, as you do not need to preconfigure your Redis instance. To bring your own, you must set up a Redis instance yourself, and then configure Gloo Mesh Enterprise to use that instance during a Helm installation or upgrade.
    • Management burden: Bringing your own local Redis has the highest management burden, because you are entirely responsible for configuring it. Additionally, the resources might impact your cluster performance because the instance is deployed in the same cluster.
    • Password auth and TLS connection: Currently, the built-in option does not let you configure auth or TLS encryption. However, traffic is partly secured already, because the traffic stays within your cluster environment. You can use password auth and TLS encryption if you bring your own instance.
    • Highly available data: Your HA options vary with the deployment option.
      • Built-in Redis: You can optionally set up persistent storage to increase the availability of data in case of a pod restart. Note that this option does not protect against zonal or regional failures.
      • Local Redis Enterprise: You are responsible for setting up data persistence. You can also configure single-region, multi-zone HA with automatic failover by using a single, sharded Redis Enterprise instance with one primary and up to five replicas. For more information, see the Redis docs.
      • External Redis: In your cloud provider, you can configure HA through replication and DR through failover in various multizone or multiregion active-passive scenarios. For example, you can use the following options with Amazon ElastiCache Redis OSS (clustered mode disabled).
        • Single-region, multi-zone HA with automatic failover by using a single, shared Redis instance with one primary and up to five replicas. For more information, see the Amazon ElastiCache docs.
        • Multi-cluster, multi-region active/passive HA with one active management cluster and one passive management cluster in different regions. For more information, see the Solo blog and the Amazon ElastiCache Redis OSS with Global Datastores docs.
        • Note: Redis clustered mode HA for multiple, active shards is not currently supported.
    • Data recovery: Similar to HA data, a cloud-provided external Redis option typically offers the most redundant recovery options by automatically replicating data across zones or regions. For built-in or local Redis instances, recovery options are typically provided only by the cluster nodes that the Redis instance is deployed to.
    • Multicluster: Cloud instances are especially well-suited to multicluster environments, where every cluster can point to the same backing cloud instance. For data plane workloads like external auth and rate limiting, multicluster availability is especially important. By default, the built-in Redis is not multicluster. Redis Enterprise also does not support cross-cluster operations as described in the Redis Enterprise docs.
    • Performance at scale: Because the Redis instances are deployed to the same cluster as the workloads that use them, performance typically scales well. Cloud performance can vary depending on the strength of the network connectivity. Therefore, create the external Redis instance as near to your cluster infrastructure as possible, and in the same network. Be sure to review sizing guidance for each deployment option.

    Built-in local Redis

    By default, Solo provides the option to deploy several built-in, local Redis instances for various components when you install Gloo Mesh Enterprise.

    This option is convenient for quick setups, demonstrations, and small testing or staging environments. To make the local Redis more highly available, you can configure persistent storage.

    One current limitation is that you cannot configure auth or TLS encryption for this built-in option. However, traffic is partly secured already, because the traffic stays within your cluster environment. This option can also be difficult to scale as your environment grows larger, or in multicluster use cases particularly for external auth, portal server, and rate limiter.

    For setup steps, see Built-in Redis.

    For sizing guidance, review the following suggestions based on sample environments and data plane traffic. Use this information as a starting point to validate your Redis needs against your environment, and contact your Solo Account Representative for further guidance.

    Environment sizeData plane trafficRedis sizePersistent storage
    Staging, demos, or small environments, such as:
    • 1 cluster, or 1 management and 2 workload clusters
    • A few apps, such as Bookinfo or a developer portal frontend
    • < 1,000 services
    None to minimal external auth or rate limiting requests (< 100 requests/second)The default size of the built-in Redis
    • CPU 125m request
    • Memory 256Mi request
    Optional backing PV with 1Gi for persistent data
    Medium production environments, such as:
    • 1 management cluster
    • 5-10 workload clusters
    • < 4,000 services
    Moderate external auth or rate limiting requests (< 1,000 requests/second)Optional separate Redis instances for:
    • Control plane snapshot and insights
    • Data plane extAuthService and rateLimiter
    The default size of the built-in Redis
    • CPU 125m request
    • Memory 256Mi request
    Backing PV with 2-5Gi for persistent data
    Large production environments, such as:
    • > 1 management cluster
    • > 10 workload clusters
    • > 5,000 services
    Heavy external auth or rate limiting requests (> 1,000 requests/second)Separate Redis instances for:
    • Control plane snapshot and insights
    • Data plane extAuthService and rateLimiter
    • Or, a separate instance per usage
    Adjust the default size of the built-in Redis resource requests as needed
    Backing PV with 10Gi for persistent data

    Bring your own local Redis

    Instead of using the Solo-provided built-in local Redis instance, you can deploy your own. This option can be particularly well-suited for an existing Redis deployment that you manage, such as a local instance of Redis Enterprise.

    By using Redis Enterprise, you can set up replicated nodes in each workload cluster where the component runs, such as the gateway, external auth service, portal server, and rate limiter. This way, performance increases because the data is typically stored and accessed on the same network as the requesting client.

    For setup steps, see Local Redis.

    For sizing guidance, review the following suggestions based on sample environments and data plane traffic. Use this information as a starting point to validate your Redis needs against your environment, and contact your Solo Account Representative for further guidance.

    Environment sizeData plane trafficRedis sizePersistent storage
    Staging, demos, or small environments, such as:
    • 1 cluster, or 1 management and 2 workload clusters
    • A few apps, such as Bookinfo or a developer portal frontend
    • < 1,000 services
    None to minimal external auth or rate limiting requests (< 100 requests/second)At least:
    • CPU 125m request
    • Memory 256Mi request
    Optional backing PV with 1Gi for persistent data
    Medium production environments, such as:
    • 1 management cluster
    • 5-10 workload clusters
    • < 4,000 services
    Moderate external auth or rate limiting requests (< 1,000 requests/second)Optional separate Redis instances for:
    • Control plane snapshot and insights
    • Data plane extAuthService and rateLimiter
    At least:
    • CPU 125m request
    • Memory 256Mi request
    Backing PV with 2-5Gi for persistent data
    Large production environments, such as:
    • > 1 management cluster
    • > 10 workload clusters
    • > 5,000 services
    Heavy external auth or rate limiting requests (> 1,000 requests/second)Separate Redis instances for:
    • Control plane snapshot and insights
    • Data plane extAuthService and rateLimiter
    • Or, a separate instance per usage
    Adjust the size of the built-in Redis resource requests as needed, with at least:
    • CPU 125m request
    • Memory 256Mi request
    Backing PV with 10Gi for persistent data*

    Footnotes for the table:

    • * Persistent data for large environments: For large environments with thousands of custom resources, enabling persistence can have an impact on performance. For example, copies of the snapshot and insights data must be written to the persistent volume, which might initially take a large amount of time.

    Bring your own external Redis instance

    Instead of using the Solo-provided built-in local Redis instance, you can set up Gloo Mesh Enterprise to use your own external Redis instance in a cloud provider, such as AWS ElastiCache or Google Cloud Memorystore. This option is ideal for when you want to use features of a cloud-managed Redis service, such as for performance, compliance, or security reasons.

    An external Redis instance offers advantages in scalability, management, and security. You might be able to scale more responsively to loads by adding more resources to your cloud instance. Cloud providers often help with routine maintenance tasks such as backups, patching, and monitoring. You can also enable security features such as creating the instance in the same VPC, enabling encryption at rest, TLS in transit, and password authentication.

    Tradeoffs include cost, complexity, and latency. Especially at scale and in multicluster environments, a cloud instance might be more expensive than the self-managed option. You also must be aware of potentially complex cloud provider networking and security settings. Finally, the network latency might not offer the performance that you need, especially if your apps are in a different region or network than the Redis instance.

    For setup steps, see External Redis.

    For sizing guidance, review the following suggestions based on sample environments and data plane traffic. Use this information as a starting point to validate your Redis needs against your environment, and contact your Solo Account Representative for further guidance.

    Environment sizeData plane trafficRedis size*
    Staging, demos, or small environments, such as:
    • 1 cluster, or 1 management and 2 workload clusters
    • A few apps, such as Bookinfo or a developer portal frontend
    • < 1,000 services
    None to minimal external auth or rate limiting requests (< 100 requests/second)
    • General cache.m7g.large
    • 2 vCPU and 8 GiB memory
    Medium production environments, such as:
    • 1 management cluster
    • 5-10 workload clusters
    • < 4,000 services
    Moderate external auth or rate limiting requests (< 1,000 requests/second)Optional separate Redis instances for:
    • Control plane snapshot and insights
    • Data plane extAuthService and rateLimiter
    Size of the control plane instances, where memory is more important to store snapshots and insights:
    • cache.m7g.2xlarge
    • 4 vCPU and 32 GiB memory
    Size of the data plane instances, where vCPU is more important to process many simultaneous requests:
    • cache.m7g.2xlarge
    • 8 vCPU and 16 GiB memory
    Large production environments, such as:
    • > 1 management cluster
    • > 10 workload clusters
    • > 5,000 services
    Heavy external auth or rate limiting requests (> 1,000 requests/second)Separate Redis instances for:
    • Control plane snapshot and insights
    • Data plane extAuthService and rateLimiter
    • Or, a separate instance per usage
    Size of the control plane instances, where memory is more important to store snapshots and insights:
    • cache.m7g.4xlarge
    • 8 vCPU and 64 GiB memory
    Size of the data plane instances, where vCPU is more important to process many simultaneous requests:
    • cache.m7g.4xlarge
    • 16 vCPU and 16 GiB memory

    Footnotes for the table:

    • * Redis size: How large the Redis instance should be, based on the size and deployment method for your environment. The sizing suggestions in the table are based off AWS ElastiCache node types and their corresponding AWS EC2 instance types. If you use a different Redis-compatible provider, try to use a comparable instance size.

    Persistent storage

    When you use the built-in Redis instance, the data is not persistent. This means that if Redis restarts, such as due to an upgrade, the data is no longer available. The related Gloo components, such as the management server and agent relay process, must repopulate the data in Redis. This relay process can take time, particularly in large environments. Additionally, you might have safety features such as safe mode or safe start windows to prevent translation errors until the data becomes available. In such cases, you might want your local Redis data to persist to speed up the time until the data is available.

    How persistent storage works

    You can set up persistent storage for the local Redis instances that back your Gloo components by using Kubernetes storage. To set up Kubernetes storage, you configure three main storage resources: storage classes, persistent volumes (PVs), and persistent volume claims (PVCs). You configure storage classes and PVs on your own. For example, your cloud provider might have pre-configured storage classes that you can choose from to create your PVs. Then during a Gloo installation or upgrade, you use the Gloo Helm chart to configure a PVC (or use an existing one) for the local Redis instance.

    Persistent volumes (PVs) are a cluster-wide resource where the persisted data is stored outside the lifecycle of the Redis pod. You can provision the PV yourself or dynamically by using a PVC. Note that your infrastructure provider might limit your options for how to provision the PV.

    Persistent volume claims (PVC) request storage for the local Redis instance that mounts the PVC as a volume. PVCs specify size, access modes, and specific storage class requirements. When a PVC is created, Kubernetes finds a PV that matches the claimโ€™s requirements. If a suitable PV is available, the PV is bound to the PVC. If no suitable PV is found but a storage class is defined, Kubernetes can dynamically provision a new PV according to the specifications of the storage class and bind it to the PVC.

    Storage classes define different types of storage that are offered by your cluster’s cloud provider or on-premise storage system. You might have different storage classes for different storage performance levels or backup policies that customize settings such as replication or IOPS. You set the storage class that the PVC requests in your Helm settings. If you do not define a storage classes, the default storage class is used.

    You configure persistent storage options in the redisStore.<component>.deployment.persistence section of the Helm chart for the cluster that you deploy the Redis instance to. The component can be the snapshot, insights, extAuthService, or rateLimiter that you configure to use the Redis instance. Therefore, the snapshot and insights control plane components are updated in the Helm chart for the management cluster.The extAuthService and rateLimiter data plane components might be in the management cluster or workload cluster Helm chart.

    To statically provision the PV and matching PVC, set the redisStore.<component>.deployment.persistence.persistentVolume.existingClaim value to true. Then, put the name of the PVC that you want the Redis deployment to mount in the redisStore.<component>.deployment.persistence.persistentVolume.name field.

    To use dynamic provisioning, you can set up the details of the PVC in the redisStore.<component>.persistence.persistentVolume section of the Gloo Helm chart during installation or upgrade, as described in the setup guide.

    Redis options for RDB and AOF

    When you enable persistence for local Redis, you must also set up the persistence mode. You can choose between two main persistence modes: Redis Database (RDB) or Append Only File (AOF). You can also use the modes together, but this use case is rare.

    • RDB provides a point-in-time snapshot for a disaster recovery backup, but has trade-offs in the completeness of data versus AOF.
    • AOF logs every write operation for a more complete recovery option, but has trade-offs in resource usage which can impact performance versus RDB.

    For more information, see the Redis docs. For more details of how to configure these persistent modes, see the setup guide.

    More details on stored data

    Review the following sections to learn more about what data is stored for the different uses of Redis to back up Gloo Mesh Enterprise components.

    Data stored in snapshots

    The Gloo snapshot that the agent sends to the management server is stored in the management server’s backing storage. For an example of how to check the data that gets stored, see Review data in Redis.

    • Discovered Kubernetes resources, such as Kubernetes services, deployments, replicasets, daemonsets, and statefulsets. The management server translates discovered resources into Istio resources and displays them in the Gloo UI. Note that you can use Istio discovery selectors to ignore certain Kubernetes resources. Ignored resources are not included in the snapshot that is sent from the agent to the management server.
    • Gloo custom resources that you create. The management server translates Gloo resources and displays them in the Gloo UI.
    • Istio resources, including:
      • Istio resources that, after initial server-agent setup, the management server automatically translates from your Gloo resources and writes back to the workload cluster. These resources are included in the snapshot to avoid accidentally deleting them from the workload cluster if an agent disconnects and reconnects, and to display them in the Gloo UI.
      • Any Istio resources that you manually created, so that they can be displayed in the Gloo UI.
    • Internal resources that are computed in memory by each agent and pushed to the management server without being persisted in the workload cluster. Internal resources include:
      • Gateway resources, which contain information about gateway endpoints in the cluster.
      • IssuedCertificate and CertificateRequest resources, which are used in internal multi-step workflows that involve both the agent and the management server.
    • External authentication information the UI, when enabled. This data includes user session information such as access and ID tokens.

    Data stored for insights

    If you enable the insights engine that is available in version 2.6 or later, Gloo Mesh Enterprise automatically analyzes your Istio setup to check for health issues. This insight information is stored in Redis for the Gloo UI to read and display to you in the dashboard.

    For more information, see Insights.

    Data stored for the external auth service

    The external auth service stores data to use to authenticate to destinations and routes that are protected by an external auth policy and enforced by an external auth server.

    Such data can include the following, depending on how you configure external auth.

    • API keys that you create (such as through the developer portal), either as plain text or hashed if you configure a secret key to hash the values.
    • Session data such as cookies.
    • OIDC distributed claims.

    For more information, see External authentication and authorization.

    Data stored for the developer portal

    The portal server stores the following data in the backing database. For an example of how to check the data that gets stored, see Review data in Redis.

    • API keys that end users can use to authenticate to destinations and routes that are protected by an external auth policy that the external auth server enforces.

    To use Gloo Portal, you must configure both the portal server and external auth service to use the same backing storage. Review the following diagram for more information.

    Portal and external auth shared backing database
    Portal and external auth shared backing database
    1. As the portal admin, you configure the backing storage database for both the external auth and portal servers.
    2. As an end user, you use the developer portal to generate an API key. Then, the Gloo portal server writes this API key to the shared backing storage database, such as Redis.
    3. As an end user, you include the API key in subsequent requests to the API products that you want to access. The Gloo ingress gateway receives your requests and checks the API’s route for external auth policies. The Gloo external auth server reads the API key value from the shared backing storage database, such as Redis. If the API key matches a valid value, then the Gloo gateway sends the request to the API product and you get back a successful response.

    For more information, see Verify that external auth works with API keys in a backing storage database in the Gloo Mesh Gateway Portal docs.

    Data stored for the rate limiter

    The rate limiter stores information for limiting requests in the backing Redis database, including the following information.

    • The generated unique key for the request.
    • The associated count against the rate limit.

    For more information, see Rate limiting.


    1. * Redis is a registered trademark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by Solo.io, Inc. is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and Solo.io. ↩︎