Cache queries

Improve network performance by caching GraphQL queries.

Stitching together your APIs into a single GraphQL schema offers many benefits. However, the resulting schema can potentially be very large. In that case the query strings that a client needs to send to the GraphQL server also grows, impacting network latency and performance.

To improve network performance for large query strings, the GraphQL filter supports Automatic Persisted Queries (APQ). A persisted query consists of the query string and the query's SHA-256 hash that are cached on the GraphQL server side. When you enable query caching, a client can then send the query hash instead of the full query string, which reduces request sizes. Persisted queries are especially effective when clients send queries as GET requests, because clients can take advantage of the browser cache and integrate with an intermediate Content Delivery Network (CDN).

Depending on your client and edge caching architecture, the query caching performance benefits can also apply to responses. Although this policy does not cache responses, persisting queries with a query hash allows clients to cache the response in a client-local cache. Query caching also allows the CDN layer to cache the response. Note that responses are unaffected by query caching.

You can also prevent malicious requests to your GraphQL servers by specifying a list of allowed query hashes in a GraphQLAllowedQueryPolicy.

Before you begin

This guide assumes that you use the same names for components like clusters, workspaces, and namespaces as in the getting started, and that your Kubernetes context is set to the cluster you store your Gloo config in (typically the management cluster). If you have different names, make sure to update the sample configuration files in this guide.
  1. Set up Gloo Gateway in a single cluster.
  2. Deploy sample apps.
  3. Configure an HTTP listener on your gateway. The RouteTable in this guide is not required, because you create a GraphQL-specific route table instead.
  4. Follow the Get started guide to define example GraphQL schema and resolvers and configure routing.

Configure GraphQL query caching

You can apply a GraphQLPersistedQueryCachePolicy policy at the route level. For more information, see Applying policies.

Review the following sample configuration files.

apiVersion: resilience.policy.gloo.solo.io/v2
kind: GraphQLPersistedQueryCachePolicy
metadata:
  name: bookinfo-query-cache
  namespace: bookinfo
spec:
  applyToRoutes:
  - route:
      labels:
        route: graphql-bookinfo
  config:
    cacheSize: 1000

Review the following table to understand this configuration. For more information, see the API docs.

Setting Description
spec.applyToRoutes Use labels to configure which GraphQL routes to apply the policy to. This example label matches the app and route from the example route table that you previously applied in the GraphQL getting started guide. If omitted or empty, the policy applies to no routes in the workspace. If more than one GraphQLPersistedQueryCachePolicy applies to a GraphQL route, the oldest policy is applied.
cacheSize The number of queries to store in the persisted query cache. Defaults to 1000.

Specify cache control directives

By default, the GraphQL server adds the Cache-Control header to each response that it returns, which describes the response's cache policy. The cache control policy signals to CDNs to cache the HTTP GET responses, and also describes how the response should be cached.

In the GraphQL schema that you define in your ApiDoc Gloo CR, you might add the @cacheControl directive for the folllowing arguments:

In the following example, the @cacheControl(maxAge: 60) directive for the top-level query type ensures that responses to queries are fresh in the cache for a maximum of 60 seconds. Additionally, the @cacheControl(maxAge: 30) directive for the fullName field indicates that this value is cached for a maximum of only 30 seconds, instead of 60 seconds. Field-specific directives override type-level directives.

apiVersion: apimanagement.gloo.solo.io/v2
kind: ApiDoc
metadata:
  name: bookinfo-rest-apidoc
  namespace: bookinfo
spec:
  graphql:
    schemaDefinition: |-
      type Query @cacheControl(maxAge: 60) {
        GetAllAccounts: [Account]
        GetAccount(account_id: Int): Account
      }
      type Mutation {
        CreateAccount(account: Account): Account
      }
      type Account {
        fullName: 		String @cacheControl(maxAge: 30)
        email: 			  String
        phoneNumber: 	String
        address: 		  String
      }
      ...

Verify GraphQL query caching

To send a query, you typically use a GraphQL client. However, because most GraphQL clients have native support for persisted queries, steps such as calculating the query hash and registering the query are done automatically for you. To see how query caching works, you send an initial query by using cURL in the following steps. Then, to call the persisted query, you send the SHA-256 hash of the query next. If the query that matches the hash is cached on the GraphQL server, the query is executed and the result returned to the client. If the query with that hash is not cached, an error is returned instead.

  1. Apply the example policy in your cluster. This policy enables query caching for the graphql-bookinfo route that you created in the Get started guide, and configures the cache size as 1000 queries.

    kubectl apply -f - << EOF
    apiVersion: resilience.policy.gloo.solo.io/v2
    kind: GraphQLPersistedQueryCachePolicy
    metadata:
      name: bookinfo-query-cache
      namespace: bookinfo
    spec:
      applyToRoutes:
      - route:
          labels:
            route: graphql-bookinfo
      config:
        cacheSize: 1000
    EOF
    
  2. Save a simple GraphQL query string, which requests the title of the book, in an environment variable.

    export QUERY="query MyProductsForHome { productsForHome { title } }"
    
  3. Compute the SHA-256 hash for this query by using the shasum command, and save the hash in an environment variable. shasum prints both the hash and the name of the input file, but because the SHA-256 hash is always 64 characters long, the head command takes only the first 64 characters of the output.

    export QUERY_SHA256=$(echo -n $QUERY | shasum -a 256 | head -c 64)
    echo $QUERY_SHA256
    

    Example hash:

    ba7faf706579b441e281376ba5a87d5047e79eb52dcf6ac0eb34eb85ed53b053
    
  4. Send a GET request with the SHA-256 hash of the query.

    curl --get --resolve www.example.com:80:${INGRESS_GW_IP} http://www.example.com:80/graphql \
      --data-urlencode "extensions={\"persistedQuery\":{\"version\":1,\"sha256Hash\":\"$QUERY_SHA256\"}}"
    
    curl --get --resolve www.example.com:443:${INGRESS_GW_IP} https://www.example.com:443/graphql \
      --data-urlencode "extensions={\"persistedQuery\":{\"version\":1,\"sha256Hash\":\"$QUERY_SHA256\"}}"
    
    Because the query is not cached yet by the GraphQL server, the server returns an error:

    {"errors":[{"message":"persisted query not found: sha256 ba7faf706579b441e281376ba5a87d5047e79eb52dcf6ac0eb34eb85ed53b053"}]}
    
  5. To send a GraphQL query string for the server to cache, encode the query in a URL-encoded format by using a simple Python script, and store the result in an environment variable.

    export QUERY_URL_FORMAT=$(python -c "import urllib, sys; print urllib.quote(sys.argv[1])" "$QUERY")
    echo $QUERY_URL_FORMAT
    

    Example output:

    query%20MyProductsForHome%20%7B%20productsForHome%20%7B%20title%20%7D%20%7D
    
  6. Send the query string and the hash in a request to the GraphQL server. This request is cached by the server.

    curl --get --resolve www.example.com:80:${INGRESS_GW_IP} http://www.example.com:80/graphql \
      -d "query=$QUERY_URL_FORMAT" \
      --data-urlencode "extensions={\"persistedQuery\":{\"version\":1,\"sha256Hash\":\"$QUERY_SHA256\"}}"
    
    curl --get --resolve www.example.com:443:${INGRESS_GW_IP} https://www.example.com:443/graphql \
      -d "query=$QUERY_URL_FORMAT" \
      --data-urlencode "extensions={\"persistedQuery\":{\"version\":1,\"sha256Hash\":\"$QUERY_SHA256\"}}"
    
    Example output:

    {"data":{"productsForHome":[{"title":"The Comedy of Errors"}]}}
    
  7. Repeat the request that you sent in step 4 by sending only the SHA-256 hash of the query.

    curl --get --resolve www.example.com:80:${INGRESS_GW_IP} http://www.example.com:80/graphql \
      --data-urlencode "extensions={\"persistedQuery\":{\"version\":1,\"sha256Hash\":\"$QUERY_SHA256\"}}"
    
    curl --get --resolve www.example.com:443:${INGRESS_GW_IP} https://www.example.com:443/graphql \
      --data-urlencode "extensions={\"persistedQuery\":{\"version\":1,\"sha256Hash\":\"$QUERY_SHA256\"}}"
    
    Because the query is cached by the GraphQL server, the server now returns the expected response:

    {"data":{"productsForHome":[{"title":"The Comedy of Errors"}]}}
    
  8. Optional: Clean up the policy.

    kubectl delete GraphQLPersistedQueryCachePolicy bookinfo-query-cache -n bookinfo