About JWT tokens for access control

With Gloo AI Gateway, you can use policies, such as JSON Web Tokens (JWT), to ensure that only authenticated users can access your LLM API. In addition, you can extract claims from the JWT to enforce fine-grained access control to particular APIs. For example, you can restrict access to certain APIs based on the user’s role, group, organization, or any other claim within the JWT.

In the following example, you create a JWT provider with self-signed certificates and configure access to your LLM API by using the claims in the JWT. To verify that authentication works, you create JWTs for two different users, Alice and Bob. Alice’s JWT contains the claims to successfully authenticate and authorize with Gloo AI Gateway. Bob’s JWT however is set up with an unsupported model name. Because of that, access to the LLM API is prohibited.

Before you begin

Complete the Authenticate with API keys tutorial.

Authenticate users with JWTs

Enforce JWT authentication to your AI Gateway routes.

  1. Create a VirtualHostOption resource to define an inline JWT provider. The JWT provider is used to validate the JWTs that are sent as part of the requests to the Gloo AI Gateway. In the following example, the JWT provider validates the JWT by using the public key that you add to the VirtualHostOption resource.

      kubectl apply -f- <<EOF
    apiVersion: gateway.solo.io/v1
    kind: VirtualHostOption
    metadata:
      name: jwt-provider
      namespace: gloo-system
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: Gateway
        name: ai-gateway
      options:
        jwt:
          providers:
            selfminted:
              issuer: solo.io
              jwks:
                local:
                  key: |
                    -----BEGIN PUBLIC KEY-----
                    MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAskFAGESgB22iOsGk/UgX
                    BXTmMtd8R0vphvZ4RkXySOIra/vsg1UKay6aESBoZzeLX3MbBp5laQenjaYJ3U8P
                    QLCcellbaiyUuE6+obPQVIa9GEJl37GQmZIMQj4y68KHZ4m2WbQVlZVIw/Uw52cw
                    eGtitLMztiTnsve0xtgdUzV0TaynaQrRW7REF+PtLWitnvp9evweOrzHhQiPLcdm
                    fxfxCbEJHa0LRyyYatCZETOeZgkOHlYSU0ziyMhHBqpDH1vzXrM573MQ5MtrKkWR
                    T4ZQKuEe0Acyd2GhRg9ZAxNqs/gbb8bukDPXv4JnFLtWZ/7EooKbUC/QBKhQYAsK
                    bQIDAQAB
                    -----END PUBLIC KEY-----
    EOF
      
  2. Send a request to the AI API. Note that the request returns a 401 HTTP response code, because the JWT is missing in the request.

      curl -v "$INGRESS_GW_ADDRESS:8080/openai" -H content-type:application/json -d '{
       "model": "gpt-3.5-turbo",
       "messages": [
         {
           "role": "system",
           "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
         },
         {
           "role": "user",
           "content": "Compose a poem that explains the concept of recursion in programming."
         }
       ]
     }' | jq
      

    Example output:

       Connected to 172.18.0.2 (172.18.0.2) port 8080 (#0)
    > POST /openai HTTP/1.1
    > Host: 172.18.0.2:8080
    > User-Agent: curl/7.81.0
    > Accept: */*
    > Content-Length: 354
    > Content-Type: application/x-www-form-urlencoded
    > 
    * Mark bundle as not supporting multiuse
    < HTTP/1.1 401 Unauthorized
    < www-authenticate: Bearer realm="http://172.18.0.2:8080/openai"
    < content-type: text/plain
    < date: Tue, 18 Jun 2024 02:57:41 GMT
    < server: envoy
    < connection: close
    < transfer-encoding: chunked
    <    
    * Closing connection 0
    Jwt is missing
      
  3. Create an environment variable to save the JWT tokens for the users Alice and Bob.

    1. Save the JWT token for Alice. Alice works in the dev team and she has access to the gpt-3.5-turbo model of the OpenAI API:

        {
        "iss": "solo.io",
        "org": "solo.io",
        "sub": "alice",
        "team": "dev",
        "llms": {
          "openai": [
            "gpt-3.5-turbo"
          ]
        }
      }
        

      Store the token in an environment variable:

        export ALICE_TOKEN=eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyAiaXNzIjogInNvbG8uaW8iLCAib3JnIjogInNvbG8uaW8iLCAic3ViIjogImFsaWNlIiwgInRlYW0iOiAiZGV2IiwgImxsbXMiOiB7ICJvcGVuYWkiOiBbICJncHQtMy41LXR1cmJvIiBdIH0gfQ.I7whTti0aDKxlILc5uLK9oo6TljGS6JUrjPVd6z1PxzucUa_cnuKkY0qj_wrkzyVN5djy4t2ggE1uBO8Llpwi-Ygru9hM84-1m53aO07JYFya1VTDsI25tCRG8rYhShDdAP5L935SIARta2QtHhrVcd1Ae7yfTDZ8G1DXLtjR2QelszCd2R8PioCQmqJ8PeKg4sURhu05GlBCZoXES9-rtPVbe6j3YLBTodJAvLHhyy3LgV_QbN7IiZ5qEywdKHoEF4D4aCUf_LqPp4NoqHXnGT4jLzWJEtZXHQ4sgRy_5T93NOLzWLdIjgMjGO_F0aVLwBzU-phykOVfcBPaMvetg
        
    2. Save the JWT token for Bob. Bob works in the ops team and does not have access to any LLM in the OpenAI. Instead, he has access to an LLM (mistral-large-latest) from a different AI provider (Mistral AI):

        {
        "iss": "solo.io",
        "org": "solo.io",
        "sub": "bob",
        "team": "ops",
        "llms": {
          "mistralai": [
            "mistral-large-latest"
          ]
        }
      }
        

      Store the token in an environment variable:

        export BOB_TOKEN=eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyAiaXNzIjogInNvbG8uaW8iLCAib3JnIjogInNvbG8uaW8iLCAic3ViIjogImJvYiIsICJ0ZWFtIjogIm9wcyIsICJsbG1zIjogeyAibWlzdHJhbGFpIjogWyAibWlzdHJhbC1sYXJnZS1sYXRlc3QiIF0gfSB9.p7J2UFwnUJ6C7eXsFCSKb5b7ecWZ75JO4TUJHafjLv8jJ7GzKfJVk7ney19PYUrWrO4ntwnnK5_sY7yaLUBCJ3fv9pcoKyRtJTw1VMMTQsKkWFgvy-jEwc9M-D5lrUfR1HXGEUm6NBaj_Ja78XScPZb_-APPqMIvzDZU04vd6hna3UMc4DZE0wcnTjOqoND0GllHLupYTfgX0v9_AYJiKRAcJvol1W14dI7szpY5GFZtPqq0kl1g0sJPg-HQKwf7Cfvr_JLjkepNJ6A1lsrG8QbuUvMUAdaHzwLvF3L_G6VRjEte6okZpaq0g2urWpZgdNmPVN71Q_0WhyrJTr6SyQ
        
  4. Repeat the request and include the JWT token for Alice in the Authorization header. Because Alice’s JWT is successfully validated, access to the AI API is granted. Verify that the request succeeds with a 200 HTTP response code.

      curl -v "$INGRESS_GW_ADDRESS:8080/openai" --header "Authorization: Bearer $ALICE_TOKEN" -H content-type:application/json -d '{
     "model": "gpt-3.5-turbo",
     "messages": [
       {
         "role": "system",
         "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
       },
       {
         "role": "user",
         "content": "Compose a poem that explains the concept of recursion in programming."
       }
     ]
    }'
      

    Example output:

      {
      "id": "chatcmpl-9Z4YueDx9k1eks1SgMyAV7bewQj62",
      "object": "chat.completion",
      "created": 1718146044,
      "model": "gpt-3.5-turbo-0125",
      "choices": [
       {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "In the realm of coding's elegant art,\nA concept dwells, tearing logic apart.\nRecursion, a loop of mystical grace,  \nUnfolding like a never-ending chase.\n\nA function calls itself, a loop unbroken,\nThrough the echoes of code, words are spoken.\nLike a mirror reflecting its own reflection,\nRecursive calls dance with perfection.\n\nWith each iteration, new paths unfold,\nA journey deep into the code's stronghold.\nLike a Russian doll of infinite size,\nRecursion reaches for the code's highs.\n\nYet tread with care in this looping reel,\nFor endless calls can break the seal.\nBase cases anchor the recursive flight,\nGuiding it towards the end of night.\n\nSo embrace the recursive pattern's flow,\nIn the mystical dance of code to show.\nA loop within a loop, a cycle profound,\nIn the kingdom of coding, recursion is crowned."
          },
          "logprobs": null,
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 39,
         "completion_tokens": 174,
        "total_tokens": 213
      },
      "system_fingerprint": null
    }
      
  5. Repeat the request with Bob’s JWT token and verify that the request succeeds, too.

      curl -v "$INGRESS_GW_ADDRESS:8080/openai" --header "Authorization: Bearer $BOB_TOKEN" -H content-type:application/json -d '{
     "model": "gpt-3.5-turbo",
     "messages": [
       {
         "role": "system",
         "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
       },
       {
         "role": "user",
         "content": "Compose a poem that explains the concept of recursion in programming."
       }
     ]
    }'
      

    Example output:

      {
      "id": "chatcmpl-9Z4YueDx9k1eks1SgMyAV7bewQj62",
      "object": "chat.completion",
      "created": 1718146044,
      "model": "gpt-3.5-turbo-0125",
      "choices": [
       {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "In the realm of coding's elegant art,\nA concept dwells, tearing logic apart.\nRecursion, a loop of mystical grace,  \nUnfolding like a never-ending chase.\n\nA function calls itself, a loop unbroken,\nThrough the echoes of code, words are spoken.\nLike a mirror reflecting its own reflection,\nRecursive calls dance with perfection.\n\nWith each iteration, new paths unfold,\nA journey deep into the code's stronghold.\nLike a Russian doll of infinite size,\nRecursion reaches for the code's highs.\n\nYet tread with care in this looping reel,\nFor endless calls can break the seal.\nBase cases anchor the recursive flight,\nGuiding it towards the end of night.\n\nSo embrace the recursive pattern's flow,\nIn the mystical dance of code to show.\nA loop within a loop, a cycle profound,\nIn the kingdom of coding, recursion is crowned."
          },
          "logprobs": null,
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "prompt_tokens": 39,
         "completion_tokens": 174,
        "total_tokens": 213
      },
      "system_fingerprint": null
    }
      

Requests from Alice and Bob succeed because both tokens can be validated with the public key of the local JWT provider in the JWT policy. To authorize access based on claims in the token, continue to the next section.

Authorize access based on claims

Recall that the tokens for Alice and Bob have claims for teams and LLM model types. You can use the claims in the token to restrict access beyond basic authentication.

  1. Create a RouteOption resource that extracts the llms.openai claim from the JWT. This claim represents the model the user has access to. The following example allows access to the OpenAI API only if the JWT contains the "llms.openai": "gpt-3.5-turbo" claim.

      kubectl apply -f- <<EOF
    apiVersion: gateway.solo.io/v1
    kind: RouteOption
    metadata:
      name: openai-opt
      namespace: gloo-system
    spec:
      targetRefs:
      - group: gateway.networking.k8s.io
        kind: HTTPRoute
        name: openai
      options:
        rbac:
          policies:
            viewer:
              nestedClaimDelimiter: .
              principals:
              - jwtPrincipal:
                  claims:
                    "llms.openai": "gpt-3.5-turbo"
                  matcher: LIST_CONTAINS
    EOF
      
  2. Send another request to the OpenAI API with the JWT token for Alice. Because the JWT matcher is set to LIST_CONTAINS, the request only succeeds if the gpt-3.5-turbo model is part of the claims that are extracted from the JWT token. Because Alice’s JWT token includes that claim, the request succeeds.

      curl -v "$INGRESS_GW_ADDRESS:8080/openai" --header "Authorization: Bearer $ALICE_TOKEN" -H content-type:application/json -d '{
     "messages": [
       {
         "role": "system",
         "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
       },
       {
         "role": "user",
         "content": "Compose a poem that explains the concept of recursion in programming."
       }
     ]
    }'
      

    Example output:

      {
      "id": "chatcmpl-9bJj0qqDOmxM9zRFC06ahDHp1Slqz",
      "object": "chat.completion",
      "created": 1718680986,
      "model": "gpt-3.5-turbo-0125",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "In the kingdom of code, where logic reigns supreme,\nLies a mystical practice, like a never-ending dream.\nRecursion its name, a concept so profound,\nIn the art of programming, it's widely renowned.\n\nLike a mirror reflecting an image so clear,\nRecursion calls on itself, without any fear.\nA function that calls itself, again and again,\nUntil a base case is met, breaking the chain.\n\nThrough loops and iterations, it takes a different route,\nA repetitive journey with"
          },
          "logprobs": null,
          "finish_reason": "length"
        }
      ],
      "usage": {
        "prompt_tokens": 39,
        "completion_tokens": 100,
        "total_tokens": 139
      },
      "system_fingerprint": null
    }
      
  3. Send another request. This time, include the JWT token for Bob. Because Bob does not have access to that model, the request fails and a 403 HTTP response code is returned.

      curl -v "$INGRESS_GW_ADDRESS:8080/openai" --header "Authorization: Bearer $BOB_TOKEN" -H content-type:application/json -d '{
     "messages": [
       {             
         "role": "system",
         "content": "You are a poetic assistant, skilled in explaining complex programming concepts with creative flair."
       },                   
       {
         "role": "user",
         "content": "Compose a poem that explains the concept of recursion in programming."
       }               
     ]              
    }'
      

    Example output:

      * Mark bundle as not supporting multiuse
    < HTTP/1.1 403 Forbidden
    < content-type: text/plain
    < date: Tue, 18 Jun 2024 03:21:26 GMT
    < server: envoy
    < connection: close
    < transfer-encoding: chunked   
      

Next

Great job! You learned how to use JWTs to authenticate and authorize users to your AI API. You can now explore how to effectively manage your LLM prompts.