Ai
Package: ai.options.gloo.solo.io
Types:
- SingleAuthToken
- Passthrough
- UpstreamSpec
- CustomHost
- OpenAI
- AzureOpenAI
- Gemini
- VertexAI
- Publisher
- Mistral
- Anthropic
- MultiPool
- Backend
- Priority
- RouteSettings
- RouteType
- FieldDefault
- Postgres
- Embedding
- OpenAI
- AzureOpenAI
- SemanticCache
- Redis
- Weaviate
- DataStore
- Mode
- RAG
- DataStore
- AIPromptEnrichment
- Message
- AIPromptGuard
- Regex
- RegexMatch
- BuiltIn
- Action
- Webhook
- HeaderMatch
- MatchType
- Moderation
- OpenAI
- Request
- CustomResponse
- Response
Source File: github.com/solo-io/gloo/projects/gloo/api/v1/enterprise/options/ai/ai.proto
SingleAuthToken
The authorization token that the AI gateway uses to access the LLM provider API. This token is automatically sent in a request header, depending on the LLM provider.
"inline": string
"secretRef": .core.solo.io.ResourceRef
"passthrough": .ai.options.gloo.solo.io.SingleAuthToken.Passthrough
Field | Type | Description |
---|---|---|
inline |
string |
Provide the token directly in the configuration for the Upstream. This option is the least secure. Only use this option for quick tests such as trying out AI Gateway. Only one of inline , secretRef , or passthrough can be set. |
secretRef |
.core.solo.io.ResourceRef | Store the API key in a Kubernetes secret in the same namespace as the Upstream. Then, refer to the secret in the Upstream configuration. This option is more secure than an inline token, because the API key is encoded and you can restrict access to secrets through RBAC rules. You might use this option in proofs of concept, controlled development and staging environments, or well-controlled prod environments that use secrets. Only one of secretRef , inline , or passthrough can be set. |
passthrough |
.ai.options.gloo.solo.io.SingleAuthToken.Passthrough | Passthrough the existing token. This token can either come directly from the client, or be generated by an OIDC flow early in the request lifecycle. This option is useful for backends which have federated identity setup and can re-use the token from the client. Currently, this token must exist in the Authorization header. Only one of passthrough , inline , or secretRef can be set. |
Passthrough
Configuration for passthrough of the existing token.
Currently, specifying an empty object (passthrough: {}
)
indicates that passthrough will be used for auth.
Field | Type | Description |
---|
UpstreamSpec
When you deploy the Gloo AI Gateway, you can use the spec.ai
section of the Upstream resource
to represent a backend for a logical Large Language Model (LLM) provider.
This section configures the LLM provider that the AI Gateway routes requests to,
and how the gateway should authenticate with the provider.
Note that other Gloo AI Gateway LLM features, such as prompt guards
and prompt enrichment, are configured at the route level in the
spec.options.ai
section of the RouteOptions resource.
To get started, see About Gloo AI Gateway. For more information about the Upstream resource, see the API reference.
AI Gateway is an Enterprise-only feature that requires a Gloo Gateway Enterprise license with an AI Gateway add-on.
"openai": .ai.options.gloo.solo.io.UpstreamSpec.OpenAI
"mistral": .ai.options.gloo.solo.io.UpstreamSpec.Mistral
"anthropic": .ai.options.gloo.solo.io.UpstreamSpec.Anthropic
"azureOpenai": .ai.options.gloo.solo.io.UpstreamSpec.AzureOpenAI
"multi": .ai.options.gloo.solo.io.UpstreamSpec.MultiPool
"gemini": .ai.options.gloo.solo.io.UpstreamSpec.Gemini
"vertexAi": .ai.options.gloo.solo.io.UpstreamSpec.VertexAI
Field | Type | Description |
---|---|---|
openai |
.ai.options.gloo.solo.io.UpstreamSpec.OpenAI | Configure an OpenAI backend. Only one of openai , mistral , anthropic , azureOpenai , multi , gemini , or vertexAi can be set. |
mistral |
.ai.options.gloo.solo.io.UpstreamSpec.Mistral | Configure a Mistral AI backend. Only one of mistral , openai , anthropic , azureOpenai , multi , gemini , or vertexAi can be set. |
anthropic |
.ai.options.gloo.solo.io.UpstreamSpec.Anthropic | Configure an Anthropic backend. Only one of anthropic , openai , mistral , azureOpenai , multi , gemini , or vertexAi can be set. |
azureOpenai |
.ai.options.gloo.solo.io.UpstreamSpec.AzureOpenAI | Configure an Azure OpenAI backend. Only one of azureOpenai , openai , mistral , anthropic , multi , gemini , or vertexAi can be set. |
multi |
.ai.options.gloo.solo.io.UpstreamSpec.MultiPool | Configure backends for multiple LLM providers in one logical endpoint. Only one of multi , openai , mistral , anthropic , azureOpenai , gemini , or vertexAi can be set. |
gemini |
.ai.options.gloo.solo.io.UpstreamSpec.Gemini | Configure a Gemini backend. Only one of gemini , openai , mistral , anthropic , azureOpenai , multi , or vertexAi can be set. |
vertexAi |
.ai.options.gloo.solo.io.UpstreamSpec.VertexAI | Configure a Vertex AI backend. Only one of vertexAi , openai , mistral , anthropic , azureOpenai , multi , or gemini can be set. |
CustomHost
Send requests to a custom host and port, such as to proxy the request, or to use a different backend that is API-compliant with the upstream version.
"host": string
"port": int
Field | Type | Description |
---|---|---|
host |
string |
Custom host to send the traffic requests to. |
port |
int |
Custom port to send the traffic requests to. |
OpenAI
Settings for the OpenAI LLM provider.
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
"customHost": .ai.options.gloo.solo.io.UpstreamSpec.CustomHost
"model": string
Field | Type | Description |
---|---|---|
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the OpenAI API. This token is automatically sent in the Authorization header of the request and prefixed with Bearer . |
customHost |
.ai.options.gloo.solo.io.UpstreamSpec.CustomHost | Optional: Send requests to a custom host and port, such as to proxy the request, or to use a different backend that is API-compliant with the upstream version. |
model |
string |
Optional: Override the model name, such as gpt-4o-mini . If unset, the model name is taken from the request. This setting can be useful when setting up model failover within the same LLM provider. |
AzureOpenAI
Settings for the Azure OpenAI LLM provider.
To find the values for the endpoint, deployment name, and API version, you can check the fields of an API request, such as
https://{endpoint}/openai/deployments/{deployment_name}/chat/completions?api-version={api_version}
.
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
"endpoint": string
"deploymentName": string
"apiVersion": string
Field | Type | Description |
---|---|---|
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the Azure OpenAI API. This token is automatically sent in the api-key header of the request. |
endpoint |
string |
The endpoint for the Azure OpenAI API to use, such as my-endpoint.openai.azure.com . If the scheme is included, it is stripped. |
deploymentName |
string |
The name of the Azure OpenAI model deployment to use. For more information, see the Azure OpenAI model docs. |
apiVersion |
string |
The version of the Azure OpenAI API to use. For more information, see the Azure OpenAI API version reference. |
Gemini
Settings for the Gemini LLM provider.
To find the values for the model and API version, you can check the fields of an API request, such as
https://generativelanguage.googleapis.com/{version}/models/{model}:generateContent?key={api_key}
.
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
"model": string
"apiVersion": string
Field | Type | Description |
---|---|---|
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the Gemini API. This token is automatically sent in the key query parameter of the request. |
model |
string |
The Gemini model to use. For more information, see the Gemini models docs. |
apiVersion |
string |
The version of the Gemini API to use. For more information, see the Gemini API version docs. |
VertexAI
Settings for the Vertex AI LLM provider.
To find the values for the project ID, project location, and publisher, you can check the fields of an API request, such as
https://{LOCATION}-aiplatform.googleapis.com/{VERSION}/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/{PROVIDER}/<model-path>
.
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
"model": string
"apiVersion": string
"projectId": string
"location": string
"modelPath": string
"publisher": .ai.options.gloo.solo.io.UpstreamSpec.VertexAI.Publisher
Field | Type | Description |
---|---|---|
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the Vertex AI API. This token is automatically sent in the key header of the request. |
model |
string |
The Vertex AI model to use. For more information, see the Vertex AI model docs. |
apiVersion |
string |
The version of the Vertex AI API to use. For more information, see the Vertex AI API reference. |
projectId |
string |
The ID of the Google Cloud Project that you use for the Vertex AI. |
location |
string |
The location of the Google Cloud Project that you use for the Vertex AI. |
modelPath |
string |
Optional: The model path to route to. Defaults to the Gemini model path, generateContent . |
publisher |
.ai.options.gloo.solo.io.UpstreamSpec.VertexAI.Publisher | The type of publisher model to use. Currently, only Google is supported. |
Publisher
The type of publisher model to use. Currently, only Google is supported.
Name | Description |
---|---|
GOOGLE |
Mistral
Settings for the Mistral AI LLM provider.
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
"customHost": .ai.options.gloo.solo.io.UpstreamSpec.CustomHost
"model": string
Field | Type | Description |
---|---|---|
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the OpenAI API. This token is automatically sent in the Authorization header of the request and prefixed with Bearer . |
customHost |
.ai.options.gloo.solo.io.UpstreamSpec.CustomHost | Optional: Send requests to a custom host and port, such as to proxy the request, or to use a different backend that is API-compliant with the upstream version. |
model |
string |
Optional: Override the model name. If unset, the model name is taken from the request. This setting can be useful when testing model failover scenarios. |
Anthropic
Settings for the Anthropic LLM provider.
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
"customHost": .ai.options.gloo.solo.io.UpstreamSpec.CustomHost
"version": string
"model": string
Field | Type | Description |
---|---|---|
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the Anthropic API. This token is automatically sent in the x-api-key header of the request. |
customHost |
.ai.options.gloo.solo.io.UpstreamSpec.CustomHost | Optional: Send requests to a custom host and port, such as to proxy the request, or to use a different backend that is API-compliant with the upstream version. |
version |
string |
Optional: A version header to pass to the Anthropic API. For more information, see the Anthropic API versioning docs. |
model |
string |
Optional: Override the model name. If unset, the model name is taken from the request. This setting can be useful when testing model failover scenarios. |
MultiPool
Configure backends for multiple hosts or models from the same provider in one Upstream resource. This method can be useful for creating one logical endpoint that is backed by multiple hosts or models.
In the priorities
section, the order of pool
entries defines the priority of the backend endpoints.
The pool
entries can either define a list of backends or a single backend.
Note: Only two levels of nesting are permitted. Any nested entries after the second level are ignored.
multi:
priorities:
- pool:
- azureOpenai:
deploymentName: gpt-4o-mini
apiVersion: 2024-02-15-preview
endpoint: ai-gateway.openai.azure.com
authToken:
secretRef:
name: azure-secret
namespace: gloo-system
- pool:
- azureOpenai:
deploymentName: gpt-4o-mini-2
apiVersion: 2024-02-15-preview
endpoint: ai-gateway-2.openai.azure.com
authToken:
secretRef:
name: azure-secret-2
namespace: gloo-system
"priorities": []ai.options.gloo.solo.io.UpstreamSpec.MultiPool.Priority
Field | Type | Description |
---|---|---|
priorities |
[]ai.options.gloo.solo.io.UpstreamSpec.MultiPool.Priority | The order of pool entries within this section defines the priority of the backend endpoints. |
Backend
An entry represeting an LLM provider backend that the AI Gateway routes requests to.
"openai": .ai.options.gloo.solo.io.UpstreamSpec.OpenAI
"mistral": .ai.options.gloo.solo.io.UpstreamSpec.Mistral
"anthropic": .ai.options.gloo.solo.io.UpstreamSpec.Anthropic
"azureOpenai": .ai.options.gloo.solo.io.UpstreamSpec.AzureOpenAI
"gemini": .ai.options.gloo.solo.io.UpstreamSpec.Gemini
"vertexAi": .ai.options.gloo.solo.io.UpstreamSpec.VertexAI
Field | Type | Description |
---|---|---|
openai |
.ai.options.gloo.solo.io.UpstreamSpec.OpenAI | Configure an OpenAI backend. Only one of openai , mistral , anthropic , azureOpenai , gemini , or vertexAi can be set. |
mistral |
.ai.options.gloo.solo.io.UpstreamSpec.Mistral | Configure a Mistral AI backend. Only one of mistral , openai , anthropic , azureOpenai , gemini , or vertexAi can be set. |
anthropic |
.ai.options.gloo.solo.io.UpstreamSpec.Anthropic | Configure an Anthropic backend. Only one of anthropic , openai , mistral , azureOpenai , gemini , or vertexAi can be set. |
azureOpenai |
.ai.options.gloo.solo.io.UpstreamSpec.AzureOpenAI | Configure an Azure OpenAI backend. Only one of azureOpenai , openai , mistral , anthropic , gemini , or vertexAi can be set. |
gemini |
.ai.options.gloo.solo.io.UpstreamSpec.Gemini | Configure a Gemini backend. Only one of gemini , openai , mistral , anthropic , azureOpenai , or vertexAi can be set. |
vertexAi |
.ai.options.gloo.solo.io.UpstreamSpec.VertexAI | Configure a Vertex AI backend. Only one of vertexAi , openai , mistral , anthropic , azureOpenai , or gemini can be set. |
Priority
The order of pool
entries within this section defines the priority of the backend endpoints.
"pool": []ai.options.gloo.solo.io.UpstreamSpec.MultiPool.Backend
Field | Type | Description |
---|---|---|
pool |
[]ai.options.gloo.solo.io.UpstreamSpec.MultiPool.Backend | A list of LLM provider backends within a single endpoint pool entry. |
RouteSettings
When you deploy the Gloo AI Gateway, you can use the spec.options.ai
section
of the RouteOptions resource to configure the behavior of the LLM provider
on the level of individual routes. These route settings, such as prompt enrichment,
retrieval augmented generation (RAG), and semantic caching, are applicable only
for routes that send requests to an LLM provider backend.
For more information about the RouteOptions resource, see the API reference.
"promptEnrichment": .ai.options.gloo.solo.io.AIPromptEnrichment
"promptGuard": .ai.options.gloo.solo.io.AIPromptGuard
"rag": .ai.options.gloo.solo.io.RAG
"semanticCache": .ai.options.gloo.solo.io.SemanticCache
"defaults": []ai.options.gloo.solo.io.FieldDefault
"routeType": .ai.options.gloo.solo.io.RouteSettings.RouteType
Field | Type | Description |
---|---|---|
promptEnrichment |
.ai.options.gloo.solo.io.AIPromptEnrichment | Enrich requests sent to the LLM provider by appending and prepending system prompts. This can be configured only for LLM providers that use the CHAT API route type. |
promptGuard |
.ai.options.gloo.solo.io.AIPromptGuard | Set up prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response. |
rag |
.ai.options.gloo.solo.io.RAG | Retrieval augmented generation (RAG) is a technique of providing relevant context by retrieving relevant data from one or more context datasets and augmenting the prompt with the retrieved information. This can be used to improve the quality of the generated text. |
semanticCache |
.ai.options.gloo.solo.io.SemanticCache | Cache previous model responses to provide faster responses to similar requests in the future. Results might vary depending on the embedding mechanism used, as well as the similarity threshold set. |
defaults |
[]ai.options.gloo.solo.io.FieldDefault | Provide defaults to merge with user input fields. Defaults do not override the user input fields, unless you explicitly set override to true . |
routeType |
.ai.options.gloo.solo.io.RouteSettings.RouteType | The type of route to the LLM provider API. Currently, CHAT and CHAT_STREAMING are supported. |
RouteType
The type of route to the LLM provider API.
Name | Description |
---|---|
CHAT |
The LLM generates the full response before responding to a client. |
CHAT_STREAMING |
Stream responses to a client, which allows the LLM to stream out tokens as they are generated. |
FieldDefault
Provide defaults to merge with user input fields.
Defaults do not override the user input fields, unless you explicitly set override
to true
.
Example overriding the system field for Anthropic:
# Anthropic doesn't support a system chat type
defaults:
- field: "system"
value: "answer all questions in french"
Example setting the temperature and overriding max_tokens
:
defaults:
- field: "temperature"
value: 0.5
- field: "max_tokens"
value: 100
"field": string
"value": .google.protobuf.Value
"override": bool
Field | Type | Description |
---|---|---|
field |
string |
The name of the field. |
value |
.google.protobuf.Value | The field default value, which can be any JSON Data Type. |
override |
bool |
Whether to override the field’s value if it already exists. Defaults to false. |
Postgres
Configuration settings for a Postgres datastore.
"connectionString": string
"collectionName": string
Field | Type | Description |
---|---|---|
connectionString |
string |
Connection string to the Postgres database. For example, to use a vector database deployed to your cluster, your connection string might look similar to postgresql+psycopg://gloo:gloo@vector-db.default.svc.cluster.local:5432/gloo . |
collectionName |
string |
Name of the collection table to use. |
Embedding
Configuration of the API used to generate the embedding.
"openai": .ai.options.gloo.solo.io.Embedding.OpenAI
"azureOpenai": .ai.options.gloo.solo.io.Embedding.AzureOpenAI
Field | Type | Description |
---|---|---|
openai |
.ai.options.gloo.solo.io.Embedding.OpenAI | Embedding settings for the OpenAI provider. Only one of openai or azureOpenai can be set. |
azureOpenai |
.ai.options.gloo.solo.io.Embedding.AzureOpenAI | Embedding settings for the Azure OpenAI provider. Only one of azureOpenai or openai can be set. |
OpenAI
Embedding settings for the OpenAI provider.
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
Field | Type | Description |
---|---|---|
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the OpenAI API. This token is automatically sent in the Authorization header of the request and prefixed with Bearer . |
AzureOpenAI
Embedding settings for the Azure OpenAI provider.
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
"apiVersion": string
"endpoint": string
"deploymentName": string
Field | Type | Description |
---|---|---|
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the Azure OpenAI API. This token is automatically sent in the api-key header of the request. |
apiVersion |
string |
The version of the Azure OpenAI API to use. For more information, see the Azure OpenAI API version reference. |
endpoint |
string |
The endpoint for the Azure OpenAI API to use, such as my-endpoint.openai.azure.com . If the scheme is not included, it is added. |
deploymentName |
string |
The name of the Azure OpenAI model deployment to use. For more information, see the Azure OpenAI model docs. |
SemanticCache
Cache previous model responses to provide faster responses to similar requests in the future. Results might vary depending on the embedding mechanism used, as well as the similarity threshold set. Semantic caching reduces the number of requests to the LLM provider, improves the response time, and reduces costs.
Example configuring a route to use a redis
datastore and OpenAI for RAG:
semanticCache:
datastore:
redis:
connectionString: redis://172.17.0.1:6379
embedding:
openai:
authToken:
secretRef:
name: openai-secret
namespace: gloo-system
"datastore": .ai.options.gloo.solo.io.SemanticCache.DataStore
"embedding": .ai.options.gloo.solo.io.Embedding
"ttl": int
"mode": .ai.options.gloo.solo.io.SemanticCache.Mode
Field | Type | Description |
---|---|---|
datastore |
.ai.options.gloo.solo.io.SemanticCache.DataStore | Data store from which to cache the request and response pairs. |
embedding |
.ai.options.gloo.solo.io.Embedding | Model to use to retrieve the embedding mechanism. |
ttl |
int |
Time before data in the cache is considered expired. |
mode |
.ai.options.gloo.solo.io.SemanticCache.Mode | The caching mode to use for the request and response lifecycle. Supported values include READ_WRITE or READ_ONLY . |
Redis
Settings for a Redis database.
"connectionString": string
"scoreThreshold": float
Field | Type | Description |
---|---|---|
connectionString |
string |
Connection string to the Redis database, such as redis://172.17.0.1:6379 . |
scoreThreshold |
float |
Similarity score threshold value between 0.0 and 1.0 that determines how similar two queries must be in order to return a cached result. The lower the number, the more similar the queries must be for a cache hit. +kubebuilder:validation:Minimum=0 +kubebuilder:validation:Maximum=1. |
Weaviate
Settings for a Weaviate database.
"host": string
"httpPort": int
"grpcPort": int
"insecure": bool
Field | Type | Description |
---|---|---|
host |
string |
Connection string to the Weaviate database. Do not include the scheme. For example, the format weaviate.my-ns.svc.cluster.local is correct. The format http://weaviate.my-ns.svc.cluster.local , which includes the scheme, is incorrect. |
httpPort |
int |
HTTP port to use. If unset, defaults to 8080 . |
grpcPort |
int |
GRPC port to use. If unset, defaults to 50051 . |
insecure |
bool |
Whether to use a secure connection. Defaults to true . |
DataStore
Data store from which to cache the request and response pairs.
"redis": .ai.options.gloo.solo.io.SemanticCache.Redis
"weaviate": .ai.options.gloo.solo.io.SemanticCache.Weaviate
Field | Type | Description |
---|---|---|
redis |
.ai.options.gloo.solo.io.SemanticCache.Redis | Settings for a Redis database. Only one of redis or weaviate can be set. |
weaviate |
.ai.options.gloo.solo.io.SemanticCache.Weaviate | Settings for a Weaviate database. Only one of weaviate or redis can be set. |
Mode
The caching mode to use for the request and response lifecycle.
Name | Description |
---|---|
READ_WRITE |
Read and write to the cache as a part of the request and response lifecycle. |
READ_ONLY |
Only read from the cache, and do not write to it. Data is written to the cache outside of the request and response cycle. |
RAG
Retrieval augmented generation (RAG) is a technique of providing relevant context by retrieving relevant data from one or more context datasets and augmenting the prompt with the retrieved information. This can be used to improve the quality of the generated text.
The same embedding mechanism that was used for the initial creation of the context datasets must be used for the prompt.
Example configuring a route to use a postgres
datastore and OpenAI for RAG:
rag:
datastore:
postgres:
connectionString: postgresql+psycopg://gloo:gloo@172.17.0.1:6024/gloo
collectionName: default
embedding:
openai:
authToken:
secretRef:
name: openai-secret
namespace: gloo-system
For an extended example that includes deploying a vector database with a context dataset, check out the Retrieval augmented generation (RAG) tutorial.
"datastore": .ai.options.gloo.solo.io.RAG.DataStore
"embedding": .ai.options.gloo.solo.io.Embedding
"promptTemplate": string
Field | Type | Description |
---|---|---|
datastore |
.ai.options.gloo.solo.io.RAG.DataStore | Data store from which to fetch the context embeddings. |
embedding |
.ai.options.gloo.solo.io.Embedding | Model to use to retrieve the context embeddings. |
promptTemplate |
string |
Template to use to embed the returned context. |
DataStore
"postgres": .ai.options.gloo.solo.io.Postgres
Field | Type | Description |
---|---|---|
postgres |
.ai.options.gloo.solo.io.Postgres | Configuration settings for a Postgres datastore. |
AIPromptEnrichment
Enrich requests sent to the LLM provider by appending and prepending system prompts. This can be configured only for LLM providers that use the CHAT API type.
Prompt enrichment allows you to add additional context to the prompt before sending it to the model. Unlike RAG or other dynamic context methods, prompt enrichment is static and is applied to every request.
Note: Some providers, including Anthropic, do not support SYSTEM role messages, and instead have a dedicated
system field in the input JSON. In this case, use the defaults
setting to set the system field.
The following example prepends a system prompt of Answer all questions in French.
and appends Describe the painting as if you were a famous art critic from the 17th century.
to each request that is sent to the openai
HTTPRoute.
apiVersion: gateway.solo.io/v1
kind: RouteOption
metadata:
name: openai-opt
namespace: gloo-system
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: openai
options:
ai:
promptEnrichment:
prepend:
- role: SYSTEM
content: "Answer all questions in French."
append:
- role: USER
content: "Describe the painting as if you were a famous art critic from the 17th century."
"prepend": []ai.options.gloo.solo.io.AIPromptEnrichment.Message
"append": []ai.options.gloo.solo.io.AIPromptEnrichment.Message
Field | Type | Description |
---|---|---|
prepend |
[]ai.options.gloo.solo.io.AIPromptEnrichment.Message | A list of messages to be prepended to the prompt sent by the client. |
append |
[]ai.options.gloo.solo.io.AIPromptEnrichment.Message | A list of messages to be appended to the prompt sent by the client. |
Message
An entry for a message to prepend or append to each prompt.
"role": string
"content": string
Field | Type | Description |
---|---|---|
role |
string |
Role of the message. The available roles depend on the backend LLM provider model, such as SYSTEM or USER in the OpenAI API. |
content |
string |
String content of the message. |
AIPromptGuard
Set up prompt guards to block unwanted requests to the LLM provider and mask sensitive data. Prompt guards can be used to reject requests based on the content of the prompt, as well as mask responses based on the content of the response.
This example rejects any request prompts that contain the string “credit card”, and masks any credit card numbers in the response.
promptGuard:
request:
customResponse:
message: "Rejected due to inappropriate content"
regex:
action: REJECT
matches:
- pattern: "credit card"
name: "CC"
response:
regex:
builtins:
- CREDIT_CARD
action: MASK
"request": .ai.options.gloo.solo.io.AIPromptGuard.Request
"response": .ai.options.gloo.solo.io.AIPromptGuard.Response
Field | Type | Description |
---|---|---|
request |
.ai.options.gloo.solo.io.AIPromptGuard.Request | Prompt guards to apply to requests sent by the client. |
response |
.ai.options.gloo.solo.io.AIPromptGuard.Response | Prompt guards to apply to responses returned by the LLM provider. |
Regex
Regular expression (regex) matching for prompt guards and data masking.
"matches": []ai.options.gloo.solo.io.AIPromptGuard.Regex.RegexMatch
"builtins": []ai.options.gloo.solo.io.AIPromptGuard.Regex.BuiltIn
"action": .ai.options.gloo.solo.io.AIPromptGuard.Regex.Action
Field | Type | Description |
---|---|---|
matches |
[]ai.options.gloo.solo.io.AIPromptGuard.Regex.RegexMatch | A list of regex patterns to match against the request or response. Matches and built-ins are additive. |
builtins |
[]ai.options.gloo.solo.io.AIPromptGuard.Regex.BuiltIn | A list of built-in regex patterns to match against the request or response. Matches and built-ins are additive. |
action |
.ai.options.gloo.solo.io.AIPromptGuard.Regex.Action | The action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. Response matches are always masked by default. |
RegexMatch
Regular expression (regex) matching for prompt guards and data masking.
"pattern": string
"name": string
Field | Type | Description |
---|---|---|
pattern |
string |
The regex pattern to match against the request or response. |
name |
string |
An optional name for this match, which can be used for debugging purposes. |
BuiltIn
Built-in regex patterns for specific types of strings in prompts.
For example, if you specify CREDIT_CARD
, any credit card numbers
in the request or response are matched.
Name | Description |
---|---|
SSN |
Default regex matching for Social Security numbers. |
CREDIT_CARD |
Default regex matching for credit card numbers. |
PHONE_NUMBER |
Default regex matching for phone numbers. |
EMAIL |
Default regex matching for email addresses. |
Action
The action to take if a regex pattern is matched in a request or response. This setting applies only to request matches. Response matches are always masked by default.
Name | Description |
---|---|
MASK |
Mask the matched data in the request. |
REJECT |
Reject the request if the regex matches content in the request. |
Webhook
Configure a webhook to forward requests or responses to for prompt guarding.
"host": string
"port": int
"forwardHeaders": []ai.options.gloo.solo.io.AIPromptGuard.Webhook.HeaderMatch
Field | Type | Description |
---|---|---|
host |
string |
Host to send the traffic to. |
port |
int |
Port to send the traffic to. |
forwardHeaders |
[]ai.options.gloo.solo.io.AIPromptGuard.Webhook.HeaderMatch | Headers to forward with the request to the webhook. |
HeaderMatch
Describes how to match a given string in HTTP headers. Match is case-sensitive.
"key": string
"matchType": .ai.options.gloo.solo.io.AIPromptGuard.Webhook.HeaderMatch.MatchType
Field | Type | Description |
---|---|---|
key |
string |
The header key string to match against. |
matchType |
.ai.options.gloo.solo.io.AIPromptGuard.Webhook.HeaderMatch.MatchType | The type of match to use. |
MatchType
The header string match type.
Name | Description |
---|---|
EXACT |
The string must match exactly the specified string. |
PREFIX |
The string must have the specified prefix. |
SUFFIX |
The string must have the specified suffix. |
CONTAINS |
The header string must contain the specified string. |
REGEX |
The string must match the specified RE2-style regular expression pattern. |
regex |
Do not use. Use REGEX (fully capitalized) instead. |
Moderation
Pass prompt data through an external moderation model endpoint, which compares the request prompt input to predefined content rules. Any requests that are routed through Gloo AI Gateway pass through the moderation model that you specify. If the content is identified as harmful according to the model’s content rules, the request is automatically rejected.
You can configure an moderation endpoint either as a standalone prompt guard setting or in addition to other request and response guard settings.
"openai": .ai.options.gloo.solo.io.AIPromptGuard.Moderation.OpenAI
Field | Type | Description |
---|---|---|
openai |
.ai.options.gloo.solo.io.AIPromptGuard.Moderation.OpenAI | Configure an OpenAI moderation endpoint. |
OpenAI
Configure an OpenAI moderation endpoint.
"model": string
"authToken": .ai.options.gloo.solo.io.SingleAuthToken
Field | Type | Description |
---|---|---|
model |
string |
The name of the OpenAI moderation model to use. Defaults to omni-moderation-latest . |
authToken |
.ai.options.gloo.solo.io.SingleAuthToken | The authorization token that the AI gateway uses to access the OpenAI moderation model. |
Request
Prompt guards to apply to requests sent by the client.
"customResponse": .ai.options.gloo.solo.io.AIPromptGuard.Request.CustomResponse
"regex": .ai.options.gloo.solo.io.AIPromptGuard.Regex
"webhook": .ai.options.gloo.solo.io.AIPromptGuard.Webhook
"moderation": .ai.options.gloo.solo.io.AIPromptGuard.Moderation
Field | Type | Description |
---|---|---|
customResponse |
.ai.options.gloo.solo.io.AIPromptGuard.Request.CustomResponse | A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
regex |
.ai.options.gloo.solo.io.AIPromptGuard.Regex | Regular expression (regex) matching for prompt guards and data masking. |
webhook |
.ai.options.gloo.solo.io.AIPromptGuard.Webhook | Configure a webhook to forward requests to for prompt guarding. |
moderation |
.ai.options.gloo.solo.io.AIPromptGuard.Moderation | Pass prompt data through an external moderation model endpoint, which compares the request prompt input to predefined content rules. |
CustomResponse
A custom response to return to the client if request content
is matched against a regex pattern and the action is REJECT
.
"message": string
"statusCode": int
Field | Type | Description |
---|---|---|
message |
string |
A custom response message to return to the client. If not specified, defaults to “The request was rejected due to inappropriate content”. |
statusCode |
int |
The status code to return to the client. |
Response
Prompt guards to apply to responses returned by the LLM provider.
"regex": .ai.options.gloo.solo.io.AIPromptGuard.Regex
"webhook": .ai.options.gloo.solo.io.AIPromptGuard.Webhook
Field | Type | Description |
---|---|---|
regex |
.ai.options.gloo.solo.io.AIPromptGuard.Regex | Regular expression (regex) matching for prompt guards and data masking. |
webhook |
.ai.options.gloo.solo.io.AIPromptGuard.Webhook | Configure a webhook to forward responses to for prompt guarding. |