Data Loss Prevention
Understanding DLP
Data Loss Prevention (DLP) is a method of ensuring that sensitive data isn’t logged or leaked. This is done by doing a series of regex replacements on the response body and content that is logged by Envoy (see Access Logging).
Valid regex patterns are those in the RE2 syntax. Note that some features such as lookaheads are not supported by RE2.
For example, we can use Gloo Edge to transform this response:
{
"fakevisa": "4397945340344828",
"ssn": "123-45-6789"
}
into this response:
{
"fakevisa": "XXXXXXXXXXXX4828",
"ssn": "XXX-XX-X789"
}
DLP is configured as an ordered list of Action
s on an HTTP listener, virtual service, or route. If
configured on the listener, an additional matcher is paired with a list of Action
s, and the first DLP rule that
matches a request will be applied.
The DLP filter will be run by Envoy after any other filters which might add data to be masked into the dynamic metadata. Gloo Edge’s current filter order follows:
- Fault Stage (Fault injection)
- DLP
- CORS
- Rest of the filters … (not all in the same stage)
DLP for access logs
By default, DLP will only run regex replacements on the response body. If
access logging is configured, the DLP actions
can also be applied to the headers and dynamic metadata that is logged by the configured access loggers. To do so, the enabledFor
DLP configuration option must be set to ACCESS_LOGS
or ALL
(to mask access logs AND the response bodies).
WAF access logs will only be masked when logged to Dynamic metadata. WAF logs written to Filter State will not be masked.
Prerequisites
Install Gloo Edge Enterprise.
Simple Example
In this example we will demonstrate masking responses using one of the predefined DLP Actions, rather than providing a custom regex.
First let’s begin by configuring a simple static upstream to an echo site.
kubectl apply -f - <<EOF
apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
name: json-upstream
namespace: gloo-system
spec:
static:
hosts:
- addr: echo.jsontest.com
port: 80
EOF
glooctl create upstream static --static-hosts echo.jsontest.com:80 --name json-upstream
Now let’s configure a simple virtual service to send requests to the upstream.
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
name: vs
namespace: gloo-system
spec:
virtualHost:
domains:
- '*'
routes:
- routeAction:
single:
upstream:
name: json-upstream
namespace: gloo-system
options:
autoHostRewrite: true
Run the following curl
to get the unmasked response:
curl $(glooctl proxy url)/ssn/123-45-6789/fakevisa/4397945340344828
The curl
should return:
{
"fakevisa": "4397945340344828",
"ssn": "123-45-6789"
}
Now let’s mask the SSN and credit card, apply the following virtual service:
kubectl apply -f - <<EOF
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
name: vs
namespace: gloo-system
spec:
virtualHost:
domains:
- '*'
routes:
- routeAction:
single:
upstream:
name: json-upstream
namespace: gloo-system
options:
autoHostRewrite: true
options:
dlp:
actions:
- actionType: SSN
- actionType: ALL_CREDIT_CARDS
EOF
Run the same curl
as before:
curl $(glooctl proxy url)/ssn/123-45-6789/fakevisa/4397945340344828
This time it will return a masked response:
{
"fakevisa": "XXXXXXXXXXXX4828",
"ssn": "XXX-XX-X789"
}
As noted above, DLP can also mask access logs by using a configuration like:
options:
dlp:
enabledFor: ALL
actions:
- actionType: SSN
- actionType: ALL_CREDIT_CARDS
Custom Example
In this example we will demonstrate defining our own custom DLP Action, rather than leveraging one of the predefined regular expressions.
Let’s start by creating our typical petstore microservice:
kubectl apply -f https://raw.githubusercontent.com/solo-io/gloo/v1.14.x/example/petstore/petstore.yaml
Apply the following virtual service to route to the Gloo Edge discovered petstore upstream:
kubectl apply -f - <<EOF
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
name: vs
namespace: gloo-system
spec:
virtualHost:
domains:
- '*'
routes:
- routeAction:
single:
upstream:
name: default-petstore-8080
namespace: gloo-system
EOF
Query the petstore microservice for a list of pets:
curl $(glooctl proxy url)/api/pets
You should obtain the following response:
[{"id":1,"name":"Dog","status":"available"},{"id":2,"name":"Cat","status":"pending"}]
Names are often used as personally identifying information, or PII. Let’s write our own regex to mask the names returned by the petstore service:
kubectl apply -f - <<EOF
apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
name: vs
namespace: gloo-system
spec:
virtualHost:
domains:
- '*'
routes:
- routeAction:
single:
upstream:
name: default-petstore-8080
namespace: gloo-system
options:
dlp:
actions:
- customAction:
maskChar: "X"
name: test # only used for logging
percent:
value: 60 # % of regex match to mask
regexActions:
- regex: '"name":[^"]*"([^"]*)"'
subgroup: 1
EOF
Query for pets again:
curl $(glooctl proxy url)/api/pets
You should get a masked response:
[{"id":1,"name":"XXg","status":"available"},{"id":2,"name":"XXt","status":"pending"}]
Key/value (header masking) example
In this example, you define a key/value DLP action, which you can use to mask the value associated with a specified request header.
Predefined and Custom actions will only match based on header value in access logs. To match against a header name, use a key/value action.
-
Get started by creating the petstore microservice.
kubectl apply -f https://raw.githubusercontent.com/solo-io/gloo/v1.14.x/example/petstore/petstore.yaml
-
Apply the following virtual service to route to the Gloo Edge discovered upstream for petstore.
apiVersion: gateway.solo.io/v1 kind: VirtualService metadata: name: vs namespace: gloo-system spec: virtualHost: domains: - '*' routes: - routeAction: single: upstream: name: default-petstore-8080 namespace: gloo-system
-
Apply the following gateway definition, which configures the
gateway-proxy
deployment to log the value of thetest-header
request header.apiVersion: gateway.solo.io/v1 kind: Gateway metadata: name: gateway-proxy namespace: gloo-system spec: bindAddress: '::' bindPort: 8080 proxyNames: - gateway-proxy useProxyProto: false options: accessLoggingService: accessLog: - fileSink: stringFormat: "test-header: %REQ(test-header)%\n" path: /dev/stdout
-
Query the petstore microservice. The
test-value
value is specified for thetest-header
request header.curl $(glooctl proxy url)/api/pets -H test-header:test-value
-
Get the access logs from the gateway-proxy deployment.
kubectl logs deployment/gateway-proxy -n gloo-system
Verify that you see the following log entry:
test-header: test-value
-
To mask the value of the
test-header
request header, update the virtual service that you created in step 2 to use a DLP key/value matcher.apiVersion: gateway.solo.io/v1 kind: VirtualService metadata: name: vs namespace: gloo-system spec: virtualHost: domains: - '*' routes: - routeAction: single: upstream: name: default-petstore-8080 namespace: gloo-system options: dlp: enabledFor: ALL actions: - keyValueAction: maskChar: "*" name: test-header-mask # only used for logging keyToMask: test-header percent: value: 100 # % of regex match to mask actionType: KEYVALUE
-
Send another request to the petstore service.
curl $(glooctl proxy url)/api/pets -H test-header:test-value
-
Check the gateway-proxy access logs again.
kubectl logs deployment/gateway-proxy -n gloo-system
Verify that you see the following log entry, in which the value is masked:
test-header: ****-*****
Some notes on key/value actions:
- You cannot use key/value actions to mask pseudo headers.
- Key/value actions do not mask data in response bodies. They mask only the value of request headers, response headers, and dynamic metadata in access logs.
- To apply key/value actions, you must set
dlp.enabledFor
toALL
orACCESS_LOGS
.
Summary
In this tutorial we installed Gloo Edge Enterprise and demonstrated rewriting responses from upstreams with both the provided default regex patterns as well as the custom regex config.
Cleanup
kubectl delete vs vs -n gloo-system
kubectl delete us json-upstream -n gloo-system