Data Loss Prevention

DLP is a feature of Gloo Enterprise v1.0.0+. Gloo Enterprise release candidate v1.0.0-rc1 was the first version to support this feature. v1.0.0-rc2 contained some minor fixes to the Gloo-provided regular expressions. This guide is written for v1.0.0-rc2+.

Understanding DLP

Data Loss Prevention (DLP) is a method of ensuring that sensitive data isn’t logged or leaked. This is done by doing a series of regex replacements on the response body.

For example, we can use Gloo to transform this response:

{
   "fakevisa": "4397945340344828",
   "ssn": "123-45-6789"
}

into this response:

{
   "fakevisa": "XXXXXXXXXXXX4828",
   "ssn": "XXX-XX-X789"
}

DLP is configured as a list of Actions, applied in order, on an HTTP listener, virtual service, or route. If configured on the listener, an additional matcher is paired with a list of Actions, and the first DLP rule that matches a request will be applied.

DLP is one of the first filters run by Envoy. Gloo’s current filter order follows:

  1. Fault Stage (Fault injection)
  2. CORS/DLP Stage (order here is not guaranteed to be idempotent)
  3. WAF Stage
  4. Rest of the filters … (not all in the same stage)

Prerequisites

Install Gloo Enterprise.

Simple Example

In this example we will demonstrate masking responses using one of the predefined DLP Actions, rather than providing a custom regex.

First let’s begin by configuring a simple static upstream to an echo site.


apiVersion: gloo.solo.io/v1
kind: Upstream
metadata:
  name: json-upstream
  namespace: gloo-system
spec:
  static:
    hosts:
      - addr: echo.jsontest.com
        port: 80

glooctl create upstream static --static-hosts echo.jsontest.com:80 --name json-upstream

Now let’s configure a simple virtual service to send requests to the upstream.

apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: vs
  namespace: gloo-system
spec:
  virtualHost:
    domains:
      - '*'
    routes:
      - routeAction:
          single:
            upstream:
              name: json-upstream
              namespace: gloo-system
        routePlugins:
          autoHostRewrite: true

Run the following curl to get the unmasked response:

curl $(glooctl proxy url)/ssn/123-45-6789/fakevisa/4397945340344828

The curl should return:

{
   "fakevisa": "4397945340344828",
   "ssn": "123-45-6789"
}

Now let’s mask the SSN and credit card, apply the following virtual service:

apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: vs
  namespace: gloo-system
spec:
  virtualHost:
    domains:
    - '*'
    routes:
    - routeAction:
        single:
          upstream:
            name: json-upstream
            namespace: gloo-system
      routePlugins:
        autoHostRewrite: true
    virtualHostPlugins:
      dlp:
        actions:
        - actionType: SSN
        - actionType: ALL_CREDIT_CARDS

Run the same curl as before:

curl $(glooctl proxy url)/ssn/123-45-6789/fakevisa/4397945340344828

This time it will return a masked response:

{
   "fakevisa": "XXXXXXXXXXXX4828",
   "ssn": "XXX-XX-X789"
}

Custom Example

In this example we will demonstrate defining our own custom DLP Action, rather than leveraging one of the predefined regular expressions.

Let’s start by creating our typical petstore microservice:

kubectl apply -f https://raw.githubusercontent.com/sololabs/demos2/master/resources/petstore.yaml

Apply the following virtual service to route to the Gloo-discovered petstore upstream:

apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: vs
  namespace: gloo-system
spec:
  virtualHost:
    domains:
    - '*'
    routes:
    - routeAction:
        single:
          upstream:
            name: default-petstore-8080
            namespace: gloo-system

Query the petstore microservice for a list of pets:

curl $(glooctl proxy url)/api/pets

You should obtain the following response:

[{"id":1,"name":"Dog","status":"available"},{"id":2,"name":"Cat","status":"pending"}]

Names are often used as personally identifying information, or PII. Let’s write our own regex to mask the names returned by the petstore service:

apiVersion: gateway.solo.io/v1
kind: VirtualService
metadata:
  name: vs
  namespace: gloo-system
spec:
  virtualHost:
    domains:
    - '*'
    routes:
    - routeAction:
        single:
          upstream:
            name: default-petstore-8080
            namespace: gloo-system
    virtualHostPlugins:
      dlp:
        actions:
        - customAction:
            maskChar: "Y"
            name: test   # only used for logging
            percent:
              value: 60  # % of regex match to mask
            regex:
            - '(?!"name"[\s]*:[\s]*")[^"]+(?="[\s]*,|"[\s]})'

Query for pets again:

curl $(glooctl proxy url)/api/pets

You should get a masked response:

[{"id":1,"name":"YYg","status":"available"},{"id":2,"name":"YYt","status":"pending"}]

Summary

In this tutorial we installed Gloo Enterprise and demonstrated rewriting responses from upstreams with both the provided default regex patterns as well as the custom regex config.

Cleanup

kubectl delete vs vs -n gloo-system
kubectl delete us json-upstream -n gloo-system