Understanding Envoy Rate Limits

Envoy rate limits is a fairly complex system, built using multiple components. While there are many articles on the Internet explaining basic setup and how each component works, we weren’t able to find something that explains how each component works end-to-end in simple terms.

That’s why we’ve created this blog - covering envoy and rate limit service configurations.

To learn how to integrate the configuration into the istio resources, feel free to scroll to the bottom of the page for the complete example.

This document assumes that the reader has previous experience operating Kubernetes and Envoy. The target audience for this document is mostly platform engineers and Kubernetes or Istio operators.

Wayfair and Envoy Rate Limits

At Wayfair, development teams are actively using Kubernetes - with over 1500 applications currently running in our clusters. These applications are written in multiple languages and use a large number of frameworks.

We wanted to avoid the need to build a set of libraries for each language, which manage things like request metrics, identity and access control, or rate limiting. Service mesh is a solution, which provides these and many more features on the infrastructure level, making these features language and framework agnostic.

At Wayfair, we decided to implement service mesh based in istio, which in turn uses envoy proxy at its core.

One of the features envoy and istio provide is a rate limiting. Rate limiting makes sure that your application doesn’t get more than a specified number of requests over a period of time. This effectively helps to avoid overloading your application by a bad actor or a misconfigured client. While there are other mechanisms, which help get similar results (e.g. circuit breakers), rate limiting is the simplest one to implement as a first line of defense.

We will use the the following formatting conventions throughout the document:

“Resource names” will be quoted and italicized;
Code snippets will use monospaced font;

Introduction

Rate limits are configured in two places: first is the envoy “rate_limits filters” and the second is lyft’s “ratelimit service” configuration. Envoy’s filters contain “actions”, which result in a “descriptor”. This “descriptor” is sent to the ratelimit service, which uses them to make a decision on a specific limit.

Envoy rate_limits filters

A single “rate_limits filter” contains a list of “actions”, each of which in turn contains a list of “rate limit actions”. Envoy will attempt to match every request going through the “rate_limits filter” against each “action”. A request matches an “action” only if it matches every “rate limit action” inside.

Example

Let’s assume we have a single “rate_limits filter” with two “actions” (for the sake of simplicity, only the relevant part of the configuration displayed):

rate_limits: 
 - actions:
   - header_value_match:
     descriptor_value: path
     headers: 
      - name: :path
         prefix_match: /middleware
 - actions:
   - header_value_match:
       descriptor_value: get
       headers:
       - name: :method
         prefix_match: GET
   - request_headers:
       descriptor_key: service
       header_name: x-service

The first “actions” block reads as follows: match any request with a path starting with /middleware.

The second “actions” block reads as follows: match any request with method GET and header x-service (we will explain the descriptor_key field in a bit).

A request may match zero or more “actions”, which will result in corresponding number of “descriptor entries” sent to the “ratelimit service”. Let’s take a look at a few example requests and how they match the “actions” above. For the sake of simplicity, we will assume that first action will result in a “descriptor entry” 1 and second action will result in a “descriptor entry” 2.

As you can see, “rate limit actions” inside of an “action” are joined by the logical AND. At the same time, “actions” are independent from each other. One action not resulting in a “descriptor entry” does not prevent others from resulting in a “descriptor entry”.

Rate limit actions and descriptors

A “descriptor” is a set of “descriptor entries” corresponding to an “action”. Each “descriptor entry” is a key-value tuple and generally a “descriptor” will look like this:

(“descriptor-key-1”: “descriptor-value-1”)

(“descriptor-key-2”: “descriptor-value-2”)

Envoy provides a pre-set number of “rate limit actions”. Each “rate limit action” generates a “descriptor” with either a pre-set or configurable key and value. Values can be either static or dynamic based on one or more request fields. A full list of rate limit actions provided by the envoy is available here.

Keep in mind that Istio does not alway ship the latest envoy. It’s usually a good idea to check which envoy version is shipped with your Istio and check documentation for the corresponding version.

For example, generic_key descriptor has a pre-set key generic_key and allows configuring a static value.

Another example is a request_headers descriptor, which allows configuring a descriptor key and will get a value from one of the request headers.

Example

Let’s use our example configuration above and the following request, which matches both “actions”:

GET /middleware
x-service: boo

As this request matches both “actions”, envoy will generate two “descriptors”, one per “action”.

First descriptor

First descriptor will be generated from this “action”:

- header_value_match:
  descriptor_value: path
  headers:
    - name: :path
      prefix_match: /middleware

It is using header_value_match “rate limit action”. This action has a pre-set key header_match and will pull the value from the descriptor_value field of the envoy configuration. The resulting “descriptor” will look as follows:

(“header_match”, “path”)

Second descriptor

Second descriptor will be generated from this “action”:

- header_value_match:
    descriptor_value: get
    headers: 
   - name: :method 
     prefix_match: GET
- request_headers: 
   descriptor_key: service
    header_name: x-service

It is using two “rate limit actions”: header_value_match and request_headers. We already know how header_value_match works, so let’s take a look at the request_headers. This “rate limit action” allows us to configure the descriptor key using the descriptor_key field and will set a descriptor value using the value from the header configured in the header_name field. As we have two “rate limit actions” in this “action”, the resulting “descriptor” will have two “descriptor entries”:

(“header_match”: “get”)(“service”: “boo”)

Ratelimit service configuration

On the top level, “rate limit configuration” operates with a set of descriptors inside of a “domain”. A “domain” must be unique per rate limit service.

“domains” in the service configuration provide an isolation for the “rate limit configuration”. For the sake of simplicity, in this document we will assume that the service configuration has a single “domain”.

“rate limit configuration” is stored in the descriptors dictionary in the config file. These configurations are used to classify requests and provide each class with a certain amount of requests. We will call these classes “buckets” in this document.

“rate limit configuration” is stored as a list. Each configuration must have a key and may or may not have a value. These keys and values will be matched to keys and values from “descriptors”. If a value for a configuration is not specified - each unique combination of key/value matching this configuration will have its own bucket. A combination of a key and a value (or a key without a value) must be unique inside of the domain.

Ratelimit service matches on a full “descriptor”, not on individual “descriptor entries”. In order to match a “descriptor” with multiple “descriptor entries”, a nested “descriptor configuration” must be used. In this case, nested “descriptor configurations” are joined by a logical AND.

In case multiple “descriptor configurations” match a “descriptor”, the most specific one will be applied. In case no “descriptor configurations” match a “descriptor”, no limit is applied effectively allowing the descriptor any amount of requests.

Example

Let’s take a look at the following ratelimit service configuration:

domain: example-ratelimit
descriptors:
 - key: service
   requests_per_unit: 10
   unit: second
 - key: header_match
   rate_limit:
     requests_per_unit: 20
     unit: second
   value: path
 - key: header_match
   value: get
   descriptors:   - key: service
     value: boo
     rate_limit:
       requests_per_unit: 3
       unit: second

The first “rate limit configuration” reads as follows: match any “descriptor” with the key service and provide a bucket with 10 requests per second (rps).

The second “rate limit configuration” will match only “descriptors” with key header_match and value path. It will provide a limit of 20 rps to this “descriptor”. Effectively only this will match:

(“header_match”, “path”)

The third “rate limit configuration” has a nested “rate limit configuration”. This means it will only match “descriptors” with key header_match and value get AND key service and value boo. Effectively this “descriptor” will match:

(“header_match”, “get”)(“service”, “boo”)

Putting it all together

The following diagram shows a relationship between Envoy Filters and Ratelimit service configurations in general.

Istio configuration

Now let’s take a look at how complete Istio-based configuration will look like for our rate limit setup. Rate limit service uses redis as a key-value storage. For the sake of simplicity, we will create a redis Deployment with a single replica and will expose it with a Service:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 labels:
   app: ratelimit-demo-redis
   component: redis
 name: ratelimit-demo-redis-dep
spec: replicas: 1
 selector:
   matchLabels:
     app: ratelimit-demo-redis
     component: redis
 template:
   metadata:
     labels:
       app: ratelimit-demo-redis
       component: redis
   spec:
     containers: 
      - image: redis:6.0.6
         imagePullPolicy: IfNotPresent
         name: istio-ratelimit 
        resources:
           limits:
             cpu: 1500m
             memory: 512Mi 
          requests:
             cpu: 200m
             memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
 labels:
   app: ratelimit-demo-redis
 name: redis
spec:
 ports:
   - name: redis 
    port: 6379
     protocol: TCP
     targetPort: 6379
 selector: 
  app: ratelimit-demo-redis 
  component: redis

Next we will configure our ratelimit service. It will be a single-pod Deployment and will also be exposed with a Service. YAML configuration for the service will be provided via ConfigMap:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 labels:
   app: ratelimit-demo-rl 
  component: istio-ratelimit
 name: ratelimit-demo-rl-dep
spec:
 selector:
   matchLabels:
     app: ratelimit-demo-rl
     component: istio-ratelimit
 template:
   metadata:
     labels:
       app: ratelimit-demo-rl
       component: istio-ratelimit
   spec:
     containers:
     - command: 
      - /bin/ratelimit 
      env:
       - name: REDIS_SOCKET_TYPE
         value: tcp
       - name: REDIS_URL
         value: redis:6379 # points to the Service we've created above
       - name: RUNTIME_ROOT
         value: /data
       - name: RUNTIME_SUBDIRECTORY
         value: ratelimit 
      - name: RUNTIME_IGNOREDOTFILES
         value: "true"
       - name: RUNTIME_WATCH_ROOT
         value: "false"
       image: envoyproxy/ratelimit:ef13143c # latest build from master at a time of writing of this document
       name: istio-ratelimit
       resources:
         limits:
           cpu: 1500m 
          memory: 512Mi
         requests:
           cpu: 200m
           memory: 256Mi
       volumeMounts:
       - mountPath: /data/ratelimit/config
         name: config-volume
     volumes:
     - configMap:
         defaultMode: 420
         name: ratelimit-demo-rl-cm
       name: config-volume
---apiVersion: v1
kind: Service
metadata: labels:
   app: ratelimit-demo-rl
 name: ratelimit-demo-ratelimit
spec:
 ports:
 - name: ratelimit-app
   port: 42080
   protocol: TCP
   targetPort: 8080
 - name: ratelimit-app-grpc
   port: 42081
   protocol: TCP
   targetPort: 8081
 selector:
   app: ratelimit-demo-rl
   component: istio-ratelimit
---
apiVersion: v1
data:
 config.yaml: |
   domain: example-ratelimit
   descriptors:
   - key: service
     requests_per_unit: 10
     unit: second
   - key: header_match
     rate_limit:
       requests_per_unit: 20
       unit: second
     value: path
   - key: header_match
     value: get
     descriptors: 
    - key: service
       value: boo 
      rate_limit:
         requests_per_unit: 3
         unit: second
kind: ConfigMap
metadata:
 labels:
   app: ratelimit-demo-rl
 name: ratelimit-demo-rl-cm

Next step would be to configure Istio sidecar to use envoy’s ratelimit plugin and point it to our service. This is done using EnvoyFilter resource:

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
 labels:
   app: ratelimit-demo
 name: filter-ratelimit
spec: configPatches:
   - applyTo: HTTP_FILTER
     match:
       context: SIDECAR_INBOUND
       listener:
         filterChain:
           filter:
             name: envoy.http_connection_manager
             subFilter:
               name: envoy.router
    patch:
       operation: INSERT_BEFORE
       value:
         config:
           domain: example-ratelimit # must match domain in ratelimit ConfigMap
           failure_mode_deny: false # run plugin in fail-open mode, no limiting happens if ratelimit is unavailable           rate_limit_service:
             grpc_service: 
              envoy_grpc:
                 cluster_name: rate_limit_service
               timeout: 0.25s
         name: envoy.rate_limit
   - applyTo: CLUSTER
     match:
       cluster:
         service: ratelimit-demo-ratelimit
     patch:
       operation: ADD
       value:
         connect_timeout: 0.25s
         hosts:
           - socket_address:
               address: ratelimit-demo-ratelimit # ratelimit Service name
               port_value: 42081 # and port exposed by the Service
         http2_protocol_options: {}
         lb_policy: ROUND_ROBIN 
        name: rate_limit_service
         type: STRICT_DNS workloadSelector:
   app: ratelimit-demo

And finally we will need another EnvoyFilter to supply envoy ratelimit filters:

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
 labels: 
  app: ratelimit-demo
 name: filter-ratelimit-actions
spec: configPatches: 
  - applyTo: VIRTUAL_HOST
     match:
       context: SIDECAR_INBOUND
       routeConfiguration:
         vhost:
           name: inbound|http|80 # port must be a port your Service is listening on
     patch: 
      operation: MERGE
       value:
         rate_limits:
           - actions:
               - header_value_match:
                   descriptor_value: path
                   headers:
                     - name: :path
                       prefix_match: /middleware
           - actions:
               - header_value_match:
                   descriptor_value: get
                   headers:
                     - name: :method
                       prefix_match: GET
               - request_headers:
                   descriptor_key: service
                   header_name: x-service
 workloadSelector:
   app: ratelimit-demo-app # label used to identify pods running your applications

Now deploy some application into your namespace and expose it using a Service listening on port 80 and attempt running multiple requests. You should start getting HTTP 429 responses if sending more than 3 GET requests per second with x-service: boo header or more than 20 requests per second to /middleware path of the Service.

That's Envoy Rate Limits, courtesy of Wayfair!

Interested in joining our Engineering team? Explore open roles on our Careers Page.