Envoy rate limits is a fairly complex system, built using multiple components. While there are many articles on the Internet explaining basic setup and how each component works, we weren’t able to find something that explains how each component works end-to-end in simple terms.
That’s why we’ve created this blog - covering envoy and rate limit service configurations.
To learn how to integrate the configuration into the istio resources, feel free to scroll to the bottom of the page for the complete example.
This document assumes that the reader has previous experience operating Kubernetes and Envoy. The target audience for this document is mostly platform engineers and Kubernetes or Istio operators.
Wayfair and Envoy Rate Limits
At Wayfair, development teams are actively using Kubernetes - with over 1500 applications currently running in our clusters. These applications are written in multiple languages and use a large number of frameworks.
We wanted to avoid the need to build a set of libraries for each language, which manage things like request metrics, identity and access control, or rate limiting. Service mesh is a solution, which provides these and many more features on the infrastructure level, making these features language and framework agnostic.
At Wayfair, we decided to implement service mesh based in istio, which in turn uses envoy proxy at its core.
One of the features envoy and istio provide is a rate limiting. Rate limiting makes sure that your application doesn’t get more than a specified number of requests over a period of time. This effectively helps to avoid overloading your application by a bad actor or a misconfigured client. While there are other mechanisms, which help get similar results (e.g. circuit breakers), rate limiting is the simplest one to implement as a first line of defense.
We will use the the following formatting conventions throughout the document:
- “Resource names” will be quoted and italicized;
- Code snippets will use monospaced font;
Introduction
Rate limits are configured in two places: first is the envoy “rate_limits filters” and the second is lyft’s “ratelimit service” configuration. Envoy’s filters contain “actions”, which result in a “descriptor”. This “descriptor” is sent to the ratelimit service, which uses them to make a decision on a specific limit.
Envoy rate_limits filters
A single “rate_limits filter” contains a list of “actions”, each of which in turn contains a list of “rate limit actions”. Envoy will attempt to match every request going through the “rate_limits filter” against each “action”. A request matches an “action” only if it matches every “rate limit action” inside.
Example
Let’s assume we have a single “rate_limits filter” with two “actions” (for the sake of simplicity, only the relevant part of the configuration displayed):
rate_limits: - actions: - header_value_match: descriptor_value: path headers: - name: :path prefix_match: /middleware - actions: - header_value_match: descriptor_value: get headers: - name: :method prefix_match: GET - request_headers: descriptor_key: service header_name: x-service
The first “actions” block reads as follows: match any request with a path starting with /middleware.
The second “actions” block reads as follows: match any request with method GET and header x-service (we will explain the descriptor_key field in a bit).
A request may match zero or more “actions”, which will result in corresponding number of “descriptor entries” sent to the “ratelimit service”. Let’s take a look at a few example requests and how they match the “actions” above. For the sake of simplicity, we will assume that first action will result in a “descriptor entry” 1 and second action will result in a “descriptor entry” 2.
As you can see, “rate limit actions” inside of an “action” are joined by the logical AND. At the same time, “actions” are independent from each other. One action not resulting in a “descriptor entry” does not prevent others from resulting in a “descriptor entry”.
Rate limit actions and descriptors
A “descriptor” is a set of “descriptor entries” corresponding to an “action”. Each “descriptor entry” is a key-value tuple and generally a “descriptor” will look like this:
(“descriptor-key-1”: “descriptor-value-1”)
(“descriptor-key-2”: “descriptor-value-2”)
Envoy provides a pre-set number of “rate limit actions”. Each “rate limit action” generates a “descriptor” with either a pre-set or configurable key and value. Values can be either static or dynamic based on one or more request fields. A full list of rate limit actions provided by the envoy is available here.
Keep in mind that Istio does not alway ship the latest envoy. It’s usually a good idea to check which envoy version is shipped with your Istio and check documentation for the corresponding version.
For example, generic_key descriptor has a pre-set key generic_key and allows configuring a static value.
Another example is a
request_headers descriptor, which allows configuring a descriptor key and will get a value from one of the request headers.
Example
Let’s use our example configuration above and the following request, which matches both “actions”:
GET /middleware x-service: boo
As this request matches both “actions”, envoy will generate two “descriptors”, one per “action”.
First descriptor
First descriptor will be generated from this “action”:
- header_value_match: descriptor_value: path headers: - name: :path prefix_match: /middleware
It is using
header_value_match “rate limit action”. This action has a pre-set key header_match and will pull the value from the descriptor_value field of the envoy configuration. The resulting “descriptor” will look as follows:
(“header_match”, “path”)
Second descriptor
Second descriptor will be generated from this “action”:
- header_value_match: descriptor_value: get headers: - name: :method prefix_match: GET - request_headers: descriptor_key: service header_name: x-service
It is using two “rate limit actions”:
header_value_match and
request_headers. We already know how header_value_match works, so let’s take a look at the request_headers. This “rate limit action” allows us to configure the descriptor key using the descriptor_key field and will set a descriptor value using the value from the header configured in the header_name field. As we have two “rate limit actions” in this “action”, the resulting “descriptor” will have two “descriptor entries”:
(“header_match”: “get”)(“service”: “boo”)
Ratelimit service configuration
On the top level, “rate limit configuration” operates with a set of descriptors inside of a “domain”. A “domain” must be unique per rate limit service.
“domains” in the service configuration provide an isolation for the “rate limit configuration”. For the sake of simplicity, in this document we will assume that the service configuration has a single “domain”.
“rate limit configuration” is stored in the descriptors dictionary in the config file. These configurations are used to classify requests and provide each class with a certain amount of requests. We will call these classes “buckets” in this document.
“rate limit configuration” is stored as a list. Each configuration must have a key and may or may not have a value. These keys and values will be matched to keys and values from “descriptors”. If a value for a configuration is not specified - each unique combination of key/value matching this configuration will have its own bucket. A combination of a key and a value (or a key without a value) must be unique inside of the domain.
Ratelimit service matches on a full “descriptor”, not on individual “descriptor entries”. In order to match a “descriptor” with multiple “descriptor entries”, a nested “descriptor configuration” must be used. In this case, nested “descriptor configurations” are joined by a logical AND.
In case multiple “descriptor configurations” match a “descriptor”, the most specific one will be applied. In case no “descriptor configurations” match a “descriptor”, no limit is applied effectively allowing the descriptor any amount of requests.
Example
Let’s take a look at the following ratelimit service configuration:
domain: example-ratelimit descriptors: - key: service requests_per_unit: 10 unit: second - key: header_match rate_limit: requests_per_unit: 20 unit: second value: path - key: header_match value: get descriptors: - key: service value: boo rate_limit: requests_per_unit: 3 unit: second
The first “rate limit configuration” reads as follows: match any “descriptor” with the key service and provide a bucket with 10 requests per second (rps).
The second “rate limit configuration” will match only “descriptors” with key header_match and value path. It will provide a limit of 20 rps to this “descriptor”. Effectively only this will match:
(“header_match”, “path”)
The third “rate limit configuration” has a nested “rate limit configuration”. This means it will only match “descriptors” with key header_match and value get AND key service and value boo. Effectively this “descriptor” will match:
(“header_match”, “get”)(“service”, “boo”)
Putting it all together
The following diagram shows a relationship between Envoy Filters and Ratelimit service configurations in general.
Istio configuration
Now let’s take a look at how complete Istio-based configuration will look like for our rate limit setup. Rate limit service uses redis as a key-value storage. For the sake of simplicity, we will create a redis Deployment with a single replica and will expose it with a Service:
apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: app: ratelimit-demo-redis component: redis name: ratelimit-demo-redis-dep spec: replicas: 1 selector: matchLabels: app: ratelimit-demo-redis component: redis template: metadata: labels: app: ratelimit-demo-redis component: redis spec: containers: - image: redis:6.0.6 imagePullPolicy: IfNotPresent name: istio-ratelimit resources: limits: cpu: 1500m memory: 512Mi requests: cpu: 200m memory: 256Mi --- apiVersion: v1 kind: Service metadata: labels: app: ratelimit-demo-redis name: redis spec: ports: - name: redis port: 6379 protocol: TCP targetPort: 6379 selector: app: ratelimit-demo-redis component: redis
Next we will configure our ratelimit service. It will be a single-pod Deployment and will also be exposed with a Service. YAML configuration for the service will be provided via ConfigMap:
apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: app: ratelimit-demo-rl component: istio-ratelimit name: ratelimit-demo-rl-dep spec: selector: matchLabels: app: ratelimit-demo-rl component: istio-ratelimit template: metadata: labels: app: ratelimit-demo-rl component: istio-ratelimit spec: containers: - command: - /bin/ratelimit env: - name: REDIS_SOCKET_TYPE value: tcp - name: REDIS_URL value: redis:6379 # points to the Service we've created above - name: RUNTIME_ROOT value: /data - name: RUNTIME_SUBDIRECTORY value: ratelimit - name: RUNTIME_IGNOREDOTFILES value: "true" - name: RUNTIME_WATCH_ROOT value: "false" image: envoyproxy/ratelimit:ef13143c # latest build from master at a time of writing of this document name: istio-ratelimit resources: limits: cpu: 1500m memory: 512Mi requests: cpu: 200m memory: 256Mi volumeMounts: - mountPath: /data/ratelimit/config name: config-volume volumes: - configMap: defaultMode: 420 name: ratelimit-demo-rl-cm name: config-volume ---apiVersion: v1 kind: Service metadata: labels: app: ratelimit-demo-rl name: ratelimit-demo-ratelimit spec: ports: - name: ratelimit-app port: 42080 protocol: TCP targetPort: 8080 - name: ratelimit-app-grpc port: 42081 protocol: TCP targetPort: 8081 selector: app: ratelimit-demo-rl component: istio-ratelimit --- apiVersion: v1 data: config.yaml: | domain: example-ratelimit descriptors: - key: service requests_per_unit: 10 unit: second - key: header_match rate_limit: requests_per_unit: 20 unit: second value: path - key: header_match value: get descriptors: - key: service value: boo rate_limit: requests_per_unit: 3 unit: second kind: ConfigMap metadata: labels: app: ratelimit-demo-rl name: ratelimit-demo-rl-cm
Next step would be to configure Istio sidecar to use envoy’s ratelimit plugin and point it to our service. This is done using EnvoyFilter resource:
apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: labels: app: ratelimit-demo name: filter-ratelimit spec: configPatches: - applyTo: HTTP_FILTER match: context: SIDECAR_INBOUND listener: filterChain: filter: name: envoy.http_connection_manager subFilter: name: envoy.router patch: operation: INSERT_BEFORE value: config: domain: example-ratelimit # must match domain in ratelimit ConfigMap failure_mode_deny: false # run plugin in fail-open mode, no limiting happens if ratelimit is unavailable rate_limit_service: grpc_service: envoy_grpc: cluster_name: rate_limit_service timeout: 0.25s name: envoy.rate_limit - applyTo: CLUSTER match: cluster: service: ratelimit-demo-ratelimit patch: operation: ADD value: connect_timeout: 0.25s hosts: - socket_address: address: ratelimit-demo-ratelimit # ratelimit Service name port_value: 42081 # and port exposed by the Service http2_protocol_options: {} lb_policy: ROUND_ROBIN name: rate_limit_service type: STRICT_DNS workloadSelector: app: ratelimit-demo
And finally we will need another EnvoyFilter to supply envoy ratelimit filters:
apiVersion: networking.istio.io/v1alpha3 kind: EnvoyFilter metadata: labels: app: ratelimit-demo name: filter-ratelimit-actions spec: configPatches: - applyTo: VIRTUAL_HOST match: context: SIDECAR_INBOUND routeConfiguration: vhost: name: inbound|http|80 # port must be a port your Service is listening on patch: operation: MERGE value: rate_limits: - actions: - header_value_match: descriptor_value: path headers: - name: :path prefix_match: /middleware - actions: - header_value_match: descriptor_value: get headers: - name: :method prefix_match: GET - request_headers: descriptor_key: service header_name: x-service workloadSelector: app: ratelimit-demo-app # label used to identify pods running your applications
Now deploy some application into your namespace and expose it using a Service listening on port 80 and attempt running multiple requests. You should start getting HTTP 429 responses if sending more than 3 GET requests per second with x-service: boo header or more than 20 requests per second to /middleware path of the Service.
That's Envoy Rate Limits, courtesy of Wayfair!
Interested in joining our Engineering team? Explore open roles on our Careers Page.