API Gateway and a Service Mesh working together
In a recent presentation about API Gateways, I got a question about service mesh and I did not understand the question or have a good answer. So here is an attempt to understand these concepts and see how they fit together in the context of a typical container based microservice deployment. At the end of the day it is about routing and shaping traffic.
API Gateways have been around for a while and are an essential component of the “Full Lifecycle API Management” products as Gartner puts it. In this full lifecycle, the API Gateway is typically a L7 reverse proxy and the HTTP protocol (with REST, GraphQL etc) being the most popular. This could be north-south or east-west communication. Most important capabilities include authentication, authorization, rate-limiting, throttling, quotas, request/response transformation, load balancing and observability. Some examples Kong, Apigee and Spring Cloud Gateway.
Service Mesh is a relatively new pattern to optimize L4 to L7 communication typically between microservices and deployed as a sidecar reverse proxy. Most important capabilities include discovery, mutual TLS (mTLS), rate-limiting, throttling, quotas, load balancing and observability. This generally implies east-west but could be north-south if the ingress happens to be in the same cluster. Note that mTLS requires all participants to have the same certificate authority. Istio and Linkerd are couple examples.
Clearly there is some overlap of capabilities and the title suggests they work together so where does each fit? Let us understand the flow of traffic for a HTTP request. Typically requests come into your app (one of the service mesh as shown below) via a CDN/Firewall/Load Balancer (not shown) and end up at an External API Gateway (ingress+API gateway or the new gateway-api ingress). This then routes it as configured. Further there maybe Internal API gateways between isolated applications. The overlap is therefore not in the same traffic path. Each component serves a different layer in the routing and has a specific purpose.
Finally, there is no free lunch so we must consider trade-offs. Every new component introduced adds complexity and new emergent failures to consider. A new component to understand and explain, implement, keep up with security patches, upgrade, collect logs from, monitor etc. As with any topic, the answer is always “it depends” on your use cases and requirements!
If these topics interest you then reach out to me and I will appreciate any feedback. If you would like to work on such problems, you will generally find open roles as well! Please refer to LinkedIn.