Archive · · 2 min read

Avoiding overload in distributed systems by putting the smaller service in control

The Amazon Builder's Library is a great set of deep dive papers into the challenges with modern systems. This post looks at how Amazon balances system stability between control and data plane requests.

Avoiding overload in distributed systems by putting the smaller service in control

The Amazon Builder’s Library is a great set of deep dive papers into the challenges with modern systems. This post looks at how Amazon balances system stability between control and data plane requests.

Avoiding overload in distributed systems by putting the smaller service in control, looks at

I call out a few more details in the Twitter thread below…

Tweet 1/10 👇 Next tweet

wrapping up my Amazon Builder's Library week, I'm looking at "Avoiding overload in distributed systems by putting the smaller service in control" today. this paper is by @_joemag_ from @awscloud 🧵☁️ #cloud #devops

Tweet 2/10 👇 Next tweet 👆 Start

you can view this thread unrolled at https://t.co/b3S6hIfSAB yesterday's thread on "Automating safe, hands-off deployments" by @clare_liguori is up at https://markn.ca/2021/automating-safe-hands-off-deployments/ 🧵☁️ #cloud #devops

Tweet 3/10 👇 Next tweet 👆 Start

this is a shorter, straight to the ➡ paper. it discusses an uncommon pattern between the two planes of most services: - data plane, "responsible for executing customer requests" - control plane, "responsible for managing and vending customer configuration" 🧵☁️ #cloud #devops

Tweet 4/10 👇 Next tweet 👆 Start

the paper mentions a number of interactions but focuses in on the pattern when the smaller control plane fleet is in control of making sure the service doesn't get overloaded (which is the opposite of most designs) 🧵☁️ #cloud #devops

Tweet 5/10 👇 Next tweet 👆 Start

data plane is typically 100x (or more) the size of the control plane. that makes sense given that it's doing most of the work the author uses EC2 as an example. a lot more systems run compute/storage vs. vending those configurations 🧵☁️ #cloud #devops

Tweet 6/10 👇 Next tweet 👆 Start

it's natural to assume that it would be best for the bigger service to manage the overall health. after all, it's already doing way more work to keep itself running the 📑 provides more details here 🧵☁️ #cloud #devops

Tweet 7/10 👇 Next tweet 👆 Start

in some cases, when the volume is high & predictability low of control plane requests, the author makes the case that it's more sensible for the small plane (control) to be in the drivers seat 🧵☁️ #cloud #devops

Tweet 8/10 👇 Next tweet 👆 Start

having the control plane responsible for stability needs some specific plumbing and design patterns in place but the overall benefit is worth it (when merited) this is a really fascinating write up on a pattern that probably comes up more than we realize 🧵☁️ #cloud #devops

Tweet 9/10 👇 Next tweet 👆 Start

the paper doesn't provide a complete roadmap here but it does a good job of providing the key points that signal this reverse pattern could be the solution you're looking for this paper is exactly what the Builders Library was made for. 🧵☁️ #cloud #devops

Tweet 10/10 👇 Next tweet 👆 Start

it helps others learn from some hard won lessons @awscloud. take a few minutes to read this one all the way through, it's excellent thanks for sharing this @_joemag_ /🧵☁️ #cloud #devops

Read next