Archive · · 2 min read

Can We Improve How Lyft Handled Service Discovery on AWS In 2016?

In late 2016, Lyft demonstrated the service discovery engine they built on AWS. Five years later, how well does that design hold up? What could we improve given the services and features available today?

Can We Improve How Lyft Handled Service Discovery on AWS In 2016?

In late 2016, Lyft did an AWS “This is My Architecture” video. It was one of the first. In the video they explained how they tackled the problem of discovering what microservices were available and healthy in their environment.

Now, a few years later, I react to that video and see what’s stood the test of time, what could be done simpler given today’s technology, and generally critique the design against the AWS Well-Architected Framework.

The AWS Well-Architected Framework

The AWS Well-Architected Framework is designed to help you and your team make informed trade offs while building in the AWS Cloud. It’s built on five pillars;

There pillars cover the primary concerns of building and running any solution. And as much as we’d all love to have everything, that’s just not possible.

…enter the framework.

It’ll help you strike the right balance for your goals to make sure that your build is the best it can be now and moving forward.

Why Architecture?

I often get asked why I talk about building in the cloud and architectural choices so often…aren’t I a security person?

Yes, I do focus on security and architecture is a critical part of that.

There’s really two types of security design work. The first is when you’re handed something and need to make sure the risks of that technology matches the risk appetite of the users.

The second type is when you’re building the technology. This is where making choices informed by security early in the process can have profound effects. You’re no longer bolting security on but building it in by design.

That’s why I talk about architecture and building so much. It’s where we all can have the largest possible security impact!

This video—and the ones that will come after—looks at a specific set of design decisions and how they balance the concerns of the AWS Well-Architected Framework…where security is one of the five pillars.

Lyft’s Design

Lyft’s system runs on a service based design. They have a service for matching riders to drivers, logging in, processing transactions, etc.

This architectural style makes it easier for teams to work independently on different parts of the same system. It’s not a new concept by any means but lately it’s getting more attention because of the way the cloud enables teams to work in this manner.

In this video, Lyft explains their challenge when it comes to “discovery.” Basically, making sure that each service knows that the others are online and healthy.

This is a common challenge because these services are so independent. A discovery engine (or service) addresses this problem by providing a directly of available services and how to communicate with them.

Because of the nature of their service (matching riders to drivers in the real world), Lyft understandably has reliability as a top priority. They don’t want a driver or rider to ever be without service.

They’ve designed their solution with this in mind. Any component can be offline and the system will still functional. It might not be working a peak performance, but at least the job still gets done.

It’s a fascinating design that still holds up years later. Watch the video 👆 for more of the details!

Btw, I’ve updated my course, “Mastering The AWS Well-Architected Framework” on A Cloud Guru. If you want a solid walk through of the ideas behind the framework and how to apply it to your work in the AWS Cloud, check it out!

Read next