Road to re:Invent - AWS Databases

AWS offers a wide range of databases but making sense of these services can be tricky. In this live stream, we explore these data services and why you pick one over another.

Here are the slides that I used during the live stream.

Reasonably Accurate 🤖🧠 Transcript

All right. Good morning, everybody. How you doing? Thanks for jumping on the stream was quiet last week on streams not quiet work lies turn my position which is fantastic, but that kept me slightly busy last week. So I wasn’t able to jump on the streams. Let me just double check the quality.

He was lookin good. We are five by five awesome or live on Periscope through Twitter as well as the easiest place for comments. So I’ve got it up. I can see it as we’re going along and showing you other things. So if you have some questions as I’m going along as absolutely the best time to ask them I will cover them after the fact is well.

I’m kind of new to see if other people out of traditional comments over time, but definitely ask them here in the Stream. I will do my best I’m to fight Them Off from Twitter as well at Twitter should be a little bit better. But of course I turned off notification and because I didn’t want to stream to get Disturbed.

It’s one of those sort of Catch-22. So picturesque fall graphic for cuz I certainly don’t have any leaves left on the trees around my neck of the woods. It was a beautiful fall, but it is a quickly is switching into winter, which is sort of a Halloween tradition up here where I live in Canada with every kid comes dressed as the same thing something underneath the park cuz it’s cold and sleep didn’t I see rain or snow starting off on the day? So we’re going to talk about is ews database service offerings.

So, let’s Dive Right In. Let me share my screen. We are going to go Google Chrome. So let’s make that a little bit bigger. So you guys can see it. So there is actually a database page for a tab us now they make this a little bit easier databases.

Here’s what we’ve got to offer you the countdown clock to because this is a key part of the AWS offerings. And in fact, what we see is Vernor will get up on stage and talk about how they are expanding their database offerings and all of these different services that give you in a purpose-built databases, which is relatively true is a little bit of stretching in there.

But I have a wide variety far more than most other Cloud providers and they are all really high and which is great that they do have some differences and that is absolutely important to understand so they have a breakdown on this webpage. So this is if we look we go to Amazon AWS at amazon.com product sucks databases.

Rise and set the official database paid for these actually has a whole bunch of different use cases position on this page, which is great, but they all bubble down to storing data and being able to query it. So we already saw a very unique database structure and when we did the episode on streaming a three-hour ride the streaming episode when we talked about S3, then we talked about data as well and treated like a really unique applications.

But today we’re talking sort of more traditional old-school database Concepts and end with that end. Let’s start looking at this. So this is a set of slides I worked out for today wanted to walk you guys through this. There are four major will three major offerings in a tulle under the category of relational databases? He would traditionally think of as a database they store data in a table and you can query them out.

So in this case, we have Amazon Aurora, which is a truly cloud-native scalable a ridiculously low cost really really high and relational database service and you know it by default you should just be on Aurora and there are post grass and mice equal access is into Aurora and there is even a serverless version / feature in Aurora to make it easier to access from a divorce Lambda and service designs, but realistically if you’re doing a traditional database structure Aurora is where you want to start that is the most logical place.

It lines up sort of that best balance of I’m comfortable dealing with traditionalist. I get it out through two very popular standardized open-source interfaces postgres in my Sequel and it’s really you takes advantage that the fault tolerance in the cloud the cloud it is a managed. Service, which is great Amazon redshift is their database Warehouse.

So sort of old school database Warehouse application at rehearsals are the batch database processing is a great call men RTS is sort of the catch-all for everything else if you need a my Ms. SQL database if you need an Oracle database traditional postgres or traditional my sequel or Maria DB this is where you get it.

I’m so startled us migration service will let you move between any of these three other services but also importantly on-premise stuff. So if you have a database on-premise and that’s running like Microsoft SQL, you can actually sing your date of your triggers are your functions all that stuff over into like Aurora and it’ll do the conversions as necessary.

Used in order to move data from production from intestine to production have you really want to be careful moving production data ended up in test at but very useful to migrate from a can pretty much be anywhere. So that’s traditional relational database documentdb spaced and is fully mongodb compatible.

An essential you’re storing documents in Korean based on any property in those documents. So a document in this case is a collection of key-value pairs. And so you can have you no title equals just databases author Eagles content and style of database where it’s very much off to the sort of up to you is the user to Define what goes in there traditionally, they don’t have schemas which is a data definition and then you can query as is very handy very useful in a lot of scenarios.

Especially too much is a graph database and graph databases really good for tracking relationships. Now, I know what you’re going to say. Category yeah, we did graphs are actually far better to track relationships and properties of according to those relationships and I’ll show you why in detail in a second and the Amazon Neptune is the offering for me to get us there.

I was a little bit disappointed in that it’s a traditional sort of instance based offering so you still have to stand up an instant save a document d v I would like to see these be moved more into the serverless world, but you know baby steps. We’re getting there a lot of stuff in this column is a ledger database.

Please don’t even get me started on this. I guess I have to Define it fine. It’s a blockchain based Ledger database. It’s called the Amazon Quantum Ledger database. It’s best if we just move on I am extremely Limited in its use cases. If you have one of these use cases extremely useful it is immutable ledger so you can’t change things things sort of like a double-entry accounting.

You can’t remove something once you put it in there. You need to make a counter on Entry have a lot of stuff. It doesn’t belong but we’ll talk another time at time series the new last year reinvent along with Neptune was Amazon timestream. That is an extremely powerful database if you are data is based on a Time series.

So you’re tracking things like events. If you are a if everything has a timestamp to it time series might be where you want to go depending on the type of what you want to do with that data. Finally. We have key value. Amazon dynamodb, this is going to be the number one workhorse outside of relational databases for most of us and we’ll actually dive into that console in a minute.

And I though it’s been a long time since I played it so bear with me but yeah, we’ll get some hands-on experience in this very very strong fast very very powerful if you want an in-memory database. So that’s the breakdown of the database offerings. Which one do I use the number one thing you need to think about this while there’s two stages is basically think about are you going to be if you’re and then based on that next question, how are you pulling data out? Like what do you want to do with this data? And that’s absolutely critical to documentdb received.

A lot of the time for mobile applications. We need to send a chunk of related information to a mobile client with one query. So you’re trying to reduce Network back and forth so you to say hey give me today’s episode metadata and there you get it all in one query with one quick look up a relational database is been around for 40 plus years.

It’s a well-known as far as it’s it’s sort of edge cases and how best to design and set things up. But yeah taking your data out of the absolute worse or the most critical thing to think about in that really will tell you what you’re going to be picking.

So I would highly recommend spending more time writing data are writing actions on your side. That put more work on you to write things in a data structure that makes it easier to pull out because that’s the majority of most is actually clearing the data and pulling it out.

So let me walk you through a very simple example, let’s look at a traditional relational database set up. So he so I’m there a bunch of fake names are there we have buses. So these would be a busser. It’s a bus route one would go from the airport to the hotel bus route goes from the hotel to the office, which is great.

And then we have a table that helps connect these so to say who is written on what bus so if we will get my ID my name ideas one if we look at the buses bus to so I went from the hotel to the office Name ID one that’s me again Road bus one.

I went to the airport to the hotel, right? So that’s how we connect these two pieces of Information Tracking and the names table on tracking all the people. Know about all my bus riders in the buses table and tracking all my bus routes and then I created this new table called names to buses that links these two to make it possible to query to say how many buses which passes did Mark take or which writers were on this bus you can go either way.

And what ends up happening is you write a query that looks kind of like this Rites of issus standard relational database query written in SQL is double check for the words as well. But if we have our standard relational database query, it does break down to someone English is at it’s not that hard to understand.

I think when you break it down so select. So give me Star is everything from names. Okay, that makes sense. Give me everything from names and I’m going to call and and names and just to make it easier. So instead of typing up names. I just use the with the letter n Inner join with me in Schenectady Sue tables names the buses and I’ll call that and to be on and ID so Name ID and names the buses Name ID right that make sense.

I’m looking up the same thing. So I’m saying in table names. There’s a field called ID in names the buses. There is a field called Name ID the same connect those two tables all the values in those two tables. Furthermore. Here’s another inner join furthermore connected table buses on the bus ID.

When it’s equal two names two buses bus ID that name ID in the name of the buses table and then also grab me the buses information when the bus ID equals the same bus that’s in the names of the buses. So what that effectively does is connects these three table by saying when Mark Donna covin, so when I have an idea of one here in my name’s the buses, I have an idea of one and then a bus idea to grab that right? And then for the second row, I’m going to get another entry where it’s marked on the Cove in 121 airport hotel.

So there will be in the result of this one for each of the bus rides. I took that is almost standard literally textbook SQL and it’s clunky is all good at it works. But if you start to expand Beyond Relationships are this is a second-order relationship. We have or first order relationships are we have names to buses? And that’s what we’re connecting you start to do a second and third order relationships or & Order relationship to get really bad really fast.

You get these monsters queries, you make these tables more than these kind of names the buses tables of these odd connection tables and yes, it is a relational database, but I think if you’ve been building anything for any length of time, you know that database technology like this gets we take this concept this exact same concept.

We’re going to look at how a graph database wood store its own this case 8 Amazon Neptune. We’ve got the exact same information here, but I think you was a user understand this far better, right? We have our Riders on the outside in the blue balls. And in the in the green circles, we’ve got our buses right to Blue circles.

Riders this week and then I in the green circles, we’ve got our bus and then we have relationships know all these lines. I just colored them differently make it easier to see them but you see we have a relationship and because I’ve only done it once or twice Mark rides the Number 2 bus but it’s a big thick line because I ride that bus far more often same with Jen in the opposite corner.

She rides the bus the first bus quite a bit more then she writes the second one or lean here only writes the one bus and Walter has moderate rides. And now this as a graph database it stores it in a whole bunch of really cool data structures in the back and that you don’t have to worry about conceptually.

You just need to think about this kind of graph structure that you see on your on the screen right now on this slide. Now, we want to the syntax is going to change depending on the graph database that you’re working with but the concept is all the same. So you see on the right left hand side hear the buses this document structure.

Patrick is literally query and graphql. So I want bus and I want the names of all the Riders related to bus first and last that’s it. That’s a far simpler relationship a pole. Then you want then in a relational database write and graph can get really complicated really really quickly really understandable have felt his community hero Gemma Eileen Smith gave great talk at the public sector Summit in DC how she had helped organize the New York using open data from the city make relationship connections between a various actions.

So when there was like a spark spike in parking tickets were connected that to your 311 complaints and Construction in the area and this far simpler all store in a graph database, right? Because graph databases are all about relationships and it seems very backwards. Compared to a relational database and a graph database to connect to other tables to make multiple relationships between them.

Where is the graph database? And I forgive the rapid scrolling hear the graph database structure. This is how you would draw doubt if I said draw this out on a on a whiteboard tell me who takes what bus would Draw Something far more similar to this rather than Table after Table after table of information.

So this is just a quick illustration of why you would pick one database over the other if we’re trying to figure out bus ridership in an area a graph database makes way more sense, especially if we start to add an additional data points like places. So if we start to add in places, where did they get off on these bus routes, right? We could add in a Place Circle for the office and then we can quickly see while how many who Getting off their which bus routes were getting more service to each of these places when there was multiple routes in place and it just a far simpler way to understand.

So that’s 8 Amazon Neptune, which is what I come up with a serverless version of it because I don’t want to manage instances, but that is a core difference between those types of databases and you have to keep asking those similar questions when you’re looking. Sorry to hear we got when you’re looking across these different options.

Which one do you pick? It really depends on how you want to pull data out? What do you want to do with that data, right. You never going to pick something like Amazon elasticache for long-term care is what you’re going to pick up to speed up, and a shove things into sits in front these in-memory databases tend to sit in front of another database a lot of the time he values what you actually want when you’re trying relational database in Amazon DB makes it really easy to do.

Bowser my screen so if I go to dynamodb options myself in a Management console and flexible, no SQL database service you create tables at a very straightforward. So we’re going to create a table. We’re going to call this. Let’s go. Same data buses. The primary key is bus ID and it’s going to be a number right? So we’re also going to have to sort key which is going to be a place.

So we have a table in buses leave a bus ID. We are adding a sort key. That’s our primary key or sorc he is also going to be place. So we’re in a sort out by that and then we’re going to use default index now, I don’t have auto-scaling enabled by default.

That’s a relatively new dynamodb service hours feature and you really should enable that we don’t have time to go through the documentation right now. Basically, what it allows you to do is let a dynamodb throttle things for you automatically. That’s fine. We’re going to just click the blue create button and this is going to create our table so it’s being created right now.

Which is great. So the idea here is that you don’t want to actually end up connecting tables like we did in the SQL. So if we look our overview, which has been created we don’t have any streams. That’s fine. We have no items so we can start to add items.

We could also add. So here we got a bus ID. We’ve got a place we can add our bus ID is 1 places office. We can actually add another. second stop attribute value is going to be Hotel. So we can start creating our items like this standard sort of relational database, but you can start to get really complicated and actually add connection so you can set it up like we did with the traditional database so I can create a second table.

I can create my name’s table. And my name is going to be first and I’ll add a sort Key by last so that’s going to create a second one. And what you can do here is because of the way this works now forgive me. If I am breaking a lot of Dynamo rules.

I probably am I am not as up to speed on my Dynamo optimize structures that I should be. But if I take this I can add a binary what’s bad a binary set and I’m going to add rides and in this case what I’m going to do. There we go.

And then I sat a second ride. You’re probably wondering what I’m doing what I’m doing here. No or empty values, what are more parameter values are in physical of live streams, right and then within that record directly, I’m going to get that relationship of how many bus rides are taken.

So I’m going to keep a binary set which is just don’t want to buy that’s the problem. What I want is a number set on a binary set. Or a map and see if I can get a map. Thanks. Let me kill list. Remove append number set my apologies there.

Too perfect. So say that has down by a table to overall optimize this so instead of creating multiple tables and I have buses. So I got a bus ID in a place in stop and but also in my right or record, I am keeping track of rides and that is because most of time I want to know where my Riders at what write multiple Riders are taking some more boxes so I can combine routes maybe later at but also possible to query based on ride the ride values to then line up to see who’s taking the bus has far more performing this way in again.

There’s probably a six different ways to design this. I’m in a different structure and but if I want a query this I can do the same thing. I can write a simple query that goes across these two tables and essentially say you don’t want for Ryder Mark. I want to see what the bus of rides are at.

What I would do when I’m building an application as I would actually just query the buses cuz I have a finite number for whatever. Bus routes than Riders devices keep that in memory and then just say Marco KU Road bus to what’s that one of us one? Okay, great with that one and then you could also because this is a set you can work over the satin say how many times have I taken bus number to so if I change this record again, if I go into my rides right onto another ride onto so now I’ve taken multiple buses courses duplicates because I’m not setting this up correctly and but that kind of concept Works similarly, right? So very rough example of how to use Dynamo.

I’m simply because I’m out of practice which I apologize for what you get the concept right? We walked through. Let me share that again real fast and we walk through some of the and of course I closed it. We walked through some of the different types of offerings of databases.

So if you’re looking for relation, So I guess you’re setting up and tables you want to be sitting in Aurora at the document documentdb those graphs. If you’re thinking of things like this graph database Amazon Neptune is the way to go time streams really intriguing to me. Actually. I am I don’t think it gets enough play make it still in its early stages but a lot of stuff that gets shoved into a relational databases really time series data.

So if you have something that’s explicitly time series in your thinking about it in the concept of Time techno timestream, a lot of cloud native designs actually end up in dynamodb. Do not let my fumbling around it dissuade you it is the best database service out there. I think especially need to be as Cloud.

It is quite powerful. There are some phenomenal courses out there on dynamodb some great information some people much better than I at At it and it’s a great reminder that I just need to get my hands dirty little bit more. I need to dive in. So I that’s a quick overview database Services.

Appreciate you joining me here on LinkedIn. I appreciate you joining me on Twitter. Hopefully, this has been useful. Please let me know. If you have any questions at what you’d like to see in the next stream. It’s probably going to be next week. I got a bunch of stuff to do for reinvent this week.

So trying to get that out and I always appreciate your time and let me know what you think. Will talk to you soon. Thanks for joining. All right. Good morning, everybody. How you doing? Thanks for jumping on the stream was quiet last week on streams not quiet work lies turn my position which is fantastic, but that kept me slightly busy last week.

So I wasn’t able to jump on the streams. Let me just double check the quality. He was lookin good. We are five by five awesome or live on Periscope through Twitter as well as the easiest place for comments. So I’ve got it up. I can see it as we’re going along and showing you other things.

So if you have some questions as I’m going along as absolutely the best time to ask them I will cover them after the fact is well. I’m kind of new to see if other people out of traditional comments over time, but definitely ask them here in the Stream. I will do my best I’m to fight Them Off from Twitter as well at Twitter should be a little bit better.

But of course I turned off notification and because I didn’t want to stream to get Disturbed. It’s one of those sort of Catch-22. So picturesque fall graphic for cuz I certainly don’t have any leaves left on the trees around my neck of the woods. It was a beautiful fall, but it is a quickly is switching into winter, which is sort of a Halloween tradition up here where I live in Canada with every kid comes dressed as the same thing something underneath the park cuz it’s cold and sleep didn’t I see rain or snow starting off on the day? So we’re going to talk about is ews database service offerings.

So trying to get that out and I always appreciate your time and let me know what you think. Will talk to you soon. Thanks for joining.

Road to re:Invent - AWS Databases

Reasonably Accurate 🤖🧠 Transcript

Read next

ChatGPT Delivers Ideas and Answers on Demand, If You Know How To Ask

Road to re:Invent - AWS Databases

Reasonably Accurate 🤖🧠 Transcript

Read next

ChatGPT Delivers Ideas and Answers on Demand, If You Know How To Ask

AWS re:Invent 2022 Attendee Guide: Security

Accelerating innovation at AWS Security