Archive 15 min read

Road to re:Invent - AWS Machine Learning

AWS has recently explored with great machine learning services. From the core building blocks to services design to teach you techniques to simple transactional services that just get the job done. This stream looks provides an overview of these services and when you might want to use them.


AWS has recently explored with great machine learning services. From the core building blocks to services design to teach you techniques to simple transactional services that just get the job done. This stream looks provides an overview of these services and when you might want to use them.

Here's the key slides from the live stream:

A list of AWS machine learning services in three columns, divided into four groups: transactional, building blocks, partially managed, and tutorial

Reasonably Accurate 馃馃 Transcript

[00:00:00] Mark: All right, everybody. There we go. How's it going today? Thank you for joining. Um, my name is Mark Donovan. Uh, you can hit me up on social, uh, at Mark and CA. Obviously you've already figured that out cause you're here on the stream on LinkedIn, uh, or on, uh, Twitter. This is the final one in a series of streams that I've been doing leading up to AWS reinvent, a final one simply because AWS reinvent is four days away.

[00:00:29] four. That's crazy. Uh, my schedule has been all over the map. Um, uh, as it gets busier and busier and busier, busier leading up to reinvent. Um, so today we're going to cover the basics of AWS machine learning. We're going to look at a few of the services, uh, we're going to see how the service, Um, and then, uh, we are going to, um, see a couple of examples and then I'll give you a few tips for reinvent if you're coming.

[00:00:57] Um, and a few tips if you're not coming, actually, [00:01:00] because there's a lot you can do even if you aren't in Vegas next week for, um, AWS reinvent. So let's dive in right away. Let me share out, uh, chrome, this one. All right. So we've done a bunch of these already. If you want, I popped the link already in the LinkedIn live comments.

[00:01:21] Um, basically on my site, markand. ca. Uh, the main page. Pinned post is this one on reinvent and you can see all the previous streams. So if you scroll down, are you going to see, uh, my guide to reinvent? You're going to see, um, the, uh, practical security session guide. I did also the talks I'm giving at reinvent and then, um, all the different streams.

[00:01:40] So you can see the last one we did, uh, was on compute. So you can go there. You can also see the transcript and actually the transcript, the reason why I wanted to show this, the transcript is something that you're going to see the service that we can use to do that, um, with audio or video. today. So I think that makes a lot of sense.

[00:01:57] Um, hopefully it does. So a bottom line, check out that [00:02:00] link if you want to see the previous stuff. Today we're going to tackle machine learning, which, um, I think easily sums up, um, with these, this slide. So if you have any questions as we're going, um, As we're going through, uh, hit me up here on LinkedIn or on Twitter.

[00:02:16] Um, and I will respond in kind. I find it's always best to answer questions as we're going as opposed to queuing them up at the end. So, um, just, just pick it in the chat there, um, and, and we'll dive in. Um, you know, I've already seen some, some great questions, uh, from folks already this morning. I'm trying to answer as we go, uh, or as we were leading up.

[00:02:33] But, um, here we are. We're looking at, uh, all of the services that AWS labels. as machine learning services. Now there's one that's not on here that's actually been retired unless you already have an account and are using it, you can no longer use it and that was actually called, wait for it, AWS machine learning.

[00:02:53] Horrible name. I don't know why they picked it in the beginning. I mean, I know why they picked it because it made perfect sense, but that service is actually being [00:03:00] retired. Um, I, if you have used it on an account already, you can still access it, but new accounts can't. So I left it off there because as you can see, there's already a crazy amount of machine learning services.

[00:03:12] So we're going to start from the transactional services. Now there's the most of these services. Now, these are just my categories because I find, uh, This is the best way to think about machine learning on AWS. Now, if you want to know my credentials, um, I am a part of the IEEE, um, International Electronics and Electrical Engineers Association.

[00:03:32] As part of that, I've been, um, part of the, um, community for computational intelligence the last 12 or 13 years or so. I've really enjoyed, uh, machine learning, um, AI subsets, you know, um, all these different areas. Uh, it's been a passion and a hobby of mine for a while. So I love that it's gaining, um, really big, uh, sort of mainstream attention now.

[00:03:53] So, um, I know a lot how what's gone into a lot of this stuff. Um, and today we're just going to kind of go over and show you how you can start to get [00:04:00] started. So you're most of the time are going to get started with these transactional services. And the reason why I call them transactional is because you don't need to build up any models.

[00:04:08] You don't need to, um, have any of the underlying stuff that makes any of this work. They're simply, you give the service something, the service returns results to you. So we've got transcribe, um, and, uh, transcribe right off the bat. Just go one by one. It's easier. So transcribe takes audio, gives you back text.

[00:04:27] Translate takes one language, gives you back another. Comprehend does natural language processing, which is a subset of machine learning where it looks at how language is structured. So you give Amazon Comprehend a set of text and it's going to give you back the structure of that text. It's going to identify Verbs, nouns, um, it's going to give you the semantics of that text.

[00:04:51] It's going to give you topic modeling. Um, I actually did a course on this service for Pluralsight. Um, it's really, really interesting, but if you're not a language nerd, or if you're not trying to break down what [00:05:00] text, uh, means, then it's not super useful. Um, but it's still very, very interesting. Amazon Textract.

[00:05:07] It's actually a good name. Um, it's a, it's a portmanteau of text and extract. So you give this an image or a PDF and it's going to actually pull out text. Um, very, very cool. Um, and Lee, I see your comment there on pen testing, uh, machine learning for automation. Um, there is actually some really cool stuff you can do, um, tied to machine learning and some of these services may pay off for you.

[00:05:30] Um, but specifically, uh, for pen testing, one of the absolute best areas of study, Lee, is, um, fuzzy logic. So fuzzy logic, you're going to see that in fuzzers, which basically goes, Hey, algorithm, figure out a whole bunch of weird inputs and throw them against something and see what happens. Um, and that's actually a subset of machine learning, um, which is under the bigger umbrella of AI.

[00:05:52] Uh, so Textract, like I said, give it an image, give it a PDF, it'll give you back, um, some, uh, some, some text information. Um, I [00:06:00] did a video when that launched, actually, I'll link that, um, in the chat. Because that was, um, super useful, uh, and actually very, very cool, uh, Textract. I cannot spell Textract to save my life, which is ridiculous, um, but, uh, here you go.

[00:06:16] So this is that video for Textract. Amazon Textract. Uh, while we're doing that, I just wanted to, um, call out something that was really interesting. And I misspelled Textract in the thing, but you get the deal. Um, if you're wondering why some of these are Amazon, some of them are AWS, Corey Quinn actually called us out the other day, because not a lot of people know it.

[00:06:35] Amazon services tend to be able to be used on their own. AWS prefixed services mean that you need their building block, right? So you can see all these transactional ones are Amazon. So text tract, give it an image, give it a PDF, get back text forecast. You give it a bunch of data and it's going to do predictions.

[00:06:51] It's going to do extrapolations out. Um, so this is great. If you have a bunch of like sales information, um, you can do it and say, [00:07:00] Hey, here's all the sales transactions for 2019. What does 2020 look like? Um, same with, um, actually interesting, uh, back to Lee's, uh, comment, um, around pen testing and machine learning.

[00:07:10] Forecast is really cool if you shove a whole bunch of, um, network transactional data into it, as far as source and destination, to see where it's going to forecast network patterns in the future. Um, very cool. Could probably do a whole nother stream on that. Lex is actually the basis for Alexa. Um, it is a chatbot service.

[00:07:28] Thanks. Recognition is the one that you've seen, uh, unfortunately in the news quite a bit. It's the facial recognition. It's not just facial recognition. It's actually, um, image processing and image recognition. We're going to dive into that a little bit. Amazon Personalize is a personalization recommendation engine, so you can give it a whole bunch of buying habits, and it's going to say, people who bought this also bought that.

[00:07:52] So it powers, um, Amazon. com's recommendation system. Amazon Personalize Um, which is kind of cool. And then Amazon Polly is the last one. Think [00:08:00] Polly want a cracker. You give it, um, uh, text, it gives you voice, uh, and you can pick which voice you want, which is kind of cool. Um, so, uh, those are the transactional services.

[00:08:12] We're going to dive into a couple of those in a second. Um, because they're really great for, uh, demos, um, and we'll, we'll take up most of the time today, I think, doing those demos, but then I wanted to call out some of the ones that you're probably not ready for yet. And I think it's good just to know that they exist most of the time.

[00:08:28] If you're just starting with machine learning, you are going to be diving into these transactional services, but there are these building block services. So there is an AMI for EC two. That's an Amazon machine image. Okay. Um, and that is preloaded with a bunch of software that helps you run machine learning models.

[00:08:45] So if you're going to try to run your own models and build out your own machine learning system, Start with the AWS, a deep learning AMI that is managed by AWS. They keep the versions current. And basically once you deploy that server, it's pre installed with a whole bunch [00:09:00] of stuff, um, which is great. Uh, one of those things that's pre installed with, which is, uh, Apache MX net, which is a machine learning library that lets you do a whole bunch of cool stuff.

[00:09:09] Uh, Amazon or AWS actually maintains, um, that, uh, they're the leaders on that project. Um, and then they also keep a fork of tensorflow as well. These two things, if you don't know what they are, don't worry about it. They are the libraries that power some of those transactional services. But if you want to go down a level and get into the nuts and bolts, like I said, these are building blocks.

[00:09:27] Um, Amazon Elastic Inference is a GPU acceleration service. Um, now what that means, a GPU is a graphics processing unit. Your computer that you're working on right now has a CPU. CPU and a GPU. The CPU is the central processing unit. It's a generic thinking machine, right? It's a generic processor, and where it'll just go through functions and process computations.

[00:09:54] GPU is designed around graphics, but it turns out why building all these graphics processors over the years, [00:10:00] we actually optimize them for a different type of math. That math lines up very, very nicely. With machine learning and Amazon Elastic Inference lets you take a graphics processing unit and attach it to your, um, normal servers, not to run graphics, not to run games or things like that, but to do the type of math that we need to do in machine learning.

[00:10:20] So it'll basically lowers the cost of your machine learning and makes it faster. Um, so building blocks really cool if you need to build them out. But where, if you want to go deeper than the transactional, you go into these partially managed, which is essentially one service, uh, Amazon SageMaker. Very, very cool.

[00:10:36] It's a, um, level of, um, set of, uh, containers that you set up a Jupiter notebook, which is a data science notebook. It's basically a mix of Python code and data and, uh, some other code can be in there too, but normally defaults to Python. And then SageMaker takes care of all the infrastructure to spin up a container fleet to run through all of this [00:11:00] stuff and give you these models.

[00:11:02] So if you have something where you're doing design work and you go like, I don't know what the best shape of a phone is, Um, it's this by the way, um, though, the new folding phone, super cool. Um, you can run it through SageMaker and run like a thousand concurrent models and it will come back and say, Hey, here's the crazy new shape that the computer came up with.

[00:11:19] So very cool. If you're kind of getting past the transactional services and want to do more, um, SageMaker ground truth is actually a labeling service where it uses. Both machine learning and humans to label data. So it's setting a ground truth. And the interesting thing is anytime you're using machine learning, if you're, uh, in the building blocks are partially managed.

[00:11:39] You have to create a data set to train the machine to learn on basically, right? You're creating a curriculum for the learning. So you need a whole bunch of images that are labeled in order to make an image Processing service or a transactional image service, right? So SageMaker Ground Truth helps you do that Um, farm that work out and say, Hey, what are all these [00:12:00] pictures of?

[00:12:00] Or can you make sure that all these numbers or credit card numbers are formatted the correct way? Don't use credit card numbers, but that was just the first thing. Security guy. First thing that came to my mind. Um, if you go, okay, I want all these phone numbers, uh, or it's formatted the same way, or I want all these, um, you know, whatever cleaned up and, um, smoothed out.

[00:12:17] Ground truth is the way to go. And then finally we have two tutorial services. Now these are really, really clever services. AWS built these to teach you different machine learning techniques. And the, uh, the way they did that was they attached it to hardware. So, um, this is actually a deep lens camera. So you can check that out.

[00:12:37] Um, essentially a normal webcam with a little computer attached to it, right? So it's super clunky. But the interesting thing is that this actually runs, uh, machine learning model on it. So we'll see that in a second when we do a demo of recognition. Um, but this is a way where you could set this up and it will actually recognize things and label parts of the image and say, like, this is Mark, [00:13:00] this is Mark's head, you know, this is a human's head.

[00:13:02] Right. This is a webcam, that kind of stuff. So they did this and they also did deep race, which is a car. It's a remote control car, um, where you don't remotely control it. It controls itself. Now, um, this was a simple, uh, training model, uh, that they built to show you how to build models for image recognition.

[00:13:22] Um, deep racer is all about, um, Oh geez, I can't, I'm blanking on the term, but it's another way of teaching you a different type of, um, machine learning. Of course, I'm going to, I gotta, I gotta Google that. It is, come on. Yes. Reinforcement learning. Sorry. Reinforcement learning. So that basically tells you, you give the deep racer model a set of parameters and say, I want you to value turning, you know, being faster to the next way point versus being more accurate.[00:14:00]

[00:14:00] And again, it's a fun way to do it. In fact, it's so fun that there is the AWS deep racer league finals coming up next week in Las Vegas. Um, very, very cool. So that's an overview. of how the machine learning services break down for the vast majority of us. We are going to stick in the transactional services.

[00:14:17] And the reason for that is that we don't have to worry about the model. We don't have to worry about the training data. Um, we are just getting results. So let's flip over and see how that works right now. So we are in, uh, Amazon recognition. Now this is the one, like I said, where, uh, this is actually capable of doing facial recognition.

[00:14:37] So it's capable of saying. I see Mark in this photo, right? I see, uh, Lee in this photo. I see, um, you know, Mina in this photo. It's capable of doing that if you give it the appropriate photos with the labels. But what you can see right out of the gate is with their sample image. Um, you can see the results on the right hand side here.

[00:14:57] It's 98 percent 0. 8 [00:15:00] percent confident that it's found an automobile that there's transportation here. That's the car. If you actually go on the little boxes, it will tell you where the A. I thought it found something. So here it thinks it found a skateboard. And of course it did. But that's just a default training image or default demo image.

[00:15:18] We're actually going to give it some images from, uh, the, um, from unsplash. So I found a couple of these images. So we're going to use this one. Uh, this is a great photo from Andrew, um, Petrushev, um, where it's got a singer, um, out into an audience and a few other things. So we're going to take this photo and drop it into the demo.

[00:15:39] So if we drag this in here, it's going to take a second. And let's see what it comes up with. So now it's processed these results and it gives us our results on the side here as well as a confidence test. rating, right? So now the confidence rating, the 99. 6 percent is how confident [00:16:00] the service is that this is what's actually in the image.

[00:16:02] You'll notice it's not 100 percent because it's not sure, but based on the modeling, this is a really high confidence level and that's actually part of the, um, Challenges around the facial recognition and why this service, um, there was a lot of, and don't get me wrong, but facial recognition is an absolutely critical thing that we need to discuss about its use, um, by government, by law enforcement.

[00:16:22] Um, I've actually covered that. If you go to my website, MarkN. ca, my regular tech column, uh, where I do here in Canada for the, uh, I have a local, um, radio program that I work with. As well as the local news outlets. We've covered it a ton. It's absolutely important. But one of the critical things that's missed in most of these tests is this confidence level.

[00:16:39] You absolutely need to understand your sort of risk tolerance or your error tolerance here. If we just want to figure out like, hey, what's kind of in this image? Um, this is great. This is fine. Anything over, uh, you know, 90 something or 85 something is normally good enough. If you're trying to actually recognize somebody and then, you know, take legal action, [00:17:00] this should be 99 higher.

[00:17:01] There's actually guidelines published by Um, recognition on it. Um, but back to the, back to the issue at hand here. So this has found a person. Great. Um, this has found a crowd. So if we look, it's, uh, identified, uh, didn't box it, but it's identified the background as a crowd. Then all these little people here, here, here, here, here.

[00:17:20] Um, this is pretty accurate, right? It thinks there's a building in here, which we know there is on the edge and in the background. Um, skin is kind of creepy, but it's found skin. Um, and if we go down, you'll see, actually, this is where it gets really interesting.

Read next