Follow Mark on LinkedIn Follow @marknca on Twitter Follow marknca on YouTube
marknca

Stopping mistakes & misconfigurations

Road to AWS re:Invent 2019


Join the discussion on LinkedIn

Tweet about this episode

Full machine generated transcript follows

All right. Good morning everybody. How you doing today? Let's flip this over just got a double check that we are live on Twitter as well. We're live primarily. Yeah, we are perfect get so that's working. Awesome. Always a good thing when the streamworks first thing, which is nice. So I'm taking comments on LinkedIn.

We're taking comments on Periscope as well. I think I've got this setup. So it shouldn't be a problem. If you guys are want to catch up on previous streams and you know that we've got it all set up on the on my website. I dropped the link at in both streams here.

Are you going out in Bolton so you can catch that up me and see they're going to be talking today about Miss configurations and mistakes. And there's a reason why I wanted to bring this up today. And I think it's a really important topic. I think it's a place where a lot of people fall down.

But last week you guys know I was at 8 last week, but where was last week was at serverlessconf in NYC. Was really fantastic great crew awesome communicate amazing how the service Community has exploded over the last few years can't believe that was already the nights or four years.

Yeah. We were talking about a bunch of stuff and I gave a talk there on security as normally as I do and I got some interesting feedbacks. I'm just adjusting my windows that I can see all of your wonderful comments. So again, if you want to comment on LinkedIn if you want to comment on Periscope, I should be able to manage those live.

I think I've got this down after the sort of a semi catastrophic upgrades to Catalina of interesting note. I complain about some issues about Mac OS online and Apple support actually paying me right away. It said we want to call and talk to you because you found a button pain with the bed bug but so here's what I'm going to do my football over to Google Chrome.

I'm going to show you something as so this is their the page of put the link in the description already have to talk about Look at all the previous streams here. Also the toxin giving two stalks SEC 204. Casas sort of my annual thought leadership, which I hate is a term but it's basically what I you know, every year I try to give you something new this year.

I'm kind of putting all the pieces together for security. Also hosting a customer. I talked up to a four steps to a 4. I'm going to Great customer and pivot Jason credits fantastic guide great speaker talk about his journey. I'm so we're doing the stream right now, but you see the previous stream.

So we did the live reserved seating you can watch that you can watch serverless. We talked about the basics of that but as a result of going to serverlessconf, I actually am fired up a post and share it on LinkedIn. I'm going to take a shirt on Twitter, but I'll do that soon what's happened over the last few years in service security.

So I started researching this really heavily the beginning of 2016 and I'm in here. We are three years later in essentially originally proposed these areas. of serverless Securities upload the data choosing a services in apis using Code quality in modern production cool sound effects fast forward and here in 2019.

We've adjusted. Based on a few years of seeing things play out in the market and we've got service selection still you need to make sure you like you're handling Healthcare information your service Services handle HIPAA data things like that quality in your functions to data flow between your services at your stitching together, then configuration validation of configuration validation is to tie into the problem that were talking about today, which is Miss configuration.

So I if you look at how I wrote up the talk, I'm just going to scroll to the bottom here. Please bear with me as I scroll. I thought the talk went really well. I tried to do a little more personal so I want to talk to about here.

Is that the number one threat in servile SM and really the number one threat in Cloud today is Miss configurations. People are making simple mistakes and exposing a data and needlessly which is unfortunate. So if we take the risk of going to Google All you need to do is type in S3 data breach and look what happens when you get Millions results.

Now, there's realistically as there is a finite number of these things is not millions of the S3 data breaches, but we've gotten so many at this point where a leaky bucket hat has uniform business insights here in the 10 worst Amazon breeches. If you got the points were missed configurations are such a problem that you've top 10 list.

This is something worth of diving into an us what we're going to talk about at today. So the first thing we're going to do is actually going to dive into console now on this what I have is logged into my console and I have gone into services and I've typed config has can fake.

It is probably be most ignored service if we can say that are outside of simple DB which is hidden service. This is the first set of pages essentially what can figure is config is a compliance tool. Now you can customize config to pull in third-party information about by default in a divorce information.

And what it's doing is it's looking at the cloudtrail logs to give you the state and time of anything that's happened in your AWS account of that can be really really useful. So we have not set it up when you get this initial page. What are we doing while we're going to I record all global resources and including the ones in this region.

We are going to create a bucket. We're just going to create a bucket and we're going to call it the default. That's fine. It's a TBS config blah blah blah. Are we going to stream these configurations know so you could automatically stream leads to an SNS topic which gives you the ability to serve automate things even further, which is really cool.

I like that but we're not going to do that today and we're going to let it config setup the rules that it requires. And then we're just going to hit next. So now a we have the option to add rules if we want to we're going to skip this.

And we just hit confirm and now 8 lbs config is set up. So it's now tracking cloudtrail as changes are made of this is a demo account. Not much actually happens if you're so we're going to fire up something so that we can see what config actually does. So now config is set up.

It's unable to assume the role. That's always challenging but it's redesign. Let's flip over to the new look. Do do do do do whatever goes around the new-look console anywhere or doing anything easier and we're launching instance. This is very basic very simple. We're just going to launch an ec2 instance only so that we have something in config to do so here.

I'm not even I don't care. This is locked down with the security groups is not doing anything. It's just going to spit up a server. So now this is launching which is great. Now the challenge here is that because they do because config Works off of cloud trail data.

It has to have that data in cloud trail in order to actually do anything and that's the real problem is that cloud trail has no defined as last it will drop information into your S3 bucket in the cloud trail saying hey you launch the new instance now based on experience depending on the volume of transactions in your account in the size.

Your account this could take two to four minutes. So while this is going I'm going to flip over to some slides and show you talk or a part of the talk that I always give or have been giving at a summons this year. I don't think Lee Summit season is over so I don't have to run around like mad anymore.

This is an automating advanced security automation Made Simple in the thing. I wanted to call out. The people was that when you're doing on this is what we're going to do to make sure that we're catching our mistakes. You need to realize there's a fast lane in the slow lane.

Now the slow lane is through a cloud which is what is config cloudwatch events, which is the fast now you may say what what are fast really just keeping track, right? You're going to hit some Lambda limits. If you're trying to do stuff you're going to just have overhead to wear of keeping everything in a divorce can fig make sense and you'll see that in a second.

So the idea here is that the reason why we started with this is the easy ideas you're going to make mistakes are going to have missed configurations. You can do to prevent people from taking actions in a database cloud and that's what we use our identity and access management for and that's a whole nother topic and thankfully interviews just hired its first add developer advocate for IIM, which is amazing move on their part.

But what we have is the ability to prevent people from do anything. So Mark does not have the permission to spin up an instance. That's a really simple way to stop him is configuration of the hats the absolute first thing you should be doing is applying that principle of least privilege in have a slide for excuse the scrolling cuz that's the only those privileges which are essential to perform the intended function.

So if I have a job and not simply reading one file out of S3, the account I'm using should only have the ability to read a file out of S3. I shouldn't have to list files. I shouldn't have to write them. I should have access to ec2 or dynamodb or any other database service.

I should have just the permissions. I want now a lot of the time people don't do that people apply at Hannah Dexter permissions and I get a trust me. I totally one-hundred-percent understand why you do that. So this comes back to what we were talking about. Today is Miss configurations, right mistakes in this configuration.

So your ideal approach here is stopping people from doing stuff in the first place. So prevent them from doing anything bad by simply locking those permissions down. That's the first thing they'll provide the mistakes right that minutes easiest mistake to fix is the one that you don't make mistakes and that's what most he's at 3 bridges are is that they were mistakes because she felt something that will you create a bucket in S3? It's locked down.

It's completely locked down and only a person who created it can do anything with it. You have to explicitly add permissions which means every time you've seen a big headline in another 10 or 10 million records exposing at 3. Someone made a mistake someone made a Miss configuration. So what we are going to do is set up some automations to catch those Miss configurations in the first place we wanted Start with a to be asking because that will allow sort of this slow lane looks of things that you're like, it's not urgent if it happens where we are.

Okay for a few minutes, he was under 10 minutes. We can catch that and fix it. That's great. So those are there we'd prefer if you're doing policy-based language, that would be you should not do this. Right? So you have the option to kind of soup up with a mark you did that.

Are you really sure you're supposed to you must not do this? Right? So you have the ability to intercept very very quickly in near time fix seconds on the top minutes on the bottom. So you act accordingly right now if we flip back into our browser here and let's see if this has done anything in this is the sort of challenge we have is now are instances up and running, which is great.

I don't really care about it and it's literally just there to show us something and we're going to refresh config to see Config has found it yet again config Works off of the data that sitting in cloud trail. So you can see here in config. It's already found some security groups that exist, but it doesn't have our ec2 instance yet.

So, let's see if we can find it. We'll see if it's there and there it is. So that should be are easy to instance. Let's just double-check E36 E36 you see here. I'm looking at the instance ID the last three characters E36 E36. So this tells us everything. We need to know about our ec2 instance, right? This is a really really handy way to visualize to visualize what's going on in our In our environment, but if we go back to the dashboard, you'll see that we have the ability to add compliance checks with the ability to add custom rules to say things.

Like I want this ec2 instance. Sorry this ec2 instance. This thing should always have a tag right so I should always have tags and I don't have to act so I can create a rule that says anytime you seen it doesn't have tags fire up and say that your non-compliant now because is a compliance tool that's what the framing is.

Always. Are you compliant? Yes or no, so it's looking for yes or no answers. You can also create custom rules which means if you have like a security to a we did this as a Trend Micro project originally win config launched we had a rule that said if deep security which is our security product if it wasn't running send up non-compliant if you have a certain log configuration you so you know what? I want all my logs going to Splunk or going out to sumo.

Something like that. You can have a configuration rule that says every time you see a new instance run this rule check for this piece for the data for this flag. And if it isn't flag this back as non-compliant now config used to have a really beautiful visualizer and I think it's gone, which is unfortunate that we go now, we got some resources populating it so you can see these resources and if we click on the TV PC this might be where they hit the sea they got rid of the visualizer and that really sucks because the visualizer was amazing.

so this is gives us our ability to look at what's going on in query against our data and let me just do a quick Google for a second because that visualizer visual config View They're okay. Cuz I don't know how to get to this right now. I am I can't remember this configuration.

I think it has to happen after time. But essentially you get this really great view of okay on the 17th. Here's what happened. There's one change in one event on the earlier in the day. Here's what happened. And this is really important is why I wanted to show you config the challenges of live streaming when you don't prepare enough ahead of time because of something else.

I want to show you as well. So I looked into that instead of this. This gives you the ability to check into how things have changed over time and gives you this nice timeline and you can see you can click on each of these time events and say okay at 1:20.

What happened in will give you a list of exactly everything that happened at that time and you can actually click on the compliance timeline to see yes or no if these things were in line with what you expect it. So that's one way that you can catch Miss configurations or somebody makes a mistake is through compliance.

However, remember this and we flipping back to this I think a lot remember this fast lane slow lane everything takes time and config cuz it's based off cloudtrail. So that takes time when there's a challenge now, what we're going to do is we're actually going to hook a cloudwatch event.

We're going to set up an event and we're going to trigger a Lambda to do something and what we're going to do is we're actually going to Fire off for when instances are shut down. So instead of going in a Tervis config. We're going to go to cloudwatch. Now remember cloudwatch unfortunately is actually three services.

There's cloudwatch metrics which are used to cloudwatch logs, which is a really fantastic logs tool and a cloudwatch events, which is what we're going to use right now. So we're going to go to events. We're going to get started cuz I've never set it up in here and we're going to match up at doing a schedule schedule cloudwatch events is like a Cron job on Unix or Linux.

It's basically every X number minutes every hour or whatever. What we're going to do here is actually look for an event pattern and we're going to say ec2 and then the event type we're going to get State change notification. So anytime estate changes for an ec2 instance. We want to get notified and we want to see this setup.

And in this case what's going to happen if he's going to send this detail type of the event pattern to us if we look at a sample event and scroll down here so you can actually see this. So here's a sample of event. Right. It'll say the ID instant stain State Innovation the account which resource which is the our actual are ec2 instance arm.

So that tells us exactly what you and then what the status change of it was now that's really cool. So we're going to hook into that to prevent something from happening right now. So I'll do as a fantastic over on Lincoln. So he's talking about multi-account environment environments are really really tricky because they has a very simple principle of your data needs to stay in region.

I'm in the account that it was and then it gives you the ability to view it sometimes from other areas depending on the service. So that's called Hey Master member pattern so that there are every member has its own data. So if you have marks account and you were looking at cloudtrail sitting in the one area, and the question is you don't can you pull out from other areas into config? Well, I can fix mirrors the cloud trail data.

So you can look at I believe that in config. You can check with in that same account. The problem is going across accounts have no your cross account rules that you can assume it is possible to set it up to look at different ones from config. But this is also why can fig has the SNS topic output.

I'm so if you set up config and say anytime there's a configuration change send this to an SNS topic. Topic can be in your master account and then you can subscribe to do things on in that master account because the configured rules or just Lambda functions have that taken figured that's so you can have that as a yes and thank you Steve.

Excellent point. I did not switch back to Chrome. So when you have that multi account setup, you get it a lot of this Plumbing for cross account rules. There's some documentation that I'll fire into the comments after words that will help explain that as so as I had this hidden.

I'm going to explain this again, really really quick and thick Steve a very much appreciate that have managing multiple windows is somewhat easy sometimes needs to be before before. So what we've done is in cloudwatch events. We've set up a rule and we said event pattern because we want something to happen and if that happens then we wants to to match.

So in this case we selected a service name. We said ec2. We don't want all events. We just want when an instant change it. So what we're going to trap here specifically someone pauses our server or if someone has terminated we want to know and so we're going to say any state has a for any instance.

So if you scroll down you get a preview of what that looks like and you get it actually be sample events in the most stuff. You can just ignore and the detail type is kind of Handy cuz we can filter on that in RI Lambda function and but the resource is the important thing that's the specific identity of this instance that's been changed and then you get that again in the detail instance in this case is going to be pending.

What we're going to do is we're going to copy this over and we're going to go to the right hand side and I just doubled you go to the right hand side. We're going to add at Target. And in this case, we're going to add a Lambda function and we're going to change this reinvent sample to be what we want to be at configure.

The input is a matched event, which is perfect. So what this says is essentially anytime we matched event in the left hand side. So anytime easy to season a state change. We want to trigger road to reinvent sample. Number one, very simple so we can add that Target. We scroll down to the bottom and we are going to say confirm.

So this is our rule ec2 instance changing. And we're going to enable this so we create the rule and now anytime in ec2 instance changes. It's going to trigger arlanda. So it's whip over to land a real quick. And we're going to go to our function of this function was what we used earlier and you can see now it's wired up to cloudwatch events.

That's the step. We just took Cod watch event is going to trigger this sample and it has access to our Amazon cloudwatch logs. And we're going to change our our code because we had this code running earlier. So what we're simply going to do is we're going to dump out the event.

I'm going to return nothing. There's no return it 200. So basically what this happens is This Is Our Land a function that says when were triggered so we need to ec2 instance changes all documents and we're going to dump that entire thing out to the lock. So we're going to just print out what it was sent just to show that it's triggered.

Now. What you would do here normally is he would take some sort of action because somebody's doing something you don't want to write they're making a mistake or they're making a Miss configuration. Now, let's just save us real quick and let's run a quick test and you'll see it just dumps out whatever was sent to it.

In this case. We had an S3 example, let's add our new test event, which is cloudwatch. Simple events easy to and we're just going to paste in what we had from the past in her face. So this is from the sample code from a sea or from cloudwatch events at sample event.

And now we're going to test it again. So this is it just us manually testing it and all it does is returns 200 says yeah. I got it. Don't worry about it. But in this case, it actually dumps out what we are seeing. So what we could actually do is make this a little bit cleaner.

So let's make this cleaner and we can save print event detail type. Sprint event details South now. Let's save this and test it and instead of dumping the whole event. I've just pulled specific attributes out of that document. So you seen now we have hope Erica's Mark can't type properly event details is not exist detail.

Typing is not my strong point this morning and you should have seen him out of his shoes. I had not not not good. So in this case, what we've got now is weave at the detail type. It's an instance chain a state change notification and it's telling us the instance and the state seems like that works perfectly.

So let's trigger is let's make something happen. So we've wired up so just a review real quick. We got a few minutes left in the Stream just to review real quick while we're done here is we're trying to catch people making mistakes and we know that the best way to do that is to prevent people from doing at taking actions that they shouldn't be able to take in the first place.

So in this case, I wouldn't give myself permission to mess around in ec2 if I didn't need them. So if I wasn't supposed to be stopping or pausing or terminating instances, I should not have permission to do that. That's the easiest way to stop by mistake or Miss configuration from happening now.

understanding that shit happens at the end of the day people make mistakes. It's going to be a problem at some point. So we have that slow lane fast lane idea. We could use AWS config for Stuff where you shouldn't be doing it. But if you did it, it's okay.

We can either meet or we can Ave. Correct it slowly. We don't need it. What we've done now is we've got something it's far more important. So estate change normally means you're taking a server and instance offline, right or you're spinning one up that cost money that could potentially impact Productions.

So what we've done here is a wired up cloudwatch event cloudwatch events looks at every action taken in your AWS account and we added a filter. So what we did and let me just review this with you guys. Turn smell, like sulfur called watch Advance or we did here as we went into events.

We looked at a rule and we said you know what every time somebody calls this a p i n e c Tucson this case. They're changing the state of an ec2 instance trigger this function. So trigger this Lambda function called road to reinvent sample, and now our land of function all it does is print out to tell us this was done because we don't have time to wire this up what you could put in this code here is actually to spin a new instance up or to raise the security incident or to take some other sort of remediated action that helps you to figure out what or to respond to the incidents are really common one that I helped by organization setup is when something like this happens to a actually pull up the information of who made the request to say Mark made this request, and he's not supposed to be able to check his permissions, right? And here's a permissions.

He's currently assigned. Here's the one he just took advantage of that. He shouldn't have pleased you. Fix this right? So this is really great ways to do that. So what we're going to do right now is back in ec2. We're going to take an action. We're going to go to instant State and we're going to reboot this system assistant doing nothing.

This is a demo cancel. This is a low-impact change, but we're going to reboot this. Yes, because we're rebooting that it should trigger a console notification that just takes a few seconds. So if we come back into here, we are going to look at monitoring. And you can see the invitations.

I think that's an extra invitation. This is the challenge with really low volume lambdas, but we click on the logs in cloudwatch and that's going to give us a better idea now cloudwatch. Unfortunately, again has those 3 at different areas with Carlos alarms events and logs. Now, this is not the latest log you can see the timestamp is 25 logs take a couple seconds sometimes to get kicked in.

So we'll see if it's actually triggered or if the reboot might not be a trigger for the API call. That's a good thing to do. And so we'll just check in here. Yeah. This is just the standard demo that we had. So if we come back alright, we're not getting that here challenge of live-streaming.

What we're going to do is go back and easy to wear actually just going to stop this we're going to say instant change stop Yes. We understand. This is going to kill any data that's on the ephemeral disc. That is okay. This is just literally a random. So this is stopping.

This will definitely trigger the and then we'll make sure to see that this worked all the way through Fingers crossed so you can see that this this flow this pattern works pretty well right now the challenge here. So if you go with us and we'll look back to this live again.

If you're the fast lane slow lane again shoes in the bottom muscle on the top really really quickly expand this out to cover a ton of different things. Now a lot of this is just sort of plumbing code minutes. This is really something that you want to trigger yourself.

This is where there's a hole tear of tools called Cloud Ave management tools or black. I don't come up with the names. I don't come up with a category but this is a whole goal of those Cloud management tool. So we're not talking about like things like chef and puppet.

Those are orchestration tools that help you set things up. There's a whole category of and other third-party tools at to set up this kind of framework automatically for you so that you don't have to do the plumbing So today, we're in the weeds. We're doing a lot of the plumbing ourselves and again, you just get this as Tool which is great because you know, it's it it's a lot of groundwork.

I mean, you don't necessarily need to do the grunt work you want to do the value work write the value work here is figuring out what is important to you. Right? So not flagging these things is not the important part what you do after you flag. It is the important part and we're still waiting for us to come through and I have a feeling that I have a permissions error in here and that might be a problem permissions errors are challenging at mainly because I opted out there we go.

Actually invoked 13 25-26 I'm pretty sure that invoked we have a new graph of the line and that's always a good thing extra application. Don't believe it actually triggered from the cloudwatch event, but that's going to be because we didn't set up for the permissions on the lamp to function properly.

And so we can correct that later on but you guys get the idea here, right? If we run this test event again, you're going to see in the logs and that it will have the proper thing where they change notifications don't know there is it just went past it worked our state change of stopping we actually triggered right? I like Works, especially when I've already backtracked and said it didn't work and I covered somebody doing something.

They shouldn't write so again first case ideal cases. You don't give people permissions you implement the principle of least privilege and they don't have the permissions to make these mistakes. But the reality is we're all human these challenges people are going to make mistakes in this configurations have any idea is slowing fast lane and tolerant of being you can put it through AWS config which bases off cloud trail, which is a really nice visual way to give you that timer to configuration changes including third party stuff.

You can clue third-party things in there with custom rules and rules are just Lambda functions. And now we did the fast lane 2. We did the near real-time where we triggered from cloudwatch events sew-in cloudwatch events fires off we trigger office lamp and it will take an action in this case.

All we did was just notify. We just tell put a log this instance has been stopped. Right, if we go back to ec2 you'll see instant c36 is stopped. If I fire this up and start it again, you're going to see another notification. We didn't take the additional remediation step.

We're just showing you the hook, right? So that's the gifs. That's the core. There's not a lot to this stuff. Like the concept is pretty straightforward I have but you can imagine really really quickly as having a host of these rules and trying to manage that and that's where there's third-party chewing available that can help with that overhead.

But that tooling realize in the basic concepts that we saw today, which is really just figuring out what should happen. What should not happen? What must not happen and making sure you put that in the appropriate way to slowly in the fast lane and setting up the remediations. But if you can again you the same concept with a divorce in general if you can push the grunt work to something like a third party till you're better off and because we're the value you can provide your business in the value you have for your order is really in the remediation.

What do I want to do? If Mark does something dumb like setting a bucket to public or stopping a production server now? Will you stop that with permissions and you don't let me do that in the first place but mistakes happen and this is one of the ways that were couple of different ways that you can actually correct those mistakes and it's nice to build this safety-net.

So as you can see from this dream, as you can see from other stream land is really critical to working in the area spot. If you're not building Productions stuff in Lambda, you can build a lot of production operation to Lincoln Land of yourself. Bennigan's is a really robust partner Network out there and open source Network or stuff open source projects and a service app repository for me to be asked that lets you get up and running with this stuff really real quickly.

That's it for today. Hopefully that was useful hopeful that showed you some of the concepts of how to remediate how to catch some of these mistakes and Miss can figs and let me know keep sending comments on LinkedIn here on Twitter setting up the next streams after next week as always were doing this you a couple times a week at leading up to reinvent is 45 days left as a reminder now is a great time if you're going to run that charity run and have never run before start your couch.

5K to me and you'll be all set come Tuesday for the charity run the week of reinvent believe it's Tuesday morning. If not, just donate directly to the Charities that really fantastic ones. You can find that on the reinvent site. So let me know what other topics you want to see in this lead up.

I really appreciate your time. Thanks for joining and I will see you on the next train. All right. Good morning everybody. How you doing today? Let's flip this over just got a double check that we are live on Twitter as well. We're live primarily. Yeah, we are perfect get so that's working.

Awesome. Always a good thing when the streamworks first thing, which is nice. So I'm taking comments on LinkedIn. We're taking comments on Periscope as well. I think I've got this setup. So it shouldn't be a problem. If you guys are want to catch up on previous streams and you know that we've got it all set up on the on my website.

I dropped the link at in both streams here. Are you going out in Bolton so you can catch that up me and see they're going to be talking today about Miss configurations and mistakes. And there's a reason why I wanted to bring this up today. And I think it's a really important topic.

I think it's a place where a lot of people fall down. But last week you guys know I was at 8 last week, but where was last week was at serverlessconf in NYC. Was really fantastic great crew awesome communicate amazing how the service Community has exploded over the last few years can't believe that was already the nights or four years.

Yeah. We were talking about a bunch of stuff and I gave a talk there on security as normally as I do and I got some interesting feedbacks. I'm just adjusting my windows that I can see all of your wonderful comments. So again, if you want to comment on LinkedIn if you want to comment on Periscope, I should be able to manage those live.

I think I've got this down after the sort of a semi catastrophic upgrades to Catalina of interesting note. I complain about some issues about Mac OS online and Apple support actually paying me right away. It said we want to call and talk to you because you found a button pain with the bed bug but so here's what I'm going to do my football over to Google Chrome.

I'm going to show you something as so this is their the page of put the link in the description already have to talk about Look at all the previous streams here. Also the toxin giving two stalks SEC 204. Casas sort of my annual thought leadership, which I hate is a term but it's basically what I you know, every year I try to give you something new this year.

I'm kind of putting all the pieces together for security. Also hosting a customer. I talked up to a four steps to a 4. I'm going to Great customer and pivot Jason credits fantastic guide great speaker talk about his journey. I'm so we're doing the stream right now, but you see the previous stream.

So we did the live reserved seating you can watch that you can watch serverless. We talked about the basics of that but as a result of going to serverlessconf, I actually am fired up a post and share it on LinkedIn. I'm going to take a shirt on Twitter, but I'll do that soon what's happened over the last few years in service security.

So I started researching this really heavily the beginning of 2016 and I'm in here. We are three years later in essentially originally proposed these areas. of serverless Securities upload the data choosing a services in apis using Code quality in modern production cool sound effects fast forward and here in 2019.

We've adjusted. Based on a few years of seeing things play out in the market and we've got service selection still you need to make sure you like you're handling Healthcare information your service Services handle HIPAA data things like that quality in your functions to data flow between your services at your stitching together, then configuration validation of configuration validation is to tie into the problem that were talking about today, which is Miss configuration.

So I if you look at how I wrote up the talk, I'm just going to scroll to the bottom here. Please bear with me as I scroll. I thought the talk went really well. I tried to do a little more personal so I want to talk to about here.

Is that the number one threat in servile SM and really the number one threat in Cloud today is Miss configurations. People are making simple mistakes and exposing a data and needlessly which is unfortunate. So if we take the risk of going to Google All you need to do is type in S3 data breach and look what happens when you get Millions results.

Now, there's realistically as there is a finite number of these things is not millions of the S3 data breaches, but we've gotten so many at this point where a leaky bucket hat has uniform business insights here in the 10 worst Amazon breeches. If you got the points were missed configurations are such a problem that you've top 10 list.

This is something worth of diving into an us what we're going to talk about at today. So the first thing we're going to do is actually going to dive into console now on this what I have is logged into my console and I have gone into services and I've typed config has can fake.

It is probably be most ignored service if we can say that are outside of simple DB which is hidden service. This is the first set of pages essentially what can figure is config is a compliance tool. Now you can customize config to pull in third-party information about by default in a divorce information.

And what it's doing is it's looking at the cloudtrail logs to give you the state and time of anything that's happened in your AWS account of that can be really really useful. So we have not set it up when you get this initial page. What are we doing while we're going to I record all global resources and including the ones in this region.

We are going to create a bucket. We're just going to create a bucket and we're going to call it the default. That's fine. It's a TBS config blah blah blah. Are we going to stream these configurations know so you could automatically stream leads to an SNS topic which gives you the ability to serve automate things even further, which is really cool.

I like that but we're not going to do that today and we're going to let it config setup the rules that it requires. And then we're just going to hit next. So now a we have the option to add rules if we want to we're going to skip this.

And we just hit confirm and now 8 lbs config is set up. So it's now tracking cloudtrail as changes are made of this is a demo account. Not much actually happens if you're so we're going to fire up something so that we can see what config actually does. So now config is set up.

It's unable to assume the role. That's always challenging but it's redesign. Let's flip over to the new look. Do do do do do whatever goes around the new-look console anywhere or doing anything easier and we're launching instance. This is very basic very simple. We're just going to launch an ec2 instance only so that we have something in config to do so here.

I'm not even I don't care. This is locked down with the security groups is not doing anything. It's just going to spit up a server. So now this is launching which is great. Now the challenge here is that because they do because config Works off of cloud trail data.

It has to have that data in cloud trail in order to actually do anything and that's the real problem is that cloud trail has no defined as last it will drop information into your S3 bucket in the cloud trail saying hey you launch the new instance now based on experience depending on the volume of transactions in your account in the size.

Your account this could take two to four minutes. So while this is going I'm going to flip over to some slides and show you talk or a part of the talk that I always give or have been giving at a summons this year. I don't think Lee Summit season is over so I don't have to run around like mad anymore.

This is an automating advanced security automation Made Simple in the thing. I wanted to call out. The people was that when you're doing on this is what we're going to do to make sure that we're catching our mistakes. You need to realize there's a fast lane in the slow lane.

Now the slow lane is through a cloud which is what is config cloudwatch events, which is the fast now you may say what what are fast really just keeping track, right? You're going to hit some Lambda limits. If you're trying to do stuff you're going to just have overhead to wear of keeping everything in a divorce can fig make sense and you'll see that in a second.

So the idea here is that the reason why we started with this is the easy ideas you're going to make mistakes are going to have missed configurations. You can do to prevent people from taking actions in a database cloud and that's what we use our identity and access management for and that's a whole nother topic and thankfully interviews just hired its first add developer advocate for IIM, which is amazing move on their part.

But what we have is the ability to prevent people from do anything. So Mark does not have the permission to spin up an instance. That's a really simple way to stop him is configuration of the hats the absolute first thing you should be doing is applying that principle of least privilege in have a slide for excuse the scrolling cuz that's the only those privileges which are essential to perform the intended function.

So if I have a job and not simply reading one file out of S3, the account I'm using should only have the ability to read a file out of S3. I shouldn't have to list files. I shouldn't have to write them. I should have access to ec2 or dynamodb or any other database service.

I should have just the permissions. I want now a lot of the time people don't do that people apply at Hannah Dexter permissions and I get a trust me. I totally one-hundred-percent understand why you do that. So this comes back to what we were talking about. Today is Miss configurations, right mistakes in this configuration.

So your ideal approach here is stopping people from doing stuff in the first place. So prevent them from doing anything bad by simply locking those permissions down. That's the first thing they'll provide the mistakes right that minutes easiest mistake to fix is the one that you don't make mistakes and that's what most he's at 3 bridges are is that they were mistakes because she felt something that will you create a bucket in S3? It's locked down.

It's completely locked down and only a person who created it can do anything with it. You have to explicitly add permissions which means every time you've seen a big headline in another 10 or 10 million records exposing at 3. Someone made a mistake someone made a Miss configuration. So what we are going to do is set up some automations to catch those Miss configurations in the first place we wanted Start with a to be asking because that will allow sort of this slow lane looks of things that you're like, it's not urgent if it happens where we are.

Okay for a few minutes, he was under 10 minutes. We can catch that and fix it. That's great. So those are there we'd prefer if you're doing policy-based language, that would be you should not do this. Right? So you have the option to kind of soup up with a mark you did that.

Are you really sure you're supposed to you must not do this? Right? So you have the ability to intercept very very quickly in near time fix seconds on the top minutes on the bottom. So you act accordingly right now if we flip back into our browser here and let's see if this has done anything in this is the sort of challenge we have is now are instances up and running, which is great.

I don't really care about it and it's literally just there to show us something and we're going to refresh config to see Config has found it yet again config Works off of the data that sitting in cloud trail. So you can see here in config. It's already found some security groups that exist, but it doesn't have our ec2 instance yet.

So, let's see if we can find it. We'll see if it's there and there it is. So that should be are easy to instance. Let's just double-check E36 E36 you see here. I'm looking at the instance ID the last three characters E36 E36. So this tells us everything. We need to know about our ec2 instance, right? This is a really really handy way to visualize to visualize what's going on in our In our environment, but if we go back to the dashboard, you'll see that we have the ability to add compliance checks with the ability to add custom rules to say things.

Like I want this ec2 instance. Sorry this ec2 instance. This thing should always have a tag right so I should always have tags and I don't have to act so I can create a rule that says anytime you seen it doesn't have tags fire up and say that your non-compliant now because is a compliance tool that's what the framing is.

Always. Are you compliant? Yes or no, so it's looking for yes or no answers. You can also create custom rules which means if you have like a security to a we did this as a Trend Micro project originally win config launched we had a rule that said if deep security which is our security product if it wasn't running send up non-compliant if you have a certain log configuration you so you know what? I want all my logs going to Splunk or going out to sumo.

Something like that. You can have a configuration rule that says every time you see a new instance run this rule check for this piece for the data for this flag. And if it isn't flag this back as non-compliant now config used to have a really beautiful visualizer and I think it's gone, which is unfortunate that we go now, we got some resources populating it so you can see these resources and if we click on the TV PC this might be where they hit the sea they got rid of the visualizer and that really sucks because the visualizer was amazing.

so this is gives us our ability to look at what's going on in query against our data and let me just do a quick Google for a second because that visualizer visual config View They're okay. Cuz I don't know how to get to this right now. I am I can't remember this configuration.

I think it has to happen after time. But essentially you get this really great view of okay on the 17th. Here's what happened. There's one change in one event on the earlier in the day. Here's what happened. And this is really important is why I wanted to show you config the challenges of live streaming when you don't prepare enough ahead of time because of something else.

I want to show you as well. So I looked into that instead of this. This gives you the ability to check into how things have changed over time and gives you this nice timeline and you can see you can click on each of these time events and say okay at 1:20.

What happened in will give you a list of exactly everything that happened at that time and you can actually click on the compliance timeline to see yes or no if these things were in line with what you expect it. So that's one way that you can catch Miss configurations or somebody makes a mistake is through compliance.

However, remember this and we flipping back to this I think a lot remember this fast lane slow lane everything takes time and config cuz it's based off cloudtrail. So that takes time when there's a challenge now, what we're going to do is we're actually going to hook a cloudwatch event.

We're going to set up an event and we're going to trigger a Lambda to do something and what we're going to do is we're actually going to Fire off for when instances are shut down. So instead of going in a Tervis config. We're going to go to cloudwatch. Now remember cloudwatch unfortunately is actually three services.

There's cloudwatch metrics which are used to cloudwatch logs, which is a really fantastic logs tool and a cloudwatch events, which is what we're going to use right now. So we're going to go to events. We're going to get started cuz I've never set it up in here and we're going to match up at doing a schedule schedule cloudwatch events is like a Cron job on Unix or Linux.

It's basically every X number minutes every hour or whatever. What we're going to do here is actually look for an event pattern and we're going to say ec2 and then the event type we're going to get State change notification. So anytime estate changes for an ec2 instance. We want to get notified and we want to see this setup.

And in this case what's going to happen if he's going to send this detail type of the event pattern to us if we look at a sample event and scroll down here so you can actually see this. So here's a sample of event. Right. It'll say the ID instant stain State Innovation the account which resource which is the our actual are ec2 instance arm.

So that tells us exactly what you and then what the status change of it was now that's really cool. So we're going to hook into that to prevent something from happening right now. So I'll do as a fantastic over on Lincoln. So he's talking about multi-account environment environments are really really tricky because they has a very simple principle of your data needs to stay in region.

I'm in the account that it was and then it gives you the ability to view it sometimes from other areas depending on the service. So that's called Hey Master member pattern so that there are every member has its own data. So if you have marks account and you were looking at cloudtrail sitting in the one area, and the question is you don't can you pull out from other areas into config? Well, I can fix mirrors the cloud trail data.

So you can look at I believe that in config. You can check with in that same account. The problem is going across accounts have no your cross account rules that you can assume it is possible to set it up to look at different ones from config. But this is also why can fig has the SNS topic output.

I'm so if you set up config and say anytime there's a configuration change send this to an SNS topic. Topic can be in your master account and then you can subscribe to do things on in that master account because the configured rules or just Lambda functions have that taken figured that's so you can have that as a yes and thank you Steve.

Excellent point. I did not switch back to Chrome. So when you have that multi account setup, you get it a lot of this Plumbing for cross account rules. There's some documentation that I'll fire into the comments after words that will help explain that as so as I had this hidden.

I'm going to explain this again, really really quick and thick Steve a very much appreciate that have managing multiple windows is somewhat easy sometimes needs to be before before. So what we've done is in cloudwatch events. We've set up a rule and we said event pattern because we want something to happen and if that happens then we wants to to match.

So in this case we selected a service name. We said ec2. We don't want all events. We just want when an instant change it. So what we're going to trap here specifically someone pauses our server or if someone has terminated we want to know and so we're going to say any state has a for any instance.

So if you scroll down you get a preview of what that looks like and you get it actually be sample events in the most stuff. You can just ignore and the detail type is kind of Handy cuz we can filter on that in RI Lambda function and but the resource is the important thing that's the specific identity of this instance that's been changed and then you get that again in the detail instance in this case is going to be pending.

What we're going to do is we're going to copy this over and we're going to go to the right hand side and I just doubled you go to the right hand side. We're going to add at Target. And in this case, we're going to add a Lambda function and we're going to change this reinvent sample to be what we want to be at configure.

The input is a matched event, which is perfect. So what this says is essentially anytime we matched event in the left hand side. So anytime easy to season a state change. We want to trigger road to reinvent sample. Number one, very simple so we can add that Target. We scroll down to the bottom and we are going to say confirm.

So this is our rule ec2 instance changing. And we're going to enable this so we create the rule and now anytime in ec2 instance changes. It's going to trigger arlanda. So it's whip over to land a real quick. And we're going to go to our function of this function was what we used earlier and you can see now it's wired up to cloudwatch events.

That's the step. We just took Cod watch event is going to trigger this sample and it has access to our Amazon cloudwatch logs. And we're going to change our our code because we had this code running earlier. So what we're simply going to do is we're going to dump out the event.

I'm going to return nothing. There's no return it 200. So basically what this happens is This Is Our Land a function that says when were triggered so we need to ec2 instance changes all documents and we're going to dump that entire thing out to the lock. So we're going to just print out what it was sent just to show that it's triggered.

Now. What you would do here normally is he would take some sort of action because somebody's doing something you don't want to write they're making a mistake or they're making a Miss configuration. Now, let's just save us real quick and let's run a quick test and you'll see it just dumps out whatever was sent to it.

In this case. We had an S3 example, let's add our new test event, which is cloudwatch. Simple events easy to and we're just going to paste in what we had from the past in her face. So this is from the sample code from a sea or from cloudwatch events at sample event.

And now we're going to test it again. So this is it just us manually testing it and all it does is returns 200 says yeah. I got it. Don't worry about it. But in this case, it actually dumps out what we are seeing. So what we could actually do is make this a little bit cleaner.

So let's make this cleaner and we can save print event detail type. Sprint event details South now. Let's save this and test it and instead of dumping the whole event. I've just pulled specific attributes out of that document. So you seen now we have hope Erica's Mark can't type properly event details is not exist detail.

Typing is not my strong point this morning and you should have seen him out of his shoes. I had not not not good. So in this case, what we've got now is weave at the detail type. It's an instance chain a state change notification and it's telling us the instance and the state seems like that works perfectly.

So let's trigger is let's make something happen. So we've wired up so just a review real quick. We got a few minutes left in the Stream just to review real quick while we're done here is we're trying to catch people making mistakes and we know that the best way to do that is to prevent people from doing at taking actions that they shouldn't be able to take in the first place.

So in this case, I wouldn't give myself permission to mess around in ec2 if I didn't need them. So if I wasn't supposed to be stopping or pausing or terminating instances, I should not have permission to do that. That's the easiest way to stop by mistake or Miss configuration from happening now.

understanding that shit happens at the end of the day people make mistakes. It's going to be a problem at some point. So we have that slow lane fast lane idea. We could use AWS config for Stuff where you shouldn't be doing it. But if you did it, it's okay.

We can either meet or we can Ave. Correct it slowly. We don't need it. What we've done now is we've got something it's far more important. So estate change normally means you're taking a server and instance offline, right or you're spinning one up that cost money that could potentially impact Productions.

So what we've done here is a wired up cloudwatch event cloudwatch events looks at every action taken in your AWS account and we added a filter. So what we did and let me just review this with you guys. Turn smell, like sulfur called watch Advance or we did here as we went into events.

We looked at a rule and we said you know what every time somebody calls this a p i n e c Tucson this case. They're changing the state of an ec2 instance trigger this function. So trigger this Lambda function called road to reinvent sample, and now our land of function all it does is print out to tell us this was done because we don't have time to wire this up what you could put in this code here is actually to spin a new instance up or to raise the security incident or to take some other sort of remediated action that helps you to figure out what or to respond to the incidents are really common one that I helped by organization setup is when something like this happens to a actually pull up the information of who made the request to say Mark made this request, and he's not supposed to be able to check his permissions, right? And here's a permissions.

He's currently assigned. Here's the one he just took advantage of that. He shouldn't have pleased you. Fix this right? So this is really great ways to do that. So what we're going to do right now is back in ec2. We're going to take an action. We're going to go to instant State and we're going to reboot this system assistant doing nothing.

This is a demo cancel. This is a low-impact change, but we're going to reboot this. Yes, because we're rebooting that it should trigger a console notification that just takes a few seconds. So if we come back into here, we are going to look at monitoring. And you can see the invitations.

I think that's an extra invitation. This is the challenge with really low volume lambdas, but we click on the logs in cloudwatch and that's going to give us a better idea now cloudwatch. Unfortunately, again has those 3 at different areas with Carlos alarms events and logs. Now, this is not the latest log you can see the timestamp is 25 logs take a couple seconds sometimes to get kicked in.

So we'll see if it's actually triggered or if the reboot might not be a trigger for the API call. That's a good thing to do. And so we'll just check in here. Yeah. This is just the standard demo that we had. So if we come back alright, we're not getting that here challenge of live-streaming.

What we're going to do is go back and easy to wear actually just going to stop this we're going to say instant change stop Yes. We understand. This is going to kill any data that's on the ephemeral disc. That is okay. This is just literally a random. So this is stopping.

This will definitely trigger the and then we'll make sure to see that this worked all the way through Fingers crossed so you can see that this this flow this pattern works pretty well right now the challenge here. So if you go with us and we'll look back to this live again.

If you're the fast lane slow lane again shoes in the bottom muscle on the top really really quickly expand this out to cover a ton of different things. Now a lot of this is just sort of plumbing code minutes. This is really something that you want to trigger yourself.

This is where there's a hole tear of tools called Cloud Ave management tools or black. I don't come up with the names. I don't come up with a category but this is a whole goal of those Cloud management tool. So we're not talking about like things like chef and puppet.

Those are orchestration tools that help you set things up. There's a whole category of and other third-party tools at to set up this kind of framework automatically for you so that you don't have to do the plumbing So today, we're in the weeds. We're doing a lot of the plumbing ourselves and again, you just get this as Tool which is great because you know, it's it it's a lot of groundwork.

I mean, you don't necessarily need to do the grunt work you want to do the value work write the value work here is figuring out what is important to you. Right? So not flagging these things is not the important part what you do after you flag. It is the important part and we're still waiting for us to come through and I have a feeling that I have a permissions error in here and that might be a problem permissions errors are challenging at mainly because I opted out there we go.

Actually invoked 13 25-26 I'm pretty sure that invoked we have a new graph of the line and that's always a good thing extra application. Don't believe it actually triggered from the cloudwatch event, but that's going to be because we didn't set up for the permissions on the lamp to function properly.

And so we can correct that later on but you guys get the idea here, right? If we run this test event again, you're going to see in the logs and that it will have the proper thing where they change notifications don't know there is it just went past it worked our state change of stopping we actually triggered right? I like Works, especially when I've already backtracked and said it didn't work and I covered somebody doing something.

They shouldn't write so again first case ideal cases. You don't give people permissions you implement the principle of least privilege and they don't have the permissions to make these mistakes. But the reality is we're all human these challenges people are going to make mistakes in this configurations have any idea is slowing fast lane and tolerant of being you can put it through AWS config which bases off cloud trail, which is a really nice visual way to give you that timer to configuration changes including third party stuff.

You can clue third-party things in there with custom rules and rules are just Lambda functions. And now we did the fast lane 2. We did the near real-time where we triggered from cloudwatch events sew-in cloudwatch events fires off we trigger office lamp and it will take an action in this case.

All we did was just notify. We just tell put a log this instance has been stopped. Right, if we go back to ec2 you'll see instant c36 is stopped. If I fire this up and start it again, you're going to see another notification. We didn't take the additional remediation step.

We're just showing you the hook, right? So that's the gifs. That's the core. There's not a lot to this stuff. Like the concept is pretty straightforward I have but you can imagine really really quickly as having a host of these rules and trying to manage that and that's where there's third-party chewing available that can help with that overhead.

But that tooling realize in the basic concepts that we saw today, which is really just figuring out what should happen. What should not happen? What must not happen and making sure you put that in the appropriate way to slowly in the fast lane and setting up the remediations. But if you can again you the same concept with a divorce in general if you can push the grunt work to something like a third party till you're better off and because we're the value you can provide your business in the value you have for your order is really in the remediation.

What do I want to do? If Mark does something dumb like setting a bucket to public or stopping a production server now? Will you stop that with permissions and you don't let me do that in the first place but mistakes happen and this is one of the ways that were couple of different ways that you can actually correct those mistakes and it's nice to build this safety-net.

So as you can see from this dream, as you can see from other stream land is really critical to working in the area spot. If you're not building Productions stuff in Lambda, you can build a lot of production operation to Lincoln Land of yourself. Bennigan's is a really robust partner Network out there and open source Network or stuff open source projects and a service app repository for me to be asked that lets you get up and running with this stuff really real quickly.

That's it for today. Hopefully that was useful hopeful that showed you some of the concepts of how to remediate how to catch some of these mistakes and Miss can figs and let me know keep sending comments on LinkedIn here on Twitter setting up the next streams after next week as always were doing this you a couple times a week at leading up to reinvent is 45 days left as a reminder now is a great time if you're going to run that charity run and have never run before start your couch.

5K to me and you'll be all set come Tuesday for the charity run the week of reinvent believe it's Tuesday morning. If not, just donate directly to the Charities that really fantastic ones. You can find that on the reinvent site. So let me know what other topics you want to see in this lead up.

I really appreciate your time. Thanks for joining and I will see you on the next train.