AWS re:Inforce 2019: Build Security in CI/CD Pipelines for Effective Security Automation (SDD351-S)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
but afternoon welcome everyone to the building security in 2c ICD pipelines for effective security automation on ada bliss must be a catchy title this is a big crowd very happy glad you could all make it so before we begin I just wanted to you know see a quick show of hands how many of you are using containers in enablers Wow a lot of you that's great how many like how many are using ECS eks okay it's good far gate there are a few all right this will be an interesting session so the agenda for today's session is we'll start with the need for security the need for bringing security and compliance checks early into the development cycle and then we'll quickly you know turn it over to Kevin you'll walk you through the approach taken by flex board actually Kevin he's the seaso there and then we'll walk you through the security checks that you can build in into each stage of the build deploy you know run phases we don't want to do PowerPoint by death at this late hour in the afternoon so we have a couple of demos peppered in there and then we'll wrap it up with Q&A all right so I don't know why marketing does it this way but our in clothes come later I'll go first so I'm Ron beretta I'm director of products at Palo Alto Networks Prisma public cloud I'm responsible for driving product strategy and roadmap for our cloud security products prior to Palo Alto Networks I was at Amazon a Tablas you know responsible for the VPN service and the transit gateway service which hopefully many of you are using or considering to use and before that I was at Verisign and another cache B company called cipher cloud how do you Kevin thanks my name's Kevin page so I'm currently the seaso at flex port prior to prior to flex port I was the C so at a technology company called Neil soft it focused on api's and an integration technology and then prior to that I spent 20 other years various technology and security roles working for large companies like Salesforce small little startups as well as for the for the government thank you Kay all right so let's kind of set the stage I'm assuming there are a lot of security practitioners and security analysts so you know what's the number one security analyst dilemma or what's what is a stock analyst really dealing with write alert fatigue is really a major issue for a lot of you know soft teams and security analysts as you can see by the statistics there every week there are about you know 175,000 alerts generated in a typical average enterprise and only 7% of them are reviewed let alone you know fixed or remediated and if you look at some advanced attacks it can take up to you know the MTTR or mean time to identify is about in 197 days and to contain is another 69 days so the gap between the generation of an alert and full remediation can take a really long time if you just follow the traditional security practices of today so you know as you all know security should start very early in the in the build stage and they need to be kind of remediated there again some more stats here one in two developers especially working on containers they do not necessarily test there's you know security images that's 50% of developers don't test the images that they are building their apps on if you look at the top ten downloads from docker registry almost all of them have a roughly 30 vulnerabilities on average and on the flipside you know four in ten of those images can be fixed by updating the image tag so simply flipping a bit will actually take care of for you know forty percent of those vulnerabilities in darker images and you know we live in a world of cloud formation templates automation and again another statistic based on a research study that Palo Alto Networks did the number one reason for you know majority of the alerts that 62 percent of the alerts are raised because of a common misconception templates this problem only explodes when when there is a common misconception like this so what I'm gonna do next is turn it over to Kevin Oh actually there's one more so I think the theme here is pretty apparent right we really want to build security in into earlier stages of your DevOps cycle you know that's what is leading up to this dev sack ops momentum and obviously you know this bolt on security where you're only performing security and compliance checks in runtime doesn't really cut it and we need high degree of automation so that we don't slow down the developer security needs to be built in every stage of the the dev build deploy run phases so with that I'll hand it over to Kevin great thanks Rob cool like I mentioned before I've been doing security for a long time so I've been fighting the battles putting out the fires for a long time a long time ago I realized that there's got to be a better way to do things rather than and then put fires out we came across we came up with the term dev sack ops it's probably good I guess if we about with that with that word is really quickly before I kind of talk about my my journey I just I look at dev sack ops is just a as a standard way on how to make your product or service not suck you know make it reliable make it available make sure you have security inside your processes everybody should be using kind of the same tools the mindsets the processes security shouldn't be it shouldn't be an afterthought security should just be part of the process it should just be how it is it's easier said than done and how do you get there and what are some of the some of the lessons that I've learned and thought about as I've made this journey I've been lucky enough that the last four companies I were at where hyper-growth the I guess the definition of a hyper growth company is 25% growth and revenue and people the companies that I've worked at or more than double so more than 50% so when you're growing that fast in your your engineering teams are growing that fast your IT teams are growing that fast it's it's a big risk and and how do you deal with it bolting security on on the end while these people are coming in trying to figure stuff out doesn't work at the same time you're bringing all these people on board the the business wants to move faster you know hey we're doing great sales is great everything's going great we need to we need to have more features to make all of our customers super happy and we're trying to do all this stuff at the same time we're internally inside teams there's not alignment between security and engineering teams and IT teams were not aligned our dependencies aren't quite there so all of these challenges from a security perspective as well as engineering perspective as well as other other business units everybody's everybody's not aligned so how do we how do we get to a get to a spot where where we can get everybody lined and we can we can figure this this devstack ops thing out we can figure out how to do ops the right way is probably a better way to put it so one of the one of the first steps that we had to take is realize that it's not just a this isn't a technology problem by itself it's also a problem with alignment and influence we want to make sure that how those all those pieces that were broke you know they'd be the alignment the teams that aren't communicating weld it together how do we get them you know working working together security can't be a bolt-on of security is not going to be a bolt-on security's got to get involved get involved with the engineering teams with the IT teams understand what their problems are and be part of the solution to help them solve their problems while your brain bring security and as part of the standard standard process we need to do that through influence influences and titles it's not it's not ways to push or use a hammer to hammer things down it's it's really just kind of being thoughtful understanding and and and being helpful and then is the visibility part so in order to in order to fix things and make people want to know what's going on you have to make things visible how are you gonna make those things visible through monitoring availability SLA is lots of interesting interesting ways and lots of interesting ways to do it and we I've done it in a couple of interesting ways and I'll show you an example here after this slide that showed how how we kind of took a look at some of the things that drive certain parts of the business and brought them together so that the visibility was there and those visibilities drove drove interesting behaviors one of the behaviors that it drove was was accountability a lot of times we don't really hold people accountable and the ability to figure out how to hold people accountable is is really is really important it helps drive behavior changes and helps helps kind of make operations and makes brings brings teams brings brings teams together the other big thing was identity an access control getting that right I think that was really important because in the in in today's age where everybody is virtualized their containerized they're there they're going lots of different directions there's no such the old-fashioned asset inventory where you've got a spreadsheet of assets that doesn't matter anymore we need we need to make sure that every every every asset has got a tag that everything has an owner and that there's associated responsibilities with those particular owners and what we did what we did with with a lot of these these stories is uh we figured out we figured out ways to figure out what what drove em we kind of created a dashboard that they identified lots of different things so we took the developers what do developers care about they care about how many pull requests are open we took a look at other things like customer reported issues and incidents put them on the same dashboard when when it drove it drove lots of different people to want to take a take a look at it so the security team wanted to look at incidents we put customer reported incidents on here the CEO who was customer obsessed he wanted to take a look at what incidents were going on and he wanted to understand what the what the problems were as well but we also baked in here at the top of the top of the mechanism don't worry too much about the names of the tools on there those evolved over time this is an old and old version of the of the dashboard but we built our static analysis tools into the into the dashboard so in the same place what you're looking at incidents and customer reported issues we're taking a look at static static analysis we're taking a look at dynamic analysis we're taking a look at what issues were there based on vulnerabilities we're taking a look at lots of different mechanisms that we could pull via API is into a single dashboard and and all of these things which you can't see that I cut I cut out of this was that each one of our products had an Associated owner with the for the product and what would happen is the the CEO the CPO the CTO would all take a look at these dashboards for their particular item that they cared about but they would always see that if there was a higher high risk vulnerability from one of our private areas they would instantly ask questions this drove this this held our developers and our product managers accountable for making sure that we had very few vulnerabilities inside the product because they don't want the CEO or the CEO or the CTO to ask you know hey why do you have three critical vulnerabilities inside your product it's on this dashboard here you know what do you need what do you need to do to get rid of those vulnerabilities any more resources you need more people what's it going to take and that kind of built the built the culture it was a strong way to build the culture to make sure that that everybody was taking a look at the same set of data we pulled in across the across the the lifecycle of the business some of the most important key assets we pulled them all in through api's and we showed the the most important things for people to view that they wanted to see and this this drove a lot of behavior change and really instead of me fighting fires and chasing everybody down people were coming to me and people were saying like hey I see there's five vulnerabilities in this product over here what's going on we have s LA's on them so you know if it's yellow that means that you know it's it's reaching the end of the age of the SLA Green of course it means good and red means that it's that the SLA had expired and what what what do we need in order to get to a good spot where we can keep things as green as much as possible so using api's using automation and bringing everything together really kind of understanding your your business users understanding what drives them what motivates them what does the CEO want to see what does the CTO want to see what do different parts of the business want to see be able to bring that into a centralized dashboard using automation and api's really help kind of bring this idea of the sack ops this idea of of operations and automation and got everybody aligned in a good way and really and really started to drive good things it drove a big a big culture shift so security was just a thing that people did and and it was and that was just the way way that it was everybody saw the SLA s and the vulnerability reports and and it and it led to good behavior and the accountability drove behavior changes too so there was less pointing fingers less like oh that's not my fault that's that's their fault or it's not my responsibility it's somebody else it's it's you know it was very clear with all of our assets tagged every tag had an owner every product had had an owner everybody knew who was who was responsible and everybody knew that everybody had lots of visibility on those that were responsible and it really helped help drive that behavior change and then building security in increased velocity so people wanted the features faster people wanted to build new products faster putting that framework in place using those api's providing the visibility making sure that the most important data elements were or visible really increase velocity so security really helped as a scooter organization we were able to help able to help a lot a lot of pieces of the organization come together and and really kind of enable through security so it's a it's a pretty good pretty good story and then some of the there's lots of different kind of vendors bringing new types of API is the market Palo Alto Scott has a couple and and the next portions of this of this discussion are going to are going to be Rams going to be discussing some of those those api's for security that you could use to to bring to a dashboard to help build into the CI CD pipelines to increase security and provide that visibility alright I wanted to pick it up from here right so how do we help Kevin and his team and you know customers such as Kevin really achieve these outcomes while Kevin you know kudos to you you've done a tremendous job of your affecting all of these positive outcomes you know culture shift accountability change in behavior these are not easy to come you've accomplished you know increase in feature velocity they you know how do we continue to maintain that so while Kevin and his team have done a phenomenal job of working on these we had Palo Alto Networks listen to customers you know and Kevin and we focused a lot on building high degree of automation and we focused on the tools side of things to help you know Kevin and his team so how did we go about this right now the clicker should work okay all right so we quickly realize that we need to start security right from the build phase and we need to make it you know developer friendly as developers are building their applications we need to provide them with easy friction free ways to perform security checks and this needs to continue you know throughout the deploy and run stages so there are a number of challenges security challenges that you need to address especially for containers during each of these phases and we've gone about it very systematically and built tools and you know Kevin mentioned API so we built api's so that developers can just hook into our services to perform their security and compliance checks so let's start with the build stage where you need to be able to perform you know Wallner ability scans on the images that you are downloading from open source or docker registry we talked about you know we talked about the fact that 30% 40% of the top 10 downloads have you know 30 40 vulnerability so we need to be able to scan those quickly and then as you are building your images and beginning to deploy those we need to scan those images we need to scan all the configuration settings that you know are associated with your containers containerized apps and runtime security there are you know a lot of things that could go wrong even when you deploy a golden certified image and into your orchestration tool into your registry and it's spinning up container images you know using auto scaling and other features right so there are a lot of runtime configuration checks that we need to do we need to detect any drips or any new zero-day threats for the golden images that you have and we need to like detect and respond to you know any potential attacks so we set out to build a solution or a set of solutions that covers the entire lifecycle of a containerized app you know starting from the build stage through deployment and through run stage so how does it all work so next few slides and demos will kind of walk you through that process if I can get the clicker to work all right so just to take a step back this is probably the scenario for you know if not a lot of you at least some of your DevOps settings right this is where developers are building their applications where security is not really looking into it meaning they are added only or they become aware of the applications that you built only after they are deployed in the cloud environment they are not really part of the daily builds that happen and this creates a lot of resistance you know this creates a lot a lot of frustration both for the developers as well as security teams like the ones that Kevin runs so we want to go from here this is the current scenario to an end state where we want to go you know from the previous scenario to here where everything's you know friction free everything's smooth because at every stage we are running security is involved we are running the you know security team or SEC ops define guardrails we are running continuous you know security and compliance checks you know right from the you know dev state where you are just identifying your applications through build deploy and you know all obviously you're running it continuously during runtime right so let's take a look at how the you know how we architected the solution right so let's assume you are developer who's working on your desktop you are exploring images from a docker registry you just pull those down you are playing with it looking for you know good set of base images and open source libraries that you got from docker hub you need to be able to quickly scan those and make sure there are no vulnerabilities so you can submit those to the recently released Prisma public cloud one scan API we recently released you know two services they are open source freely available so you know you can start playing with them today you can start utilizing them today a developer can simply send the the package that you've downloaded into the Prisma public cloud we will scan it for a list of known vulnerabilities and within a second or so that's not an SLA that's the average response time that we see because that's the time that's the response time and performance requirements kevin's team imposed on us so within a second or so we send back either a list of vulnerabilities that we found in that package or we give a green signal right once you get past that stage now you you started layering in your business logic your business application you've built an entire image again you can send it to the same Prisma vulnerability scanning api service through and you can set up your CI CD pipeline to send it automatically to our cloud service it can be kind of baked into your daily build routine so as soon as a build is done it will send the image to our API and we again send the response back either a list of vulnerabilities that we found in the entire image that we saw or you know and we also send you a list of you know for each vulnerability the recommendations that we have or the patches that you can apply to resolve those let's say again you as a developer went through and the dev ops team went through the list of recommendations we have fixed all of those now your image is you know code and code golden image is pretty clean now you're beginning to configure it so that it can be registered with the you know container registry you may be using I think a lot of you said you're using e CS or e KS you may want to let that orchestration to you know spin up your images but before that there are a number of configuration settings deployment settings you know EML files that you need to configure and a lot of times if they are not configured properly it opens up a huge attack surface so we have a config scan service again available as part of the same Prisma infrastructure as code scanning service you can submit it they're part of your CI CD cycle you can send it or configure your a SS code pipeline or you can use tools such as Jenkins any other CI CD tool just make a few lines of code changes in your build script and every time a build is run the images get sent the the configurations files are infrastructure as code files read this could be cloud formation templates this could be terraform templates they get sent to our service and again we will scan it look for make sure it adheres to all of the security best practices that corporate security team has laid out a number of industry benchmarks you know AWS CIS benchmarks nest HIPAA octave so on and so forth will run your infrastructure as code templates through all of those security checks and then you know either pass a bill or failure bill depending on what we find and then you are ready to run your containerized apps in in these images so what I'm gonna do next is walk you through a couple of demos let me set it up clicker doesn't okay there you go so I'm just looking at the time in the interest of time I had all four demos lined up but I may do two or three depending on the time we have I'll start with the demo where we are going to show how from a Jenkins build tool I'll actually set it up right there sorry about that yeah how we can invoke the Prisma public cloud Valen scan API from Jenkins you know build tool and kind of scan your container image right so let me swap this place here so this is a you know typical Jenkins project you have set up the Jenkins project to point to a github repository that's where your source code is that's where your you know build scripts are every time there is a change to you know one of these files so I'm going through and highlighting the call that you need to make this is the simple API call that you need to add to your build script and then let's look at the demo let's look at the the file itself so we are going to scan this docker file that I just pull down from docker hub I'm going to change you know my build script to include the file that I just downloaded I'm looking at a curl file and I'm just issuing a pull request so this is a pull request because I just changed the curl package to appropriate curl and this will kick off the build process and you know behind the scenes there's an API call being made to Prizmo public cloud you know scanning API which you can use today and pretty soon you'll see the build results I promised about one second response time we built in some pause so that we could get some commentary going and this is where you'll see the build results I'm going to open up the console now to look at the build results so there you go that will fail you see the big red checkmark and we will show you why the build failed so the appropriate curl had a number of CVS we failed the build based on thresholds that you may have you know defined meaning it should not have critical CBE's so based on those thresholds we failed the build we can take a quick look at what the Seavey's are and how we can address them so those are the recommendations we provide that's just the CV and it just says if you update the version of curl you should be okay so we are going to go and update to the curl will recommend it there yeah here is another CVE so we are going to fix both of these issues by upgrading to this later version of curl so I will go go ahead and get that seven you know whatever that number is I can read from here so we'll go ahead and change the bill script to include the newer version of curl just running an update on the on the package so anytime an update is made you know there's a pull request and the build script kicks off and again since the same package to the Prisma public cloud one scan API and so the reason why you know we made it very simple there is no authentication required you can just make it part of your build script is because we want to minimize that you know friction for a developer this needs to happen in the background what once this is set up it's done and you'll see the build will automatically pass and we'll be ready to deploy this image all right there you go the build 103 passed and successful so I'm gonna flip it back to slides and you talk about the next scenario okay so we just did that demo we'll save some time at the end for Q&A you can ask us questions we'll stick around for a while let's get to the next phase so this is where just like we did with your container images we can also scan your terraform templates or cloud formation templates anything that you use for expressing your infrastructure s code those templates can be scanned by Prisma config scan api and you know we'll run it through a number of security and compliance checks and again provide recommendations if we fail that build I'm just looking at the clock so in the interest of time I won't run the second demo it's very similar to the first one I'm gonna move to the next the runtime security aspects of it sometimes this clicker is playing games with me alright so we are just gonna skip that demo and talk about runtime so you know you you got your golden image running it's your orchestration tool has taken over your containerized apps are you know functioning well but you also need continuous security monitoring continuous compliance checks of your environment and you know we have this recommendation for customers where we say apply the 80/10/10 formula 80% you know of the alerts should be handled by automation tools and devops tools that we just saw the other 10% should be you know this is your production environments where you know exactly what kind of security guard rails and configuration settings you have if you find a deviation or a miss configuration there those 10% should be automatically remediated by your auto remediation scripts and only about 10% should really go to a sock team or a sock analysts right so we saw about the alert fatigue we want to minimize that by really making the DevOps team's responsible for their own cloud misconfigurations their own security alert so we gave them a bunch of security tools to take care of that and the last 10% we want to give it to we really want to reserve the most advanced type of threats which can be reviewed by you know human analysts security analysts and even there we want to do a lot of work we are automated you know anything that can be automated should be automated and the information should be presented to a sock analyst so what we are going to see now is a demo of a product called demister that that we have demister works really well with you know a number of ADA bliss security products as well as Prisma cloud monitoring so what you'll see is a demo of alerts being raised by Prisma cloud it's being sent to demister and demister has a number of you know common playbooks that have been automated and they will kind of perform 90% of the work Indians just leave about 10 percent of the work for a security analyst so let's run through a quick demo of that once the integration is complete the alerts from red lock are ingested into the master and are visible to the ambulance from the instance page where the analyst can move incidentally incident ID it's an entire severity status label when in it upper and so and so forth all these field values can be selected from the setting page to choose your preferences as we can see here there is an alert from red lock on some ec2 instance communicating with well-known codes known to bind Bitcoin you can also see that it's an M&A server but we need more details around this other to know more about this alert the analyst needs to go to investigation page by simply clicking on the other that takes you to the sixth section investigation views located on the left hand side of the page why don't we page the analyst can view more details around the other in this case we know that unless this traffic is part of the authorized applications and processes your instances may have been compromised the analyst can also view the alert severity other so the recommendation as well as the other time Dennis can also assign an owner for this incident right from summary page all the Wardrobe Daniels can view all the artifacts and can click actions such as marking as an evidence note viewing the artifact in a new tab attaching to tag the annex can also add other team members to help with the alert investigation as well as run commands from the built in CLI located at the bottom of the page now let's take a look at more response actions for playbook has accomplished the playbook has already run in the bottom line each rectangular box that you see on the screen is a human that can take to the police several hours dem is to provides built-in out-of-the-box playbooks as well as the flexibility for the analyst to create their own play books from the play book page in this case the response to cryptocurrency mining alert from red lock as a label which is a task based workflows to respond potentially compromised instance so let's see what the PlayBook has done the plymouth marks our trains the other as an evidence which can be viewed by clicking on this rectangular box it has opened a ticket in JIRA it has extracted indicators as well as enriched those indicators of compromise and if in malicious indicators are found during this check then those I things are blocked at the firewall similarly we need to quarantine the instances so the playbook has already accomplished quarantine instances so let's see what it has done it has captured the instance details it gets the security group details it takes the volume snapshot creates attack for the ec2 instance as well as changes the network interface second group to Dennis to essentially quarantine the instance once the instance has been quarantined after taking backup and creating attack the playbook is also initiated scan on the ec2 instance and also capture running processes and perform memory capture actions and then send an email to the analyst for the anniversary once the analyst has responded it has updated the ticket in the ticketing system as well as automatically generated an investigation summon a book and this report can be basically automatically sent to email to all stakeholders and once all these actions are completed it sends an alert notification to the security to the final review and then potentially close now all these actions typically takes several days and hours today the mistress playbook can accomplish the same actions within few minutes on the evidence page what we have seen here is the playbook is already recorded for you evidences for the retrospective review purposes page let's go to the related incidents page yep as you see there are multiple alerts in action and I'm gonna pause that and flip back to the slides all right so hopefully you saw and understand how you can use Sol tilde mr. happens to be a sole tool so you can use a tool like demister to really make life of a sock analyst easier you know automate anything that can be automated present that information so the human analyst is just making decisions as opposed to spending hours collecting information you know going from you know ten different systems or tools that they have so that's what you know Dempster makes the life of a stock analyst extremely easy with that I want to leave some time for Q&A so let's quickly wrap up here what we are doing and what we've accomplished so far is you know being able to automate a number of security and compliance controls you throughout the lifecycle of your typical applications especially container applications you know from right when a developer is building these applications all the way through you know runtime and then we offer continuous monitoring for this so with that you know that you can certainly access the services that we mentioned here from the URL here developers dot Palo Alto Networks comm forward slash Prisma you know we'll open it up for questions actually before I do that any closing thoughts on what we just saw just just I guess a reminder dev sack ops should be just ops and we should be automating as much as we can the technologies are there the capabilities their way a lot of what you see a lot of times with automation is that we're streamlining things we've got a good process we've got good procedures that are repeatable and this helps us helps enable the business that to move faster whether it's your security team or other parts of the parts of the business yeah thank you Kevin yeah so we'll open it up for Q&A now all right that is correct that that's a great question I think we need to kind of data sharing agreements we need to take it offline we are by the way just so that we are all clear we are not receiving your actual containers mostly its manifests is you know images that you are pulling down from docker file docker hub it is right now unconstrained meaning you can submit any number of files there are no limits even in the free version there are no limits at least as of now yeah we want adoption we want developers to start using this right away and we want developers to use this a lot question there it is comparing two non darker images and then we have a research team that is you know continuously going through open-source places and pulling down images and we are building a library so it's both but yeah as far as far as you know the runtime is concerned where you are actually submitting the darker images you are submitting as a manifest and we are just going through the known vulnerabilities you know based on what we've scanned before just like you know CVS and all our abilities there's a question back here so I couldn't hear you properly but I will try to replace the question for infrastructure as code scanning is it just static analysis or I think you've meant dynamic analysis or human yeah yeah so we do support both cloud formation templates and terraform templates today we are working on other ISC templates in future it is you know both static analysis as well as we do do variable interpolation we will actually you know explore what the JSON will look like or you know TFT will look like in your cloud environment question there yeah so the vulnerability scanning service is primarily for containers the infrastructure escort service is for your entire cloud environment you could have s3 buckets being created you could have RDS instances being encrypted non-encrypted or security groups open we will look for all of those checks everything that Prisma cloud does in runtime we are checking for those in your infrastructure as code templates all right I don't see any other questions thank you all and don't forget to submit your feedback I think [Applause]
Info
Channel: Amazon Web Services
Views: 2,905
Rating: undefined out of 5
Keywords: AWS, Amazon Web Services, Cloud, cloud computing, AWS Cloud, AWS re:Inforce, AWS re:Inforce 2019, security, identity, compliance, cloud security, AWS security, cloud security community, learning conference, Detective Controls, Infrastructure Security, Data Protection, Incident Response, Governance, Risk, Compliance, security best practices, Security Deep Dive, AWS re:Inforce 2019 Sessions, Session, SDD351-S, 300 - Advanced
Id: IkQK2epK19E
Channel Id: undefined
Length: 46min 33sec (2793 seconds)
Published: Wed Jun 26 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.