Why you should be using Multi-Stage Docker Builds in 2019

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hello everyone and thank you for joining us for today's webinar why you should be using docker Wolters multistage docker builds in 2019 today's session is brought to you by code fresh live and today we'll see how you use multistage docker builds and best practice and the best practices around them even taking a docker image from 700 megabytes down to 20 megabytes which is a big difference in the context of CI CD our presenter today is gai Salton solutions architected code fresh and as we are going through the presentation you may enter questions please remember to submit them in the Q and a box on your zoom toolbar so that we can better keep track of them and we'll address them at the end of the presentation this session is being recorded and a link to the recording will be sent to you later this week please remember to reference code fresh do slash events for our upcoming webinars as we have fresh informative webinars for you every month so that is all for the housekeeping items guy if you want to come online the floor is all yours thank you very much Terry and welcome again everybody and thank you for for joining so as Tara mentioned today's webinar we'll talk about multistage docker builds and why you should use them in 2019 mmm just to introduce myself my name is guy I'm a Solutions Architect here in coach fish helping with technical PLC's or reviews installations and you know in providing our customers with technical service um and you have my email here so you can feel free to send me any emails if you have any questions during the webinar after the webinar love to talk so let's get started um first I want to start with why are we talking about movie stage docker builds only now and you know that was also the topic of the webinar why you should be using them in 2019 so for those you may be no multi-stage docker beans are available starting from docker version 17 point zero five which was released in 2017 so while ago so why are we talking about this now the thing is that we encode fresh we see a lot of our users usage and how they use docker and some of them are just moving to docker some of them are using docker for a while and it seems like a lot of our users and also other people that we meet are still not using multistage docker builds and that's a shame you will see wide same time this webinar here on the screen you actually see a screenshot from reddit with recent post that was just like a week ago and which talked about using multi stage docker bills to reduce container size and it got a lot of traction and a lot of reviews and you can see some of them the comments that people had here like wow that's amazing I was looking for a solution to reduce my my my build my imagery sizes and and that's so cool so seems like people are not yet fully familiar and aware of multistage docker builds and that's why we decided to have this webinar to help em everybody out and utilize this this great feature of daughter so let's start with just going over quickly what is a docker file and a docker build I guess that for you joined probably most of you are familiar with docker images but just to quickly go over this so docker file doctor provides us with an imperative DSL as text so we can create a list of the build command each of these build commands will generate one layer and as and the whole docker file we usually create one docker image so this works great and this is the basic of docker and we'll see an example of that so again one docker file will create one image let's look at an example and here we can see an example of a docker file that is building an image for go so of course to to run my application I first need to build it and compile it so you see that I'm starting and of course every docker file will start with the from command and this is where which image it will use to to initialize and start with and this can be any docker image can be a custom one it can be an official one here I'm using goal and version 1.7 one and the first thing that I will do is copy all of my source directory into my goal and container and then run go build to a build and compile my go application after the build is done I'm exposing this container with port 8080 and finally running the binary to run my echo web server and so let's see how we can build such a docker file and run it and so we'll go into our first demo and by the way and you can see this public key tab repository that you can use you can even access it now and you know follow up in with our demo and so it contains the docker file that I just showed you which is what we're going to use now and so this is again our docker file I have it saved in one of my folders and I'm going to now build it in the terminal so it's simply called docker file right I can see it here so this is my docker file now to build it run docker build - F to specify the docker file name which is simply docker file then I'll add the minus T to provide the name of my image and I'll simply call it my go app and click enter so this will build my my docker image and I see that it was built it used some of the layers from the cache because a built it before now I can run docker run to run my application I could simply run a container and with the port that M that we expose which is 8086 of dr. run - p exposing and my port and specifying the image that we just created and I see that my web server is starting at the port that I configured so we can now open a browser and go to our local host at this port my see that my go application is running so that was an example of looking at the classic docker file beaming it and running it and I see that everything looks ok I can now kill this container and if I go back to my browser and refresh see that now my server is down so we can um go back to the slides and continue so everything worked fine so what is the problem with there is a problem and the problem is this and the image that we wanted to run our application and should have contained only the binary of my application and some runtime and configuration that are necessary for running my web server but what we got is an image that contains not only my application and runtime but also a compiler you know the build tools it can in some cases contains things like debugger and test frameworks and get and stuff like that we should not be part of my production image and it also makes my image much much much larger so if we go back to the terminal actually and we look at the image that we created right so run docker images and grab my image name I see that the size of this image is sixty six hundred and seventy eight megabytes okay so pretty large because it contains all these build tools by the way let's just delete and second images I want to build it like with you so just run our eye and delete this one great so um again we saw that my image is almost seven hundred megabytes and that's a problem and by the way all these additional libraries and tools that are part of my image can also expose my image to security vulnerabilities and which of course I don't want to be to happen on my production miniature so what can be the solution for this problem well you know basic solution can be let's separate our docker file and create two docker files one that will give my image you know contains all my build tools and compilers and debuggers and test frameworks etc and one which will contain only the run type but this first of all M makes me and manage and maintain two docker files for the same application and also I'll have to figure out your castration between these images like using bash or make file something like that so not a very good solution the good news is that I don't have to use this solution because topper released multistage build so what is multistage build for those who don't know him it's a feature that docker released as we mentioned on 2017 which lets me create multiple images from a single docker file so I only have to have one docker file with the same syntax they just expanded the syntax a little bit as we'll see in a minute and I'm running the same docker build command as before but it will create multiple stages and so the main concept is that we will start with the first stage that will contain all of our build tools and debuggers and compilers and test frameworks etc and this image would probably be big like we saw on the demo like it can be 700 megabytes or even more but the final image which we what we will use in production will be a very minimal image that will contain only the runtime or application binaries and nothing else not good not test frameworks not compilers no build tools none of these stuff so it would be lightweight and it would be fast to build it would be fast to deploy and it won't contain any unnecessary security vulnerabilities so how will this look like and let's look at an example of a multi-stage docker file so the first thing that maybe comes to your eyes is that you now have multiple from commands as previously we only had one from command now we have multiple ones so that's one of the features of docker movie staged build that you can run multiple from commands and each of these from command will create another stage and this stage could then be exported into a different image so we're starting from in this example from an Alpine like Alpine image and we're tagging it as base so this is the extension of the syntax that docker added so each from command you can add the as syntax to tag it with the name that you can refer to this name in other front comments well as we'll see in a second so doing from Alpine then I'm adding see URL to this image so this will be days then I'm creating another image which are marking is second and I'm running some command here so to write something text to some file then I'm having a third from a statement and I'll mark it is third and here I'm writing some other text and finally I have my final image and here you see that the from is from base what is bears base is this image that we created internally so base is Alpine three point five plus zero and now the second extension that they did to the syntax first one was adding the s to the front command second one is that you can now copy from a previous image or a previous stage so I can use the from paste but now I'm copying some files that were created in the second image and other files that were created on the third image and this will result in my final image now let's see a real example of docker multistage build and we'll see an example that makes sense and so if we go back to our terminal and let's look at this docker file and this docker file again will run the exact same application as before so let's see what's different it starts with the same go like image same version but this time I'm tagging it as build environment so this will not be my final production image this will be my big environment image and again I will copy all of my source code into the container I will build my raw application so so far very similar to the previous docker file but now you see that I have an from command here and I'm starting from scratch and for those of you don't know scratch scratch is the smallest possible image that you can start from and it basically doesn't contain anything it will only contain what you copied to it so I'm going to then in the next line copy from my build environment image from the previous stage where we actually peeled and compiled our application I'm going to copy the binaries in the runtime configuration and finishing with exposing and this image to port 8080 and running the actual binaries so let's see how this will look like and I see some questions so scratch is is an image you can use it as an image but it again doesn't contain anything you can start from any image here and we'll see other examples but the idea is to start from a very very lightweight image that doesn't contain you know on any unnecessary tools and so it would be fast and secure now let's see how we can build this this multistage docker image so again I'm in my folder and I previously built this docker file but now I'm going to build this docker file dr v dot multi-stage so again to build it i'm going to run docker build dash F to specify the name of my docker file but this time I'm going to specify docker file dot multistage and again add the minus T to specify a name of my image so called this image micro app multistage and right so I see that my image is now built it was built and tagged successfully and and if we now look at our images by running docker images and grepping the micro apt name I see that this new multistage image is only five point six megabytes so we went from 678 megabytes to only 5.6 megabytes and that's because we don't have all of our build tools and compilers and in our production image we only have what's really really necessary which is the runtime and the actual application binaries now let's see if this even works right we can run docker run em again exposing my port and this time specifying the new image that we build and I see that this web server is now starting I can go again to my browser and go to localhost to my port and I see that my application is trying to see the same functionality but a much much much smaller image so faster more secure and much less safe storage um so now we can go back to our documentation to our slides and it's important to note that maybe some of you are not using go or you know you're using Java or no PHP you can enjoy a multi-stage build with any programming language not only go and we actually have some pretty cool examples for that in our documentation so and you can click on Java for example and see how it will look like in Java so exactly the same concept right and to build the compiler application you will start from an image that has the build tools has the compiler so here who start from maven with an image that contains JDK version 8 and we will mark this as maven toolchain and then we can go and copy the source and build our image using maven but then for the fine production image we will not use this Navin with JD case we don't need all these tooling we will use a very lightweight Java image that contains only JRE based on Alpine so it would be much much lighter it will be built faster and it won't contain any unnecessary tools and test frameworks and that may be exposing us to security vulnerabilities so that's the main concept one more thing that I really encourage you to check out is this blog post that we created about docker anti-patterns so as we you know we've been working here with docker and containers for a long time and we see sometimes our users or we see people that are simply you know they started using docker and because everybody is doing it or you know they heard of that the advantages that it has but they're not maybe using it correctly so there's a lot of do's and don't dues in hearing this in this article and one of the points here is exactly on what we talked about where people and I think it's point number four five exactly confusing images that are used for development with those that are used for deployments so the actual deployment image should contain only the compiled form plus runtime dependencies and nothing else really nothing else it shouldn't contain any test frameworks and any compiler is anything like that that you use for development so really encourage you to check that out in our blog of of code fresh right and just go in Google Talk or anti-patterns and so it that's great and I also don't mention at the beginning so we talked about a docker build and the problem with it and how solve it with multistage docker now I want to introduce code fresh and talk a little bit about him how multistage dollar bills can help us as part of CI and CD processes so for those who don't know code fresh code fresh is the first container negatives the ICD platform for micro-services designs for kubernetes so what is container native means it means that if you look at this diagram for example that shows you a classic CI CD flow where you start from maybe a commit and then go through a multi-stage build you can build or install your help chart and then run some integration performance security testing parallel then push your artifacts or images into some registry and finally deploy them to production in code fresh each of these steps will run inside of docker image so as a docker container so you get a segregated environment for every step you don't have to maintain and build slaves and pre install all of your tools on the slave so it's really simplifies the cncd process and and it's also a very intuitive and friendly start platform and so we can go and see a demo on code fresh on how to utilize multi stage builds and so let's go and open our browser and maybe some of you are actually seeing this before so this is the code fresh UI and so again it's a sass platform you can go and sign up for free you get a free account and what you see first is your project view so you can create multiple projects and each project can then contain multiple pipelines and I created this project which is linked to the same git repo that contains the project that we built and ran locally on my term know him so we can go and start with this pipeline and this is how a pipeline looks like in code fresh you see that it's built using declarative Yunel file I can either either build it here in line or I can import it from my peach repository and this simple pipeline contains only two steps the first one is git clone it will clone my git repository so I'm specifying the repo that I would like to clone and the branch and the second one is is going to build our image so we created a library a marketplace of steps that you can use and a lot of them are around using docker containers and stuff like that so to build a docker image you don't have to run those docker commands explicitly like I did locally on my terminal I can just set the type of the step is build choose the image name that I would like to get and points to the docker file that I have in my repository so once I ran this pipeline I can look at the beans which are the executions of this pipeline so we can look at one of them for example and see the two steps were executed successfully so I was able to clone my repo and then bend my image and now here I can see the logs of my docker build and you can see that actually all of the layers were pulled in from the cache and as code fresh has a pretty unique caching mechanism that can really make your builds faster so the way that it works is that every pipeline gets a persistent volume so once you run your your pipeline for the first time all of your project files will be automatically placed on this volume and also it uses utilizes docker cash and and it has out of the books caching on the intermediate layers of the docker image so and as long as I don't change my docker file it will just reuse all of my layers from the cache but if I make only a small change to the end of the docker file or something like that it will still take all the existing layers from the cache and will only build the new ones now if we go back to our project and we can look at the second pipeline that I created here and this is for the multistage build so you see that contracts has you know out-of-the-box support for multistage build em all I had to do is simply and specify the name of my docker file that is a multi-stage docker file here and and other than that it's exactly the same build and if we look at this build for example we see that the whole execution was only 18 seconds right it was much much faster um and I can see here the logs of my booty staged build again utilizing the cache now the cool thing about this is that you want you also get a free private docker registry with every account even with the free account so once your images are built as part of your CI flow and they are also automatically pushed to your private docker registry and you can go to the images view and see your images here so again I see that the micro app image which is not utilizing multistage is 646 megabytes while the multistage one is only 5 point 37 megabytes and so this is the full v2 registry and I can even click on each of these images and that's the really cool part here is that I see all kinds of information about this image like when it was created if it was promoted to another registry and I can even hear from the UI go and push it to other registries that I have connected here to code French but I can also go and see the the docker file itself if you remember is the first opera file that we built locally right not utilizing multistage and I can see the logs which will show me the pipeline that built this docker image so I can see when did it run I can see the duration of this execution and I can go and and go to the actual pipeline and configuration for this for this image and finally I can also even see the layers of this image so it really very nicely give me a visual representation of all of the layers that you were created and look and it's all of these layers and you can even see the size of each of the layers and you can expand each of the statements here to see exactly what happened on each of your layers and understand how your image got this big so this image if you remember is 646 and megabytes now if I go to the multistage image right I see the same information I see the docker file that is using multistage build so I have two from statements and the actual image that was built and which is only five in something megabytes started from scratch and only copied the binaries in run time that were needed to run this application and if I look at the layers here look only three layers and that's why it's so lightweight it's so fast it's much much more secure all of those labels by the way for those who are not familiar with docker labels and so a labels is something that you can add to your docker file and it simply and adding metadata to your image so forth - by default also and add things like the branch so I can know exactly um for this specific image as it was built the tourist part of a CI flow which branch did it come from what's the ripple name who is the owner of the repo what's the comitia etc and if you remember at the pipeline at the the docker file that I showed you here see you can add your own label so here I added this label saying that this image is the secondly imagine this is the image is the third image right so you can add your own labels and then you'll see them here as well inside of code fresh so that's you know you saw these different in the sizes also on your CI and CD and we saw the difference in the build sizes and of course if you then decide to deploy your image into kubernetes for example or ECS or forgotten or any service like that of course deploying a lighter image will be much much faster it would be more secure as it won't contain and in any unnecessary build tools compilers the barbers test frameworks etc so that was a little bit about running multi stage fields and you know the advantages of multi stage build as part of CI and city and one more thing that I want to show you is utilizing the target parameter and what does this mean so sometimes there are some cases where and you'll have a multi stage build and you won't necessarily want to use only the final image right because if you think about it or think about this example we started from this golang SDK image which is big right when contains all of my be tools and then in the final image that we got was only this scratch one that you know only has the runtime and application but what if I still want to use the image that was created here or what if I have more than two from statements like here and I decided I want to use this image so docker also provided a solution for that so you can pass something called a target to your docker build a command and then specify which which stage you want to use so by default it will use the last stage which is finally image but if you want to maybe get let's say I want to get the second image then in code fresh you see in the build docker image step I can add a target parameter and specify the value of this target parameter and if we look again at my image at my docker file I see that I labeled this second image as second so this is how this would be my target now I can now go and say okay um I want to clone my repo give my image but and as a result for this docker build I want to get the second layer and not the final layer so if I now look at the execution of this field I see that the clone happened successfully now if I look at the build of my daugher build right I see that it started from Alpine version 3.5 as base right that was the first command right this now it should add the CRL to this base image so I see that it's adding see URL right now it's continuing and it's it's running the second from statement and which is from Debian as second image and that's it you see images up to date it's running the echo hello sorry echo and hello hello and as my second image does and finishes with having the label so it doesn't go and continue and the rest is just label so it doesn't go and build the other stages as I stopped it on the second stage and I can also see representation of that if I go to the images view we can go and add multi stage example right um so I can go and I see these images is big considerably big right hundred eight megabytes as it was not the final and light image and if I look at the layers I see that it has the image second layer as this is the second image the image that was produced from the second stage so this is how how you use that and you can see a lot of information about you know what are all the functions of of the docker built inside of code fresh and again examples from different programming languages right here and so if you go and look at our pipeline steps inside the yamo we can go and look at our build step which is your native docker build step and it shows you all the usage so how to specify a target out well build arguments and of course you can add any logical conditions and when we want to run this step etc and you have the list of all the possible parameters here so we refer to this target branch a target parameter M I guess in some cases you know people in their origin image and let's say golang or maven etc and they say okay I already have my pill tools there maybe they added some test frameworks there and debuggers and stuff like that umm maybe they would want to yell rap to use this image for you know running tests or stuff like that so they can run a build docker build specify the target as you know specifying the base image is the target and then running and their test frameworks etc you don't have to do that and you can just you know have I always believe in you know having exactly what you need for your specific action and nothing else oh and you can just have usually you will have your images that you use for CI like images that are made for running test images that are run made for security scans and images that are help you with deployment but these are not relevant to the production image production image should contain and again only the application the runtime and the configuration that will enable and let your your actual pipeline your actual application run so let's just go and so we saw the the demo on a multi stage in CI CD so just to summarize before we go into the QA m we saw that using one docker image for both the build and the production results in slow deployment and a lot of security for abilities so this is definitely something that is not recommended and again we we talked about the docker and patterns blog that you know gives you all the do's and don'ts and multistage build it produces a lean secure production ready docker image to a much lightweight much much more light much faster to build much faster to deploy and much more secure and with code fresh you get speedier builds thanks to the caching across all images and layers so again at basic docker cash but also caching on the intermediate layers of the docker image with native support for multistage M so thank you again very much as I mentioned you can sign up for free account and you literally get unlimited builds with the free account so we don't limit you on the amount of builds whatsoever you can go in and sign up for free and start running your pipelines I think we'll go into the Q&A session right now and yeah thank you so much guy we do have a couple questions and just a reminder if you have any questions that you haven't entered yet please use put them in the Q&A box or if they were in the chat and they haven't been answered yet I want to make sure they don't get lost so please add them to the Q&A button on your zoom toolbar so first of all we have a question from Jagr we have a dr. weblog WebLogic image that has 1.5 gigabytes does multi stage with multi page builds work for this so it can work definitely depends on what exactly the image does so if if it built if it's built similarly to what what we've seen on the webinar where we have you have you're starting from an image that has built tools it has compilers it has debuggers it has test frameworks and all these stuff and then you know you're building your app and only at the end you're exposing it it's important running it so of course most we can help you there because you can just add another stage and that will copy only what's really necessary from the build into a very lean lightweight image like we've seen here and that will run in production and then this can definitely reduce the image size that you have in production up great thank you and then our next question is is there possibility to run similar code fresh environments on premise yes that's a very good question so in code fresh and as we mentioned this is a SAS platform we also have a full on-premise offering where you can install the whole core source application on your own environment on a kubernetes cluster that can be behind the firewall and completely disconnected from the internet but there is also a third solution which is becoming very popular which is a code first runner and that means that you can say let's look at one of our pythons that I showed you earlier here in the settings you decide where this pipeline will run so by default once you sign up to code fresh by default in all of your pipelines all of your bills will run on a cloud hosted environment and that we host code fresh and it's a Linux environment but if you have some internal assets that that you need to access as part of your build behind the firewall let's say you have like an artifact or we write industry behind the firewall or you have like a big bucket server behind the firewall and you know running those builds on a cloud hosted environment and outside of your network simply won't be able to to connect to your internal asset so you can install a code fish agent on your own cluster behind the firewall like you see that I did here and it will simply add another option here in the runtime environment and then it will even let you manually increase and this side what would be the resource allocation for your bills because the bills will run on your environment so this can be can be a good solution awesome thank you and our next question how can I do pod auto-scaling when its CPU is above 50% so this is a little bit off topic but them but with code fresh by the way you can look into this we have a native deploy step so if you look at all of these steps here there is also a deploy step and if you're looking for auto scaling and deploying to kubernetes this is also something that we support natively and again if you have high CPU and that's something that we need to look into and it's exactly how but this is a little bit off topic more like kubernetes them scaling and that's about that okay sounds good thank you so let's see our next question while defining standard based docker image for an organization with all basic required artifacts do you think we can leverage docker multi build features can you give us a use case on how this could be used when it is published for all IT teams to use do you be the first part of the question again sure while defining standard be a standard based docker image for an organization with all basic required artifacts do you think we can leverage darker multi build feature mm-hmm so yeah it really depends on on on what you want your your end image to contain so if you know you have some image that you need to use across different IT teams and you need this image to contain build tools and compilers and you know other test frameworks and stuff then you know obviously as you need these stuff and this stuff can be pretty heavy and then there is no no escape from that but if you're only if you currently have all those tools in your image but you don't really need them on the image like you if you use this image to deploy it to production and it's just like a web server or it's some application that runs and doesn't need all those extra tooling inside then definitely you can use multistage build and and of course again if you have some specific example that you want to show me you can you can send me an email then and we can look into it exactly to the to that example all right thank you and then I have one more question and so just a reminder if you have any last questions to enter them into the Q&A box here what limitations are part of the free version so we do have unlimited builds available but I don't know if you want to pull up our different pricing structures there real quickly guy sure and so that's a good question and as mentioned with the free with the free plan you get unlimited builds and so and we actually changed our pricing pretty recently to make it very very simple so you see that every account gets unlimited builds you get unlimited private repositories and you get one small runtime machine so basically the price and there are two main factors that affect the price the size of the machine that you're going to run on and the number of concurrent builds and we actually have those calculators here so you can just play around with them and understand what would be your end price but you get a small machine for free which means that you can run builds on an environment that has one gigabyte of RAM and two cores of CPU if you see that you need to run multiple bills in parallel like let's say you have a development team or like 10 developers and you want to use git triggers so that every time that a developer commits something to his repo or open up or request a build will run and trigger automatically then you'll probably need more than one concurrent build and this of course can affect the price and also the size of the machine so let's say if you want to run I can show you an example of that if you want to run a multiple steps in parallel inside of the same pipeline so let's say you have an application that contains multiple micro services by the way code flash is also built on them micro service architecture and contains about 20 micro services so let's say you want to build all of them in parallel and you can add like a parallel step to your pipeline and if we look just showing you a relevant example for that then for that you'll need probably a stronger machine so here I'm running for docker builds entirely right and I want them to run fast then I'm running security scan on each of these images in parallel then deploying to staging and finally to production so for multiple steps in parallel you would probably need a stronger machine and it even has some recommendation here so up to for concurrent steps you can use the medium machine if you need more than that you can go with the large and we mentioned the hybrid solution earlier which is um you know installing this runner on your own environment behind the firewall and as well along with other enterprise features that will be part of our enterprise packages so if you need features like SSO or our bag or sam'l behind the firewall support on-premise bit provider stuff like that and or you need to use code fresh on-premise then this would be part of enterprise awesome well thank you so much guy and thank thank you everybody for attending today you can look for our future webinars at code fresh do slash events and we hope to see you at our future code fresh live events have a great day everybody thank you [Music]
Info
Channel: Codefresh
Views: 1,725
Rating: 5 out of 5
Keywords: Docker, DevOps, Webinar, Containers, CICD, Codefresh
Id: iUTTPfsclZU
Channel Id: undefined
Length: 51min 31sec (3091 seconds)
Published: Mon Sep 09 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.