Kubernetes hands on introduction (2023) | Amazon EKS Workshop

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
here we go all right hello everyone and welcome to another episode of containers from the couch today we've got a special episode for you uh if you'll recall the when we first started creating videos on container from the couch uh we created webinars for a workshop asset eksworkshop.com and today we're unveiling a completely redesigned version of the workshop and Justin and I are gonna dive into the workshop today go through some key design choices things we've changed things that are brand new um and Justin uh it's your first time looking at this Workshop right and hey welcome yeah thanks I'm very excited for this because yeah the workshop was or Contagious on the couch was built around eks Workshop originally where we were making video content specifically to show off the workshop how it worked get people Hands-On with eks and kubernetes and say hey this is how we get started this is the path we think that you should take to learn this stuff and shout out to Brent and Adam who did pretty much all of those introduction videos three years ago now and and maintained the workshop for so long and PSY has been working really hard with a whole group of people at AWS to help build the new version of the workshop that is better in many many ways and I'm excited to jump into it because I specifically I've tried really hard to not look at any of the content I wanted to come at this with fresh eyes and say like hey what does this actually look like for someone that's brand new to kubernetes or brand new to this Con this content the workshop itself so obviously I've been working with kubernetes but I don't know what's going on in the workshop so I'm excited to dive into it with you and we're going to do like a pair programming today where I'm I'm the person running through the workshop but Psy is Hands-On keyboard actually running the commands so I hope that we can go back and forth on explaining some of the stuff that's in there and then how it actually works when you apply it and run the commands yep absolutely so that that's kind of exactly what we're working with today it's an all new workshop it is an evolution of a workshop we've had previously but again redesigned completely built up uh from the ground up and you know before we get into the workshop itself and some of the kind of key decisions we made in developing this Workshop I think for folks that are tuning in uh maybe they're watching live or maybe they're watching you know months after the fact uh it's probably worth it you know with with the first Workshop stream we're doing today to talk a little bit about what is kubernetes what is Amazon eks why did we build an entire Workshop experience around this uh this product the service on AWS uh and Justin I want to lean on you for this I know it's it's kind of a heavy loaded question here but why and and how does kubernetes and Amazon eks work yeah I mean kubernetes is very complex it's a it's something that's depending on who you're asking how they use it they're going to give different answers it's same thing like what is a car right like a car is a complex thing that has a lot of different use cases some people use it for transportation some people use it for to make money and to transport goods and there's all reasons that you might use a car differently to me kubernetes has fundamentally boiled down to a stateful rest API that is built for automation the the thing that you want to do with this API is Automation and it comes with some container workload Primitives built in it's extendable it does similar things but it eliminates some things that it's not and some things it's not intended to be and kubernetes boils down to a way that we can automate infrastructure and workloads containerized workloads absolutely it's extended of some more things but the focus of kubernetes was really around how do we automate these containers eks the Amazon hosted kubernetes service is really about making managing kubernetes easier how do we take away a lot of the stuff that is hard in kubernetes statefulness is really hard there's a database behind it that the database is distributed it needs to have high availability it needs to have backups it needs to have all these things for automatic scaling eks does that for you that is a big goal of eks but sayonara both are on on the eks team we work really closely with the PMs and features and everything coming out for eks and we love making that better for customers because it takes away some of the challenges of running kubernetes on your own fantastic API very well designed kind of hard to manage sometimes and eks is really designed to help take away some of that management from you that burden of hey I need a database expert I need someone that knows how to scale an API I need someone that does all these things let's just let you use the API to deploy the workloads and make that side of it easier so it comes with container Primitives built in because it's extensible extensible and was made to be extended in various ways it can be a lot of different things for people but eks running in Amazon helps you run containers inside of AWS using AWS Primitives such as load balancers and ec2 for compute and EBS for storage all these things come together to make an actual useful product for people just to run a workload perfect and I honestly couldn't explain it any better than that you know Justin you talked a little bit about what kubernetes is not and that kind of reminds me I should probably explain a little bit about what this new eks workshop we're unveiling isn't uh I'll start off with saying that it's not meant to be a basic tutorial for what are containers and kubernetes now don't get me wrong if you have a fundamental understanding of the technology this Workshop is the perfect place to start and uh I think a term Justin you you've recently coined here uh if you're an AWS expert eks uh even if you are an expert this Workshop is going to be kind of a perfect place for you uh to to look into some day two operations you know complex and networking scenarios IP exhaustion problems uh security groups really complicated use cases as well as we're going to start with kind of some fundamentals but the key for all of this is that we really are focusing on Amazon eks and its ecosystem of Open Source tooling and capabilities all the Integrations that you can set up with it um and not really diving into what is a container what is the fundamental of kubernetes and and and uh you know for that we have a plenty of other resources on Amazon through our training and curriculum and even the open source Community has a lot of great content here we didn't feel the need to uh reinvent the wheel and so uh with that I'd love to kind of share my screen here and unveil this this new eks workshop today we do want to dive into it and start actually um going through the first module uh so you can see it here we can see we've got a brand new UI for folks that that might be looking for the old one uh you can still find it at archive.eksworkshop.com um but keep in mind that that older one is deprecated today like we mentioned it's Justin's basically first time looking at this content and I really want to see how Justin experiences this Workshop um and I want to see the kind of questions he has because there's a lot of key choices that we made throughout this Workshop um and I'm curious to get your thoughts on it Justin for example uh we decide to use a tool called customize to apply yaml uh we decided to use the same sample app throughout the entire Workshop so you know users get that consistency they can expect to know that you know the networking module where they set up their custom networking is working at the same app as the Fargo gate profile which is used to deploy one portion of the application by the way that app is also open sourced and a great way to start learning about how to use containers on AWS whether it's eks or app Runner or ECS a lot of different functionalities there but hey Justin what do you think let's let's go ahead and get started yeah let me uh let me post this in chat so people can see the example app coming out I already posted a link for the announcement blog I didn't post a link yet for the actual Workshop we should probably do that huh is the easiest one for anyone getting started so like I said folks it is live you can actually access it right now um by the way one of the the really cool things that we did and it's probably one of the first things we'll talk about here Justin so once I dive into the workshop here I think people are first firstly going to want to understand how do I run this Workshop make your screen bigger I can't I can't read it from your your side I have it I have it up over here but I want to see it larger on your side too oh here we go all right hopefully that's big enough yeah um okay so there's probably two ways you're ever going to run the eks workshop uh the first is at an AWS event and if you do it this way it's really streamlined for you you don't need your own AWS account uh we'll use the kind of pre-baked accounts we'll load up the infrastructure for you but you know for most people that are finding this digitally and want to go through it themselves we have instructions to help you set this up in your own account um and so Justin I I want you to kind of go through this first and and tell me what you think about the decision we we made here because I know in the old Workshop this was a major pain point for customers wanting to set up their own Workshop cluster environment yeah and I think some of that frustration just came with we're throwing a lot of new terms and tools at people right off the bat where they say I want to come learn kubernetes and say hold on a second you got to learn this other stuff first and sometimes that was difficult because we were doing tools that weren't familiar to them where we said here's a brand new tool like eks control some people knew about it some people didn't we got a lot of feedback over the years that more and more people liked using terraform it was more familiar with them the workflow was familiar and so they knew git they knew terraform but beyond that they were like how do I do these other things and this whole the original Workshop had a really good way to give people clean environments with Cloud9 give them an ID in a shell that allowed them to just run with it because when we originally had the workshop we had you run all the commands on your computer and there was just a lot of variables that could not be accounted for because everyone has a different computer a different operating system you know commands installed all that stuff so giving you a clean environment in cloud nine was another way just to say here's a shell that we know will work over and over again can you run this on your local machine probably but we want to make sure that we have a great experience for people getting started and so I right off the bat you should have an AWS accounts and you should be familiar with Git and have terraform installed and I think that's a great place to start because it lowers some of the barrier of new tools for some people hopefully you are familiar with with at least a few of those things if not we try to hide some of the details where you don't need crazy complex shell commands you don't need to know everything about terraform we're not going to ask you to modify the infrastructure or anything else and terraform we're using we're building on top of blueprints for terraform right side that's right that's right so the underlying automation it's it's not you know custom baked uh we're actually leveraging eks blueprints for terraform which is a great project that allows you to basically just abstract away the complexity of all the different dependencies an artifact might have for example when deploying an eks cluster of course you have to consider things like vpcs and subnets I think that's the obvious piece uh but what about when you're trying to set up additional open source tooling like you want to install Carpenter an open source cluster Auto scaler into the into the cluster and and maybe the load balancer controller and some other eks add-ons and all the other things that you'll need to create a production ready environment well with the the eks blueprints for terraform it's honestly as easy as setting a flag to True um and and you'll see that if if you go to the open source repository for the eks workshop and actually it might just be worth me pulling it up right now um if we take a look at the workshop uh we'll actually see and let me make it bigger for you we'll see a terraform folder and within this uh basically you come into here you run that one command terraform apply Auto approve and it'll kick off the terraform here and it'll actually start spinning up all of the dependencies uh one I want to point out if we go into the cluster module and we go to add-ons.tf here's what I was uh here's what I was talking about that Boolean this eks cluster comes with Carpenter the no termination Handler for spot load balancer controller cluster Auto scaler Cube cost and all of these other add-ons Justin I I know you've had this experience if you had to install these manually outside of terraform whether it's click Ops or you know whatever other automation tool you might be using um it really doesn't get this easy right each of those is multiple terraform files for sure you got variables you have ways to integrate them together you have all this stuff and blueprints gives you opinions on how some of that should be deployed based on what we're seeing from customers and how we engineered some of the tools to be used and so with those was that like 10 lines of true you get a whole bunch of tools that are super valuable to have in any cluster and so yeah this terraform apply and you just get all that out of the box is awesome yeah and and so here's the very first step folks obviously I'm not going to do this live for you it does take some time to spin up all of the infrastructure uh I wouldn't say not longer than an hour but um I've already spun up that environment one key thing this terraform does for you not only does it deploy the cluster but also spins up that IDE and you mentioned this just as something we'd added to the previous Workshop as well it's a Cloud9 IDE has all the tools pre-installed so you know you don't have to worry about the variables of having different versions or different types of binaries different operating system you have that kind of dedicated Cloud9 environment but once you run that terraform by the way folks two things I want to caution this is going to cost uh money so uh you have to make sure you're managing your resources and the TF State file is critical here for the cleanup process but once you've got it got it there we give you some instructions for how you want to clean it up but then we point you to the next step which is accessing the IDE okay let's take a quick look at what that IDE looks like uh here's the Cloud9 environment very quickly I'm going to run a few commands that's way too small uh I want to make that bigger so let's run Cube CTL get pods in all namespaces we can see we've got you know a number of my pods up and running already um the the cluster environment actually comes pre-provisioned with like we mentioned all of the add-ons so you're really ready to hit the ground running and that's what I want to do with Justin here today um we'll go ahead and here if we make this a little bigger bring up the side nav bar again um I'd say Justin probably a good time for us to jump into the uh the fundamentals module sounds good all right let's do it okay so uh the way we've structured the workshop we've got a number of these high-level use cases you can kind of see them along the top fundamentals Auto scaling observability so on and for each lab within the module you'll notice it's actually tagged with the word Lab One critical thing that we've done done with this Workshop you can start at any lab at any point in the workshop you could spin up an environment and immediately jump to a random lab in the network networking or security section the reason we've done it this way is because let's be honest to get through this entire Workshop it's going to take you the better part of an entire day and so chances are folks don't have that much consecutive time to dedicate so we want folks to be able to jump between different workshops and modules um and uh you know today we're going to start pretty consecutively from the beginning here but um Justin I want you to you know again being your first time going through this content or maybe your second time uh let's let's start with the managed node groups one so and it's a great point about the labs because we found with the original Workshop it was kind of designed to have people go through top to bottom everything in there to learn the entire set of what's going on and they always built on top of each other and this Workshop does build on top of each other but it's meant to be a little more bite-sized to say like hey if you have the cluster running you can just focus on load balancing you can just focus on storage you don't have to start from scratch every time and so people come in and they want to learn different things at different times so this Workshop is really designed around that to say hey here's the piece we want you to learn in this module let's go ahead and just start here and get running with it correct um and so I think um to to support that you'll notice there's a command right at the start here it says before you start prepare your environment for this section um and and tells you to run a command reset Dash environment by the way I just want to point out anytime a command like this pops up you can just highlight anywhere in this box left click and it'll copy it to your clipboard and you can actually paste that directly into Cloud9 to execute the command nice um I'm not going to do this right now because environment's already reset uh it does take like a minute or two for that command to run but um let's go ahead Justin let's start going through this first one okay so we're starting with compute manage node groups we want to start with something that every cluster needs is some way to run those jobs and can you go back to your terminal and list your nodes right now yeah nodes and kubernetes are the computers where the workloads actually run it's something in that actually runs pods runs workloads on top or orchestrated with kubernetes on top of a cloud provider or bare metal wherever you're running this stuff if you have ecase anywhere not focus of this Workshop but you can have you know different nodes running on your data center whatever kubernetes can orchestrate that and say Here's Where workloads are running so we have three nodes here I don't know how they got there I don't really care how they got there what we're going to focus on is how to provision something else with managed node groups and manage node groups is part of eks that allows you to get nodes without having to worry about it too much without having to do manual work to do upgrades to connect them back to the cluster it automatically is going to give those nodes some variables when they start up to say here's your cluster here's your VPC here's your region go connect to the API server and get ready to run workloads you don't have to worry about it because back in the day when you're doing this manually that's actually kind of a lot of steps and people I wrote bash scripts that would do this people did different things to configure them ansible playbooks all this stuff to say okay I need that node to talk to that cluster manage node groups take care of provisioning and also de-provisioning and upgrades this is a huge benefit right there of managed node groups we're just like I just need some compute how do I get that in my cluster and and so that's what we're focused on here in this manage note node group section that's right um and and so you know this this one the very first uh module um you know we want to get folks kind of acquainted to how this environment works one of the things you'll notice is that we use different cli's CLI tools throughout the workshop so for this particular command we're using eks cuddle eks CTL and it asks us look let's inspect the default managed node group that was provisioned for you and by the way these nodes are um as a result of two different managed node groups so there's two in the first one and then one in the second one and so let's run that node group command and then in the default manage node group we can see here it's a little bit long here but we've got uh two nodes and we can see the instance type and a few other details about the auto scaling group that was provisioned as part of that managed node group now so that gives me compute that just is already attached to the cluster and I saw you had a minimum of of two right or desired of two uh yep so Min and desired is two and then the max size is set to six and all that saying for anyone not familiar with node groups this is an auto scaling group inside of AWS it's a managed Auto scaling group that gets some settings applied to it auto scaling groups are a range of how many computers you want to be provisioned the same way in this case we say we want a minimum of two so never go below two and a maximum of six and we want two right now you can change those sliders any way you want for your cluster you could have one and a hundred you could have whatever you want in those sliders a lot of times that's meant for availability and budgeting or you say hey if I drop below two instances I don't have any availability for my workload for my workloads and my cluster might start failing if I go over six I may be spending too much money or urgently wasting compute because I don't actually have that much running so in this case this managed node group is kind of a manual slider where you're saying this is how much I think I need for my requirements whatever it is some people have you can scale to zero and you can say hey I don't need this to be scaled at all if there's nothing running and I want to scale it up to a thousand when I run jobs and it depends on your workloads how that gets applied in this case eks control we're just looking at the node group and saying okay this is this is what's in the node group it's a managed node group if we upgrade the version of eks we can also upgrade that manage node group and it'll automatically deploy a new version uh to the cluster got it um we gotta we got a pretty good question in the chat actually uh do you have a cluster Auto scaler built in with uh blueprint I'm guessing you you mean uh with the terraform uh ekaz blueprints we're using and yes actually we've got cluster Auto scaler installed on this cluster um Justin I have a feeling I know why he's asking this question uh since you're talking about Auto scaling groups why is cholesterol scalar important yeah with the managed node groups we tell it our minimum and our maximum but there's nothing that actually changes how many nodes are in the cluster how many nodes are in that manner's node group the cluster Auto scaler is responsible for doing that when a workload comes in when a pod comes in and says hey I can't be scheduled anywhere because there's not enough compute or it doesn't meet my requirements the cluster Auto scaler looks at that and says hey we need to add another node to this node group and it just increases that desired count it says hey we're going to go up one on desired from 2 to 3 in this case so if we needed to scale up that cluster Auto scaler would automatically add a node to the cluster and then we'd have three nodes in the cluster available the workload can get schedules and everything's happy again so cluster Auto scalar scales up but it also scales down if our workloads scale down it can look at the nodes and say hey this isn't being used as much as I expect it to be I'm going to go ahead and remove a node so we can keep your costs a little closer to efficiency to what you actually need to run in the workload in the cluster yep and so I think the let me go ahead and clear this out I think what we're going what we're going to do with the very first lab here it's fairly straightforward we want to show how you can add nodes to a managed node group I'd say this is fairly straightforward now for the most part in our Workshop we focus on the CLI commands but at certain points we do want to call out the UI experience now if if you're working with the cluster for the first time and you're going through that you know that click Ops experience of of AWS we do give you a link here where you can go and bump up the number of nodes uh using the UI but I think for the most part we want to make sure that you have the experience of doing the same thing on the CLI and so here we see a couple of commands here to allow you to modify the scale of that node group so very simply let's run this first one this is just going to get that node group again we've done this before and then now the second command here is going to scale that managed node group and we can see here we're passing in nodes three minimum three and then maximum six so we're bumping up the minimum and the desired to three keeping the maximum at six now in this case we're using the PSI Auto scaler you are the auto scale and this is how we used to do back in the day right like if I had a workload and I needed to buy a new server or I need to add a VM to something someone would manually go and say okay I created a new machine now we can deploy the application Auto scaler will automatically do that for you in this case I ran that command manually but it's the same apis that the cluster Auto scaler is going to hit exactly and so that was pretty quick we got a message back saying the node group has been successfully scaled um now pretty common in this Workshop anytime you do something we're going to make sure that we've confirmed that it's happened now this actually tends to take some time but we can see that if we get the node group we now see it updated to 363 that's min max and desired up from 262 and then running the actual get notes command against kubernetes uh so this this again will take some time I've found that um it doesn't happen immediately for that node to pop up uh whoops didn't mean to run yeah it's not going to do anything right but um what this is uh essentially I just re-ran that command to scale up the node group but given two to three minutes we'll see that node get registered into the the kubernetes cluster and then we'll see the note get started a few seconds after that and this all depends a lot of this provisioning happens and depends on what's in your OS and how fast it can provision if you're using one of the default images it shouldn't take more than probably a few minutes we've seen plenty of people and I've done it to myself in the past where I heavily customize my OS image to do a million things like yum updates and if you're yum updating when you're provisioning an instance that means you're pulling down all of the packages you're figuring out what needs to be installed you're updating it before the node is actually ready to be used inside the cluster so make sure you keep that provisioning step as small as possible to make this provisioning happen as fast as possible that's right um we've got a question in chat is the the minimum one or zero good question you can actually scale uh managed node groups down to zero that is a newer feature for cluster autoscaler so you have the newer version of eks and newer version of cholesterol scalar I think 124 and above is where that landed um but yeah you can if you want to say I'm just deleting all of my workloads cluster Auto scaler should scale you down to zero and when you deploy new workloads it'll scale up those managed node groups to have nodes that's right and um I bring the get nodes command with the dash dash watch and we can see that um a not ready uh note has come up so let's actually just run that get nodes command Again by the way we're just passing in a label selector to make sure we're just getting the ones in the default manage node group the one that we edited and we can see here that it's in a not ready State came up 37 seconds ago we also give you a command if you want to wait for it to come up you can run this Cube CTL wait command and then as soon as soon as it comes up actually that just came back so boom yeah the the worker node is up and ready okay uh Justin I think that that about covers this particular piece of the managed known group Labs um quick clarification for a question in chat Cube control versus eks control eks controls designed to manage the infrastructure for your eks cluster Cube control is designed to to manage your workloads in the cluster so eks is managing the eks service node groups that sort of stuff Cube control is designed for managing workloads and things inside the cluster you got it um next I you know I want to actually go into the Pod affinity and anti-affinity module I think this is a really interesting one um that shows how uh you can leverage these Affinity rules to Define how pods are deployed across managed node groups um and this is particularly a pretty interesting example because sometimes you want workloads to be co-located to one another and so I think this is this is a good way for us to go into the sample application that we've got here now very quickly I actually do want to bring up the GitHub repo for the sample app which we which we I think we uh shared a link for a little somewhere back in the chat I can get it again but the reason I want to bring it up is I want to show uh folks that are watching that the architecture of this application and you can see here we've got a microservices architected application it's a retail store and there's a number of components behind the scenes here um we use the same sample app across the entire Workshop which you can see here is that there's a number of different components here that power the actual retail sample store right now in the Affinity module we're going to be playing with the checkout service and the checkout service uses redis behind the scenes now critically what we want is uh for high availability we want one instance of checkout across each of the nodes and we want um redis to be deployed right next to the checkout pod and so what we want is one pod per node and then we want one redis pod per node and that way we have checkout and it thread is instance right next to it in the same node that's going to reduce overall latency and improve our availability because we have multiple instances of checkout I think it's a pretty cool use case and that's kind of what we're doing with this very first lab here to demonstrate these affinity and anti-affinity rules and also going to point out that Affinity anti-affinity isn't just on a node level that also includes per availability zone or per node group anything that can be matched so if I can say this label is designated to these types of nodes I could say I don't ever want any other workloads to run on those or you can say I don't ever want to run on this type of machine or I do want to run on this type of machine so there's lots of things that you can apply to these rules in this case this is you know one way of of showing that and I like that you show customize in here um I think I think I might have jumped ahead let's go ahead and start where you were at yep so uh let's let's just jump through the instructions step by step here so clear our terminal let's get the pods uh in the checkout namespace by the way if you do a get namespaces you'll see that we've got a different namespace for each one of those components so carts catalog checkout obviously you see a lot of other things for the other components we've installed here but we do have them chunked out fairly nicely so we've only got one instance of the checkout and the checkout redis pod if we run this command it'll actually show us using pretty cool Json path to kind of manipulate the output of cube CTL that we've got the checkout and the redis instance actually running on different nodes and that's not ideal now I very much likely could have had them both running on the same note I got I got lucky today for our demo they're running on different ones to kind of prove my point a little bit better but I don't like the randomness and we want to use Affinity rules and anti-affinity rules to not make it so Random we want to set some structure for how these pieces are deployed and so this comes to I think what you were about to bring up here Justin our first yaml configuration and of course everything in kubernetes managed by declarative yaml which is notoriously error prone in Workshop environments for folks that are updating things and might be updating it in the wrong place and so we use customize to help streamline find what these changes should look like so here in the very first customized patch maybe I made it a little bit too big um look this yaml is pretty big uh I just want to know what exactly are we changing so essentially you go to the div Tab and it'll tell you that uh what we're adding to the checkout deployment here is an affinity and anti-affinity Rule to break this down very quickly I just uh want to show that basically we're giving the checkout pod an affinity and we're setting this expression with the key value pair for redis and we're setting an anti-affinity for um the the key value pair service and so essentially what this will do is say don't deploy checkout anywhere else that the checkout service has been deployed so anti-affinity for service so if we go to the deployment checkout we'll actually see um that label right here so we're saying an anti-affinity for itself and that's essentially what that means is if that pot exists on that node it'll deploy it doesn't exist on that node um sorry if it does exist it won't deploy it if it doesn't exist on that node it will deploy it um so that's a fairly straightforward uh yaml change we'll apply that and K command is customized for our Cube control apply Dash K that actually applies this overlay to override what the original information had the thing I also think is really cool is with that Affinity rule you have required during scheduling so if this cannot be fulfilled let's say I do not have a node that doesn't have service running and does have redis running this pod will become unavailable it will not schedule because the schedule we're actually looking at this Rule and saying oh now what do I do I can't I can't provide both of these things that you want that's where things like cluster Auto scalar commit because cluster Auto scalar is going to look and say Hey this node's not scheduled so I'm going to create a new node and see if I can't land that job on this new node and meet that new requirement based on these demands and it might need to reschedule jobs base to be able to meet this and so that is something that happens cluster autoscaler would add a new node in this case if the Pod says hey I can't be scheduled anywhere the compute doesn't exist for me so there are situations you can get into where you can say hey I made these rules but nothing's happening my job's not going but you know this workload needs to be deployed and it's not you have to check to see if you have that capability of actually scheduling that job somewhere absolutely so um like like Justin mentioned customize actually built into the cube CTL CLI it's an open source tool I think it started separately but it um was actually eventually integrated into Cube CTL itself so everyone has access to these kind of customized overlays um so we applied that uh yaml change we'll see that um all of the other things in checkout that are handled by that customized overlay were unchanged so we just configured the the checkout deployment um and in The Next Step we're just going to scale that out two times and then what we should see here is that um it it well well actually let's just go ahead and take a look because we want two pods but um we'll actually see that uh they're they're not deployed yet so you'll see that the second checkout pod has not been deployed yet and that's because uh it has an affinity rule to only deploy once redis has been deployed on that node this makes sense because traditionally you don't want the app to come up before its database it makes more sense to start up the database and then once it's up and running you make sure the Pod starts up alongside it um okay and so that that checkout pod is currently going to be pending let's make it a little bit clearer here and just do the standard get pods uh and there we go we'll see that it's spending for 47 seconds um okay let's make sure I didn't forget any commands here all right now in the next uh yaml customization I'm going to create an anti-affinity rule on the redis Pod uh and that's going to match itself so it's just saying don't deploy two instances of redis on the same uh node so we'll apply that that should be fairly straightforward it's not actually changing anything we can do the rollout status uh okay there we go and then we'll scale the redis out to two instances now this is going to kick off kind of like a cascading effect uh that that second redis pod will get started on a different node and then the checkout pod remember which was pending it's going to say hey I can actually be scheduled now the red as pod came up and so if we run the get pods boom we'll see that uh everything got started up and I think that's that's exactly what I think that that's probably the end of this this particular lab yeah so essentially what we've seen if we run this command is that now we're gonna see one checkout instance running on one of the nodes another one running on the second node and and uh and the and a redis instance for each one of those checkout pods alongside um the instances on and in fact since we have three nodes because we bumped up the number of nodes in the previous lab to three we can actually scale it out three times of course an anti-affinity for the service right um yes so uh the third one will be pending so to fix that we need to also scale up the checkout redis deployment and I'm going off the workshop script here but I was curious because uh is is the service the one the Pod the workload we don't want to be running with is that on a different node a different node group uh or is it on one of those three nodes inside the Spanish node group uh so they're all going to be in the same managed node group I think the the anti-affinity you're talking about is we don't want to spin up a checkout pod uh if we don't have the analogous redis pod also running in that same node as soon as the redis Pod comes up then we get the checkout pod as well and that's what we see here um it was pending and then as soon as we scaled up the the redis Pod to three instances and use that third node spun up another redis instance and then that that led the checkout pod get scheduled actually a great question here in chat would it be better to run redis as an init container for checkout so that the container in the Pod we have two containers and redis runs with that pod if we wanted them co-located like that every time yeah yeah that that's another great uh you know approach for basically solving the same thing um in this particular lab we wanted to demonstrate the anti-affinity and Affinity rules but that's that's a great point one of the downsides of init containers is if you have multiple of them you can't order them so if I have a sidecar and I have redis I can't I can't order them necessarily and it ties the life cycle of both of those containers together where if I need to deploy a new redis container or a new checkout container I have to deploy both whereas redis is stateful if depending on how I'm using it right like I might have it as a cache and I want to keep the cash so I could only deploy the checkouts or only deploy redis depending on what I actually want to deploy an update to so there are pros and cons of doing it that way uh that may be doing it with an affinity rule might be a better solution just because of that coupling of these things are always shipped together and I always want them in a clean State sure that makes sense well I was wrong you can you can order in it containers you can't order sidecars I'm sorry I spoke wrong and it containers do run an order um that's right but but a sidecar if we have multiple long-running pods with it I can't order all of them together that's right um and yeah I Rafa and chat it's the other issue you can start multiple in parallel right that's that's that any containers run one after another that's right um okay so I think this kind of showcases I I really wanted an opportunity to talk about why we chose customize so I'm glad we went through this to kind of showcase uh exactly where customize is useful because uh you kind of have to admit when you have to start making these large dips in uh in yaml where you're you're nesting changes and multiple levels yaml is uh you know notoriously pretty strict which is a good thing when working with configuration but a bad thing when you're in a workshop environment and you just want to update a Gamo file um by the way if you want to make that change manually we don't stop you you're free to navigate to that uh that folder directory and change the yaml yourself manually and apply it uh we just give this as a streamlined option and yeah customize is great at repeatable manipulation if I need to do something over and over again or I want to share this configuration say hey this is the thing this is the patch you need I don't need to tell you go edit that file because is you might end up with spacing issues we don't want you to troubleshoot editor problems right now we want you to troubleshoot or learn the actual what's happening in the cluster and that's why customize is a great fit here because it just here's the spec we need you to patch here's the deployment that already exists in a generic sense and here's how we want you to override it one step at a time and that's repeatable over and over again and you don't have to worry about how to exit Vim that's not a problem we're we're dealing with right now it's just here's your patch and here's that yaml yep all right folks so I think uh that's about as much time as I want to spend on managed node groups we could also go into taints spot instances upgrading armies but we'll leave that as an exercise for you next I want to get into fargate but look we got halfway through a module uh we we don't necessarily want to go through the steps to clean up all the things that we've did and so here's a perfect case for why we need something like the reset environment say I got bored of going through the manage node group module I want to go somewhere else I can simply take this command run reset environment and about a minute or two it's gonna reset my cluster environment as if I just started the workshop I think this was my favorite thing one of my favorite things at least about the workshop is that you don't have to go through consecutively and you can easily get this uh fresh environment to work with but while that's running these environment's doing the cluster as well it's saying resetting all of the workloads and manage node groups to how it was initially provisioned exactly yeah and so the node group goes back to the original size and it changes the configuration by the way this is another cool thing about using customize we made all these changes but now to get back to the original uh configuration we just have to apply the original customize folder of yaml uh all the changes we've made would be erased and you can imagine in production this is a great Paradigm to have access to is you know whether you're deploying to Dev tests prod environments whether you need to roll back or roll out a new version to have that easy ability to reset config I think is honestly really awesome and we do expect people to use the eks workshop repo to help lay out their own uh you know production grade workloads of course uh the eks workshop out of the box is not production ready but we do follow best practices uh and we do expect folks to be able to use it as an example um that being said oh actually we can see it's asked uh it's asking the eks node group to go back to the initial sizing it's going to go through all those steps uh to reset our environment um but hey now Justin we're on to the next piece here you gave you gave us great explanations for kubernetes eks managed node groups I'm going to lean on you again here what's fargate go go down to that diagram because that's a great diagram right there and we can just start with that and we already know that inside our cluster we need to have some compute available to run the pods there has to be somewhere these things run and we had manage node groups that are Auto scaling to say hey I need four identical nodes that look this way they're going to run pods and I can match those based on Affinity rules all that sort of stuff these are just standard VMS these are VMS that run and in kubernetes they run multiple workloads per VM typically where a VM is large enough to handle maybe five or six or a hundred containers and we can say just run them all and and they'll all work together because I can have fewer VMS to manage with more workloads on them it might be good in a optimal sense if you're running the same workload or same type of workload to be able to scale things that way and people have been doing it for a long time even before kubernetes it's just a common way Auto scaling groups have existed for a long time to let people scale their workloads in this cookie cutter one one versus the other looks pretty much the same fargate is a serverless container image or or VM essentially where it is a managed compute layer that allows you to still run workloads but in a sense that you don't actually need to manage the OS you don't need to manage these updates what you get out of fargate is a one-to-one matching of a pod on a VM and that allows you to scale them horizontally the same amount of PODS you have will be the same amount of VMS if you're running them all in fargate and you don't worry about anything under the covers below your workload that should all be managed for you just say like hey I just need this to run somewhere just create it for me I don't really care just give me a computer and this will run and that's where farty really shines because when we look at serverless for functions in Lambda you don't care about anything below your code you can deploy a Lambda function and say just make this run whenever I get a request that is event based architecture where when a request comes in I need to do something fargates is a container-based serverless infrastructure where the container is continually running it's not event based like Lambda the container is always running so it's always ready to receive those requests it's always just available inside the cluster I can query it I can get those logs it is one instance of the container per fargate node versus Lambda where if I get 100 requests I get 100 different Lambda functions both great options to running workloads but in a containerized workload with container or with kubernetes with eks fargate is the solution you want a lot of people think eks and fargate are different things this is not the case eks can use fargate or ec2 to run the workloads and that's the biggest difference where how much of it do you need to own how much tuning do you need to do how much configuration do you want depends on if you should be using fargate or ec2 and uh we got a question in the chat here does fargate automatically scale without the need for defining Auto scaling groups and and yes absolutely and I think that's that's kind of uh one of the big advantages of it for certain workloads like Justin was saying that you don't really care about the underlying details of exactly how the the instance is set up the you know whether they're GPU enabled or whatever they might be and the auto scaling group specifics there are certain workloads where you just say look manage the scale of this for me manage the deployment manage the instances um and you know if if traffic exceeds my limits then scale it up for me and scale it back down automatically when I don't need it anymore and of course these are all things that you can configure when working with managed node groups Frankie simply uh kind of streamlines some of those decision uh decisions you need to make lowering your overall you know cognitive overload when you're managing uh a kubernetes cluster and so there are some workloads say that are really great for fargate and and for example I think one is uh when you're working with Carpenter there's a controller mechanism here that enables it to actually manage the scale of your cluster and so that control mechanism looks at all of your other nodes and all your unscheduled pods and and what your cluster layout looks like um so you never want that to go down you never want that controller to disappear uh you know whether you accidentally move a managed node group or you're doing some sort of omni updates or whatever it might be you want that controller to be always running and so that's a great example of when you would want to use fargate in this particular sorry good you're the workshop I know I'm positive we'll get into how we're going to provision that but there are some downsides to Fargo that I do want to point out right now where things like Daemon sets aren't supported because it's a one-to-one pod matching of every pod gets a fargate node and so you don't have to worry about Auto scaling groups you don't have to worry about the cluster Auto scaler because when you scale up your pod you'll automatically get a fargate node but that means that if I need to run a Daemon set on everything the Daemon set isn't going to be triggered for those fargate nodes because those are only for running your workload pods fargate also doesn't have as much variety as ec2 does out of the box like ec2 has literally hundreds of instance types that you can pick and choose do you want graviton arm processors do you want gpus do you want both you want EBS optimization all these things fargate simplifies a lot of that decision process but also can limit some of the things that you can run in fargate so there are some great workloads that run in fargate it lowers that barrier or or that cognitive load for you but when you say hey I need to tune these things I need to make these changes or I need to own part of this infrastructure that's when you should probably switch to ec2 and say this is the thing that I know I need for my workload let's go ahead and make sure that we have those separate yep and so uh the workshop is going to go through the process of enabling fargate and then doing some activities working with fargate I think it's worth going through this first one I do want to point out on you know the way far great works you create this fargate profile and you create a selector within it and that selector tells it which pods it should consider to be uh running on fargate um I'm gonna sit down for fargate because now now I'm just getting too excited for this I need to sit down now now very recently we actually created a new feature for for Fargo we released a new feature rather um now we enable you to select uh namespaces with a wild card so before every fargate profile you had to select a specific namespace and with only five selectors for each profile some customers ended up having to create way too many uh fargate profiles because sometimes the namespace uh you know some companies are using namespaces in a different way and creating them on the fly or dynamically um and so for that we now have support for wild cards in the namespace selector actually I notice we don't have that documented in the the workshop that might be uh something to I'll open an issue we'll get that added to the workshop that reminds me by the way folks that are watching right now that want to contribute to the workshop we've established uh contributing guidelines uh we're gonna have a monthly community meeting we really want to accept contribute solutions from the community from Partners from the open source ecosystem now I will say we're going to restrict certain submissions we want to fit with our open source guidelines um and and stick with uh you know partners that have an open source component in their Solutions and tooling but look we're happy to accept PR's and evaluate PR so if you're interested in contributing modules to the workshop it's a great place to get started Target profiles when I was getting started were confusing to me because I didn't know how they were implemented now that I understand fargate profiles are admission web hugs there's web hook inside of kubernetes it's not an abstract thing it's not something that's magic behind the scenes it uses a web hook that looks at the job when you submit it and it checks what you have defined on on your fargate profiles so we have these matches that we say hey if it's in a namespace or if it has this label whatever it is we want to match that and then we apply with that that web hook apply a node selector essentially that says you need to schedule on a fargate node and we're going to provision that for you and so that makes sure that the right that makes sure that the scheduler doesn't automatically send it over to the existing nodes because if I send a new workload in the cluster the scheduler is just going to say oh I have I have availability let's put it on this node but the fargate profiles we make sure that that doesn't go land on non-fargate nodes so it's just it's a web hook inside the cluster and it's something that again to me was really confusing just because it was this magic thing that happened I'm like I don't know how you got this node and my workload ended up there and not somewhere else but if we look at how kubernetes works with web Hooks and applying label selectors and taints we can see how that logically progresses into oh I can make any workload that comes in with this namespace or this match of some sort labels whatever the case go on a specific node we already just we just covered that and you can do that with fargate as well fargate just happens to be fully managed AWS infrastructure that's right and so we can see that the the profile that we've created here has a label selector so any pods with the label fargate colon yes are going to get picked to to run in fargate um now so let's quickly check why isn't the checkout service already running within fargate right and we'll see that it doesn't have the label fargate yes and so we're going to use our favorite customize add that fargate.js by the way I love this so much you can see the full view is awesome check out and you can also just click this diff and see that hey look we're just adding this one line we'll apply it and check the rollout status this should be fairly instant we're just waiting for the deployment to restart with the label so it's pending termination I guess not instant and what's happening right now right because we applied the patch and now in kubernetes we see it sees a new object it actually gets a new object that goes through the workflow of seeing web Hooks and all that stuff to be evaluated where should this go now because we've modified the object and we send it back to the API and in this case that webhook says hey I need we have a new label that label matches we want fargate so now we're going to provision those nodes and so the fargate nodes are actually coming up to be able to run the workloads based on that new label matching and so it goes through the entire workflow of simply saying Cube control apply but in this case we have a specific node selector that says we also need a fargate node so let's create that node in the process that's right and I'm not sure why it's getting hung up on terminating let's let's run that one more time let's see what's going on here yep okay let's see get pods and check out this container is creating let's check out the events no errors yet I think it's just waiting for fargate to spin up the nodes sorry what was that check the nose do we have a fargate node we do I don't have that alias oh we need to add that Alias that's a PR if someone wants to add that PR to our IDE environment there needs to be a k Alias yeah it gets me every time um okay so the node the fargate profile node does exist let's see all right all right it just took a little while but we've successfully rolled it out so if we run the get pods again and let's actually describe this pod did your checkout service have a pod disruption budget that's like that's why it probably would have taken a long time but I I'm fairly certain we don't have one of those um so scrolling up here I just want to see what note it's running on um let's see okay we do know it's it's uh it's been triggered by the fargate scheduler so I can assume it's oh here we go fargate IP uh blah blah and and that's how we basically confirmed that the checkout pod is running in fargate now um a couple more things that we do in this particular Workshop are related to um kind of understanding the different uh labels of what defines a fargate workload resource allocation we go through how do we change how much um you know what the resource limits for uh these these pods should be and then how you actually scale the workload on fargate um we've got about 30 minutes left here Justin it's probably worth us let's let's jump to the uh the next part of the fundamentals module here I do want to try and make it through at least all of the high level uh labs in here yeah we need to expose some of these services let's do it um so right now we're not exposing the UI application and so we can actually see that if um you know if we do Cube CTL get pods and the UI namespace and then let's say let's describe the pod that's running we'll see that uh actually it probably makes more sense to describe the service yeah look at the surface right so taking a look at the service we can see that it's a type of cluster IP essentially what that means is it only has an IP address for internal access so if you're within the kubernetes network then you can you know do a curl against this IP and access the UI but of course that's not good for our users our users need to be able to access the application and believe it or not there's quite a few uh ways to do this and uh in in this Workshop we go through what we believe is the best practices when working within an AWS environment you know you're running in an ec2 VPC and these subnets what's the best way to expose our application to the public world and that begins with the AWS load balancer controller Justin I'll I'll lean on you again here I was replying to some of our awesome questions in chat uh so the load balancer control so by defaults kubernetes has Services services are the only thing that you actually expose in kubernetes you don't expose pods directly you have to have a service which is an abstract abstraction around multiple pods they get master service and then we have all these endpoints so a typical load balancer we have one load balancer and a bunch of endpoints underneath it we have different servers we have different port numbers we have different something behind the load balancer and it balances per instance behind it in the case of PODS we have multiple pods and a load balancer just balances that load between all those endpoints the AWS load balancer controller gives you AWS load balancers the managed load balancers in AWS that align to pods or Services that's that do the same thing services and kubernetes kind of act as a load balancer but it's just a software load balancer in DNS it's not actually like a physical thing that exposes anything cluster IP that PSY pointed out before the service is going to get an IP just like that and inside the cluster they can talk to each other from outside the cluster I need something that exists that says hey I'm exposing this DNS endpoint and I will balance load between all these pods behind the scene so we need a service and we need a load balancer to actually do that exposing of services now kubernetes is actually it does a lot of this for us and actually when we set it up it'll do everything for us but um the idea is we've got a lot of different nodes and somewhere in all those nodes our services running uh and our service has its own kind of I uh you know Port that it's listening on the Node has its own IP address and then now what we're doing is setting up that other load balancer outside all of this to route to it so kubernetes really does help us out this concept of a service is going to wrap all of the pods that that service is responsible for and say uh you know kind of like a round round robin load balancing uh it's gonna say any request that comes in from that elb is going to hit uh our um kubernetes load balancer which then gets distributed to one of the pods that are running to represent that workload in this case we only have one instance of the UI running it's got a cluster IP which is great but I can't curl this and it does not have an external IP so that's what we want to fix right now um and so let's go ahead and get into it um I'm not going to run the reset environment I don't think uh there's anything uh critical that we've changed in this namespace so it shouldn't be a problem let's take a look um now what we're saying here is that we want to use the type load balancer to expose this application I see a typo actually that's not the customized though so we should be okay we want to use the type load balancer to expose this we've already installed the load balancer add-on um and so let's take a look at our microservices so this is the first command is asking us to run to say hey let's just take a look at all of the services that we have and make sure that they're not externally accessible and they're not but they are listening on ports internally and notably we want to make sure the UI component is accessible scrolling down here we already did this so describing the UI service we can see it as a cluster IP running on ipv4 let's go ahead and create the load balancer now first thing we'll need to do here is actually apply this um this NLB yaml so this service and so what this is going to essentially do is um change that UI to be a load balancer instead of cluster IP and so we'll run that customize and then run that exact same describe service UI again um sorry let's see let's see here we go sorry we just created the NLP we haven't modified the UI service just yet and so with this uui NLP essentially instead of just using that same service we created a second one and now we can see that we've got an external IP and the load bouncer controller knows the difference between different types of load balancers there's load balancers that'll just do network connections the NLB the network load balancer there's load balancers application layer load balancing which give you benefits such as like header injection and figuring out your HTTP endpoints so the ALB for application load balancer versus NLB versus an elb the load balancer controller can know which one you actually want and in this case we're looking at an NLB so the defaults expose a service in kubernetes doesn't know about those things it only does elbs and it doesn't do some of the more fine-grained controls you might want for those different types of load balancing and so in this case we wanted an NLP for Network load balancing layer 4 without HTTP injection or header routing or endpoints and all that sort of stuff and what's cool about the the flow in the workshop is that we'll actually point you towards that elb that's got created so by running this command and you'll notice it's not Cube cuddle not ecast cuddle it's an AWS CLI command we'll see that a load balancer actually got created for us using the elb V2 uh you know CLI uh prefix we'll see that that that is coming up right now here's Arn you can go into AWS you can go to the ec2 interface and actually see this load balancer being being created uh the different azs that that it's kind of registered to um and so I I think this piece is pretty cool this is something that we wanted to add into the workshop so that you know what's actually happening behind the scenes in the AWS infrastructure because a lot of times it's a little bit nebulous you're working within the kubernetes context uh to know what's happening behind the scenes so what does this tell us it tells us that this NLB is accessible over the public internet and it's again using those uh public subnets that are in our VPC a couple other things that we'll ask you to do here in a workshop first we set a couple of environment variables for the albs Arn and the target group Arn and then we're going to run the describe Target Health now one of the things that that kind of bugs me every time I did this was sitting there and constantly hitting refresh to make sure that that UI is accessible but it takes some time um and so by running this describe Target Health we can see that it's still setting up that load balancer for us it's doing initial health checks right now a couple of things that it needs to do before it shows as healthy and so if you go back and get that um so why why are there three endpoints there I have one one pod running why do I got all three endpoints well that's going to be for each of the the subnets um the the target groups and so hitting it now boom there we go uh we can see the the retail store application in fact if you want to hit this at home you can hit as well I'm not going to paste this into chat now uh but it is it is publicly accessible um and boom we've exposed uh our our service on on AWS through uh the the load balancer um by the way the the point of this exercise is to Showcase how that load balancer add-on is basically translating the commands that are coming in um through kubernetes API to actual resources in AWS backend all of these things that the kubernetes controllers are doing are apis that you could run manually you could set up this infrastructure all with terraform or with the AWS CLI and you could do the same commands kubernetes just knows how to connect those things together of I have nodes that are exposing a pot a port that have a pod running on them I need to connect that to a load balancer as an endpoint it's automatically registering that this is just data that kubernetes is saying this is where it exists here's how I need to connect it here's the apis I need to actually make this thing happen and and sorry I think to answer your question earlier I mentioned that these were the different subnets these are actually pointing to the the different nodes um uh I believe in in the uh in the node group and we can kind of see that um over here as well yeah when you go back and describe that service you'll see that it's exposing a port 31 898 and that's the port that gets Exposed on every node in the cluster that just gets reserved as a service when you have a cluster IP that Port is going to be available uh three one yeah there it is at the end there yeah 31 898 and and that gets exposed everywhere and then every one of those nodes is a back end for the load balancer but the UI is only running on one of them so when the load bouncer comes in it doesn't really know necessarily where it says all of them are healthy but only one of them has a service running the way that gets routed is the load balancer sends traffic to a node if the node is running the pod great it's just going to serve the traffic if that node's not running the Pod the cube proxy which runs on the Node is going to send it to another node it's going to send it to the actual node that's running the application to then return traffic so you would get every node as available on the load bouncer because it's a lot more expensive to attach and detach nodes as endpoints to a load balancer than it is the change IP rules on the Node to say oh send traffic over here and I'm just going to reroute you to where that's running if you're running a Daemon set that exposes the same port then it's always going to land on a node that runs the application but it's a small distinction that sometimes it's confusing to me where even if I have one pod running I get three endpoints or four endpoints or however many nodes understanding that Cube proxy actually takes that traffic and just sends it to the right place inside the cluster based on that internal cluster IP got it different from Ingress because we're already getting questions about it and I only have like 15 minutes left so uh very quickly I think uh we want to talk about so in the load balancer approach there's also a way to run this as IP mode which is just another way of exposing this application but that'll allow it to get a dedicated IP address here on the VPC which is a slightly more efficient network path for inbound connections if you want to get more detail in the IP mode you can read into this but to answer the questions and chat on how this is different from Ingress we have a separate module that goes into the Ingress approach and so this is going to be a slightly different uh approach to the load balancing using the application load balancers and as just mentioned earlier this is going to give you slightly more control over you know layer 7 routing and you know header checks and a few other features that you might want to use for example you might want to Route One application to the slash you know dog's endpoint of your application and another application into the slash cats endpoint that's a simple way to kind of Route multiple pods on the same domain and use kubernetes routing mechanisms to help you do that and so Ingress is pretty smart at being able to let you do that and translate those um and and then the add-on will help you translate those to the actual right path-based routing is very common and it doesn't always mean that it's the same server or the same pod it just means that hey these different paths in the URL need to go to different places and Ingress in anything layer 7 is what would handle that and Ingress is only for layer seven you can't do IP sockets with with Ingress but the AWS load bouncer controller also can handle Ingress rules and create application load balancers instead of nlbs for you so you can have different models run in Ingress application load balancer you can say or the load Master Control you can say give me an ALB for every Ingress rule you can share some of those rules a common practice at least historically with kubernetes was to run nginx inside your cluster as a reverse proxy and so you have an NLB coming into nginx and nginx would then figure out all of those rules and nginx would be your load balancer or your Ingress controller internally in this case the load balancer controller we can use albs directly we could also run something internally but running those albs gives you all the flexibility of an ALB and the scale for an aob external from the cluster where you don't have to manage it you don't have to manage that life cycle in the cluster itself it's just another thing that you can say I just want to use what my host what AWS provides and then let me route the rules how I want to based on HTTP requests with headers and paths and all that other stuff you got it and um actually we have an example in the Ingress multiple Ingress pattern that goes into exactly that uh and if you go through this and and we won't do it right now I think we've got about probably 10 minutes here and I want to get the get to the stateful sets but I I do want to show that in this module we cover how you can route the slash path you know just the home page to the main UI service and you could have a different uh application so the catalog app uh routed to the slash catalog endpoint um and so this is a very simple example about how you can use Ingress rules to host both of those applications and you'll see here that when you do that um the catalog ends up running on one address uh and uh sorry they're both running on the same uh endpoint but they're they're going to be served at different paths so UI the base path and then catalog at the slash catalog pass um I think this this really does cover just the basics of uh the the load balancer controller but I'd say probably 90 of use cases for most of our customers out there uh we do have kind of more advanced networking modules if you get into the networking where we talk specifically about things like security groups for pods and setting up probably more relevantly custom networking uh I know that a common uh complaint uh or or kind of problem that our customers run into is IP exhaustion and custom networking can help you get access to more IPS in your cluster and a couple of other common use cases that we cover in the networking module um but really this module is all about the fundamentals the basics and we geared this in a way that um it you can be completely new to Amazon eks and still get through this module um lastly well before we jump to the last piece here Justin anything else that you wanted to add specifically on the exposing applications piece I see we've got a flurry of questions in chat yeah I'm trying to keep up with some of the questions because this is a really difficult part of kubernetes and I would love to see people run through the workshop and give feedback on things that still weren't clear if there was something in here that you ran through and you said I still didn't understand why this happened or what was working or not working here please open an issue please open a PR because we want to make that Workshop as clear as possible for anyone starting here because that sort of that side of things where that once you hit traffic and someone says my load balancer is not working or I can't get to my application that often gets confusing of what layer you should start troubleshooting and so we want to make sure that we have as much Clarity in the workshop directly that everything works by default but also we can help you understand with more resources and whether that's something directly in the workshop or another website kubernetes documentation more videos that we need to host there's a lot of options there that we can we can provide to sure that this is as clear as possible so please open those issues in PRS because that's where we want to get started absolutely all right so I you know I I want to talk about the storage module but we're coming kind of close on time now broadly I do want to give a high level overview and and Justin um keep me honest here so storage is a critical concern when you're working with containers and kubernetes because at the end of the day um if a container disappears and you're using the local storage like the file system it's gone and so you want to be careful when you're when you're running stateful workloads uh in the kubernetes environment that you have some sort of state and so this can come in a number of ways you can attach drives to the nodes that are powering your kubernetes cluster um you could you could take advantage of you know database Services out there something like Aurora for like a mySQL database service or RDS but critically when working with kubernetes we find that the use of persistent volumes and persistent volume claims makes managing stateful workloads in kubernetes that much easier and critically here we have two tutorials for working with EBS and EFS and FSX is on the way we do have a contributor working on an FSX module right now but in in this uh particular module we go through working with stateful sets with um for EBS and EFS anything specific you want to add there Justin a big thing with stateful sets is data is hard to manage and it often requires some sort of consistency in how it's managed so when a stateful set is the provisioned in this case a mySQL database MySQL replica is not only assume that they have some storage they can rely on but if I have multiple of them they need to be named consistently and that was something to me that a staple sets were being created in kubernetes was weird because we kept saying that oh all pods should be disposable and they get random names and they're part of deployment all this stuff but with stateful sets we want them named one two three in a consistent matter and when we need to de-provision them or scale down we scale down three to one like we can't just say oh just pick one and get rid of it because a stateful workload might depend on some of that naming how the data was provisioned how it was divvied up between the other containers so all of those things are really hard to do with things like EBS and EFS it makes that a lot easier because you're not relying on Direct attached store average to the node and then always provisioning the workload back to the same node and saying I need you to mount this folder you can actually say EBS is a service that exists I can attach an EBS volume externally and it gets provisioned as block storage underneath the container at provisioning time there's plenty of other storage Services you don't have to store things this way please use RDS if you're able to please use a hosted database service if that is an option for you it'll be much easier it'll be simpler just to say hey here's here's an external thing that runs this thing but we also understand that stateful workloads inside of kubernetes are very important for a lot of reasons and so stateful sets were created they exist inside the cluster this gives you a couple different options of using EBS or EFS and like you said xfs to get some sort of store storage whether it's block or file or something else underneath the container that can be mounted consistently without hard coding it to a specific node or a specific you know IP address or something like that yeah and and we tend to see this a lot with customers that are migrating maybe Legacy workloads to kubernetes and container-based deployments because a lot of times they don't have the flexibility to immediately refactor their applications to be synchronous and use cloud services for storage that kind of thing and so in this case we have a catalog service that utilizes the microsql database that we're going to have running within the eks cluster and so we have a stateful set here again I I don't want to go too much into the weeds here but let's just go ahead and describe that particular stateful set for the MySQL we can see that you know it exists there's only one replica running right now but really what we want is for it to look something like this where we have each pod and like Justin said they're labeled one two and three they're made from the same spec but they're not interchangeable these pods are not interchangeable and that's like the critical thing with stateful sets um and so throughout this uh what we do is essentially we create a test txt file um and then we make sure that it actually exists uh and then we delete the Pod oops and we wait for the Pod to come back up and are a little bit quickly there should be fine um and then as soon as the Pod comes back up we're going to run by command again and we'll see that that text file is now gone and it's no such file directory so essentially we're not persisting state right now as soon as we delete that pod it's gone a local file system doesn't exist anymore and so what we want to do is we want to mount a drive on that uh particular uh endpoint so that when the Pod disappears The Drive still exists the data still exists um and so for to do that we'll use the EBS CSI driver Justin do you want to quickly talk about what the CSI driver is doing for uh the users here I I think you're on mute uh but I did it again I was trying to unmute myself as I was typing because I have a mechanical keyboard whatever CSI driver is an interface for containers to get storage it's just a consistent way that any storage behind the scenes will be provisioned the same way so that the container runtime could say I need storage whether it's block or file or something else and it's pluggable on the back end so if you're on Prem you can use NFS and you get file storage if you're on physical servers and you need a hard drive mounted you can have these plugins that let you pull that in the EBS CSI driver will provision EBS volumes and make sure they're mounted for your workloads as needed when the when the volume or when the application says I'm starting up I need this volume the EBS driver will provision an EBS volume if you need it or remount that if it already exists and so that's that driver is all about making sure that EBS inside your applications Works seamlessly and you can just say when your application says I need storage it gets storage there's also things like storage classes and all these other Advanced things for like I need high speed storage or I need I don't care storage whatever the case may be EBS has a lot of options and the EBS driver is allowed allows you to marry those options to your workloads inside of kubernetes because kubernetes is generic and it doesn't know about every option available on every storage platform the drivers that plug in that storage to the workload is where it's responsible for saying what's possible or how I mount it into the workload right and uh I think I missed the step here I was running into an error but regardless what we end up doing in this workload is essentially creating uh we're updating the the database endpoint that the application actually uses to use that MySQL that's hosted on EBS we show how uh you know by using that EBS CSI driver that Justin talked about uh we're able to automatically provision a volume by creating what's called a persistent volume claim so our pods will actually ask for that uh that that persistent volume um and then you know this goes through you know accessing the PV seeing the actual PV in the AWS console the persist volume or the EBS volume um and then we run that same test again to see that we can actually access the test.txt file after deleting the Pod and we'll see that we can now uh I wish I had more time today to actually go through the uh full Workshop here but the benefit is folks you can go ahead and start doing this yourself at home today go to eksworkshop.com in the first module we have a setup phase where you can actually provision the entire Workshop cluster environment with a single terraform command now of course um you know there's going to be a few requirements you got to have enough vpcs available in your account but the great thing is the terraform will actually walk you through whatever issues you might be running into uh and and so you can kind of help you kind of work through it debug anything that you might potentially run into if you have any issues by the way feel free to open an issue on the repo we are actively monitoring right now we just launched today so we've tested on our machines and you know to the best of our ability but we can't cover every use case and every uh user machine so if you do run into an issue please please let us know and we'll get it rectified as soon as we can by running this one terraform command you get the cluster environment and you can start working through it at your own pace by the way if you're a customer watching this and you want to have an AWS hosted version of this event uh where we'll host it on our infrastructure you can reach out to your account rep we're pushing this out internally so our uh our field team our Solutions Architects were able to execute this Workshop as well um less than anything you want to say before we close out here thank you everyone for asking questions and joining because you made all the content better just to add more clarity so I saw we already got one issue open which is awesome to see keep them coming we want to keep improving this over time we're going to keep it updated with new versions of kubernetes and eks that come out new features that come out in the controllers and terraform blueprints so this is starting today we want you to get started give it a try but this isn't the end of the workshop we've been managing a workshop for years now and people love it we love that so many people have been learning from it and we want to keep improving it over time so please keep those PRS and issues coming so we can help improve it for the next generation of people that are trying to learn absolutely and be sure to subscribe to Containers from the couch on YouTube we're going to do more of these episodes we've got module chairs that let each of the different modules here for networking security automation all of them we're going to do more of these live stream episodes be sure to subscribe again to Containers from the couch thank you so much for joining us for all the great questions in chat and we'll catch you next time
Info
Channel: Containers from the Couch
Views: 32,599
Rating: undefined out of 5
Keywords:
Id: _TFk5jQr2lk
Channel Id: undefined
Length: 87min 22sec (5242 seconds)
Published: Wed Feb 08 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.