Kubernetes and Cisco ACI with Andres Vega

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

once again and welcome great to have you here today at Cisco my name is linda des vega i am senior product manager in the engineering team that brought to you cisco application-centric infrastructure and the nexus 9000 platform of switches in my role I am responsible for open source initiatives out of the group I am very passionate about complex distributed system problems and building problem products not problems around those and taking it to market and making it easier for our customers taking waiting edge technology from web-scale an open source and to consumable enterprise products today over the next hour we're going to talk about kubernetes and ACI as well as a CI multi-site these are two of the major features that came out with the latest release of a CI a CI 3.0 before I take it off anyone who has not heard of kubernetes heard of or understand how well do you understand kubernetes I've heard of kubernetes okay I'm a storage guy so I'm a little slow ok so let's talk a little bit about kubernetes to start out kubernetes is based off the operating system that runs Google Data Centers Google in the early 2000s had the problem that they wanted to complete a query for an entire terabyte of data in less than 10 seconds so they had to figure out the most optimal placement of containers that would serve that query across their entire data center as much as they loved solving theoretical computer science problems they got into the public cloud market and they had to share the infrastructure resources they were dedicating for an internal engineering to share it outside with that with that they had to bend pack all those workloads to maximize the efficiency of other compute resources and energy utilization across the world they open-source the system they had developed to manage all the all the network and the compute to do the most optimal placement of containers basically managing containers at very large scale and the order of thousands and millions of containers at any given point in time kubernetes is great many thinks the open-source community loves it developers loves it it's great for scheduling and managing the lifecycle of containers but kubernetes is very hard to bootstrap if you want to get the experience of running kubernetes in the private public cloud and to your private data center there is a lot of work in both and setting up the required infrastructure to do so the networking requirements for kubernetes to work are quite complex and people have gone down the path of doing so I've walked the path of untold misery with a CI we want to simplify it and make sure you obtain the same experience of running kubernetes in the public cloud that you can do so in the private data center another set of challenges exists of containers are great but no data center environment is legacy free in reality there are physical databases new systems that may be brought up will still have dependencies to those legacy databases to interconnect to the data that's already stored and available there interconnecting containers to those physical databases or anything in virtual machines it's hard and there's different set of tools for it no matter how much level of automation you come up it's still quite a cumbersome task to interconnect containers to anything else it's not really simple making sure that not just the connectivity requirements but the security policy is consistent across the different layers of the stack is also a challenge and another problem inherent to most deployment platforms ask kubernetes as they were a homebrew and environments that didn't really have a requirement for governance or isolation they came up in development teams that that didn't have any level of compliance and didn't have auditors coming in or didn't have to abide by a security policy so that they were never developed around concepts of multi-tenancy for isolation of different of organizational groups or different clients consuming it naturally as google got into the public cloud environment they built in multi-tenancy constructs around Google Cloud but there's never made it part into what they open-source the scoober Nattie's so how do we fill those gaps and integrating kubernetes on top of a CI if you're bringing kubernetes into your private data center and we are proposing well the best infrastructure to do so as a CI why is that a CI when we developed it we were looking at a number of issues to address one of them was virtual machine mobility things move around in the data center we wanted to make sure no matter where workload would come up we couldn't deterministically predict where it would but at the time that it did that we would instant shape the connectivity requirements in the security policy right there at that point in time and we do that very well we knew containers were coming and it was a harder problem why because containers have a much larger rate of fluctuation and refactoring than virtual machines they may take a fraction of seconds to come up they may run for a number of seconds if someone runs the container to do four pings and we have to wait for the first packet to figure out what workload is that where the it's already too late we if we did that if we ate that first packet we just compromised 25% of the availability of that application so we had to come up with something faster for containers and we did by we're using the network automation API so that we built we know even before a container is going to come up the moment the scheduler the control loop that says a container is to be initialized in this particular server that's part of your kubernetes cluster a CI is notified we do the plumbing both in the infrastructure and on the virtual switch level we create all the connectivity requirements we do the IP address allocation and we know even before it's ready to start talking where it is what were the requirements we plump those across it becomes very easy and we knew it was coming an AC I every single network endpoint is an object we build the system from the ground up API based everything is entirely modeled in a directed graph to us we do not differentiate the form factor of a network endpoint it's all endpoints it's an endpoint object for us it's very easy to interconnect a container object to a database that's physical or to a virtual machine it's very easy to build out that application profile that we call it the sign it and a declarative fashion an application architect may say this is my app may be a three-tier app regardless of its containers or it's not just say this needs to connect to that that cannot connect to this and we built it out we simplify the routing and switching of interconnecting containers to everything else now container environments have a very heavy reliance on load balancers why is that if you're accessing a web server from Amazon you do not care whether there's a thousand containers that are serving that web request for that particular service you care about a virtual IP but as Amazon the scales up or scales down those thousand containers to any other number that is transparent but in the container world it does contain Erised the IP address allocation is not deterministic containers are ephemeral they come and go they get different IP addresses in a pool I believe most of you in the audience have infrastructure of backgrounds that problem of how do i configure the load balancer if I don't know what IP addresses are gonna be behind the VIP how do I do that so that needs to be entirely automated be transparent to the user and again through having built a proxy and a broker between a kubernetes and a CI we know as containers come and go what addresses get created and we automatically program the fabric we use our embedded capabilities for load balancing we do load balancing in hardware whenever north south a packet comes from outside into a service we use what's called policy based redirect we do that in Hardware utilizing equal cost multi pathing and we use the cross sectional bandwidth that it's inherent in a spine leaf architecture in a data center fabric now as for security this is particularly hard you have development environments where what they know is well I expect everything to connect to everyone in my team and anything that I brought up I shouldn't have to open up network ticket now as things move on to production that becomes a challenge because how do you map an environment that was wild west and to something that is tightly controlled engineers know a shadow' IT model but infrastructure InfoSec and network knows something else so how do we retain what developers get in the public cloud while bringing back governance and control to the infrastructure team and house we built the model where we honor the kubernetes network policy API that is the kubernetes security specification for within the cluster now if you want the enterprise model represent this by model approach where you can build ACI contracts and a CI policy around the entire cluster or around the different levels of the hierarchical object level within kubernetes of the cluster from cluster you go down you have a namespace which is an organizational construct basically attack for different groups to assign resources to different teams you can apply that at a deployment which is a set of replicated containers or you can do it down to a very low level container object so if you want to restrict of this container I'm gonna make sure that we program and enforce rules both in the virtual switch on the Linux server with that for that runs or on the switch if it has to traverse the wire make sure we restrict and we enforce that doing ACLs and contracts the way we do in a CI monitoring monitoring it's challenging there are tools that exist again for the different layers of the stack you may have a monitoring tool for kubernetes a CI has its own set of monitoring capabilities and everything else in between but how do correlate how do you map what in the kubernetes world means what and the infrastructure this often turns out and a lot of time spent in war rooms chasing down the whole what is what where is it attempting to resolve incidents we by exposing every container as an object and everything else within the kubernetes cluster nací via the namespace via the deployment be at a container the cluster notes itself we represent the object and all all its metadata we visualize you got the stats you get the law and in addition we provide you with the correlation of what in the container world maps to what in the fist world so what container and what host what virtual port the far end of that virtual port what is the physical interface of the switch if you want to see one service talking to another service for east-west we show you where those services reside and where is the traffic that traverses the fabric what actual ports is it going through ultimately reducing the mean time to troubleshoot to resolve an issue I had spoken about multi-tenancy a CI s multi-tenant and by its multi-tenant facilities and provisions we map every single kubernetes cluster deployment to an a CI tenant and sharing hardware level isolation between resources allocated for one cluster to the other questions so far great so let's take a deeper look and to how it all works what does it mean so you have your a CI fabric you have kubernetes running on top this is your kubernetes has a master that makes all these scheduling decisions of where containers are going to run and it manages servers by order of control loops it's ask the claret if as a CIS though if I launch an application kubernetes and say this application has this set of deployments each deployment has his number of containers for itself kubernetes says well if I'm running service web it has to run three containers at any given point in time exposed on this port it will query the notes that are part of the cluster how many containers are you running the container the server will say I'm running cero well you're supposed to be running one and at that point it starts that way it is able to self-heal if it's out of compliance with the spec with the manifesto for that application if the server goes down if the container dies and will discover that through the feedback makanan simpson and relaunch it we have the fabric in between and kubernetes has a network policy specification that allows you to isolate within the cluster turn it on and say nothing is able to talk to each other unless specified and the CML file just as a format kubernetes uses llamo for all this file descriptors we take the network policy and paste that policy within kubernetes is just a specification it's not an enforcement it doesn't guarantee it doesn't ensure that things are not going to talk to each other you need an enforcement mechanism be it IP tables be it open virtual switch in a CI we ingest that network policy we render it and we push it to the Linux server and we do the enforcement by means of a flex that's our protocol for Policy Exchange and we do the implementation itself in open flow rate open open flow rules and open virtual switch now go ahead and around control plane that workflow is it that I create the workflow the policy that the intent that I want in ACI that policy is then pushed to kubernetes to create the profile and then again enforced by ACI directly so do I have to actually ever see a yamo file in order to create a policy that a CI will enforce you do not if you're the a CI administrator you go about your job the same way you do today you create policy directly in a CI it will be pushed and it will enforce those hosts now the DevOps team that doesn't care about a CI and should not need to be bothered with learning a CI they write their llamo file the same way they would do in Google Cloud or in AWS they are migrating the application in-house they put in the exact same file and we will do the enforcement without them needing to care how it's done they just have the assurance to guarantee that it is a so ACI is ingesting both from the APEC API and from the kubernetes so from that visually I can obviously I can go back to kubernetes ingest those llamo from my interface ingest those llamo policies into a CI and visually you'll be represented from a CI administrator perspective as a a CI intent driven policy correct so we call what we ingest and render from kubernetes an APEC host protection profile that is what's pushed to the server we visualize as an object in a CI and we give you the stats and the a CI administrator can interpret it without having to translate verbally with the DevOps person what does this mean they see it in their own language okay I get this yes how would I implement this if I didn't have a CI because clearly the point is how much easier this is but I don't know what hard is yet you know if I if I was still running sixty five o nines how would I make that work with kubernetes there are open source projects that exist that do implementations and IP tables so they take the network policy API translated in IP tables so one of those is calico company you might have heard of so-so calico will take the amal and translate it into an IP tables spec correct there there are other open source alternatives to it or you can build your own which is quite a cumbersome task each we're not we're not building our own we're but we're smarter than that which is what like the network automation API is have been built we want to make the job easier people shoot and from infrastructure they shouldn't become developers to use this the machine should figure it out which is what we're doing here we are giving you time back to make more strategic decisions and spent less time thinking with it with that in mind by retaining the experience you needed to make sure that the exact same llamó file to create the application and any public cloud provider you could take it without modifying it is spit it on to kubernetes running on top of a CI without making any changes and ensuring it would work if you run and AWS or you run in Google Cloud you acquire the load balancing services that they provide you pay an extra for that in-house you need to either come up with H a proxy or put in whatever load-balancing you have in place and figure out how you're gonna automate it it's quite a cumbersome task an ace yacht we made it completely transparent the moment you and we're gonna see it into the demo it's gonna be well illustrated to be clear at that point the moment you create a set of replicated containers a CI takes that and that are exposed to externally for to be accessible from the internet we create the policy based redirect we create the load balancing provision and also while micro services are not only load balanced north-south they're also east-west and application tiers they have to communicate enter service one to the other we do by our own extensions to open virtual switch open virtual switch load balancing for east-west it's a layer for load balancing and for those familiar with ACI you may familiar with the concept of virtual machine manager domains how we do integrations with vCenter how we do integrations with asure how do we do integrations with OpenStack we have extended that to include kubernetes so you can visualize from the AP GUI the entire kubernetes cluster all its objects cause statistics for the teach of the different objects namespaces deployments services and pots and the correlation of those objects to the physical components of the system now let's take a look and back to your question where is the separation of responsibilities drawn what is the delegation of tasks we would have two personas that make use of the system on the far right you have the network administrator the network administrator remains very much in charge of rocking out the system bringing it up making sure there's a converged data center fabric at that point they define what set of network resources are to be carved out to present to the kubernetes cluster what BRF what external interface to connect to the internet what set of be lanced optionally they can create endpoint groups and contracts for anything from unto the cluster to abide with the set of security requirements and from a day-to-day perspective they'll go about monitoring and observing telemetry the way they do today in a CI they have all the troubleshooting capabilities they can apply those to kubernetes now the DevOps team on the Left they install kubernetes and they apply the ACI plug-in by use of a single one-line bash a script that will apply is called the CNI plugin container network interface one of the proposed standards for a container networking we took the work that we've done in OpenStack for ml2 reported that over to CNI they make the installation the moment they have a freshly installed system they are ready to build Network policy API if so des desired they built the definitions and the network policy and they are ready to deploy instead scale containers optionally if they want to isolate their deployments they can build EP G's and contracts from kubernetes and push them into the system additionally they are able to annotate and kubernetes there's the concept of an annotation think of it as a tag or a label to be applied to objects you say this will be placed and there's a specific EPG so think of I built an application I run it in my development environment I'm gonna migrate this application to production and its runtime I can apply the tag and by applying the stack matching an EPG name that will be automatically mapped to a corresponding EPG in ACI that would have predefined policy so the migrating from Def to prod is pretty straightforward so I can also apply the tax there we can get lost in the in the taxonomy a little bit can you reinforce EPG the the concept of EPG what what does the acronym stand for again and point group end point group and then from a governance perspective so there's going to be people to run in two different modes there's going to be the DevOps team that manages security in theory correct and then there's going to be the API team the ACI team that manages security or some other group in taxonomy and contracts in conflict so what the blank language that DevOps talks and then language that the traditional networking teams talk can be difficult can be different so there will be eight instances where we need to troubleshoot in between the two teams so he talked about when that contract is broken when DevOps tries to do something that from a security perspective the ACI doesn't allow ya where the ACI team tries to do something that the DevOps teams are gonna like how does that conversation yeah totally that takes us to the next slide so before you go I just want to be before we go on my question lies right before keys question on the network admin side we create the kubernetes system resources in ACI that's defined in the network policy right you said VLANs and BRF V RFS and that stuff so then on the container team side in the second step their building service definitions and defining Network policy what's the difference network policy kubernetes is irrespective of what exists on the underlying infrastructure network causing kubernetes as this and I can show you a llamo file forward how it's how the spec is written in kubernetes but take the example of a three-tier app web nav and VV within kubernetes saying web open port 80 database is gonna be open in 63 79 okay that does not concern where what physical interface what is being trunked it does not care about be lanced it's very high level application communication yeah and ACI the system resources are what interfaces what's which is what be lens i'm gonna present to the kubernetes cluster yep but at the moment you've handed that they are in a sandbox yeah they they don't care about what's being used as long as things connect its layer to stuff versus layer four and up correct okay that that is very much the boundary okay keep through a question DevOps teams are not necessarily concerned about security yes we say you work it you're a developer you work in an engineering team you want to get work done right you consult your Piero I just I just built this other service oh that sounds perfect I'm gonna bring it on to my application I don't need to write that myself just give me the service name I will put up and serve as discovery and we will interconnect you don't care about the network semantics right and foe sec and infrastructure are concerned so within your world within the kubernetes cluster by the fault everything that you launched an ACI we're putting it into a default EPG and we're gonna get to this and i want to transition with that in mind so with the the use case that I'm thinking of and a like that you said that up is that what happens when I create a service that is in conflict with my security policy so I'm thinking oh I'll create as a DevOps guy I'll create a helpful service that does a that basic creates a proxy right that breaks the security policy like whoa no proxies don't happen at the application level what happens at the network layer right so so here is the beauty of it by the fault anything you create and kubernetes running on top of a CI will be placed into a catch-all default epd endpoint group once again the stands for it's our EP d stands for endpoint group nací what does that mean as a set of applications a set of network endpoints that shared the same set of connectivity requirements and they can talk to each other without having to go through policy we provide out-of-the-box cluster isolation everything within a cluster can talk to each other there is no need for policy internally now if they want to talk to the outside they have to go through a CI policy if they want to talk to any other business organization any other business function if they want to talk to the Internet policy needs to exist that allows for that now to the point of your question if you create a proxy at this level you're only proxying internally you cannot proxy to the outside because on the outside boundary the network administrator is holding you and saying no no you cannot go through this I'm giving you a free-for-all and you're a deployment platform but you're not going over me and in doing so now this is not entirely prescriptive all the models are flexible and maybe a combination with each other but you can take subsets of a cluster that are called namespaces and say for each team and engineering for each of their namespaces I'm going to map each namespace to a separate EPD and contracts will control-enter EPG traffic oh there's nothing to stop because it's logically this is just a this has to happen at the dev ops alone so I'm a policy perspective but correct if networking if you know the EP G's are transient meaning that I have one EPG two different sets of services on EP G's a DevOps team creates a connection between EP G's from my perspective it's allowed but from a data security perspective that translation may not the work that would they're performing inside a CI doesn't have insight into that that's that has to be handled at a different level so policy is never in conflict it's always synchronized and consistent I want to describe this last one to illustrate how and just look at the key at the top right hand corner the blocks rectangle as a CI policy the yellow rectangle is DevOps within kubernetes and one is always encompassed within the other I want to get to deployment isolation if we're launching things into prod I want to make sure that every single tier of the application everything that is described as a service and kubernetes maps to its own end point group and that contracts tightly control that communication now what do you this scribe is have in my cluster yeah we have this free-for-all back let's roleplay again you're the developer but you want to isolate your app from all your peers you want no one to access what you're developing so you turn Network isolation on within the network policy API and that will restrict your gonna say no one can access button port 80 that is the DevOps of specification and that is implemented internally to the EPG and enforced on the linux server the contract the black rectangle is what gets applied on a CI and the interest of time let me pick up the speed and show a quick demo I am going to launch on a freshly installed system a tootie our guestbook app it has a web front-end and it has a red as back-end of one container for Redis master two containers for Reta slave and the front-end is three containers I'm gonna launch that application and to a single EPG and to the default catch-all EPG without you're gonna see I just do the deployment it gets placed in its own EPG then we're gonna apply labels to the different tiers to move them to corresponding EP DS we're gonna showcase how the load balancing happens and we're gonna take a quick look at the vmm portion dismiss all down here I have my freshly installed fabric by the way the installer script would have created the tenant for the cluster and would have set up the VM em for them I'm gonna go to the terminal where I'm logged into a kubernetes master is that large enough for everyone to see so standard kubernetes syntax I want to show you what is running at in the system at the time of having been installed standard kubernetes components that would be part of any kubernetes system you have the kubernetes database which is at CD you have an API server the brokers entirely all communication in the system you have a controller manager a DNS for service discovery as containers come up to know where services reside a proxy for every single node in the system and this cluster app 3 notes 3 hosts and the scheduler that does placement decisions abstractions exist as services which are again stead of replicated containers for the DNS and for the kubernetes master so here I am going to create a I am going to create the guestbook app let me show you how that yamo file looks like so I have a file where I say I need it sir Redis master exposed to 46 379 to be one replica I need red as a slave also exposed in 46 379 to have two replicas and I need service front and exposed on port 80 to have three replicas I can call out that file by doing cube Caudill and the namespace demo create the following file the file just showed you hit enter and it tells me the following services and deployments have been created I can verify that by doing cube Caudill and the namespace demo give me all the running containers we see for 12 seconds three front-end containers have been running rad is master one to register slaves if we look at the service abstractions service front end with that cluster IP address is going to communicate to the register slave on that other cluster IP address which in turn is going to communicate to Rattus masters cluster IP address the front end has also been given an external IP address which I have poured forwarded on router connected directly into the ACI fabric I should be able to go to the IP address I have port forwarded as containers come up I'll show you on the EPG where the containers have been placed by default you've default and we look at the operational tab we see those six running containers with their network identity what MAC address what IP what's running on what's the reporting controller water the interfaces both physical and virtual and what encapsulation are they listed VX LAN or is it being VLAN unwatered those IDs so I have my guestbook app welcome to tech field day 15 let's see if we can that I can access my app and like I can ride from the front end to the back end that's all great they can communicate to each other whatever I want to move those into production what if I want to apply labels to those services for those to be placed corresponding APD's using standard kubernetes format for annotations i can do a cube coddled annotate deployment I'm going to edit a annotate Redis master you see the label on the bottom of the screen says key value pair name demo guestbook backend hit enter tells me it's annotated I want to do the exact same for master and I want to do the exact same for front-end I'm is there we go front end annotated I would expect for this containers to h out from this screen see that happening and and the separate EPG separate application profile the separate et geez you see gas book front-end the three containers have moved and guestbook back-end the three database containers if I must have a typo here annotate deployment read as master policy space keeps here one demo yeah I have back at instead of back-end by refresh here I can see the containers start popping up cube 0 1 demo back-end I want to do the same first slave Brad as master back in slave Annotated refresh here those three containers move automatically if i refresh my application it continues to work I can keep on writing to it this for it to work after I moved it to corresponding EP geez I have contracts that were defined by either InfoSec or the network team that exist in here if I go in and remove the contract that allows for red as communication reports six three seven nine delete that from a CI I would expect the application to break front-end as you can see no longer can read from the database well I try to ride to it it's broken it's very easy to quarantine from and you see the control the network infrastructure suddenly gains if I read the contract or Redis and i refresh the page connectivity's restored every time that i refresh the application the policy base redirect for wherever the any note that has a running container will be reported and to an automatically created service graph with the IP addresses for each of the notes and if you look at the service graphs i'm wrapping up in two minutes I know I'm a little bit over time if you look at the deployed graph instance you will see in here we automatically created for front-end this insertion to load balancer we're doing this in fabric if I go into kubernetes and say well I'm running three containers for front then in reality I just need one and I scaled the application down you would expect to see the load balancer here on the screen to go down to the single node instance that's now running a container for that if we go back into the EPG and we scale up the application and say well you know what one is to leave two little I need 50 containers for front-end you can see if i refresh here s containers are is scheduled and initialize and you can see we're doing the IP address assignment even before the container is ready to start communicating on the network we have already discovered so we provide guarantee that by the time it starts both the network connectivity requirements across the entire system and the security policy has already been created even before the container is running yes is there any monitoring with this to know whether or not you have enough replicas available and whether it needs to scale up maybe still that automatically that is done within kubernetes so you the flaw you define the specs for your application what are the s lace and kubernetes will make sure that is met at all time the monitoring we presented in ACI we look at the replica sets and for front-end my VPN just brought their Lakey over here but if I look at this service front end or the deployment front then we will see the number we will visualize the number of replicas if I do the get pots here once again we see all the running containers and here the page refresh in the backend we see all the replicas for that and if the number replicas is not being met and whatever across any number of hosts you have in the system kubernetes will ensure and you will have what's known in the container world see groups like NV Center would be the equivalent to the DRS of making sure those requirements are met and doing rescheduling for high availability and worth and things be been packed and it's just I think my lack of experience with the kubernetes stuff the EP G's and policies you've created in a CI but when you were giving us the overview of what each side would do yeah there was an optional create EP GS and Paul seized on the container side of things so they can't actually just randomly create containers and policies can they so they can it can be restricted and constrained for them not to you don't want to develop it okay can we define the pronoun that they in this conversation the container side the devil the DevOps team right because the network team would be building your EP GS and your policies right what you can and can't do on the network who can talk that kind of stuff or the security team but if you're on the DevOps team and you're the one that's deploying the kubernetes and scaling it and all of that stuff you can't just say oh and by the way I'm gonna create a new new EP GS and new policies that let me do whatever I want on the network because it would basically just be overriding and right they end up creating a thousand EP DS that mean nothing and the right consume a lot of system resources so we want to provide control to add rights to the top-level kubernetes administrator so when the system is launched if you look at the kubernetes application profile and pointing to on the screen the EPDs you see there the cube default the cube notes the coot the cube system everything that's required by the kubernetes infrastructure has been created from kubernetes so we evaluated whether we want them to be able to push APD's but we want to restrict that to the top level administrator everything you see and the demo ap those guestbook - back-end gasp - grunt and those were predefined if a label an annotation applied to a container does not match either one of those two it will be placed in the default ok unfree defined by who predefined by ACI predefined by the network administrator ok that's what I thought right so so in kubernetes I can build a yamo file that selects an existing policy exactly

Info

Channel: Tech Field Day

Views: 5,993

Rating: 4.7837839 out of 5

Keywords: Tech Field Day, TFD, Tech Field Day 15, TFD15, Cisco, Cisco ACI, Andres Vega, Kubernetes, containers

Id: TMCqSKtAlik

Channel Id: undefined

Length: 46min 29sec (2789 seconds)

Published: Thu Sep 28 2017