Juniper Networks Cloud Network Automation using Contrail

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

one my name is Prateek Rashad Rhee and I'm one of the product managers at juniper in the contrary team so the topic of discussion today is cloud Network automation using contrail and the reason I did not want to choose something like Sdn is because I wanted to talk about more of the customer problems that we are trying to solve using contrails so solving the networking challenges that are there in cloud environments and enabling that through API is through automation and so on now before I start the product is completely open sourced so it's available under Apache v2 license and you might have already probably visited our website open Cantrell dot o-r-g there are tons of information there are blogs and videos as well as there is a day one book there about the architecture itself so please feel free to go to that website some at some point so what I'm gonna do is I'm going to talk about the the product overview and then I've got 30 minutes I'm going to talk about some of the customer use cases as well so before I sort of let me sort of tee it up with what are the requirements that we see from our customers in terms of the cloud Network automation so all our all our customers are across the board whether it is enterprises service providers or you know clouds as companies emerging companies they have multiple heterogeneous environments that they want to interconnect so they have the traditional environment which is mostly DMV based there's VLAN architecture then there is they are building out their next generation you know spine and leaf cloths based data center and they are building multiple of those and these data centers will have virtual machines containers and in containers is becoming very popular with financial services because of performance reasons then they are they will also have bare metal servers and storage again if you want to run Hadoop clusters and so on you need bare metal and then there will be like other service appliances physical service appliances where there are physical load balancers physical firewalls and virtualized instances of those all these in their multiple distributed data centers there will be public clouds Amazon or Google or or as your as well as service providers of themselves building public clouds and there is the customer branch as well the customer the the enterprise branch all of these need to actually interconnect with each other so the requirements that we see from some customers are you know first of all legacy interconnect how do you connect your traditional data center with your modern next-generation data centers how do you do P plus we interconnect how do you connect your virtual machines containers with bare-metal servers within the same network how do you do multi DC distributed cloud essentially stretching a network from one data center to another so creating a network where you know VMs and two data centers can seamlessly talk to each other people are switch service chaining this is very important service chaining is an important concept in the telco world where you want to create a chain of services whether load balancers firewalls and so that when traffic goes from one network to another it goes through that sequence and how do you have your physical network functions like physical firewalls as well as virtual firewalls in the same service chain and then there is hybrid cloud of course some name 70% of enterprises are planning to use some sort of a hybrid cloud over the next few years and branch networking so essentially how do you connect all of these with your CP environment so that's where your content comes in and brings in the cloud cloud network automation and of course you have to have an automated environment and top of that to manage imagine manage and orchestrated so with that background let me start talking about the product itself here is a high-level architecture of contray very standard architecture there is a centralized logically centralized controller which is actually physically distributed it can consists of multiple nodes as you can see there there are control nodes config nodes analytics nodes and they the control nodes east-west talk BGP that's how they federate among themselves and that's how you can get control plane scalability there is a data plane component called V router which is a kernel loadable module which sits within the width the within each host so every host has got a rear outer in it the other set of elements that the content control talks to is a top of racks which and we heard about you know how how a contrail controller in many scenarios want to have a bare-metal server as part of a virtual network so that in that context we actually talked to the top of racks which we use the OVH DB protocol because we want to make it multi vendor because GFX 5100 supports OVS DB and that's how you can have like a virtual network let's say a blue virtual network red virtual network within which you can have very metal servers as well as VMs as well as you know containers in physical servers can be service chained as well yeah that's right that's right but it is so that's absurd we do physical service training using using MX ok are using a standard hardware v type schema for OBS TV yes so and then the other thing that the content controller pure Swift is a gateway is it's the MX MX Cakery in our case but that's the that's where all the tunnels turn it and you can go out to the internet or or win an environment now the important thing here is that of course we support multiple you know overlay protocols GRE UDP VX LAN but but but the important thing to note here is that again we do it in a very multi vendor fashion because we want to give customers freedom of choice so whether we are talking about different Linux distributions you know Ubuntu CentOS Red Hat what have you or hypervisors we integrate with KVM ESXi for example or different kinds of x86 servers or different types of gateways MX in our case but again it's stocking standard protocol so you can have other gateways or top of racks which again we talk OVS DB standard protocol or orchestrators so we talk to multiple environments in fact you can also have different kinds of vnf virtual network functions also as part of service chain not just the ones that are provided by juniper but also by third party when so the whole environment is made very multi-vendor now one of the things that I will draw parallels to to understand the architecture is its likeness to a router so if you look at the control nodes they are like the routing engine of a router they talk east-west they talk bgp the the compute nodes can be thought of as line cards and when we talk about the qfx 15100 there's there's one use case I'm going to talk about you can think of the key effects 5,100 as another line card in this in this framework and you think of this whole thing as a giant router chassis you've got config nodes in a router there are CLI or other methods that's how you do the configuration and you can have you know orchestrations other orchestration systems so just just if you want to draw a mental picture of the architecture you can actually think a router now the features of contrail there are there are lots of features I've just highlighted a summary of all the features and I'm going to talk about a few of them in a little more detail because I have got a very limited time so of course there is routing and switching there are vr af-s and VPN instances that are created within the veer out there is we support all the hype and Bienes dhcp source nat floating IP which is equivalent to elastic IP in the amazon case Amazon Web Services ipv6 is something that we support in the overlay as well and quality of service also we do load balancing within the veer out or itself because there are multiple paths that traffic can take and we do a CMP load balancing there there is within the veer out or itself we do some form of distributed fiber we don't have LG's and all that but just creating stateful policies is something that we do within the V doubter itself so you can for example say that allow or deny HTTP traffic between these two virtual networks and that is all done in a very distributed fashion in fact all of these are done in a distributed fashion because they are being done within the V doubter which is distributed itself and it is therefore yes it's safer and not all these performance if it is yes so performance is we have we have actually tested on a 10 gig interface to be 9.1 gig on any interface so it is definitely performing third party network services I'm going to talk about that in a little bit this is one use case I'm going to talk again in the context of the customer use cases but the next few things analytics is is one of the key differentiators of the product so we have got rich analytics where you know everything that is done within the readout are sent back to the analytics node and it is presented as REST API stew any kind of visualization tools or even a contrary web UI that shows everything so just just a few snapshots of the product again you can you can go to our website and you can get more information about it but a few things like you can get more detailed information about any of the different nodes whether it is the compute node control node and so on you can in fact look at live traffic between two virtual networks by spinning up a wire shark instance so you can just do a port mirroring send it to a wire rack instance and essentially look at that real traffic that is going between two virtual networks you can also query the system for either historical flows or for historical flows essentially between two virtual networks what have been the flows at different points in time so those are all the capabilities that you can do on the web UI itself of course there are lots of api's and stuff that are available and you can query the system to get lot of information about about the product now one of the other interesting things that we are doing is the overlay underlay correlation where what we do is and this is whenever we talk about this to customers they get excited because it helps them do troubleshooting essentially what you can do is you can do a topology discovery using lldp and SNMP queries and once you do the topology discovery again this is a very simplified cloth fabric that we are shown like over simplified one once you do the topology discovery you can visually look at active flows as well as historical flows what underlay paths a particular over leaf flowe has taken all of that can be done using IP fix s flow information from the from the different routers in real time as well both as well as you can't okay so you can basically do active flows as well as you can search for previous flows between two and you can export the pickup exactly you can actually do that so so all of all of this functionality is is is the part of the analytics feature which differentiates us from from several of the sole of the other vendors the other thing that I wanted to quickly talk about is service chaining and the way we do service chaining why it makes it a very multi vendor kind of a fashion so in the logical world you basically define a policy you say that you know let's say I have two virtual networks red and green and I want to service chain them with let's say two instances our firewall and a dpi right so in the in the actual physical you know policy enforcement phase what happens is there are all these you know V ahrefs that are created within within the readout on all the horse so for simplicity I've assumed that all of these are running on different hosts but they can run in in any combination so if you if you look at the way we do service chaining if the red virtual network r1 is trying to talk to the green virtual network g1 it sends out a packet and once it reaches the we router the router looks up this table and says okay if I need to send to g1 my next hop is going to be server s2 with the label l3 so IP fabric forwards it to server s2 and every interface has a label and so it gets forwarded to that level again when packet comes out on the other side it looks up this nation is still g1 my next stop is going to be server s3 with label l5 and that's how it ends up at server s3 on the Left interface of the other service so now if you look look at the way we have done service chaining here it is we have not modified anything in the service itself there is no unique packet that is going inside that we strip off everything when we send the traffic so we do service chaining using routing and that is what makes it a very multi vendor in nature because if you don't like this from particular vendor you can yank it out put something in it and actually so that's that's one thing the other thing is that these services can also scale out horizontally so you can have multiple instances of those and that's when the V doubter each CMP load balancing also kind of kicks in right yes see you you mentioned to Larry so if you if you have physical this is for virtual this is for device so if you have physical let me yes wherever you have that coming up let me actually answer that so that is something that we can do manually what we are doing is we are automating that using net con to dmx the only the way we do physical service training is imagine this was a physical firewall and imagine this being instead of a readout or it's an MX so we just stick it to the MX and that's that's the only difference everything else the constructs whatever whatever we do within a sufficient exactly remains the same so this we router is replaced with an MX and that's how you can seamlessly do service chaining you can do it with some virtual yes exactly some physical devices going up that's right they're probably just hardware base B tips that you I'm assuming right on the MX so MX yeah you create create these VR F's exactly as you do in in case of a virtual instance you create these VR F's and that's how it actually goes from one interface to the other so just if visually if you replace this thing with an MX where you tie the physical instance that's how you actually can do it and going back to what you said because of the way that you service chain is based on route right which is why physical theoretically is no different than virtual that's right that's exactly so you're just changing a route if I put another firewall in or some type of inspection that it needs to go through on the way I'll or in I can manipulate with that that was that service chain because it's just the route you're not a tag in traffic or anything that's right that's absolutely right okay so let's move on to use cases now I've put in three broad use because I have bucket eyes the use cases in three broad segments one is the cloud services SAS SAS companies cloud emerging companies as well as many of the enterprises have their cloud teams who are kind of focusing on this these are some set of customers in fact we have we these are some which we have been public with we have done some joint sessions at some of the summits as well there is there are a bunch of other companies which we have not disclosed yet for example large industrial internet Enterprise they are essentially doing vCenter integrations so they have recenter environment and they want to integrate contrary with recenter so so what the whole idea about the cloud services is that you can launch VMs containers you have to you can basically create security policies between those virtual networks you can use service chaining if needed as well as you know you can do automation using you know heat which is OpenStack based or different kinds of our orchestration systems in fact an la-based gaming company they have got docker containers and they have their own orchestration system and so we have integrated with them as well we have done some basic integration and you will find this video on open Cantrell org as well we've done some proof-of-concept integration with Google notice as well I'm just gonna ask about that actually because it imposes a new model with the pods concept mm-hmm all right you're three batteries and whatnot how do you how do you um you interoperate with that what do it do you override any of that functionality so if you have what we have done is we have created a plugin for the api server so that it goes and talks to the console controller okay and the cubelets as well we have made some modifications to that so instead of using iptables it goes and talks to the V doubter okay so you actually are owning that part of the pod model that's right so each part can be thought of as a virtual network in our case suite that's cool so I mean that that that particular demo video is available on open content or if you are and and then and one of our architects Pedro Marcus has written a blog detailing about what implementation very cool so and one of our customers wanted this so we just did a did some kind of a proof-of-concept just with them so the I Pam I keep going on this because I mean the I know but the hype am dns DHCP is this your version or is this a vendor for whatever my flavor of I Pam solution I want or is it either Lord in the absence of a let's say an external DHCP server you can use the video to acts as a proxy sure for for the for the more interested honestly I keep going back this is the I pan part right is it you can for example integrate with an external I Pam server do you have your own as well that I could use as part of this yes okay okay yes so the next set of use cases are bare metal as a service I mean very applicable to you know hosting companies but essentially a large AIPAC base telco is using this essentially they are offering bare metal as a service and in that context we are working very closely with GFX to deliver the solution to the end customer again there are bare metal servers you want to actually connect them the Middle's of as part of different virtual networks you want to interconnect them using policies as and again overlay underlay correlation also falls in that category the third is the mostly focused on the telco side whether it's service chaining creating a telco cloud creating SD van which is beta which is the some people call it VC pucp so creating that SD van environment where you do customer branch networking in a proper way and so again there also we have got a bunch of customers who one of them we have done a joint session with this entity so they have got we have we have worked with them and co-created a model where you can do the you know the call home you can do the initial provisioning of the lightweight CP device as well as can you do a service chaining where services are running on the CP device itself so all of that is falls under the the third category now let me just give you one example and then I'm going to talked about on the cloud services and then I'm going to talk about what we are doing on the bare metal hosting side so this is again this is basically derived from customer cloud what what they have is an sure which they have created where there are you know compute network storage and some of the you know Triple A metering kind of things they have created this using contrail and OpenStack and they have essentially exposed api's to this so of course contrail contrail has got all the components of content has got ap ice and those api's are being used by their back-end servers as well as by their you know their applications and a front-end systems so what we have essentially done is and this very you know automation and API has come into the picture what we have essentially done is we have let our customers expose api's to their systems using this this particular infrastructure everything is again as you can see API driven so this is basically one of our customers who wants to offer infrastructure service services to their end customers the second example is where we do bare metal as a bare metal as a service VMs store integration so what we have done is you remember the architecture of contrail where there is a there's a logically centralized physically distributed controller and it has got different nodes compute nodes config nodes we have added another node there which we are calling as at our service node and logically it is it is part of the controller but it also it runs on a separate x86 server which has got a readout err in it and what it does is it X XMPP commands and converts them to OBS DB and that's how it talks to the top of rack switches and then there there are a VPN so it peers with the with the router talking a VPN and so you can have VX LAN tunnels which essentially in in cases where there are VLANs it terminates on the top of rack in cases there are V routers it actually terminates on the V doubter itself and oh there's one that terminates here and handles mostly BOM traffic and one that terminates on the on the MX so now what it enables our customers to do is what you see on the right hand side if you follow one of lines let's say you follow the red line again from going from VM to another VM Cantrell actually takes care of it through the V router but when you want to go from let's say a VM to a bare-metal server in the same network it actually goes through the top of racks which there is a bridge domain and as as was mentioned before you know VLANs and the VX LAN tunnel ends there so you can do a switch then VM needs to go to a bare-metal server on the on the blue network so from your traversing the green network to a blue network as well as your traversing from VM to a bare metal environment that is when it goes through an l-3 gateway so this is basically what we have enabled for our customers and as I was mentioning before if you think of it again in terms of a router the key of X 50-100 is another line card there with bare metal servers now as as hanging from there okay and then there is another use case this is a use case which is applicable which again this this is modeled after whatever we have done for entity and it was a again a co-creation model where they developed a bunch of pieces like for example you know billing and charging some of the centralized portal they build it out because our focus is coming in when we want to do the networking part of it right so again there are multiple requirements of this first of all you need to make sure that this CP device which is a lightweight x86 server can can be provisioned can be instantiated and so on then you have got services that needs to be and this this is again in in the case of entity it was a atom based lightweight x86 server it's basically there are VMs running here within within the CP device as as well as there are services running within the data center right and the way we look at this whole thing is that this for us is nothing but another compute node on the x86 server so all we need to do is figure out how to connect this via to that who the data center or CEO or or what have you and once it's done it's just as another x86 server running a readout or in it it sends all the analytics information you can do service chaining where services can reside on the CP device as well as it can reside in the data center so all of those are available analytics and then you know service chaining as I was talking about and one other important thing that they wanted us to do was an internet breakout which we did within the V doubter itself because customers from the CP device can actually go out so this is again as I mentioned was modeled after entity I 3 which we did and we did a presentation on that there is there's a video on this available as well and then finally I know I'm running close to the top of my hour this is basically a list of all the videos that we have a bunch of them are there on open content already and you can take a look at them there are lots of use cases there are lots of demos that are part of those so please feel free to take a look at it a quick question is on the virtual layer in terms of how the V router is built are you really are you raising any components colonel or otherwise from OVS or is that totally no so yes it's it's totally separate so it's a it's a component that we have developed it so our concept was l3 is is first that's what we started off with and then we had l2 features on top of it so it was it's not reusing any of the obvious features it's a new component that we have you not even reusing the OVS kernel module for data path and the pH or anything like no yeah so what are you doing instead so I heard that too harsh but how there's a lot of people what they do is they say all the user space processes were writing those ourselves will reuse the kernel module because of course that's totally different but you know some so you haven't done that so video tour itself is a is a kernel module that we actually put in so what we have done is we have made sure that we have implemented the full l3 VPN forwarding model in the which was not existent so it's pretty much like what happens in MX we have it in a cardinal model so that's why we needed to do so today if you see the implementation that people are doing for DVR for example so there is obvious that does l2 then the next host does the layer 3 and there is lot of context switches back back and forth and then there is a IP tables that are doing you know filtering and that causes a lot of latencies performance issues so what we did is we since we came from our networking VNA we wanted to build something that is like integrated routing and switching and ACLs which are integrated into a pipeline ok so that's why we rebuild the whole l3 VPN data path and so it does a flow table lookup just for the AC else part and it does vrf and route lookup with the next top architecture so that doesn't exist in kernel so we built it completely ourselves the reason we built it ourselves because we wanted to support multiple OSS so we wanted to serve so when we started off right now we are we have most section in Linux but we also have a freebsd port and we also wanted to take the same model in hyper-v at some point of time so that's why we built our own model ok during your racket upstream I actually don't know history behind that in upstream but it is all open source ok I was gonna be my question yeah yeah that's the main reason why people reuse over yes because well I mean obviously you take on your own work but it's just so hard to get that done for obvious reasons yeah so just that's why I want to ask thanks for the clarification

Info

Channel: Tech Field Day

Views: 9,284

Rating: 4.8048782 out of 5

Keywords: Tech Field Day, Networking Field Day, Networking Field Day 10, NFD10, Juniper Networks, Pratik Roychowdhury

Id: -_oEuV4WXJ0

Channel Id: undefined

Length: 28min 33sec (1713 seconds)

Published: Fri Aug 21 2015