Juniper Networks Engineering Simplicity for SDN with Bikash Koley

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
how's everybody doing whether you started pretty early and hopefully you are fully awake by now my name is Mukesh Kohli I'm the CTO of 20.per this is my comic-con coming up on my sixth month here spend about ten years in Google before that so my team is to design build architect operate Google's production network have run build designed very large software different infrastructure for a large chunk of my carrier and one of the things that I actually wanted to share today morning are lessons learned when you run an infrastructure where your data center has 10,000 switcher is moving 200,000 switches a network has you know tens of thousands of routers and the principles that you learned in in both designing and building an infrastructure at that scale and how do you apply that as a generic principle to other networks that may or may not be of that scale that may or may not be solving same set of problems but are there principles that are actually applicable to those right and one of the things that I actually learned in in many ways in in hard way is that when you actually have a large infrastructure you cannot treat your networking elements as pets you have to fit them as kernels you you cannot go and tend to individual elements as if they were the most precious thing in your fleet you basically got a victim as a flit and find out the characteristic that the fleet's have and then apply the spectral characteristics why is this important it's important actually for a couple of reasons when when you look when you run a really large infrastructure you know there are there are few things that you sort of start learning and those are that ultimately the beauty of an infrastructure is when it's invisible and it may not be invisible to you as an operator but you absolutely wanted to be invisible to your customers and the customers with the applications customers could be users customers could be others who run services on the infrastructure and there's some fundamental properties of an invisible infrastructure the interesting part is that if you look around and think of the infrastructure that we use every single day electricity water internet to some extent that should have some common principles it starts with fungibility you basically have an infrastructure that allows you to use exactly as much you want no more and no less and you actively pay for that amount of infrastructure so in the case of network again it's not the number of switches and routers that you have but it could very well be I have this bandwidth with this service level assurance that is available for this much time right so it must be fungible reliable reliable is something that is often very hard to achieve because it starts with how do you define reliability is it packet drop on a switch is it completion of an RPC is it uptime of my fabric what is the definition of reliability right but end of the day you know people are looking at an infrastructure where the services that run on top of that it expects that you're going to get the service level assurance that that your services have right so it has to be reliable secure it's especially true with today's today's ways of building infrastructure where the old definitions of segmentations and parameter and having boundaries between privately owned infrastructure and publicly owned infrastructure are basically disappearing right it is true for users if you if you are a large enterprise your users are working from Starbucks on an open Wi-Fi or if you're an application owner where your applications might be written on an on-premise data center and you want to move that to AWS or Azure or GCP based on economics based on ease off ease of migration and many other things economics right and finally you bakui they want here infrastructure to be available in everywhere and that is you know in some ways this has always been the promise of cloud the promise of cloud has been I write it once and I have it available from wherever I am and I have it available from whatever in point that I'm that I'm accessing my infrastructure from its storage beet compute bit application right interestingly you can apply all of these things to running water it's fungible you pay for as much as you use its reliable you can have your tap and you expect water of certain quality its secure or insecure this right great infrastructure always follows these four principles and the promise of cloud is effectively to be bad infrastructure right and the interesting part is that if you sort of break cloud down into what ultimately is cloud its compute storage and network and then the application on top and for this promise to work all this fundamental part of cloud compute or edge and network they have to follow the same principles so it's critically important that the network that is being built underneath ultimately follows the principle of fungibility reliability security any Vickery right and where I'll actually spend my time on today is and I'll have to spend a whole lot of time here maybe just flash this for a second this is actually showing an example of an enterprise but you actually can take the same picture and you can turn that into that of our telco or a cable provider or even hyper scattered conceptually therefore they're about the same you actually have a mix of infrastructure where you started out with a lot of things being on-premise you possibly have a set of colocation facilities where you either keep your in nature or a telco those are your H pops if your enterprise those are often in places where you apply security and things like that and you want connectivity to public cloud and of course the realities that you know most of the infrastructure today are a subset of this not everybody has all pieces but it's a pretty generic picture that sort of that that shows how any of the modern infrastructure looks like right now if you try to apply the principles that I talked about before just on network forget about storage compute and applications that run on top if you want to apply the principle that I described for just on network there couple of things are in that that you need one you want to freed this infrastructure as a cattle not as a pet what I mean by that is if I have our data center that is on-prem and if I'm extending that to let's say is your V PC or or AWS V PC where I have a virtual data center effectively because I run a bunch of VMs on on a bunch of networking notes that forwards packets and applies policy there is no reason that should look different because ultimately to you what you're trying to do is you take a bunch of applications or users that sometimes run on Prem dynamically move off a frame to public cloud and ideally moves over public Internet where it extends all of his security postures and policy has to do it right so our goal in my opinion is to build a network infrastructure where you don't have to worry about the boundaries between public and private and the boundaries between physical and virtual the boundaries between is it a router or is it a virtual switch that runs on a server because end of the day your applications don't care your applications don't care whether an HTTP session like to RT sebaceous session led to a packet that was traversing a silicon on a juniper router or it was getting routed by an x86 server for the applications is exactly the same thing right so the vision that were actually pushing very hard towards that at juniper and I'm going to share some of the technical details of how how we're thinking about this is how do we build a network infrastructure where you don't have to predetermine what is physical or what is virtual what is managed on frame or what is managed off frame you can dynamically move resources on payment off frame without having to rethink of what network you have or what policy you have or what security you have that's in many ways you know the Nirvana of fungible reliable ubiquitous and all is available right Network so how do we get there we get there by effectively applying the same principles that many of hyperscale yourself applied for a long period of time in building very large large granite large-scale infrastructure so let's start with some of the fundamentals right what does knit will do it connects it sends back it in short distance it sends back it in long distance it switches packets and it blocks based on policy if it's necessary right the main thing that has happened over last 10 years in my opinion is that people have figured out different economics of how to forward this packet there are a lot of discussions on Sdn and white box and NFV ease and it's a red setter right if you ultimately peel it all down and if you look at the principles what are you really after is what's the best economics for me to forward packets or switch packet just comes on click on economics it's not philosophy it not religion it's not principles even though a lot of people make it ask anyone who has are who has worked on really large Sdn projects I was part of the team that built Jupiter at Google and before at Google and espresso at Google it is never about religion it's almost always about about economics was the cheapest possible way for me to forward packets and if you look at it it ultimately comes down to what applications are driving those packets so there are applications that require very high rate of look up along with speed and fit and what you find is that if you try to do that on an x86 processor of course you can do it it's going to be very expensive for you the equivalence of that is can a CPU render you know your latest and greatest video game on on on a 4k screen you can if you have lots of CPUs you have lots of bandwidth connecting the CPUs together and and effectively if you're willing to spend a lot of money instead you build GPUs because they are optimized in hardware to render a screen at a very high rate at the lowest possible cost the separation between CPU and GPU are again it's not religion it's economics it's the exact same thing for network but there are applications where you might need massive amount of program ability in terms of how you look a packet how you apply policy you want to make the policy very rich because you know your policy might be I have this application tier let's say let's call it web tier for a second for these applications for this user group I don't want to allow these applications to talk to the storage tier of this user group that happens to be on public cloud that's my security policy I'm making this up write a policy this complex is often hard to encode in a pre pre-done ASIC because you may not have enough table size or enough abstractions to do it right well x86 or ARM processors work great for doing that so those are the scenarios where you say you know I need a table size which is dynamic the the the table lookup changes based on the applications that I do and whether I look at margin silicon or you know the greatest NP you that juniper does I still have a fixed table size it might be a couple of million in something it might be you know a couple of hundred thousand in something else not enough because I'm actually building richer and richer policy so l3 cache on x86 is your friend or RAM on an exciting x86 server is your friend because you have gigabytes of table size right that's what that's when you use virtual right the thing that we want to completely remove from this and they say sort of you know mission of this company has been much before I joined absolutely I know I feel very passionate about that is doesn't matter how your packet gets forwarded underneath your application should see the same thing on top which basically means that if you want the same abstraction for your forwarding plane you want the same abstraction for your control plane you want the same processes to run you want the same abstraction for your management plane and this is the goal of Juno's what are doing with Juno's is how do we take irrespective of the way that you forward packet we expose the forwarding plane with a common abstraction which we started out with something called a Fi and we're basically adopting p4 across Norfleet what p4 is our mechanism to expose the abstraction that you need for for the underneath switching and by switching I don't mean I mean abstraction to forward packets or or switch packets could be silicon could be excited things could be something else right we're doing the same thing with control plane you may have seen you know an announcement from Facebook where juniper is supporting their open or agent on our control plane using our standard API called jet appear and again the idea being that the control plane state is is mutated by juniper applications like juniper is BGP or juniper sorry sweetie the exact same way as your own applications that you might be writing on box or off box will give you the exact same api to go and mutate you know the state that you have on the box and the third is management plan you know you you might you might have an application where it's not worth your time to write control plane protocol I mean you know bishop you are scare great or easy P&V excellent works great for for your use case but you want to manage this infrastructure that you have underneath as cattle not as pet you want to have disability where I started out wait and let's let's use another example I started out with doing EVP and VX LAN on physical hardware on my data center and I had all these policies that I that I wrote which allows me to do segmentation of of my of my applications maybe I did micro segmentation by by throwing a set of firewalls physical firewalls and I see the economics of public cloud now I want to move this infrastructure to public cloud without affecting how my applications are written the way we offer solution to that is contrail gives you a VPN vx and overlay that runs on x86 its is the same micro segmentation policies that used to be on possibly SRX but you also have the ability to take SRX with you on a public cloud and run it as vs or X and you have the exact same replica of the infrastructure that you had in your private data center same operations same policy same ways of managing it you have magically taken your infrastructure that is on Prem moved it off trim it all starts with building the right abstractions and the right operating system on top and it starts wait you know as I mentioned Juno's that gives you the abstraction for control forwarding control and management plane and it finishes off with contrail having the common orchestration capability we respective of whether you are orchestrating a physical or a virtual network this by the way is a significant enhancement to what Cantrell originally did when we built contra it still happens to be the number one is the n4 OpenStack it is the most deployed production greatest en which runs AT&T has announced this and in public domain so I can mention you know AT&T s and every infrastructure runs on Cantrell one of the most demanding environment where people have done very demanding applications right has been used and proven as an orchestration system that does kill out management and orchestration and provisioning and policy enforcement for large infrastructure we're taking the exact same concept and we are extending that to physical layer so now you have the ability to use contrails to not only manage your overlay but manage your underlay and to be very clear you know frankly I actually don't care what there is call under layer overlay it doesn't really matter it's whether you route packet on silicon or you know track it on x86 it's the same thing and we want to make it the same thing where people don't have to think about whether it's overlay or underlay people don't have to make the choice off you know I'm doing Sdn versus I'm doing physical network it's the same it doesn't matter I have the same ability to go and orchestrate and manage and apply policy and do security right so connect we are put we're giving the ability to connect endpoints in an infrastructure in a way that it spans the physical and the virtual boundary pretty much every product that we have in a for a client actually has a virtual counterpart MX as vmx it actually runs about the same with almost the same performance as a physical MX model of the limitations of the ports that you have in a server same things to it SRX it has about the same performance as it runs on a virtual version again model of the limitations of ports that you have on a server right we are we're creating a CP device where again the goal is you new service chaining and you can bring up and down vnfs where you have the same capabilities or the VNS - right so connect orchestrate now contrary is not mechanism for for doing this not Christian I'm gonna go down into little bit more details as to how this orchestration platform work but I'm you know setting this up as to what we're solving and once you have done that you know the next bit is see I'll not spend much time there because you know with you you heart a lot about what a formics is doing this morning which see again our goal is the same philosophy as before when you have an infrastructure and you want to get visibility in the infrastructure you shouldn't have to separate between whether it's physical or virtual whether it's in public cloud or by private cloud or even bare metal you should get the same pane of glass visibility for compute storage and networking irrespective of physical and virtual and you should get the capabilities to normally just see but do other things with it and you know the important part operational analytics stage with an orchestration if my infrastructure is going sideways I have the ability to fix it because I have smarts in my visibility system which are the machine learning examples that you actually hard before and do stage with an orchestration role based monitoring and alarms I want to know when things are going bad based on some thresholds that I put in and capacity planning critically important the last bit is often are overlooked like let me give you a concrete example of this I am a service provider and my business is selling VPN at the edge and and I have this edge router that I bought and I actually have no abilities tonight and by the way this is all control plain heavy application this is not necessarily data plain heavy application I actually don't know how many sessions ii support because i have no mechanism to actually know how my routers behave as i keep on adding sessions to it right our bowl here is again to give you a common mechanism to collect data process data and store data in a way that the data store itself is not different for for actions that are taken in second stage with an orchestration or action that are taken over weeks or months you don't have to go and look after and go and find two different data sets and apply two different mechanisms to do planning it's the same common data store and and and same same behavior you know just read back my understanding of what you're talking about here because I think there's a few things going on there's a whole bunch of gears spinning so down at what you're saying is that I think what you're saying is that junipers moving all to embrace physical and virtual devices that the control plane for those will be driven by a p4 abstraction over time for that you use PP 4 as a language to define packet forwarding and to some extent the application rules and then you're talking obviously about the Sdn type of environment where it's applications at the edge and then packets in the core that proximally true so so what what we're saying is we are giving you the capabilities to control juniper infrastructure at any layer that makes sense for you right so if for you it makes sense to control the forwarding plane you can with before for many that'll not make sense for many you want to control it at the control plane to read further up the stack then that's right then it becomes contrail that's right to do the configuration investment capability that's where and then on top of that in the last slide you talked about our formics becoming then a combination of analytics engine giving us data about the combined state of the devices the configured state of the operational network platform but you're also saying that this becomes an automation platform then for DevOps operations yes so yes so again this is getting away from a network as a devices to the network as a platform so that you can pull a lever and the gears all clunk down and then it all trickles down so that becomes in very loose parlance of the current state of words intense slash policy slash application centric depending on which flavor of cause you wish to apply on top of your thing you know this I want to make my network go faster here click faster please and then the engine can then divide formics will say yes there's a performance problem here I can measure this then I will send some configuration commands to contrail contrail will work out how to add at this point in the system and then it would then go make the appropriate changes and so on and so forth and that's sort of the vision that you're laying or I couldn't have done the summary any better yes and and what I was actually going to do is I'm going to skip this slide let me go into the details of how we're doing it okay sorry I just want to read it back because I was yes exactly you actually described it perfectly right so the vision is exactly that the that the interesting thing that I actually wanted to share with you is that it's not a vision it's actually very close to reality and the reason being that we we started out with a large number of this building blocks that are already there mmm so Juneau has always had this capabilities of programming at FAA we added before so there was a fi we are at before so it's not really new it was there contraire always have the ability so control had a very strong intent and policy language from the beginning and it always had the ability to go and orchestrate thousands of endpoints it just happened to be virtual when you started out a forming started out its life again as managing applications and can do hundreds or thousands of endpoints and collect telemetry and make sense of the ironies so some of the demo right we're really putting this functions together and the the real change I don't call it change evolution is a better worth it the real evolution that is happening is what a really building is a common platform as a service so its internal name it has an internal name you will see externally it's it's referred to as really the applications that run on top and we have been calling them bots so you might have seen hard you know name pier appearing BOTS or test bots or health BOTS the basic idea is we're building this as a real platform as a surface where you have clearly defined not bound api's and these are rest so there'll be either juniper applications that will drive them or somebody else can write applications because they're clearly defined api is they have clearly defined data models so intent you know intent is used in loosely in various different ways right but ultimately it comes down to is there a clearly defined data model that allows me to talk to them work not the switches or routers right yeah intensive very complicated issue yes I think ultimately intent is I don't want to know how it's done I basically wanted done yes and managed the cattle and other pets yeah or even managed the pets but only tell me when it's broken yeah sure right you know like you know we're seeing some intentional systems say you know in the data center say these cables are wrong right right well my intent is actually to have a networking a working network yes if the cables are wrong yes then I can say from the policy that you should go and fix those and its most simplest absolutely reduction okay of the logic whereas more complicated is you know I've got an MPLS backbone with four carriers in the back of it and I've got a my analytics visibility platform in the case of Juniper it's at four weeks telling me that there's some sort of brownout yes I've bandwidth ran out I'm not getting enough sure so I either have to go and kick the telcos to provide a decent level of service or just go buy more bandwidth or already out or read out her on that that path that LSP that's right yeah I'll just focus on the networking or not talk about you know you're also able to the demo from app for weeks was also talking about scaling up the capacity in the cloud I have a service brownout and they act all the due to a lack of web servers EMI yeah and then I can start to trigger send commands into whatever the backend system is and say scale up the number of instances yes and and and you can extend that to you know SQL like you know in like from junipers perspective estion is a feature it's not a product it's a feature where it's another way of connecting from A to Z and so it's the same intent that then say you know I was using this VPLS network from somebody and I see a brown out there right do something and the deuce do something might be using internet or moving to public cloud or it's not just a network moving the workload to public cloud no right so yes your your description is exactly right and the approach that we're taking is exactly how hyper scholars have built this infrastructure so you actually look at this picture you'll find lots of similarities with kubernetes this is this is built using almost the same principle that kubernetes uses for orchestrating services this is for network your leading to a marker services model so it is absolutely micro right it uses the exact same principle as in just as kubernetes as a manifest this has an intent and it pushed down over a clearly-defined REST API by the way this part of the picture is completely leveraging Cantrell yeah with evolution this part of the picture is completely leveraging a formics with there's evolution and what it's going towards is you have a clearly defined state database for the network we call it in 10 DBM config DB it has two levels of abstractions that are captured similarly on our forming size side there is a telemetry DB that keeps the telemetry for the whole infrastructure at the bottom of this you have plugins yeah so the plugins are both juniper specific as well as non juniper and these plugins are what do the translation in a loosely template but this is not really template desert there's an actual plug-in software plugins right yeah so you'd have plugins for each model exactly of the physical virtual infrastructure exact underlies it and that would be customer lid that's correct yeah and for Drupal all these plugins already exist we use them and Cantrell always have this plugin exists for all the June 4 stuff and you'll extend and you'll extend that for if you're gonna build a platform that's variable they so so long as the underlying platforms allow real configuration push like will not do CLI hacking I mean somebody can write a plug-in that does CLI hacking they're you know they're welcome to do it will not do it but as long as somebody support modern ways of pushing contrary generally if you have risk of and and yang or gr PCE and open config where but yeah we'll support that right sending with telemetry so there are plugins where we already have plug-ins for zooming please give me honest you know of course all the cloud stuff that happens for storage and applications but SNMP sFlow NetFlow UDP swimming gerrae PC trimming right standard stuff that that comes in and there are a bunch of service primitives that we are writing again this is very similar to like if you look at kubernetes you know there are some applications that are embedded queue proxy is an application I read embedded so we actually have a bunch of applications that are embedded in the system that are commonly used for almost any application inventory database example right and then we give the ability to write application on top which these are the things that are called BOTS and you know we're already releasing a set of juniper BOTS but ultimately this is going to be a large chunk of this not all of it a large chunk of this remains in control and therefore will remain open source so all this API is are going to be published openly and this is usable for others to go and build an ecosystem so the thing that I want you to leave today with is that the abilities to manage a physical and a virtual network as a system with clear definitions of api's and data model it's not a dream no I it's actually very real look at this and go the CEO is just not a part of the long term future you've got something really wrong with your head yes because this is all software layered on top of something right and whether it's a Juniper Enix ad or a white box or you know running a junus on top of a white book switch or it's a it's a virtual server at the edge or it's AWS or Google there's no cell line here it's all you know and it's all got this and I've been banging on about analytics and visibility for the better part of 18 months but this is the integration of the analytics engine with the configuration engine and that's that feedback loop closing up and then the applications close the feedback loop right and some of those applications would be ours like if you know health part is a good example or in a formics has a set of applications that close the loop you will be writing application for sure right people the right application so I had they'll close the loop in the way that it makes sense for them a challenge here is security because firewalls and IDS's and IPS is because and that's the next that's the gap that's missing that's not being addressed here yeah actually but you'll be evolving into that and so it is being addressed so let me let me talk about that I know a little bit over time I'm sorry spend 30 seconds maybe talk about security where we're going after security in three different ways yeah one is as I said our firewalls are fertilized they're actually containerized now so see SRX is a tiny small footprint 25 mega Patterson it's a very small footprint that can run it already runs in everything in AWS it'll run on GC pianos you or actually I think that as your integration is done so you can check what you have done with physical firewall and take that and plug that into their infrastructure Cantrell announced a thing called Cantrell security of what idea is Cantrell V router already has a stateful l3 l4 firewall yep with that if you add a CS or X on the side you actually have a l3 l4 l7 stateful firewall and this is fully distributed so it's actually not a perimeter security at all so at that point you can you can go and push security into and by the way this is one of the RISM that a lot of our enterprise customers use Cantrell because it gives the micro segmentation you're then starting to get into the sidecar proxy model absolute where you're running in a CSR X on the side of every container to give us visibility and control yes yes and and by the way this infrastructure is being built absolutely to support that so the policies are not just native policies the policies of security policies as well okay so you go and you push the security policy based on whether it goes into physical firewall or virtual or We certainly have some security because all the sudden we've got visibility too and we've got configuration filling but it's just those legacy appliances you have to go in the middle of those flows because legacy is really important apparently absolutely like stupidly important I mean you know the reality is you know more I have never seen any of this changes work when you have assumed that legacy has disappeared right because you have to assume the legacy is going to be around for a long time yes you put still never take it with you right and and again that's the other approach that we're taking so thank you very much I know I've taken more time than I had but I'll open it up for questions if you have yeah just a quick on through security so you see the end vision is you have the complete transport in one let's say pane of glass and the security is just one part of that part of that all additional tech of your that you can just add and it is distributed and you see that also in the whole package it's just one fear I mean my view of security actually people's view of security you may have heard the term s DSN software-defined secure Network what it really means is just as here we're saying you go and configure individual routers right you don't go and and write security policies that says block this in this firewall and block that on that video tour you say my policy is in my application level or in my user level these are the things that are allowed right then the system goes and process that and figures out okay I need to not follow this application because this application is moving into cloud so I need to take the same micro segmentation possibly that I had on Prem now it moves to cloud right and that is done by software like a human is not going and figuring out now I need to go and plug bunch of firewalls in AWS yeah so that's the vision we're not there yet we're getting pretty close to that because in again CSR X and and contrast security gets us really close to that vision we'll be there by end of this year I would say okay there okay thank you very much for your time and really appreciate you guys coming over thank you
Info
Channel: Tech Field Day
Views: 7,738
Rating: 4.9365077 out of 5
Keywords: Tech Field Day, TFD, Networking Field Day, NFD, Networking Field Day 17, NFD17, Juniper Networks, Bikash Koley, big data, sdn
Id: BxC6uluk6k4
Channel Id: undefined
Length: 34min 37sec (2077 seconds)
Published: Wed Jan 31 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.