The State of Open Source Routers

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
around so next I'd like to introduce Russ white who's going to talk to us about that is going to work about open-source router there we go stop all right hey it says Q instead of clue I thought somebody left me a clue up here I need one of those so this is about the state of the open source and disaggregated ecosystem parts of this can be open source parts of this are going to be commercial stuff and that's okay Russ why they work for LinkedIn this first slide is disaggregated chocolate chip cookies in case you're wondering because I think we should all do everything disaggregated so first of all let's start with a kind of an overview of the bottles that I'm talking about here because some people may not be familiar with them an SDN of course you have a controller and you have a bunch of network devices and essentially you have this API that connects the controller to the network devices an open standard control plane you have the network device and then you have an open standard control plane like bgp or OSPF or is is that is then running on all those devices and dissolving your connections and you rely on something like the IETF to build those things and an open standard in the Desai's inherited model when I call them disaggregated model is you actually have the network device you have some sort of open source or commercial NOS that you're using on that network device but it's separate from your control plane so you open say in a control plane may or may not be a separate application that's running on top of the open source of the commercial NAS but it allows you to make your control plane decisions separate from your nos decision as to third more like I want to talk about because that's kind of the the area I think that's pretty interesting right now is what the sources for those nossas and control planes are and those white boxes and or bright boxes or whatever you want to call them this is the chart comparing the three Sdn potentially as lower-cost the big assistance from my perspective is that this is not about saving money white box is not about saving money disaggregating is not about saving money it's more about aligning your technology with your business because the network can be considered either a strategic advantage or a tactical advantage or it can be considered a cost center and if you consider your network a cost center then probably you don't want to disaggregate you probably want to focus on driving your vendor costs down if you're more concerned about figuring out how to shape your network more to your application and of course the hyper scalars and web scalars are kind of leading the way in this realm then you really want to think more about white box Sdn and disaggregation so let's talk a little bit about the disaggregated model and think about some of the different components you would need to build or to purchase or whatever put together like a bunch of little toy blocks an entire system so you start up here at the top you have a routing stack of course you give me some sort of a control plane that's going to contain BGP is is it's going to contain a rib to manage the routes and to do arbitration between the various routing protocols it's going to have to contain a hardware abstraction layer this is actually going to talk or connect the rib to a forwarding ASIC they didn't have a colonel and of course the colonel is going to run memory management and things like that one thing you have to think about in this area is that when you do a ping or a traceroute or whatever those are colonel applications so therefore the colonel needs to know about the routes coming out of the RHIB in some way so that you can actually make those applications work then you're going to have what I call a platform abstraction layer this is going to spin the fans turn the LEDs on turn the LEDs off change the color of the LEDs do all that cool stuff so when you bring visitors into your data center you have a way of making your LEDs flash green or red or whatever you want to as they're walking through make them all blue or whatever then you never thwarting ASIC which we're going to talk about several providers of those then you have a Phi which is going to be like your physical interface to the box and then down there at the bottom I have Oni Oni is basically a bootloader that you can use it's an open-source kind of open project that you can use that's being worked on currently to allow you to boot boxes and do all the platform abstraction layer stuff pretty easily so there are a couple of different options here when you're looking at doing disaggregation in the sphere or in routers of course you can always do the single appliance what I call the appliance model which is you can use an appliance vendors hardware combined with their software so you're basically buying and racking and stacking a chassis or one hour you unit and you're actually buying everything from a single vendor this is what people like to call the one neck to choke model and I'll actually talk about that in a minute and whether or not I believe that that really works another option is to do open source control planes are open more so software on appliance into hardware there are a number of appliance vendors of people I consider big vendors working in this space right now I have a slide later on that you use vendor software on top of white box or bright box hardware you can actually today by vendor software from some of the major vendors that will run on white boxes and again I'll give you some names a little bit later for those you can do open source software on top of a white box or a bright box hardware and so these are the different models you can get into when you think about disaggregating I would say the biggest issue in this space is to decide which pieces you want to own internally which pieces you want to outsource and what you actually want to work on so for instance you may say I'm not a hardware company I don't want to roll my own hardware but I can see the value of owning my own control plane or using an open-source control plane that I can tweak in that case you might go to an appliance vendor a standard vendor and buy hardware with support and then rely on open source control planes to solve the rest of the problem on the other hand you may say my real concern is in saving money on the hardware side on the capex side so what I'm going to do is I'm going to go to vendor and buy my software but I'm going to buy my hardware from whoever I might be able to buy it from we can think of different models in the PC market that do the same sort of thing right I can buy a Dell or a Lenovo and run Windows on it which is buying vendor software with theoretically white box or bright box hardware or I can go the other way and I can buy a Dell and consider Dell my vendor or a Lenovo or whoever else HP it doesn't matter and I can run Linux on top of it so those are the two types of models were talking about there so there are all sorts of different ways of doing this the main thing is deciding what you want to earn and where you want to own it so let's talk about white box and white box what you have to do is you have to start with chipsets and there are a number of companies making chipsets out there right now I'm black on barefoot caveum and Mellanox are the four that I chose for the slide there are others I have some others that are talking to me right now that I'm trying to figure out more about what they're doing these dolts supply hardware chipsets not only to the major vendors they also supply them to white box people as well like Delta and alpha networks gcg act and Silistra believe it or not you can buy what is effectively white box from Cisco Dell and juniper as well so if you know the right people to ask and you know the right way to go about doing it you can buy it if what is effectively white box or bright box from them as well and they'll even provide you at AXA for it and whatever else you need so let's talk about some vendor software's facts tags there are a number of these out there as well of course there's six winned is producing a version of Linux that has its own routing stack on it and I'll actually talk about routing stacks separately because there's a lot of interesting developments in the routing stack area that I want to spend a little more time talking about there's always cumulus Linux which is an interesting choice if you're trying to disaggregate and buy your hardware from one company in your software from someplace else jennifer has a version of Jena that will run on other people's stuff there's lab in Cisco actually has some projects underway or perhaps already announced I haven't really kept up very well running versions of XR and some other things on white box RT brick is doing some interesting stuff in the space dell has OS can or less i guess its OS 10 which came out of the forced an acquisition which they are working with snapper out and several others and snapper out is on there as well to build an open source network operating system around the open switch the old open switch community so snapper out as well is in that community and then big switch of course has their own OS that they're working with right now there are others out there again I'm just trying to give you a sample of what's out there if you have specific questions about a specific vendor or somebody you've not heard of feel free to email me and I'll tell you what I know about them the platform abstraction layer is actually one of the hardest components to source a lot of times it's just like drivers on your PC or on your Linux box what you need to do is you actually need to get this from the hardware vendor a lot of times and they're going to grab pieces from their vendors of their chipsets and try to pull it all together for you into a single installable application you can use so you have to somehow figure out how to connect your hardware platform with your OS this includes again things like talking to your plans talking to your five chips talking to your LEDs running your fans things like this that need to be done to make the box run again these are normally provided by a hardware vendor or a software vendor consulting companies will actually write these if you go to lab in or somebody like that and you say I really really want a particular chipset in my white box to work with a particular version of Linux or a Linux release or this particular network OS they can you can generally write them a check and they will make it happen the ASIC hardware abstraction is a very complex area right now there's an awful lot of stuff going on in this space so you really have to think through exactly what you're doing so the age of color obstruction effectively takes your rib browse and installs them into the FIB or into the ASIC so that you can do forwarding there are things like size so it's abstraction interface which is supported pretty much by all chip vendors it's pluggable architecture in fact open NSL is the way size supports Broadcom is it basically uses an open SL plugin in the SCI architecture so this is going to pour your River out in some way and it's actually going to install them in fib so open in SL is a broad Tom only hardware abstraction layer so what openness n SL is basically is a gloss or a thunk layer sitting on top of broad comms API and if people ask Broadcom for more features they decide whether they're going to include it in open NS l or not p4 is slightly different it's a barefoot networks version of an of an ASIC or hardware abstraction layer how so what's interesting about p4 is it's actually a programming language so you run an API and you run an entire development environment that can program the chipset ASIC D is snap route interface to a wide variety of Asics this is proprietary to snap route you have to buy this from them switch T is cumulus is very similar product FDI o is based on DB DK and this is another open source project that I think started out at Cisco and became a general project that you can use it's largely focused on mix however rather than networking switching network switching hardware so this gives you a slight rundown of some of the projects that are out there in fact these are all the ones that I know of there may be others that are built into like IP infusion and other projects that are out there so let's talk about routing stacks a little bit there are three basic ways you can garner routing sack the first is you can go totally open source which is represented by something like bird which is a fairly complete routing stack it uses it has BGP no SPF at least go BGP which is a go version of BGP I think it was originated by Google and is open sourced then you have this blend of open and open source and commercial so one of the places there that you can go is snap route you can go to the github repository you can download snap route code and you can compile it and run it but if you want to pay snapper out to support that code or if you want snapper out to do something to that code like add features you can actually write a contract with them cumulus is much the same way with free range routing which I'll talk about in just a moment that has its own set of slides there are commercial versions which are like IP infusion and Cisco and juniper IP infusion will sell you a complete routing stack included in their nas Cisco will and so will juniper free range routing is a newish project it was forked off of quagga and so this is something that is six months old seven months old lasts ons career and routing was actually brought into the Linux Foundation in a way where it cannot be controlled by a single company it's being supported right now by cumulus Big Switch volta six winned you can read their list it's a huge list of companies that are actually supporting free range routing all of these companies are not only supporting free range routing commercially a lot of these companies actually have coders working on features and I'll give you a sense of some of the feature in here but the big complaint that I've heard out of people I've talked to about the range routing a lot is well there's so many features going in and there's so many commits we can't actually keep up with the code changes that are going on because there's a lot going on is is is just being rebuilt there's a lot of performance improvements in bgp there's a lot of stuff going on and free range driving right now far about him so for instance in stable 2.0 these are some of the changes that were made performance and scale fixes for instance the entire BGP scanner was sucked out the BGP implementation in free range routing and it was moved to an event-driven mechanism for doing that for walking the table when necessary at pet support is in their remote AAS bgp hostname support update groups which is for your update packing in bgp next up tracking which is part of the business of taking out the BGP walking the table walking processes 32 bit route tags there's a ton of stuff going on in here there's a label manager going in and all sorts of new features going in and like I said is is is being rebuilt of interest as well for anybody who really cares Europeans currently being built for free range routing Babel is actually in and being worked on this is a very broad set of features and stuff like that okay pre range routing version 3 next version 3.0 which we're actually in discussion on when free daughter gets released and what is included in 3.0 but here's a lot more stuff in that large communities evpn partial there's a lot of evpn work going on right now in bgp like I said before is is is pretty much being rebuilt currently the SPF backup stuff is being added authentication is being added to SPF v3 there's a whole parser rewrite going on there's actually an entire API rewrite going on between zebra and the routing protocols so there's an awful lot going on right now in free-range routing and I would suggest that if you are work looking in the space of open source routing stacks the pre-rendering should be on your short list of things to look at not only to play with and stuff but to actually implement and we're talking to open BMP and to various other people as well about getting new features and you stuff pushed in here so what's different about free range routing is that there is actually a methodical vetting of submissions the way the process works is the maintainer who allows the coder commits the code from a pull request this is all get based cannot be from the same organization that actually did the pull request or put wrote the code there's a lot of rules around making sure different companies and different organizations are working in it so there's a maintainer and steering committee our maintainer and also on the steering committee for free-range routing becomes their common assets are held in trust by the Linux Foundation as well like the logos and things like that you can get it by a binary package there's a snap package available now 2.0 and the stable Channel is available and 3.0 is available in the beta Channel the Debian Ubuntu we have packages coming soon we're working on packaging around this right now and how to manage future packaging requirements and there are some pointers to the source if you are interested all right so now at this point what I want to do is turn my attention a little bit to some architectures and so I've given you an overview of the space and what's out there and the components that you need to make this work so now let's look at some of the architectures that are out there and try to get a sense of how some of these architectures open source and commercial are actually solving these problems and supplying these things so open network Linux is one of the network operating systems is completely open source you can just go to github download repository make modifications hack on it compile it and deploy it open network Linux tends to be more of a framework than a complete application or a complete NOS that you can download in years you have to combine it with something like free range routing or quagga or bird order bgp to give you the routing stat it does the major component of open network Linux is the major components of the open network platform AP is in the open route cache in the upper and route cache is actually this piece that mitigates between the kernel calls that do route installs and open NS else I and other things in there that you can use to as your a check interface so if the network Linux is one of those options is out there I don't see a lot of people are using onl right now it's a pretty neat platform a play with a little bit and it is usable humulus Linux of course is a commercial version its commercial slash open source almost everything in cumulus Linux are downloadable off of github except switch D so you have to spy switch D some cumulus pretty much everything else is out there they have a slightly different architecture in that they're very focused on orchestration and automation so they have an entire piece around orchestration and automation that integrates vmware nsx OpenStack chef puppet collect d and all these other ways of getting into the system as a matter of fact the CLI itself in communities Linux is kind of an add-on that's that's not they're really focused on the automation side of things they're also very focused on BGP humorous Linux is currently using quagga they're switching to free range routing in the very near future so when you download or install cumulus Linux in the near future free range routing will come as the default mechanism for running control plane like I said switch D is their thing switch D actually talks directly to the Broadcom api and other things so that it will run on top of a number of different Asics synapse architecture is a little bit more complex they are focused more on having a single database through which you can get to everything and they're using thrift RPC is their primary connection point to get between things so they actually have a REST API or a rest config module that talks to CLI net comp ansible chef yang all these other things and pushes it through a process called config G config D connects through RPC through the rift RPC to talk to each of the protocols so the snap-valve version of bgp actually uses the thrift RPC interface to get its configuration and stuff out of config d so when you have a CLI it actually talks to the rest config and talks to config D through thrift RPC to get to everything else and again you have another thrift RPC that actually connects everything that generates information that the ASIC is going to care about for reporting table down to ASIC D then ACD can talk to open NS l soft switch or vendor e plug-ins of Sai or whatever and talk to the ASIC Sonic is Microsoft's open source not that again you can download it off a github compile it and run it it uses a bit more complex modular format or layout it is primarily focused around the Sonic base things at the bottom a database and what's called SWS s and SW SS is kind of the centralized database for a lot of different things in signing so this is kind of an overview of the architecture of Sonic you have network applications they talk to app dB to be inside to be actually work within an object library Sonic is Sai native so it uses the site interface to talk to the a6 which means you can't you can use an up a plug the ball up an NSL or p4 or whatever you want to but it actually uses persistent Sai objects to do stuff there's an orchestration layer that sits in there that can allow you to translate between the apps and the side objects and this all lives within the switch services s2 BSS module sync D actually uses actually holes or checks IDB for route updates and pushes them through side down to the ASIC or through whatever switching module ASIC that your abstract happened to be using so that's kind of the sonic layout so here's some challenges first of all in the disaggregated world you don't get one look one neck to choke now I don't know how many of you find on my history very well but I worked in Cisco attack and escalation for many years and I can tell you that the one note my opinion anyway anyway is the one note neck the choke idea is actually not what you think it is it's also known as a single point of failure it's also known as my vendor makes all my architectural decisions for me it's also known as the net may sometimes disappear and not do what you want it to do because the net has other customers besides you so honestly I don't think that the one neck to choke model works in large providers whether it's hyper scale web scale or whether it's a transport model either way so I'm not really I don't buy the one neck to choke argument market challenge is this is an immature market you can see there's a lot of stuff going on the slides are beautiful but you can't compile the slides and run them right so your mileage may vary when you actually download this software and try to compile it it may take a lot of work to get involved in this there's a lot of vendors and there's a lot of projects everything is in flux many of these companies are smaller there are financial issues in some cases and maybe not in others projects are based on small communities routing is a small community okay routing is a very small community overall it's very hard to get people involved in open source routing project to participate the skill set is often a unicorn situation how many people do you know who know C code can run git can actually compile the code and install it and configure your network or run ansible to run your network or puppet or chef or whatever it's a very unique one skill set right now I think that this will change over time personally I think the skill set is growing rather than dying but you've got to be a UNIX head you've got to be a coder and you have to be a network engineer so it's kind of a hard skill set to fill you can't just be a CLI vendor or a vendor jockey you have to be full stack again going back to my original argument for me the whole idea is that I'm very concerned about integrating the business with the network in a way that makes the network something more than a cost center but this means that you have to understand the applications as an engineer and the business in a way that allows you to do that again Unicorn skill set hardware challenges welt count is low on a lot of these hardware platforms right now queue depth and buffering is variable label imposition depth is very small and a lot of these things if you're planning on deploying segment routing on top of these you have to be very careful about your label stacks and what cut the support you're going to get out of vendors in these areas you need to do a lot of research on cost versus benefit analysis and a lot of research into what you get for your money and in the silicon support area before making a decision project and vendor overlap most Asics are supported by most every option supported here you can use I and pretty much be certain that your ASIC is going to be supported fans LEDs and system chips chips that are very different order be very careful about asking about the hardware abstraction layer and platform support when you're playing with this we've got no features okay the features are building but my argument is part of the problem of network engineering the world today is that we love nerd knobs they're like Linus comfort blankets we love our nerd knobs and we will throw node knobs at every problem in the world rather than actually doing real engineering and fixing the real problem at hand so I actually consider the lack of features a positive thing for making engineers grow up and do the right thing and actually do Network design and the hard work of really running a network there's no tech support unless you buy it you've got to be an educated consumer if you're going to play this game you need to participate in open standards and you need to pay attention to what's going on in the projects you're relying on much of the api's are under eight iron the NDA and again if you're into play in the open source world I would encourage you heavily to be a part of the community so that is the entire presentation and we're back to disaggregate a chocolate chip cookies if anybody would like to copy the presentation or would like to chat about it I'm very easy to find I'm not going to give you my email address so but feel free to email me if we have any questions jump up to the mics and yell at me or whatever feel free to contact me offline or to find me I'll be hanging out a little bit around nano it's for the next couple of days I left you with two minutes early look at that that's because I talked to have passed my record is 95 slides in 15 minutes two words to this live
Info
Channel: NANOG
Views: 5,759
Rating: 4.8823528 out of 5
Keywords: NANOG 70, Kaskadian
Id: JTQqmnVRToI
Channel Id: undefined
Length: 28min 19sec (1699 seconds)
Published: Wed Jun 07 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.