Barefoot Networks Software, Architecture, and Strategy

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
great oh good I'm grabbed soon everybody my name is Akash manager here in the barefoot cider mental responsible for core software and network operating systems integration so IDI really set the stage on the importance of software towards achievement Barefoot's vision so today I'm going to talk about barefoot software architecture and strategy so when we started in that little garage that some of you have been to in Palo Alto one of the problems that we recognized is that network systems are really build from the bottom up it's functional ASIC that really tells us this is how I process packets it's fixed you can't change it not a whole lot you can do and Wow things like Sdn have tackled control plane and made things a little bit more top-down driven make things a little bit more open and things like this aggregation right with bare metal switches added a little more flexibility the last thing that really remained closed were the data planes and everything was opening up everything was getting disaggregated but they the planes were remain to be closed so we set out to fix that and we had a vision that the network systems will be programmed top-down which means that the control plane will really tell us we really tell the switching ASIC how to process packets as a result really this becomes user driven versus network driven the function of processing packets becomes user driven now a user could be one of you or it could be your switch system vendor or it could be a partner an integrator or anyone in between now we're all a bunch of geeks here like many of you but we didn't do this because it's cool I know it sounds cool we didn't just do it because of that we did it because we wanted to build a differentiated solution for a customer so as a result our software strategy is really about enabling those differentiated solutions using our people program basics which is Tofino into - like I talked about or another way to put it is if the asic is your canvas and the p4 is your paint we just want you to be able to draw the picture now in the world of programmability and particularly data plane programmability pretty much everybody now talks that they're programmable in that there are data planes are programmable and if you go back to a few years ago with Sdn I'm sure many of you have been around and have listened to presentations where pretty much everybody would talk about that they have an SDN solution where in reality may be only a few vendors really had a true Sdn solution so what I want to do is level set with you on what makes a data plane truly programmable so we believe you need to be able to meet these five criteria number one having an open programming language with a community behind it which is before number two having a compiler and an ASIC simulation model number three having visualization tools we will talk about something called p4 insight number four having flexible and auto-generated api's because guess what if the chip is programmable if the chip is a blank canvas I can do whatever I want my API is better be flexible and auto-generated and not fixed okay those things have to go hand-in-hand and finally of course where everything is rest upon is a fully programmable ASIC and all of this has to be and user accessible hey not all the time having barefoot doing this part so let's dive into each one of these things first before as I mentioned the key thing about before is not just that it's a language not just that it's open but the fact that it has a community around it in the community is so diverse sometimes it's mind-boggling as to what kind of people what kind of organizations are part of the community these are not just vendors these are not just researchers but these are also and users folks like enterprises or cloud customers or telecom operators if you ask me five years ago with a telecom operator you're writing their own data plane I would have said you're crazy but we see this happening more and more today hey there's one example in the end where I can show you how that happens okay so that's the community the open source community around before and to dive deeper into this and kind of be clear as to the difference between before and maybe some of their alternatives for programming networking basics that you may have heard about before the difference I think couldn't be more clear before as a large open community an ecosystem alternatives depends right probably little to zero community right there may be a language but there may not be a community around it before its target and architecture independent whereas other solutions may be tied to a specific ASIC before is a high-level language right before has a set of generic constructs that can be fairly easily be used across different architectures right where you know alternatives may not quite be in that you know kind of frame of mind before really has been designed to do network pipeline programming right it's a language that was designed from the ground up to program the network not to program x86 devices or to program any other kind of target it's a domain-specific language for networking which is very important right sometimes people say well why do I need another language why can't I use what was there before because what was there before does not allow us to achieve the goal and finally right with the ecosystem and with the community around before and with multiple vendors and badging before based data planes into their switches by now we can say that before it's proven in the industry now think your target market isn't so is this are you marketing it towards network operating system vendors or high-end working users or okay so the question is about well who is really the target user now it is clear that we don't expect every single person out there every single network operator to be writing before right away okay and while we do have lots of cloud customers telecom operators enterprises for doing before the people who would say you know what I really want to do this but I don't want to start doing before by myself right away we have some ready-made p4 programs that are available to get them started hey don't you a question yes thank you okay so to move on the question is how does the architecture of our software development environment and the way that we integrate with the control plane actually answers to all of these programmability requirements remember I mentioned about five things that are required to call yourself programmable to call a data plane programmable so starting with before we would have a p4 application we would use our compiler I'll talk about the compiler in the next slide to take that application interpret it and compile it in a format that the silicon can understand and the compiler would actually help load it onto the silicon and it would also produce a set of information that is available for our visualization tool called different side to be able to tell you how the program is actually utilizing the resources on the chip now once I have the program loaded onto the chip or into the ASIC simulation model because guess what right - right the p4 programs you don't always need to have the hardware our customers can actually do a lot of things before they get the chips if you're a vendor or the complete systems in their hands so once you have the people program loaded of course you need to integrate the control plane with that v4 program and you would use a set of api's for that now which api s-- there's three different types of api the lowest levels the program level API what we call barefoot runtime api or you may have heard about something called p for runtime are the ones that give you the most direct access to the hardware but some folks who perhaps used to a traditional fixed function ASIC may not be quite ready to talk to the ASIC by the lowest levels right talk to kind of using the P for tables and P for constructs they may want to talk to the chip using constructs like routes and VLANs because in before doing something like programming a route may be quite a couple of different operations it all depends on how flexible you want to be so they may use the switch application API that comes with our production ready people program to program the chip the chip now acts to you like a fixed function silicon and many of the vendors that we have worked with so far have done exactly that and finally those who are used to switch abstraction interface so you may have heard about switch obstruction interface as this common layer that somebody's control plane could talk to without a change between different nations so if I'm talking to a barefoot or to somebody else I can use the same layer the same API right so we support the SA as well but the cool part is that we don't really have a religion as to which API to use there are folks and industry that I've heard that sometimes have a religion and you know that may work for their particular use keys but because the chip is programmable and can fit in many different use cases sometimes using sei is actually really good and sometimes one of our users may be able to use any of the other layers so I mentioned compiler compiler is super critical towards achieving this vision of data playing program and being able to meet the requirements of putting yourself truly programmable we've been doing compilers for quite a while doing p4 compiler it's not a simple task we're now in second generation of compiler that supports both versions of the language supports both chips Tofino in Tofino to improve the efficiency quite a bit how quickly we can compile the program how efficiently we can utilize the resources of the chip because guess what if I use more resources of the chip if I can back things up more efficiently I can build a more scalable solution I can put in more routes or more tunnels or more a closed or anything of that sort and the most critical thing is that it's available to end users and the users who may want to do p4 programming with an open source front end and so compiler has kind of these two parts there's a front end and there's a back end the front end is the one that does interpretation and validation of your p4 syntax much like a compiler for C or C++ or any of the languages question of Houston around that about the compiling I remember in previous discussions with you guys you talked about the idea of if it compiles it forwards and can forward at line rates you know you know previously we talked about the security applications of that of things like IPSec and massive you know putting massive workloads for security into the forwarding plane is that something that you guys have seen develop as you're growing the next generation of chips and the security applications sure yeah so in the case of security for example doing things like DDoS mitigation just doing simple echos or doing the high scale icon environment absolutely okay what about like Mac suck IPSec anything in those those realms so Mac SEC requires the encryption part right which is not part of a chip today ok I mean there are customers that support platforms that have Mac sec but today mexic is done externally to the actual switch tip and so you can definitely use barefoot if you know with that and there are customers that have actually provided or built platforms that support that from that side of things but as Arkady was saying certain there's a lot of security applications I would say we've started to get used in a lot of high scale stateful firewalls where think of people that need terabit you know bandwidth for those kinds of applications hopefully that gives you a good sense of okay so as I was saying that compiler has an open source front-end so those who want to build a compiler for there before program will target let's say somebody else wants to build a NIC or a switch or switch in silicon they can take that open source front end perhaps modified add some additional tools on top of it and of course we have to build a back-end that actually translated to a language that the chip can understand ok now when we built the compiler especially when we started working our second-generation compiler we realized the importance of a visualization to a graphical user interface that actually tells you how well you're utilizing the resources on the chip if you are not utilizing enough it can maybe give you some ideas as to what you can do to utilize more and we've build out this tool code before insight which essentially shows you in a set of dashboards how your program is doing how many SRAM blocks it's taking how many T camp blocks is taking how much parser resources you're using allows you to do some fun things like now compare program version 1 to program version 2 and see did you actually do better or worse did you make a change that now all of a sudden you end up taking more stages or maybe you're doing better right because a lot of the customers we see where they're doing this kind of iterative before development are you providing customers with these utilization recommendations before configuration or is it all just after yes so definitely the utilization recommendations are provided kind of when they build it before program all right so it's not like it's too late yeah it's definitely never too late yeah that oh that's yeah yeah because as I said right it's an iterative development somebody builds it they see the result and I say ok looks like I have some space over here I can increase Table two you know a bigger size and you keep working on it of course then you may have a time line like if you're a vendor or you yourself are building a solution you kind of say okay this is good enough like in any engineering project and but we've seen customers who have deployed a p4 program or p4 data plane and then they start seeing all looks like I need to add another feature looks like I need to add another security feature because I started seeing some traffic that I want to be able to handle in a very specific way that's exactly what these folks are doing right I'm actually was working with one one big customer this week on that deficit area okay so this is completely part of our software development environment when our customers get it it's all built in they start using it and essentially what this does is that it saves people time think of it like if you have to pee for developers now you sort of have a third one that helps you out and make things go a little bit faster and allows you to accelerate your delivery of your solution so as I said not every single person is gonna write a p4 program from scratch for that reason we supply a set of p4 applications which we'll call switch that before which are ready-made they answer several different use cases from enterprise to prorack to spine to know things like segment routing some of the new applications they have the data plane telemetry built in which Roberto will talk about later they're pretty flexible where people like in a software development scenario they want to start with something but then they would add on top of it and instead of doing things from scratch and so what we've seen customers do is they take that before program and they try to modify it I mean we move some features they don't want because they want to keep things simple and had talked about simplicity before and they may want to add some more features and with that remember I talked about this abstraction api's that allows the customer to talk to the asic using common networking constructs the api's come with that they're also auto-generated they're also flexible so if you go at a feature to our before program you can have an API that goes along with that speaking of segment routing since you brought up segment routing are you guys I'm assuming you're doing working with vendors software vendors to do like SR MPLS yeah right now are you doing SR v6 as well yeah it's our v6 the yes and guess what I heard that the community is thinking of some other version called srt-6 blocks yeah there's a new version of s alright so the beautiful part of this is ships programmable you one won't do something else you make a do whatever you want yeah you want to do something else go ahead it's really up to either us to provide a reference or for an end-user or vendor to build that solution using before now kind of an interesting question I I'm thinking of some cool use cases for it and going down those lines too and I hear you talking about you know not everybody's going to program their switch which of course always scares the hell out of me you think you know everybody having a switch and being able to program it but what so so let's assume that I decide to go make cool application and really don't do it very well and it completely hoses stuff on my data plane how difficult then would it be to you know it from the the your compiler standpoint to either get information from it for it to tell me what went wrong for it to go dude you suck at this you know something along those lines but I mean you know I all I can think of is it you know yes I want to have the right people do this but you're making a platform and a whole environment where anybody technically can do it yeah where's that balance no no it's a great question I mean obviously it's not necessarily for everybody I mean an enthusiast could certainly go you know go do this but they're not gonna have that responsibility right yeah they're not gonna have access presumably to your wiring closet to go you know go do that from that side of things I think that's the the beauty and the power with great you know power comes great responsibility but that's why we have focused so much on the software side of things and the tools in the programming environment in the validation environment and so that p4 studio and that p4 insight it really allows you to look at what you change from one program to another so that you know exactly what you change so that you can validate that you can even send sample tests packets through it a little trace through and show did you actually exercise that code did it follow the test packet actually follow that path and so there's a lot of things there that really you know ensure and validate that these things are gonna happen and obviously the entire partner ecosystem whether it's the white box nas operating system they're gonna validate it that it's you know configure it correctly it's going to go through their verification environment but also the large OEMs that are also shipping solutions they do the same thing just like if they were to ship you a new hardware platform based on a new ASIC they would go validate that you know all of those things are there certainly it allows maybe in a playground environment you can see some interesting things happens but that's that's still like the the frontier not everybody's doing it well yeah I figured we didn't want to start with the Edison idea if I figured out 10,000 ways to not do it true yes so I've actually got a follow-on question to that because that way I'm curious where a p4 comes in because you mentioned you know if you guys are putting this onto a white box only enabled switch and then you're gonna pick your favorite you know nas vendor to go put this on top of where does people or come into that is that the NOS vendor is gonna work with you guys to leverage p4 to put features into the OS to push down in hardware and is that available to the end user or where does where does it fit that lifecycle and i think that's a great question I think Arkady has that's no I think that's literally covered up you know very soon and I think Arkady I'll touch upon that because there's many different consumption models as what I think you guys are touching upon yep okay so quick note on the API specifically the sa I we've been doing a CI for quite a while we're actually the first folks to contribute early implementation and a set of packet test framework that's sweet so talking about what I'd mentioned on you know a set of tests to validate whether you did things correctly whether the switch is processing the packets correctly and pretty much since 2016's we've been supporting new site versions supporting features that are specified in the CI but also add in some stuff of our own right like doing data plane telemetry then adding a site specification is not the only way to talk to the asic but it's one of the ways that's becoming quite popular and kind of we're right there with it so how does this all come together right then what who did we actually enables so starting off with our programmable silicon - phenom - phyno - we worked with several different ODMs so on the disaggregated side in you'll see folks like H core and Ventech elastica all of these people have - Fino based systems but then you got the network operating system or fabric solution that goes on top of that and many of these things would be familiar to you you know things like sonic an stratum on the open source side I'll talk about both of them and also these other vendors like Kalume IP infusion lava flow and a new company called status we've just launched last week where they have a specific fabric solution for a colocation environment and completely build with before with our programmable silicon now on the OEM site the publicly known Arista has a set of platforms based on tough you know where they supply multiple different people programs that answer to various different use cases things like high skill math or high skill tunneling but they're also now allowing end-users who want to do it to put the people program in themselves where actually they would take care of things like optics and platform components so that you wouldn't have to deal with because if you ever heard presentation from you know folks from onl one of the most annoying things is dealing with optics so they would take care of that but the end user may put their own entire p4 program and replace the Arista spy plane right so they've demonstrated that a people workshop a couple months ago and similar to Cisco right where there have built a toughy new base system they put several different p4 data planes to answer to the use cases that they feel and need to be answered but also allowing customer and user to do some of their own programmability alright so some that they have talked about and called the pipe is exactly the solution for that now to your question about how the things actually come together and who's gonna be doing the P for so in the case of vertically integrated systems and OEMs things are pretty clear typically the OEM vendor would be supplying the P for program now as I mentioned there are some scenarios now where they're opening things up and they're letting end-users do it but the most common case is that they would be supplying the P for program and the program gets loaded onto the chip and their CLI or API presents a set of people programs set of use cases that that chip can kind of answer to and the OEM vendor in this case is the hardware manufacturer or the nos vendor or both so in case of a vertically integrated system it's both okay so somebody like Cisco they own the OS right there on in us and they own the system but in case of a disaggregated system and by the way there's there's some gray area in between now because the OEM vendors are now starting to do things like supporting sonic but let's take the traditional kind of approach the design gate system things become a little more interesting where you have an AA spender you have a ODM vendor but the p4 can be developed by us could be provided by the NAS or could actually be done by the end-user specifically in the case of open source nas right because if you have the code you can change the data plane you can embed the data plane with a control plane right away and kind of do things to your heart's content and of course if it's a commercial NAS then they would have to open things up much like our Vista has done we will be able to load an end-user data plane and some of them are a little further on that some of them are just starting but that's kind of a question to those to those vendors whether they see the value and doing that we certainly do right ok does that answer your question yeah it does yeah absolutely good excellent so to talk a little bit more about disaggregated network operating systems hopefully most of you have heard about sonic sonic has been more popular there is more investment in Sonic from the community from vendors from end-users trying to add more features make it more mature so that it starts getting more widely adopted so we've been working with Sonic for a couple of years now actually one of the first things that I started in barefoot is driving our sonic strategy and making sure that we're full participant in the sonic community by now we got several folks who have deployed sonic on our programmable silicon and to ours and a programmable data plane we didn't want to just do that everybody else does we wanted to enable multiple different data planes surround on Sonic because these type of systems are traditionally have been designed to work with fixed functions silicon so the only would work with one data plane we actually embed it into Sonic multiple different data blends exactly in this way this is an example with Sonic where there's a file that you need to edit to basically select which data plane that you're using and when that is done the right people program gets loaded onto the asic and so this is with our upstream package into Sonic that and end users can do that and the two profiles that are there are essentially answering two most commonly used cases one is the top of racks which that's doing data plane telemetry and another one is the top of racks which that's doing a very high scale tunneling at the excellent timing okay and again the OEMs the commercial operating system vendors they do something similar they either have a file or they have a CLI or they have an API that allows the user to switch between different p4 programs that they supply ok another operating system on the open source lab that we've been working with and they actually announced last month today is October 1st so it's last month that it's been open sourced is something called stratum so stratum has been led by Open Networking foundation with the initial seed code from google but stratum is a little bit different so if you're familiar with things like Sdn an open flow where you had a controller that had all the control plane and you had a thin stack OS it switch stratum is the next generation of that where instead of having the full control plane like sonic or IP infusion on the switch the control plane sits on the controller and the controller uses this protocol called p4 runtime to push down the p4 program onto the switch okay now in the case of a programmable silicon what this means that p4 actually defines the data plane and because we natively understand before we actually don't have to do a heck of a lot of things we're just taking the program and we program it of course as long as it's been compiled and validated for our silicon okay so we've demonstrated a couple of different data planes together with our audience and know enough with a couple of user scenarios and within the controller so onf has this controller called honest they actually embed these applications that when they're selected those data planes get pushed down okay so that's Chatham and now going to the benefits of p4 programmable switches my colleagues will talk about a lot of this and the use cases that are associated with this but what I want to focus on for the remainder of my portion is the benefit of reduced complexity and talk about how the software architecture actually enables that with a several different use cases so let let's take a customer example we have a customer called you cloud they're up-and-coming cloud provider and in China some of you may have heard folks like Alibaba you cloud is growing quite fast and there'll be a household name pretty soon if not already now the problem that tastes set out to resolve is that they had a set of gateways right as they were growing and they were consolidating operations into several data centers they had a set of gateways that needed to do some sort of translation in this case it was kind of ipv6 gate we're doing maybe v6 to v4 translation the current networking products that they were using IPO traditional networking products using a fixed so can they just weren't keeping up because they weren't designed for this high scale and a cloud environment purpose what they ended up doing is they ended up taking everything to the server which is great we're into now we love servers but what the customer also realized is that they want to be able to accelerate these workloads further so what they've done is they've actually taking these workloads from a server at the DPD PDK base server and moved it back to the new network in silicon they wrote around before program they took open Network Linux they put FRR so they wrote their own control plane stack and as a result they actually have a high performance solution it's simple because it's only doing that function it's not doing a bunch of other features that are typical topo racks which would or maybe a spine switch or router would they just focus on doing this function and doing it very well and they also achieve much lower latency and lower power and more control of their network as a result and another example of that is what we've done together with Microsoft and Arista where we showed the top of racks which doing a high scale the excellent unlink and we really focused on this bare metal deployment where you're not connecting a virtualized servers in the cloud but you're connecting bare metal workloads that need to be talking to the virtualized servers and in those cases you need to do not just a couple of thousand tunnels but hundreds of thousands of tunnels ok this is one scenario that kind of similar in this tunneling use case and high scale tunneling and flexible tunneling and the last use case I want to talk about is a network packet broker now guess what you can build a network packet broker with a fixed function chip and without using before many people do it but how good of a solution that will be how clean of a solution that will be that's the question that you have to ask yourself and that's the reason why all of these both vendors and end-users actually ended up doing their own packet broker because they saw that if I have a fully programmable chip I don't have to try to take what's there that's been designed for a top-of-rack switch and try to horn it into a network packet broker use case I just make it only do network packet broker functions right so my data plane simplified I only have the api's that are needed to program network packet broker functions right things like header stripper and packet slicing and many of this actually cannot be done on a fixed function chip but even if they can't be done you could do it in a much simpler way how does this compare from a cost and performance perspective to someone like the you know merchants silicon that like Broadcom has out there I'm just curious to see where you might position this it kind of seems like there's a spectrum right if for a long time we've had like the merchants silicon on one side that it's very fixed in performance you really can't add more capability to it once you know that that chipsets defined right but it's low cost and it's you know easily accessible for you know switch vendors to to create like you know low cost access layer type devices right on the other end the spectrum you got the custom silicones and things like the q5 chips that are much more programmable and stuff so I'm trying to figure out where this fits in that spectrum and like you know how competitive it would be from like a pricing perspective compared to someone like the merchants silicon that's out there today no that's a it's a great question I mean it's a common one that we get I think you can always hear from one vendor versus another I think right now what's fantastic is you can go out there and buy switches based on our technology and the standard legacy fixed you know merchants and you can see from a power performance feature scalability pretty much every you know price every dimension that in many ways this is the future of merchants right I mean we are you know in many ways just another merchant provider but now we have really shown that you're going to get everything that you got before and what I like to call the power of programmability maybe it's a little bit of investment protection allows you to not have to go rip and replace you know that switch just because the new came along or you want to add something new and what's interesting I think in one of the following presentations you'll actually see a comparison we did of taking two data sheets from the exact same vendor right because sometimes one OEM versus another may not measure things or compare things the exact same way and we did that versus you know the merchant one and you can see on every dimension that it is better and so now you can actually go look at it and what we're trying to do and our objective is to make it an on consideration right from a price perspective from a cost perspective from a power efficiency performance every dimension out there you can see that this is simply better than the what people have associated with the low cost low cost approaches that are out there it is the low cost approach but now with programmability to that point how much does the flexibility weigh in for your customers because the use cases that I see you putting out there of large scale mat which is a huge problem to solve capacity of cg net and large scale if you're in the 5g carrier world that's a massive problem to solve packet scrubbing and scale those kinds of things and even just core packet forwarding if I can take the same switch and I'm using it instead of a custom appliance to do those things how big of a big of a win is that for your customers because that's what I see you being able to grow with that I mean I think it's a fair question our current focus has really been on the hyper scalars and for those guys yeah for that customer group it's really been fundamental you can't build a large scale cloud if you can't NAT to every customer that's out there and you can't do it for every top rack switch and everything else you can't if you need to do the scrubbing so for them it's inherently technology that's just integrated into their base infrastructure and maybe they're the ones that are a little bit more out there and a little bit more nimble and and you can see how quickly that that cloud is infrastructure ecosystem is really evolving I think what you're gonna have is that technology slowly come you know to the rest of the the standard enterprise data center network technology and it'll start to become available and you already started to see it from some of the OEMs rights Arkady showed one of the slides from one of our OEM partners that is using this they're already offering layer for load balancing integrated into that same switch and that's pretty much a basic feature for any large scale cloud you have to be able to distribute it equally over every single server in that rack I mean it's kind of hand-in-hand with scale out right from that side of things so certainly those new the one thing that I kind of mentioned that we do feel as ubiquitous to everybody that is really pushing this is the need for visibility and instrumentation in that idea of int in telemetry and having that in all of your infrastructure that is key and that is the one thing today that everybody is really trying to get more of and you need in just that regular switch that you're you have today and so that's the the compelling thing that is available today that you don't have to change your infrastructure you don't have to change where you're doing that or anything else from that and so hopefully that gives you a view of where we think it's it's coming
Info
Channel: Tech Field Day
Views: 1,386
Rating: 5 out of 5
Keywords: Tech Field Day, Networking Field Day, Networking Field Day 21, NFD21, Barefoot, Barefoot Networks, Tofino, Intel, Arkadiy Shapiro, P4
Id: y2LRYKsdsJ0
Channel Id: undefined
Length: 38min 33sec (2313 seconds)
Published: Wed Oct 02 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.