Help! My big expensive router is really expensive!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right welcome so what I want to do is frame up a problem I've gotten up here a few times now spoken about Netflix and what we've done over the past couple years for for deploying infrastructure what we haven't really talked about is how does this take us into the next decade both from a technology scale perspective and from a technology cost perspective let me introduce our panelists who are gonna speak about this weight this is a way over j'en of the statute I think No okay we're fine we're fine sorry so I had the last version I removed mark Burley from Arista he unfortunately because of travel what was going to come in yesterday morning and his flight was proactively canceled I'm Dave champion I work for Netflix up on the ste stage we have Craig Marin posi from Microsoft we if the original version of the talk that was posted had Vijay Gill yet another travel casualty but we're happy to have Craig we have mr. Richard Turk Bergen or Steenburgen depending on the poll results that will be announced later on from GTT we have Kevin Weber from Cisco and a last-minute addition you just heard Russ speak yeah he might have some good insight as well for the panel so kind of standing in for mark mark it's that kind of morning so I want to be clear about one thing this panel there's a reason why we have Cisco up here and we originally we're going to have a wrist up here I invited juniper to speak up here this is not to attack vendors we're not saying your stuff is too expensive in that your margins are too high and you know how dare you charge as much as you do yada yada yada this is a technology problem we think vendors can help us one way or another and so what we really are focused on here is figuring out what the long-term answer is depending on what buzzwords we want to apply to it this is not in any way specific to any vendor and I really do thanks Cisco and now Ericsson for coming up here so from a background perspective we began our project where we installed our big expensive routers approximately two years ago we've got about roughly a hundred percent of our traffic just below that on our own platform at this point we've rolled out at this point 18 terabytes of network and server capacity around the world a quick map of where we are so this is a scale drawing of one of my coalos you may have heard of the internet and you may have heard of house of cards and then there's also the content that you'll at some point today complain to me about there's there's not actually a lot of complexity here in reality that's what it looks like so it's a big expensive router in the middle of some transit and some peering and some servers and that's really yet there's multiple of these right the sorry this scale is really just how many of these we deploy and that's where the cost driver is things like power and space and the actual hardware is what drives the cost and the complexity so we recently since the last time I presented in front of the 9 out crowd introduced an even more dense platform for delivering bits which means that I can deliver you know 40 gigabits of traffic nearly 40 give you its traffic from 1u of Rackspace if you think about that and I can fill a rack well you know 150 watts of peace I can fill a rack at you know 8 kW and end up with over a terabyte out of iraq that's pretty crazy so the scale itself is unprecedented right one of these clusters I need to be able to deliver 640 routed 10g portes now this is again the big expensive router construct right not be I've come up with some awesome little cheap box you know that can you know I can only put certain routes into the FIB and play games of policy and have an RSP that's way too slow too to deal with updates in a meaningful amount of time this is I want a performant Network and if you think about the current products of the market from the major vendors six hundred and forty 10 gig ports is that the way upper limit of what any of them can fit into a single chassis box before you start talking about defining chassis zazz entire cabinets that then connect to each other so this is an optimal in numerous ways routers are really expensive right we're talking about ten to fifteen thousand dollars per 10 gig e at lists you know and we can argue about who pays list for what but the overall point is that if you really take that price and multiply that by six hundred and forty ports that's an awfully expensive rack of course these boxes do lots and lots of things that I don't need them to do right I don't need MPLS in my case right I don't really need layer 2 VPN there there are you know certain use cases I've got an awesome network architecture team that can come up with all sorts of great ways to turn on knobs for things if we really wanted to do something cool but in reality we don't need it there are things that they absolutely need to do they need to route packets they need to route ipv4 sometimes they might need to route ipv6 I don't need to run these though like a data center I'm almost one-to-one I can my server can put out almost 40 gigs so I don't have the opportunity to build a big aggregation layer on top of that my application is really smart my application can make a lot of decisions regarding how and where to serve traffic from to the point where our application takes a BGP feed in today to make a lot of the decisions it makes and so if I've got that I don't really need all sorts of added intelligence on the edge now if I wanted to that server that I just showed you and our other servers that we've showed off before I can build 90% of that up from off-the-shelf parts of Fry's I can walk in a Fry's and buy a bunch of hard drives I can buy a motherboard I can buy a few SATA controllers and effectively build what we call it open connect appliance I can't do that with the router I mean it would be really impressive someone should photoshop a bunch of links this is in a in a 48 you rat but in reality I just can't do that with a router and so we recognized that just from a cost perspective you're never gonna have true parity and nor are you ever gonna truly follow the Moore's Law curve in routers like you do in in servers and on top of that these boxes are getting harder and harder to cool right we've gone from a basically having a 5 KW rack we be more than enough to deal with whatever router you could throw at it now we're up in the 10 to 12 to 15 kW range people are talking about water-cooled routers that idea to me is insane and you know we've talked about if any of you were in the data center track the other day the question came up regarding Bart or cool server racks and the added complexity there could you imagine that if you needed to do anything to your router you need to have a chilled water loop running to it and you have to have your data center deal with any any piece of maintenance to do with your router because of that chilled water loop I think I'll pass on that so how do I bring this all together if you think about a router it's actually really terrible at making decisions for routing traffic it's not really a good routing engine it takes a fixed set of inputs for things that change drastically over time and applies the same calculation to it no matter what happens this is the Hutt's this is not the way forward here we've added extensions on the things we start who started talking about segment routing right segment routing is great it's good concept it helps us it takes some complexity out of the network allows us to build maps of what the network should look like offline that's great however we're still just kind of putting a bandaid on what already exists we're not really seeing the innovation at the edge we'd like to see and when you think about what a router is doing none of what it looks at has anything to do with the axle performance of a path which is what most of us at least in the content world are concerned with and so how do we get that performance info into these big shiny boxes or how do we get away from these big shiny boxes if you think about it you know my network is way greater than 50% of my server cost don't need MPLS and we carry Uther net underneath a pb6 anyone I don't need layer 3 VPN and I don't even really think I need a full scale fit so I posed this question but I'm gonna actually wait let the other panelists present first and then we'll we'll hopefully have an interactive discussion with the audience about that I'm gonna ask you to not use buzzwords like SDM big data the cloud or big data in the cloud and so if you could just avoid those in all of your responses that would just be fantastic Thanks the next are you doing we're not allowed to use those words either so I had to redo my slides because they all said CSD n and cloud and everything so I'm Craig parent Ozzie director network engineering and Microsoft and a poor uh VJ Gill substitute but they did want to channel VJ a little bit and start off with our cogs our cogs is too damn high and it's all Kevin's fault but Kevin will be up here and I don't envy Kevin's job at all actually we're all telling him to build different things and strip this out and put this in and make it bigger and make it smaller and make it faster make it slower so I don't know his head it's probably spinning by the time he actually gets a PR D over to his over to his engineering group so okay the Microsoft Network so 8075 is a public-facing network and it's delivering bits to the end users the general function is just to deliver services to terminate them closer and closer down to the down to the consumer so today that's pretty much what it looks like its global network its end by 10 we are moving to 100 the public facing network is pushing about probably about 5 5 T right now it is a feature-rich network it's an it's an IP MPLS network with multiple MPLS meshes on top of it a best-effort mesh a priority mesh we are using auto bandwidth we're using forwarding adjacencies it's a it's pretty complex network at face value when you actually dig into the dig into the configuration configuration itself one of the interesting things about it is each of those dots is obviously representing a market or a city but within each of those markets we can have as many as five six sometimes even more facilities in that in that market in that region so the complexity from a scaling perspective is not only the wide area here but it's also intermarket so we're delivering terabytes of city between facilities whether it's inter-campus intra data center and those can be anywhere from across a parking lot or across the street from one another to on the other side of a other side of a city or a market some of our larger properties traffic profile especially on the cloud services besides sorry is cloud you know we get requirements from our product guys our services guys that say you know you know machine the machine can't be more than 1.8 milliseconds machine to me machine can't be more than 0.7 milliseconds so that's how we're designing each of these each of these markets as we scale different generations different generations of data centers into into the markets we do have another network which is 874 it's private and that's our inner data center fabric it's our inner data center network so that's where it's just be big fat pipes between the major data centers for things like moving around the Bing search corpus BCD our data replication exchange online SharePoint moving around big big chunks of data mailbox replication things like that go back okay so the topology inside the data center themselves it's actually a three-stage folded cloths of fat tree and the advantage we get from from this is the horizontal scaling so a lot of the applications demand non-blocking east to west bandwidth because a lot of the application components are actually spread out amongst a large number of servers not only are they spread out amongst a large number of servers but also like I was saying before in a particular market the components of the application actually might be distributed amongst locations within a market so the machine to machine traffic actually might be talking across across facilities within either a campus or or a region or a market so we're able to do this obviously with the dense commodity hardware we're trying to build larger networks out of smaller components simplify it reduce the variability on the network get the economies of scale and the stat mocks of gloss matrix and the aggregation that it that it gives us with the oversubscription and then thereby reduce costs we reduce costs that way so from a requirements perspective one of the complexities that we have is we have over two hundred two hundred actual online services that are using the network and using the using the data centers so from an application standpoint we have to we have to take into account a pretty large set of requirements whether it's it's the social media it's the being with nth web index and moving around the search corpus the targeted advertising the Bing imagery so moving the moving the maps and the imagery around the cloud computing stuff so both the public and private Azure cloud services and CDN and from that perspective that gives us a lot of elastic compute and storage requirements and traffic profiles that we need to deal with and then of course we have the real-time analytics that are going on at all times so from from a requirements perspective low latency computing which is leveraging the distributed memory across across the across the nodes so in general we see a lot of ephemeral traffic traffic profile traffic flows across the network we see a lot of from from east to west traffic profile that's what's really driving the need for this large by sectional bandwidth that we that we we really have to deal with okay so what are we seeing on the trends so I think this is gonna you know Kevin's gonna come up here and pretty much oh yeah I think he almost has the same graph as we do which I guess that's a good thing that we actually see the same thing so what we're seeing is when we were mapping the costs on the network inside our data centers we're seeing those costs shift over time from from the silicon over to the optics and the cabling infrastructure that we have to put together inside and between all these all these facilities so the switching costs themselves in the fabric layer those are continuing to decline with silicon economics but we're really not seeing that being reflected or at least the curve being as steep from a cost compression or a unit cost compression side with large feature rich core routing nodes routing routing devices and then like Dave said the power and cooling on these large network elements is a is a huge problem that that we're facing even as the the power per bit goes down stacking chips on boards building a larger and larger routing nodes is actually the the total power is going up and when you're getting past 10 kW 12 kW it really causes a lot of challenges not only in our facilities but in third-party facilities where we're trying to drop these drop these notes notes into so what are some of the things we're thinking about as we're trying to deal with the challenges not only on the wide area moving the bits closer and closer to the to the end-users and the consumers through geo expansion but inside the data center in between the data centers so first we're really looking at any optimizations around service resiliency itself so decoupling the service availability from the network availability and engineer availability essentially from the get-go so if you start with the premise that equipment failure is is a constant operating condition and you look at it as a software opportunity you can really start to do some interesting things with migrating workloads around under failure and maintenance conditions abstract the service itself from the hardware and the protocol stack and abstract it from any manual processes that you might have to go in and you know your NOC is sitting there waiting for an alert and and you know red flashy light goes off now I got to go fix something that's adding time to time to mitigate time to detect time to restore which is obviously from a user experience and a service availability perspective not-not-not where you want to be the interesting thing about service service resiliency in the software and decoupling it from network resiliency itself is then I get to go tell you know Kevin and is and and and and others that I can actually start stripping down the requirements list and say you know what I don't need it to boot really really fast I don't need things like grace will restart when the when the node goes offline you know my workloads migrated there traffic engineering is happening at the service level and then you can boot you can boot in 10 minutes 12 minutes partial boot whatever you want so it really simplifies what Kevin and the vendors actually have to do and think about when they're engineering engineering a box you know that the trends keep swinging back and forth what we want and what we're utilizing in the networks and you know I think even a couple years ago you know people like just to boot as fast as possible please get it back in 30 30 seconds and that puts a lot of a lot of constraints or a lot of pressure on a box that's you know 5 6 8 10 10 T router spins all its fans up at the same time and uses up choose up on all that that peak power so the second thing when we're looking at scale and lowering cost is if I can actually do the first one I can then do the use do the second which is increasing scale at lower cost through a cheap label switching I also need lower costs integrated WDM solutions not only for wide area where the integration can help with the unit cost on the optical side or the packet optical side but also in the metros in the campus environments where at the boarder leaf layer of my network I'm connecting at huge huge speeds huge capacities and I really want that this type of low cost integrated DWDM solution to be on the same depreciation curve as practically like servers so I can so I can crop rotate not only my servers and my coalos but also crop rotate my conic my way and land connectivity so where we're heading is a future where there's much fewer if not no protocols actually running on these network elements and all of the path computation and is done either offline in some type of path computation engine and the Phoebus programmed program down into the network or even at a services services engine if you will and then you can just dumb down your entire network you have the same consistent forwarding paradigm through the whole entire network with some type of quote unquote label switching and I'm not saying it has to be MPLS it has to be this or that it's just is a cheap label switching solution and then lower power consumption we talked about power a little bit I was talking to somebody in the back and you know 10 12 kW seems to be the the sweet spot so from our perspective we'll actually trade off density and capacity for the lower power consumption and then use these smaller widgets to scale and scale and network horizontally that also gives us the ability to use essentially get Hardware ubiquity so I can use similar platform same chipsets across all layers of the network and then essentially what we're asking for is smaller stupider and cheaper but not too small and not too stupid so I think that's all I have Thanks so that's what's up Kevin go and then we'll go into Q&A oh wait no sorry Richard first and then my bed hello I am Richard Steenburgen from torque working telecom no just kidding GTT and something and here to talk about how to make big expensive routers less expensive but probably still big so the reality is that ether net can be really freakin cheap if you want it to be there's a whole bunch of simple hardware out there that's really really cheap and the simplest example of this is all of the Broadcom tried to drive boxes that are out there so take a look at every vendor on the planet essentially is is making one of these rebadging these selling these having their own little twist to them but it's essentially the same hardware the same design you know the the juniper qfx the cisco nexus 3k is that the forest ends some newer Arista boxes HP alcatel-lucent IBM and and literally dozens and hundreds more including every random Chinese vendor known to man and the thing that all these things have in common is a very simple Broadcom chip that now delivers massive amounts of bandwidth for ridiculously low amounts of money so the question then becomes why is my expensive routers still expensive or why does my mx9 60 blade cost a couple orders of magnitude more than my 1u HP switch and to be fair it's not all that your router vendors are just trying to bilk you a lot of these are very complex technical problems and especially when you start building these very large chassis x' and you start trying to design all of these failure mode protections and redundancies you end up with something that's even less likely to be stable and very complex to engineer so in fairness these are very complex technical problems but then again sometimes the technical problems aren't always shouldn't have ever needed to exist in the first place and sometimes we're beating our head against problems that don't necessarily to exist so the answer to why you still pay for your big expensive router is you probably want your big expensive router to do some fancy things that your small cheap router can't do the real question is is it because the box can't or simply because it doesn't so here's an example of something that's missing in the in the product space the case of quorum pls switching there's a lot of networks out there that would benefit greatly from a dedicated core that really was dumb didn't need to carry a big fib didn't need to do deep packet processing anything all it needs to do is MPLS switching and in terms of hardware it's actually ridiculously easy in fact that was one of the reasons that MPLS was designed in the first place you're doing simple exact match lookups you have very little state there's very simple headers the parts you need a very small fib there's all kinds of reasons why the hardware could be able to do it better and in fact a lot of the commodity hardware that's out there today actually can do it so the question is where's my cheap MPLS only core platform if you look at a platform like juniper since they're not appeared to defend themselves the PBX is really barely any better than the MX in terms of density in terms of price in terms of you know that's not an order of magnitude savings benefit it's not anything that really motivates you so the question is why do I need to buy this million dollar core box if I could do something like that with a thousand dollar cheap box and the hardware can support it so and the answer in this particular case is because all of these other cheap boxes don't have the software it turns out that doing MPLS is really simple and hardware and actually pretty complex in software we think about all the work that goes into signaling bandwidth reservations doing all the fast reroute all these different mechanisms out there really only the incumbent outer vendors actually understand it and that's because they're the ones that wrote it so you you see other vendors try to come along you see you know brocade and the old foundry boxes and people that try to implement it and they do to some extent but they don't do it to the full carrier-grade suite of a Cisco and a Juniper who actually wrote the protocol that's that's really the reason you don't see any type of box in this space so why would Cisco or juniper make a cheap dense 1u MPLS core box with a 64 by 10 gig solution they would just be candle izing their own carrier business and there's no competition in the market and like Dave said we're not here to bash on these guys but we're also here to look at where are their areas of improved competition where are their ways to solve the problem in non-traditional ways instead of just trying to throw more power in Rackspace and cooling at it so Sdn big data cloud to do sorry to the cloud but before we fire up the hype machine and talk about what actually is Sdn what is software-defined networking because it's one of those things no one actually has a real grasp of what they're actually talking about or or claiming they're talking about so the reality is I've been using software to define my network for quite a while and if you haven't you've probably got a lot of unmanaged switches that's probably not a very interesting network so is Sdn just another fad or as avi Friedman would say a funding augmentation device which in a lot of cases seems to be the case or is it actually something something that could be accomplished with that but first I think you want to you want to talk about what did you actually spend your money on when you buy your your big expensive router the reality is there is no big expensive router what you actually spent your money on blowing your mind without CDP or MAC addresses if you if you have hardware that can forward all of these packets and do it so cheaply and so commodity and and all of the features are there what you're actually buying when you buy these big expensive routers is software it turns out software is actually pretty hard all the complexity that goes into all of the routing protocols and it's not just one simple routing protocol and once some networks it's all of the design that goes into routing protocols that are that are built to scale to thousands and tens of thousands of nodes and not fall over all the work that goes into producing a CLI that works not just that doesn't crash some people have problems with that but that gives you the data that you want as a network operator comes with all those years of experience in developing that all the network management platforms and then just think about all the features have you ever read a release notes from us or Juno's update look at all the features that are in there that someone paid good money for that's really what you're buying when you buy the big expensive routers so the hardware think of it more as a delivery vehicle so you don't feel so bad that you just spent millions of dollars and got nothing tangible it gives you something to hold but that's not what you're actually buying or to use the example from Atlanta it's like grits or a delivery vehicle for butter and salt so why do we actually care about Sdn well it turns out that some people are good at some things and not so good at other things and try not to act shocked but if you look at the commodity silicon manufacturers out there it's interesting that we've hit a state where people can now produce Asics for literally hundreds of dollars that can do tens of hundreds of millions of packets per second to look up but they can't write a good CLI but they can't write a routing protocol or they can't even make the Box stay up without crashing meanwhile you have a different set of people with a different set of expertise you have people that know how to write software know how to write routing protocols they couldn't fabin ASIC to save their lives and the interesting thing about the incumbant router vendors who most people buy from is those are the people who have managed to master both they've managed to hire the right people acquire the right people and make sure that they offer both and that's what you're paying for you're paying for a reliable router that gives you the features that you want so at Sdn is actually a ballot isn't it's not about centralizing the intelligence or any one particular method of implementing it it's really about the threat the threat of getting these two groups together in a way that isn't under the incumbent router vendor so is there an actual Sdn product out there that will revolutionize anything if it exists I haven't seen it people throw Sdn around a lot and I always take it with an extreme grain of caution but I want people to think that as a concept there's actually a lot of Merit because what you're really trying to achieve is that that merging of the two groups you're trying to break down the wall of the people to make the hardware can't talk to the people to make the software and come up with a reliable product that competes and in a way that is outside of what the traditional space is so maybe someday soon you'll actually get a slightly less expensive router probably still dig all right so I'll just jump right in my name is Kevin Walden Weber I'm the director of product management for the high-end routing products at Cisco so I've been building against the big expensive routers for the last 15 years that that everybody's complaining about I didn't want to justify why they're expensive why they're big I think there's a lot of Technology reasons that I'll get into now but you know what I wanted to kind of talk about was some of the challenges that we see in the high-end routing space some of the ways that we are attacking it and trying to make products cheaper one thing we are actually going to build a commodity 1ru X our base switch so for aggregation device so those technologies aren't merging but actually I really liked your presentation because one of the things that was in there that I find really interesting is when people feel better if they were paying a million dollars for the software and the hardware were free because that is where the bulk of our engineers and the bulk of our investment is so there's a little bit of fudging between hardware and software resources but yes there's a lot of investment in software and and for some reason people are comfortable paying for hardware but not software but the real resource and the real intelligence in those devices is actually in the software but what I'm going to talk about today is all hardware so I'm a product manager I've been doing this for 17 years if one of the biggest problems we have is looking around this room and taking input from every single person in this room consolidating it down to a you know common set of features that it makes sense to put into a routing device and delivering that to the market so you guys are probably mostly if not all engineers engineers are really smart you could probably or at least my engineers can build anything if I tell them to build a box that looks like this that's this big it takes this much power they can build it the challenge I have is taking all these requests and and kind of consolidating them down mixing that with the technology pieces that I have to work with and delivering something to market so one of the things that you can you know help with and I think the presentation from Dave and from Craig and everybody else helps me a lot in that I understand what the requirements are I think one of the challenges or one of the things that we have to look at our technology optimization points so the products you're deploying today are the products that we defined four years ago the products you're going to deploy tomorrow are the products we define two years ago and the products we're defining today are the products you're going to be deploying in 2015-2016 so stuff like this meetings like this you know you guys after this meeting bashing me when I'm stuck here in the snow for two days that's where we get these requirements and that's where we define these products that are gonna be delivered over the next couple years so I I agree that cores are leaning out I actually don't like the definition of core and edge and an aggregation and peering that we defined twenty-five years ago when we started building routing these are all shifting the architectures are shifting and the needs of those those devices are changing so good solid input from you and me actually taking that down and working that into product requirements makes a lot of sense here but technology optimization point is actually slightly different it's the the boxes that I built today I'm optimizing to leave in the network for 15 years and to have three or four generations of fabric upgrades so I'm planning for 28 gig and even 50 gig pam-4 Surtees technologies that aren't even available yet and that will be delivered three or four years from now that's built into the devices you're deploying today so if we're in a model where you know like Craig was talking about you're gonna put something in the in the network and rip it out three years later cuz something newer and bigger better is there I can define the Box differently and I can design the box differently you know I won't designed it for 100 million megawatts of cool and you know the ability to put in 10 terribly Fernet over time I'll design it at you know 12 and 1/2 gig 30 is exactly what I'm deploying today and know that three years later I'm gonna have a better and more efficient way to deliver that and maybe a slightly different form factor that's not backwards compatible of what we have so that's part of you know getting product requirements in and technology optimization this is one that that 10 years ago when we were working on GSRs and early CRS is we didn't really worry as much about power and cooling though our top of mind for me and something that I'm focusing a lot on I'll talk more about it in later slides but power cooling is about more than just you know 10 kilowatts 12 kilowatts it's what are you guys gonna do with these devices and what environments are they going into because again going back to requirements the boxes were designed to go into nebs type of environments and I've got to survive the chiller failure and the thing has to be able to run at 50 55 C those are requirements that if that changes if I can design a box that'll never go above 40 °c or not guarantee operation above that or shut it down I would be able to design the box differently and I'll talk about that a little bit later when I get to more the power and cooling pieces Asics are interesting actually I wish Mark was here because I know Mark was gonna talk about Moore's law a bit so I ripped all that out of my presentation but Moore's law is cool everybody understands Moore's law we can build bigger faster Asics we get double the numbers of transistors but what most people don't realize is Moore's law has been broken for the last ten years in a routing space yes we get more transistors yes we build denser chips but we don't get twice the number of transistors at half the power anymore so when I deliver a 100 Gig ASIC and then a 200 Giga tick and then a 400 gig a sec the cost per bit is lower the power per bit is lower but the aggregate power of those chips and of these devices is slowly creeping up so if I give you a you know just use round numbers a five terabit box that's ten kilowatts the ten terabit box is probably going to be 12 or 13 so average power or the power usage per bits going down efficiencies going down cost is going down or efficiency is going up but power for bits going down all good things but the sort of creeping up so one of the things that I've been wondering and we were talking about a little bit last night is is it better for me to define a power envelope and you guys say you know what I'm gonna deploy this in a 12 to 15 kilowatt power envelope and I should cram as much as that into that power envelope as I can versus what we do today where I say what can i what can I do and what can i coul cram that into a rack and that may turn into a 15-kilowatt rack or a 20 kilowatt rack or a 40 kilowatt rack or wherever we're going and so I think the design methodology has to change but that's based on input from from the architects building the actual networks but when it comes to the Asics it's funny the the pendulum stuff in the in the previous presentation really resonated with me because that's what we go through if you go back to the GSR and and you know speaking historically because that's where I came from we had core engines and we had end engines and the core engines were bigger and faster and cheaper than the edge engines you had almost four times the capacity at roughly the same cost when you were doing core features and functions versus edge functions then you know Asics got going and we got some really big fast Asics and the pendulum kind of swung the other direction and everybody said you know I want a box it does everything and that's kind of what we designed in the previous generation and even current generation platforms is I can do core or I can do edge I can do peering maybe there's some software licenses and some things we can trade off but they're generally about the same size power and density we're shifting back towards a model of I think needing very dense you know I won't call it core but no IP transport type of devices that have may be limited fit may be limited features may be limited functions so the stuff I listed here are some of the things that we're looking at in terms of trade-offs in our ASIC families so bandwidth versus PBS I can build really big bandwidth systems if you want them all to run at 64 bytes that's either gonna constrain the bandwidth or force me to build a bigger chip fib scale that's really easy if I can fit the FIB on chip it's great as soon as I go off chip if it's you know if I go off chip for 256 thousand entries there's not a whole lot of difference on that board between two hundred and fifty six thousand and one and eight million fit entries once I put the external memory interfaces in the external memory it kind of is is a fixed cost to that device things like queuing I'll talk about later but you know we have started to strip queuing out of the the core based devices there's no need for 256,000 queues in 100 yogi that goes from point A to point B and has a queues on it so we've already started to do some of that and you'll see that more in our next gen offerings and then buffering you know we talked you were talking about what do you need to do in a cheap commodity device versus a larger device one of the reasons why the cheap commodity device and the small commodity device can be cheap is some of the other functions are done elsewhere so you know one of the conversations I want to have it I want to understand is what what is buffering by you and is it interesting you know it doesn't have to be 80 milliseconds of RTT probably any more on a big 8 terabit router but you know maybe there's some medium between the miniscule nanoseconds of buffering you have in a tor switch versus you know what you need on the routing side and that has direct implication to fabrics and tasers and then the last thing the one that nobody wants to talk about but what I was glad to see in Craig's presentation is the optics side of the equation you know I could drive a FET and I won't but I could drive effectively the layer-3 portion to zero zero cost in layer three and you wouldn't see a huge and dramatic drop from where we are today because of the size and power being driven by the optics so a lot of the investments we're making today are in the non sexy things it's not the bigger batter ASIC that we keep talking about it's the power the cooling and the optics technologies to drive smaller lower powered optics so you can build these eight terabyte 10 terabit 50 terabyte systems and have enough plug holes on the front and actually drive the optics off the front of these devices so you know that electronics are in the asic domain they're in the silicon domain and we can get Moore's law and leverage it and drive that orange curve down optics are taking these step functions down but they're inherently different technologies so I think a lot of the investments I'll show you one side later but a lot of the investments were making in silicon photonics allow us to take some of those those optical components into the silicon world and leverage that three hundred billion dollar silicon industry to drive Moore's Law and actually to drive the optical components to better size to better and allow us to build these dense devices so a couple more slides is kind of on the hot topics and where we're going and then I'll take all the tomatoes and knives in the back and everything else so silicon is growing you know I kind of lied to you before Moore's law helps us we have a 200 again bu now we're building a 400 again PU and and we can continue to see that scaling to 14 nanometer and 10 nanometer and and beyond but it's creating this this paradigm where I can build this chip and I can pack a bunch of stuff in this chip and this chip is so low power for its density that I stack a bunch of those chips onto a board so the unit power is decreasing the power per bits decreasing but the total power of these devices is starting to go up or continuing to go up and because you know everybody will say I want to build a smaller power or a smaller router I want lower power I want lower cost but actually I still kind of want it to be really big and really dense what we do is we just pack a bunch of silicon onto that device so the current you know shipping product that I managed called the NCS it's got a terabit line card it's got its 1/8 terabit system I could have built a four terrapin system at half the power but instead I wanted a cram as much capacity on there as possible because we're in this you know bandwidth war between my friends and juniper or now could tell it everybody else we're building the biggest densest routers we can build because that's the direction we've been given from the field and from the the customers so if that changes then we're more than happy to oblige and move things around but that's one of the constraints that we're working in there this is an interesting one to me because it really focuses on that that environmental part of things so this is not actually a router problem this is a noise power cooling and you know Colo problem so this is this is a router it's actually irrelevant which router it is this shows what a router does at 25 C and what a router does at 50 C so this goes back to the comment I made before if I need to design a device that goes up to 50 C and handles you know every chiller failure possible and you know runs through a nuclear war the the the chart on the Left shows you know the power draw of the line cards and the fan trays and the switch fabrics and power supplies and route processors and everything else and so you see the bulk of the power of these devices is the silicon it's the line cards it's that that 78% number there that's actually where we want it because then my focus can be optimizing the the power of my ASIC I can build these leaner architectures we've been talking about shrink the power of the ASIC and either build a 10 kilowatt box that's twice as dense or build a lower power device but the problem I have is what happens when we move over to a little point here when we move over to the the 50 C thing over here if you look the fan trays in this example drew about seven percent the fan choice in this example drew about eighteen percent this actually gets significantly worse depending on the architecture of the device itself so you might see upwards I've seen some devices that are upwards of twenty five to thirty percent of the power draw of the device is just spinning the fans faster to cool that device if we can solve that problem again we can either build lower power devices or in that same power envelope we can build really dense devices so that's something we've been trying to solve for a while now and then this is just a historical view of that kind of looking back at the the CRS one and remember there's now three different generations of CRS there's one three and ten this is the the box they actually shipped in 2004 and you can see that only half oops only half the power was was used to drive the line cards the rest was cool and this is at 50 C cooling the device and and the power efficiency of the power supplies and the fabrics we've done a lot to decrease this in in the current generation devices and this is all looking at the 50 C numbers so you can see that we've gone up about twelve and a half times in capacity over the last ten years and now we're doing you know over 70% of that power is driving the the line cards the cooling as a percentage is roughly the same but we've been able to obviously use higher efficiency power supplies we've shrunk that fabric piece down so we're optimizing the pieces that we can but this gives you a good sense of looking at it like it doesn't make sense for me to build an RP that's you know maybe a smaller CPU or less memories because it really doesn't affect the overall number the focus really has to be around cooling the device and the line cards itself this is more of a marketing slide I apologize bringing it into this but the the point of this is what we're starting to build now in the ASIC front the stuff we started defining two years ago our families of Asics so this although we call it a chip is actually a family of network processors we have two versions of this 200 get chip we have one that has 250 6000 cues and does you know full HQs and edge cueing and everything else and we have one that has 8000 so we saw this trend three or four years ago when we started defining these chips we wanted to build leaner silicon for the core it's definitely not as lean is what what you guys have have been talking about and what we see happening over the next couple years but you know we pulled out things like some of the cueing structures so you're starting to see these these families of chips that we can do high-end lower functionality core base functions higher functionality edge base functions but with a common packet processing engine in the middle so I can write the code once you know get common and consistency across aggregation edge core appearing and everything else but deploy the right set of technology in the right place of the network you'll see this evolve over the next couple years in the chipsets we're defining and building today that you'll see in products in the next two or three years you'll see leaner variants you'll see more full-featured variants you'll see a label switching variants with very small internal fibs but all leveraging some common internal building blocks now that's what we're using Moore's law for is not necessarily just to build bigger badder chips but integrate more things into these these devices and then I wanted to kind of just finish on on the the optics piece and then and then we'll obviously take any any questions you have this is from some of the guys that are building our our silicon photonics and it's zero correlation to numbers it's really looking more trends but we understand what's happening to ICS and electronics in Moore's Law we're able to drop the cost per gig or technology down every 18 months we get a new silicon process and we can drop it in optics are at a completely different slope so regardless of where you put that line the slopes are different and the optics are never gonna catch up so the reason we acquire light wire and the reason we're investing in these technologies is you know we see optics as as tens of years behind electronics technologies by taking some of those discreet you know optical components that are manually put onto boards with coax connectors connected to them by dropping them into the silicon die itself I can shrink the size I can shrink the power and I can build much much denser devices so these are the technologies that are going to enable us to do pluggable you know coherent DWDM or shrink the optics package down to a size of an SFP+ at a hundred gig at very very low costs and power so we're investing heavily in these things but it's gonna drive more than just the the optics technologies themselves this this is actually a view of something that we're focused on for the next couple years and I apologize for all the builds but we think we can integrate optics technologies into the asic packages themselves so instead of having electrical Surtees connecting devices things move into the optical domain and so for those of you that understand what's going on on mr. D's and these devices we have on the order of ten gigs RDS today you know twelve and a half on some of our backplane links but call column ten gig the GS are in 1997 had 1.25 gig thirties so we've gone up about 10x in terms of 30 speeds on devices but the devices have increased you know hundreds of times in terms of capacity so things like the thirties are starting to become the bottleneck so one of the things we can do is move from the electrical domain to the optical domain everybody wonders okay now I have electrical optical I can connect them together my current generation routing devices are going to change and one of the things Dave asked me to talk about was how do I make sure I'm investing in a technology it's gonna be around for the next 10 years we can utilize these these IP are these electrical and optical integrations for more than just you know replacing backplanes or replacing optics so I wanted to end on a couple things that we're focused on just to show how these technologies can integrate into even your current generation devices this is what we call a slice in our data path like the the the one terabyte card I was referring to before has five of these slices it's it's some type of npu some type of fabric interface and and that makes up a slice on a line card today it's a 200 slice and we build these 1 terabyte cards by putting 5 of these slices down on the board the first and easiest thing that we're looking at now is just pure silicon integration you'll move to the next ASIC node take that NP you take that fabric lump them together that's gonna allow us to build much more low power and low-cost devices that that's already underway and you'll start to see that as a almost a system on a chip or a router on a chip for these next-gen designs but that's still all everything here is orange right this is still all electrical as we start to look forward at some of these electrical optical integrations there's a bunch of different things we can do you know I think I've heard complaints from from probably half the people in this room about optics prices and optics power and optics density we can start to eliminate some of the optics completely or develop you know short reach very low cost technologies to come off the NP use and drive external shelves or passive optical devices to get very high-density very low cost optics off of routers the interesting one that I think nobody really thinks about is I can get rid of my 25 gig Surtees bottleneck here and connect devices together whether it's through two small dies on the same package that are optically connected or two completely individual Asics with with optical interconnects I can do 50 or 100 gigs thirty's no problem on the optic space where in the electrical domain we're at 12 and a half today moving to 25 in the next couple years and probably moving to 50 and Beyond with some advanced modulation and then the last one which we will see over the next say 10 years is potentially being able to use optics as fabric interconnects and either allow you to build you know massive multi shelf type of systems out of smaller building blocks or eliminate the fabric completely from some of these devices so this will change the architectures of some of the elements but the other three you can work within even our existing architectures so I crammed a lot in I see how about six minutes left so I've kind of pause now and take questions but I think most of us will be stuck here for at least the next two days so you know find me in the hotel or in the bar or something and feel free to ask me any questions you want thanks very much Kevin yes so what I'd like is obviously some some audience Q&A we want to hear what everyone thinks not only about what Kevin just said but about what Craig and Richard have said and obviously what I said I guess the first question I have is for Kevin are we being you know Pennywise and pound-foolish for thinking about we want a ten-year roadmap and a device for deploying versus just being able to be wit being willing to rip it out after say five years and completely replace it so I think a couple things I think we've been looking at that for a while and and remember we're building routing devices for everything from you know the the Netflix architecture you're talking about two large surfs wider architectures that don't change for 10 or 15 years so I think we need to build devices that last at least a couple generations no matter what because we have some people that are going to want to put all three generations in or we might have some people that that you know let's say Netflix misses the first generation of a box you're gonna catch that second generation so I think we have to have some flexibility but I think there is some merit to how long these devices need to live and maybe instead of 20 years and looking at exotic technologies that are five years down the road we don't understand maybe we look to shorten that a little bit lower the cost of the device if the cost of the device is lower I think that plan works better if the cost of the device is more you want to sit it in the network and leave it there for as long as you can so I think there's some of both and you'll see both in different types of providers I watched someone ripped a GSR out in Ashburn about was about three months ago now and it was actually slightly sad but gratifying the same problem I also still see 75 hundreds and some of the pops I visit so I believe it so Richard can I call you Richard for mr. Tuerck Burton okay so if we think about open flow open daylight open big data in the cloud open whatever do we need another standard zorg to attack routing for what you just spoke about to do you know how do we how do we actually move that forward well I'm not sure that you need a standard zorg but it's really unclear where you're gonna go with this Dan right now it's it's the promise the possibility there's no clear direction as to who is going to come up and say I know how to write this routing protocol I know how to write this CLI I know how to give the the features that you want and then talk to the hardware that's I think where all the effort needs to go is figuring out the business case for that okay and Craig the big question that's on my mind is what's a white like to work for Vijay okay you have this question you can have its question sure now so if you have a good answer this question why not whether we need to whether we need a new protocol or not you know I think we need to kind of look at Sdn and some of the use cases for the data center or software orchestration or an orchestration whatever we're calling it and if we just look at it like centralized packet forwarding control or service control I think we can actually accomplish a lot of our goals and objectives hopefully without inventing something new because there is plenty plenty of technologies and methods that we can leverage today to actually get us where we want to go if we're just thinking about about it like centralized control or or packet forwarding in general you mentioned in your presentation you know you're starting to move the 8075 back 100 Gig what's what's the biggest blocker there from your perspective power and space yeah it all comes down to power and space and when we're planning and evolving and trying to implement the roadmap generation the multiple generations of the of the network and the topologies it's it's being able to put the put the devices in whether it's in our own data centers like I was saying earlier or in the third-party coalos it's the the biggest challenge is taking a taking a big routing node that's now pushing I don't know anywhere between you know 18 and 20 something kilowatts and going to equal Nick's or level three and saying here can you put this in my cage and they're like well no but you can put it over here in the middle of the floor where there's nothing around it so we can actually move the air so okay I've got a couple more questions but we're actually running out of time so I want to go to the audience when we start in the middle Thank You Anton capella five-nines quick question for the panel here maybe not too quick unfortunately both I'll try my best the the topologies we're describing especially the N tiered or n layered class or fabric type things really seem to borrow a lot from what we consider what I would call destination or destination based routing of course this is an internet friendly thing we want to get prefixes places but maybe there's alternates that make sense in this space for example one thing that comes to my mind is Myra net or the hosts the most numerous thing and the network of course define what they're gonna profit propagate through and of course you have almost no gates to move ten forty hundred gigs of traffic of course that's not useful for the internet so I'm curious in this space especially if we want to consider Sdn or centralized control planes and so forth what you think might happen in that space where if someone could afford to break from the concept of Ethernet and IP destination based lookups and push that into the edge nodes where that change anyone's landscape sufficiently or interestingly or as it has even entered the conversation and anything you've worked on so far let's go Francisco yes yeah how'd it feel like no actually I think there's a lot of interesting stuff going on there if you look at the the sorry the the network function virtualization and some of the things as you as you push devices out or push intelligence out and let them make decisions it will lower the requirements of either the edge or the core devices in that space and so you know a lot of a lot of the things that would be asked for by Dave or by Craig you know look smaller FIDs or lower functionality devices or even stuff like segment routing if you're able to define paths or do and that's what segments routing is about is being able to do sort of source-based routing through the network that will allow us to lower the cost and the functionality of these devices so I do see that as interesting do you think it'll intersect the rest of the world developing the edge stuff in a timeline that you could even even intent I think we will still be building edge routers for a significant period of time I think it will take time for those technologies to evolve and allow you to deploy ubiquitously throughout the network but I'm starting to see pockets of interest there now which means you know 12 to 18 months of playing with looking at and figuring out how to deploy and and you know probably in the next three to five years I do see large architectures potentially moving in that direction thank you last question over there are site security one of the things I didn't see the panel touch on was sort of the size of the market and I want to couch that by directing it away from the hardware because I think we all understand R&D cost and amortized over so many boxes but on the software side and and I'll pick up on something Dave said just because it was in the presentation but he mentioned that for instance Netflix does not need ipv6 right now in their box I'm going to assume at some point in the future he does I would don't know if that's tomorrow or five years from now but eventually he will want a box that does it and that means the vendor from a software perspective it needs to have relatively speaking bug free software to do that tested on those high-end boxes of which there are very few deployed and very few people in this room willing to run the latest and greatest whatever so whether you need the latest MPLS feature the latest ipv6 it seems to me that that we the software development and testing problem where you know for years ipv6 was not feature complete on many boxes from many vendors and so how much does the size of the market constrain what you can do in the software in this space I had a feeling with being the one vendor up here a lot of these we're gonna come to me no so I think two things one is although we're talking about you know LS our base devices and dave says he doesn't need v6 we have to focus on v6 in these devices they're deployed in a lot of other places and so we are heavily investing in the v6 protocols and and complete completion of those v6 stacks but you're right I think over the last couple years there have been so much effort in extending out before and all the cg network we have to do and trying to figure out how to how to kind of stretch the v4 investment that maybe we haven't seen that many pure v6 deployments in networks but know it's going to happen at some point there are gonna be people or there are people today that are asking us for you know what can I do in the control planes whether it be LDP v6 or other things where I can deploy a pure v6 control plane so we're investing in them and we have them but I agree with you lack of deployments means we haven't potentially found some of the issues that we're gonna see over the next couple years so we do need to see people either driving in that direction or we're gonna focus the dollars and investment you know where people are asking for functions and and to be clear that was my bullet point was a bit of a troll to see if people were paying attention we actually have a nice percentage of v6 traffic on our network I think the point is and and you kind of got there is that people like Dave if they don't or maybe not like him who don't have ipv6 deployed on some level have to fund the development of that so it's there when they need it and they fund that by buying boxes today that go back into your Andy thing right so but we do have enough deployments of v6 and enough people asking for everything in v6 that in the high-end devices you're gonna see a heavy dose of v6 support across the board so the one other thing that I would say is you know Craig's number one issue is that power and density in space my number one issue is probably software stability with power and density being number two so to Leo's point when when all of us are out there asking for features asking for weird bizarre features that apply only to us that make the software ridiculously complex I think that's probably the the biggest driver to to us not getting what we want in a timely fashion absolutely yeah I think the funny thing about that is I would almost say tea is good it drives you know vendor differentiation it allows us to do things that potential you don't have in other devices but the simpler I can make these devices the more solid I can make it and higher uptime better resiliency and not going down those are all good things for us because then my support load on the back end goes down and I can actually spend more engineering resource on features functions or stability and infrastructure so I'm fundamentally completely ok with simplifying these places on a network because it allows me to invest in potentially other areas or hardening the infrastructures or the devices we have so it makes sense give less work to us yeah last last thing I'll say about that is you know the point about the size of the market and where the vendors are investing is a good one because one of the you know what one of the things I was saying it's not that these these features and functionality or the functionality itself goes away it's just that we're saying if you can fund a big development shop and you have a bunch of software developers you can actually move that into a software development shop do the ROI and the and the cost-benefit analysis and say I can do it at the application or the software layer better faster cheaper and iterate on that and then that that gives me the ability to dumb down the network device Kevin might have a long tail in this room of people that still need him to develop those features and functionality so it's a trade-off well thank you very much my palate and panelists you guys did a fantastic job I really appreciate the candidness from everyone again sorry that cisco ended up being the target it wasn't the was never the intention but thank you very much and thank Stas and thanks Richard you you
Info
Channel: NANOG
Views: 6,770
Rating: 4.891892 out of 5
Keywords: Verilan, Atlanta, routers, Internet Service Provider Industry, NANOG, Network Operators, NANOG 60, network, expensive, SDN
Id: -05xWeYGn4A
Channel Id: undefined
Length: 67min 26sec (4046 seconds)
Published: Thu Feb 13 2014
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.