Juniper Networks EVPN - VXLAN Architecture

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so my name is Doug Hanks I'm the director of architecture here at Juniper my team we basically build the hardware and the software for our switching products here at juniper and now what we want to talk about is go a little bit deeper on this EVP and VX land use case and kind of give you an idea of what that looks like and how it works so there are at a high level I want to focus on two use cases the first one being DCI so obviously when you take a look at DCI you want to do a couple of different things obviously you want to exchange data between data centers but you want to provide both layer 2 and layer 3 transport between different data centers not just two but multiple and on top of that you need data separation isolation and obviously redundancy when it comes to your links and your nodes so pretty simple and now how does evpn apply to this we have a lot of different options so I'll post the most common first of all we start taking a look at this you typically already have a LAN in place it may be a win based on MPLS running a layer 3 VPN at your enterprise site and what I'm showing here is a simplified DCI segment wrapped around upon itself so basically this box over here on the left this is your first datacenter and over here is your second data center and these MX's here are the edge routers so it's kind of show it loop back on itself so I'll be showing different options here so the first option we assume there is a existing MPLS network back here that you guys own and you manage or you buy it from someone else such as AT&T or Verizon or what have you so this already exists and this is going to be a layer 3 VPN that you buy now the first option is well if I want to use EVP and a VX land to do a layer to DCI across the datacenter how do we do that well if we make the assumption that a qfx 10,000 is in your data center somewhere regardless if it's IP fabric fusion or whatever what does it matter but what we do is we can basically start and stop a VX land tunnel on the qfx 10,000 and because we encapsulate that traffic we can especially send it over the top on that layer 3 VP in that you guys already have in your network so we kind of call this over-the-top DCI because it's going over the top of layer 3 VPN so it's pretty easy to implement because it requires no fundamental changes to your LAN so that makes sense to you guys ok the second one is a different flavor of this so if you want to go in there and modify your LAN and say well I'm not happy with just a layer 3 VPN I actually want to change my LAN architecture and to natively support evpn you can do so so then what happens is that you have MPLS data plane with the evpn control plane so now you can natively do layer 2 and layer 3 traffic on your way connecting into your data center we use a different data plane encapsulation so in this case it would be VX land going back into the data center but still using the fundamental control plinth element evpn so we have the stitching of VX LAN and MPLS happening on the edge router so this is more of a MPLS option a if you're coming for the SP world but the benefit is that it's EVP and everywhere and the downside is going to require more planning and basically changes to your network with upside is that you get all the benefits of MPLS so you want mastery route traffic engineering yada yada yada you can do so with EVP and now in your LAN so we've got two different models here the easy way and the other way which requires more planning and more work on the LAN itself but more benefit those two options make sense for the EVP end DCI okay the other option is a bit more interesting this might be more of a branch model where you might not have a traditional land maybe it's just going across the internet or IPSec tunnel and basically what I'm showing here is that we're getting away from MPLS on the LAN side and we're going right to VX land which can ride on top of IP so if you have two branch locations going across the internet you can basically just tunnel this across the internet for example and this is VX land through and through so from data center to your edge router and edge router to edge router would be VX land and again obviously evpn all the way through as well so you get both l2 and l3 between these two locations and the last option is probably the most simple it's just a back to back connection well what we'll have an edge router or a peering router how can I do it you just directly connect them very easy to do there's no MPLS you typically need dark fiber for this but again it would be evpn with VX lan natively so again four options for DCI when it comes to VX land and evpn we can do so on our qfx 10,000 here our next use case is basically the EVP and VX line fabric which is basically focusing more internally in the data center so if you want to zoom in and take a look at how this works basically what you see is inside the data center you have let's say bare-metal servers or they'd be virtual servers to get your storage so on so forth and then everything will be connected through this EVP and VX LAN fabric and I'm going to give you some more details of what this actually looks like and then again we mission we orchestrate and manage this day-to-day with open cloths so what happens I'll start peeling back the covers here so we see a spine and leaf type of topology obviously we have our bare metal servers down here BMS and what's going to happen is that we'll create a full mesh on the spine and leaf and the next step is that we want redundancy to the server's whether it's dual homes which I'm you know showing here or multi homes so six way over there so we've got two different very very ations of the access layer now this is where we get a bit interesting so now we start talking about the VX land components and here we'll do something called VX land l2 gateway so whenever we see receive an Ethernet or an IP packet from our bare metal servers will encapsulate that into a VX LAN header and if we want to switch between to these guys obviously we can send it from any l2 gateway to any l2 gateway and the question kind of comes up well what if you want to do VX land route and how does that happen because generally on these type of access switches they can only do l2 gateway and not the layer 3 gateway so what we've done here and the spine I made the assumption that this is a qfx 10,000 what you kind of see over here it's the chassis so we can do both l2 gateway as well as layer 3 gateway so it's kind of like a universal gateway you can kind of think of that so you'll do MPLS IP Ethernet VX land whatever it really doesn't matter both l2 and l3 so now what happens if we create a bunch of tunnels going between any of our access which is going back into the spine so where we want layer two or layer 3 routing we can do so and the benefit is that the underlay itself it's built upon an IP fabric using BGP standard protocols extremely resilient any of these nodes go down it's just a switch going down it happens to run BGP it's not going to impact anyone else again just using very you know standard open protocols the next piece is we start talking about the tenant separation how does that generally happen so if i zoom in on one of these switches you know whether you want to have application separation or tenant separation it's the same fundamental tools so what we have here we start carving out V ahrefs and I call this t1 for a tenant one will do the exact same thing for a tenant number two and this is going to be a layer 3 VAR f as well as a layer 2 VR F and this we start carving out bridge domains or VLANs and we give them a V and I which ties back to the VX land identifier and we also give it a mapping to a VLAN ID so that way with the mapping from Ethernet to VX LAN through this mapping of IDs obviously translation between the two so if I need to do a migration or whatever you can terminate it yeah and we got new really like funky stuff too so the traffic comes in here on VLAN ID 1 we can send it back out over here on 55 or whatever you guys want so beyond normalization I guess you will call it so we do the exact same thing and vrf number 2 and I'm not showing a sequential VN IDs on purpose because the the namespace for VX LAN is global it's basically 1 to 16 million you can't overlap those that's why I showed that as global however we can have overlapping VLAN IDs though because we make a layer 2 vrf we can have overlapping VN ID or sorry VLAN IDs so for example I put here for a bridge domain 1 VLAN 1 I could have this name VLAN ID 1 and 10 and 2 as well as tenant 1 so in that case that's going to be a local native space we have overlapping VLAN IDs so the next question is how do you route between these guys there's going to be a IRB interface or an S VI and we connect these up into the IRB and we basically do this in the control plane for policy enforcement so if you want to say that tenant one can only route and switch within this one vrf you can do so it happened at line rate in the data plane now if you want to say that tenant one can also talk to tenet number two will make a control plane adjustment in BGP and it'll say well for these particular NLR is whether it's an IP slash mask or an IP / MAC address will allow that traffic to propagate between these two different VAR FS and therefore these tenants can talk to each other and the exact same thing can happen going out to the internet or the LAN we just explicitly allow or deny what traffic can go in or out of that Network now you guys done any kind of distributed firewalling across Ghana access layer is that it we have that GDL support but we don't have stateful firewalls really northbound but what we could do is the virtual SRX and we could put in the compute clusters and then enforce it like that it's not like perfect yeah just not hyper-v yes good so does this make sense to you guys I'll pause here now back to the the virtual is a kernel module or is that a actual VM that's going to run it'll be a virtual instance I will let my friend speak to that I'm not well-versed in the world of virtual autocorrects yeah okay but it's not a kernel module it's just so it could be motion between and okay maybe yeah so it's actually it's not a kernel module I will talk more about it it's actually VM that we instantiate for a virtual firewall Kaiko I'll speed up a little bit there are two options when building out these VX line fabrics you can do it in a three-stage topology or if you're going to grow beyond the capacity of the spine we can go into the five stage as well a lot of different design options but we do support both depending on your scale and your needs and amount of servers that you need so on the next bit I do want to give you a bit more of the fundamentals we already talked about the different VR FS already so I'm not going to repeat that what I'm actually showing here on the dotted line is that layer 2 VR F and then we have our bridge domains and I talked about the IRB already but the piece I want to show you is the interaction between the spine and the leaf and there's an assumption here the assumption is that this is going to be a top of racks which based on the Broadcom t2 tip set that only does VX land l2 gateway and what that means is from an overlay perspective it's layer 2 only however you know the underlay is layer 3 but when it comes to evpn and the overlay it could only do l2 functions so you're not going to see the repeat of your V ahrefs down here you'll see a global default virtual switch instead and what happens is that will migrate our l2 components that are that is up here into the switch because again from the overlay perspective it's only layer 2 so we just replicate those bridge domains and now what happens is obviously we'll put those V taps into the network itself and then we'll extend the the VN IDs to match those bridge domains so what happens is whether it's l2 or l3 traffic coming in from your servers will encapsulate that it will get tagged into the correct VN ID and it will basically get routed up here or we can natively switch in the switch as well and when it comes to their routing side you may ask well where's my default gateway from my server it actually exists up here as the IRB if you imagine you have multiple spawns whether it's 2 3 4 16 32 whatever it is this is actually anycast gateway and by using a VPN we don't need to have that you know dot 2.3 kind of scenario it's all just dot one for your default gateway and we keep track of the the MAC addresses the synchronization and Mac learning through evpn and yeah I know so I'm thinking it's sofa if I have gone back to a virtual environment but if if if I'm in one rack and I hit my might or add my leaf and I'm on the same host and I'm going to a different subnet whatever going to get into another network am I going to be able to go straight up to my leaf and then back down is my gateway because it's anycast or am I going to go back up to my spine right so what happens is if you get a server down here like you said plugged into this leaf and you want to route to a different bridge domain what's going to happen you're going to ARP for your default gateway right which exists up here and it's going to say oh your default gateway exist on this VN ID it's going to travel up here it's going to respond to aqua quest it'll get the packet it will route it up here and send it back down to the leaf in that case and that's because this particular leaf is based on the Broadcom tried into chipset and then when I get into the virtual and you're here to go over this as well as does that change that scenario when I'm doing the virtual will it will it actually will it at the host layer say I'm going from this VM to this VM even though they're on the same host am I still going to go back up to my spine back down or with the virtual Edition will I stay on that host and route east-west so again if if it's within the same host and you have virtual ascetics instantiated and if it's seeing the traffic then you can it would provide all the firewalls right in the host itself and that was stuck again cool you brought up a little earlier about developing your own chipset so obviously I guess this came out of the fact that you can't do layer 3 B type routing with the Trident - right so are you moving to Jericho chipset or your own chipset in their future and where are you going to put because obviously you're using us through the default gateway extend the community through VPN but you know when you move off to your own functionality are you going to bring your default gateway down on layer 3 on the weak switches at your place yes so we have a pretty extensive roadmap when it - the access which is if I I think the next so you it's a few slides ahead so the answer the question is on delete on access witches we tend to go with emergent silicon just because it's it's it's good when it comes to the spine we had some challenges where we wanted to solve some of the hosting problems and the current chipsets we started developing the system they they couldn't do it so we make her own in-house silicon and that's the Juniper q5 chipset so that was where we could route switch anything whether it's VX LAN GRE NV GRE or MPLS we could do so in that chipset and we keep that chipset in the spine of the network and the merchants are looking in the access part of the network but obviously Broadcom is going to have a t2 plus chipset very soon that will do VX line of routing and the access layer so then we could we can modify the same architecture but then do the local routing on the access switch as well so we got a lot of different options so you know assuming there's a lot of address families in here a lot of Nina additi peps do you do like a zero touch provisioning so that because I mean when you scale this thing out really why to make it matches a lot of configuration yeah so you have a ztp that you use it will grab all the edges families everything yeah I can show you how that works um basically as you just described you have different topologies and you have a lot of interface assignments Allah the control plane that you mentioned as well as the VX LAN configuration bits and basically how do you automate this our answer is the open class no yes no it's nice it's always nice on a presentation when you ask a question ah ok good nice ok so yeah we have tools to completely automate this for you and I agree it's extremely difficult to do it by hand and most people it's like oh yeah I'll get an Excel spreadsheet and go do this or if i'm savvy i'll get like a little perl script and like use that output to make a spreadsheet it's like no let's just make a real program to do this and most shops may not have resources to do that and we do we wanted to give that to you guys for free so that's why I made it open-source and freely available now do you guys integrate again this may not be valuable or not but integrate with any like I Pam like Infoblox anything like that as far as being able to automate some of that I nina's anything work with this is there a need for it I mean is it really I don't know um honestly this will do it for you it'll do if you gotta tell your starting blocks and what record your namespace and it'll extrapolate that but if I wanted to you know tie that in and yeah tie it in automate it at a higher level maybe don't have that today yeah sorry man I am curious um it might have asked this the last time I heard about this but you guys obviously are not opposed to working with ansible from no reasons no and I know you know a lot of us have explored using ansible for doing these kind of physical and virtual provisioning tasks right so just I guess from a strategic perspective this is a lot of investment for from an engineering perspective to do this dedicated library why not just leverage existing tools why write your own tool that kind of thing well a lot of the functionality doesn't exist an instable a lot of the technology that we had to do we actually thought at first like oh this would be trivial let's go write some Python libraries do some calculations but as we got into the weeds of it we actually learned that the challenge was actually quite complex we actually have like two patents on how to do a lot of the calculations on building IP fabrics VX line fabrics and so on so forth and that just didn't exist in a framework like Anspaugh for sure it's it's what I've just I'm doing I'm doing calculations that you're you're talking about will we take namespace whether it's BGP namespace IP namespace route target namespace whatever and that has a lot of interdependencies and the outcome has to be absolutely perfect for this to work and the interaction of those namespaces and the carving out a tenant namespaces is quite complex so that was the logic that we had and we want to provide that value as opposed to you know say no ants will go to do this for us but we want to use ansible as the go do it tool and we actually have we did a demo the other week with ansible Plus open cloths to go build this out as well so we got a lot of you had different options on how to do this uh school yeah
Info
Channel: Tech Field Day
Views: 48,530
Rating: undefined out of 5
Keywords: Tech Field Day, Networking Field Day, Networking Field Day 10, NFD10, Juniper Networks, Doug Hanks
Id: EBjPve8AmR4
Channel Id: undefined
Length: 21min 1sec (1261 seconds)
Published: Fri Aug 21 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.