Designing Modern Data Center IP Fabric Networks

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hi i'm reggie warren a senior principal systems engineer at extreme network and for the next few minutes i'm going to be talking about designing modern data center using ipfabrics now we all know the reasons why new data centers are being built and the strategies from what they need to be built for however i'm going to focus on the how design and transition of the data center how do you build out this new data center how do you tie the o to the new inner automation that you require to meet the expected outcomes of this new data center so with that said let's get started from the enterprise perspective looking at the data center from from the macro we see that the data center touches many different networks it reaches out to the campus network it extends out into the cloud into the internet it also connects other legacy infrastructures that you may have in addition it may need to communicate with other data centers so it's important that we take all of this into consideration when building out our data center with that we have to build a design checklist and there's four different areas of that checklist one area is a connectivity area when we're actually looking at the total number of racks we need the number of physical holes that will go inside these racks the number of holes per racks the type of network interface cars that would be part of these holes whether you're using virtualized servers or not from the architecture perspective once you have your total host that will be going in you have to determine the topology that's being used to host this new data center whether it's a small data center design or whether it's one that requires a three-stage or five-state claw architecture in addition you have to look at the over-subscription ratio and whether you'll be managing the network from the in-band or out-of-band perspective also you have to look at the logical design basically how many vlans and vrs max and arp addresses on this and if you plan on using network virtualization such as bgp evpn you have to determine how many vx land tunnels will be required and lastly you have to look at the automation is automation required to deliver day zero services for ip fabric deployments day one services for adding tenants and day two services for any ad moves and changes what about full life full life cycle management support or integrating with other ecosystem partners inside your data centers what have to be considered at that point is programmatic interfaces into these ecosystem partners so all this is a design checklist that you should consider when building out your data center so the key point here is plan plan plan it's very important that you spend the time in looking at this checklist and looking at all aspects of how this data center is connected to other infrastructures as you move in transition towards a new data center so let's look at the physical design at this point now first i'd like to show you extreme slx portfolio switches that we primarily use inside the data center space as you can see here you have a number of switches categorized from where there are placed inside the data center whether you have leaf switches or spine switches core border router and border leaf switches now this is fur to categorize whether these switches are for shallow buffer and deep buffers that becomes important later on in our discussion around connecting connecting interfaces up with different speeds and why shallow buffering and deep buffering would matter at that point with that let's look at the physical topology as i mentioned earlier the physical topology you have to look at the physical racks that's needed to hold your new servers inside the data center so here you have racks 1 through 11 as an example in racks 1-3 you would typically have a topper rack switch inside the data center here it's listed as a pair of switches now in rack 1-3 you would actually put all your hosts that have similar network interface cards of the same speeds inside this rack having a higher efficiency of the use of ports with inside the top of rack switch so here for example you have 30 hosts that has copper ports that's one g page t and you also have 50 hosts that are 10g base t copper ports here you will house them in the same rack in racks 4 through 10 you will have host that has 150 10g interfaces that have optical connections into the network as well as hosts that have 25 gate connections optically into your top rack switches and then in the final rack you will have your border leaf or any other services in your data center inside this rack such as your dhcp or dns or firewall that will reside in the network now these racks all have to communicate with each other they communicate with another terrorist which is called the spine layer and these leaks on top of rack switches all connect them to your spine we go more into details what that spine does but those spines can also sit within rack 11. so now we have the racks laid out the switches that we have laid out in these racks let's look at other things that's required such as overseas over subscription in order to compute your network over subscription you have to take a look at your compute links as well as your leaf up links on this network it's a simple math calculation but let's look at this from a 10 gig compute link perspective with 40 gigs on your network switches with that we have to add up all of your compute links in this case we have 10 gig links in our host we have a total of 96 ports simple math simple uh math will give us 960 gig of total bandwidth at the compute layer next we'll look at all the bandwidth for your uplinks in this case we have four uplinks going from the leaf to the spine the speed on those links are 40 gig again another math simple math problem and we have a total of 160 gig now we would take the total compute bandwidth divided by the total leaf bandwidth and that will give us an over subscription ratio of six to one typically for more most modern days and deployments you normally would see a oversubscription ratio anywhere from three to six however you may see some that requires a much lower oversubscription ratio so let's look what happens when you use 100 gig uplinks use that same math it drops from 6 to 1 down to 4 2.4 to 1. so by adding 100 gig up links you have a better over subscription ratio however many modern day servers today get shipped with 25 gig links let's see what happens to your infrastructure over subscription ratio of your infrastructure using 40 gig links it shoots immediately up to 15 to 1. that's a huge over subscription ratio which may have an impact on the network performance of your data center in order to resolve that issue it may be best to go at 100 gig leak uplinks or drop that over subscription ratio back down to six to one so as you move to 40 gig interfaces now it's actually 25 gig compute interfaces it's probably best to move with 100 gig uplinks as part of your infrastructure in addition you can also improve your oversubscription ratio by adding more spines this will also add more leaf uplinks into those binds so now let's look at the underlay there are building blocks to an ip fabric first the topology the typical topology that you see is streets day's closed topology or five stage topology this these apologies are based on whether you need to build a medium to large size data center deployment or a very large data center deployment it's called three stage and five stage because of the number of switches a host has to communicate or have to traverse before it reaches its destination but for those customers that need a small data center topology extreme also offers that topology as well and it's not in your standard clo topology no matter which topology you use an ip underlay of bgp is required with no additional into gateway protocols on the network so bgp is primarily the only protocol that's required from the underlay across all these closed apologies in addition to that across these apologies we also support an overlay network of using bgp evpn control plane with vxlan data plane encapsulation this will allow you to communicate extend layer 2 and layer three across the entire fabric now let's look at this from a layer one perspective as i mentioned you have a spine layer in a leaf layer the spine layer is used just to as transport for the leaf switches so therefore it's important that all the leaf switches attach to all the spines in the network in addition to that the leaf switches may have connections in between each other for clustering and we go a little bit more further into that in just a minute from a layer two perspective we do auto discovery between your topology or between your switches in the ip fabric using lldp this will discover what spines and leafs exist in the network in addition to layer two topology i mentioned we use clustering at the leaf layer called mct clustering and what this allows is for a host that require dual connectivity into the fabric from a layered three perspective all our links under switch links are using ipaddress 31 addresses or they can be unnumbered i p addresses in addition the entire fabric is using bgp and no other protocols is required as part of bgp we're using private autonomous system numbers in which each rack whether the rack exists of one top iraq switch or dual period top rack switch has its own unique autonomous system number and all the spines belong to one autonomous system number host routes are installed after auto auto learning is done now let's look at host connectivity and we talked a little bit about this as far as hosting connecting into the ip fabric you may have some requirements for a host to be single attached into your into your leaves that's fine but what about when you need connectivity for your host for redundancy for hosts that requires high availability well you need to be able to attach two links into your top iraq um switches and this and these links have to be configured as uh as a logical virtual chassis across your network well from the host perspective we can use static lags for that or dynamic lags um using lacp for that connectivity and this will give you that high availability for your host connectivity into the ipfabric network from a vlan vrf perspective it's important that we separate the traffic out from the underlay perspective that should go into its own default vrf once we start adding use of vrs vlans or vlans and ve interfaces that should be in its own separate vr app this is such that so that traffic from the user vlans don't impact traffic from the underlay network now for connectivity between different racks we are using l2 vnis or l3 vnis for that connectivity in addition you see that static anycast gateway is required at the vlan level this static anycast gateway is defined at your top rack switch there is no need for any other protocols here such as vrrp static anycast gateway is the same per vlan vre across your entire network in addition to that there's a default mac that's consistent across network that can be configurable now that static anycast gateway is the entry point into your ip fabric network and is a requirement now let's take a look at that overlay network and look at the considerations there so our ip fabric is based off bgp evpn that is the control plan and from the data plan perspective we're using vxlan encapsulation in order to map these vlans into your network virtualization we're using vlans to vni or vxlan network identifier mappings now those vni's have to be mapped to an evpn instance in order to get into that ip fabric once it's there we can extend l2vnis or l3 vni's across the network in addition all mac and ip learning is done via the control plane of the network and that's everything to the to the overlay networking as you can see here is based off bgp so it's very minimum configuration is required to evpn to get the overlay up and going so let's look at how l2 extension occurs across this ip fabric network here in this example i'm looking at traffic flow from left to right we have two leaf switches that are in different racks that need to communicate each other each of these leaf switches have an evpn instance it's already configured now i would like to add vlans in order to get this into our network virtualization network i would have to map these vlans to a vni instance that will then get mapped into my evi instance across network that traffic will be sent to the local leaf in which it will hit the virtual tunnel endpoint in which an encapsulation will occur and sent across a vxlan tunnel to its destination leaf switch where the v-tip at that point would d encaps that d encapsulate that traffic and send it on to its vlan vlan's coming back the other way would go through that same process we talked a little bit about configuration the ease of configuration of getting the overlay network let's take a look at this from the cli perspective of leaf one so as you can see here first thing we have to look to is look at the vtep and see how that's defined so here most of the time our v-tip is defined as loopback 2 on the switch also we have to map vlans to vni this is done on the vxlan configuration of the switch as you can see here you have map vni auto this is automatically taking care for you although you have the option to manually convert via vlans to vni in addition you need your bgp underlay and your bgp overlay control uh configuration for the switch here you have your underlay that's defined underneath address family ipv4 and for your overlay it's underneath address family l2 vpn and lastly you need your evpn instance and here with inside the evpn instance that's where you will add your vlan to the eva evr at this point you may have some transient tunnels communicate because you're doing v-tip discovery however if you take a look at the switch itself you may not see any tunnels that's up and operational at this point that's because there are no requirements at this point for hosts communicating onto that vlan so now let's add a host inside vlan 30. once we do that you will see a tunnel automatically come up to all the hosts that are interested inside that vlan and needs to communicate you can show those tunnels from the cli by doing show tunnel brief and here you'll see two tunnels that are up and that's communicating to this destination leaves in addition if you do show a vlan show vlan brief you can see that your vlan 30 is mapped to vni 30 using two tunnel instances tunnel 61441 and tunnel 61442 now from auto vtip discovery it's done a little bit different from a layer 2 network so let's see how that's happening so here you have the vlan that's communicating to your evi instance of your leaf switch once that occurs that vlan gets mapped to a vni and gets added to that avi and a bgp update message of of an imr which is an inclusive multicast route with the vni 30 in the and with the next hop being the local vtep ip address that will get sent up from the leaf switch to the spine switch the spying switch will then send that imr route down to the leaf switch in the network when a leaf switch received these imr imr routes it will look at it to see whether the it matches its own vni and if it matches its vni will import that route and it would set bgp it would set the next hop as the remote v-tip for bgp at this point your vx land tunnel will get established between those leaf switches from a mac and ip learning perspective it uses slightly a different method so say user one would like to communicate say with user five in this case you would do normal data plane learning at the top of rack switch whereas that mac will come into port one now from an avi perspective once that uh data traffic hits the the ip fabric it would then at that point um capture for that particular mac the ip address the vni that it belongs to as well as a local vtep it will then send a bgp update from that leaf switch up to the spine switch with this information that spine switch will then send out all his interfaces back down to the leaf switches inside the network with this with this information that information will be stored inside its local mac table entries in addition this information learning back the other way will occur so instead of a broadcast mechanism where these mac and ip addresses are learned it's actually learned from a bgp update message specifically a route table 2 update for a bgp across this network now let's look at vxlan traffic distribution as you can see we have many links going from your top rack switch up to your spines which link does it choose does it choose all of them uh is there some sort of load sharing a low bouncing across these links well before i answer that question let's look at this from a dual leaf pair or a pair of switches at the top of rack that is configured as an mct cluster now that mct cluster each physically has its own virtual tunnel endpoint however since we're looking at those switches or combining that cluster as if it was one logical switch we're also combining the v-tip as a logical v-tip so therefore there's only one v-tip for a pair of topper rack switches now it will look at the number of links that are available to the spine and can use that using a cost multipath now will vary across those equal cost multipath links by the use of using a source udp port to vary that traffic let's see how this occurs here is encapsulated packet that's sent to the virtual tunnel endpoint let's specifically look at the destination mac and it's pointing to to spine1 how did it choose buy one what chose buy one based off the source udp hashing results for distributing traffic um across these links once it chooses spine1 it will forward traffic up to spine1 and then spine1 would then using a hashing algorithm determine which link it will use to send to its destination vtep in this case it has two possible choices it shows one based off this hashing algorithm it can also choose the other one also that logical v-tip source v-tip also have the option to send the traffic up towards spine 2 and spine 2 based off the hashing results will determine which link it will send down towards its destination vita so all links get used across an ip fabric and it will be distributed based off equal cost multipath now one thing to note about routing across an ip fabric is that there's a two ways to do it but the switches themselves to make sure that routing happens the most efficiently has to support routing in and out of tunnels now some switches do this in one single pass other switches may have to have two paths to route in and out that tunnel you have to make sure that you are choosing the right switch and extreme cases all our switches support single pass riot or routing in and out of tunnels so my asic perspective now once you choose the right switch in your network there's two types of integrated routing and bridging asymmetrical and symmetrical with asymmetrical you have to have common vlans across your vrf stack so you can see here for communication for user one and vlan 30 to communicate with user two and vlan 50 i know not only have to have uh vlan 50 in the user vrf top of rack switch but also i have to have vlan 30 there as well the same thing i have to have for my destination vtip as well common vlans across the vrf stack once that occurs a packet is sent up to vlan 30 there will be a lookup based off the destination ip the stack at that point will route that traffic to vlan 50. the vlan 50 using l2 vni will bridge that traffic across the vx land tunnel to the destination v-tip in which that traffic then will be forwarded out onto the vlan now it's called asymmetrical integrated routing or bridging or irb for sure because on ingress is doing a routing and bridging function and on egress is just doing a bridging function now this may be great for small deployments but for very large deployments they have all the vlans present common across all your vrf stacks you may have scaling issues so we offer symmetrical integrated routing and bridging and with symmetrical integrated routing and bridging you don't have any common vlans across your vrs stack what you do have is a dedicated l3 vni that will handle the vxlan routing function in this case that dedicated layer 3 vni is called vni 99 let's see how this operates use a 1 and vlan 70 noun sends traffic to vlan 70 inside your vrf stack there would be a destination iplookup and we find that that destination ip exists behind vni 99 so it will route from vlan 70 to vni 99 it then will route that information across the l3 vni to the destination v tip and what you will do another lookup and see that that mac exists on vlan 90. so there will be a routing function that will cause vni 99 to route that to vlan 90 at which there will be a d encapsulation of the packet and that will be sent on to vlan 90. now it's a metro because there's a routing function that occurs both at ingress and a routing function that occurs at egress and it is used for large deployments in which you need that common vni that will do all the routing functions on a vrf basis now looking at this from a cli perspective looking at this from a cli perspective let's look to see how that's configured and you see that that integrated routing and bridging ve 99 is configured under the vrf context that means for every ve instance within a vrf you have to have a dedicated l3 vni defined now let's look at the border leaf now the border leaf is basically the gateway in and out of a data center fabric however it may connect to a number of other different networks inside your infrastructure this may include existing campus networks this may include the internet it may include some network services such as firewalls or load balancers that exist on a network or other networks so therefore it's very important that your the router that you choose or the switch that you choose for your for your border leaf supports enough buffers to make sure that all the different interface link types that are connected to your borderline that there is no issue with performance of your traffic however how does the this border leaf communicate to the ipfabric network well the leaf network has to advertise all its networks or vlans to the border leaf that is done such that it can communicate that border leaf can communicate to other external networks outside of the ip fabric this is a simple configuration that's done at the borderleaf where you are now redistributing all your connected links into bgp very simple configuration now that we took a look at the underlay and overlay of the network let's also look at automating the network and we're going to do this with efa efa means extreme fabric automation extreme fabric automation is an automation application that extreme builds to now automate the orchestration of fabrics and tenant networks with inside your ip fabric it's used as a single point for configuration of the entire fabric and it's based on microservices you see that there's a foundational microservice that exists that looks at asset the fabric infrastructure as well as a tenant services so is it fully aware of the fabric and how it's configured and the tenant services that exist on that fabric this becomes a simple and easy integration to other ecosystem partners that you may require for integration into the ip fabric infrastructure such as openstack vmware and microsoft hyper-v in order to add that integration it's just adding another microservice onto the efa application now i just have a sample of some of that integration there are additional microservices that can be deployed upon esa now this application can run on the switches themselves the slx switches themselves with inside integrated application hosting environments but for very large deployments you may require to install efa on a standalone server in addition extreme fabric automation is an application or or is treated as a feature for extreme's ip fabric deployments therefore there's no extra cost to using extreme fabric automation at all now i'm going to transition to a demo on how extreme fab automation is easy to use and how quickly it is to set up an ip fabric as well as tenant services across the network here is an example of the topology that we will be using it's using as three-stage claw architecture of which there are four switches two switches will be deployed as spines and the other two as leave each of these switches have a configured management ip address the switches are racked and stacked and configured in the manner that you see there are two configured hosts attached to each lead the efa application is already installed on a server in this example we'll be demonstrating the automatic provisioning of the ip fabric both underlay and overlay in addition to adding a tenant service that will allow host one to communicate across host two using network virtualization specifically bgp evpn let's get started i have a terminal session into each of the devices on a network for visibility at this point there is no configuration of the fabric itself so if i go to each of the switches on a network i shouldn't expect to see any ip fabric configuration at all extreme ip fabric is based off bgp protocol so let's go to leaf one to see if there's any bgp services configured and there's nothing there just as a confirmation let's take a look at any bgp services and there's no bgp enabled we should expect to see the same on leaf 2 spy 1 and spine 2. let's check and as you can see there are no bgp services across the fabric therefore i should not be able to ping from host one to host two let's check i also should be able to paint from host 2 to host 1. lastly let's go to the efa application to see if there's anything configured there and as you can see there is no fabric configured so using efa let's start to automate the provisioning of this fabric first let's create a three-stage clove fabric called fabric one now let's add our leaves and spines to this fabric the fabric at this point is pre-provisioned let's commit to configuration of the fabric to the underlying infrastructure at this point the underlaying overlay of the fabric is provisioned we can go check this as you can see we have bgp neighborship from the leaf to its spines and the same is true of leaf two to its binds and the spines as you expect should have bgp navy ship to the leaves in addition by looking at the bgp overlay you can see that we have neighborships from the leaf 1 to spine 1 and spine 2. we also have our overlay established for leaf 2. however at this point since we have no tenon surfaces deployed we shouldn't have any tunnels across the fabric and as you can see from leaf one there are no tunnels also no tunnels on leaf two since there are no tunnels our hosts still should not be able to communicate with each other now host two we are unable to communicate between the hosts so now let's go back to efa and add our tenant services let's first start by creating a tenant now we want to assign this tenant within a vrf contacts lastly we want to commit the configuration of this tenant now the tenant is fully deployed let's go check let's go to leaf one and take a look to see if there's any tunnels that are created ah we do have a tunnel that's up let's go to leaf two and tunnels up as well let's also check the vlan to virtual network identifier or vni mapping which allows that vlan to be virtualized across the network and as you can see this vlan 100 is mapped to vni 100 and the same is true on leaf two vlan 100 is mapped to vni 100. now let's go to the host to see if we now are finally able to ping between each other as you can see host one is able to ping the host too let's go to host two and host two is also able to successfully ping the host one the entire ip fabric was configured in a matter of seconds using stream fabric automation application let's talk now about our transition strategy now i'd like to state again that it's imperative that you plan plan plan any ip fabric deployment and transition strategy the key here is that we like to minimize any downtime and service outages when we have the new data center fabric deployed we can start standing up new services and then migrate existing services onto the new infrastructure there's a number of different approaches of doing this i will go through one phased approach on this transition strategy let's take an existing data center network now we deployed a new extreme ip fabric deployment hopefully using extreme fabric automation and if so it was done in minutes if not seconds once we have that new data center fabric now we can connect the existing fabric to the new data center fabric at the border layer once we have that connection into the existing infrastructure we can now start adding new servers onto the new infrastructure once we finish deploying the new services now we can start prepping the existing network as well as a new data center to transition all the services onto the new infrastructure that's first done by configuring your vlans of the new services where the old services exist once we have that configuration completed we can now transition the old services onto the new infrastructure anywhere on that infrastructure on the ip fabric infrastructure now once we finish that transition of the old services onto the new infrastructure we can now take the links connected to your core routers and swing that over from the existing data center onto your new infrastructure now your existing data center can remain in place or um there can be continued transition of older services onto the new data center infrastructure at the customer's pace once that's complete the customer can at one point decommission the existing data center and just move forward with the new ip fabric data center so in conclusion we talked about modernizing their your your data center using ip fabric networks extreme gives you that choice this several things that you have to keep in mind as a consideration your physical network or your physical topology your underlay your overlay what automation application you're going to use as well as your transition strategy extreme has a solid solution to help you migrate your data centers into the modern era using ip fabric technology thank you for watching

Info

Channel: Extreme Networks

Views: 1,826

Rating: 5 out of 5

Keywords: extremenetworks, technology, it solutions, networks, data center, fabric networking, IP fabric, reggie warren, white boarding, real-world examples, automation, speed, ecosystem integrations

Id: obEJZNCV8JQ

Channel Id: undefined

Length: 45min 13sec (2713 seconds)

Published: Fri Sep 25 2020