Cisco QoS: Design and Best Practices for Enterprise Networks

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone and welcome to today's webinar this one is Cisco QoS design and best practices for enterprise networks I'm Patrick Hubbard head geek here solar wind and with me on the call is Kimberly hi Patrick hi guys hey how are you doing pretty good Kim is um I want is actually a guy who wrote the book on implementing best practices for enterprise networks of QoS tell us a little bit about what you do as a technical lead at Cisco so Patrick I've been a Cisco for about 13 years I moved around from LAN switching started out there went into 3 K 4 K 6 K business units mainly focusing on quality of service and then found myself in advanced services working into n designs for a lot of the customers here at Cisco Systems right now I'm an in OST G which is the technology group that owns all of the OSS for Cisco Systems and our quality of service on on all of the the platforms basically across Cisco that's fantastic again I love these webinars where we get to have a chance to have real geeks on board and your presentation has got some slides that in it that I really enjoyed looking at while we've gotten prep for this let's just dive on in here what we're going to cover today is the most the the body of it we'll be talking about key OS design and best practices obviously and then towards the end we'll talk a little bit about CB QoS reporting using nga and some of the new features and for dotto feel free to ask questions at any time using the question panel this is completely interactive we're going to try to make sure that we've got some time at the end but if you have questions that we don't answer we will reply to you through email at the end with answers just a quick overview about SolarWinds and then I'm going to hand this over to Ken and he's going to talk for quite a bit I'm gonna jump in with some questions along the way and then we'll wrap it up with Q&A I think all of you I'm sure familiar with SolarWinds I mean we've been delivering for over a decade now IQ management software that's really easy to set up to use and it's increasingly powerful every year I mean particularly with the release of the latest version of NTA with Ford I know it's low storage really all available for you to download off the website this slide actually the number of customers is incorrect it's more than hundred thousand customers and more than 170 company countries now so I think but I can't think of too many folks who haven't used at least some of our free tools especially things like at a minimum the TFTP server and a couple of other things so let's dive right on in here to kins part of the presentation and talk about QoS design all right guys so again I've already introduced myself so let's just go ahead and jump to the next slide here we're going to really focus in on into nqs design and strategy we'll start out with a basic overview and then we'll convince you that you need to deploy campus qeh because it always seems like that's that's usually the sticking point there most customers deal with when and when they think of QoS that's where they go first but the first section will be campus then we'll jump into an and then we'll go over a quick summary and take any questions if you guys have them towards end and we'll have a couple of references there you'll see this in every presentation in Patrick go ahead and flip through the whole slide here and we'll basically see that you know the trends going going up or essentially you know more traffic Eightfold you're going to see all sorts of numbers like this is that you know nowadays essentially wireless traffic is going to exceed wire we're going to see more video on the network by 2010 we're going to have tons of non PC traffic which means smartphones tablets and all that good stuff on the network today and one of the increasing things that we're going to see is BYOD out there which is is really prevalent we're seeing a lot of video downstream to these devices and all of these trends are typical we've been seeing this over the last 10 years right so the first slide that you always see in any QoS presentation is beware the traffic is changing there's going to be a lot of traffic types out there that you don't know about and today is a big thing is video a video HD video specifically is something that can really dig into your network and you really want to know a bit more about it and we'll cover that here and what to do with HD video in a bit the next piece you always talk about is the trends towards voice video and data media applications we we have more collaborative applications like Lync and jabber internal at Cisco now we have all sorts of collaborative media which means the question comes up whether or not we should separate the voice in the video traffic what about the unmanaged traffic what do we do with the data applications any sort of collaborative application has multiple facets multiple pieces and those individual streams are those individual flows all have different requirements so how do we handle those and that's a question that arises from time to time do we segment them do we mark them differently and in most cases it really depends on your business needs but recognizing the fact that it's becoming more complex and we need better classification in terms of n bar or metadata that we may have as opposed to just typical a CL is one of the improvements we've made over time on to quality service and that's that's one thing that we really need to take a look at so again the trend is is becoming more complicated we see more collaborative media out there now with that basically that collaborative media it leads us to the classification and this RFC is basically your best friend when it comes to quality service because it really provides you with a really good guideline of the application class or how you'll sort applications by specific technical requirements and the per hawk behavior or the DSC p-value that you can associate to that traffic class as you will the idea here is to really provide you with at least a reasonable idea of how to group applications how they should be marked and then some examples over there on the right now Cisco kind of differs a bit on one of these media classes you'll see that call signaling in cs5 basically Cisco believes that the priority queue is extremely important and honestly if you look at the LAN the priority queue costs you a bit of money if you go to Gold car you pay for your priority key you recognize that the more bandwidth you request for that specific queue the more you're going to pay so the less bandwidth you have in there honestly the the better it is basically for a for the company because you're going to save money on the bottom line and removing call signaling from the priority queue itself by by allowing it to be marked zs3 in most instances is is basically a better choice cs3 and cs5 again this is an informational RFC have basically just been toggled here so they call signaling again on all Cisco devices by default is set to cs3 so that we don't really need that signaling traffic in the priority queue doesn't have the same requirements as voice bearer traffic and I can't believe you'd throw xbox live in the scavenger category yeah well that's uh that's that's Vegas on your business requirements I guess now if not this slides not pretty and it's really not meant to be I mean the reality is is this is RFC 5865 I mean this it's the next generation RFC that kind of incorporates all of the the additional classes so we went from a 12 class to a 15 in this one and that incorporates admitted in non admitted traffic classes so you'll notice basically the admitted non admitted or up there at the top and the video reference and this is when you have like a call manager or something like that has the ability to mark down admitted and non admitted flows and to make differentiation asian between the two actually so when you when you look at this it it just really appears to be very complex and it really is sadly ok so this RFC because it is informational it gives you a lot more detail into what applications are out there and how to group them so it tries to to provide a bit of information for you again on how to separate different video types if you have an admitted call versus non admitted you may want to mark them differently but if your network isn't this complex and you don't have all these individual applications where into it what do I do we start looking at different models and we look at a 5 class and at 15 classes in there but most customers will start it either a 5 or an a class model and some of them if they if they really want to endure will move to a 12 plus model now the reality is is that you only have a few different markings here so on a 5 PLAs you literally have 5 markings that doesn't mean you'll have you won't see AF 3 and a 5 class model realistically so you want to keep your your classification your marking simple if you have a 5 class model you move to an 8 to get a bit more granularity the idea here is to keep Q as simple I mean that's you're going to hear me repeat that over and over but the reality is is if you make it overly complex you have to manage the classification that's tied to a basic policy so when you look at the overall in the end design strategy you can start with a 5 class model and then leave all that excess traffic in your best effort q and then from that standpoint we can look at that specific Q and if there's video types that need to be removed from there we can prioritize them by moving to an 8 class model and that's really the idea is grow with york US policy grow with your network and again keep it as simple as you possibly can here's a just a quick slide these slides will be handed out to you guys to take a look at this this is on cisco.com or CCO as we call it internally this basically describes an entire section overview you know the application explosion the collaboration what you should do there what the different rfcs are and in the 5 8 and 12 class model um just again for your reference this is a really handy reference now I've spent a lot of time take a look at this site the first thing that we're going to do is we're going to have to convince you that campus cue us is important because most people when they look at the campus they think well I have 10 gig channels or I have one gig channels how could I possibly drop any traffic well I'm not worried about latency there am I um and the reality is no you're not worried about latency you're more worried about loss that's the big concern latency and jitter really aren't that big of a concern because you are dealing with 10-gig in most instances but in one and 10 gig campus network because you're going to find out in the next couple slides it takes only a few milliseconds to really congest the link and that's really what we're concerned about we want to make sure that we schedule the the traffic appropriately if HD video is extremely important to you for instance like telepresence or other video mechanisms that do use HD you want to schedule that accordingly you want to give it the correct bandwidth so that you don't lose packets because as you'll see in the next few slides it can be bad so as mentioned here one packet in every 10,000 for an HD flow it can easily cause user disruption you can see it so we do the math on the right side and basically in a 1080 by 1920 screen which is just a 1080p flow or 1 video screen you're going to see that that's about 1.5 gigs uncompressed now because the h.264 codec we compress that thing down the three to five Meg but the reality is you've got a 1.5 gig uncompressed stream within that at that flow so that means if you drop one single packet you're really dropping a lot more data than you think so users again a single packet here is going to inside of 10,000 is going to be a hundred times more sensitive to packet loss you know then voice so HD in general is a hundred times more sensitive so it's extremely important to recognize that loss is really what you're protecting yourself from within the campus you don't want to drop packets if HD video is extremely important to you now here's a comparison generally when we look at quality of service we compare voice to HD video it's kind of a joke I mean the reality is is voice back in the days was it important and it still is but if you look at it it's incredibly deterministic and it's very easy on the network as far as looking at a g7 11 or g7 22 call you're looking at 20 millisecond audio samples and very small amounts of data when you look at video it really depends on the shirt of the on the camera on the other side it could be massive like the the middle frame or it could be fairly light like the first frame but the reality is is that the amount of data the amount of bandwidth required for HD video is completely different than then voice today and that's something that we really need to take into consideration when we do our classification and we do quality service within the network is a lot of traffic out there these days is HD video and it may not be business centric so the next piece that we talked about was it's fairly easy to to oversubscribed buffers online cards now the reality is is that the 6148 a shown here is a very standard line card that's good great um and it provides a gig throughput just fine but if I throw HD video calls or HD video in general on this line card it'll only take it will take about 11 milliseconds to fill up the buffer on this guy now that's not because the buffer is undersized that's just because you're dealing with a massive amount of traffic and burstiness based on video it's also on this 10 gig line card you can see it's about nine milliseconds so the burstiness because of the HD video is extremely important to take into consideration so if we don't provision properly if we don't configure quality of service within the campus we're going to run into packet loss on potentially incredibly important video calls that your manager your boss could possibly be had alright so the basic design principles for quality of service as always are performed to us and hardware rather than software and most of the time um customers when they first look at quality service they think about the land specifically but the reality is is again as we just proven it's easy to over subscribe the campus so you need to definitely configure quality service the access edge to make sure that number one you're performing the QoS and hardware to offload CPU to offload anything you possibly can from the land routers themselves because they may be doing things like dpi which are processor intensive classification and marking is really important at the edge because we establish something called a trust boundary so as traffic comes in or flows come in it's just like somebody walking into a building you make sure they badge in before they're allowed into the network well the idea with establishing the trust boundary is as an application or flow comes into your switch you don't have to trust the marking you can mark it whatever you want so better to try to classify based on a CL or something to identify that flow so that it gets queued properly throughout the network and then policing unwanted traffic flows is just kind of common sense if you limit the traffic at the access edge then you you realistically don't have to worry about excess traffic and your distribution your core on your way if you have traffic that you really don't want to hit the LAN you could easily just drop it there at the access edge so don't let it the network if it doesn't need to be there and then queuing queuing is what guarantees qeh if you don't have queuing in your in your core or if you don't have queueing on your win there is no guarantee what queueing basically is going to do for you is make sure that based on your markings and all of that policing and all that work you did at the edge and each egress hop on the edge and the core and the distribution in the wind you're going to make sure that you fulfill those specific cues based on a certain bandwidth requirement so 30% allocation for some specific queue um is actually a positive thing right so you're going to adhere to that 30% firm so some of the QoS tools and some of the options here that we'll talk about is the global trust are the the trust states or conditional trusts that we deal with Cisco IP phones and also telepresence purport QoS vs / VLAN per port purview ends another option on the the catalyst 4k and freak a standard QoS models ether channel QoS and one thing I want to mention here is in this campus section we generally talk about MLS because that's usually the 3750 in the classic 4k and also the 6500 in some cases but these are all MLS based switches now the next-gen line of switches the 3850 that just came out recently 4k with soup six seven and eight and the 6k soup to TR all-in QC basis or more they're more like a router realistically and they don't use the MLS command set and so you'll see a little bit of difference there but in this specific section we're really focused on the MLS based switches alright so in the campus on all of these MLS made switches you're going to see trust States so you see untrusted trust and Trust dscp down there for cost and dscp by default everything is untrusted on these MLS based switches one thing you'll notice different Lee is in QC we do trust everything by default keeping in line with is the router based platforms now untrusted basically means you've got an internal DHCP value and you rewrite that cost in gscp or zero if we trust costs all the cost does is basically looks we look at the cost value of five and we basically overwrite cost five will are not costs five with the dscp value based on the internal cost of the SCP mapping table so we care that internal DHCP value and on egress we do the rewrite same thing with dscp we trust the dscp that means we overwrite the cost value on the egress side based on the internal dscp so that's extremely important to note the next piece that we're actually going to talk about it is basically is basically the the trust or conditional trust operation that we deal with when it comes to either IP phones or Cisco TelePresence in units right so the idea here is is with conditional trust the IP phone itself can be extended to trust so that the the phone itself can remark traffic from the PC that comes into the phone and goes further into the system essentially what will happen is a PC port on the back of the phone the cost values will come in or DHCP values will come in from the PC the phone itself because CDP validated as a Cisco phone we'll go ahead and Mark the traffic from the PC four to zero for the for the cost value and then the phone itself will send cost three as well as a DHCP value of 46 for voice bearer traffic or if you've got video af 41 directly into the system and the box the switch itself connected that did the conditional trust will will literally trust the value the DSP in the cost is associated now generally we say trust cost for phones and also Cisco TelePresence well in the instance of telephones we say dscp because the phone port the port behind the phone should actually be shut down in that case but for phones for Cisco phones since we extended that conditional trust the phone remarks cost it doesn't remark dscp and essentially what will happen is is that we want to trust costs on the physical switch port otherwise what could possibly happen is the PC behind the phone can send the DHCP value it could go through the phone and then into the switch itself so anytime you do that you want to make absolutely sure that you do trust the cost value rather on the switch port attached to a Cisco phone so the standard trust boundary is basically something again we establish it's kind of like a key card effect we want to make sure that the flows or the applications that come into the network are directly tied into what our business requirements are so three options here is to conditionally trust the the phone again as we were talking about the phone is a border it's essentially going to remark the PC traffic to zero or it could mark it something else if you like but normally it's set to zero and then the phone itself will send you no cost values of three and basically five directly to the switch will take those cost values and mark now remap them based cost to dscp like we talked about on the previous slide and then that traffic will go through the system but the idea here is is that we establish a boundary the next piece is the secured endpoint so we could trust the endpoint because we have some sort of application or some sort of supplicant based on the system there the unsecured endpoint is going to be just a standard PC on that switch port we're not going to want to trust that device at all so end-users generally we try not to trust because who knows what they're going to do either on purpose or accident the next piece is purport verses per VLAN quality of service and there's a lot of discussion around this it depends on realistically the amount of policies you want to deploy how large the policy is and also really what you want to do from a QoS standpoint on a per port policy you just apply the policy to a port everything on that physical port has to pass through that policy in order to adhere to the you know the QoS requirements so if you set you mark you police it's all going to be based on that pork if you do a VLAN based policy again it's going to be an aggregate as opposed to an individual port so everybody coming into that VLAN is going to be subjected to that specific view n now it could be a four deployed to a SBI or a layer two entity as a VLAN but it doesn't really make a difference it's still going to be an aggregate so anytime you create a police or on a VLAN policy it's really just um it's it's just a best-guess realistically when it comes down to it so if you have ten phones you're going to take 10 times 128 K for the VLAN based policy whereas per port it's going to be much easier because it's more granular now one step further is VLAN per port per VLAN based policies so we look at the VLAN on the individual port and these policies from port to port can be different on a per VLAN basis so the 4500 does a really good job of this in that on the physical port itself you just have to configure the VLAN command and essentially tie the service policy directly to it so you have a VLAN range command you say VLAN 110 for instance and that's your voice VLAN you can tie a voice VLAN policy directly to that voice VLAN and the same thing with the data beeline here as well so separate policies more granularity and the reality is is that a lot of customers really like this because you do have that capability to be more flexible the ingress QoS model to be honest you're looking at the standard untrusted versus Truscott sword ESET that we just talked about and then we have the trust so it depends on which type of port you're dealing with is it an untrusted port a trusted port or in all instances we may want to create a qsr policy associated to that port now a QoS policy and this is a standard recommendation for ingress is an eight class model that essentially takes into consideration voice video multimedia transactional classes and it marks them according to the standard RFC the policing is optional but realistically when we look at police's within the network there are a lot of overhead there are a lot of management to really look at these individual police's so the best practice recommendation is for voice signaling and other deterministic like video based applications and when I say deterministic and video I mean that loosely but at least you have an idea of what the the maximum codec value would actually be so you can attach a police or to it associated to that individual class or that traffic and make sure that a virus doesn't happen to take over or something like that but when you use police or use use them very conservatively because if you update your system you may end up going back and revisiting all of your Q s policies quite often because the police are set too low and you have to keep pumping it up and then of course we have the voice VLAN and the native VLAN in this case we have a per port curvy line based policies and that leads us to the ingress queuing now this is the typical ingress quality service model when we look at queuing in general Cisco's based it on this one p x qyt model that's been around for some time so you have one priority queue in this example three queues meaning three normal queues that are either s r RW r class-based weighted fair 5o queues and then you have eight thresholds and it's a big example for those now queueing does vary based on hardware and there's a lot of discussion about that out there people would like to see one queueing model across all platforms but the reality is that the reality is that every time we come up with new hardware we want to extend the capabilities that are available today to customers and that's why you keep seeing more cues or more thresholds or more priority queues in fact and the reality is that more customers actually do use the capabilities are there so you will see a plethora of different queueing types and queueing structures but they all follow the same overall expression of 1 P x qyt now general recommendations best practice are not to over subscribe your priority queue leave it about 1/3 of the overall link speed but the reality is you can definitely go about that reason we say to leave it at 1/3 is just so that the scheduler isn't being being monopolized by the priority traffic because usually what happens is when a priority a packet comes into the priority queue we schedule it first and foremost which means we may subject the other traffic to drops or to loss latency or jitter and we don't want to do that we want everything to work in accordance with the standards we want to make sure that as flows go through as traffic goes through that it adheres to the the standard business practices and it also makes all the applications usable to the best of our capabilities so that the idea behind that recommendation is is that the best effort our sorry the real times you shouldn't monopolize its best effort q 25% recommendation there is really just to provide you with an idea that you need to provision bandwidth for everything else so priority queue is important traffic that has low latency best effort and everything else realistically now scavenger in bulk in today's times we're really looking at some book and scavenger traffic is being kind of disruptive traffic to the network bulk is usually you know the traffic type that is some sort of backup that happens hopefully during the night time and doesn't disrupt your network during the day but if it does happen to fire off during the day you want to make sure that that traffic that both traffic's not in your best effort queue because the reality is most of your traffic won't be classified because either you won't be able to classify it or you won't want to classify your 30,000 applications that are in your network today so removing that bulk traffic will really enhance the capabilities of your class lethal because you won't be crushing the standard applications that happen to be in there now the next piece is land qss designed considerations basically when it comes to the win we don't really have to sell the idea of cue us here because everybody understands that if they have a t1 on one side of the router and a 100 Meg on the other side that it's it's probably going to cause some problems right but the reality is is that today's times really requires that we look at more more developed classification techniques like n/bar or metadata to really dive into the individual applications especially when it comes to HTTP to either classify that traffic is business centric or business critical or to classify it as bitbucket and cs1 now on the the ingress here we seize their land edge we have an optional DHCP to cost mapping sort of thing where our legacy switches don't really support dscp I don't really see that happens much anymore but it's left there for for those of you who do have fairly old switches but the reality is is that on that land edge on the egress side towards the land side we really don't need cueing in most instances so you really won't see a cueing policy there you may see classification and marking based on n/bar and those those policies are really going to be a bit more granular in today's times when you're dealing with cloud or or just you know standard internet-based traffic on the wet edge we do typical cloud-based weighted fair for the routers shaping as well as dropping and one thing we didn't mention in the in the switch section was that a lot of the switches to be honest they don't do shaping they do policing and the difference between those two are quite huge shaping is going to allow us to backup the cute above for some of the traffic policing is just going to drop it at a cold cio rate which means if you have voice and video and data applications traveling through there you configure a police or you're going to drop the traffic based on that rate so you could drop voice video and data on the way edge we want to provide a bit of a bit of back pressure with a shaper so that we can fill up individual queues so that we send out the the priority queue traffic like the boys in the video first and foremost and essentially then provide for the other class basic weighted fair queuing cute now there are a plethora of different types of connections via dmvpn or internet-based connection standard VPN connections flex VPN all sorts of different VPN connection types and we don't really cover that in detail but there's a few SR d--'s out there on the QoS the for diversion is the latest as of yet and there's a VPN and a way on went out there you may want to check out because we don't really have time to cover all that in this individual section now again on the way inside when we're looking at the point within the network the green dot essentially on the switches we want to trust the traffic as it comes in we want to make sure we trust that dscp because the LAN device has already remarked the traffic and Trust is basically an ingress function so as traffic comes in we look at the DHCP values and allow it through then we do our queuing I'm on the switchboards as we normally would because we want to do that from end to end on the router itself you don't have to worry about trust and also again like I said a lot of the next-gen switches that are out there you don't have to trust because by default that's iOS ok that's the way we do it with mqc now optionally you'll have loq class-based weighted fare depending on your policy on the LAN aggregation box usually again your uplink is going to be P 1 or ds3 or something like that that's going to be much slower than a gig link so you won't have to worry about traffic going from the right to the left at least out the way Nagre Gatien port towards the land side of the house on the uplink on the uplink the uplink is really the most important piece this is where you need some synchronicity with either your service provider or you need to determine what you want to send over the internet so if you've purchased a VPN based service from your service provider and we'll see in the next couple slides you want to link this up directly with their there offering and you'll do that based on your internal your 12 class model or whatever it happens to be now you'll do some remarking here at this edge towards the provider in that case because you generally will have to deal with a four or six class model where you may go up to twelve in certain instances the other thing to consider is that even if you're sending traffic out the internet recognizing that there's really no quality of service out there if you're dealing with a public cloud or or cloud-based services it's important that you look at that traffic and granularly define what you want to send from your router out to the internet so you can send voice that voice is more important to you or a video allocate bandwidth to that and be very very granular with that policy on that uplink to make sure that you know voice goes out before video before your transactional applications and you can even you know stifle applications on ingress like Hulu or Netflix or whatever it happens to be internal to the network and the idea there is to make the best use of your bandwidth even if it is internet-based bandwidth so when you deal with the MPLS VPN design there's all sorts of things that could actually occur so you may have a hierarchal QoS shaper that provides you with a parent shape rate and that rates going to be provided to you by your provider you're going to ask for a certain rate either ten Meg under bag whatever happens to be and then from there you'll have class-based weighted fair queuing seek you probably and your priority queue is going to be your low latency queue and this is where essentially all of your voice traffic possibly video are going to go and you'll pay a bit extra for that your standard queues as we talked about before it's going to be based on the amount of traffic that you want to send it's very difficult to give any one customer an idea of the amount of a f21 or transactional traffic that they're going to have in their network it really depends on what you have your business requirements and this policy itself is going to be something that is going to take some time to discuss internally based on again business requirements what applications you have is HD video Krita equal to you is it going to go in the priority queue or is it going to stand in a standard class-based weighted fair queuing queue how much are you going to pay for the priority queue or golden car as we've talked about it before and then on the service or router side they basically just enforce your policy so you're your objective is realistically to fold all of your values if you have to into the service of minor policy and make sure that you adhere to their markings and that could be a task upon itself working with the provider one on one is extremely important because you want to make sure that based on their individual classes that they are looking at cs5 for instance or cs6 or are they using AF 21 or AF 31 there's there should be no assumptions there at all they should really outline that for you provide that policy for you and you should be able to pay based on a certain set of percentages for each of those classes so two steps really verifying the SP policy you want to make sure that you know what you're getting from the provider directly and you also want to configure egress queuing now here's an example we go from the standard model to the service provider on the right we have it looks to be a 12 class model on here on the left we're going to have to fold some of these values in with provider offers so you look at cs5 AF 4 CS 4 and you see that we're remarking those values to two different values associated with it what we're going to see is on the other side of the house where this traffic comes out of the pipe from the service provider we're going to have to classify that traffic based on ACL or n bar to remove the CS 2 value and remark it again CS 5 so that we can provide it the treatment that it needs within our enterprise so these are just marked or matched associated with the four cues that were provided by this specific provider and realistically we're trying the best that we possibly can to look at the applications on the left and put the ones that are close together that are have the same basic requirements on the right so if you notice you see TCP 4 SB critic one UDP 4s be critical too and so the next piece realistically is one of the best models that we have that are generally offered by providers and QoS guys love to work with this because it's flexible it gives you just about everything you need in the network so it's a six class service provider model which provides you with all the assured forwarding values the DHCP value of you know EF CS five four three two one all the way down to best-effort so it's really easy to map all of your values directly into the one on the right though because we do have a pretty extensive 12 class model and we're going into a six class what we're trying to do is essentially condense again light classes so we have both data and scavenger which go into a single class for their scavenger at 5% we have the management and we have media streaming and typical video and a single queue so what we're trying to do again is establish a critical one critical to critical three that are all individually very similar to one another as far as traffic types so when they flow through the provider they don't interact poorly with one another the last piece here is based on Metro or sub line rate access policies now please do recognize that the switches up until the 3850 the next generation guy that just got released um generally do not support shaping and so switches are really not supposed to be Metro a handoff unless it's a 3750 Metro or some other Metro based switched so the idea here is to look at a basic land card that has a sub line rate that provides you with the capability of doing some hierarchal QoS or some shaping like a sip respond now in this case we see the marking we're going to trust dscp of on the the orange dots there we're going to do queuing from the distribution or the core to this or access to this guy and then we're going to trust the dscp on the up links as the traffic flows through now the H cross tape capable is that's a metro router gives us the capability of looking at the VLANs and all the detail there but realistically all we're going to do from the access edge is trust that traffic coming in make sure we queue it according to our land policy um and then also as we hand it off we want to shape that that traffic down on that Metro router or that Metro switch to the line rate so again if your provider offers you a 10 Meg right and you have a gig connection if you send a gig into that 10 Meg it's free for all the traffic is dropped randomly you don't know if your voice back at your video or your data is going to get dropped so you have to provide a shaper in order to provide back pressure and give cloud-based weighted fair queuing me the chance to schedule the traffic out the basic shape rate of the port and into the cloud so you're grooming the traffic to make sure that what goes out of your network is really what you want and that's the extremely important piece of quality service all right guys so that's the the wank us design guidance I know it's very high-level but again if you dig into this design guide they're all here is all the details are here basically it's around five to six hundred pages if you really enjoy reading and you can basically start from there and you know send us questions if you have any also there's a new book that just came out in the inequality of service network design this is going to cover all the next-gen quality of service 3850 5760 from a wireless perspective the soup ate this and the basically the next generation switch is in the current ones and it's the additional book the second edition from the standard version that's been out for quite some time now so if you do have questions feel free grab that take a look at it and I'm sure it will answer your question thanks again I appreciate that that was a fantastic presentation I don't get to really dive into QoS as much as I'd like it ends up a lot of times that process is one of spending a lot of time looking everything from TOS to the QoS flags to you know building out 2cb QoS class maps and it boys seems to be a challenge to get it to work right in practice so I wanted to talk just real briefly here about the monitoring QoS and the product that we offer to make that a little bit easier especially with some of the new features in 4.0 the ideal monitoring really should provides a strict statistics on both pre and post of policy traffic maps and traffic drops and I'll get into drops in a minute but again to reiterate on the previous portion of this drops versus queuing is really really important and being able to see that is really the those pre and post class maps are really the only way to know how that's working policy validation is really important you know do the right conversations are they getting the priority that you expect especially with the increasing reliance on a highly compressed data like video and more and more critical telepresence and voice and then of course make sure making sure that the tools have access to all of the Cisco myths to be able to collect the correct cbq OS and QoS policy data and then of course fortunately NetFlow carries a lot of information on QoS per conversation so the way we process is actually with our solarwinds netflow traffic analyzer product and now for those of you who are unfamiliar with it it it operates on two different levels one of them obviously is taking advantage of Cisco's NetFlow protocol and it actually has a collector which receives the NetFlow just about any version of flow you can think of it's really easily configurable and in addition to that it's using SNMP polling to collect CB QoS data off of the CV QoS myth if you've ever tried to collect CB QoS data manually in a command line it can really kind of be a pain so being able to automate that is really important and then finally being able to tie all of that together into a single view and correlate those two values together is the key of the web interface main features there I mean it is an add-on to our network performance monitor product it does and I say most of our customers are primarily Cisco but we do also support some of the other standards out of the box like s blo IP 6j flow to be able to get all of the different traffic metrics another one of those that we've recently added with samples net flow so if you have 10g networks and above where you really don't want to try to complete the firehose your collector with extremely high traffic flows will be able to still do let's say 95th percentile reporting that makes that really easy and then of course it's using SNMP as well to do not only the performance reporting on the TB HQs policies but you also then get all the performance data on the devices that are supporting those the QoS not only the metrics providing QoS policies so if you are doing a lot of queueing for example it's going to have an impact on CPU and memory it's nice to be able to monitor all that in one place the big news here is with NT a 4.0 we have completely overhauled the way that we are storing our metrics before there was sort of a mmm semi-hard limit of around 10000 flows per second per collector we were using a relational database on the backend store net flow data and we finally realized that when we had customers that were getting to be pretty big and wanted to be able to support a whole lot more flows than that but beyond that being able to have one minute granularity for as long as you're willing to set up storage on a system is really important especially when you're trying to correlate performance metrics between let's say the net flow data which is showing something really spiky that happened three days ago and I and I helpdesk ticket versus the interface traffic metrics for that for that that same link being able to not average out that data into hourly or larger with and no chunks is the way that we do that so out of the box it's Denver binding 5x more processing our much better performance out of the web UI in addition that it speeds up all of your other Orion platform applications as well because it's moved the NetFlow history data out of out of sequel server which also means you don't need a license and a dedicated sequel hardware to be able to store that in in many cases that represents the your traffic data may represent as much as 75 to 80 percent of a typical users database so that's a huge new upgrade in the latest version and there's a bunch of videos online I definitely recommend you check those out QoS reporting is really handy and especially because in addition to being able to pull the basic policies that are applied to an interface we also now support nested policy view and direction so that you can actually look at both inbound and outbound and the as I mentioned before you get pre and post map policy tracking I make sure you show you that in a chart here in a second and being able to see that on a class-by-class basis for each one of those policies including nested policies makes visual make the time that you're spending creating these really nuanced cv QoS class map policies that should be working the way that you're expecting but they are in fact working the way that you expecting those lay you expect you also get great detail on drop traffic and that's per interface but also on QoS class and I mean again the main goal there is to be able to validate that your QoS strategy is in fact working the way that you expect it will I think most of you have probably already seen the interface you can also see this live if you want to go check out our live demo it's a Orion demo solarwinds.com or just visit the NTA page on the solarwinds.com website there's a link that will take you to this but in this view what I wanted it demonstrated that we're looking at to a series of for a single policy with four classes designated we got streaming low delay class default and best effort and we can actually see on the left-hand side here the pre and then on the right hand side the post policy class map that's that's moving through that interface based on that based on those policies and then we can also say our CV QoS drop report which is really handy if you had discovered it all of sudden you're dropping all of your executive weighing telepresence data with hard drops instead of cueing that can keep you out of an awful lot of trouble and then finally reporting within ta I mean obviously we're supporting a flexible net flow as well as the five it includes the tos information which is really nice for each IP conversation and using that along with the DHCP field reporting can really help identify miss marks IP traffic and I think that was something Kim's talking about before if it's really important to make sure that if you're you're marked correctly and it's a great way to figure out where those mismatches are actually occurring in the field and then finally of course you can report on the essentials like protocol application endpoints and conversations relative to each each TOS oh a great question just popped in here which is does NCAA cost more than NCAA if we're already licensed and the answer is typical to SolarWinds absolutely not as long as you're on maintenance you get all of that capability for free and what it really means that it's going to decrease your overall operating costs for net flow at the same time that for example instead of storing 15-minute details out of the box you can get stores 30 days of detailed data out of the box and so definitely go download the free functional 30-day trial if you don't already have it that's solarwinds.com / NCA the vanity URL for that you can also just google SolarWinds and ta and you'll get right to it there are a number of videos and the links that are included here will be included in the attached slide deck so you can go right to those and then if you're on the website and you're looking at QoS there's actually a whole description page there of how NTA supports QoS in the product and of course if you're not already a member of our a community of more than 150,000 folks on thwack definitely I would recommend that you do that spend a lot of time on our NTA forum so even if you're not steamer there are a number of folks out there who have gotten to be real gurus working with NetFlow and as well as QoS and QoS strategy on a day to day basis on real live networks who can actually help you with questions as well as have a lot of interesting stories that you can read and check out directly if you have any questions of course don't hesitate to hit us up on head Geeks we're going to on our Twitter account we're going to go over some questions here can I are going to take off of the list here in a second as well and then of course you can always tweet SolarWinds okay so one of them was that the RFC stated videoconferencing should be in a non EF q well what about audio in that videoconferencing column shape okay I I can see the note yeah so I mean so the idea here is is that the RFC really really is trying to again clean up the EF q but if you want to segment the audio in the video pieces of it you could actually through so for instance in a call manager you can you've marked the voice call a voice call piece of the the video phone call to EF and then the standard video to AF for t1 by default or Mark the same as a f41 a lot of customers may do that because if you have a voice call or a video call and the video gets a little jagged e and the voice is directly there it's always there the call is usually more acceptable because if the voice is definitely protected but the video is a little bit off you can kind of deal with a bit of pixelation so it's okay to do that but again it depends on it really does depend on the business requirement we've had customers do it both ways but by default Cisco marks the affording one for you know your general video calls so that they're aligned they're in the same queue they're in the same class you don't have lip sync issues and things like that you may run into if you do separate the two streams okay there was another one here that I liked which was what happens if you don't have CDP coming out of the device how do you implement a trust boundaries and the example that they gave was a lake Polycom thumb all right so so in the instance of the lack of CDP to be honest and CDP is a pretty lightweight protocol it's not incredibly secure of course as we all know but the reality is is that if it doesn't support CDP you'll have to just apply a standard you know QoS based policy on the physical port and that's pretty simple to do I mean that's usually the way that we did it previous to a conditional trust so just apply a simple of quality-of-service policy on the physical port and you're pretty much good to go you can match the traffic based on a CL port range or the SCP values and depending on what the in device can actually send okay so so first of all II IG RP neighbour traffic so in general we if you're dealing with a router we have something called pack priority and a lot of the traffic is generally marked a lot of the internal or any of the EIG RP or routing based traffic and protocol traffic is usually marked cs6 internally if you're not able to to classify that base cs6 and you're already having problems if you can't you're just classify cs6 and allocate it to the your network management q ER here sorry here network control q you could easily classify eigrp itself with a standard ACL usually that traffic because it's locally generated is again it should be marked with pack priority and so it either depending on your router platform it's already protected or if you're seeing the loss you should be able to classify it and then allocate its own its traffic to basically a network control queue and that's usually what a lot of customers do especially bgp where you do have to classify that traffic specifically hmm there was a question here that I want to actually expand on and just make it a little bit more general which is you know we a lot of times kind of rely on Auto QoS we know that customizing our QoS strategy can pay some real real benefits so what's a typical return like on the investment in qos i mean it it obviously takes a while to implement it's discipline gear has to be compatible and you overall need to know what you're doing so you are gonna have to put some time into but really what is it with the comparative benefit once you've taken the time to do it well I mean do you trust your police department I mean the reality is is that it's hard to justify to be honest basically the the thing that I'm trying to say is that with quality of service you have a guarantee if you if you provide it in the end you know that at the end of the day when you call 9-1-1 the police will show up right the reality is with quality of service in the end you basically have a guarantee that applications will flow through now it doesn't mean that you won't shoot yourself in the foot but what it really means is that you've provided yourself some sort of reasonable guarantee that if a virus or if something you know somebody shoots off a backup during the middle of the day that because void happens to be in the priority queue and this traffic was thrown in best-effort and you have congestion your voice calls are going to go through at the end of the day if you don't enable quality of service in the end there is no guarantee that's the whole focus for quality service so as far as ROI it's it's hard I think to justify a number when it comes to quality of service but if you don't implement it it will bite you and that's the problem when you're dealing with voice and video these days and you're looking at all these immersive applications you're dealing with you know Netflix who all sorts of you know sharing programs it's very difficult to determine first of all what's using your bandwidth without quality service and it's very difficult to separate your business from your recreational traffic without doing that so I would say it's imperative but I think it's hard to give a number or say what the ROI basically would be right I was going to say it requires a lot of discipline and one of the things that we find that our customers see a lot is there it always seems to be Friday at about three o'clock in the afternoon when something really critical is going on that you hit this tipping point where you've been watching it on your instrumentation for a long time and thinking we know there's some things that I could do here to really improve service on the network and I just need to make time to really sit down and take a proactive approach to my QoS management and then that's when you you roll past that point and all of a sudden it start to have a lot of conflicts and it usually doesn't start off getting a little bit affected in one area it gets bad all at once so there's a question about Metro II with the sub rate capability and if policing if you're policing to limit traffic to 20 Meg so the cure the carrier doesn't drop packets will cue us policies be based on percentage of the max bandwidth allowed via policing I'm not really sure I mean so if you're if you're policing so first of all if you have a Metrolink and your policing you won't be able to do class-based way to fare directly under the police er so in other words you won't be able to allocate percentages for voice video or whatever under that police rate if you have a shaper at the parent because the shape a shaper buffers which means you're allowing queuing which means in the child of that policy you'll have class-based weighted fair queuing so you can say a percentage goes to my voice a percentage goes to my video that's definitely first of all the way you want to do it for us I'm not reading it right but that's I mean I kind of I got that from the question they're stating police and delimited but if you even a few police the traffic as it leaves your box it's randomly dropped based on the Cir so anything above that 20 mag that you send is going to be dropped whether it's voice video or data it doesn't differentiate you definitely want a shaper there okay well let's just do one more question and we'll wrap this up and get to the drawing and X actually a general question aidid something one of the attendees actually asked but the specific question was do multiple versions of QoS work with one another but I think the bigger question is obviously there's the there has been increasing complexity with qos over the last few years and there seems to be an awful lot of work toward bringing all of that together and simplifying the management of QoS across the enterprise network ya know so I mean so so definitely I mean in recent years and we've been leading efforts to really align quality of service in general because like I said we we kind of did away with MLS because we recognize that MLS itself on a per platform basis really exposed the hardware to the to the customers and so it makes it a bit more difficult for them to recognize and to understand QoS from end-to-end because on the 3750 we have srr on the the 4k we have you know whatever happens to be we use dbl there instead of weighted red and what-have-you the reason behind that is because we wanted to expose the best hardware to customers we possibly could at that time but in doing so QoS being so tied the hardware specifically became increasingly complex right so if you upgrade from one version of iOS to the next I mean generally you'll have the same set of QoS you may just get some new features just like any other sort of a plus or any sort of feature but if you move from platform to platform it may be different so again as you stated in recent years we really try to do away with that so we have mqc which is a class map policy map sort of function it's a unified provisioning language that we've provided now across all the platforms at Cisco so the 3850 the 4k with soup 6 the soup 2 tea on the 6k Nexus even XR or any platform the XR like the CRS they all use mqc so at least the provisioning is the same but still because QoS is still tied to hardware you're going to have different capabilities and the reason for that is because you really need different capabilities on a CRS then you would 3750 at the XS edge right so mqc is there to really help people so hopefully that does help people understand how to provision quality service in the end and it's a bit for them because at least they get the structure but the features and the functions are definitely still going to be different varying on platforms because again places in the network require different things so yeah that's a really good question that's awesome well that's gonna that's going to wrap us up thanks again for attending I'm Patrick Hubbard and have a great afternoon
Info
Channel: solarwindsinc
Views: 74,396
Rating: undefined out of 5
Keywords: Cisco QoS, SolarWinds (Organization), NTA, networking, best practices, QoS, Quality of Service, Cisco QOS, NetFlow, bandwidth monitoring, network engineer, NetFlow Traffic Analyzer, enterprise network
Id: xePZcobaJUY
Channel Id: undefined
Length: 62min 45sec (3765 seconds)
Published: Mon Dec 16 2013
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.