Tutorial: Troubleshooting with Traceroute

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so good morning everyone who else's as hungover as I am so my name is Richard Steenburgen and I'm doing a tutorial on a practical guide to how do correctly troubleshoot with tracer album so I think most people kind of know what traceroute is it's the number one go-to tool for doing all kinds of Diagnostics every OS out there comes with some form of tracer out tool there's thousands of websites out there where you can do tracer out there's looking-glasses there's a visual tracer else there's things that will try to make pretty little maps out of it there's commercial versions there's free versions it seems like it would be a simple tool tool to use you put in an IP address and you see a path and somehow that helps you with your Diagnostics you know you look at the path and you see where does the trace route stop or where is the latency jumped and that must be the problem right the most people think that that's how this works and that this will immediately help them figure things out and the reality is it almost never works that way in practice so the question is what's wrong with tracer ality well it turns out most modern networks are actually run pretty decently well so the really simple issues like congestion routing loops the things that you would think are most obvious you run a traceroute you see the the latency go up and you say that's where it is those tend not to be real issues anymore so what you mostly see for for the remaining issues or the things that are more complex what happens there is people who just use a very naive implementation of traceroute turns out mostly to be wrong so the problem is very few people out there very few people out there are actually terribly skilled at interpreting traceroute you know it looks relatively straightforward and it gives the impression that you know how to read it but I have yet to find any major ISP knock that is terribly good and I still to this day encounter many engineering departments many people who are in theory very senior engineers who aren't able to read it well so the purpose of this tutorial is to kind of go through some of the realities behind it how it actually works how you can how you can read it better and how you can troubleshoot networks with it one of the downsides to people not being able to retrace your out well is most of the time if you go to a NOC and you open a trace rail ticket most people don't believe you because there's so many false positive reports so any way to improve that is better so here's traceroute at its most basic all the implementations vary a little bit but it mostly looks something like this you have a bunch of lines of output and on each line you're going to get the hop number you're going to get the hop dns in theory what should be a router along the way the IP address really you get the IP address and then you do reverse dns on that get the get the dns and you're going to get some type of latency measurement and typically most most standard trace routes are going to do three independent probes and they're going to come back with three independent results so you see here's an example where hop one this IP resolves to this DNS and here's three independent probes but what are you really getting when you do that so let's look at trace route how it actually works at the packet level so the very first step when you're doing a trace route is to launch a packet towards the destination with a TTL value of one and what happens is every time a router receives a packet and forwards it it decrements the TTL value once it gets to zero the router drops the packet and returns an ICMP message so that's step two step three returns the TC the TTL exceed message back to the original sender of the source that source receives that ICMP and it knows it says I sent this thing out 50 milliseconds ago and I just got my icing and P back and so I've now got a measurement of 50 milliseconds it's the round-trip time for how long it took to send the probe out how long it took to get the ICMP message back and then it goes all the way back to step one but now we start with the TTL of two so the packet makes it further into the network it makes it to hobson before it gets dropped rinse repeat there's a little diagram that shows basically that you're sending out incrementally one more TTL at a time one more TTL at a time the packets making it one step further in the network being dropped and you're getting an ICMP message back and responding and seeing that response and using that to display your result so when you said before trace routes doing multiple probes per hopped so like I said the vast majority it's going to be three what that is is three independent probes so it's in packet one that goes all the way through TTL gets decremented hit zero gets dropped returns a result and then it sends another packet and does this three times before it increments the TTL in each probe packet uses a unique key to distinguish itself so how do you know the difference between probe one two and three so if you look at most trace route implementations the the UNIX systems most you know like junipers and things like that anything that's based off that it's a it's a UDP packet and it's using a destination port to distinguish itself so by the time it hits probe to probe three the UDP port goes up by one every time that's how it knows okay this is probe two this is probe three how to display the results correctly there's some other implementations out there that use other things windows for example famously uses ICMP there's a lot of tools out there that use TCP not so much that those are better but sometimes those will work around firewalls or some people will just randomly block UDP but ultimately the thing to remember is that each probe packet is a completely independent trial it has nothing to do with the pro packet that happened before the results are different the path that it took over the Internet can be very different and what you're seeing is only that last hop where that packet got dropped for each probe and I'll get into that in more detail later on so remember we talked about how you actually see traceroute latency and this is real important when you go to try and figure out what's the source of latency there are three things that factor into traceroute latency actually I get to the other the latency calculation is nothing more than you take a timestamp for when you launch the probe you take a timestamp when you launch the probe you take a timestamp of when you get the ICMP back and you take the difference that's that's your time something that people get confused by is they think the routers along the path are doing something to time processing they're not putting a timestamp in the packet they're not doing anything special they're just dropping the packets a normal function of the router when the TTL is exceeded and they send back an ICMP message so the latency is always going to be the sum of the time taken for that forward probe to get to that last hop so however long it takes me to send my probe out to where it gets dropped the time that it takes for the router to generate the ICMP which turns out to be measurable and then the time that it takes for that ICMP packet to get all the way back to the sender so it's a it's a full round-trip and that's that's very important later on so here's some details on the hops that you're seeing one important thing to remember about trace route is that it's going to show you the ingress hop so here's an example where the first packet goes to router one and has an ingress interface you know one seven two one six two don't one when it sends its tracer I'll result back it's going to show you that ingress interface the packet came in over this when when router one drops it it's not going to show you the path that it took to come back that it took to return the ICMP TTL exceed which is in no way guaranteed to be the same path that it came in on and it's not going to show you the egress path so then when router two gets it it's receiving the packet over this interface it's going to show you 10 dot three two two and it's up to you to figure out the egress interface from router one but everything that you're seeing is every router's ingress interface you're seeing how it got to that router it returns an ICP how it got to that router returns and I see into you an interesting factoid that's actually not what the RFC and standard says to do RFC 1812 which defines this says the ICMP must be sourced from the egress interface and if that was followed traceroute wouldn't work it would actually be completely unintelligible for troubleshooting so that's one of those random people disobey the standard and everyone continues to do it and it has obvious benefits why now we get into one of the first things you can do to really figure out traceroute is how do you interpret the DNS data that's been put in there this is how you can can do better troubleshooting as you as you read it really one does not read traceroute be IP IP alone almost every operator for their own sanity as well as people on the internet is putting DNS in there and the more you can interpret out of it the better Diagnostics you can do so typically you're going to find in there some type of geographic location what pop it is what's idiots in what what region is sends something along those lines you can pull out in a lot of cases the router types and capabilities you can figure out what type of box it is if it's if it's a core router if it's a really low end edge router things like that you can figure out in a lot of networks with different hierarchies you can figure out the type and role they'll be border routers core routers and and those will have different properties but the most important thing that you're looking for when you're working with traceroute is identifying the network boundaries and I'll get into that more later but anytime the packet leaves network a and goes to network B that is potentially a it's a point where problems occur and B as potentially a point where traceroute interpretation problems occurred so interpreting trace file locations it helps if you know the geographical location because that's the first step to identifying suboptimal routing if you see a packet go from San Jose to Baltimore to back to San Jose that's probably wrong and if you see a latency of 100 milliseconds that might explain it whereas if you saw a latency of 100 milliseconds and it went San Jose to San Jose to San Jose there might be something else you're looking for so that's why you you really want to be able to interpret location the most commonly used location identifiers a lot of people do it with airport codes there's the ITA and the the ICAO airport codes basically a nice globally unique list of Airport codes that mostly identify major metropolitan regions and it gets a little confusing once you start talking about cities with multiple airports or smaller cities that don't have airports but as a way to identify a major region that matters that is likely to be a hub of Internet connectivity it tends to work relatively well the other major identifier is called a silly code CLI I'll get into that in more detail but basically you're just you're just attempting to identify sometimes people will just make up something that they think sounds good you know they'll they'll say wass for a city for Washington or just try to take a guess so the the airport codes like I said there's good international coverage for almost every major large city and you'll find lots of good examples out there for most networks I'd say maybe 75% of networks out there do it something like this give or take so examples sdq senator Mingo SJC San Jose California you can go look these all up sometimes there's their pseudo airport codes and this is where it starts getting tricky for things like New York City which is served by JFK LaGuardia and Newark airports and some people will actually use those to mean distinct things this is you know our Newark pop versus our New York City pop and some people will just use there's there's codes that represent all of them NYC things like that so that's a at any rate you your understanding that you see NYC and you know that it's New York City Northern Virginia similar thing ie D served also by DC a and so some people just do W DC or WI yes things like that the silly codes look like this so basically these are these are full codes that identify very specific telco things the actual full code looks something like you know this example and it's telling you down to a facility and a switch in Iraq and and it's mostly used in voice networks and it's something that's maintained and sold by tel Cordia and you'll see a lot of ilex and people like that use it in an antelco role most people just strip off everything else and you're only looking at the first six it's called the the geopolitical and basically just at the state and the city in the state so again you can google all these and figure it out you know HS TN TX houston texas it's it's pretty obvious this is a u.s. standard so you've got perfect coverage for every US city that matters including many that don't and you'll you'll typically see that if you got like a u.s. network with a lot of cities where you need actual distinction between Baltimore and like the five suburbs of Baltimore and things like that but it's not an actual standard outside of North America so what you see in some cases is people just fudge this they'll make up what would be a silly code but it's not a standard that you can for sure Google with six letters you kind of able to figure out you know Amsterdam Netherlands things like that arbitrary values sometimes people just make stuff up this is probably a huge percentage you know you get into example Chicago the airport codes or Ohio Rd and Midway MDW some people say I don't know what that means I'm just going to call it CH I things like that Toronto a lot of people don't know it's YYZ as an airport code or ypc is another airport there so they make things like cor and these you know they they have the best intentions when they make these things they they try to pick things that look right and make sense in English and you think most people be able to interpret and for the most part you can but it also gets into situations where there's no actual standard and someone else uses TRN or something like that so that's if you're naming your your router's I prefer to stick with something that has an actual convention that you've been mapped to so I include a table of some of the most common major US cities as far as the internet goes and the airport codes that represent them so you'll see a lot of those the the silly codes so like you know NTP is a big user of those for example other other more more telco like folks and then some of the other codes so between all of these these are very representative of the types of things you'll see and once you get into San Jose some people will start to do things like SV Silicon Valley to represent San Jose Palo Alto Santa Clara etc lots of little cities within a region the international version looks like this so these are you know as the internet goes the most important cities so these are like their representative airport codes the silly codes and then some of the other codes that people commonly use to represent them good back to refer back to if you're looking at stuff so the next thing you want to get out of DNS is the interface types the way most networks do this is they'll put their full interface info into DNS again mostly to help themselves troubleshoot not so much for people on the internet but if you're smart and you know how to interpret it it can help you and you know whether who to complain to correctly and if you direct your complaint to the right place first you'll be a lot better off so you can you can pull a lot of stuff out of it so here's an example level 3 hop XE - 11 - 1-0 edge one New York one so here I'm obviously it's in New York City I learned the role of the router it's edge box that's where they're terminating some of their appearing circuits and you can start to look at the the naming format XE - is juniper convention for 10 gig and you can look at and this changes over time as new line card models come out but at the time this was written you know you could you could say it's the 11th slot starting from 0 so it's actually the 12th slot in a Juniper device that does 10 gig and you can tell by the the numbering since the last one was 0 that it does at least this never reports and you could actually deduce hey this was a this back of the time was a Juniper DPC with a four port 10 gig slot and this was no other device could fit that naming profile so I include a table of the most common interface types and how they get represented typically people there's a lot of for good networks automation that comes into this so SNMP will shorten things down to the short name it'll it'll turn Gigabit Ethernet into GI and then you'll be able to see the those those patterns but you know you can you can tell if you see a GI and depending on the the number of fields if it's iOS if it's iOS XR if it's GE - that's the Juniper geeky so you can refer to a table like this and really have an understanding of what the device type is and then later on what the behavior is for that model especially as it affects traceroute because I'll get into that in more detail but every router behaves a little bit differently when it comes to how it drops the packet and sends back the ICMP so the other thing I said you could get out of DNS is the the router types and roles every network has a different standard right you know what's a core what's an edge what's appearing box what's a customer aggregation box most people kind of mostly mean the same things and you can have an understanding of what those devices are and then again how they're going to behave so your typical core routers they're gonna be named things like CR core gbr backbone CCR EB are things like that there's those are some examples from prominent networks out there but you know you can you can get an idea that this is a core device there's a lot of networks that will do they're appearing on dedicated edge devices be are border routers borders edges IR interconnection routers IG are gateway routers you know people come up with with names for all these things but you can kind of look at these and figure out this is this is a peering router and the other one to look for is a customer router an AR an aggregation box a nagger customer box a CA our customer aggregation router HS a high-speed aggregation and everyone comes up with their own little thing but these are the most common ones out there for some large networks and the most important thing you can get out of it is identifying the network boundary so like I said it's important the reason that it's important is a it's it's always more difficult to in the order the slides those are where the routing policy changes occur so if it's on my network and I prefer level three and I hand it off to this other guy and he prefers sprint for the return path and the return path changes the instant it leaves my network and so by knowing where the network boundary is you're going to know how the return path was affected how how all the routing was affected and and how that's going to affect traceroute and it also tends to be the area where capacity and routing or difficult those are the areas where one of us may be paying one of us may be waiting on the other one to upgrade appearing circuit and so it typically tends to be easier to fix yourself to yourself than it is yourself to some other third party and then identifying the relationship you want to figure out if it's a transit relationship if one person is a provider to the other if it's a peer if it's a truce interconnection it doesn't matter if it's settlement free or not but if it's routed as a peer so the only customer routes are advertised to each other and if it's a customer and especially the customer side a lot of networks are very very clear about where the customers they'll put things like that customer or you know GW or they'll put an a s number in the boundary that's really helpful and I'll go into that in some more detail and kind of the the naming but that's really helpful because most customer interfaces come out of the IP space of their provider and probably 80% of the the bogus traceroute complaints that I see are people who send a report to someone who can do nothing about it because the packet has already left their network gone to someone else but the IP space came out of their network and so that's that's how they got found so here's an example where it's real easy to spot the DNS changes and yeah I'm still using old slides that still have a Global Crossing you know here you can you can see this was on device AR 3 D CA 3 and Global Crossing and then the next stop it's on Sprint the other thing you can do is to look for the remote party name so here in the next example you see cogent on a global crossing peer but it's it still says GPL X net what happened there was when the the two parties got together and they said we're gonna set up an interconnection between us one person said I'm gonna do the slash 30 for this in this case it was Global Crossing and they didn't take the time to say hey cogent what do you want your full DNS to look like for your router they just through cogent into their naming scheme so if you look for that third party thing on the the slash 30 that's how you can tell and then the other thing you want to do is look up the other side of the slash 30 so you do a traceroute you see top 5 here and you say dot 90 I you know do simple bit math and know that that's dot 88 / 30 so the other side is 89 so do look up on dot 89 and you can see that's the other side that's the Global Crossing interface on AR 5 which matches up with the ingress interface that you see on hop 4 so now we get into the most fun part of tracer out in my opinion understanding the network latency so there are three main causes that contribute to network induced latency the the things that you actually see performance-wise on a packet that that are going to potentially impact performance or that are definitely going to impact tracer on the first is serialization delay that's the delay that's caused by the packet moving across the network and packet size chunks that the as you encode it transmitted across the wire pull it back out serialization delay and go into more detail here queuing delay is the delay the of the packet sitting on a router waiting for an opportunity to be transmitted and propagation delay is purely the speed of light it's how long it takes the signal to actually flow through the wire and come out on the other end so serialization delay like I said it's the process of encoding the data into chunks so if I send a 1,500 byte packet you can't start transmitting the packets first byte until you've received its last byte so there will be a delay as you chunk it and the way that the math works on this is the faster the interface the quicker the process will occur so it's basically nothing more than convert it into into bits and divide or eight or convert convert the bits and device so if you have a one megabit link that's actually 125 kilobytes per second and you're sending a 1500 byte packet do the math it's 0.01 two seconds or twelve milliseconds of propagation delay across this one megabit link like I said it you know moves as an atomic unit so every time you hit mostly this is this is low-speed devices so on a you know here's a here's a table that shows the serialization delay fortunately what's happened is the internet essentially transmits almost all 1,500 byte packets it was the the max payload size for Ethernet and no-one's really gotten above that there's jumbo frames out there but they're not widely used they're definitely not widely supported so typically what you've seen is you know exponential increases in speed with the packet size staying the same so the serialization delay for modern high-speed networks has gone down in 2.000 doesn't matter milliseconds but if you're transmitting this over someone's you know one megabit DSL and you're wondering what the 12 millisecond delay is that's an important thing you're gonna see and remember also the traceroute is using small packets it's using you know like 64 byte packets give or take depending on implementation but you may see four different levels of performance different different results the next big one is queuing delay that's anytime the router is holding the packet on the router or switch holding it in its memory waiting for an opportunity to transmit and this is an important one to point out because I think a lot of people don't understand how queuing works and why queuing works and it's very important when you start dealing with congested links and interpreting tracer so first a quick word about utilization when someone says I'm doing five gigs on a 10 gig port I'm 50 percent utilized that's actually not true at what they're saying is it's 50 percent utilized over some period of time so it's over the period of one second or five minutes or whatever the average is on their counter but at any given moment an interface can either be transmitting 100% utilized or not transmitting not utilized and so as the packet comes in it says I want to go out this interface is it used or not if it's not used I can immediately send it I have no need to queue anything if it is in use if it's still transmitting the packet from the the previous packet then it has to hold on to it and that's just a basic function of routers so basically some queuing is going to be necessary all the time for for every purpose sometimes it gets to be a little excessive but again into that more detail later so again when is queuing good thing when you have mismatched interface speeds so if you have for example a 10 GE client feeding data up to a one gig transit link or pier or something like that even if you're transmitting 500 megabits even if by all by all means the data should fit in the pipe what actually happens is because the the link is faster because the data serializes faster over the 10 gig because it comes in faster the packets arrive faster then they can go out on that one gig link and so they must be buffered typically you know you see that on oh yeah so so the purpose of queuing queuing technically always increases throughput the longer you hold on to a packet the more opportunity you have the transmitted and what happens is as you as you become closer and closer to being full as a link gets to like 85 90 percent full on a you know second average what happens is the the amount of time that the interface is transmitting turns out to be very very high and the the packet has to get queued a lot so you might have a packet that is perfectly capable of being transmitted it just has to be held for an extra 5 milliseconds to get you know 60 to 85 percent now maybe you have to hold it for an extra 500 milliseconds to get it from 90 to 95% and the question is is that worth it maybe maybe not but you need to when that's occurring on the Internet to troubleshoot it in traceroute when it's a bad thing like I said as the interface becomes more and more full you're spending more and more time holding on to this packet you hold on to it for 500 milliseconds looking for a chance to transmit it and you very well might right you might not see packet loss but that's where you start to see every packet that goes through the box or almost every packet has this noticeable queuing delay behind us but beyond a certain point you know did you really want to hold it for an extra 5,000 milliseconds just to get that packet through from 98% to 99% probably not you probably impacted the application most people tend to see internet mixed traffic the reality be around 95 percent as the the most they can get through before it starts to be really really bad there's there's been some presentations on buffer bloat but what tends to happen is a lot of routers especially carrier designed routers routers meant to be bought by carriers have been built with very large buffers way beyond and in far excessive what could possibly be needed for a reality and so a lot of routers by default we're more than happy to hold on to your packet for five thousand milliseconds to try and get it through and unless you go configure that down and go configure your queuing differently you'll still see that behavior a lot on the Internet and the third form of latency is propagation delay the the time spent on the wire so really the math here is is pretty straightforward it's the speed of light so if you want to know how the math works for fiber fibers is made of glass I have a tutorial on that also but it's two forms of glass it's a core and a cladding and what happens is they have different refraction indexes the light hits it bounces back off the cladding stays in the core and just propagates along and it's propagating through glass with a refractive index for fiber of around 1.4 eight so that means it's traveling at about 0.67 see about 0.67 the speed of light so roughly 200,000 kilometers per second is how fast it moves across the wire fiber so two hundred thousand kilometers per second you can turn it into two hundred kilometers or 125 miles per millisecond well simple math and then divide by two to take into account the round-trip times remember trail just showing you forward and backward so as an example a round trip around the equator of a perfectly straight fiber L well the point of that V through a lot of ocean would take 400 milliseconds just from the speed of light delay so it sometimes gamers complain and they say I want it to be faster and you say take it up with God so then the question becomes how do you identify what latency is affecting you you start by looking at location identifiers you look to see if the routing is as expected and and if it fits in the propagation delay so here's an example of something going from NYC New York to LA char of London so there's a difference of 67 point 6 milliseconds and that happens to be 4200 mile route it sounds about right actually it's another example where you see the latency shoot up from you know you've got WDC on hop 7 here 80 and you're still an AED that's also DC and hop 9 and yet the latency is shot up 220 milliseconds that's probably not speed of light related and the other thing to remember is no 1 laser fiber out straight people tend to run their fiber on paths to go to major cities and populations and so it will never be as perfect as a perfect scenario you've got repeaters region if the slack loops the Fiber goes through all these different different places but this gives you at least a ballpark of doing the math and seeing does this make sense for the latency that I'm seeing so the next big thing we get into and we're interpreting traceroute results is prioritization and rate limiting and remember that I said the traceroute latency is the sum of the time that takes for the pro packet to go out the time required for the router to drop that packet and send back its ICMP the time for that ICMP to get all the way back to the source so numbers 1 & 3 are real network characteristics that affect real packets including things that are not tracer on packets number 2 on the other hand is only affecting trace route packets this is a behavior that the router typically turns out not to be very good at doing I get into that in more detail but number 2 if there's any type of delay in the routers ability to drop the packet generate an ICMP and send it out that's going to cause tracer out to look bad even though network performance is fine so there's a lot of conditions that cause this you can hit hit rate limits on your generation of your ICMP message causing artificial loss so you'll see loss in the middle of a trace route and you'll think that there's a problem and there might not be and the router can be slow on the generation and that causes artificial latency so to understand that we have to talk about the architecture of a router so modern routers have distinct forbidding paths the different ways that packets get handled as they they go through the box so the data plane is what it's called when the packets are going through a router the router is doing its job its routing you know millions of packets per second from point A to point B its shipping at via the data plane and even inside of a data plane even inside of a modern box fully ASIC enabled doing millions and millions of packets per second there's a fast path and a slow path the fast path is anything that can be handled well in hardware it's the ordinary shipping of packets and the slow path is anything that's an exception packet so those examples of things that cause exception packets if you have IP options on the Box if you're trying to do you know source routing through IP options if you require ICMP generation so that's the important one for trail if you have the BOK box configured to do logging anything like that what tends to happen is the packet actually gets punted to a cpu to handle this stuff this is you know stuff that's relatively complicated and isn't just as simple as look up an IP address and a table figure out a next hop interface and ship it and so even on a modern box all of these boxes with their very high end line cards have a little CPU sitting on every one and it's doing this type of stuff and then the control plane is the packets that are going to the router these are control protocols so your BGP your is is in your OSPF your SNMP packets your CLI XS when you're logged onto the Box all your your SSH and telnet stuff is going there anytime you ping the router directly any time the router is doing ARP so every default gateway every dot one every time you type IP address blah that's an entry point to the control plane on the router and a lot of router CPUs out there are horribly horribly underpowered so this is not uncommon to this day to see very high end you know multi hundred gigabit devices with an old 600 megahertz MIPS or PowerPC CPU and typically even the best highest inbox what the newest rallying engine is going to be three to four years behind whatever's available out there for PC because people are building these as industrial applications hardened against temperature and you know they're slow to ship new ones and remember ICMP generation is not really a priority for the router that's a that's a nice to have but ICMP generation doesn't make the packets flow so on a lot of boxes out there like old Cisco IOS boxes there is an infamous process called BGP scanner and the way that iOS works is every 60 seconds it has a little process that walks down the routing table and checks to make sure all the routes are still valid and pulls things that doesn't basically the the symptom here is on some platforms the slow path data plane the the the stuff that is doing the exception generation for I just dropped the packet now I need to send an ICMP is shared with the control plane so you tend to see things like someone turns BGP and all of a sudden you give a traceroute spike it can even be something as simple as someone from your NOC logs into a box and types of big expensive command and the box is calculating it and traceroute latency spikes that's typically not found on a lot of the higher-end box's they'll have a dedicated control plane and slow path data plane but even on things like 6500 9s which are still hideously common it's it's the same resource it's the same CPU that's doing it and so if it's busy doing one thing it's not doing the other so the most famous example of this is the the old Cisco IOS BGP scanner every 60 seconds you see it's spiking traceroute and you have no idea why the other big one again is rate limited ICMP generation even in the the slow path data plane operation there are many reasons for an ICMP generation besides TTL exceed there's lots of everything from destination unreachable because the route didn't exist because you're going to a host where ARP hasn't been resolved yet too pack it too big there's there's dozens and dozens of reasons why and I pack it would get dropped or an ICMP message would need to get generated and there's no such thing out there as a traceroute ace it could be nice no one's done it no one's put any any effort into making it really faster so you've got a general purpose CPU that's generating ICMP as a not critical function and the routers are typically they have two choices you can either process all the ICMP as best you can until it takes down your other services which is bad or you can rate limit them and try not to take down those services but effect traceroute so remember it's actually relatively easy to cause a TTL zero denial of service it's as something as simple as a routing loop you turn up a customer you assign them as slash 24 you route it to them they haven't actually installed it on their router you ought to have a default pointing back to you you just created a routing loop some worm on the internet sends one packet it loops between the two of you forever and every time that that TTL zero is hit it's it's generating ICMP so if there wasn't a rate limit this would actually be a big denial of service vector for routers so most routers are gonna put an artificial rate limit on this and the reality of this is the rate limits tend to vary wildly not only by vendor by platform by software revision by individual line cards by models of line cards and a lot of cases they're not easily configured log you have no idea if you're heading them and you have no idea if you've hit them on FPC 7 but FPC 8 fine and all it takes is you know one of the common trail tools out there that does repetitive tracer all over and over and over as MTR and people will look up very thin TRS left it running for six hours and it's been sitting there sending traceroute packets over and over you get a couple users doing that and all of a sudden you start to hit some of these baked-in limits so you need to figure out how you can tell the difference between the cosmetic loss and latency the things that are just tracy are all specific and the things that are actual forwarding issues and the secret to that is if it's an actual forwarding issue the loss or the latency will persist across all future hops so here's an example hop two you see 18 milliseconds 80 milliseconds 60 milliseconds oh no the world's ending well actually not because and it goes away in hop three so that means that was purely a function of the ICMP generation if it was affecting all packets you would see at least 60 milliseconds or more persist all the way through in the next hop sometimes you need more probes to figure that out for sure but if you see latency spike in the middle of a trace route it means absolutely nothing if you see lost in the middle of the trace route it usually means absolutely nothing but those are again a huge source of people opening trace route based tickets I saw a drop somewhere in the middle that didn't mean anything at worst it could be the result of an asymmetric path it just it took a different path to come back and it doesn't really matter but more often than that it's it's an indication of just artificial rate limiting you you see a delay like that or the box was busy on that particular control plane at that particular time so it's a good way to test that is if you're if you're very concerned if you think that maybe there's something wrong with this interface try a non TTL expiring method and again this is control plane so it's it's gonna have different performance but try just pinging the interface and see if you see that that loss persist or if it's just something that's happening from from ICMP generation so the next thing we get into with traceroute is how do you how do you troubleshoot all these asymmetric paths so remember traceroute is only showing you the forward path and remember I showed it only shows the ingress interface of how the packet went there it shows you absolutely nothing about the return path but the return path is 1/2 of the latency value at the very least it's completely invisible you have no idea and the only way that you will ever know about it is to contact the person on the other side and say can you please do a trace route back to me and even that is a guarantee of absolutely nothing but it's at least a place to start so that's thus the real hazard of trace route is it's completely hidden it can be completely different in every path and as one person without any cooperation from the other side if you don't have a lookingglass on that side if you don't have a customer who can do a trace you're out back at you you have half the information that you need and you're trying to make a determination so the only way you can confidently analyze a traceroute is to have it in both directions and you know a lot of knocks that get tracer I'll tickets source and desk source and destin'd does always ask for that and even then like I said you can't catch all the potential stuff but now we'll talk about some of the details of that so asymmetric paths I said before start at network boundaries that's where administrative policies change so here's an example of a traceroute that's going from DC a3 Global Crossing we know that that's DC area to Ashburn sprint link also DC area and those are both actually like Equinix Ashburn so as it goes from this one particular router on Global Crossing this one particular router on Sprint you see 100 millisecond spike in latency so now you actually shelf what could be wrong here it could be congestion on that path that particular link could be full and it could be exactly 100 milliseconds worth of queuing that's happening things like that it could also be an asymmetric reverse path so as we see once the packet crosses the boundary once it hits sprint sprint is now in control of how it sends that return path back it's gonna have its own routing table its own policy its own peering policies its own local pref all of the above and it's going to vastly affect what path gets used and a good cue here that that's what's happening is it's consistently 100 milliseconds like precisely across all things typically if it's congestion you're gonna see some variance there as TCP backs often things like that so when you see exactly a hundred milliseconds you started thinking hmm maybe it's the reverse path so how do you work around asymmetric paths like that the most powerful thing you can do as a ninja eusers sitting here working Treasure Isle trying to figure things out without you know magically being able to get the other guy on the phone and do house of trace routes for you is to control your source address so here's an example say your your multi-home your you're an ISP you're buying transit and you're connected to two networks they're now the same thing Global Crossing in level three so you know that Global Crossing reaches you if you have Global Crossing and it so happens that Sprint because of their different policies they reach you via level three so that's how you start to say hmm maybe the reverse path is is between its going over level three and that's the issue how can you how can you prove that without having access to a lookingglass and all the routers and although all the endpoints one thing you can do is to run a traceroute on your side from your router sourced from your slash 30 so the way that almost every ISP works is when you connect as a customer they give you the slash 30 the interface IP out of their IP space and that's IP space that's owned by them routed by them part of their super net and it's gonna be routed that way so you do a traceroute and you manually set and you're in your traceroute command the source address to be that Global Crossing slash 30 you're forcing the return path went even once it hits Sprint to come back to be a Global Crossing so then you can look and see if that affected your results if it did try it on your your level 3 path oh look you know see 100 milliseconds on everything so you know the the path the the issue in the previous example is a return path coming back in over level 3 now we get into some more complicated stuff asymmetric paths with multiple exits so remember asymmetric paths can happen anywhere and they do and in fact the the way the modern in networks they happen pretty much everywhere turns out there's no terabit Ethernet yet sorry Randy so people tend to splay things out over you know it's not uncommon to see 64 by 10 gig paths it's not uncommon to see many many many different parallel paths happening and so all these asymmetric routes are happening all the time as traffic is load-balanced and networks interconnect with each other in multiple points so here's a very simplified example of a major network with three locations San Jose Chicago and Washington DC they interconnect with each other and all three point and remember that in almost all examples peering is done via hot potato routing closest exit routing so you want to get the packet to the other network as quickly as possible because they know more about routing it than you do so in hop one we see an example where the outbound packet goes out of Washington DC interface and comes back in that same interface very simple hop to weave now in committed the TTL by one we've sent the packet through Washington DC the little red line it goes out to Chicago Chicago drops it sends its ICMP back but the return path that Chicago sends is via the Chicago interconnection so if for example there was congestion there or something about that path was broken you would see the latency spiked for that hop but then by the time it hits hop three by the time the the third packet the green line goes out and it's made it three routers deep and it's hit San Jose San Jose drops it and sends it back it's got a different path so every point along the way can have a completely potentially different asymmetric path with all these different exits so sometimes you'll see a latency spike in the middle that's caused by that and not caused by the cosmetic stuff and that's where you kind of have to to ping all the interfaces manually one thing at a time and really deduce one thing after another after another to figure out what the real issue is so then the question becomes some more advanced troubleshooting with source address what happens if you know it's a peer address or I'm the transit provider and the slash 30 is numbered out of my space well you can still get some benefits by playing around with it a lot of times you'll see different results just by different hashing so for example if you have a router and it's dot one and you do a trace your outsource from that and the hash that gets calculated from that address sends it down this path and you do it from a different interface on your router with a different hash and the way that gets calculated differently it's going to expose different paths that you weren't seeing before and there's a lot of examples out there where people be reporting intermittent loss and you won't be able to figure out why and it's because your particular trace route just happens to be sourced from you know using the hash that just sends it over the clean path not the broken path so the more things you can try the more likely you are to uncover those types of issues another thing to remember when working with traceroute is where is your source address coming from most routers Cisco I don't know X our most most classic boxes out there set the source address to the egress interface that was used so if I'm sitting on a router and I do traceroute 1.2.3.4 and it goes out my level 3 interface it's going to set my source address to that level 3 interface and juniper is going to do that by default too but they have a command system default address selection that forces it to use the router loopback in a lot of cases that might be more more more what you're looking for so for example if you did that traceroute at level 3 and it was sourced from level 3 AP space that might not be exposing all of the different return paths that are coming from your real IP space from a real customer who's really multihomed with you so again try all those different things and take that into account when you're looking at your paths so the next one thing about trace route multiple paths and load balancing so remember that I said every probe is an independent trial and you see P and TCP trace route probes like I said are using a different port every time and the ICMP stuff may use some other method to to encode that but what's happening is whenever there's multiple paths on on a link you've got 4 by 10 gig and they're not done as a layer 2 bundle it's not an LACP bundle it's done as for independent links and you're trying to do layer 3 hashing across them that's called EC and P equal cost multi path and what happens there is you tend to see multiple hops show up at every point so here's an example of a link that's clearly multi path going through T Lea we're four hops six two of the probes landed on bb2 and one of the probes landed on Bibi in this you know their routers that are probably sitting right next to each other and traffic's being evenly distributed across them and what traceroute does is when it encounters a different return address than what it got in his previous result it just displays it like this it says here's another result that we got and so people have to know how to interpret that you see that a lot on the Internet of the three probes to go over one path one goes over very simple now we get into some more complex examples here's all very ODS here's an example of a packet going through what used to be very own entity and you see the first hop is is in Ashburn and then hop five you see some very different load balancing you see New York and Chicago and a naive person might look at that and think oh my god my packet just went in weird ways the wrong direction we have no idea what's going on what's actually happening here is it's being load balanced between two paths one that goes New York Seattle and one that goes Chicago Seattle so again that's completely harmless but if you are a naive user looking at traceroute and you don't know how to interpret that you're going to come up with the wrong results now we get into the really painful stuff unequal equal cost multi path so that's an example where it's load balanced across two equal paths but one happens to have more hops than another and it makes the paths look like this so here's an example of load balancing where X got inserted in the middle and what it does you get traceroute results that look like hop three has hop two's result in it and so people get real confused by this one they go it's going back and forth I don't know what's going on it it's it's out there you'll encounter it you just you have to look and when you see hey I'm seeing that same hop that's that's tends to be an indication that you're seeing that so when in doubt you can eliminate the whole thing by just doing a single probe so your standard UNIX traceroute implementations the command is - q1 and so instead of sending three probes for every hop it's going to send one and if you're confused and you don't know what send one probe look at that path and then send it again see if it comes up something different but again remember every probe is still an independent trial so just because in this one example you see a return that says this that in no way means that's the path your flow went over or anything like that so when you're working with tracer out your you're looking at many different variants and and how to interpret all these results to build a complete picture and like I said one way to try out different paths is just to increment the destination IP or the source IP by one so if you're tracing to a customer and their dot one two three and everything looks fine trace you know dot 1 2 4 see if it changes the path along the way and see if it changes it in a way that breaks something in the middle because that's that's a common thing that you'll see as the the different IPS get hashed differently and now MPLS and traceroute how howell traceroute behaves when you put it on an MPLS network so there's a lot of large networks out there who operate an MPLS based core in fact most large ones at this point and a lot of them out there they run it in such a way that their core devices their core label switching devices don't even carry an IP routing table and in fact but the the Juniper PTX isn't even capable of doing it it's it's a dedicated MPLS only box and so that's that's fine for tracing for delivering an MPLS packet then the question becomes if you want to show that hop you have two choices you can either completely hide the hop so MPLS gives you the capability to turn off TTL decrement so you can magically make the packet go from point A to point B and hide all of your network in the middle some people do that just to make sure about shorter or not confuse people and some people don't because you're actually trying to figure out what's going on inside your network but when you've got an MPLS only box and you are trying to show that hop you you need to figure out where am I going to send that ICMP back and if you don't have a routing table and you can't do that lookup you've got a problem so one common solution to this is something called ICMP humming what happens there is if you generate an ICMP on the inside of an LSP as its as it's being forwarded through a network rather than have that router immediately look do a lookup and send it back to the source the router puts it back in the same LSP that it got it from and the packet continues all the way to the end pops out there and then routes so it makes it work but it makes traceroute look really weird so here's the the details on that you know throw out your your previous example of how traceroute works as you weren't coming through hops 1 2 3 4 you what's happening now is router 1 sends its ICMP message all the way down to router four before returning it to the source and here's what it looks like here's an example of 18t doing it so you see a packet sitting in Global Crossing Palo Alto it hands off to AT&T in San Francisco and you say oh no the latency just went up 72 milliseconds in San Francisco is a congested link no what's happening is that San Francisco packet and then the Chicago packet and then the New York packet everything there is going all the way down to that very last router in New York before the return mass message can come back to you so that's why you see every path every hop along the way has 74 milliseconds that's usually the indication that that's happening if you see something the latency is precisely the same or precisely the same as you can get going through all these different hops all the way to the edge point you're inside of an LSP and you're seeing ICMP tunneling so try not to freak out over that really final thoughts on trace route before beginning a trace route for any type of serious analysis always ask for forward and reverse source and destination paths someone sends you a trace route and they take out three lines in the middle and they send it to you they've done almost nothing for you make sure your NOC knows make sure anyone that you talked to knows make sure when you send a traceroute complaint yourself you don't get to get snippy and say I've already done all the work and here's the three lines that matter send us the source and destination IP so people can actually troubleshoot it like beware of snippets of traceroute of missing information and also remember that a user's running a trace route towards an IP they saw in another trace route is is where things tend to go south so a user might trace route dub dub dub Yahoo comm and they get 3/4 of the way there and they find a yahoo router and they say I'm gonna trace her out to that Yahoo router that's a completely different forwarding characteristic than the in host so you really need that real destination IP you can't just look at those IPs in the middle and try to make your determinations and that is my tutorial on trace route questions anybody mic so it's on the recording I can't get to a mic you can see me res but this is ready um so I traced her out to my host in Seattle and I from that host I traced her out back here am i seeing the full return path that that kind of corresponds to the way things would have come back from there etc and what am i learning how much is it worth doing so you're not guaranteed that you're seeing the full reverse path but you're you're much more likely to be assured that you're seeing something much more similar to what that path would look like so he's saying I'm doing a tracer out to him in Seattle he does the trace route back to me back to my actual end user host where I'm communicating from that is the best way you can eliminate all of the the different things out there and really get the route this is how the packet got to me this is how my packet got to him this is how his packet got to me with all the disclaimers of multipath hashing all of those things that can you know maybe that traceroute packet went down the clean path and maybe the the real flow that's been broken has gone down the broken path but you've eliminated most of the issues that you see are around the the policy change where you know the the IP address crosses into Sprint and so suddenly their routing policy changes their reverse path changes and you can't see it if you can get the guy on the other end to do that trace route and send it back to you it's a much better place to start from its its eliminates eighty five ninety percent of the issues that you'll see right off the bat one ring about the current slide how to read it does it mean that starting with the top number three we have this MPLS tunneling and can I say that it is that all the devices maybe up to hop number nine our MPLS tunneling devices or yeah so you know MPLS is used by large carrier networks to move the the packet from point A to point B and so what's happening here is there's an LSP that's taking the packet from this router in San Francisco to that router in New York but because they haven't turned off TTL decrement you can still see all the hops along the way so what's happening is hop three after it does its traceroute and it says I'm dropping this packet here in San Francisco I need to return my ICMP it's not doing a routing lookup finding the source address and sending it back it's putting it in that same LSP going all the way to that original destination so yeah you're seeing all the router pads along the way to that final destination and once it hops out at the end it routes all the way back but that explains the the latency so this is the same for number four and so on so this number fours it's the same thing and so on yes thank you I'm Matthias from D cakes I was wondering some new tracer tools I have this additional option it's called Paris so I was thinking it has to do that they tried to change port numbers and stuff like this so that they enforced different he seemed pleased do you have any details on that yeah I should've mentioned that so like I said the more you can change the port numbers the more you can you can vary the probe the more likely you are to encounter all the different paths out there so there's there's many different tration implementations out there one obviously Paris that we'll try to build that in so as it does multiple probes and to some degree you can get this out of Mt r2 if you leave in TR running it's basically just like top for your trace route that's it there and repeats over and over and over every time it does it launches a new probe potentially exposes a new path so if you leave an MTR running you'll tend to see all of those and there's other tools out there that I'm not familiar with all of them but as a way to kind of go out and find all those different paths yeah change the change of the destination ports source ports everything so and the second question so some sometimes you see an even your presentation some trace first give back some MPLS labels do they help you with any debugging I meant to throw that in there so there's a wind MPLS came out there was a big question of how do we communicate MPLS information back in that trace route and so there was an extension basically what happens is you can encode some amount of information in the ICMP message as you send it back when you send the the returned packet back you send some of the original packet so tell you back to an example I launched a pro packet it's 64 bytes long I stick some piece of information in the header you know I know the source the desks all that stuff when the router drops that packet and sends its ICMP back it copies the first bit of that message and in doing that copy it can also include MPLS information it's not terribly helpful for anyone outside of the the isp network and it's actually not even helpful for that it's basically telling you the label number which you're not really going to do anything with unless you are troubleshooting and implementation that's broken you pretty much have to be the carrier on the network looking at it to see but some traceroute implementations will be able to interpret that and display an actual label number as it goes through or you'll be able to see the stack depth you'll be able to see if it's inside of two LS B's or three LSPs for someone on the internet not on that network it's essentially pointless Annie I'm Randy bozo J again um regarding exploring different hashing etc some if anybody's interested just write to me and I'll point you to our ICM internet measurement conference paper from a year ago where we actually looked very heavily at using ping with I deem manipulation just as Paris or a traceroute those to explore pads and find out how radically crazily different they can be over and what's worse is on the same physical fiber with people because you can't get a hundred gig interface from Ashburn to Dallas so you take ten of them and you lag them radically different times if you can explore those different fibers in the lag so be careful um for the kind of stuff we as operators do you're close enough for government work for the kind of things researchers do you better know what you're doing with paying and we saw like 30 percent differences on the same path on the same fiber hope not to see that not on the same fiber but a lot of times what will happen is you'll someone will take ten geeky circuits or ten by ten between Ashburn and Dallas and they'll get it from three different carriers for three different fiber paths and they'll they'll treat it the same internally you'll do a low balance across all three and all that but one will be much higher than the other it's really rare to see that big of a variance on the same path you'll see some and that's why you have to do hashing you can't just ship every packet down with a perfectly even load balancer you'll see reordering different fibre cable lengths different different things like that different queue depths on every interface but a lot of carriers out there are mixing and matching you know you go out and you buy whatever the cheapest path is at the time and you in some ways load-balanced it with a longer path and a shorter path yeah hi Joe with info relay our help desk is really smart so we asked for forward and reverse mtrs instead of trace routes from customers and then you know trust that they can get it but you made an offhand comment about that bang on the CPU of every intermediate router in between is that are we being bad net citizens when we do that maybe so I'll give you an example and I thought I had more details in here slide wise on some of the the different hard coded limits but you know and again this depends greatly on router vendor and software and things like that but there are a lot of boxes out there a lot of box is still deployed where there's a limit of like a hundred packets per second per line card for you know an 80 gig or a 40 gig line card and so it doesn't take much it takes you know 10 10 gamers and three guys in the NOC running MTR and all of a sudden you've got enough packets coming through that you're now dropping TTL and what tends to happen is the more people that do that and the worse it gets the more people fire up traceroute to figure out what's wrong and so you yeah you can very easily get you know 100 gamers they're all pissed off that they're their server rebooted they all fire up traceroute and now you just broke more traceroute for a lot of people just make sure that they keep in mind don't leave it running forever really the the bigger fix there is talk to your router vendors and ask for better handling and we've started to see more and more of that over time so one of the the things I didn't get to put in here is juniper as I think 11/4 pulled all of their ICMP generation rate-limiting out into a standard structure called DDoS protection miss names but and you can actually configure it and you can log it so you can see on MPC seven I'm hitting this limit and that's why I'm not returning traceroute packets and you can actually bump it up but for a lot of boxes for a lot of legacy deployed systems out there it's it's hard-coded its per per line card it's per version of OS and you have no way to change it even if the box is perfectly capable of doing more than that anybody else behind any pillars all right well thank you guys very much you
Info
Channel: NANOG
Views: 11,146
Rating: 4.9272728 out of 5
Keywords:
Id: WL0ZTcfSvB4
Channel Id: undefined
Length: 71min 51sec (4311 seconds)
Published: Fri Jun 10 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.