Is It The Client, Network, or Server? - Packet Analysis with Wireshark - Sharkfest Talks

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
and what we look for specifically in the packet traces and one of the reasons why we named this session waiter we did is because there's so many times that people run into issues where the network has been blamed I mean come on if it that's what people win right away with me I mean most of us actually can just get a show of hands you might tell me are you from the network side that working here side okay have a server support application okay so most of us are Network people right so we're probably in this room is whistling again blend now people blame to network with or without cause with or without any type of data to back that up I always think it's interesting for me when I'm calling in to make a change to a flight or talk to my bank about something with my account and the person on the other side of the phone tells me that they're having network problems so how's the weather libram you know they do that stall thing have you ever had that happen to you having to me not long ago I decided to make a quick change with my flight and the person who was in that call center was I could tell they were stalling they were doing something and yeah again they even they were blaming the network which I always find that interesting because they're just a they just work in a call center and I mean how do they know what the real root problem is right all of us can think of a hundred things that we could go try to troubleshoot but even even a non-technical person is blaming the network so what's happened and culturally is it's put us in this defense mode where now we're trying to get the blame off the network well that's a nice thing to do to be able to get the blame off the network and show them now this is a application call that's taking 60 seconds see it's not my network but the thing is that network engineers that's what we need to get you need to get to not just saying okay it's not my network but also show them specifically what is the problem and even better get help them or if it is get to root cause and result need to go away now one is why we're here one of the reasons why we're getting blamed in just culturally is network engineers well we're responsible for the lower layers OSI model so if you look up if you go to HR and you pull up our job description we're responsible for the physical layer or responsible for data link or responsible for routing you could also put a toe up into the transport layer and say we're responsible for load balancers and for firewalls and things like that but as far as owning and embracing things above those layers a lot of times is just not our problem it's not what we're responsible for resolving so when a problem strikes what do we do is network engineers typically we look make sure that we have our connections are ok we're not blasting traffic anywhere capacities and a problem that we don't have a ton of errors not many ports along the way and if all those things are checking out good then hey thumbs up then that works fine especially if we can run a throughput test from one point to another things are looking great data passing ok show me why it's another well network looks good never served replication guy and he's responsible for the upper layers so actual service that's supporting the application he's responsible for making sure that back things up and cranking away if there's a problem then he's going to check into his system on the what modern tools he might be using check for server resources make sure that the service is running in the first place and if he doesn't see some major indicators and he's just going to blame the network they're going to blame that I work anyway but they're also going to hopefully do a little bit of due diligence on their stuff but if you notice there's a metaphor later that you jump over and this slide if it this just shows culturally what I see all the time is an independent third-party analyst I just see people in boxing matches over the transport layer I'm not saying that all problems are in the transport layer but the transport layer is an often unknown it's not an own thing people don't own it and embrace it and say yep that's me I'm responsible I'll take ownership of that so today what I'm going to show you is how to do that with packets start at the transport layer take a look at how things are performing asset layer and then that will help us to saw the OSI model in half and either go up the stack or down the stack depending on what we find it that layer all right so from this point forward the rest of your careers you guys are going to be owning and embracing and taking responsibility for the transport layer which Wireshark is going to help you to do so alright let's go ahead and talk about some specific things that we're going to look for at the transport layer that indicate network problem so just to show you in the orchid says to tell you how we're going to do this I'm first going to show you some examples of some network issues some actual network problems and indicators that we're going to look for it in the packet level then I'm going to show you an application problem something that was actually due to the application not responding very well and then I'm going to show you what I call the TTP weirdness some things that can happen at the transport layer with the TCP protocol that can be tough to find and aren't necessarily networked and they're not necessarily application but they're in that middle space alright so first of all before we take a look at a single packet just a few things that we want to make sure that we we do before we get there and get lost in the weeds first we want to define what is it that we are troubleshooting is it performant is it I can't connect at all see the word slow when people are complaining about flow that's a whole lot different then I can't connect at all right away in your mind she'd be thinking okay if they can't connect at all it's not working at all I'm the Lord four different symptoms in my face bottle then I will if it's just flow from time to time those are two different problems we want to define that well next is always flow versus it's sometimes slow those are also two different types of problems that we want to dig for and the third one it only affects again versus that affects everyone so just that's it if you don't have a clear picture with all those three scenarios before you start getting into your packets just stop back up ask questions ask the users really get input from them get information really make sure you have a good picture on the behavior of the problem before you go facutly because as you know it's easy to get lost in the weeds alright so and then also make sure that we capture in the packet fast when the problem is happening I've had I'm sure your parents before as well looking through a straight file and exciting everything's working just fine and then the factory were actually were mistaken when the problem occurred actually that happened any last week I had someone send me a trace file with that here's a problem that we were having okay when was it actually when did you experience the problem of last week the week before but history file from today like well we got to capture it like when it happened so otherwise we're just looking at as a you know what could be perfectly normal performance which by the way is nice to have when we're looking for broking it's also nice to have a trace of things that do work because then you can take a look at the calls and take a look at the conversations that happen to perform normal alright so let's take a look at some things that happen and you would be more Network related issues okay so these are the big ones we want to look for things like retransmissions we want to look for out of orders duplicate acknowledges high latency now I'm going to show you I'm going to bring up a trace file and show you how to look for these indicators and what they mean but typically these are the types of things that we're going to want to watch for if we're trying to look for a network issue or if we see these in our trace file that's when our mind can begin to go in that direction okay so again retransmissions out of orders you back high latency now these are for using TCP based protocols there are applications if they're using PPP if we're using UDP we can talk about that in just a minute but first I'm going to show you some examples of what those never probably why from reason TCP all right so that's why we didn't all communities widely known as Bacchus really all right just simply this one was a slow download and we talked here alright so before we do okay so this was just a user going out and downloading a file and it was slow it was to a server that was not on their local network so right away just from the beginning of this case file or Emily the bottle profiler that I have here now first we have our packet number right so next to that we have our running total of time so I have my time column consider so this is starting to do and I'm doing from zero to the end of the tray spot that's the one that I like to eat there next to that you know it's Delta time now you're going to see Delta time armed it's about every instructor screening that we have for sharks so that's something that you actually want to add Delta Seinfeld's use amount of time between types or amount of time from the end of one packet to the end of the next one now for me that's crazy Delta signs displayed so whatever is actually here that's showing me the kind between those frames so this isn't something that you have added yet to your war shark profile absolutely something you want to make sure is there now how to do that better just showed us in her session earlier this morning on my youtube channel I have a quick way to do that you can go out to packet pioneer on meeting I saw a little video of how to add back so quickly that's where you can just come down the same and find out the previous display turning this is where we can add this as a column it would like to be done so let me come on here okay Delta's TCP conversation that's my third time column and I had not for me this is useful because what this shows me is this is where this packet falls in its TCP context alright so if I have a trace file and it's got a lot of different PPP connections happening at once which is the case in a lot of scenarios right we don't just establish one TCP connection to a server and do everything over one connections all the time right so especially if we're going out to a web site or if we're pulling a lot of different connections this is a useful primer to have because as soon as as soon as I start to capture I'm going to see packets from several different actions interleaved right the packet above me doesn't necessarily relate to the conversation that I'm on so to save myself from filtering every single connection and sorting on the Delta time and seeing where those delays are I can just add that time column and what that does is it shows me in context where did this packet fall in its TTP conversation alright so that's a really useful one for me to add as well one that I like to use that's one I could recommend to you to add that I'm going to show you real quickly to add that guy I'm going to minimize my internet that IP there I'm going to go to TCP now felitti previous is I went into you I right click TCP I went into protocol preferences and I went ahead and I selected Chocula conversation timestamp so in order to add the field that I'm about to add you have to make sure that you do this as well that's an important one once you do that you're going to notice in the TCP header if you have at the bottom timestamp so if you don't have that check box check that I just showed you this will not be there once you had a time stance now you can sit Simonson's first friend in with TCP stream or time since previous frame in the TCP stream this is the one that you can add as a column now I'm going to show you this so as you know what that column is I'm going to show you a couple of trace files how that really made a difference in finding the problem okay so for me this is the way that I like to set up my timers it's useful and helpful for me to get in to see where the delay and we didn't have pants on this morning but usually when he does his presentation he just said look this is my way or the highway so this is the way that I like to do it you're going to find your own wireshark style the different columns that are useful to you and it really can help you get to become so this is just mine all right so I'm going to take a look at what's going on here so first of all to my notice for me I just like the color my pin bright green right the TTB chin so I went into my coloring rules and I said anything was extended up go ahead and set that green for me that's helpful I find in person is usually going out to the server 1001 hit importantly and a fire over here and continue to the sin but some over here I see my fin acts on back and right away I say that I have a hundred and sixty seven milliseconds that it says this for the client to get there and then to get that sin is back so right you know this is a very important number to know anytime you see a TCP handshake look at the total amount of time between minutes and acts and act that's your basic network round-trip time now for this iris capturing client side if I'm capturing server side but I'm going to see that delay between sin and connect if I'm capturing somewhere in the middle then I can add those two delta x together to get that full round-trip time all right but client side so I see my delay my round-trip delay is 157 milliseconds now hey we're network people right how do you said it's 67 milliseconds doesn't seem too good good seems pretty slow most people say all of course it slow look at your round trip time that's but we want to be careful against same things they like that they're a bit too sweeping of the station yes this this is not not fast but is this my root cause no no human being is going to punch their screen after 167 milliseconds - II agree their complaint was slow they they went and hit a file and is taking a minute to download that thing so I haven't found my grief boggling but I have found that my network round-trip time client to server is about a hundred and sixty seven milliseconds note to self write that on the board all right that may or may not be related okay all right so then in the handshake completes we see our death and then we see an axe come back from the server 155 milliseconds okay now let me talk about serious - this one sixty seven one of the things that I like to do is I like to take a look at the TTL in the IP header to take a look at how many hops I have between client services all right so one hundred and sixty seven milliseconds if this server was one hop away if I just had to dope one router and then back in 167 milliseconds that would be a big problem with me but if I have to go to several routers those are the different story so how can I tell from the package well if I take a look at that syn ACK this is the ones coming back in and if I come down to the IC header this is where I'm come down to our TTL time to lose so that build is very useful for us to figure out where are we in the in the catcher stream and the packet pack here's this packet coming for tea every time that it brought the router that number was decremented by 1 if that number ever goes to an actual one zero the router that decremented it will not forward it instead I'll get an ICMP message back saying hey your packet died it was you know too many hops so eventually this packets going to die after fifty five more hot thankfully I received it before that happened both the TP ppl field these guys usually start at bit boundary numbers so things like 64 128 256 at Mike's in the TTL on my sin was 64 the client sent 64 so every router along the way decremented that number until those received by the server server turns around fills up its TTL whatever number is using and as they cross as routers coming back to me it's decremented on the path back so 55 will just use some longitude the receiving TTL on this packet is 65 now what's more logical either this packet started at 255 and this packet came through 200 routers on its way to me and there are literally 200 routers between me and that server that's one possibility or it started at 128 and it came through what 70 something routers on its way to me or it started at 64 and it went through what nine routers on its way to me unless there was something on that side that was tweakin with that number so what's the more likely scenario there I can make a pretty educated guess this guy started at 64 am I absolutely sure of that not without capturing on the server side but I can make a pretty educated guess that that packet went through nine routers on its way to me so now I can take that assumption which is it's still an assumption but you could say it's a good one and I can take a look at that 167 milliseconds and I stay ok 9 routers how many of them do I actually own in my enterprise 1 2 ok then hit Monday ok I'll if I own the first two the other seven are in my ISP or whatever else right so we can start to figure out ok that delay isn't necessarily all my fault in my system it's going off somewhere that into a network that I can't control alright so that's a good number to keep the money with heto alright so here's again we did not hit back for the get and then we are okay so this application if I took it well looking from the yet so when I start to actually begin to receive data if my 200 okay I'm still only a couple hundred milliseconds here or 155 milliseconds and change so far our terms of the captures start and - when I'm actually receiving data from that server is only 327 milliseconds question is the user is the end user continent screen just yet 300 milliseconds as a human not quite yet but I can see this the connections established it's not me blocked anywhere let's take a look at the round-trip time and removal for by itself or from now to save ourselves from scrolling what I want to do I'm going to come over here to you learn at the time actually I'll go to both of those benches just one connection I'm just going to sort on this guy and let's go to the top and here I see a delta time of 16 seconds alright what kind of package the DEF hit on a TCP keep alums alright so at some point in this conversation I had 16 seconds tall that's more something that some someone who complain about right that's best screen punching me PACA climbing interestingly enough the practical loaded is half that time eight seconds packet blows at four to one and then I get into these hundreds all right so this is something that I'm going to want to start to investigate this at least gives me an idea of where I want to start thinking it maybe disease happen at the end of the connection and they're unrelated but we don't know that 16 all right so let's take a look at our 16 wall clicking we're going to go ahead and resource we got a bunch of how we stuff on here before I get to this so I just do one come on here above that point that ain't point that I was on above that if I come over here I've got do black I've got maybe a segment on Statuary reduction transmissions got out of orders in fact I could come up here to prefer buzzing baths PGP is a filter that's TCP dot analysis flags of absolutely something means you want to adhere Wireshark that's EPP because I'll show us what I call it PB $2,000 of slide so that will show you retransmissions out of orders due back it'll show you all the TTP ugliness and it will put it in black and red so this basically shows you that you have some practicals there is some some retransmission happening so packet loss is occurring so right now we will start the head toward that network problem we want part investigate where this packet packet to be modest so we do see some bad CTP we do see selling retransmissions out of water or some other ugliness but if you take a look at this tiny I'm just going to you undo my little filter okay I'm going to go back to that 15 seconds and gentlemen if I take a look at the total amount of time over here on the column coming from the beginning of when I started with back to this trace file if I started doing just a little bit of scroll not too much I can start to see just a second and two seconds in that's where I start to hit some ugliness all right so we can see that if you have some issues in terms of get back swifty we have some packet loss and retransmission some things aren't looking too good but I'm strolling and you come down on my little intelligence wider bargain to big six black partier I didn't know that that when I was filtering on it that's where that 16 second delay is done here just above that I'm looking okay up until that point where I see that ugliness I'm still only three point seven six seconds into the trace file so I saw the connection I saw the request I got the word we response begin the data start streaming for today only about three seconds in I'm sorry between the capture beginning and three seconds I did see some retransmissions in some ugliness but it is the first I'm going to pick up the phone and call me three seconds into a download and tell me that things are slow they're punching their screen three seconds they probably do you cope with that right is IBC retransmission but is that what the person is calling me about probably not so let's go ahead and keep going here so I come down this bit here is are we going all right up to your data movement things are things are cranking along I see big faxes big packet from the server acknowledgement big packet big packet everything's looking good now all sudden I come down to this packet 363 which is from the server and if I have given you mining for you can see that a little bit more actually an automatic pick you up okay big stuff coming over didn't kind of least got a big packet and I hit this one from a server this is a window hole okay so Windows pull we're going to talk about what that means in a second after that I get a TTT zero one go from the client okay after that FC is keepalive this is the server checking in six hundred milliseconds later 667 the server stand hey are you still there we keep in this connection open talk to me just tell me something because I haven't heard from you in 667 milliseconds the client goes back 36 microseconds letters in yes I'm here by the way my TCP window size is zero let's talk about that for a minute so a TCP window it's basically a TCP receive buffer I'm telling the server this is Thomas baby so I had my TCP receive buffer or window that you can send data to me unacknowledged all right that's and number that we're going to see in the DCP header so I'm going to actually take a look at that only expand this guy up I want you guys to take a look at this TCP window size bottom now we're gonna actually work with the calculator window size important to talk to you about combat set and handshake but this is the number that actual the real the real number that will work English for rtcp received window all right I know above this just a little bit what I want to figure out is is the client digesting data out of that GDP receive buffer as fast as it's coming in the idea behind that thing computer speed buffer is I can receive it I can tell you how much space I have left sir see more incoming data and if my CCP receive buffer ever fills but I'm going to tell you 0 Wyndham don't send anything else that window that 0 window is a wake of a client to say look buddy you're sending a lot of data and I'm not processing it it's faster that's coming in I'm stuck in some way you gotta slow them so the client here let's take a look at these big packet coming from the server is 15 14 15 14 okay take a look at wearing the acknowledgment this is above the point of the 0 1 good all right here I got chocolate when the size 22 360 I could ever take a column but I don't want to and one slip-up there alright 23 360 okay the server send two more magnets and look at this number here 24:40 so that tells me how much space I have left how much remains in my PC PVC buffer on the client side right this is whoever is sending in that's the one is advertising that number well we have two more big packets come from the server and then we see 17 5 20 see what's happening this number is going down that means that this buffer is billing the server sends two more packets 8 you're good or especially in Spain good client stairs all right 14600 14,600 bicycles all I got less see receipt for me so resistant to more client says alleged sick thing we do attacking them this numbers dropping to work exactly so the mine says 8760 to Morgan 158 forty I'm almost full server to Morgan client says twenty twenty nine twenty then the server sent two more big packets now if we do the math on that he sent exactly does the amount of states that I had left in my receive window that's why Wireshark tells advertisements to us TCP window Bowl it is not the fault of the server the server did send enough traffic to fill that is okay there was space there it's not necessarily a problem per se the server is is the one who's broken here it just build the client window all right so now after those two packets now the client comes back and says you know what my window sizes zero stop now the server can't send any more data until this number goes back up ok so now we take a look at this 667 this is the server with basically a pro explained hey what's going on man and we still there are we still talking then over half a second haven't heard anything from you what's going on on your side of the connection kind of comes back witness 5-0 I'm still full I haven't digested my dinner yet so got a full stomach I can't take anything else so just chill out over this one point one three three seconds later is doubled the amount so ever comes back alright I don't know you there can talk and I see sanity stuff clone says zero in this bill what man I'm full I'm doing other stuff here but I got other processes that I'm working on not worried about facility to be buffered I'm not not emptying that thing either quiet or the server size double two seconds man hey how we doing what's going on fine zero one done for that completed hey what's up are you doing and start sending anything else yet notice builder eight seconds later see see now how we start getting this human time this is human being time this is a screen punching kind of time and all of this is all these delta x add up over here in my pants all right so finally sixteen bills are not milliseconds all seconds he sends this keep alive i come back i say look i'm still zero and then shortly after that my ccp buffer clears ideal window updates saying hey I'm starting to get something over and then finally is eight all the way 256 K so this pain point here that's screen punching were the internet above this point I have retransmissions I ask them stuff happening on the wire that I don't really like to look up about to factory transmission but PPP is recovering up there down below disappointing I also have some retransmissions with some ugliness and some stuff going on but this period of time is likely with that end user is complaining about right so while we do see some network stuff if my customer said this to me I could say yeah we are getting some Network drops but honestly network frightened you seasoned packet loss we do see some ugliness happening but this client he sucks he's not processing the data as fast as it's coming in now to help us visualize something like this another thing you want to make sure you use especially when you're looking at large transfers of data if you find that those large transfers are going slow then graph you want to be sure to learn and use you can find it under statistics income a TCP stream graphs let's go to even our Stevens tracks so what this shows me it graphs it over time these are our sequence numbers going up over time so TCP sequence numbers the amount of data that's going from one side to the other we saw there was this initial burst of stuff going and then I got that long pause and then what I want to see the nice straight line boom going straight up I want to see this guy could go straight up to where it hits the words even then my downloads willing to take was 20 seconds or so but here I saw a big period of time where there was nothing going across the wire it just got stuck so that's the kind of delay that I wanted to watch forward so this is just an example I like to show this one because when we first started looking at it it's easy to say oh we got retransmissions it's the network which is true there is an element of truth to that we do see some retransmissions we saw those things but the thing that was really causing the delay the thing that was the issue was that stuff TCP window on the client side if TCP received buffer was filling not processing that stuff out as fast as it was coming and that is what led to the time the real delay that the client was complaining about so what kind of things caused back a lot of times that client busy doing something else it could be stuck on some other process it could be doing some kind of backup from the antivirus some but on that side this think in terms of why would resources not be sufficient to keep up with that incoming data stream so there's a handful of things there but this was this was a of that and applause all right go again wanted to show you this one because we have your transmission those are ugly it looks like network but there is also some other things here at point so finding adding that Delta sign and also the TTP Delta time that will help you to sort for those and then you can look for it goes delays okay so this is just one example let me move on to another one got a few to work for years now if we do see retransmissions out of orders do backs that kind of stuff what do we want to look for we want to absolutely comb as much as the network as we home look for a layer two level event well for this card we will look for a gift there's CRC Reliford and links they're asking flex bill hanging out with the solutions on them and so on and try to clean up those that packet loss from RNA now here's one I've actually different one show to you okay I'm going to show you an example where retransmission didn't actually indicate packet loss most of the time they do TCP spent something it didn't get a response so it sent that thing again right have the textbook retransmission but in this case we started to see retransmission and when we got down the root cause we found that it actually would not do the packet loss so I want to show you this also this shows you the the need to capture from multiple locations even if you can't always capture simultaneously at multiple locations meaning the same event in five different places at least move that point of analysis and get more information as much as you can so for me I really like working with client and server side traces simultaneously that's like gold to me and also one of the middle of again but in this case we we couldn't get all of those at the same time but we were able to move our capture point so let me show you this so just from the client side this symptom the symptom was slow means random slowness sometimes I would go out to top men sometimes go to Gmail sometimes go united sometimes it would work and the wouldn't sometimes it would be a flat-out cannot connect and sometimes it would just be dog slope all right so internet slowness was the complaint now starting a capture point one so we started client side gladly day let's take a look at what that page looks like this is a filters just because I am the kind of filtering process sorry all right so here's our sin we sign up act it out there's our sin we wait three seconds if you lose your first clashing and TCT conversation guess what three seconds that's human time right that's we don't like that kind of time wait three seconds returns it didn't go up the neck back six seconds later we retransmit so right now we're a combined total we're nine seconds in now and then the connection establishes we get our connect and then it takes off and we're off to the races everything we're fine after that so I lots of the worst to the worst two packets I can possibly lose in terms of packet loss I lost the first bin the second sin and then the third one works so I had a whopping total of nine seconds of delay just waiting for this thing to connect all right so this is what we saw from the client side now if we move our point of capture I mean right away you know different things can start coming to your mind okay baby is it packet loss on the network ah maybe I mean it's kind of weird to lose the first two sins like that just from raw packet loss just from you know congestion or something like it's possible but that's calling out Mike or going through a firewall something that even been initiated this morning when she was looking at a similar type of tracing that maybe the firewall was already busy handling so many other connections that couldn't get around to my connection and and Ngata through or I've worked with firewalls a couple of weeks ago where it had a certain number of connections allowed per user and sorry buddy you've been too busy I can't you anymore kind of thing like they was like set to like a hundred connection per user and your first hundred would work and then after that it would say for email forget it I'm just dropping your stuff so for me that was actually my first assumption as I thought wait were you really doing outbound things let's talk to your firewall because it's probably or it you know making careful about assumptions I was like let's just take a look at that firewall NP is it have some type of ceiling per user on number of outbound connections let's check that well we went in and we found that number was very high so that is good so what we did is we moved our passion for me from the client side we build it outside the firewall what I wanted to see was are those sins making it through the first two thing that you saw drop what happens those so we just took mark back to point and move it right outside and let's take a look at those attackers all right now this is not for the exact same connection but it was happening so much it was the same behavior for all the connections that were breaking all right this this thing just outside the firewall here we have a pin for 443 being sent out and do we get a syn ACK we did you saw the tag go out to whatever server we see a syn ACK come back it was being received well next packet after that packet three that's from the client going down it's a spur to get three transmission well what the heck is a spurious retransmission we see retransmissions battery template what on earth is a Ferguson canvas well as a spurious retransmissions Wireshark is telling us I'm seeing a retransmission for that I have already seen an acknowledgement for it's a retransmission for something we've already seen an act for so it's kind of a weird spur it's well-named retransmission all right so the client sends another Finn the client never got us in act so from the client perspective it looked like outbound syn packet loss right that's what it looked like syn law sin law sin worked okay hit a sin made the server got the hack client never got the memo retransmitted the next packet down this is from the server this time if I take a look down in our packet stuff I can see this guy from the server on the whole this has been act so the service saying look that's very few transmission here just in a cruddy here this again didn't listen me the first time it's my second one the server weights just a little bit it spend another retransmission of that syn ACK the server waits for seconds sends it again the client turns around six seconds after its initial one there is another spurious retransmissions so these two are not connecting right client is sending out sins and making it to the server service coming back roasted acts are not making it back to the client all right so where's our head start to go I mean firewall but firewall if it left it out it should let the response to come back in right should it had a great word okay so we saw the reason why we moved the point of capture outside the firewalls because we wanted to see are these packets getting out all those things getting out are they being stopped right there we just proved that they are stuff is getting out to that firewall it did have enough connections available for this new for the human right okay so the next inning so he went anybody know what's it go ahead and move this guy inside here who's come into this third capture point there was only a switch here now it's funny because we did call the firewall people there were some funny things in the firewall log and the firewall people were we were like you know should we just check on these that wouldn't capture their and they're like not only official what can happen on the inside right it's just month which you're going through go ahead stuff happens and in this case stuff did happen so you can change all the word stuff very much for little right inside bottom or I'll stick a little fan okay now this is inside the firewall sins are going out sins aren't going back in the switch is stopping a possibility since you're now again we're now inside right so sin went out all in this earth and came back in we're capturing it just as it's coming in the front door and that that's client or we see that retransmission from the server the client sends that phrase retransmission so from us between these - are all right lines and distance which and in the client that's in a cascade to the crime which is one possibility another is this is we're sorry guys I have filtered their look so mostly here that one but something we want to look at is let's take a look at our layer T stuff there was a device call what's open a bar X course blue 55119 fibers are ascending MAC address that's the client eccentric right funny to the file with so for spiral alright coming in cycle so coolest is setting into Cisco so it's coming in from the client is sending it out this way but when it comes back in with just a switch between us firewall is sending it over to Cisco device silver here some third device okay that's strange let's take a look at our mini pocket from the planet my aggressive actual source smackers right designation Macker ready but the stock coming back in here the firewall is going to the Cisco device a sticker with here what was going on so someone just said art so I'm going to show you exactly in weird art ass today okay in this environment we have in the body this is it I just because of the time I had to kind of filter and make this look nice and concise for you but this shoulder behavior the plan we had a device in this case let's just say this server in broadcasting it's saying okay 10.1 dot 2.7 who has that I repeat because I got a packet 40 and I need your MAC address before I can send it to you to just hook me up with your MAC address I'll put in the destination MAC and we're off to the races ready well check out what happens what kind of responses we get that look okay to you so yeah so so yes whoo 10.1 90.7 and it gets three responses me me me again you whoever was last to respond in it so in the first case we see it responses are response from Cisco device last two digits of the values of the Magnus or a te back I'm stand on one not 2.7 send you stuff to me right after that we see another response from another system I be through imma 10.19 2.7 send your stuff to me finally the real guy Apple response predictable I'm saying that one that gif 7 send you said to me when he stop that behavior when the host was slower than those first few responses that lacks of the information is what the firewall he's inappropriate and got to the right however the funkiness you thought the next block of art project he has end on the left is only 59 the real user replied first the real value that's me only myself then they have a consistent what accent is that guy than me so then in that case the most recent information was put into our table then the firewall extends that really because stuff over the filling board on from there those riscos were doing proxy arbors so on those interfaces they activated proxy ARF they were acting they were actually VPN interfaces for vendors that this client was using to go talk directly to the vendors should have been using proxy art for what they needed it was overkill that that router was basically saying I'm the whole subnet talk to me anything on finding out when not to send it my way I'm your guy problem is it was even saying that for the firewall right and it wasn't keeping things in-house it was killing the connection at that point so there was no real purpose and having that enabled those two routers pretty old dogs they were like Cisco awake hundreds or something there they're old dogs we went and disabled proxy ARF this whole problem went away so those spins TCP syn went out syntax made it to the firewall made it inside the firewall because the firewall had the wrong ARP information it was sending it to a Cisco router rather than the end user and after a certain number of seconds that firewall would say hey where's 10.1 dot whatever and that would fix it's our cable because the end user would be the last one to respond it would get Cisco Cisco than the actual end user then everything would work and these guys were having a problem with their printer they're having probably enough everything is a recall issue item it's a fundamental thing there - if arts aren't working if I don't have good mac address stuff things aren't flowing through the switches and this type of thing isn't working well nothing above it is going to work either right so don't forget layer two that's the takeaway lesson from here this was it looked like packet loss in fact we got out of sledgehammer we had our sledgehammer right on our children we're about the town at five I mean we're you know we we had enough capture point if we had only captured client-side we wouldn't have found this these arcs wouldn't have been sent to us try to find long if I'm client-side I only would have seen the firewall ARF and my response I wouldn't have seen those other routers responding as well right those would have been out of the picture of my capture so a takeaway lesson if you have retransmissions fully vet then make sure do you see those packets coming in a certain point not leaving a certain point multi-point capture make sure we're not just relying on one perspective and also don't forget about layer two lot of us do I mean we saw a transport layer problem right we saw retransmissions a transport layer that led us to look down the stack now we want to go and make sure house where to house layer three do not make assumption don't assume all is between the firewall and servers all marl only the client is a switch so what's happening six weeks right we going - that wasn't the switches fault it wasn't the firewalls fault in this case a was browsers that were completely out of the that we're doing proxy on which we disable okay I was kind of a weird one good how you doing on time - I got 20 minutes okay next then so we talked about we talked about network issues what we want to watch for will not for retransmissions and for Duke backs things like that also when we do find retransmissions we want to vet them out make sure that we really understand what's happening at layer 2 and layer 3 now let's take a look at an application problem in this case this was application that was running slow it had a whole day running that was several servers on the web bun and interacting with the people took a couple of a mystical servers on the backend and in this case we were able to look at transactions on the Sun and we saw some of them were slow not all of course it's always the random ones an artist so we have the chassis on the back amended I mean under my Delta right this one I hadn't filtered on this conversation let's sort on go foot and I'm going to go down to the bottom and here we have our 32nd keep alive we saw that before now keep alive literally as the name implies that's just TCP trying to keep the connection alive it's not time to kill it it's not time to finish maybe it's busy doing something else maybe it's not we don't yet know but we did see some people life okay a book that we feel weird 13 seconds this is a response from and by the way whenever I'm tracer angles of you you expert whatever invasive regulate the client is when we went to you and then a service tenant just so you know it's my way or the highway right so I get this weird 13 second response now whenever you're sorting on deltas I'm like this these are the types of things that you'll see like you might see a long TCP equalise thing like 30 seconds now that might be related to what your troubleshooting or might not if it's at the end of a conversation let's just say you and I have a conversation I've asked for something you've given me that something and then I've just waited for a while to turn off the connection or to hang up the phone right it could be those are just at the end of the connection before everything was spins or reset what we'll find out but something you want to watch for when you're sorting on Delta is just the weird numbers that really don't seem to make sense 13 seconds on a response that's not cool back people time that's screen punching time right 13 seconds and we're impatient little beings that's one that I'm probably going to sort for the only so I've owned that and I'm going to sort put that back in context and check it out yeah all right so quiet remote city college funds remotely total silence if you guys are moving to sub-millisecond so the mobile teacher common response remote receive your coffee here's the request let's get sent to the server 200 milliseconds later not be going to see as we blow over there my Markway right 208 milliseconds later the server responds and says thanks the third an act this is the envy act 60 but this is not an actual response this is it TCP layer for saying great I got stuff hang on mr. layer seven up there is working on it okay this is not an application response this isn't layer seven this is just transport layer hey he's working on a chill so 30 second plated the server respond there's one of my key pods that we saw when we put that adult design and content server says all right clear seconds later hey any 50-60 Bible okay let's go ahead and keep this connection alive this isn't a response yet this is just layer four we don't want to hang up the phone mister layer seven up there so ever he's still working so let's just let's keep this connection open so wife is sure he could keep it open ethical classes are awkward and keep my connection open left side so we exchanged a couple people at thirty seconds later we do it again thirty seconds later we do it again we're now a minute and a half thirty seconds later in fact let's come up here to our remote our PT call I'm just going to come up here and set our bedtime reference I just started a stopwatch on that request all right so I want my time reference clear that off to zero so now I can see over here this is the total amount of time after all of those people lives the server finally thirteen seconds after the final keep alive actually gives me a response but how much time did I wait 193 bold plans ok now I've had I think I don't remember if it was exactly this one but this type of symptom I've had this one sent to me and said I have a network problem look at all these TCP keepalive this is a network problem and ugly black lines don't always mean retransmission right ugly black lines mean ok you probably want to check this out but it's not necessarily a problem with TCP TCP is just keeping it open but that's nice that that ugly black is it draws our attention there right it tells us hey there's our we we got something going on here something stuck now those people eyes are happening between a request and a response so that connection is staying open if we had taken these keepalive let's just imagine the request goes response comes back quick and LGD Blackie Blackie black people s respond or a request response we wouldn't necessarily be waiting on those are the people I don't signalling the problem and this means wrong in the connection we're keeping the connection of the right to keep labs don't always either the problem really depends on where they are in this case we waited 193 full seconds before we heard a response from that sequel box that's an application issue that's where you say look I don't have any returns missions check no duty and nothing else no no other TTP ugliness there's no window issues here right it's a big a look at the receiver and we can kick in there and look at the window side and all that stuff that wouldn't be issue instead we were just waiting for the server respond it is probably one of the longest application responses I've seen but what happened was one of the servers in that server array was still configured to use like an old trusty sequel box they had where everyone else was used in the big hotshot new one so the old Krusty dog was still working on a bunch of a bunch of calls and it was just bogged down with life and this call came in and it really slows down that end user so every now and then you would get hit with this type of delay now for me I was impressed that they have they even I mean we don't know on that convicted client man but I'm sure by then the clients already given up in closing windows ok for squid who started going over after what was that five minutes is that what for so anyway yeah hey so that's we're going to look for when we're digging for applications pet products now for me something that I have on my Wireshark it's a handy little filter if you're doing web transactions and you want to find a slow web responses quickly one sofa that you can use I just call it for HTTP because what I need it but you can have it you can use the air and I haven't even in a straight line but you can use it's called HTTP dot time and that filter is it's just going to be on port 80 and it shows me what is the response plan between the request and the response what is that amount of time between those two actions and the filter name HT gut HTTP time and what I say is just show me all web responses that were slower than one second that's what the one is if that's too fast for you you can up that number to two seconds five seconds whatever you want or if you want to be really really show me all the transactions that were slower than 100 milliseconds or 500 milliseconds whatever you want it to be for me I just want to get every one second this is a button it's an expression that I can click and if I'm looking for slow application responses on web servers specifically I just click that guy and it shows me as slow as the post responses so it's a quick way to filter for something like that you also notice for my buttons I have one called no broadcast chatter this is one that's own for me as I'm working through a trade file if I see something that I'm pretty sure is not related or it's just dizzy on the screen what I like to do is just get out of get it out of there you don't always know specifically what you're doing when you focus in on itself too wide but sometimes there's just chatter on the screen you just want to get that out of there and like spanning-tree updates or sometimes are it barks not related or other protocols that you're not really interested in seeing from there ready to fit your neck or not and then start making your list you want to say no R or no int your level whatever absolutely just for whatever is your troubleshooting I can say okay I don't know what it is but I know those guys aren't related let's get those guys out of there or you know we can filter we can add to that like a layer to broadcast if I don't want to see you later to broadcast if I'm not worried about focusing in on those what I like about that type of filters you can quickly remove it if you suspect that are really is the issue or you do need the Ticos broadcast back that's the next one instead of half as well all right good okay all right last one small the lady so you saw a big application delay that was a many minutes equal response but in this street file if I come up here this is just small the way the timing did sometimes an application can do what I called nickel-and-dime go to death sometimes it's not like a mastered 45 second delay that shows up real easy little filters - it's a bunch of small right so here it is Jerry's going to do something I'm going to sort on both them go do it onto the bottom now usually just so you know instead of sorting on the actual Delta IB sorting if I just started with the trace file I'd be looking at the TTP conversation builder because remember that regular Delta the true like time between packet that's going to show you everything as it related to the packet look for it right not necessarily the next packet in that connection that matters right those those are the times that we want to see the usual mo sort on that that the TCP Delta column in this case is this one connection so all right already here we can see server deployment right is above the fact these are 65 pockets I come down to the PDB header there's one in here look stuff right is just simply a TCP act and check out the number that I see I feel I right let's go over lunch now the number 200 should be a little warning bell for you the number 200 milliseconds that is in this case this server is doing delayed acknowledgment I'm going to show you and put it in context and you'll see what that means a delay that means that it's waiting for more stuff that come in and you know more came into yes it kind of it held the axe back waiting for more stuff to come in all right I put it away that is what we put this in context and Department all right so fine request to respond climate busty response plans last night but then we get his remote Reseda college of the required then we see that two hundred millisecond respond and then after that 190 milliseconds later we actually see the real dated stuff the real response from the sequel server okay this doesn't seem like a whole lot three hundred milliseconds one instance of three hundred milliseconds is not a big deal a thousand instances of three hundred milliseconds are busy special returns and be a bunch of extreme right so if we take a look at a bunch of smaller delays here you can see the fifth equal server is we dinner what we see in the late act we see an actual responses real stuff up above is you can see the same day here giving requests you can see see it's the late acknowledgment then you see an actual smoke with real stuff this pattern just kept happening well the two hundred milliseconds at first you might think oh man why are we waiting two hundred milliseconds to respond but in fact in this case the application wasn't ready to respond anyway from from requests nearing the time this actually front of me request to the actual data coming in was a total of three hundred and twenty six milliseconds from request to respond in the middle of that the server hadn't responded or hadn't gotten actually getting that response back anyway so it's the late the request well I'm sorry it delayed the pp-2000 it could have been that more was coming in from the from the client especially if we had a small packet if we have something a lot smaller than the 1500 byte or the 1460 to be exact then it could be that more is coming so while he has delay this because my server's not responding anyway so we're going to delay this acknowledgement its waited to undergo forget to delayed it then after that the server responds come so for me I'm not super worried about that delayed acts because if you look at the rest of them they happen pretty quickly instead what we did is in women we found out I cut the data but we found a book called being requested and it constantly would slow that specific call so the age on America we made some little tweaks and in this delay went away but in this case what I'd like you to take away from this trait is that little teeny delays can add up so it's not always going to be this big honkin 60 second thing that's just so clear as day or a spot right sometimes it's going to be down to these smaller delays that persistently happen and those are the ones that you want to take a look at and again something like this I mean I'm seven time reference this is interesting that we're gathered here with here is even graph as well as battering pull data I just click on a packet from going in that direction from server to client and we're going to see this graph and you see all these little blaze so basically those are sequence numbers so that's data coming to me and then and then a way in an application delete so those are also some things you can find with applications and anyone I get in and take a look at what exact call is going and then work with your DBA and figure out what's exactly caused the massive could be held up all right guys um I do have another example to show you it's one of those TCC weirdness ones that I found one it's a it slows things down and it got down to the nitty-gritty of TTP I've ran out of time in this session I'm going to be repeating this session tomorrow afternoon but instead of going over this equal stuff I might just go straight into that TTP's weirdness that's what I like to call it so anyway I appreciate you guys coming and enjoy the rest of your shortcuts today [Applause]
Info
Channel: Chris Greer
Views: 15,268
Rating: undefined out of 5
Keywords: Wireshark, Sharkfest, TCP analysis, tcp/ip, wireshark troubleshooting, wireshark tutorial 2020, wireshark tutorial, wireshark training, wireshark analysis, tcp analysis, slow network, free wireshark training, free wireshark tutorial, how tcp works
Id: XbvZePFcTME
Channel Id: undefined
Length: 74min 7sec (4447 seconds)
Published: Mon Jul 17 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.