Netdev 0x13 - QUIC Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
um I will I will get started now welcome and thank you for showing up I will be talking about quick and if you were at last year's net dev that's gonna be some of that material that's rehashed here but hopefully not too much but I will start off with a general sort of an introduction to quake so the plan here is to have a general introduction to quick an overview and what quake is in this part pre-lunch and post-lunch I'll dive into the details of the protocol some of the details of the protocol I'll dive into an open source implementation and I will talk about how how you might write an example up a sample application I might if I have enough time I'll do a code read-through of a sample application and I'll show you how I basically run it and show you some packet traces from that run so that's the that's the goal so this first part is just basically me talking through things and then maybe the second part would be more hands-on if you are interested but yeah that's the goal this tutorial was supposed to be me along with Ian sweat from Google but he's decided that he's going to pass on this hold on to his family though he's gonna be here for next week also for the IETF I'm kidding about that but he's it's he's unfortunately you're stuck listening to my voice for an hour and a half hopefully it's not too boring and I'll try to keep it quick and going forward fast so i'm i work at fastly and my colleague Kazoku who also works at fastly helped me with some of the with a bunch of material here so much thanks to him how many of you are familiar with quick how many of you have never heard about quick yeah right all right well good enough if you haven't you should be ashamed you're right to not raise your hand google it look it up I'll start with a very quick history of what-what quick is just basically a protocol for HTTP transport and it was deployed at Google starting in 2014 so full disclosure I used to work at Google and and I moved to fastly about a year ago and this some of this stuff is data that was generated while I was at Google so the protocol was deployed between Google services and Chrome and mobile apps so we basically had control Google had control over all the service and also over chrome and over first-party mobile apps like YouTube application Google Search app on Android and so on and quic was deployed as an application-level transport sitting on top of UDP on both sides and the performance benefit that we started to see was was good and it was so good that we basically turned on turn it on to basically 100% of what we could which meant that it became about 35 more than 35 percent of Google's egress traffic in bytes was was quick and this is from these numbers are from two years ago but yes it improved application performance specifically YouTube video rebuffed errs and Google search latency by quite a bit as you can see and these numbers are actually substantial partly because they are not micro benchmarks their end-to-end measures so changing a transfer protocol changed the end-to-end application level quality of experience metrics significantly and again Google search latency it's not it's no it's no small task to change one of the most optimized applications on the planet and being able to move it by that that amount was quite significant and since then we formed an ITF working group to module rise under standardized of quick and I'll talk a little bit more about that later but that's that that became the goal of it so what are we talking about just let's go back to basics for a moment that's our traditional well known well understood HTTP tag that you know half of this room tries to optimize TCP for because let's face it what's TCP being used for if not HTTP practically nothing else but that's the stack that we have in love and use the most and what quake does is effectively replaces a big chunk of that stack so two places TCP entire link at the place it doesn't replace TLS but it sort of encapsulate sit in an interesting way and it it replaces a whole bunch of what HTTP 2 provided for HTTP so it subsumes a lot of that functionality and running HTTP on top of quick is now hvp 3 so the version of HTTP that runs on top of quake is a it's slightly it's it's different it's quite different than HTTP 2 because as I said much of that functionality has been subsumed in quick so the mapping of HTTP on top of quick is HTTP 3 and TLS 1.3 is basically as you can see here it's embedded effectively in quick and it I can talk more about this mapping but I'll move on and if you have questions please feel free to raise your hand I have a mic here that feels very lonely all right so these are the IETF internet drafts you're welcome to go read them if you feel like you are unable to fall asleep or if you enjoy hurting yourself badly but I want to very briefly start with basically what I call the HTTP story so if just just looking back a bit then to place all of this in context right what is quic coming from where is this coming from and why are we working on this HDB have it goes back for the beyond that of course but HTTP 1.0 1.1 happened in the late 90s and that was an HTTP ng effort there was a working group and a big effort in the late 90s and early 2000s anybody remember the HTTP ng effort anybody here a part of this effort at all no there was it there's a pretty big push at the time to try and figure out how to solve the problems that that was seen in HTTP and I won't list them all here but eventually much much later HTTP 2 showed up a few years ago as a standard and HTTP 2 introduced notions such as streams and multiplexing of objects now all of you are familiar with the TCP problem of head-of-line blocking yeah these BC realizes everything that sent down that connection and the receiver if there's a if there's a loss the receiver holds on to packets that are received later cannot deliver them up to the application until the loss is repaired and in HTTP you typically send a whole bunch of different objects into the same connection and the receiver even if that lost packet does not belong to the same object as subsequent packets subject subsequent packets are still held in the TCP buffers so this is head-of-line blocking that TCP introduces and HTTP - sorry HTTP ng was trying to solve these problems some of these problems got solved with HTTP - a few years ago but that was basically you know like I said streams and multiplexing that allows multiple objects to be sent on the same connection but interleaved and then introduces flow control per stream effectively and priorities across multiple objects so the idea and HTTP 2 was to use one connection more efficiently for multiple objects with HTTP 1 1 objects are sent back to back a whole object is sent and the subs' in the next object is requested and then sent behind that they should be - interleaves multiple objects so you can receive multiple requests at the same time so multiple objects at the same time and by the at the same time I mean multiply on the same connection there's a big gap right there right from the 90s to a few years ago massive gap I don't know not a lot really happened in the HTTP community all the time now you folks might be more familiar with this schedule anybody remember TD CP here oh come on I'm seeing some nods that's good enough for me so what happened in transport world around we were also trying to solve problems similar problems we were trying some of the problems was shared in similar we tried to solve similar problems TD CP was one of the things that was proposed it was proposed in 94 anybody can can anybody tell me what TDC be solved what's the problem that it solves yep yep sure yep so TTC B basically eliminates the handshake latency so it was going after a particular type of traffic which is small objects right because that's where the handshake latency plays in the most and guess what the web is made of a lot of small objects so that handshake latency really hurts when I'm talking about web traffic and HTTP 2 was trying to solve go after that problem in a different way by trying to say let's put more objects in the same connection same thing HTTP 1 1 did how anyways T TCP was trying to solve a similar problem was basically trying to eliminate handshake latency and effectively later on sort of jumped and showed up again in 2009 as TCP fast open and you may be familiar with that and that's basically doing the same thing it's just become more important to us now but then we had other things as well which were trying to make TCP better for web traffic and not even focus just on web traffic but they were effectively all the efforts were what they were trying to do would make web traffic work much better with TCP and there's a big gap again and then there are some proposals and then PCP fast open happens and TCP fast open happens in 2009 and can anybody tell me how widely TCP fast open is deployed anybody here by the way how many people here know TCP fast open I should ask that question first okay good how widely is the supplied I know prevent can answer this question how widely is this deployed not very widely you say not very widely praveen what do you say he agrees pravin agrees it's not very widely deployed 2009 and it's 2019 10 years later not very widely deployed what happened here and what happened in that gap here we had a CDP just before that gap anybody here familiar with SCTP a little bit enough how much SCTP do we have today actually a lot if you count all the bits because if you're making a phone call right now you're using a CDP and that form if you're using you know the telephone is signaling such as you're using a CDP but if it isn't the internet you're not using a CDP what happened in all of these spaces what happened to keep TCP fast open from getting deployed since then what happened anybody one word middleboxes says somebody and yes you get lunch everybody else here has to stay and and and and I'll walk you through what happened middle box has happened well they didn't quite happen during that time they've always been around it's just that they became a real problem during that time so what what are these things are basically intermediary devices that look past the IP header that's effectively what they are very very basic definition level right and they are pervasive they're everywhere you're using one right now and you don't even know it right I mean that's basically how pervasive it is home routers firewalls everything everything that you're every every packet that goes on the internet today touches at least one metal box right most likely so these things basically destroyed in some ways our ability to be able to deploy new transfer and and and deploy new changes and we didn't quite have a good way of handling this problem and we tried to deploy changes and we still try to you know tease be fast open is a great example of this right we want to we want to deploy these changes but it's hard it's really hard to deploy new changes on an internet that is fundamentally ossified by metal boxes metal boxes expect a certain type of behavior from various protocols and if those protocols change they break in unknown ways and that's a huge problem so with that I'll switch to basically what we started off wanting to build quick to do we wanted quick to be deployable today we didn't want to wait for nuts to come on board we didn't want to wait for metal boxes to come on board we want to deploy it today and we wanted it to be evolved over the last well so we put it in user space on top of UDP so we could deploy it right away if it is a new transport like a CDP we could wait for another 20 years and then say it's still not deployed or we could deploy on top of UDP and basically give up the battle to try and deploy new things on top of IP and say well maybe we will actually have something so we tried and we did and we got something so we deployed it on top of UDP and that was a very deliberate design choice and it's in user space and that allowed us to iterate much more rapidly at the endpoints and that was also a very deliberate design choice and we wanted it to be evolvable as well and we incur encrypted and authenticated the headers so why does evolvable an encryption and authentication of headers why are they in the same bullet point here why are they in the same conversation you don't get very many points for this but he'll get some points if you answer this because metal boxes can't read them you get another lunch and you're running out of your quota by the way others have to answer to but yes exactly this is this is if and I'll explain this with the story right so little while ago what we now call G quick is basically Google's quick so I haven't really they work through this but Google's original version of quick is now called G quick because the IDF is standardizing quick and that is actually quite different than the original version of quick which is now called G quick because we didn't want to call this AI quick because that would be quick and this would be AI quick and this is the thing we wanted to stay forever so we call this quick and that G quick if I haven't confused you enough just it's alright so the first byte of G quick was basically every packet was was a byte was called flags now you know what flags are used for right there basically boolean's right that you ought to be able to flip them back and forth and that's sort of the goal with them so that's what we we it was and it was unencrypted because it was required for the the receiver had to be able to see it to figure out how to parse the rest of the packet and so this was unencrypted and had been seven for a little while because we hadn't flipped any bits and by a little while I mean about nine months to a year it had not changed very much so we did what you do with flags right we flipped a bit and then everything went to hell we get this you know p0 call saying the new chrome version that was launched users can't reach the internet / / / / Chrome you can't use reach any Google property / Chrome and it was we were able to quickly you know bisect it down and figure out that this was because of quick and by the way the user was saying oh by the way we can teach Google / Chrome but we can reach it just fine using Firefox right so that's when it really hurts because you go this is because Firefox was in doing quick and V were doing quick and so we figured out it was quick and and and long story short it took us a while but basically we narrowed it down and we and we with the help of a bunch of really helpful users we were able to figure out that the common thing that caused this to break down in various places was the presence of it particular vendors firewall right so what was this firewall doing after but much digging what we found out that it was allowing the first packet of an unknown protocol to go through in both directions and then it would decide this is not something I want to allow so I called all subsequent packets now this is particularly problematic for quick because the way chrome detects whether quick works or not chrome basically does a it tries quick and quick doesn't work it will fall back to TCP the way it does that is it'll try to a quick handshake and if the quick handshake succeeds it'll say okay I'm going to go through with quick the quick handshake fails it will fall back to DCP now of course this firewall allowed the quick handshake to go through right and then chrome said quick works and then the firewall black hold everything right so this was basically the worst of all worlds for us and this is a big problem why did the firewall decide to do this because they basically found that the protocol that they were seeing this packet come from was not quick they saw that this packet that was coming through was not quick so they allowed it and then the black hole did because it did not fall into their category of what was quick so how did they detect quick what they did was they opened up Wireshark no our dogs are already public at this time they simply looked at the trace for a little while and they said guess what it looks like the first bite is always the same guess what the classifier looks like then and this was verified more or less yeah now this is basically what we had to deal with right so this was one bite that was open not encrypted and we had this massive issue that we and this was not a public protocol this was entirely proprietary it was not you know it was deployed only within Google and it was not very widespread traffic at the time that we hadn't ramped up the traffic to the extent that we did afterwards but that's what we dealt with and that's one of the times when me when we realized that we had made the right decision and the fact that the only way the only way you protect a protocol and you protect evolvability of any protocol and of the Internet is by using encryption this is not to say that middle boxes aren't doing useful things which if you catch me off in the corridors I will tell you why they are not using useful things but for the purposes of this conversation it's that they are there there are unintended consequences to what they do and they don't necessarily take those into account also vacation is one of those unintended consequences they've stepped on things to the extent that it does not allow us to evolve protocols in ways that we want so that is why that figured as the top item in our design Maxim so to speak we wanted to then of course performance was important somebody's got to pay for you to build this stuff and reducing latency as it turns out pays for building this stuff so we wanted mostly zero RTT and sometimes one or two t connection establishment this is you know if you understand TCP fast open and even point three this is roughly the same as that now of course as I said T was one point three has deployment issues and TCP file and sorry please be fast open has deployment issues tiers one point three came afterwards and and that's something that we've been now incorporating in the IETF work in too quickly so this was an important characteristic and as I noted earlier some of the features of HTTP two which were necessary and very useful for for web traffic streams in multiplexing that was that became part of quick this was also further on we also wanted to do better with loss and congestion control or at least we didn't think that we necessarily could do substantially better on congestion control but we could make it possible and easier for people to play with condition controllers and that's we went about building the stack in such a way so you'll actually see a little bit more of this later but quicks packet design is such that it allows for us to do much more efficient loss recovery and and and gives a lot more information to the congestion controller as well about how it can detect network usage and additionally because again Nats because things don't like UDP so much in the network and so on we have connection ID that's in the part of the transport had a part of the by the way point of clarification here how many people here think UDP is a transport Tom don't raise your hand you can raise your hand it's ok I might say yes it's not sorry I fooled you there no it's not it's not it's not it's it's it's it's a way to build it I keep saying UDP is not a transport it's a way to build one and Tom is there shaking his head going oh yeah I know I RFC 1122 says Tom and it's four pages say says me that's not a full transport but anyways we can have this debate later yeah we can have this debate later the point here is that UDP is simply a way to build something that's that I would call a full fledged transport on top but your EP has port numbers and and Nats don't there are there are issues with with nuts and rebinding and so anyways we have a connection ID in the quic protocol itself that allows us to do to not care as much about the four table and have a connection ID that identifies the connection and and this connection ID is part of the the packet header and is authenticated so nobody can mess around with it as well it's it's useful it was primarily meant for natural binding because Nats understand PCP they'd hold on for the hold on to TCP state for a TCP connection but they don't understand quick or UDP there's no notion of UDP connections so they'll drop state whenever they decide to and so we have to be willing to deal with Natalie bindings and so that also gives us mobility basically right the holy grail of things that we've been trying to do for a long time MP TCP for example but yeah it gives us mobility also for free with not quite for free but it but you get that occasionally so I'm basically going to not talk more about the details of earthquake here I'll talk a little bit more about the packetization details and so on after the after the break but I want to show you roughly why we care about this we've talked about head of line blocking and those Layton sees might seem small and it seems like you know who are we really solving these problems for and and and this this should be instructive a little bit a this is basically showing quick performance improvement by by geo by country and this is for YouTube video and this was rebuffed aerates reduction in in in in in re buffer rate for YouTube for South Korea which has a minority D of about 38 milliseconds and TCP retransmit rate of about 1% the reduction in rebuff a rate is modest its modest on on mobile actually but it's basically non-existent on desktop and if you've seen bandwidths and Korea in general you'll understand why it's hard to do anything anywhere else to improve that performance cause it's super super good anyways if you get to a place like the US which is you know a little crappier then then then South Korea you start to see more improvements it's when you get to a place like India which has really sucky connectivity that quick shines so the value of quake is in being able to make connectivity much better about it really really sucks so how have we been working on this over the past couple of years we've been working on this at the IDF we've been working with some very with a large number of folks across the board and it's it's it's coming to a stable point I'll say unofficially informally on tape but it's it's it's it's come along quite a bit since when we started the focus on security and privacy has gotten even stronger over the past couple of years and that's rubbed a lot of network operators the wrong way and not I shouldn't say it's rubbed them the wrong way it's basically we've we've had a lot of contention within the working group on how to because network operators are used to seeing a certain amount of transport information and quick basically hides everything so there's been some contention discussion debate and we've had that but that's all been I'd say healthy and there's been a very strong focus on avoiding ossification through encryption and increasing which is the mechanism I won't get into the details here what's the word like right now in terms of implementations we have a large number of implementations right now working on building and interoperating with quick we are gonna have a number of these at the Interop this weekend at the IETF so Apple has a couple of implementations as it turns out one of them is for ATS Apache traffic traffic serve and the other one is the proprietary one firstly has our we have our own implementation it's called quickly because we are cute with names like that Facebook has its implementation Firefox has an implementation f5 Google's is chromium that's open source Microsoft has theirs which is not open source Lightspeed and click go which is an go for the carry web service so there are a number of implementations out there and a lot of them they're all working on interoperating with each other and they are all spending a lot of effort and energy building these up to speed will be actually we will be looking at quickly here I won't be going into the details of how quickly implements quick but I'll be talking about the surface of quickly how an application might use it it's gonna be a little tenuous but we'll see how that goes and as it turns out because this is being built in user space and as libraries there's no one socket like API they all have their own API basically and so one of the things that Jomon had asked me when when when arranging this talk was hey can you can you actually you know do a simple thing where people can write a hello world program like intro to sockets type of thing I'm like there is no sockets here I mean in the sense that there is no unified API so but the one I'm going to show you is quickly it's available open source and you can play with it afterwards if you would like so so a number of implementations like chromium for example is building it for its own HTTP stack so it's not separable really from the rest of the HTTP stack so quick as in library is not separable from its HTTP the surface that it exposes is on top of HTTP so you do a get URL request on top of the the chrome networking stack but you can't really pull that quick and say I want to have an API that simply uses quick and build my own application on top of quick that's much harder to do so that's about it for our pre-launch time and I'm 15 seconds it's 1233 sorry three minutes past 1:00 I said end but I'll see you back here what time is it 1:00 something 1:20 anybody 120 120 okay so 120 will pick it back up from here and we will go into more interesting stuff thank you all all right I think we have quorum here shall we start I hope you've had your coffee don't blame me later if you fall asleep okay so I want to pick up where we left off any thoughts any questions about what we what we discussed so far please feel free to raise them and honestly I love it when we're just having a conversation because ultimately to me the value of of being in the same room rather than watching a YouTube video is that you can actually throw peanuts at me and I have the mic so I get to say things to you in response so please ask questions throw peanuts so I want to I'm realizing that there's there's there's a there's a very wide mix of people in this room so what I'm going to do is change the plan slightly I'm not going to do a full code read-through I'll show the code but I'm basically just going to show it I'm happy to discuss the details and I'm going to go the API I'm also going to go over the code but I'm not gonna go into it in substantial details because I think there's other discussion elements as well that might show up later I'm going to show you some tooling and there might be other discussion later as well and I'm happy to engage in pretty much all of that any of you have questions about the idea for Q&A later so and again this session might go on a few minutes past the end time because again we are starting a little bit late but but I should be done by 2:30 at the latest okay so this is where we left off you are talking about quick and we're talking about the fact that there are different implementations and each one of them has its own API to how you use quick and some of them don't even have a well-defined API at the moment and this is although evolving so as these implementations mature there's there's been talk and you've talked about this in the past about actually having some sort of standardized notion of an API I don't think it's going to be as rigid as the POSIX API but it could certainly be something that's a broader sort of an abstract API for that all implementations implement a version of so we might go there we're not there yet certainly but if you would like to play with quick and then your you can just expect that the road will be slightly rough so I'll I'll give a little bit of a brief about quickly itself quickly is the the fastly implementation of quick and this was written for the h2o web server how many of you here are familiar with the h2o web server that's it is basically it's it's it's it's a super-fast HDB to server released in 2014 and it's primarily optimized for HTTP two and by that I mean it has it Tunes the TCP send buffer to basically be minimal to minimize latency and it's got a fine-grained HTTP 2 prioritization and implement some upcoming features in HTTP 2 so it's pretty cutting edge with HTTP 2 and it's really focused on HTTP 2 performance shrim plug this yeah there you go thank you all right and quickly users so as I pointed out earlier quick is basically encrypted by default it's always encrypted and it uses a TLS one uses the TRS library to encrypt and you'll see in a moment nor in a moment in a little while the slides I hope but I'll yeah okay good enough so yes so there are that isn't very much to ossify in UDP in terms of features and in terms of extensions that we want to do to UDP as the protocol UDP the protocol itself do you wanna make extensions too quick and we are we are avoiding ossification or we are avoiding it getting ossified by encrypting everything so the hope is that going forward they that quick will remain not also fide although I have no doubt that metal box vendors will find clever ways of falsifying something that is fundamentally unmerciful book it's possible they'd have to see so the question is what level is encryption happening at encryption happens the quick level and up so all of the quick packet headers are encrypted the data that's inside of quick is encrypted the UDP header is not encrypted but everything inside of UDP is encrypted basically so the question roses encryption on a per packet basis the answer is yes it's on a per packet basis the reason you need adorable packet basis is because you have to be able to deal with the receive packets out of order otherwise you reintroduce the head-of-line blocking problem all right going along here quickly uses Pico TLS as a library for doing for doing SSL for doing TLS I'm sorry I'm showing my age and TLS 1.3 the Pico TLS supports various new features and ers such as encrypted SMI which is going to be by the way a lot of fun when it hits the wires and certificate compression and it supports quick and attack it has a very interesting it is a different API then open SSL is if you're familiar with open SSL the API is different in that it specifically supports a codec style API in which you you send you you you give it in plain text and it spits back encrypted the encrypted text this is different from open open SSL where you simply write and open SSL then writes into the socket below and this is quite useful because then it's modular the koteas is modular and in quick it's very useful because quick will basically send it to send the packet to picot TLS get the encrypted text back and then dump it down into the UDP socket so we will get into that in just a moment so the way that quickly so just very quickly it's written in C and was waiting for issue as I said and I've already covered these points the API for quickly itself is minimal buffering it basically supports a codec style API again quickly itself supports a core textile API both the Cordilleras and quickly both have this codec style api and that means something very particular well it's also bufferless I'll talk about the core textile API in a moment question codecs API is is basically you you you give it raw data and it gives you an the encoded data so that's as against you right to it and it rights as against the Laird API where you write to it and it writes the layer below this is more of a modular API yes yes oh that's that's interesting with and so the question is Google control both ends of the connection so it was able to deploy quick on both ends it's not a quickly problem as much as it is a fastly problem because fastly owns only the servers we are basically the CDN but with only on the serving side we don't know the client-side necessarily so this is why a place like the IETF is really useful because you get clients then to implement and deploy quick as well so we are working with clients who are deploying IETF quick and we are deploying IETF quick at the service as well so yes we rely on clients implementing but quickly itself has in it a client and a server both for testing and for various purposes but it's fundamentally being built for the server yes excellent question so the question was if it's buffaloes how does it handle three transmissions well I'll show that in a second the buffer is actually held by the the application effective and quick goes up all the way to the stack to pull data so I'll get to that in a second basically the value of being buffaloes is that it's much less memory footprint especially if you are a file if you're a server that's serving a very common object for example then instead of maintaining you know as many connection instead of trying to scale that memory to be the number of connections you just have one copy of this and it also allows for better HTTP header compression because if you are serializing this at the last moment your cue pack or the dictionary state is the most updated state and so you can get more efficient compression for this and it also allows you to do better stream prioritization because again you're not committing data before it's time to literally put it on the wire so you can choose from whichever stream is the highest priority at that point in time this is very useful for HTTP priorities so here the key is that the application defines the send send and receive buffers and quick simply uses the applications buffers and and does call I don't want to say up calls because it's not strictly layer in that sense but it's basically like an op call into the application to get data but it needs to send it and this really works well because it's all in user space so there's no real context switch here so briefly this is how a typical quickly itself would do this in HTTP for example so when doing an HTTP two this is how quickly will send data it basically pushes data into the TCP send buffers across the kernel boundary and then the TCP Center will you know eventually fetch data when it has the ability to send data from the send buffer and packet rises and sends it whereas in in in the quick it's quite different the architecture allows the application to basically maintain state per connection and then when it's time to send data they eat a new transmission or beta retransmission it's he goes back into the file server or the file system or the file cache and grabs that bunch of data and literally packet Isis it encrypts it shifts it down the wire and it's gone so there is no intermediate buffering from the source of data all the way down there really isn't any place that data is is buffered it's it's it's one stream through and the fetch itself that that happens up into the file cache here is only happening when there's permission to send the quick level meaning the quick has conditioned window to send and this is the stream that we want to send data from the basically goes and grabs the data encrypted packet izalith sends it yes question sure yes yes I mean the codex I'll I don't think the core excel is necessarily I saw I can't speak to the design choice I wouldn't say the codex is fundamental to this you're right you're totally right about that so the so the the the other piece that the API that quickly doesn't do is it doesn't actually send packets on on the European so the application has to do it so this is again part of the same core textile API which is that the application owns the buffers the application owns the sockets and basically grabs data ships it to quick quickly quickly will send back packets effectively and then quickly and then the application has to take that and dump it into the into the socket and it quickly provides some helper functions to accomplish these ends so this is basically how it looks so this is roughly how the HT HT t h2o itself looks there's h2 on the middle there which which talks to on the one side on the left side of that picture is basically the conversation that the parts that are useful for HTTP 1 and HTTP to serving and the right side is for using HTTP 3 and quick so as you can see here h2o is the the application owns the sockets and it basically talks to pico TLS on the sides and then dumps things down into the socket and it similarly does so for for quick and for HTTP 3 yep do the same architecture so the question is is this model unique too quickly or rather other implementations that are doing quick using the same architecture I I don't think it's unique but at the same time I don't think it's super common in partly because so the the the AP actually comes from from using TLS and trying to minimize latency for HTTP using TLS and quic just fits into that same thing so the codecs idea that I was talking to you about where these web server owns the socket underneath if you can see on the left side of that picture is true for HTTP to HTTPS well so that's h2o architecture and it shows architectures so to try and minimize latency in this attack below if you look at open SSL then you basically have the application doesn't have as much control open SSL is the thing that has control over record boundaries and how many things how long to wait before dumping a packet down a couple so as fundamentally architecturally unique you can you can get those latency benefits from other architectures too it's just not the case right now so issue was has been trying to optimize for latency and so this architecture allows for that yeah exactly so but that I'm basically just going to talk to you very briefly about what typically what you expect to see in the quickly API and we'll see more calls then then functions but at both the client and the server you want to be able to accept a so at the client you want to be able to connect to the server at the server you want to be able to accept an incoming connection from the client and at both ends you want to be able to create streams now to be clear for those of you who may not have gathered this yet streams are tractions and and the way to think about them is from a program from an API point of view I created two endpoints that an endpoint creates a connection with another endpoint and then within that connection the endpoint creates streams so I can open a stream to the the peer within the connection okay and streams are basically ephemeral things and and you use a stream to write you can't write without opening a stream first anything any data that an application writes has to be written in a stream exactly yeah question so the question is are streams unidirectional or bi-directional quick has both kinds a quick has both bi-directional and unidirectional streams so the question is are the streams negotiated or is it just the client writing data on one side and so on streams are not negotiated but the maximum allowable streams is something that both endpoints will advertise and then there's a flow control for streams basically there's a way to say okay I'm gonna allow more streams now so both endpoints can keep pumping that limit a higher and higher and in within under that limit an application is free to open as many streams as it notes yeah so the question is what about condition control is that stream based the connection based conditioning or towards connection based on the entire loss recovery condition control basically connection level constructs and we use them for connection level things not for stream flow control however operates at the at the stream level and there also is a connection wide flow control so there's this connection wide and there's first fee in flow control so yeah so so this is basically roughly the API that quickly offers this is exactly the API sorry that quickly offers but I'm going to go over it very briefly this is I have links and you can you can go look at the code examples as well but you can use quickly open stream to open a stream this this is all the stream level API right now that I'm going to show and then I'll show the connection level API and then there's a callback that's registered to call when a stream gets opened on the other side so similarly there's a stream level callback which says okay I have now space to send fill data in here this is the bufferless API I was talking about there's a callback that the application registers that that quickly will call into when there's there's availability of space to send new data and similarly there's other things about how when when something gets acknowledged again remember that the application is the one that's maintaining all the buffers right it quickly doesn't have any buffers so there's a callback to say get rid of this data it's been acknowledged from the peer you don't need to maintain it anymore there's there's yeah so applications can call back in too quickly and say oh you asked for data earlier I didn't have any data but now there's more data and so there's all of those calls for basically and all of these operators the stream level because the expectation is that ultimately a consumer and a provider of data is operating at the stream level that can be one application but the expectation is that you register one callback per stream on the receive side there's a callback to to receive data when creator arrives on a stream you have the callback that that that delivers that data and there's also a a call from the application to say I've removed this data from the receive buffer basically you can move your you know your your flow control window forward for example so that's that's a sink down from the application saying I've finished consuming this data closing streams straightforward there's a there's a call to close the stream from the application and there's a callback that says yes I've cleared out all the application all the stream related state and now we move on to the connection level stuff so this is basically where there's there's quickly is operating now not at the stream level but at the packet level right so that's quickly sent which is basically we are we are in the process right now of trying to figure out if we should rename some of these functions so quickly send basically is called to say you know crank the wheel see there's any data to send send it out of the simulator to send or if there are retransmissions to send out of a timer has fired or so on and so forth so this is basically what the event loop calls into so the application is supposed to keep cranking the wheel by calling quickly send how often should it call quickly send whenever it has data and it has like you know put dump that into the stream buffer and it can call quickly send to say yep I've put some data on the stream buffer go ahead and not same buffer sorry I have some data to send and and I'm going to ask you to send it or quickly returns a time so the application is expected to call quickly get first time out to say when should I call back quickly send again and quickly will say here's a time that I mod is based on retransmission logic on delayed acknowledgement logic various logic and quickly remained in state for what to do when the timer actually fires but the timer itself is in the applications and the application is the one that's actually driving the event loop forward so that's that's what those two functions are for on receiving packets quickly offers a decode packet so you basically grab a packet and you you decode it using packet and as yeah quickly receive which which takes a decoded packet and then actually it's the CDN that controls that because it knows where the link characteristics of the end-users sure so are you going to deal with that problem in the server side or in the client side and they received okay in there and there are you see I mean in the receiver side I do understand so I should probably make one thing clear because I've heard this not twice the goal here for the receive and the same side in the server and the client side so to speak is not to make quickly available on both sides right so quickly is basically just one end of the implementation just like TCP has both server and client implementations within the same implementation we have everything in here and and send and receive are both functions that are available at both endpoints the client side is unlikely to be our server implementation commonly we expect browsers to be doing the client side implementation so in terms of the optimization that you're talking about that's going to happen at a layer that's above this API I think that's a protocol level question that you're asking but we can talk about this more later as well but but yeah just to be clear everything I'm showing here is basically a full solid quickly implementation at the same time what we are working on is interoperating with other implementations so that our server can work with other clients and other servers can work with our test client basically I am NOT going to go to do the read-through because I have 35 minutes and I I think the subsequent things are going to be more interesting I have a wait a second how did this jump go find that code and and try it out I can I can run it and show you but it says you could run it yourself actually you know what I'll show you how this works because it's a quick server using its it's a so I'll just so that's basically just the the echo server I'm gonna run the echo server there if you can see can you see can everybody see that it's kind of tiny okay let me see you don't have to sit in memorize this right now it's quite fine the point here is that running the echo server requires you to supply a server key and a server certificate because again as I said it's always encrypted and it needs these things to to set up the TLS server and these manager to to actually do the right things here so now the server's running on that side I am going to run the client on this side connects to the server and I'll see there we go there you go yay the first bits of quick I've flown so that's that the echo server you can you can you can play notes it's a very simple implementation that kazuo put together especially for this so go take a look and it's it's it's a good way to get started if you're trying to write a simple client and server using quick so so I'm going to move past this and I'm going to go to tooling because I know there are thoughts about tooling yes question so what the question is what is what is the first packet of the client how does it get encrypted what's the keying material so the first packet of the client is from the client in a fresh connection that you never spoken to the server before is encrypted it's effectively obfuscated it's encrypted with the static key that's the key that we are publishing in the draft so so but it's also include so that's that's not going to be something that you call we would call rate encryption on a subsequent connection however this is a more important one on subsequent connection where we expect to get zero round-trip time we encrypted earrings using the zero RT t keys that we got the last time so the so so tooling this is there's been a burning question for a lot of people about quick because well the first thing I'll say is well that's a wire shock detector so go bonkers that is tooling available isn't Wysocki not for everybody it's not apparently why isn't it enough in this case he says it's encrypted and it's very baffled how is that a Wireshark dissected so there is a Wireshark decision and you're right it's encrypted so you need to feed it a master key for it to be able to open up the packets and look at it because without that it can basically read it can still read a few bytes but there aren't that isn't very much that you can really look at in a packet trace that makes it interesting so so yes so there's a bit of a problem here right if you wanna grab traces you have to grab the key as well what what does it mean though if you have an encrypted payload in encrypted headers and everything else what does it mean if you're trying to record traffic if you're going to use wire shock what does it mean you're gonna have to record the entire payload for the connection you can't simply record the headers yet record the entire payload record the whole connection and then you have to go decrypt it and and that's got all sorts of issues especially if you are logging this if you are storing this somewhere it's got all sorts of PII type issues and privacy issues so there's a lot of issues with this so one of the things that we realized is that this is not adequate and we need something that's that's that's better and so we've been trying and we've been working on when I say we the broader community not just not just me we've been working on endpoint based packet tracing the point being here that the server and the clients are best positioned to actually describe what's going on in the connection and they have the keys and they are going to be looking at the packets anyway so as long as you're not as long as you're an endpoint we ought to be able to at least grab traces at the endpoints themselves so this has been gaining a lot of traction and this is what we have been working on at transia as well so I'll show you some of some of the tooling that's being built in the community right now on this so the idea here is that the quick server and the quick client log packet level events and various other events and those logs turn into you know various visualizations and other things so there are two tools I will be showing you one of them in more detail than the other one but there's good place and there's quick ways and both of the implementers will be present at the IETF coming up and they'll be so if you're interested they'll be there and you can talk to them the first one is quick trace which comes out of Google and this was written by Victor Vasily F and others and this is open source and it's available basically you feed it a proto buff or a JSON if you want there's a converter that converts some JSON to protobuf and then feeds that into the tool and this is an awesome tool because it basically gives you a packet trace this looks very much like other PCP traces that you might have seen and I'll show this in a moment but this basically is generated and entirely by the server process the quick server generating logs that generate those packet traces this is not from a peekaboo the second tool is QuickBase which is written by robin marx and others at university of hasselt and they've done some extraordinary work they take input the input that they can as JSON and they produce basically various state transitions in the transport in HTTP levels and what's going on on different streams at different times and a full timing diagram this is produced from places of course they can they can do partial traces meaning only one end one side or they can take traces on the client and the server and put pull this together so this is some incredible work that they've been doing and also you know various other graphs that you might be familiar with in the TCP world I can't speak to that I've not really tested them with large traces the quick trace is actually super super good for large traces quick quiz I don't know I haven't really worked with it myself I'm trying to talk to them but there's active development again pretty much everything you're gonna find in this world right now is rough-edged because all of this is work in progress right I mean just about stabilizing the protocol format at the moment so you're going to find that the things have rough edges but these are things that we can make work with so so before going into this I want to talk about exactly what the quic packet looks like so you know what you're looking for so you know what it is that we are tracing right so and this is also a little bit of protocol detail that is hopefully helpful for you to understand how quick packets look how quick an encapsulation look how the packetization works so quick packets have come in to two varieties because one is boring we like we got two there's the long header format and there's the short header format the long header is primarily used during the not just primarily it's only used during the handshake this is to establish various kinds of things that we don't know about the pier but we so this is what the long header looks like I'm not gonna go into all the details here but it's got as you can see it has a version number in there and it has connection IDs as I mentioned to you earlier in there as well and once we've negotiated the version and once we finish the handshake we don't need a number of these things so we can go with a shorter version of the header and that basically includes only a handful of things it has a packet number and it has a destination connection area it doesn't have a version notably as you can see now not all of these are visible on the wire to be clear and not all of these are going to be fixed in time our expectation is there as quick versions happen that many of these things will change specifically there's a draft called the quick invariants draft that talks about which parts of the header encrypted or not will remain the same across quick versions and which parts are not guaranteed to remain the same this was basically a way for us to delve in middle box vendors or any vendor basically what they can go and ossify and what they cannot ossify so if you look at this header it basically shows you that you know from the long-long header those flags are basically off-limits those can keep changing across versions that the first bit of one is going to remain the same across all versions of quick for the short header does the connection ID is visible and that's basically it and the zero so commonly through the lifetime of a connection there's going to be the zero that's at the beginning that's going to be visible and then there's the connection ID that's going to be visible that's all that's really usable now bear in mind that the connection ID itself can be renegotiated on the crypto power so the connection ID can change within a connection so if you want to go latch onto a connection ID and think this is the connection oh the long-headed so the longer has connection ID so the question is if I understand correctly you're saying the long hair it doesn't it does not connection ID in the long I see so you're asking if you can create a connection without actually using a long hair no that's not the case right now you can't no no not doable either I mean not not not with the way that their handshake mechanisms work it's technically possible to do it in a different protocol but that's not our goal yeah so that's what the packet headers looked like water inside of packets beyond the packet header there are quick frames that's basically what I'll talk about them in a moment every frame looks like this there's a frame type and there are type dependent fields okay so there's a basically quick packet has a packet header and has a bunch of frames and each of them has a type and then very type dependent fields and these are all the different types of frames that are currently defined in the draft as you can see the ones that that the functions at least that you might be familiar with I'll point to them there's the stream frame and there's the ACK frame stream frame carries data the arc frame carries acknowledgments right so those are basically not part of the packet header they're all contained inside of a packet as frames you can see various other things such as max data all the max data max stream data Mac streams pretty much all of these are basically control trade frames and all the control signaling that happens within between the endpoints happens with frames there's more these frames like the ping frame is basically a one byte frame order this is a frame type there's no data in it but they're all you all of these frames are used for control signaling between the two endpoints right so let's look at one of these in a little bit of detail so we try to understand basically how it is that data flows in a connection so a stream frame which carries application data ultimately has these headers it has a stream ID which indicates the stream number basically that this data is going to it has an offset that's optional offset in the stream it's a byte offset within the stream so one stream is roughly it's semantically equivalent to a TCP connection once every stream is a byte stream right so you can have multiple of these within a connection and the length tells you how big this particular frame itself is and then the stream data there's actually the data right so that's what a stream cream contains so let's look at how a quick packet would look with various things in it so here's a quick short header packet it's the header bit as one that indicates is the shorted a packet there's the spin bit which I will not go into but that is basically a bit that is supposed to indicate they're suppose to be used by operators for doing passive measurements of RTT specifically there's the destination connection ID there's a key phase and there's a packet number and those are encrypted so the greyed out ones are basically and even the packet number by the way is encrypted so a metal box can't even see what the packet number of a particular packet is and a quick packet as I mentioned has frames so in this particular example there's a stream there to stream frames and an action alright so let's look at one example one example right so this is packet number 56 and it is carrying data from stream ID five at at offset 11 23 and it has a length of 500 bytes and it says that this is not the last frame in this particular stream which means that it's the the fin bit is set to false and then it carries application data everything good so far there's going to be a quiz so pay attention if you're falling asleep good time to wake up wake up wake up all right that's in that stream ID that's in that stream frame this is another stream frame this carries data from another stream ID right stream ID 8 it has a length of 300 the fin is false but the offsets not there what does that mean the office is not there 0 thank you yes the obvious conclusion yes so offset is optional if it's if it's not there it's 0 there's a bit that indicates that the offset is present or not in the flags and the rest application data of course so this is fine and in this example continue example we've removed the length what does that mean that means no data no it doesn't mean there's no data although you can send a empty a stream frame the maximum length the maximum link that goes up to the end of the packet which means this particular packet ization is incorrect what it really looks like should look like is this basically it says everything from here up onto the end of the packet is this stream data all right so the length is optional as well if you're going to look in the common case when you are packing every packet is just data from one stream in the bulk transfer case you you have the offset but you remove the length that's the general idea so that's what we got for these these this is what a stream frame looks like all right and going back to the ACK frame now let's look at the action ok the acronym has a number of fields in it and it has if you're familiar I'll try to draw parallels so how many of you are familiar with the TCP packet format seriously like for people in this room how many of you are sleeping right now yeah I was expecting that response all right wake up wake up wake up like I said there's a quiz and and you won't be allowed to go out and get coffee after this if you don't answer the questions on the quiz critical I mean not just reward right but but what is the carrot and the stick what's it I have a man that he's gonna hold the gate there so the act itself carries a largest acknowledged what does the equivalent not the equivalent what is the TCP packet carry in it it carries the Iraqi carries a sequence number on the send side on the receive side or meaning on the on the AK side it's got a cumulative ACK okay cumulative ACK is the smallest number that you received in order this one is the largest acknowledged it's the other end okay so in TCP you might have sack blocks cumulative ACK and then sack blocks after that in quick it's the largest acknowledged that's what's sent in the ACK now the other things are also sent and we look at that act delay if some of you were at the TCP Analytics talk session this morning you might have heard this thing about you know we don't move the RTD carries in a delayed acts that the the amount of time that this is the receiver was sitting on an AK or not right the radar delays are captured as part of the round-trip time sadly in TCP but here the receiver explicitly encodes the ad delay meaning how long did it sit on an ACK before sent on a packet before sending the ACK back so if I receive a packet and I'm waiting to send an act because I'm going to delay Mike knowledge moon for a little while then when I send the act back I'm going to encode that time inside the ACK and ship it back so that the sender has a better estimate of of time and can get rid of the receivers actally it has in a crane account and this is these are the ranges that are appearing does the does the the first-act range which is basically contiguous from the largest act we look at an example and we'll talk about this in a moment so the AK and then the subsequent act ranges ECM counts I'm not even going to go into so let's look at an example okay that's probably best done with an example here and this is one example so let's say that the packets received are 1 through 125 okay the time let's say that the time since the largest received at the receiver is 25 milliseconds when the receiver generates an acknowledgment this is represented in a particular way it's a shifted value to allow for a larger range of values and the shift is negotiated but it's defaulted 3 this was very much like your window scaling basically right so the value that's encoded in in the ACK is in microseconds and in this case it would be 3 1 to 5 because it's 25 milliseconds shifted by 3 the receiver will encode 3 1 to 5 microseconds inside the ACK and the and that's what's going to be sent does it make sense yes No the receivers okay I'll go over this again I'll keep going over this until it makes it the point is that the receiver is waiting for 25 milliseconds before it sends an acknowledgment and it encodes it in a particular way you don't need to know what the encoding is because you're not going to write that encoder tomorrow it's fine all you need to know is that 3 1 2 5 is 25 milliseconds what shake is sending what we go sending the act the same as a TCP it's a delayed act I'm there's a delay timer you can piggyback it with data that's coming out or a second packet that's received so there's 4 TCP when we receive like 2 segments we so it is it is right now recommended that I can actually remember the language in the draft is but that is one area where we have flexibility we you can have you can wait for larger number of packets to be to be received before sending an acknowledgement and there's actually a draft that that some of us are going to write together about that but at the moment because it's simply behaving like PCB the current current answer is to again but that's negotiable those are generally yes so the behavior right now is gonna be generally those triggers will be roughly the same as TCP but there's possibility to extend it to two non tcp like behaviors but if if mostly models the tcp state machine the highest I mean because that TCP assumes that you are cleaning up like contiguous blocks yes means you could leave holes behind yes you can that's an excellent question do you mind holding on to that person asking me again in a bit because I do want to answer that question so that the time since largest received is not the most important thing if you didn't get it it's fine we can move on the other stuff is independent the act fields are basically largest received so far in this case it's 125 because we receive everything from 1 to 125 so far so the first a Kremes is the largest AG does 125 and then how many packets below it contiguously are acknowledged and in this case that's 124 because 1 through 120 has been received so I'm gonna say largest Act there's 125 and then I'm also acting 124 packets below it that's what that says the ACK range count of 0 means there are no discontiguous blocks below that that's what that says or at least I'm not reporting anything else below that that's what that says so this is what that looks like the acronym looks like this yeah now to this let's add a little too yes excellent point he says your accident packets but the stream data is in offsets and stream IDs what the bleep can I say that on camera it's not CBS right what the yeah you're exactly right and it'll get to that in a second that's actually all the features right but in a second so let's let's do it little twist here a little twist now we receive one thirty why because the internet sucks and drops packets and so we have 130 and now things change a little bit now the largest received is 130 its bumped up to 130 because yes that's in fact the largest we've seen the first stack range how many packets contiguously below 130 are you reporting as acknowledged none right because 129 is not acknowledged immediately the first-act range becomes of size zero okay and now I'm reporting a gap I'm reporting a gap from 126 to 129 I'm saying there's a gap that goes from 129 down to 126 and then I'm acknowledging everything below that 125 to 1 again I'm acknowledging ok so the way these are encoded and again you don't need to understand this you can think about this later is this gap is simply an encoded as 1.9 minus 126 which is 3 and the ACK range is again 125 minus 1 which is 124 ok that's what the same coding works so that's what that looks like the ACK film for that final twist promise this is the final twist let's add 129 to this ok so now we receive one print and why again because the internet sucks and now the largest packet received is still 130 but the first-act range has grown by 1 because now I'm acknowledging one packet contiguously below 1 to 130 so again notice that the goal is that in a common case you're going to see packets in order that means you'll have the largest tag and the first-act range as basically conveying all your information but in this case again your gap has reduced in size and the arc range is reduced but this is I hope this is clear more or less and if it's not it's okay you can catch up with this you don't have to understand the details here now so that's what that looks like that I claim is what that looks like so this is absolutely the last example I promise the last twist let's say packet 56 was dropped right this is packet 56 right let's say this packet that is being sent right here see the how do I know this is packet 56 because this pn packet number here says 56 and let's say this packet that carries that stream frame that stream frame and that AK frame is actually dropped by the network okay and eventually because quic is reliable quick has lost detection machinery it detects this packet as lost right and let's also assume that stream eight was reset remember stream eight that's the middle stream frame there when I say reset this is something that an application can do in quick an application we can say I don't carry a care about the stream anymore let's not send it on the stream anymore cancel the stream I'm done with the stream okay the my stream is just some abstraction within a connection that doesn't mean the connection is closed the stream is closed okay so at this point quick detects after all of this has happened quick detects packet 56 is lost and let's say that the last packet that I had sent was back at 74 what should happen next quick needs to retransmit data that was in in 56 right so this is packet 56 this needs to be retransmitted so first question what should the what will the packet number be 56 seems obvious how many people say 56 oh come on raise your hands you want how many people say is 75 it's 75 it's not 56 because I'm actually not retransmitting 56 I don't care to retransmit 56 one more second what should I eat transmit in here what frames should I send again should I send stream ID 5 yes why not somebody said no what would the offside be different that packet was lost those offsets were dropped those bytes were dropped this offset is only within the stream so stream ID 5 is still in that that piece of data that would have been delivered in 395 did not receive did not reach the receiver so we need to retransmit those bytes the offset within the stream remains the same because that's delivery order within the stream that hasn't changed what about stream ID 8 we just huh it's closed do we need to retransmit that why somebody says yes yeah why would he need it you're right he says that at the time that packet 56 was constructed stream eight was alive but we no longer need to deliver to the receiver because the receiver has also closed the stream it's gone the stream has been reset the receiver does not care about receiving data on a reset stream that's the that's a semantics I mean that's a quick semantics and that's something you didn't you but that's what I want to point out here the separation between streams and connection is very very strong and quick yes question trying to be more disciplined so but that implies that the other side is effectively left dangling right there is no the receiver is now going to do a timeout disconnect because that particular stream has no signal that said go cancel yourself that is that is the current signal and that's called a reset stream control signal but that's going to be in a frame that might have been in 74 that could be any guess that might have been in 74 exactly and if that's dropped that is retransmitted as well oh also the cancel is actually yes exactly yes so the reset of a stream means don't care about the stream anymore fin of a stream very much like TCP reset in TCP fin means yes I'm still going to send you data even though I've sent to your fin because they all need to get delivered in order so in this case we don't care about these oh I should have asked you this question should the ACK frame be read I'll ask you forget you saw that right answer thank you for playing along you get points for playing along should the ACK frame be retransmitted see now I'm going to hold you to it why he said yes and then when I pinned him down he said no father I called because you're not on the mic at the time so the accident need to be retransmitted because it's carrying state information acknowledgments in quick very much like in TCP are cumulative in the sense that a later acknowledgement carries the information of a previous acknowledgement all right so we don't need to send that we could send a new acknowledgement we could in place of these two things we could send other stuff we can send new stream data we can send other control messages we can send whatever we want but it does not have to send these two bits of information and it won't send these two bits of information yes question so the question is does the quick library maintain information about packet sizes offsets and various things packetization information basically so that when an acknowledgement comes it does the right thing above pulling information from the right streams and so on if it doesn't come then marks those as lost yes in both cases it does it has to maintain all of that information and when a packet is acknowledged it goes up and tells the stream hey this data has been delivered and if a packet is lost it says I lost the stream data and I can retransmit if you would like so yes you fetch data from the application again yeah so the question is can you we remove stuff we don't want to send in the packet but can we add new stuff the short answer is yes that's sort of the whole point packets are basically containers in quick packets of simply containers they carry a packet number that is monotonically increasing no matter what you put in them the packet number is used for sending and it maintains only thing packet number encodes transmission orders so an earlier question was about oh this has this this packet numbers in this and there's three my reason offices in that the acknowledgments are for the containers technologyand say I have received this packet what's in the packet well you ought to know you are the one who sent me the packet that's basically what it means so I maintain state about what I'm putting inside a packet I give it a packet number I ship it an acknowledgement is received it says I've received this packet that's enough and if I lose the bag if the packet is treated as lost again I know what was in the packet so I can decide what to retransmit what not to retransmit and so on at the time of retransmission the value of separating these two things again is that lost detection is separate from loss recovery I can detect a loss and I can in fact choose not to retransmit any of the things that were in that packet that's perfectly fine in the example I just showed you if stream 5 had also been cancelled when I did it the loss of packet 56 I do not have to send a packet 74 or 75 because there's nothing in that packet that I need to retransmit anymore so loss detection is separate from actually retransmitting the bits that were lost okay so that's hugely useful also because your your stream offsets indicate what basically so the the the the key distinction here is packet numbers are used for transmission order stream IDs and offsets are used for delivery order okay so this is the separation that we are able to achieve when TCP doesn't have this separation but we are able to achieve and quick so packets end up being containers yep yes so the question is it's not like UDP but the question is is this is this like where you can where you can drop certain frames for videos encoding video streaming for example you can say I don't care about these streams anymore that's exactly what it is and that's actually work on trying to make this work for video yes can you allow loss in a certain stream within a stream at the moment that abstraction is not available in quick so the way you would if you wanted to if you wanted to in depending on the impeller it would depend so in short no not at the moment but that's an extension that people are talking about have been talking about and the working group has been trying to keep it tight to do not expand scope to increase include everything but that's coming very very soon if there are lot of gaps yes it'll get bigger and bigger much like sack ranges so yes that is so so somebody else has asked this question I'll answer it now about when do you stop reporting something because in this world you're never going to fill a gap right so at some point you decide to stop reporting and that is actually left to the implementation to some extent so we've left it to the implementation in the spec but there are some recommendations about what's a good time specifically because every quick packet carries a packet number funnily acts get acknowledged remember that acts are frames that are carried in packets and those packets also get acknowledged and let you think about that cause it can lead to an infinite loop it can lead to ping-pong but there are safeguards in place to not let that happen but what it means is that I can know when a peer has seen my acknowledgement so when I know that that guy has seen this information I can stop reporting it so I can I only need to report as long as I know that the other hand has not seen this information as long as the Pierre has seen this information meaning that I receive an ACK for the packet that carried my ACK frame I can retire so to speak that ACK information I do not have to report that again so that's again a trick that you can use or you would also use some sort of a timing thing you report this for one or two round-trip times and then after that you don't report it anymore another huge value is that retransmissions are not automatically high priority basically stream priorities allow you to determine whether you want to send retransmission now or not so quick separates what is sent from when it is sent I am basically out of time at this point is 232 and I don't want to run into your break into the next session but I will show you one very quick thing and then hawk one very quick thing so here's a I will show you a quick server I'm running a quick server here and I'm running a UDP forwarder that basically will introduce a drop at some point in the connection and it introduces loss and delay and things like that's a simulator which also will forward packets along this is all available in the quickly repository by the way and no why would you do that maybe I'm running it somewhere else apologies it's fine and here's the client and the client basically talked and sent a bunch of stuff and now I have traces and this is what this is what I was it's a JSON stuff that spit out of the server as the connection proceeds and I have a converter that basically wraps this to things and then converted into a protobuf and and I can plot it and there you go there's a nice little packet raised if you're not familiar the packet race it's time on the x-axis and packets data on the y-axis and what you've seen here is in blue is packets sent and the Green is acknowledgments received for those packets and you can see that every packet there's basically various things including the frame list and the frameless in this case has stream ID 0 and you can see it has certain offsets in it and the acknowledgments are basically four packets and along with the acknowledgments I can also show various transport State because the server as it receives an acknowledgement is logging this stuff as well in the JSON I just want to show you one quick thing which is here okay so the Reds here are packets are lost okay so this says lost packet nine which means a server said at this moment in time I'm logging packet nine as lost okay but let's go back and see what was in packet nine packet nine carried the stream zero data offset thirty seven one three yes and that was dropped in the network and of course we're going to retransmit that but as you can see the retransmission happens here where it has sent us back at 38 but it carries the same stream ID and the same stream offsets so if you're used to seeing PCB traces this looks slightly weird because again as was pointed out earlier gaps are not filled retransmissions don't show up along the same horizontal line they're going to show up later so you so you're going to have to reorient yourself a little bit to looking at these traces because they are different but at the same time these traces can be substantially more substantially richer because we can actually record a lot of details about what's going on at the server in these traces and with that I will end sorry that it's only five minutes left for your break but thank you for being patient and not sleeping through this presentation post-lunch thank you again and I'll be around [Applause]
Info
Channel: netdevconf
Views: 3,362
Rating: undefined out of 5
Keywords: netdev, netdevconf, netdev 0x13, QUIC, IETF
Id: CtsBawwGwns
Channel Id: undefined
Length: 99min 22sec (5962 seconds)
Published: Mon May 20 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.