Pinterest moves to HTTP/3

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Pinterest has migrated their client application and their front-end Edge cdns to http 3. they wrote A Blog about it and there is an interesting there are interesting lesson learned that uh I want to share with you guys let's jump into it alright so this comes from Pinterest engineering medium blog uh quite an interesting uh write up it's missing some pieces I don't mind but I wish they had added those pieces and I'm going to talk about what are those missing pieces I thought probably a good reason for that this article is written by Lim ma Scott Scott Beardsley and hoenyan now Pinterest operates on PhD V3 we have enable hdb3 for major Pinterest production domains on our multi-cdn Edge Network this is a very subtle sentence domains who have enabled HTTP 3 on a domain so be careful with that statement because certain domains are enabled certain domains are not because you can have a domain point to a CDN and that domain will map to an IP address which then tell you after you connect to it and send the first request that hey I do support hdb3 if you're interested go ahead and connect to me through HTTP 3. because the default is always almost TCP connection right when you go to 443 but how does the client know that this endpoint support hdb3 this is something called the alt SVC and I had a whole podcast discussing just that today we only know after the fact right in HTTP the in HTTP one we know through a response header called Health Service an actual response header so we have to actually connect establish TLS send the request and then know that oh by the way there's a quick or there's hdb 3 endpoint the the faster way is through an HTTP true frame so you don't really need to send a request once you negotiate through alpn and TLS that you support hdb2 the server will push a frame to you that hey alt service here if in case you can you can just discard this connection and just connect me through quick and the faster faster faster faster approach which is not everything's up not every client supports it is it the SVC B DNS record as you do a DNS record give me the a record and the AAA record for lpv6 and also give me the tax record and also give me the svcb record which tells you that hey there's an alternative service hdb3 during DNS that's a fast you can't get any faster than that right and that that last part is not really supported it's a new thing right only cloudflare I think supports it but clients are not are lacking behind anything with DNS just takes ages to implement so that's a critical thing here so they have a multi-cdn edge network if you don't know what a CDN is is there is that a fish proxy CDN is just a reverse proxy and you as a client connect to that reverse proxy the CDN and the CDN turns around and connects to the actual origin backend to fetch information to fetch the actual data the Syrian can't serve you anything on its own it has to fetch it from the back end and the beauty of caches is a huge caching layer it's a close to the clients so if you're coming from India you resolve the DNS you're going to get an IP address closest to you if you're from the US California you're gonna get an IP address closest to you and although this you're connected to technically the same domain magic I know right so you can have multi CDN and the CDN will take care of this stuff and we've upgraded our clients Network stack to support new protocol this allows us to catch up with the trend industry Trends I hope this is not just a trend and you'll really know why you're moving which they do most importantly faster and more reliable networking improves pinners experience is that what you call Pinterest spinners ah really like Snapchat are they called snappies right is that tick tockers YouTubers I'm a YouTuber I suppose context Network performance such as latency and throughput is critical to penarth's experience in 2021 a group of client networking enthusiasts at Pinterest started thinking about adopting hdb3 or quick for Pinterest from traffic CDN to client apps that's it they don't talk about their origin back in server they didn't touch their back end Ura out of this block twice they never mentioned their servers they didn't touch them they changed the CDN they technically just enable this here because they're they're cautious and and I don't blame them for that right they want to try this out with the CDN and their client so they upgraded their client apps and they enabled this the hdb3 on the on the on the CDN on some of the cdns and they work through the 2022 and achieved our goal blah blah blah we're still working towards 2023 and uh how does hdb3 um so they they have bullet points of Y HTTP 3 is good and I am if you don't if you're interested not interested in this spot you can skip but I'm gonna take each point and elaborate on why right so the first point is no TCP head offline blocking problem in comparison to http 2. to to explain what is head of line blocking um we're gonna understand what is hdb2 does compared to hdb1 right hdb1 is a simple protocol so request response you establish a connection and you send a request and ignoring concurrency and pipelining you can only send one request on that connection and you can't send another request on that connection until you receive a respawn that connection is marked as busy okay for reasons I don't want to go so I talked about many times right pipelining through proxying is just it's very hard to do right so we said okay it's just connection is just hdb HTTP one is just one connection one request at a time right and that's why Chrome and other browsers open up to six pair of domain and you can set up to six in HTTP one one that's why server side server sent events with hdb11 it's really not a good idea if you open many tabs on the same domain because all of a sudden because the server sentiment is like one request and then it's an infinite response that's what it looks like so so to build a browser you your connection is in use so it needs to open another one another one up to six and then you're done right so the seventh will will be stalled you can't even send that seventh request you can even browse anything you can't send any anything on to the same domain that is with HTTP 2 they solve this problem it says okay let's let's add some headers let's add some metadata to the to the to the HTTP protocol at the lower layer such that we know that oh this is request number one request number two request number three and so on so they added the idea of streams so you can create odd number streams in the client so if I'm not mistaken so one uh three five seven and so on and you can send requests concurrently beautiful and if you send seven requests right the server let's say the seventh request the servers it's just so quick to request to get a text file that is Tiny I robot.text whatever Fab icon while the six requests are taking long time the server can immediately response with the seventh request on stream number whatever 13 and do the math yeah so it can respond immediately with that and and it can continue process so that's the idea of that's the that's the that's the state of the art in the mid 2000. tens like 2014 2015 that's like a big thing right a good to actually be too yeah problem is with hdb2 the TCP head offline all right because this video is built on TCP these seven requests that you send at the same time they are sent as stream one TCP stream and what is the property of CCB it's it's ordered so if you send request number one then request number two three four con County they are currently labeled with a segment sequence number right the sequence plus the length of the TCP data right so they are ordered so what happen if you receive you send all these things but the server receives only segments two to segment seven assume one segment per request 1500 bytes MTU right so segment one was dropped lost in one of the routers right and uh the server specifically the kernel tcpip code will never deliver the data to the receive buffer for the application to actually read it because it's not correct to the TCP you have missing data you have missing data right segment number one is missing so I can't deliver anything after that that's called TCP head of line blocking so that means to be technically got request number two and three and four and five and six and seven and the server can technically process them but we can't do anything about them unfortunately and and that's TCP how to find broken quick solves that by ditching TCP altogether and building everything on top of udb and having the idea of streams actually logically grouped and sequenced per stream so now we're sequencing on the streams not of the connection level right does that make sense so if you're sending requests you have you're only going to head the head of lawn blocking at the stream level which is fine right the sure I'm gonna send the same request on the same stream just don't do that open a little stream and send your stuff concurrently but yeah that's the state of the art app Plus quick effectively hdb3 also combine the connection establishment and the encryption of TLS which is another thing you had to do right into one round trip so instead of doing it three right handshake then TLS or the 1.2 1.3 you all do that on one step right so now connection migration across IP addresses which is good for Mobile use so because UDP is connectionless and quick built the idea of having a logical connection technically we're not re we can enforce the idea of connection at the end point the network doesn't really care if you if you have a connection or not because it's udb there is no connection so what we can do tricks like this you can technically change your IP address because the IP address is no longer an actual header I said what is it it is a header but it's not it's not really part of the protocol right it is you can change the IP address but you can tag every datagram that you send with the connection ID uniquely identifying this connection so if your move from your Wi-Fi to 5G you left home and all of a sudden your IP address change technically quick can migrate your connection because you're still sending the same connection ID right but your IP address change so the server will look up it says oh of course there's authentication and stuff like that to migrate the connection but once you do that it's like oh I trust you you have the same connection ID but you came from like a different IP address look good so we don't have to re-establish the connection pretty cool stuff able to change tune lost detection and conjunction control now because like the conjunction control is a feature of TCP at the connection level uh quick pushed it at the stream level you can control the traffic for this okay oh this this stream is just um I know this video or images like let's be a little bit relaxed it's okay if we have it dropped like sure one frame or another frame will not be as good but sure be a little bit relaxed well this is an actual I don't know source code that's been delivered so we need a better reliable so of course A reduced connection time so one or round trip instead of however many right with TCP plus DLS it depends on the version of TLS right and the ability to do also zero rtt like literally zero RT nothing right so you can uh you can send data with the first request not first we've got a first initiation of a connection right slope it's not it's challenging it's not really easy to do but you can right zero rtt data and you can you can you can literally just establish if you enable that pretty key uh pre-shared key and session resumption you can actually do that so that's really powerful name but even one rtt is awesome right more more efficient for large payload as we talked about it for images so these Advantage they they like this stuff they want more of it let's do it strategy what's their strategy safety and metrics came first though Pinterest is focused on executing with velocity it was critical that we took a thoughtful approach to adopting 2b3 first we upgraded the client Network stacks and created an end to end a B test for each traffic Type image video then we run extensive experimentation before enabling hdb3 on both cdns and clients so only the clients and the cdns that the only thing that change what are the channel what are the challenges in order to talk about the challenges let's read through this for the web app some browsers already have HTTP 3 or quick support all all browsers almost supports HTTP 3 now I think all right so now they would favor and they will know how to read this alt Service as I talked about right and and move to that accordingly so they got to be careful with this like they can't they don't they can't just light up quick on the server and all of a sudden the browser start using quick and they have bugs in the in their code and they the apps are brick right so they have to control that and they're talking here about the OS version and also Android but here's the best port our series vendors are different phases of hdb3 some CDN version vendor supports some doesn't so and since all of them use the same almost the same domain right they just flip with the domain here's the interesting thing you can connect to you can do a resource resolution get an IP address connect and that have that domain that IP address belong to a CDN that supports HTTP 3 and so it will return I don't know whether an H2 frame or uh response header or even if it's sophisticated through their DNS I will tell you uh I sub watch db3 connect to me turn around and connect to me through a quick connection because you gotta destroy your connection because it's a different protocol you can't just use TCP right so it turns around and does the quick connection to establish TV3 and here's the thing the DNS time the TTL time to live is short so what happened if you now turn around and do another DNS right and get another IP address another IP address because the the TTR has expired so now you're pointing to another CDN that happened to not support quick but guess what you're still using the same domain so what what is happening here is at the client side the client has cached the knowledge that this domain supports quick and HTTP 3 but then it pointed to a different IP that doesn't and all of a sudden things break and that is the most interesting thing I ever seen I never thought about this before apparently there is a discussion going on with this so how do they solve this they solve this by making the DNS TTL very close to the alt service TTL because even the health service has a time like a trust this quick I guess connection or endpoint for this amount of time if you make it the same time they can explore at the same time so the client will learn to ask again effectively right or do you support or not and that's basically the most interesting thing here I think so they conducted a lot of test a b testing to do that they added a feature flag in the client side to enable that stuff current state whatever they have right now they have we have enabled hdb3 on critical traffic types and upgraded leveraged mobile clients Network Stacks to your trans hdb3 traffic HTTP 3 is enabled on major major Pinterest production domains on our multi-cdn Edge Network major not all of them okay web will get it for free all right if it's a web client you'll get it for free because they're browser and iOS you do an images API traffic will be served through cronet plus hdb 3 70 iOS native stack uh you're gonna do AV player and Android is gonna do expo Expo player cronet here's a showcase showing some numbers here I didn't really understand these uh these diagrams I'm gonna go I'm not gonna go through them but the interesting thing is really right here the network request round trip latency before and after hdb3 so the round replacement C here is measured from the client side from requests and to response received right based on one week of network layer collected in Q3 with apple networking hdv2 and q1 with chronet when HTTP 3 has enabled so they on q1 223 so you can see that for those listening we're looking at what is this number on the left hand side what is 2000 is it millisecond that doesn't make any sense 2 000 milliseconds that is that is really slow I guess uh I got yeah they're they're calculating the request itself right we're still God damn slow what five seconds I have no idea what this metric is I'm not I five seconds is really slow and going down to three seconds that doesn't make it even better unless like they gotta include something it's like what is it I'm unloading an image am I loading a large uh whatever video that makes sense if it's like a video but yeah yeah so as you can see guys here the future work that's really the future and discuss a little this a little bit we will continue to invest it should be three for sustained impact including increasing HTTP 3 coverage explore other network Stacks so they're still exploring playing with things further improving or it should be three adoption rate and experimenting with various congestion control algorithm because there is so much to to configure here especially at the stream level was quick you get so you do so much things I'm still learning so much about this stuff exploring zero rtt connection stands because they don't have that right now because zero rtt is like something it's really first of all it has it has danger to enable and I talked about that in one of my blog is replay attacks are a real thing here I mean if we're playing an image who cares right just re read the whole image but if you're applying an API call that is kind of changing something and that you send that to through zero rtt data that is dangerous right if it's especially if it's not idimportant that's a different connection that's a different story I mean but as your RTG is an interesting thing especially just send the first request to get something in the same breath as you establish the connection so so to summarize they move to http 3 only in their CDN and their client they didn't mention anything about their root backend application at all are they gonna even in the future plan they didn't talk I was like Hey we're not going to touch this seems like they're staying on hdb2 on their back end they're not touching that as as far as I know uh reasons I think they're they're just playing it safe to be honest it's easier to move to the the cdns because as you cache stuff your clients will just hit the CDM so what is the value of moving your backend application so that your CDN can talk in HTTP 3 with your backend application what is the added value there it's not like the CDN is gonna change your IP address right it's not like it's gonna walk like that it doesn't have legs walking through multiple places and it changes so the connection migrations doesn't have any value right uh probably the network between the CDN and the back-end applications probably high enough and strong enough that's that like the Ops are lower I can't guarantee that but could be right that's the case then HTTP 2 is fine right because like even when I say TCP head offline blocking you gotta think about you if if the network that's only happen when your network is weak and you experience frequent drops which is like a in a flaky connection you know like a Wi-Fi or a 3G or 5G or moving from one Tower to another Tower well a signal will drop things right but in a in a high bandwidth what are the chances that this gonna happen very low right so I'm not sure about that a lot even HTTP 2 even if she's moving from hdb2 is is also something you can question it's like and that's what I like about these things you gotta be pragmatic a little bit so sorry why should I move and they probably ask the same question why should I upgrade my back end origin and if I don't have to I'm not gonna do that Shibu is good enough right okay plus the CDs gonna create one connection to the back end and probably stream everything on that connection that's it for me today I thought I'll share this news with you guys Pinterest move to hdb3 see you in the next one goodbye
Info
Channel: Hussein Nasser
Views: 35,025
Rating: undefined out of 5
Keywords: hussein nasser, backend engineering, http/3
Id: 19Et2M3amA4
Channel Id: undefined
Length: 25min 5sec (1505 seconds)
Published: Thu Mar 16 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.