Long Polling and how it differs from Push, Poll and SSE - The Backend Engineering Show

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
in this episode of the backend engineering show we're going to discuss long polling this is the center of the topic of discussion and because we're going to talk about long polling as a method of communication between the client and server we really need to talk about regular polling and the opposite of that which is pushing so pull push long pulling and since we're in that realm we really need to give an honorable mention to server scent event because it just nicely fit into that pot really at the end of the day once you understand this what what what this thing really is it's not really magic in software engineering we just like to invent new words to sound smart so like if i if i tell you hey i'm using long polling on my back end i'm smarter than you right because i just used a term that you don't understand that's why we like to confuse you and and in the end it's just we're playing with packets and requests and responses nothing really magic there so that's where we try to demystify in this episode if you're interested in like this kind of stuff stay tuned welcome to the backing engineering show with your host hussein nasser and long polling before we discuss long polling what polling do we really talking about what what do we mean by polling here and for non-native english speaking by the way there is a difference between the word paul and the word pull so polling that we will talk about is that there is an o there p-o-l-l paul at the other hand is p-u-l-l okay is it they are completely different words you might native english speakers say what are you talking about of course they are two different words yeah because you guys are native english speaker i didn't know that three years ago and and i had i kept searching long polling on google and nothing should marketing terms like push and pull started showing us like what what what i'm talking about software engineering so yeah pulling with a u has nothing to do with software we're not pulling anything really which there's no rope to pull on pulling which is you're asking and you're getting a response like you create a poll right on twitter or on youtube and you ask people hey what do you think so the idea of polling is you ask the back end if it has certain information and the back end is supposed to respond to you with that piece of information that looks very similar to our request response right it is because you make a request the back end process that request and then send you back the response that is when the back-end actually has the response or can response to you respond to you like it needs to calculate something but take this example let's say you're building an email client like an outlook and you want to check your email right building a request response system based on polling is extremely expensive because the question is do i have messages is very interesting because you're going to ask a question hey do i have a message and then if you want an answer and there there and then the server will check your inbox and it says nah you don't have a message and then you are responsible to ask again yo do i have a message and the server will check again and say nope you don't have a message and then you can do this polling as frequent as you would like and people still prefer this approach nothing wrong with that there are disadvantages to it because there are a lot of bandwidth consumed on literally nothing because you're asking a question you're getting nothing in response which is kind of a waste of bandwidth you might say who's saying it's an empty request who cares yeah well you don't care if you're like a desktop client and you're you have one back-end but imagine i don't know one million web browsers and your application is like being consumed with about three million users and those three million users are asking empty requests to your back end that doesn't sound too good does it right it just puts it puts it puts a back end over load on your back end it puts an overhead on your back and that yet that you really need so that's pulling so the the good thing about this it's simple to implement like polling is very simple to implement you do a loop or or you have a button that is okay for the longest time outlook i have a send receive button that's basically what polling is it checks hey do i have a do i have a message click refresh imagine you're building a chatting app and you have to pull the back end to see if someone sent you a message first of all that bandwidth issue that we talked about second of all it just is you're gonna see delay you're not gonna get your beautiful message on time it depends on your cadence of polling right how long are you pulling how often are you pawning are you putting every 30 seconds every minute some people say yeah okay yeah i'm fine i don't want your stinking message anyway it's okay if i see it after a minute paul i don't care what you send me so it's okay right but people said nah we like real time right real time stuff is good i want to see my message as fast as possible i want this message to come to me as fast as possible and i don't really want to make any requests really i just i just did you just tell me just tell me as a client just tell me that's why we invented push model where almost nobody gets it right look at youtube if you have a youtube channel if you subscribe to someone and you said you hit that bill i can they tell you like all the youtubers says hey hit that bell so you get notified whatever request every upload that i made smash like if you smash that bill ring that bell you're telling youtube to notify you with every upload and i have a strong feeling that youtube implemented that ring and they don't make it as a default because they don't know how to do it right technically it is very hard to build a push notification system to billions of users do you have a billion user on youtube but it's very hard to implement you might say jose what you're talking about it's just a connection and every time the server has something on the back end it just pushes it to the socket that's what pushing is right it has to be stateful it's stateful it has to be stateful unlock unlike polling polling can be scaled the approach with pulling you can pull something and then be directed to a server that checks the poll request for you no that's a different thing pull request is a good help thing the poll request that you made go can go to any server and get checked for you with pushing you have to be tethered to a server because the server need to know where the client is and that knowledge title is knowledge of where the client is is a state that the server need to store and keep handy-dandy on the back end for it to respond to that means the connection has to stay open that's why websockets is stateful that's why tcp is stateful you need to know and then when something comes back on the back in the moment and notifications come back from youtube to this back end and you want to push it the backend has to say okay which which which which which you which connection is jake right jake has subscribed to all notification for this youtuber so i need to notify jake right so we find the tcp connection for jake and then we push the uh 3am uh which dark whip content for jake because that that's what jake liked right and then pushes the that notification to that if jake has disconnected we can't push to jake right if jake decided to close the connection or if jake decided to disable notification we can't push to dick it has to do all this garbage on the back end to push right i know i'm saying bad thing about push because it's hard it's just hard to do we're not just smart enough to implement push correctly so it's hard so we often get it wrong all right so pushing notification or pushing something if you yeah if you have like a few hundred users maybe you can do it whatsapp is a push system definitely right you you're connected to a well at the end of the day you have a user and you're not whatsapp server is not sending a message to 100 million user at the same time it's a group that's why they limit group they can allow you to like oh let's have 300 000 group members in a group they can't physically they can't they can't do it correctly just just the fact imagine like a youtuber with a million subscriber and before this bill icon thing in youtube they they were okay let's just notify everyone and they said well we can't really do it because we have to loop through one million sockets right and or again guys youtube doesn't work this way with notification they they conduct communication between the youtube backend and the android or apple cloud notification system to do the pushing so it's maybe it's a fewer connection between back and between youtube and apple right it's probably multiplexed and all that jazz i have no idea what's going on there but apple then talks to its cloud right of users so there is there is an extra layer and it's for a good good reason right but it's hard to push one million notifications it's very hard right first of all just the loop to loop through one million synchronous let's assume synchronous man it's gonna take so much time right because at the end of the day the millionth subscriber is going to weight 999.999 times x amount of time that is the time to push the notification so it's slow that's why people get the notification after a minute or an hour sometimes three hours sometimes seven hours sometimes they don't get it at all because well something went wrong tough luck we can't get it right we can't get youtube notification right they can't that's why we say do you know what we're going to occasionally notify you occasionally why because we can do it right they cannot do it right guys that's the reason okay and and if you even if you check that all that's what youtube didn't didn't anticipate youtubers started telling people to hit that ring bill and then all of a sudden everybody's has now the bill notification on but you know not everybody so that like is ten percent one percent will hit the bill some people don't know how uh youtubers teach their audience as much as possible to hit the bell but at the end of the day it's youtube is winning because there are less people to notify if there's less people to notify i cannot avoid them better even though pewdiepie with a hundred million subscriber if ten percent of that hit the bill they youtube cannot possibly notify all of them it's just very hard you can do it it's just youtube is not saying it's not worth the resources they can do it i'm not saying they can't they can't it's just not worth it they can do all sorts of threading they can educate a fleet for a notification they can it's just not worth it it's okay it's okay if you don't watch that video the 3am video it's okay it's okay okay so that that's what youtube says it's like okay let's just pass on uh they can't fix it if they want to it's just they need more resources they need more fleet of machine they can do it in parallel they can do all that just anyway push rabbit mq uses push to notify the subscribers of a certain what they call channel or topic queues right if you have a queue and you subscribe to a queue and and you and someone publishes something to this queue rabbitmq uses a push model to push this to the client right anyway saying okay okay how many clients do i have that's that's probably fine yeah that's why you use rabbitmq when you actually know how many clients do you have and whether a push model works for you a lot of people did not work for them that thing kafka is an example kafka abandoned the push model for that particular reason because push does not scale it's a stateful and because it's a stateful you have you have to be aware of that client and and this awareness adds overhead to the back end because now you have to wear okay are you are you okay are you healthy uh you you gotta you gotta you know what do you call this you're gonna tap on the client okay are you okay client are you okay can you handle this load that i'm gonna send you now or not right so you can start caring but back ends don't care about client if you care about your clients then you're gonna suffer because guess what you have you have to care about this client and this client it's like opening a school and having a hundred students in a single class teachers cannot care about all the students and eventually they they won't care about all the students right so there is a lot of care involved and it's it's too much for the backend so if you don't care that's the best approach do not care and this this knowledge and uh state information about the client in the back end causes the back end more resources more load plus sometimes the client can't even if the socket is open and i'm about to push something to the client the client can't even handle it because the client is processing other stuff how do you know if the client is ready to receive your notification you need an acknowledgement of the client well what if the client didn't acknowledge it back do you wait for the notification or do you donate maybe a notification you don't care hey if you get it you get it if you don't get it i don't care but if you're building a queue like rabbidmq or kafka you really need to care oh well i sent you a message i need to know if you get it or not because if i sent it again and you already got it you might process it twice that's why we're where all these only ones and only at least once guarantee at most one's guarantee all these guarantees that are very very difficult they they they can be achieved with idem potency and all that stuff but it's it's complication we don't like complex stuff we are lazy we want simple things as simple as possible all right so push kafka took a look at this and says nah we're not gonna do that we're not gonna do push there's this concept of long pulling so i'm gonna i'm gonna go back to polling but i can't use polling as a method because polling is is resource consuming because i don't want to saturate the network especially if you have a lot of clients all requesting stuff it just doesn't work right so i don't know who invented the concept of long poly but kafka is the first app that i seen use long polling in my experience let me know if you find other clients use that concept but here's what long polling is long bowling is just pulling you make a request to the back end you say hey do you have that message do you have that notification for me do you have that chat from paul right did paul send me something tonight right do i have that 3 a.m video right the witch video in the dark web and then the back end can check right and then can decide okay i checked and it didn't find that message on the back end the back end in the case of polling will immediately respond to the client saying hey client there is nothing for you the long polling approach is the pack and checks and didn't find anything but it just doesn't respond to the client says let the client wait let let's trick the client that this is actually i'm taking my sweet ass time actually checking i'm just taking time to check so the client keeps waiting well you might say jose isn't that bad client is waiting well the client is not doing anything the whole thing is asynchronous the beauty of a synchronicity right the clients no clients are blocking anymore we don't have blocking clients right if you send a request an http request on any other type of request while you're waiting you can do other stuff we figured this out we fixed it right async awaits in javascript promises uh async and c-sharp and other things asynchronicity you once you tell your function hey this is an asynchronous request it's okay application it's okay you can do other things but i'm just here waiting so the client is not overwhelmed or anything so the back end here can go and i don't know every three second checks the back end again right the back end is doing the checking now the backing is doing the pulling so that and that's okay because it's localized on the backing we're local assuming you're local you better be local in the back end right it's not like you're back in this on some cloud region and you're a database that you checks is another region that's probably you you're in the same boat in that situation so careful of that if it's local then it's okay all right i'm gonna hit that database i'm checking every three second or whatever or do you consider a model where you have a push model on the back end as well to push your to your back in that piece of metadata once it's ready and it's going to be ready at the same time when that information becomes ready then you ship the message with content to the front end very genius design and kafka has sort so uh have all sorts of beautiful timeouts and parameters that you can sit for long polling operations it has this concept of a number of bytes like okay send the request when you get x amount of bytes if you get like 300 bytes send it if you have this much of bytes send it right it feels like something i don't know a video or something like that right live stream or a live audio or something like that you can do it this way hey let's wait for a bit and then send it so that's long polling you make a it's the same thing the whole thing is built upon request response system it's just a trick on the back and we just wait a little bit more the backend just waits doesn't respond he made immediately so if you have fiddler open or some sort of a charlie or man in the middle proxy that checks your request that quest will say oh this request is so slow it took like 300 minutes or three minutes or seven minutes or eight minutes no really nobody's spending any resources on this one so maybe you want to filter that out from your 99 percentile latency so clients don't yell at you actual people client so yeah long polling guys that's the summary of long quality so we're talking about pulling pushing long polling it's good to mention server sent event server sent event is a glorified long pauling in my opinion that's how i see it servicent event is the client sends one request and the server keeps sending infinite amount of responses events it just doesn't doesn't end but that's the difference there is no end of this of this request there is no end of there is no service and you cannot send something that just oh this is the end it's just infinitely until the clock the connection is closed long polling is actually when you respond after a long wait you found the result in your spawn the client is responsible to send a request again well yeah you have to send it again you're responsible because that interaction is done right service and event is a longer wait so that's why i say service and there is a special case of long points just longer long polling it's a very long long polling it's a long it's a longer long the longest polling okay i'll stop you get the idea though server sent event sse talked about it many times in this video and this channel there's a i made a course with my uh with the services event beautiful beautiful tech very simple everything that builds on simple stuff me love i love this kind of stuff things that i can explain in i know this video is long but you get the idea things that you can actually explain with basic terminologies that everybody understands right instead of fusing things that makes you sound smarter right all right guys that was the long falling episode i hope you enjoyed the show hey guys if you like this kind of content you might enjoy this video where aj proxy reaches 2 million requests per second see that how they did it i'll see you on the other side goodbye
Info
Channel: Hussein Nasser
Views: 9,415
Rating: undefined out of 5
Keywords: hussein nasser, backend engineering, push vs poll, long polling, push vs long polling, push notifications, notifcation system, rabbitmq, kafka, apache kafka, kafka exactly once, kafka long polling, rabbitmq push, rabbitmq vs kafka
Id: J0okraIFPJ0
Channel Id: undefined
Length: 25min 55sec (1555 seconds)
Published: Wed May 26 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.