Scaling Push Messaging for Millions of Devices @Netflix

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

thank you for being here imagine it's Friday night of this week the conference is over you're back at your home sitting on your favorite couch finally ready to relax and you start Netflix I hope this is the first screen that you see when you launch Netflix the interesting thing about this screen is it's not static or universal it's customized to your taste there are 125 million versions of this tree one for each of our 125 million customers but this one is my personalized to my taste which I just realize is varied with crime shows let's not read anything specific into that moving down so please raise your hand if you start actually watching something anything within say a minute or two after you land in this screen yeah not me either most of us spend a considerable amount of time browsing this screen scrolling trend to pitch pick something to watch and that behavior is actually relevant to our discussion today let's say it's 10 or 20 minutes later and you're still browsing the same screen meanwhile our personalization algorithms are continuously running in cloud so we could have generated a new better recommendations for you in those 20 minutes and if that does happen how do we get that new list in front of you as soon as it's ready how do we tell our application how do we let it know that there is a better list better recommendations ready for it to download in the cloud push messaging is perfect solution for situations like this earlier our old application used to poll our servers periodically for new recommendation which kind of worked but it's not great it's faithful a it's not great latency wise either what's worse is the twin goals of UI freshness versus serve efficiency are in direct contradiction with each other with polling if you increase polling frequency to get the best possible UI freshness you have an unload your servers and if you decrease it to give you a service a breathing room you're gonna your UI freshness is gonna suffer now our client our server just sends a push message to our client as soon as it generates a new list for our client just as a one stat we reduced our request to our push our website cluster back 12% when we moved our in browser player from polling to push at 1 million requests per second those 12% add up really fast so please ignore all push messages on your smartphones for next 40 minutes because we're going to talk about push messaging push notifications are terrible for conference speakers like us but background push messages to applications are awesome in specific we are going to talk about what is push how you can build it how you can operate it in production and what can you do with it my name is sue she'll or Oskar and I'm a software engineer at in cloud gateway team at Netflix I have been in Netflix for 8 years have worked in three different teams in those eight years and somehow it still feels like I'm still just browsing the list the real show is about to start so let's start with defining push how is it different than the requests normal request response paradigm that we all know and love believe it or not this is actually from a motivational poster from my local gym that's why I don't go there anymore but it is a surprisingly accurate definition for our purpose today push is really different in two ways there is a eight there is a persistent connection between a client and a server for the entirety of the clients lifetime and B there is it's a server that initiates a data transfer something does really happen on server and then the server pushes the data to the client instead of client asking for it which would be the normal request response way we built our push messaging service we named it dual push to send this push messages from our servers to are applicable to our application dual push messages are very similar to the push messages that you get on your mobile except they work across a wide variety of devices they work anywhere where Netflix application runs which means they work on laptops on game consoles on smart TVs and on the wife to get this cross-platform capability jool push uses standard open web protocols like WebSockets or server sent events so is our SSE as will push server itself is open source - and is available on github today so let's get a little bit into detail about will push architecture Zul push is not a single server or a single service it's a complete push messaging infrastructure made up of multiple components there are dual push servers to start with they sit on a network edge and accept incoming client connections clients connect to these will push servers over the internet using either web sockets or sse as a protocol and once a client is connected to a particular she will push server it keeps that connection open for the entirety of its life lifetime so these are persistent connections that distinction is important now because we have multiple dual push servers and multiple clients connected to those multiples will push servers we need to keep track of which client is connected to which push server and that's the job of push registry on the back end or push message senders which are typically are back in micro services near a simple robust but high throughput way to send push messages to our clients but those push message send us don't really know about all the infrastructural details that I'm explaining to you now what they ideally want is a simple single one-liner call that lets them push a message to a client given a client ID or Azul push library provides them that interface by hiding all our infrastructure already tails behind a single a synchronous send message call behind us in the send message called takes the push message from the sender and dubs it in the push message queue by introducing message queues between our senders and our receivers we effectively decoupled them making it easier for us to run them independently the also the message queues also let us withstand wide variations in number of incoming message the act as buffer absorbing big and high spikes of incoming push messages finally message processor is the component that ties all these other components together to do the actual push message delivery it reads messages of the message queue each push message is addressed to a particular client by client ID or device ID it then looks up that client in a pusher history to figure out to which which server that client is connected if it finds a push server for that client in the push registry it will then directly connect to that push server and hand over that push message to that server for sending it to that client sorry on the other hand if it doesn't find a record for that client in the push registry it means the client is not connected at this time or it's not online in that case it just drops it on the floor now that we have seen how what are the different components that make up will push infrastructure and how they operate together we can actually dig a little deeper into some of them this will push throw is probably the biggest piece of the whole infrastructure or you'll push cluster which comprises of multiples in June which servers in aggregate handles 10 million persistent always-on concurrent connection today at peak and it's rapidly growing will push server is based on dual Cloud gateway and that's why its shares its name Zuul Zul cloud gateways the API gateway that my team owns and we operate it and it fronts all the HTTP traffic that comes into Netflix ecosystem at peak jewel push it's actually PE gateway so it handles more than 1 million requests per second and it was recently rewritten to use non-blocking a sink IO so it provided a perfect foundation for us on which to build our massively scalable this will push server but you may ask why do you need non-blocking or a sink IO in this case you many of you are probably familiar with C 10k challenge the term C 10 T was first coined in 1999 I believe and the challenge simply States you have to support 10000 concurrent connections on a single server we have by the way long since blown past that 10,000 number initial 10,000 number but the name kind of stuck this capability to support thousands and thousands of open connections on a single server is fundamental to a server like Zul push that as an aggregate cluster has to handle millions of always on open current connections which are mostly idle by the way the traditional way of network programming cannot easily be scaled to meet the SI 10k challenge the traditional way is to spawn a new thread per incoming connection and then let that thread do blocking read right on that connection this doesn't case scale because mainly because you will quickly exhaust your service memory allocating 10,000 stacks for those 10,000 threads you'll also most probably pin down your service CPUs by doing constant context switches between those 10,000 threads so it's not efficient to support large number of open connections a sink i/o follows a different programming model it uses operating system provided multiplexing IO primitives like K Q or e pool or IO completion ports on if you are on Windows to register read and write callbacks for all those open connections on a single thread from then on when any of that connection is ready to do a read or write operation its corresponding callback gets invoked on the same thread so you no longer need as many threads as you have open connections and it that way it scales much better the trade-off here is your program your application is somewhat more complicated now because now you as a developer are responsible for keeping track of all the state of all those connections inside your code you cannot rely on thread stack to do so because thread stack is shared between all those connections you typically do that by using some kind of event or a state machine inside your code we use nettie to do this a synchronous non-blocking i/o net is this great open source library written in Java and it's used by many many popular open-source Java projects like Cassandra and Hadoop so it's really very tested and it's battle-proven we're not going to go into details of nettie programming in this stock it's a subject in itself but this is just to give you an idea from ten thousand feet how an abstract native program structure looks like the channel inbound and outbound handlers that you see here are analogous to read and write callbacks that we just discussed a slide ago so this is our simplified depiction of how a push nettie pipeline looks like there are a lot of things going on in here but I really want to draw your attention to just two of the highlighted methods get push off Handler and get push registration handler you can override these methods to plug in your own custom authentication and push registration mechanism inside you'll push rest of the stuff that you see here like HTTP server code ake WebSocket server protocol handler all of these are standard protocol parsers provided by a native of the shell which is great because that means nettie is doing most of the heavy lifting here like passing low-level HTTP and WebSocket protocols each client that connects to a dual push server for the first time has to authenticate and identify itself before it can start receiving push messages from server as I said you can plug-in your own custom authentication and the way to do that is to override the class that we provide push off Handler and override its do auth method do what method gets the original WebSocket connect request as a parameter passed into it so you inside that blow up method you have a full access to its request body headers and cookies which you can then use to implement your custom authentication moving on push registry is the component as we saw that keeps track or keeps the mapping of push client and to the server to which that to which server the push client is connected just like your custom authentication we allow you to plug in your custom push registration mechanism inside ours will push server the way to do that is again extend out push registration handler class and implement its or override its register client method the example over here just shows that mapping inside a Redis store in radius would be the method that you would implement to serialize it that are mapping into your suffragists tree whichever way you see fit so you can use any data store of your choice as a push registry but that data store for the best results should have following characteristics it should have low read latency and this is important because you only write a record into that push registry once for every client when it first connects but you look it up or read it multiple times every single time someone is trying to send a message for that client so low read latency is important you can somewhat compromised on write late and see if you have to your data store should also support per record record expiry or a TTL of some sort time to live this is necessary because when a client disconnects at the end of its life cycle from push server if it does that cleanly which will happen 99% of the time or higher push server will take care of cleaning up its record from the push registry so it's no longer found but in real life you cannot rely on every single client disconnecting cleanly every single time sometimes service crash sometimes plant crash any of this happening will result in what we called living behind a phantom registration record in your pusher history it's the record that says this client is connected to this particular server but it's no longer accurate because I the server has gone away at this client has gone away Zul push drill relies on TTL to clean such phantom registration records after certain time out besides these two desirable feature then you have a laundry list of usual suspects like sharding for high availability and replication for poor tolerance given these features any of these will be a great choice for your push edges to data store they are probably several more what we use internally is dynamite it's yet another open source project from Netflix it takes readies wraps it and augments it with features like auto sharding cross region replication and readwrite chorim's it's another great choice for your push registry finally message processing is the component that does message queue or message routing queuing and delivery on behalf of our senders we use Kafka for our messaging infrastructure those queues decouple our senders and receivers most of our push senders plus mesh push message senders will take a fire-and-forget approach to message delivery the user push library drop a push message in the queue and carry on with their work few of them might care about the actual end result of status of the push message delivery and those can get to it to the final status either by subscribing to the push delivery status queue or they can rid it of the high table in a batch mode where we log every push message delivery Netflix runs in three different regions of Emerald Amazon Cloud back and push message sender trying to send to message to a client a particular client typically has no idea to which region that client might be connected so our push messaging infrastructure takes care of routing that message to the correct region on behalf of our cylinders at the base level we rely on kafka message cure application to replicate these messages in all the three regions so that we can actually deliver them across the region in practice we have found we can use a single push message queue and to deliver all sorts of push messages and be still below or delivery latency SLA but if you are worried about something we call priority inversion or design allows you to use different message queues for different priority priority inversion happens when your message of a higher priority is made to wait behind bunch of messages of lower priority because you are using a single cue to cue them all up having different message queues for different priorities guarantees priority inversion will never happen we run multiple instances of our message processors to scale up our message processing throughput message processors are based on Montes mantises are internal stream scaleable stream processing engine kind of somewhat similar to apache flink it uses mezzos container management system which makes it easy for us to quickly start have spin up a bunch of message processor instances if you are falling behind in processing our message to backlog critically it comes with out of box support for scaling message processor instance number or number of message processors based on how many messages are waiting in the push message queue this feature alone makes it very easy for us to meet our delivery latency SLA under a wide variety of load while still staying resource efficient so at this point I would like to switch gears a little bit and go over some of the operational lessons or tactics that we learn when we first started operating will push in production zoom push is somewhat different than the normal stateless rest services that we were used to till then so it required a little TLC or tender love and care when the first started operating in production the biggest difference is long glute stable connections they make as will push servers somewhat stateful long low stable connections are great from client point of view right because they they improve clients efficiency dramatically they've no wrong plans no longer have to continuously make and break connections like they would have to in simple I should EB word that's why we all rejoiced when WebSockets were finally widely supported and they replace the hats like comet or long pole but they're quite terrible for somebody who's trying from the point of view of somebody who's trying to operate a server mostly because they make quick deploys and quickly rollbacks problematic they make they complicate it let's give an example let's say you have to push a certain urgent fix right you have to push on it or deploy a new build so you do that now you have a new cluster with a new build in production but all your plants are still happily connected to your old cluster because remember they open that connection only once when they start up and they hang on to that connection for the entirety of their lifetime so they are not automatically migrated to the new cluster just because you deployed it you will have to post fully migrate them back killing the old cluster but then if you do that they are gonna all swamped that your new cluster at the exact same time giving rise to something we called a thunder encoder so this is a lose-lose scenario thundering herd is basically large number of plants all trying to connect to the same service at the same time this gives rise to a big spike in connections or traffic which is orders of magnitude higher than your steadies normal steady-state traffic it's one of the things that you have to watch out for when you're trying to build a resilient and robust system so the way we found our way out of this pickle was to limit a client connection lifetime all our plants are coded in such a way that they know to try and reconnect back whenever they lose a connection to the server we took advantage of that fact and we auto close connections from server side after certain period so when the this client loses the connections is going to try and connect back and when it does so it's going to land on most likely on some different server because of the way load balancers work right so this takes care of single clients stickiness to a single push server which is at the root of all our deployment and rollback news we have tuned this connection life time period carefully to strike a good balance between client efficiency that we desire and plant stickiness that's we are trying to avoid not only we limit a single clients connection lifetime we also randomize it from time to time or from connection to connection and this is important to dampen any thundering code that you may still get in spite of best of your design and intentions as the time progresses I'll give you an example let's say there was some network blip right and many of your clients drop the connections and they are all going to be connected back so you're going to get a thundering code even if you provide accounted for it now the problem is they all connect at around the same time T is equal to zero let's say now if all of them get the exact same connection lifetimes let's say 30 minutes then they all go not disconnect again right at the 30 minutes boundary from now and they are all gonna reconnect back at that same boundary and this is will go on on perpetuity so any flip will cause this and now you have a guard the only thing that's worse than a thundering heard a recurring thundering hood but consider now instead of giving them each one of them just 30 minutes you randomize the connection lifetime by plus minus two minutes right so they're gonna get a connection life term of something between 28 to 32 minutes for example when that happens for the next subsequent reconnect attempt they are gonna get a little bit dispersed on the timeline some of them will drop the connection when they connect a 28-minute 29 minute 30 31 and 32 and all the seconds in between so this has the effect of spreading out that initial peak over four minutes and when that happens when they recommend they are again wanna get another randomize period so as the time progresses it will automatically dampen like the curve shows you it's a very simple trick but it's very valuable to correctly tame any thundering hood that you will eventually get one day or the other this is really a nice optimization extra optimization I know just a couple of slides ago I said we ought to close the connection from server side but it's no longer entirely accurate it used to be the case but we flipped it around in our latest version such that now our server sends a message to the client asking the client to close its connection and I know it sounds like a roundabout way of doing the same thing but we do that mainly because how disappea works according to the TCP spec any the party that initiates closing of the connection ends up in that if time wait state time rate state is the last state in the disappear down flow the problem with time word state is on Linux that can consume that connections file descriptor for up to two minutes now our server is the one that is handling thousands and thousands of concurrent connections so our service file descriptors are much more valuable than our client descriptives so by having this roundabout way by having the client close the connection we make sure that service file descriptors are comes out there is a flip side to this optimization though because now you have to be prepared to hand out in misbehaving plants that won't follow service lead and close the connection when they are told to do so to handle such client we start a timer when we send a close connection message downstream and then forcefully close it from the server side if the client doesn't comply within a set time limit so with all these tweaks we more or less took care of the sticky stateful connection problem and next we focused our attention in optimizing our push cluster size on reducing the number of push servers that we need to support our traffic our big epiphany here was most of our connections were idle most of the time which meant even with large number of connections open to a single server the CPU or memory on that server was not under particularly under any particular pressure armed with this knowledge we chose a really big meaty Amazon instance type for our push server we carefully tuned its TCP kernel parameters JVM startup options and things like that and we crammed it with as many connections as we possibly could then just one of those servers crashed on us in production and of course we got again the visit from our dear old friend the thundering God all those thousands and thousands of connections that we crammed so efficiently on the single server the all came roaring right back with reconnects you know you have a problem when a single server going down in production can start a stampede in your system so we looked our wounds we learn from our mistake and for the second round we went with a Goldilocks strategy now we know that you don't want to learn run your service either too hot or too cold so we found the server Amazon instance type that was just right for us which happens to be m4 large in Amazon terminology which is basically a server with eight gigs of ram and two virtual CPUs and with our load testing and switch squeeze testing steps we figured out that on that particular server configuration we can comfortably handle up to eighty four thousand concurrent connections at one time and if that server goes down a couple of servers like that goes down that instan type goes down handling those many connections is we are comfortable with that that's the size of thundering code we can handle given our traffic so real lesson here is you should really optimize for the total cost of your server form operation and not optimized for low server count I know when stated like that it sounds obvious it should have been obvious but it wasn't initially for us mainly because we I think we conflated efficient operation with low number of servers where in reality more number of cheaper servers are actually preferable to fewer number of large servers as long as your cost stays the same next problem we had to address was how to auto scale how to increase and decrease the size of our push firm number of our push servers as our traffic goes and comes down our two main go-to strategies for auto scaling were either requests per second RPS or average CPU load both of them are ineffective for push cluster because there is no RPS as we said because these are long a persistent connection there are no continuous requests and CPU was low as we said even with a large number of open connections so how do you auto scale the only limiting factor for push server is the number of open connections is handling at any instance of time so it makes perfect sense to auto scale push cluster based on average number of open connections thankfully Amazon makes it very easy to scale on any metric you want or Auto scale on any metric you want as long as you can export it as a custom cloud watch matrix and that's what we ended up doing we export number of open connections from a server process and we hook up our auto scaling policies to that metric the last hurdle we had to solve was Amazon Elastic load balancers cannot proxy WebSockets our cluster is in Amazon Cloud of course and they sit behind these Amazon Elastic to load balancers or lbs for short unfortunately elby's do not understand the initial request that WebSocket client makes which is called a WebSocket upgrade request it's a it should be request but it's a special request and they do not understand it so they handle it as any other HTTP request which means as soon as the server returns the response definition is broken which is not what we want we want a persistent connection but you cannot have a persistent WebSocket connection through ELB by the way this is not specific to just Amazon elby's any load balance hardware or software that doesn't understand WebSocket protocol is gonna run into the same issues that includes older versions of H a proxy and nginx as well before they started supporting WebSocket protocol the way we found our way around this problem is to make Amazon real bees run as disappear load balancers by default Amazon yell bees run as HTTP load balancer doing load balancing at layer 7 but there is a configuration there's a switch that you can flip that makes them run as a TCP load balancer which means they do a load balancing at layer 4 at that point they just proxy the TCP packets back and forth without trying to parse or interpret any of the layer 7 application protocol which would be it should be in our case this keeps them from mangling the WebSocket upgrade request that did not understand at this point I just want to note however that Amazon has come up with a new load balancing offering called LB or application load balancer which is supposed to support WebSockets natively unfortunately it came too late for us by that time we had already figured out all these tweaks to make elby's do what we want but if you are starting out today you may want to give lb a try so let's do a quick recap of how you effectively manage a push cluster in production you want to recycle connections after tens of minutes ideally between 25 to 30 minutes that's switch spot that we found you want to randomize each connections lifetime to tame the thundering herd as the time progresses you should prefer smaller number more the number of smaller services servers over fewer but bigger expensive servers that will withstand much better in thundering herd scenario you should Auto scale on number of open sockets rather than CPU or RPS which are more con commonplace and finally if you are going to put your push cross push cluster behind the load balancer either use a WebSocket of where load balancer or run your load balance and the TCP mode any of which would work so let's say if you did all that and now you have a push cluster in production running flawlessly what can you do with it now that we have our push hammer in Netflix we are seeing a lot of nails we plan to use it on for on-demand diagnostic we can detect misbehaving devices or clients in the field which are generating lots of errors from their telemetry and then send a specific special push message to those devices asking them to upload their state and any other relevant diagnostic detail to cloud so that we can triage them we can debug and see why they are causing these errors and if all of that extra debug data still Intel we can always reach to reach out for the most trusted tool in any software developer's toolbox and restart the application we can now do it remotely what could go wrong but if something does go wrong now we can send you a message saying we are sorry because we have push messaging capability so hopefully you have some good ideas where you can use push messages too so I've been talking about more than 40 minutes for around push now pleading the case for push and at this point I have only one single last request for you for me to make to all of you pull all of this we discussed everything we discussed so far is open source you can find it inside a Zul project in our Netflix or SSRI PO on github it even comes with a toy sample will push server that you can fire up immediately and start playing with so give it a spin file bugs and it would be so kind maybe send they're gonna send us a wonderful request or two so in conclusion push can make you rich tin and happy thank you I'd be glad to take any questions about Zulu push architecture or Zulu operations at this point hello so Amazon Netflix have 100 million customer right 125 so how do you test this if you want to test some something how do you taste I mean it goes to mobile and different devices right how do you create QA this message meaning so these are different from normal push messages and since these are background push messages these are not mainly the main use case is not to show something to the user so for example the test the use case I just explained where you have a new list in the cloud and you send a push message for the client to the client to download that list so you can track if I send it my push message did that client actually come back and downloaded that list so all this background push messages how some actions coupled with them that the client takes and we track those actions and I'm not sure that that's the right answer and the second part of that question is when rolling out Netflix lives and dies with a be test so we put a few like a small percent of current in the a/b test enabled with push that cell was enabled for push messaging and we tested with them and then we rolled it out 200 percent of the users I mean how does this architecture compare with apples ApS notification service that's a great question Apple being Apple I don't have a lot of insight into their haka tech chure the last I heard is they use some variant of XMPP it's not really APNs is really XMPP and I think they use our line but you know at a conceptual level they are very similar plants open persistent connections and then you have to do all this stuff I'm sure they are doing this stuff in some other flavor or some other version and you push messages it's just the messing protocol we went with WebSocket and SSE because these are open web protocols and anybody can build those clients I have a question regarding the protocol so do you use like JSON or like a binary protocol communicating so all of the thanks for your question all of the current use cases use JSON but there is nothing in our push message architecture that mandates JSON so we basically establish a WebSocket connection so we support both text WebSocket frames and binary WebSocket frames so if you wanted today tomorrow we could anyone send like something like a protocol for a binary message having said that yes today mostly our message is RJ sorry two questions one is how many connections per server using how much memory you might have addressed that okay and this and the second question is is this only used for push or can the client communicate like request information from you right thanks good question so our current server is 8 GB RAM and we are comfortable pushing that server to 84,000 concurrent connections on a single server 84,000 yeah it for we do operate at a level below that to give us some Headroom so mostly around 72 K connections and the second question yes in theory the client can send something to the server and there are very few corner cases with client does that but as I I just went over this WebSocket connection it's sticky so it has all these problems so we try to keep all the upstream which API traffic on HTTP because it comes with all the stateless benefits so mostly this is used for communication from server to the client but there is nothing in design that precludes sending something up from client to server we just try not to for architectural illusions hi I'm sorry you've mentioned that you've you're disconnecting the clients in your ass in the client feels the disconnect themselves how do you cope with the fact that some messages are going to be lost because the client it is at that moment not specifically connected because it's between that stage right so that really depends on client the client yes so most of the clients for example the perfect example is again the recommendation screen right whenever plant starts up it's going to do a first page of all the recommendations so if your client was offline I sent you that a new recommendation is there for you and if you did not receive that message we can safely ignore that because when the client startup starts up is going to do that first page so in those cases it's not a problem in some other use cases when you need a guarantee that you can never lose a message what we advise our client to do is something we call hand over hand transfer of WebSocket connections so if they have a WebSocket connection and they get a close connection message we ask them to open another WebSocket connection so they have now - over a socket connections with our push server and then the close the old one the way our push our history structure is the last one wins so when you open a new push WebSocket connection that record wins for your old record and then you can easily safely shut down your old connection so now you always have one connection upon it push sir so you will not lose any messages but again having said that it's just like push notifications right like push notifications work 99.99 percent of the time on mobile but if you read APNs documentation they say it's a best effort delivery it's not guaranteed delivery same is true with dual push messages hi sorry I can't see over here and right here in front of you on the question is what kind of information do you save what what is this state that you're saving on the back end and if you do have an upgrades in error where we need to change that type of state what information are saving how do you handle that switch to the new format so you mean what state we save on the push registry it's simply simplified it just to two pieces of information so your plant is identified either by a customer ID or a device ID or a combination of two so that as a key versus the internal 10/20 for address of the Zul pushed server that client is connected to we don't store any other state and so if you store just that state it's practically imitable the only thing that can change that state is your client drops the connection or gets like crashed or something and just goes away in the first case it will clear the record because client cleanly terminated in the second place the TTL we take care of taking the record so for some amount of time that record is going to be inaccurate but that's not a problem because the worst that can happen is you you look up that record and say this server has this client connection and then you connect to this server and send that server to send a push message downstream and the server is gonna respond I don't have that information because the server itself maintains all the client connections against their client IDs in its memory and server is always more authoritative so server is gonna say this is no longer correct and then the message processor will clear the registry this is somebody's hi thanks is all of the deduplication handled on a client or there's something that's done on the servers as well human deduplication of messages yes it's mostly planned yeah we don't we could but then you get into the application specific logic because you have to understand what are duplicate messages and some of our messages are actually batch so if I send a message it does not get it there the second message will have this message plus another part to it so the deduplication on server side is not really that easy you have to understand the application semantics so we let clients handle it basically we do give each message a unique good and application message type and that makes it easy for clients to say whether they got it first or not I think we are being asked to we are running out of time thank you thank you so much I want you to remember push is green and easy thank you

Info

Channel: InfoQ

Views: 40,999

Rating: 4.9377432 out of 5

Keywords: Software Architecture, Performance, Zuul, Zuul Push, SOA, Enterprise Architecture, Message Passing, Scalability, Messaging, Netflix, InfoQ, QCon, QCon New York

Id: 6w6E_B55p0E

Channel Id: undefined

Length: 49min 9sec (2949 seconds)

Published: Fri Nov 09 2018