AWS re:Invent 2018: Choosing the Right Messaging Service for Your Distributed App (API305)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome my name is said I'm a Solutions Architect with AWS I've been with AWS for around 4 and 1/2 years now and I'm joined by my colleague hi i'm kuba Boychuk i'm a principal engineer in AWS where i work with simple q service simple notification service amazon mq and also step functions and simple workflow and I've been with AWS for 3 and a half years now and today we're going to talk about messaging the importance of messaging in your software solutions different use cases and how they map to different messaging services we have in the cloud and we're also going to see some demos and how those messaging services work let's start with a bold statement that my colleague Tim Bray made so from our experience these are the three things that move you towards messaging being cloud native being large-scale distributed and not having messaging in these cases seems like a bug in your architecture and it makes sense when you think about it when you used to have a big monolith application all the communication happened internally and you didn't need an extra component for that and as you move towards micro services there are new communication paths which now you messaging can feel very well and solve the problem of communication between components for you so we like to think about messaging as the fourth pillar on which you build your mobile app modern applications let's talk about some basics in messaging we deal with passing messages around the systems that send messages will typically be called message producers the systems will consume these messages and work on them our message consumers and what is passed around our messages so what are messages well it's whatever you want to send between two systems here's an example of of a message explaining a hotel booking in hotel booking system and the developers chose Jason as the format for messages you can pass the same around of information using whatever format you would like and here's the same message and described in XML and when you send messages around it's up to the producer and consumer service to establish what the format will be and what the contents of the messages are but you also can send attributes along with your messages and attributes are just key values of whatever you want to attach to your message they can have a business meaning they can have some technical meaning so you may ask yourself a question for this particular piece of data where should I put it should it go into the payload or should I attach it as an attribute to the message and the answer for that one is the payload can get encrypted the attributes are never encrypted the attribute can be used for filtering messages out for routing them between your services so looking at a particular piece of information this is like the distinction you can make does it go into attributes or goes into the message payload and the name W as we have a bunch of systems and services for providing messaging services to you we've got sqs for queues we've got SNS for topics we've got Kinesis for data streams and last year that we invent we've launched Amazon MQ and MSN MQ is a hosted activemq broker and we're not going to talk about Amazon MQ in this session because the purpose of Amazon MQ is to enable you to move your existing software into the cloud to move your existing software that talks to some on-premise broker and basically point it towards a hosted activemq instance in the cloud in this talk we're going to focus on messaging services provided by sqs SNS and Kinesis and i'm gonna hand it over to Syd to talk about customer use cases for messaging thanks kuba so we'll talk about for customers we'll look at the use cases why they needed a messaging platform to begin with what the requirements were they look at which AWS service fits their requirements and finally bring it all together by looking at their actual architecture these are real customer use cases and real architectures who here has sharp groceries from Amazon fresh arrays of hands oh that's less than what I expected for those of you who don't know Amazon fresh is grocery delivery service in select cities in the United States now if you go and look at the Amazon fresh website you'll realize that the products vary based on which city you are in and that's because customers like locally sourced fresh products and product which are in season which means the Amazon fresh team has to continuously update the catalog based on customer feedback so for that they build a product selection portal and the idea is a selection manager browses through this portal on a daily basis and makes changes now from an architecture standpoint any change made to their portal has to be automatically pushed to a retail and purchase system which would restock inventory now what better way to decouple or connect to loosely sir loosely coupled systems than to use a messaging platform now let's look at what requirements they had they wanted a durable persistent and highly available system the system should scale so whether you are sending one message of 1 million messages you don't you don't have to care about that let the messaging platform handle that lastly it should be easy to manage all this feature should not come at the cost of complexity so I'll let kuba talk to us on what's the right messaging platform in this year's case Thanks let's talk about Amazon SQL standard queues they've been on available in AWS since it'll be its launch ready in the middle you can see how you should think about an SQ a standard queue it's a collection of messages kind of ordered but not really on the left-hand side you're gonna see producers that will send messages into the queue and on the right hand side you will see consumers consuming from this queue and let's see how a standard sqs queue behaves when you start sending messages to it and receiving from it to understand how it behaves for your service so let's say the first producer attempts to send message a into the queue calls send message on sqs and SQS stores the message durably multiple copies that you don't really see when you're gonna start receiving the message but when you see an okay response from sqs you can be sure that the message is not lost like it's stored in multiple copies across available results what happens when another producer sends a message B into the queue call send message now in this scenario let's imagine that there was some kind of a networking problem when the second producer was calling same message sqs saw the call and stored the message durably but the producer saw some kind of a socket timeout he kind of doesn't know whether sqs got the message or didn't so what the producer will do is he'll retry the send and because of that retry SQ standard cannot actually detect that is a duplicate message it will store another copy for you so that's one of the reasons why you can see sometimes duplicates in a standard sqs queue it's enough that the sender retries because it's so calm some connectivity issue so let's see how a standard Q behaves when you want to consume messages from it now one of the most important features is sqs is the first s in the name simple when you want to consume messages the only thing you need to do is to call receive message and provide the qrl you don't get to tell sqs which message you want to receive is the responsibility of ask us to select the next best message to give out to you so the first consumer calls receive message and SQS selects the message to return and hands it back to the consumer and the consumer can start working on it now notice that the message is still in the queue it's not immediately removed it's invisible and the you can you can control the invisibility timeout and this invisibility timeout make sure that if another consumer wants to fetch another message sqs won't give out this particular message because someone is already working on it so the second consumer call sushis message and for the second one ask us decided to give out this other message and again start the invisibility timeout what happens when you successfully process the message so the first consumer is done with the message he's now supposed to call delete message on this on the message handle that he got which actually causes the removal of the message from the queue so only when the consumer acknowledges that he successfully processed the message and calls the deed message the message is gone this guarantees that you will process the message at least once you will always get it so what happens when consumer has a problem consuming the message she cannot understand the message there's a code back something is wrong the easiest solution for the consumer is just to forget about the message and do nothing doesn't even need to tell ask us anything it can be just an exception in your code or maybe the machine just fried and stopped processing anything what happens next is the invisibility timeout on the message that he was working on expires and the message is available for consumption again let me illustrate this behavior using a pre-recorded demo that I've prepared and just so it's like visible what's happening in the queue the demo has a lot of thread dot sleep in it so it's running way slower than the actual service does so in the middle we have the queue you're going to start a program to see the query the contents of the queue and we can see that currently the queue has no messages in it on the right-hand side we have a first consumer started and since there are no messages he's not receiving any messages yet from the queue we start another consumer so we have both consumers that try to pull for the next message from the queue let's start the producer we connect this one we'll send messages labeled a so he's sending sequential messages and on the right-hand side you can see how both consumers in parallel are just grabbing the next available message there's no coordination the processing happens out of order let's start sending messages labeled B so now you see how both consumers just compete over any available message next from the backlog there is no ordering happening the numbers go up and down they just it's a bag of messages that is an sqs queue what's nice about it is when you start the third consumer or any number of consumers you just increase the throughput of consumers it's super elastic so now we have three just competing over available backlog let's have a third producer into the mix and immediately you see how all the three consumers are just processing through the backlog and it's important to note here that the number of producers you can have in an SQL standard queue the number of consumers you can have the number of messages you can send through a queue is close to unlimited you don't have to provision anything anything you throw it and ask you a standard Q the Q will gladly accept and serve you so it's very elastic and allows you to scale well that comes at a specific cost and the cost is sometimes you will most likely get messages out of order you will get sometimes duplicate messages so let's hear from Sid how the customer used and SQ standard q and how they solve the problem of duplicates and message reordering so Amazon fresh decided to use sqs Q now one key thing to understand is that every message is independent and self-contained it contains a product ID in this case milk and how much inventory has to be stocked up so let's come to the two characteristics kuba mentioned which is out of order messages so in this case let's say there was a second message which was for restocking of some candies and now the message is flipped and you got the milk message first that's not a problem because each message is self-contained if we restock up the milk before candies I don't think anybody complaints except maybe the kids now the second attribute duplicates so let's say we got the candy message twice that's not a problem 12,000 candies is anyways better than 6000 candies problem solve now I'm just kidding so the way the consumer tackles this problem is by keeping a timestamp of the last a record of the last seen time stamp for a given product ID and it discards any message it receives with a timestamp which is a duplicate or an older one now let's look at the complete architecture for the perk selection portal they went completely serverless they hosted their website on s3 using the s3 starting host static hosting website they were using client-side scripting like JavaScript to make API calls to API gateway which would in turn launch lambda functions and this would dynamically update their website they were storing their per catalog on RDS and any change made to IDs what would eventually trigger or immediately trigger a lambda function which would then write the message to the sqs queue let's talk about a second use case Comcast comcast is a global telecommunications corporation many of us have Comcast Xfinity devices which provide Internet and television to our homes their use case involved a CRM data management system a bunch of customer changes for example a customer requests the new internet connection they buy a new internet modem a work order is generated for an engineer to go and install the Internet service so all of those changes are being absorbed by Comcast and then they have to be processed by a target CRM system but they wanted to store all these messages in a message platform in a message queue which were pending and still had to be processed so let's look at some of the requirements yeah so they required in order processing for a given customer ID so we had to make sure that the customer messages were processed in order secondly for distributed processing multiple customer messages can be processed parallely so they can have thousands of customer messages coming in for different customer IDs and they are all processed by parallel threads and they also required only once processing of a change now all that translates to these requirements on a messaging platform durable scalable persistence that's given and expected but in addition we needed first-in first-out kind of functionality and we cannot have message duplicates now I'll hand it over to kuba to explain which a doubler service they went with Thanks so here we're going to talk about a new type of queue that was introduced in sqs two years ago as qs5 for queues and immediately looking at the image you can see the difference the messages are ordered but it's not just one sequence of ordered messages and this image we see three and it's exactly to solve this particular use case where you don't really need strict ordering of everything in your system because that would limit your scalability to purchase things in a single sequence you can have just one consumer working on them typically what you need is you need to process things in sequence for a specific subgroup of messages like a customer account ID but you want to work on multiple accounts in parallel so let's see how ask us FIFO does it and it again starts let's again start with producers so now the first producers wants to send a message to send a message to the FIFO queue the producer needs to tell us what is the message group for which is a message belongs to now what's nice about it is just a string tag you put on a message you don't have to pre create this group you can send as many as you like there's no limitation and you don't need to explicitly create them it's just a tag on a message so the producer called sendmessage and SQS fifo and SQS five four stores it in an ordered way to the particular group and again messages stored durably with multiple copies across aziz you don't see that but like once you've seen okay from us the message is persisted durably what happens when the second producer sends a message this time to a group g3 similar case network broke we have a socket timeout on the producer asked you I saw the sent call appended the message to appropriate group but the producer doesn't know that so he retries now the case with Aeschylus 5o cues is that it keeps track of the last five minutes of all the identifiers of the messages you send to it even if they were consumed already and is able to detect that this is a retried sent for the same message B that goes into the queue so in SQ is five okay's now duplicate is introduced we just returned and okay to the producer because we already have that message as long as you retry within five minutes you're good no duplicate appears in the queue okay so now we saw how messages are sent to FIFO let's see how the consumers now use a FIFO queue language standard queues it's very simple you just call receive a message on and SQS FIFO queue you don't get to tell which message you want to receive you cannot even say which group you want to receive from it's the responsibility of the S key is 5oq to pick to pick the next best message for you to work on so the first consumer calls receive and fifl decides to hand out message from group g1 for him to consume and see starts working on it like in a standard queues the message is still there in the queue is invisible but there's one extra change if this message is being worked on from group g1 no other consumer can receive any other messages from Jaguar this is how we preserve the ordering of messages within a group the entire group is kind of locked for you until the first consumer is done with the first message so when another consumer calls receive message on an SPS FIFO asked us five four can decide I'm gonna give you the first message from group t3 to work on so we can start working on it what happens when you successfully process a message like an SQ a standard you're supposed to call delete message to acknowledge that you're done with it which removes the message from the from the queue and at that point it done blocks further processing of group g1 and other consumer can receive next message so if the third consumer calls receive message he may be the one getting next the second message from group g1 so it's important to know that you're still processing messages strictly ordered there's no other word processing happening but you don't have any sort of guarantees who's gonna get the next message for a particular group all of your consumers can get message from any available group so you don't have any type of consumer affinity what happens when you fail to process a message similar to a skew standard you can just forget about this message don't keep track of it and when they invisibility timeout on the message expires the group is available for consumption again so let's see how this all works in a demo again with pauses introduced so we can see what's happening so like previously in the in the middle we have the queue itself and it's empty let's start the first consumer and the only thing we need to tell the consumer is just a identifier of the queue start another consumer queue is empty so obviously they don't get any messages yet but now the interesting part starts let's start the producer sending messages labelled a into the queue so both consumers are always pulling for messages but you as you can see immediately only one of them is working on the next available message from the queue when we start another producer sending messages labeled B now we have a situation where both consumers are able to work on something because there are two groups that can be worked on and the who works on which group can actually dynamically change if we start a third consumer he will do some useful work but because because on we only have two message groups in the queue only two out of three are actually doing anything in order to preserve ordering so if you start this third producer sending messages with the label C we now have three groups so three consumers can do useful work but again there's no affinity and you can see how like the ownership of who's working on next changes dynamically what this means is it's very elastic when it comes to adding more consumers in in a typical use case where the messages describe changes to a customer account you have preserved order but you can work on millions of different users and keep throwing more consumers at the FIFO queue there is a limitation though a single FIFO queue has a limited throughput the max it can get to is three hundreds of each of send receive and delete per second so with batching you can get up to three thousand messages per second and let's hear from said how the customer utilized message groups and how they worked around this limitation of the throughput of a single type of you all right so let's plug in sqs FIFO into the architecture now what Comcast did was all messages for a given customer account number was sent to a different message group inside and SQS FIFO queue additionally they were able to use multiple sqs FIFO cues to get very high throughput and they did this by shutting customer account numbers across multiple message groups over different sqs FIFO queues and they kept a mapping of which account number belongs to which message group in which sqs FIFO queue in a dynamo DB database let's talk about our next use case octa is an integrated identity and mobility management service it lets users access their applications from anywhere anyplace anytime securely by using services like single sign-on octa captures bunch of events these are user events like authentication single sign-on etc and they want to use these to perform streaming analysis to create real-time dashboards as well as batch analysis now they wanted to use a messaging platform which would store all of this even data and then the Apache a storm computational platform would read from this messaging queue and perform sliding-window analysis from a requirement standpoint again durable scalable persistent but there's a key difference here every message is not independent they wanted the ability to go back and look at a bunch of messages to do analysis and detect anomalies and create alerts so they wanted some kind of a message replay to go back in time and look at them and do streaming analysis so I'll let kuba talk about the ADA bliss service for this use case so here we can see a Kinesis data stream and the first glance it looks kind of exactly I can ask you is FIFO queue but we'll see the differences as we go along through how you send and receive from a Kinesis data stream so again you have in here so let me take a step back and say for data streaming typically a different vocabulary is used and when you talk about sending things into a data stream you typically talk about putting records into a data stream and you talk about reading the stream the back and what we see here is two shards where each shard has a provision throughput so you can expect certain megabytes per second and records per second from a char and you can create as many chars as you need when you work with the Keeney's data stream but chars need to be pre created they are not created for you dynamically and you basically have to say I want 10 shards 12 shards so let's see how this works again through going through the cases of sending and consuming the first producer wants to send message a and in order for kinases data stream to select which shard should be used you tagged your message with something called a partition key which is very similar to the message group identifier in FIFO it's just a string the partition key is not something you need to pre create is just a string you tag your message with so the producer calls put record and now Kinesis looks at what was supplied and needs to decide to which of these shards this new record should be appended to and the way it's done is through hashing so imagine a hashing algorithm which as output gives you a number what happens here is each shard on the subset of those hash keys so we now know see how the first shot has the first half of heart space and the second shot has the second half of heart space what Kinesis now does is calculates the hash key for your particular partition key so p2 maps to this value which now clearly means that this record should be appended to the first chart and again when it's appended it's stored in multiple copies durably across availability sounds when you see and okay the message is good like it's not gonna get lost and at that point we can return okay back to the producer so I can previous examples let's see what happens when B sends the payload sends a record into a key nice data stream we go through the same steps calculate the hash value this time the partition key maps to a different value because it's a different partition key and it ends up going to the second chart again start durably and like in previous examples what happens if there was a networking problem between the producer and kinases data stream when the call was made the second producer will retry the call and in case of Kinesis data streams what's going to happen is the result of the hashing function is going to be the same value so yes there will be a duplicate but it's always going to go to the same shard as the previous version so when you start thinking about consuming data from a stream it's easier to reason about how can I detect duplicates when I'm reading back the data it's always going to the same chart so I mentioned how you can have as many shards as you want and each shard has a limited throughput what happens when you need more and you already had data in your Keeney's data stream what happens is you can what you can do what is called a shard split so you're basically selecting one of these and making two out of them so you divide the existing hash space for a particular shard into two shards with a subset of keys and that's how you can scale the throughput of a Kinesis data stream again almost infinitely by adding as many shots as you would like so recharging means it doesn't influence existing data the moment you do the retarding new shards appear and new data gets appended to the new shards and there is a new easy to work with API now where you can just say for this particular keaney's data stream I just want that many shards you don't have to pick specifically which charge you wanna split and how so let's talk about consuming from a kinases data stream so the key difference here will be remember how in like sqs the only thing you needed to do was to call receive message and provide the cue URL and the rest was on sqs which Kinesis data streams the responsibility of selecting from which shard your reading and which record you reading is on consumers this is your code that needs to does do this so in order for the consumer to start reading from it it first needs to query Kinesis what are the available charts pick a shard and start reading from it using an iterator something that resembles something like that so you immediately see another key difference you get consumer affinity all the sequential entries in a shard end up always being read by the consumer that is reading from it so it's easy to perform analysis of consecutive entries you can also see that when you consume from a Kinesis data stream you're not deleting anything which now means that you can start multiple different applications consuming from the same stream in like a fan-out pattern and doing doing all sorts of different types of analysis on the data stream but again as the number of chars grow and you perform these splits of shards the complexity of the code on the consumer increases because it's kind of on the client-side code to keep track of who's reading from which chart how far is it and to checkpoint the progress of reading through it somewhere and because of that instead of using Kinesis api is directly for consuming I would recommend everyone to just use an existing client-side library called KCl kinases client library it does all the complex management of shards of electing who's reading from which chart of tracking progress of your reads and you only focus on the actual code that processes the records from the stream ok let's see how this behaves in the demo so in the middle we see aki-nee's data stream and it has two charts we start the first consumer and because we can have multiple consumers reading from the same set of data we identify consumers this is application one that will be now consumed and as the consumers start up through Kinesis client library they elect who's gonna own which chart for consumption so they have exclusive ownership of a particular chart industry and we can see how both consumers and the double-acting unique chars from the data stream let's start the first producer and we can see how partition key a gets mapped through the hashing function to shards zero and it's only the first consumer that is seeing these messages starting the second producer B hashes to shard one so the second consumer sees these entries what that also means that because we have two charts if we start a third consumer he's not going to do anything because both charts already have exclusive owners if we start the third producer again we have two charts so the third consumer does nothing here and the messages labeled C get appended to shard zero and because we process the shard in sequence we will start seeing messages labeled C when we reach a 150 around that time so let's stop the third consumer who's not doing any valuable work and start them again but identify them as a separate application so this is application 2 and let's start to consumers for application 2 they also elect oh now we see right the started seeing records labeled see in the first consumer an application to elected who owns which chart and consumes the same stream from the beginning all the rows since the beginning so we can see how its kind of doing a fan out one application does one thing on the stream the other one does something else and it processes things in sequence but in order to scale consumers you have to pre shard things if data is already in the stream and it was starting five shards that beta can only be consumed by five consumers like there's no way to speed it up in any way at that point that it's kind of already too late the data is assigned to shards already so let's hear from CID how the customer used Kinesis for the analysis all right so let's plug in Kinesis streams into an architecture so all the optech events were spread across multiple shards in a Kinesis stream for the batch analysis they were actually using a different flavor of Kinesis it's called Kinesis data firehose which lets you in real time push data to a storage service like s3 once data made it to s3 then it was consumed into a redshift cluster to do advanced sequel analysis now for the alerts they were actually triggering lambda function when the data was written into s3 and this lambda function was checking for specific attributes in the message which was in s3 and sending notifications to end users if it was identified as an alert now let's talk about a final use case Edmonds is a shopping website making car buying and selling extremely simple have a huge amount of used and new car listings by different car dealers om franchisees and then these entities make any changes or add new listings that has to be updated to the Edmunds website so they built a platform to make this happen so let's look at their architecture so these vendors were writing all the changes to source systems and they wanted to decouple their source systems from their target systems using a messaging platform these target systems were the back end of their website now let's look at the requirements one is this entire pipeline this entire platform has to be event-driven that helps them save costs adds efficiency the regular requirements from the message platform durable scalable highly available but in addition they wanted the ability to write the same message to multiple target systems so multiple consumers should receive the same message from the messaging platform but in addition they also wanted message filtering so based on some message attributes they can say only send this message to two systems or send this message to all the five systems so they needed that flexibility and all of this has to be done with minimal minimal latency so I'll let kuba talk to us about the messaging platform Thanks so what's key here is that in their architecture they wanted to send something and deliver to multiple destinations and when we talk about this yeah it's typically called a pop sub model where you publish something and you have multiple subscribers and the way you achieve perhaps have been AWS like native messaging services is through SNS topics so here's how we're going to think about an SNS topic notice that there are no consumers on this image and it's going to be clear in a second why so this image represents a single SNS topic and what you can do with the topic is you can publish things to a topic and you can configure subscriptions that's in to where the messages need to be delivered from a topic so what what what what kind of destinations can you configure one type of destination is an Amazon s qsq today SNS topics port SQ standard queues they don't support FIFO queue yet so you use this type of destination for integrating systems together next one is lambda so you can easily hook up lambda function as a destination of of your SNS topic which basically means that when you publish something is going to invoke a lambda function for you the third one is an HTTP endpoint this is how you can achieve a push based model for messaging you implement the service you expose an API and you can configure SNS to invoke your API when a delivery needs to be made and what's really interesting here is that you can control the rate of retries and basically because SNS has is very scalable to prevent an outage of your system if there's a huge spike of invocations going through the HTTP delivery channel and we also have different types of destinations typically geared more towards end users than system to system integration which is like mobile application push notifications we've got SMS text messages and you've got email also as a delivery target what's new in Amazon SNS topics is you can also configure filters on each destination so it's a simple function that can look at the attributes of your message and decide whether the message is allowed to be delivered to the destination or not so let's see how it works when you publish things to an SNS topic publisher calls producer calls publish sending payload a and immediately the messages stored durably in SNS across multiple availability zones so and immediately the producer is acknowledged that we got the message before we even attempt to start deliveries to all destinations what this means is you will see the same low latency of publishing vocations whether you have one destination or a million destinations on your SNS topic the latency you observe on a publisher is going to be the same so what happens next is behind the scenes you already saw one okay from your point of view the message is published but now internally inside SNS we perform the fan-out so for each subscribe destination we will end up sending a copy in this example we see how the filtering function the second one actually prevented the message from going out to this destination and you can think about this stage as multiple internal cues inside SNS that you don't even see in that keep track of each individual destination for you so the fan-out happened and now we need to deliver five copies of this message to those different destinations and behind the scenes SNS now attempts the delivery what happens if one of those channels fails for whatever reason let's say the reason is it was an HTTP endpoint and your web server was not running what's going to happen is we will remember that those delivered notifications are already good but we still keep track of the one that needs to be delivered to the endpoint that failed and we will keep retrying how many times depends on the destination for all intents and purposes when you subscribe and SQS queue or a lambda function to an SNS topic you can think of those retries is happening forever like it's going to get delivered for your own HTTP endpoints you're in control of how many times the retry needs to be made what's the back of retry time and basically at what rate the retries happen and for how long so when the second producer sends message to the topic again SNS sees the sent attempt stores it durably across multiple AZ's and confirms to the producer immediately that we got it and the same situation happens again what a fan out in this particular case the first filtering function didn't allow the message through the second allowed it through and again behind-the-scenes delivery attempt is made and now one of these destinations actually have has two messages to go through so what's key here is that it's very elastic similar to standard sqs cues it doesn't matter what your send through send radius can be 10 TPS can be 10,000 cents per second SNS top I will handle it and the key here is you will always observe the same low publish latency because you're not waiting for the actual deliveries to happen so it's a very nice integration platform where just you just call publish once and all the different destinations configured on a topic get a copy of the message delivered so let's hear from Sid how it solved the customers problem so Edmonds decided to use SNS with a fan-out to multiple SQS queues for the architecture now we still have a server which is going to read from the sqs queues and then write it to the target systems and we don't like servers do we so let's bring in our serverless champion a SS lambda now sq sq has native integration with a tubeless lambda what that means is that the area plus lambda service does all the underlying heavy lifting poles sq sq and invokes a lambda function when it sees a message in the sqs queue additionally it also takes care of scaling so if the number of messages increases in the sqs queue it will automatically increase the number of simultaneous lambda invocations up to a thousand simultaneous lambda invocations so now for this architecture we can have different lambda functions get triggered for different sqs queues and these lambda functions write data to different target systems so let's plug this in to our architecture perfect now one requirement still remains that this entire architecture has to be event driven so for that what Edmonds did was they launched the lambda function every time something was written to the source systems this was a synchronous invocation which basically went and wrote data to an Amazon s3 bucket the Amazon s3 bucket was configured in such a way that a put even notification would automatically send a message to SNS so we have so we have that native integration between Amazon s3 and s nos so let's recap we saw Amazon SQL standard the focus was simplicity the producer writes a message to the queue the receiver reads it processes it and then deletes it from the queue and the queue by itself is literally infinitely scalable one messages or million messages the queue adapts the work workload but two things to consider one messages can be out of order and two you can have duplicate messages now if your application cannot adapt to this characteristics you can use sqs FIFO queue which guarantees in order delivery of messages as well as only single message so no duplicates and you also have the flexibility of spreading your messages across multiple message groups and within a given message group you are guaranteed to have in order processing of messages we also have Kinesis which is for streaming analysis of data so you can go back in time and look at a bunch of messages together you have the shard I traitor through which you can control how back in time can you go you can look from the latest records or go from the first record as we saw in the demo with Kenny says the messages don't get deleted once it's processed so they are there for the entire retention time additionally in Kinesis you can have multiple consumers read the same message as well as customer affinity so you can have shards and you can have one consumer read all the message of a shard hence giving that consumer affinity finally we saw Amazon SNS which is a fan out kind of architecture where you have one message which can be distributed and delivered to multiple endpoints and we also have filtering in build so you don't have to send the same message to everyone you can filter and put some kind of a logic then sending these messages thank you and we'll take any mess any questions you have [Applause]
Info
Channel: Amazon Web Services
Views: 17,033
Rating: 4.9606557 out of 5
Keywords: re:Invent 2018, Amazon, AWS re:Invent, Application Integration, API305, Amazon Kinesis
Id: 4-JmX6MIDDI
Channel Id: undefined
Length: 52min 41sec (3161 seconds)
Published: Thu Nov 29 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.