Data Design and Modeling for Microservices

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
good Morgan my name is Adam bristling well it's mine delicious funk that means good right so therefore I will be doing the rest of this in English so thank you for joining us on our database trek way almost got applause Wow see if I can earn some later I have the easiest job at Amazon Web Services because I am the chief evangelist for database analytics and machine learning which means my job is to make you excited about data this is not hard because if you were here you're excited about data right okay let's have something every so often today I'm going to ask you a yes-or-no question for yes do this for know do this okay that's something there we go so let's try that again is everybody here excited about data yeah okay about three-quarters you did this and about 1/4 of you did this I'm not sure what that means but we'll we'll figure it out later I live in Oregon so in Oregon that means that you're using a certain herbs that I don't think are legal in Germany for personal use but not my problem all right so before we go into the formal presentation I was just going to talk a little bit about the things that we do the databases and analytics at AWS because we divide all this up into a couple of large categories so one category over here is the relational databases we'll talk more about each of these later but the whole idea there is to have manage services and we've discovered that managed services are the key to doing a lot of interesting things like being able to operate at large scale and being able to do micro services now if you're old like I am you remember not that long ago maybe 10 years ago all the database questions are done everything was relational right the only question was should you use Oracle or db2 or Microsoft but we're now in a world we're seeing a whole revolution of not relational things coming up so we'll talk a little bit later today about what we're doing with things like DynamoDB ElastiCache in neptune as well as other things going on in this non-relational revolution then of course we have the world over here of analytics so the many things that we can do to understand data and see what's happening with the data now the challenging thing let's see if I can do this in the light is to keep all of these different databases analytics working together so that we have some way to get some answers ok that's as much judgment as I'm going to do the other thing to be aware of is that today we are being live broadcast on Twitch so I'm going to ask you to feel free to ask any questions but normally I just ask you to shout them out because we're being broadcast today I need you to raise your hand so we can get to a microphone so here you're looking at a picture of me talking to you while I'm talking to you on the picture on Twitch this also means that if there's anything you'd like to hear again or any of the other tracks you can listen to them at your leisure by looking at the twitch broadcast it's a little bit behind isn't it ok could I mute the twitch broadcast I'll just kill it there we go alright I'm so the other thing that remember is that if you do ask a question it will be on the internet forever so make sure it's a good question ok so we're going to start today by talking about microservices and one of the things I find amusing is that we I gave these nice short titles to our German team and they made all the titles longer so this people went from microservices to a slightly longer title about data design and modeling for microservices we're going to talk a bit about micro services and how that fits in so I'm gonna ask you just a few question wonder how many pee we'll hear consider yourself developers okay it's good because this is dev day so that's why you should be here some of you did not raise your hand is anyone here can you think of yourself mostly as a data professional a database administrator a data engineer or data scientist or something like that okay nobody alright so yes sorta all right a little bit anybody here think of yourself as an architect a few of those I don't really know what architect means I was I I spent 14 years of my career at IBM and I was involved in a two-year long project there to define architect and the only answer we came up with is we think you're smart and we don't want you to manage anybody so I hope that fits the architects here my own definition of an architect is someone who can answer any question with it depends so that comes up to I was an architect so what does it depend on well partly depends on how you're gonna put your services together and micro and micro services are one of the big buzz words it's been around for yes 10 years or so and it comes from a very simple need which is the need to work at internet scale that the whole reason that drove micro services was we were moving from the world where an enterprise application meant to big sa P insulation with a few hundred users to an Internet application with millions of users and micro services are the only way that yet been shown that you can build large applications at scale to the to this very large size as well as being able to do things quicker and more effectively developing new features that meet the needs of your organization so my last hand raising question for now how many people here are currently working on some sort of micro services project ok oh good then you're in the right session for the rest of you try to stay awake so at Amazon we started in traditional way 1995 when the first amazon.com was built it was a monolithic application built with Oracle and memcache took his about four months to realize we couldn't afford to use Oracle the way that we hope to grow so we went to this one dot o version of a brand new database called my sequel and my sequel actually came out eleven tired 21 years ago it's when I first started so we built the business on my sequel to memcache went really well on to the public in 1996 1997 Amazon was the first company to sell 1 million dollars online 1990 it was the first company to sell a billion dollars online an American billion that's still a lot of money so very quickly we were growing but into as high I was 2000 was a billion very quickly into the early 2000s it became clear that we were not able to keep up with what we needed with a monolithic application and we started splitting this into small independent services today if you go to Amazon or amazon.com you're looking at about 210 to 230 micro services that are coming into play so I asked the architects over at Amazon could you give me a map of what all of our micro services look like and they said it depends I told you they're architects and I said well what does it depend on us as it pens if you care if anybody can read it he said well just give me the map anyway and they said ok but nobody will be able to make any sense out of it and they were right however everyone who registered for this next week you will get links to all of the slides and all of the tracks and if you zoom in on this you can see that each of those little blocks does indeed a well-defined micro service at least the main micro service and the lists of how they connect to each other so when we first started this it went down the path of it being serviced service oriented architecture with each service being a single purpose I'm going to share with you some of the things we've learned about making it work one critical one is they connect only through api's no back doors so the only way to talk to each micro services through its API the only way for micro services to connect to other things is through an API and that they connect now initially we did it all over HTTP HTTP sometimes today we use MQTT or other pieces but it's all standard network connections and this is not easy distributed computing is hard this is more difficult to write than traditional computing it's just more work you have to have multiple databases across multiple services you have to find ways that you can live with eventual consistency which is very disturbing to a lot of developers who've never dealt with this before the idea that I will not have I can ask the same question twice in a row and get different answers because I'm not always going to have all my data consistent consistency is very expensive in a micro service environment so you only use it where it's really necessary lots of moving parts you have to have Asus discovers services and coordinate them and route them ultimately a few things you need every micro service has to have some sort of registration there has to be some sort of repository that explains what the services are there has to be some method of discovery services have to have a way to find what other services are available and where they are you can't have them hard hard-coded IP addresses that will fail that's part of the wiring the way that these services connect and you have to think about how you administrate administer them you also have to deal with state a micro service must be stateless where it's not a micro service but you gotta be storing some state so that state has to be in some sort of data engine now that could be a flat file that could be a high-speed database that could be an in-memory structure but I have some sort of state management and I need to have metadata and I've got to have versioning I think the number one mistake I see when people first start using micro services as they fail to think about versioning but you build your first micro service then you build your second and they build your tenth and by the time you've got ten micro services probably got a matrix about fifty services and versions going on so how do you deal with that and use that effectively and then you need to deal with caching it's almost always got to be necessary to have at least some of the micro services have some caching layer in order to accelerate their performance because each micro service is going to be fast and each micro service is going to be able to talk to other micro services in only a few milliseconds but if I've got 200 micro services talking to each other and they each take 50 milliseconds and it's up to a long time so I'm gonna be meeting use caching so I can get to sub-millisecond times and then we have to find ways to make it low friction to deploy and automate the management monitoring it's all a lot of work so what exactly mean when we do this well here some of the other things we found one make sure that you are not committed to any technology stack now when you write a micro service when you put one together it's got to be flexible today at Amazon when we write things our usual database choices DynamoDB is that what we're gonna want five ten years ago that our choice was my sequel or choice be ten years from now I don't know but we want it depends thank you that's the correct answer but we and it depends on you know what happens in technologies but we need to make sure that we're not stuck with a stack not trapped with any language not trapped with any technology because he's need to be able to change over time you end up with this polyglot ecosystem it was common belief in IT that the most important thing you could do was centralize and standardize and get everybody using the same tools we're now a hundred and eighty degrees away from that probably more accuracy we're 540 degrees away from that because it's an industry we kind of all turned around in a circle then ended up pointing the other way so we have to find a way we have to live with the fact that I'm gonna have different tools different databases different languages while living and working together and polygon persistence not one big database lots and lots and lots of little databases generally one database for microservice pattern now that might not mean a physical database instance right it could be a dynamodb table per pattern for example but i'm still gonna have many many different databases and data stored in many places this is necessary for scalability it also leads to a problem because what's going to happen whenever you have the same data in two different places over time they're gonna entropy right they're gonna go out of sync so part of what we have to do is write some extra micro services every so often that go make sure those databases get resynchronized again a lot of work but it's necessary because now we start talking about what we get from doing all of this work and there's some big benefits one of them is being able to do low friction deployment and the two key methods of this are canary and Bluegreen I've never heard of these canary comes from coal miners who would take a canary down in the coal mine it's good reason for this Canaries are very susceptible to poison also there's no music in the coal mine they didn't have you know iPods back then so if the canary stops singing run and microservice I'm sorry to keep coughing in your ear and microservice the canary deployment means you have most of the users using the old version and a few users using the new version and then you see how that works and if that works well you do a few more in the new version and if that works well you rolled everyone to the new version so for example within the amazon.com world we recently rolled out a new search engine now most people wouldn't see this you just see the search bar but initially we set it ups about 5% of users would get the new engine ninety five would get the old work pretty well next day we change it to 10% found a few bugs rolled it back out went back to five it worked went back to ten it worked went up to twenty five it worked and then everybody had it so this gives a low-friction deployment non-disruptive sometimes you can't do that for example when you're rolling out a new database you probably can't have some people on old one and some on a new one or a new database version so in that case you do blue green so I have my blue version that's running and have my green version that's a new version and when I'm ready I can just change the DNS address and flip it over and now everyone's over here I'm probably going to be synchronizing data and then I once I flip over I synchronize data back the other way so that I can flip back if I have a problem so I flip that over run it and then if everything's great great if not flip back and then do that as necessary if you do this right you can do it with an outage that's less than a second so it's not really disruptive to users so at Amazon in most places that's called blue green and Netflix they call that red black I was always curious about that and then I spoke to the head architect at Netflix they said why is why do you call it red black he said because there are the first architect at Netflix was blue green colorblind so ok that's a good reason so these are low friction deployments all right so if you're doing this right let me give you some of the lessons that we had to learn the hard way over 10 years of building this out one each service must be elastic that means the service can grow or shrink independent of the other services this is absolutely critical to scalability so I'll give a good example of this anybody buy anything during prime day last year Amazon a few nods so what a normal Tuesday in the summer the dynamodb bee that's sitting behind amazon.com does 3 4 billion queries a day on prime day it did about 90 billion on the other hand the system that tracks registered users you got a few new users but it only went up a few percent over a regular day so we want to make sure that the things that have the higher use can go up and down independent of other pieces that's critical to micro service second key piece is that it's resilient you have fault isolation boundaries you design so that when a service fails not if when every all software will fail eventually so when a service fails limit the blast radius limit how many other things it affects and make sure we think about ways to recover make it composable with api's and the last two are really hard make it both minimal and complete minimum means keep the service as simple as you can complete is make sure the service does everything the service needs to do those two are friction with each other they compete but the last thing here is perhaps the most important thing and the biggest mistake I see made loose coupling is critical tight coupling is when one service directly calls another service when one service calls another service they're not really two services anymore you've just made them a single service because if I change anything in one service have to have to test it across them both and that means any changes become large regressions and I've really just made them another monolithic application that I'm calling micro services to make micro services work they must be loosely coupled loosely coupled means services never talk to other services services write data and accuse and that other services pull data from the queues now let me be clear our queue here does not mean IBM MQ they could or Amazon MQ or rabbitmq or it could be a file that one reads and other rights or it could be a database entry or it could be a Redis entry it's someplace that that data is the point is they're loosely coupled now services right into queues and now I could have the target service the B service in this case read the queue or I could have the queue itself push something out or I could have things happen on a schedule there's many ways that this could work but the key point is that I use loose coupling that every micro service region writes only from some sort of queue this allows separation this allows elasticity this means that the development team that's working on a never needs to really talk to the team that's working on B they just need to know what the api's are and they can develop and work independently and those are the reason you do this stuff because I get faster development when I do this I get more rapid development I get the ability to do things much faster at Amazon before we started down the path of micro services we had a quarterly release frame and once every three months we can put new things onto the website today across all of Amazon globally we have one new release of some significant feature approximately every 15 hundred milliseconds so every one and a half seconds some place on some Amazon service something new is being developed and deployed we're able to do that with low friction because it's all micro services it also lets you do parallel development so there's an old bad joke in computer science that nine women cannot make a baby in one month but they can make a micro service in one month because they can each be working on different parts in parallel they don't have to do it together so babies are monolithic don't try to change that please but micro services are not they allow us to get parallelism to work you've all seen this have you ever seen that or have you ever seen this that you're on a team and the team is behind schedule and your managers go no problem we'll put extra developers on the team does that make it faster no in the long term it might but in the short term now everybody who is developing is now helping the new people understand what they're doing but if I can isolate these things into separate micro services then what I can do is say split those functions out and say okay this team work on the display engine and this team work on the search engine and we're able to do those things in parallel this is why it's worth doing all this hard work because I can do things fast and parallel and I can do true DevOps or as we now call it devstack ops all right all right another hand question how many people here today you believe you work in a DevOps environment okay it's a true DevOps do you own it do you build it do you to play it do you secure it do you own the roadmap do you control the tools because that's what real DevOps means means that team has complete authority and accountability for what happens but there's also doing all this work means I get more scalability I can build things like Scout 24 and Amazon and Airbnb and these other really giant applications and Google and other things that have hundreds of millions or billions of concurrent users I get higher availability I get better fault tolerance and official line is more closely aligned to the business domain which is a a business speak way of saying that when the business people want something they can actually get it in a few weeks not in a few years wait traditionally has worked now what Amazon we do this with something called the to pizza team each service each micro service is owned by a small team the team that when they stay late can two pizzas is enough to feed everybody now when I started Amazon my wife heard about this and said good luck good news Darren you're a to pizza team all by yourself yeah well that's more of a personal flaw than a design characteristic but I would point out that in many of our two Pizza teams there are fierce arguments about what kind of pizzas to get but we do occasionally do that but the point is how many people is achieve eats a team it's somewhere between six and ten usually and that team has complete ownership that means that team picks what we do we call them service teams they own the primitives the service team is responsible for what the service does we're publishing the api's for determining the roadmap for the service for for supporting the service when calls come in when there's a problem that it doesn't work you own it you run it you build it it's up to the service team to determine what tools they use so our default choices at Amazon for example are Java and DynamoDB or if it's a customer facing that node nodejs but if your team says you know I'm doing something that needs a lot of geographic information I'd rather use Postgres you can use Postgres be a little more work for you to support it because you won't have all the support teams for tools but that's up to you that's your choice and if you want to go way out and say no we have to write our own database for this because their needs are so weird ok good luck with that but if that's what you choose to do that's what you choose to do and each team has complete choice as to what tools are going to use and how they're going to deploy them the only places that we force things down have to do with security another plate another and and network connectivity and other things that have to be universal so each team has complete ownership of this not all of the teams are writing services that a customer would see some are writing services for other service teams so we have thousands and thousands of service teams in Amazon and is mentioned here organizationally they roll up to various units so the biggest two units at Amazon are web services and commerce and then there's many smaller units a good example of a small unit is twitch people who are broadcasting us right now but which also is a collection of service teams we have some very small services that are just one team we have big ones like Commerce or web services that are thousands of teams but this lets us build a way that other labels has to do things quicker and effectively so since this is the database track now you've talked about micro services what how does the data work underneath that and what's the data architecture well again the traditional way to do this was a monolithic data store and put all the data in one place all together one big data base doing all of your services this was the best way to do things 10-15 years ago for a very simple reason database licenses are really expensive and a lot of choices remain to minimize the database license expense rather than go for what's best for the business or what's best for innovation we believe that this is an anti-pattern that when you're in one big database it's too hard to change it's too hard to advance and your costs are high so what we end up with in the micro service world is polyglot persistence decentralized databases break the big database into many smaller databases some of which will be redundant with each other each service chooses its own and data store technology so in this case I might have one service that uses dynamo on another service wants to run on my sequel which in the Amazon world would be relational database service already yes and yet another service - what on top says well I need high-speed caching but I also need persistence so I'll choose ElastiCache which is Redis and also RDS each team makes that decision data is gated through the service API and are independently scalable which leaves us some challenges at the data layer like how do i do transactional integrity when I do polyglot persistence I'm always gonna have eventual consistency so generally I'm gonna do asynchronous calls which are non blocking but I have to handle them so I'm gonna do things like stage commit and roll back on failure and the way that you do that through what's called a correlation ID so as data passes from one service to another service to another service you make sure that you have some sort of ID that identifies this transaction so again I'm using an e-commerce example but I'm gonna have a UI and a catalog and a checkout that goes to payment and shipping I'm gonna make sure that as those go along for a given transaction there's always going to be a common unit in this case you had one two three and I need that because if I have a failure I want to be able to roll back everything on you and one two three that could happen for two reasons it's called business failure or technical failure the best practice for this is to have a micro a rollback micro service so you have another micro service whole job it is is to roll back when things don't work right so what do I mean by the two different types of failure well a little more on this later but a a technical failures when something doesn't work a lambda function failed a database crashed doesn't happen much in managed services but it can still happen a business failure is when the technology works but there's something wrong with the data so in the case here of a e-commerce what happens if we get all the way to the end and the credit card is declined all right well then I have to roll back that transaction go back to the user and say hmm there's a problem with your credit card so we're going to do this with have functions in the microservices and some of those functions are going to be a roll back function and then optionally a commit function so that these aren't actually committed until the commit function is called and these are api's you build your micro-service now one of the ways to make that easy is dynamodb and this is why we normally use DynamoDB and Amazon for run services because whenever you change date and DynamoDB it's exposed as a stream and the streams are ordered and persistent for 24 hours so that the data stays around is accessible and the stream can itself drive lambda functions or drive other things to happen so therefore you can attach yourself to the date of instance of interest home keep retrying until whatever you want to do actually happens but when I get those errors have to have some way to do rollback so what we usually do is we create a transaction manager micro service that can tell all the other microservices time to roll something back so often dynamodb will be the trigger that makes that run but there'll be some sort of correlation ID so what we'll do is when we have a failure we'll have an error table to Mike and dynamodb and into that error table you will write the you would of the correlation ID that will create a stream the stream will then kick off the various functions that say go roll all this back make sense I know this sounds a little bit complicated but it's a lot easier than it sounds so that we end up with something like this where I'm rolling along and I run into an error then that error rates to the error table if you like you could also put it into an error stream whenever a queue and then that'll kick off my transaction manager function which will then go and do roll backs against each other places at a problem now what if I have bad code or a physical failure so I'll put it this way lambdas are serverless right I'm assuming everyone here knows what a lambda is is that correct okay all right once again three quarters are doing this and one quarter okay no is my English step bad okay so when I'm writing lambdas a lambda actually runs in a container right it's server this means you don't have to care about the server but it's not magics tell us a run on a server someplace new containers fail sometimes yeah I mean and uh right now the the service level agreement for ec2 is 99.95 and we're actually running about 99.9 eighth but even 99.9 eighth that means that about one at a 200 are gonna or one out of 400 are gonna fail each year so once in a while I have a failure now in Amazon that'll automatically get fixed and replaced but it means something failed it might need to get rolled back and restarted you find this in AWS on cloud watch logs so whenever I have any sort of physical fail your code failure that'll show up in cloud watch so I need to make sure I set up some metric filters to have cloud watch also be able to have the alarms kick off my transaction manager so they can roll things back now a little warning about this if I'm using the stream another mistake I commonly see make sure when you write your lambdas that pick things off the stream that they acknowledge that they got it off the stream because remember the stream keeps pushing it until it's picked up so I've seen code written that it says if there's a problem I'll do something but it never acknowledges it so what happens in the stream the head of the stream is still that same problem and any future problems are backed up behind it so you got to take it out of the stream when you're done so the next problem can come to the so just a little piece of designing that so in any case we'll uh we'll watch the cloud watch metric and and also look for in this case look for a metric that if I see the same correlation ID keep coming up means something's not acknowledging the stream okay last trick in here keeping the data consistent this is master data management so you'll probably need to write yet another microservice whose whole job it is is to look across your different data stores and make sure the data is consistent so make sure that the data coming in and the data coming out actually match make sure the one I have data in different stores are kept in consistent ways make sure that if you're doing equity trading that all of the systems that are trading equities use it the same way so that I don't have this one says Apple and this one says AAPL but that I have those things regular lies and that's what we do with the master data management so it's yet another micro service you write to watch other services usually this is something you want to want run on a schedule so the trick of course to running the functions on a schedule is set up a cloud watch event and have the event just kick off once an hour once every 10 minutes wherever often you want to check it all right so what database do I use under all of this well the good news is AWS we have lots of managed databases the bad news is you have lots of managed databases and the truth is almost anything you want to do you can probably do with any of these right I could use a relational database I can use elastic cache if I want something really fast which is Redis I can use DynamoDB if I want something really scalable I can use RDS if I want something that uses relational technologies redshift they don't normally use micro services except for analytics because redshift is a data warehouse it reads very fast but it writes a little bit slow so it's usually not on the right time scale for this but I might want to store things on ice 3 or I might want to store them in elasticsearch some people actually use elastic search not just the search engine but as the cue to move data back and forth or streaming in Kinesis or some combination of these I noticed what I have on the side there doesn't say microservices service because the service is a whole bunch of micro services that does something again they use the Amazon example the Amazon homepage is a service made up of a few hundred micro services so how do I decide which service to use usually we say if there's a functional reason that overrides so one example of a functional reason is I need to be under a millisecond if any things have been under a millisecond I pretty much have to use Redis the in-memory data store the loads we do things in microseconds another functional example is what I mentioned earlier which is geographic information because Postgres is the only database today arguably Oracle that really have strong GIS capabilities geographical information system so that if for example I want the at the database layer to draw a bounding box and determine if a given lat/long is in that box and usually a box doesn't mean a square it means like the borders of bavaria okay the very is not a square all right it's a very complex shape with all the things around the border Postgres does that really well but if you don't have that kind of very specific functional need then we recommend you do this off of non functionals so what you end up doing is you create a chart like this one specific to your organization and you might make in in this case four different categories and you determine for your organization what those categories are so for this organization they've decided that high latency things that don't meet a big latency it can take more than a second and their idea of very low latency is less than 20 milliseconds now for your organization low latency might be one millisecond it depends on what you're doing similarly we define some levels of durability some levels of what we think are big or small and scaling levels of how available they need to be how public the data is how quickly need to be able to recover so the highest level is data that I have to recover in five minutes if I have an outage and then what level of skills they have in the organization you use this to then determine what's right database for me when each service team is figuring out what to do so the numbers on this chart are not service level agreements there are rough numbers where we see off different services but for example with latency my relational databases will usually be under 10 milliseconds and always under 100 but when I'm using elastic cash or Dax with Dinamo I can get that down to 1 millisecond as 3 on the other hand has infinite size okay not really infinite but really really big we have some customers that have moved one well-known customer is snapchat when they joined a AWS they moved over for exabytes of data into AWS into s3 so we have really big size on s3 on the other hand usually you don't need that much usually you don't need exabytes for your data if you do you might be keeping a little too much data for that micro service so other things have their own limits and then most of our services are highly available some more so than others various recoverability if you're not familiar with em AZ that stands for a multiple availability zone that's just saying have I checked off the box to have this spread across high availability or not some services like dynamo you can't turn that off other services you have a choice and then you finalize your choices that you've made so you look across these and say after I have mapped these together I end up with a short list and then I decide which ones I want to use so that gives me a choice for which data store to use for which project often I will use multiples in any project so let me walk through this in a real example I'm currently working with one of our customers that does settlement of bond trades so usually this is government and government pod trading or government to large to banks so the the problem here is the average transaction is 50 million dollars for a transaction and if the transaction gets delayed the settlement company has to pay the interest the delay time now you might say well it's only a few hours but when we're talking about hundreds of millions of dollars that adds up very quickly so they settle a total of about three trillion dollars per day of total bond sales on a typical working day so three Tara dollars it's a lot of money so what are their requirements so one requirement they have is that when things are ingested in for the bank's I have to ingest that and absolutely reliably do it within 30 seconds every time then I need to pass that through various services that are going to process it and then at the end of the process I need to write that into something that is persistent and long-lasting and can be accessed by SQL because there's some standard queries that people in the industry right all right that gives me some definitions right so my one coming in where my size is very variable and I've got to always be within 30 seconds what do you think is the right choice there would you use say that louder even louder cuz my ears are stuffed okay this is a tough room all right dynamo is what they ended up choosing because we could have done that with dynamo we could have done that with relational but one of the features of dynamos we'll talk about later today is Daniel you get the same performance no matter how busy it is and they wanted to make sure because when you're in bond trading low stage you have a certain amount of busy but someday something happens and you're much much busier so they want to make sure they always hit that 30-second mark the other trick that they do is they're writing it into two different regions so the writing in the two places in the world just in case some disaster happens they don't lose any transactions however as and then so that's my highly reliable dynamo for ingestion through the process they'll have lots of little dynamo state engines just holding data or sometimes Redis as cues to move data from one service to the next and at the end I need something that's highly reliable and I can answer a query in SQL it's a dynamo doesn't support SQL so what would you use there Aurora right and they're chased they looked at my sequel and Postgres they decided they liked Postgres better that was a skills issue because before they use the cloud they were mostly Oracle Postgres is much closer to Oracle just to the syntax and the way it operates but that's the way we kind of walk through these non release non relation these are non-functional requirements in order to determine the data stores to use and of course with Aurora they're also writing in the two regions just we have two copies and if I have some sort of disaster I don't lose my data so you finalize out that choice and usually most organizations you don't force the service teams what to use but you might have a service team whose job it is to help the other service teams by putting together services to make it easy to deploy something so within Amazon we have teams that make it easy for people to use dynamo you have teams that make it easy for people to use Aurora you have teams that make it easy for people to use ElastiCache and pre configure them into our security standards and other pieces last piece you have here is how do I report on how things are working so you have to do some sort of consolidation and aggregation there's four patterns for this and three of them are good so the pattern you should never use is the one in the lower right which is the composite which is when I write one data service that tries to live read information from different services it doesn't scale again it's tightly coupled I'm reading directly so don't do that one each of the other three works the most common what I see is the one on the top left which is a poll model which it what so I will have the data get copied or pulled into the aggregate reporting service the one that I usually prefer is the one on the upper right which is a push model which is a fancy way of saying have every service emit a log so every time something important happens in a service have it write a log and then my reporting section is just log analytics and I can use elasticsearch or ELQ stack or Splunk or a whole bunch of tools that make it easy to analyze all the logs the other one I've seen some people like to do especially if it's widely distributed across different regions is publish and subscribe so that you have each of the services actually are a publisher that will subscribe and then the data aggregation services the subscriber but all of these work in order to get the data in back to my description of the the bond settlement system whenever I have an ingest I am in a log where I write a final version I am in a log every so often we compare the logs because if something went in and it didn't get to the other end within a minute or two something funny is going on and I need to go back and find out what happened to that transaction this is again one of our ways to avoid entropy there will always be entropy don't pretend it doesn't exist find ways to deal with it so final thoughts use non-functional requirements to identify the data stores for each service use polyglot persistence so that I don't have bottlenecks that each of these things are able to scale and move independently embrace eventual consistency learn to love it think about which things you're doing that don't need to be perfectly consistent when you're saying how much inventory do I have does that need to be an exact number usually not what I'm saying you know what's the temperature outside does it need to be accurate within milliseconds yeah if it's 10 minutes out of dates probably close enough other things need to be exactly accurate understand which ones need them and only do that work when it's there and then think about your analytics requirements in the beginning so if you put all of these kind of tools together use some of these things you too can develop microservices they'll let you create applications that are flexible scalable changeable let you deal with the new world so does anybody have any questions oh my god am i that boring that nobody can come up with even one question are you so frightened by the warning up there we go all right grant you get to run over with the microphone and don't let the fact that you're live on the internet scare you in any way this is basically on the Bluegreen deployment how it is done do you really duplicate the whole infrastructure or is it only at the service level you duplicate the services and the databases of underlying databases are the same because it amounts to the cost at the end so normally when I want to do blue green I only want to do one service at a time right I don't want to do the entire environment unless I actually have a change that affects the entire environment so usually my blue green switch is I'm blue greening switching out one specific service with that switch on the other hand if I really need to do multiple services at once this is what's great about the cloud right because if I need extra capacity I use that extra capacity when I don't need it anymore I can turn it off I can do those things temporarily and do them the other thing that I like about the cloud is the ability to do a full performance testing which is something I always recommend is if you should be testing these micro services have test copies of them and test it against the full performance you think you'll need because that's the only way to discover whether a high workload is going to break your service or not but I'm sorry there's another way to discover it which is not test it and find out in production I don't recommend that method all right any other questions yes sir well let's let them get you the microphone here come on grant step it up all right I have a question regarding loosely and tightly cup of services you mentioned the the service discovery if I have a micro service and I'm using some kind of service discovery from my console or Eureka from from Netflix is it tightly or loosely because I'm not using any IP address I just know the name of the service but still I'm amused the rest interface from the service where I need the data from okay so in that case if I look at how Yuriko works what I'm actually doing is moving the service information through an API call into the service registry that is a loose coupling because the service registry isn't really calling it's going through the API into a data store so if you think about the way Eureka's defined it really is a loose coupling environment the point is the discovery service is not well sometimes in that case the discovery service will directly call in order to fill its data store but in this case it can be eventually consistent so it's not a tight coupling and I will say one don't let the definitely computer science definitions become the enemy of making working services everything I just said no tight coupling I have been working in some environments where as people say I've got to get all this done in 50 milliseconds all right well if I have a 50 millisecond budget I can't do a lot of jumps so in that world sometimes you have to break the rules and actually do a tight coupling just understand what you're getting and what your costs are when you break these rules okay someone else had a question back here yeah it's actually a similar question I'm just trying to guess grasp the concept you said that you should loosely couple the micro services but when you're calling the service another service through the API are you actually then pushing the response to the queues or and with this you actually decouple the services more is the other idea behind it so generally when one service is calling another services API what you really want to be happening is that it's calling upon something that is queued and some data that is there so that when when serve it when one cert when when I need one service to to call to another service if it's a push model the queue itself should be a she'll be kicking something off so this is what I'm going to use streaming or some sort of lambda function if it's a pole model where this service is pulling from that other service it can call the API from that service that's okay as long as what's behind that what it's actually pulling from is some sort of queue so I want to make sure I have the queue exposed by an API with that the pulling function does not require tight integration with the rest of the service so as you probably have noticed what I'm doing right now is a one-hour version of what's really a three-day course on designing micro services so if you do that full version whether it's from us or others there's a lot of examples about how you can do this and make the micro services work but again what we're really trying to avoid is creating a world where a change to one micro service requires regression testing against multiple micro services we're trying to make the micro service isolate so that all I need to know about that micro services is api's and everything else is black box to every other service the question is is I so as we saw from your images that in for micro services API you used mostly HTTP and actually tried to Google in micro services HTTP versus messaging and first question lens on a stack overflow which starts with world I heard Amazon uses HTTP for its micro services so what is the main reason is it only GDP or you mean it's mostly HTTP and some cases messaging so if you're doing so when you say messaging what do you what do you mean of grant Rd took away your microphone so if I if by messaging you mean a a lighter way protocol like mqtt sometimes we will use that today if I messaging you mean actually passing a message in the terms of like a message queue well that's passing over HTTP services usually now with an Amazon we started enforcing the rule about three years ago of all HTTP had to be HT GPS that everything should be secured and indeed I think in today's world there's no reason to ever use clear text HTTP that just opens up a security risk that there's no reason to open up but beyond that the main reason that we're stressing that is so make it clear that I'm using standard Network topologies and that I'm not writing special networking pieces for my micro services so for those not familiar with mqtt that is a very lightweight designed for Internet of Things but sometimes it makes sense if I need to do high volumes and I have an application layer that will take care of consistency because there it's essentially a high speed to UDP equivalent of moving data but there there are times that it makes sense if I need to move data very lightweight all right one more question hi my question is regarding loose coupled micro services and as you said that every time a micro service has to write in a message queue so if you got transactions which are very much time constrained then every time publishing or subscribing to those message topics wanted like increase the latencies so the short answer is yes well increase the latency so then how do I deal with that and do it so one way is to use very low latency if I care about latency deeply use very low latency queues usually an in-memory store like Redis the other piece is loose coupling sets you free it gives you parallelism and let you grow but if I've got a hard business requirement but I've got it I got to get these ten services done in 15 50 milliseconds probably can't have them all queue in there I'm gonna have to have some faster ways having said that in Redis I in an elastic cache as we call the hosted Redis and Amazon I can read or write a 1k object in about 200 microseconds so given that speed I can usually do many hops and still be within my budget again this is one of those functional requirements you have to care about all right if you have more questions I'm gonna be hanging out here so come up and talk to me otherwise I have two requests of you one go talk to our partners these are the fine people who feed for the bread CIL's this morning if you uh the other stuff and please go out there and do something really cool with micro-services thank you for your time [Applause]
Info
Channel: Amazon Web Services
Views: 22,271
Rating: 4.9187818 out of 5
Keywords: AWS, Amazon Web Services, Cloud, cloud computing, AWS Cloud
Id: KPtLbSEFe6c
Channel Id: undefined
Length: 55min 47sec (3347 seconds)
Published: Wed Apr 25 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.