ElixirConf 2021 - Tyler Hawkins - Scaling Ecto and Postgres

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] [Music] uh sweet so i'll be talking my name's tyler hawkins i'll be talking about scaling ecto and poker postgres today hopefully that's why all of you are here um a little bit about me uh i'm tyler hawkins like i said i work at a company called divi it's was started about five years ago and it set out to put an end to expense reports um by pre-approving fund requests for all employees of a company it has since expanded from there to be an entire financial uh platform i joined about three and a half years ago when the product launched and i've watched divi grow a lot from around 50 employees to now more than 500 employees we've also gone from processing thousands of transactions a month to thousands an hour as a result we've really had to think about scaling out the database to match our growing needs and leveraging the best tools that ecto really has to offer so with that kind of as a background that's what i'm here to talk about so today we're going to imagine that we have a company that we are we'll say starting that processes credit card transactions i don't know why i would have picked that but this is what i picked uh so we're going to need a database table first of all that stores transactions and how they relate to cards so here's our definition of the transaction table using an ecto migration as you can see it's pretty standard migration we've got a uuid primary key the amount incense the merchant name a reference to what card made the transaction as well as generated timestamps have inserted that and updated at just to make sure we know what's going on and when things happen the structure of the card stable reference here doesn't really matter for today's purposes we're going to just suppose that the cards are really being a one-to-many with transactions and it's not as large as a transactions table has grown obviously you can swipe a card many more times than once and so in this case we'll be talking primary about transactions so here you can see after we run this migration we just get a nice simple clean table to worry about uh it should hold up for quite some time so first of all why do we need to worry about scaling let's imagine that we are struggling with transactions table that has now grown to millions of records it's slow to query disproportionately balanced with regard to number of transactions per card and is only growing bigger every day users who log in to see their transactions used to be able to get a list in less than a second and now for our bigger customers it's taking up to 30 seconds to return a list of transactions really poor user experience and so if this database as we're supposing is managed by us what do we do when we run out of disk space are we using the cloud or we have an on-prem solution sometimes it's as easy as being able to click a button and say up upgrade size you know change your your rds instance or whatever the case may be but how long can we keep bumping up the instant size in the cloud and how do we handle the throughput when there's lots of users connecting all at once and so that's what we're going to talk about today so kind of two of the biggest scaling concerns of a database are storage and speed obviously every row takes up disk space using a cloud provided database means that we can upgrade storage space pretty easily and honestly getting more hardware is pretty cheap so why is a big database a problem well if you're replicating data which is a good recommendation it's going to be slow to do so if you're backing up the data which you also should do it'll take some time and so the problem with that is that let's say i don't know our production database crashes and we need to restore it how long is that going to take is that disaster recovery a matter of minutes hours days hopefully not days but the more rows the more work needs to be done to return the query information having slow database queries leads to slow api queries and can leave a poor impression on our customers so adding column indexes can help but they also increase storage um so let's talk about that acto makes adding index really easy in this case we're actually doing it with the concurrently flag because postgres supports building indexes without locking out rights this way we don't have to lock the whole table in order to add indexes especially for large tables this can be a problem and if we block traffic to our transactions table it's going to be really difficult for people to see results if we're adding an index at a you know in the middle of the day as we tend to do as engineers so the problem with this i mean it's great because it allows you to not lock the table it does take longer to finish as it has to scan the table twice to make sure it covers everything it's also not flawless as the resources needed to add the index could affect your other queries and the speed of that especially if your column that you're indexing has a lot of data there's also an issue adding index there's an issue adding an index to every column though as you can see in this example we're just naively throwing an index on every column so now we're trading really quick reads which is great for really slow writes whenever we write to the table we now have to update the row and all five of these binary tree indexes and it's also not really that efficient i mean how often are we going to be searching on all these different fields how often are we going to actually say hey when was this last updated in our case we might not even need this column of updated at because if our if our transactions don't change it just depends on what we need so instead we can utilize things like compound indexes well that will more directly match the queries we do on this table so we can save on storage needs as well as not sacrifice our write times since we know that we almost always query by either a date range or by a specific card id we can build smarter indexes since we know that sorry the order specified does matter if we just query by amount for example we will not get as much benefit from the index uh if we did card id and amount so it might not seem like much to go from five indexes to two but imagine if we had many more columns on our transactions table and all have an index so imagine if we had 20 or 30 columns on this table and every single one was indexed and we had millions of rows then it could potentially take seconds maybe even minutes to update rows which is pretty bad so another common issue with databases is load even if the tables aren't that large if we have too many connections accessing our database at once we can find that even if most of our queries are fast they will have to wait for a connection to be free what's worse is if we have some queries that are slow they hog up our connection pool read read replicas can be a way to split up high load high query load on a database each replica matches the main database and structure and data as changes are made to the main database whether that's updates or deletes or anything like that they are pushed out to the replicas and because of this there can be a slight lag that's not usually noticed unless we are immediately trying to read a row that has just been written which i don't recommend because you won't get the data you're looking for ecto makes read replicas really easy to work with you can see here that just you basically instantiate a bunch of read replicas you can just have it go on a random pool like this does where it just says hey return one of these at random i don't really care as long as we spread the load you could also do like a round robin approach or whatever case you want to use this is a nice simple example of how you can use it and so we can simply use this read replica anytime we don't need to do any writes this helps distribute the load automatically for us across the database instances and prevents us from using up too many connections on a single database and you can see here's the code to actually call the read repo it's pretty easy you can just instead of calling repo doll you just say repo dot replica style and you could name this really whatever you wanted to as mentioned we have to be careful if we're doing any reads right after writes kind of like that second example there where we immediately are updating the merchant name and then trying to read from it we'll probably end up with the old merchant name depending on how much replica lag there is in the system at the moment and thanks to frameworks like phoenix and the underlying structure of the otp elixir makes concern concurrency very easy to use and so read replicas can help us utilize that concurrency very easily by spreading that out all over the place so in summary databases are constrained by storage speed and load indexes give us read speed at the cost of write speed and storage space and then we can match our indexes to our top queries instead of indexing everything and spread load across replicas where possible so let's talk a bit about ecto and some of the benefits it gets because what do we have to help us with our transaction table right now we'll assume it's not too big we've got indexes that are keeping things quick and our replicas are helping to balance our load so let's look at some of the ecto features that can help improve our code so we'll be talking about batching the multi-data structure and streaming ecto does have a lot of features and tools that are they're useful we'll focus today on just these three so batching queries together to make less calls to the database the multi-data data structure for data consistency and streaming queries for resource constraints so batching queries why does batching a query matter what does it matter if you just hit the database with everything you need all the time well the more queries we make the more connections we use and the more we hammer the database with requests n plus one queries are especially challenging they're really easy to write and they're hard to see the problems until things get out of hand at first it seems fine we pass in a few car id card ids and we you know we make two database calls for each one no big deal but what happens when we start passing in 100 card ids and each has thousands of transactions we're now making thousands of queries to the database just to get one piece of information so it's much better to batch our queries and get all the cards at once and all transactions at once if we can manage it all the better to do it all in a single query one or two queries is better than thousands of queries as the database is optimized for this reason especially with regard to adding indexes we've made it so that our reads are quick but we don't want to do a bunch of unnecessary ones so multis won't actually optimize your queries for you but they do provide data consistency if you've never used them before they're a lot like the pop the pipe operator where you can pass information from one to another because it runs in a database transaction multis will auto roll back if there if an error is returned so generally an error tuple for example let's say we have a multi that runs several steps like this one that involves database connections we don't want to update some data then have an error and regret updating the data multis will handle undoing the updates made for our error so that it's all or nothing which is really good for data consistency because otherwise you know bad things could happen let's say you updated the merchant name and then immediately tried to get some statistics about okay well how many transactions do we have for this merchant name but you modified the merchant name and then the next query fails out you should probably roll it all back because you didn't actually want to change it and so it's good to have multis for that case and because it's like the pipe operator it's really convenient to just kind of pass things together and you can use the results from one object into the next then finally another tool that ecto offers is the streaming feature this can be very useful if you need to pull n rows at a time so default is 500 but you can also specify the max rows this is really useful we actually use this at divi to do csv exporting because obviously building a massive csv export in memory is not great so if we needed to build a csv export of our transaction data especially if we wanted to include more data that's then shown here like cards or user information we could use streaming to send chunks of at a time to the web client and the web client will probably thank us when we don't send them a ginormous file all at once so instead of trying to cram a hundred thousand rows into a csv built in memory we could send pieces over 500 rows at a time and there are some libraries like the nimble csv library out there from dashbit that handle streams just fine and actually prefer a stream i think they might require one so if you have a stream you might as well use it um and then there are other some other tools that ecto provides help with that i'm not going to go into detail today but things like windowing functions can make aggregate queries easier and allow things like sums averages group buys to be easier to manage the dynamic fram fragments are great if you have conditional where clauses things like searches or filters that might not always be there and just depends on the user really easy to just kind of put those together to make your query life easy and then ecto enums are also a great way to give you more readable code while allowing you to optimize storage space in the database it'll do the translating it integer to adam for you it's also easier to sort by numbers especially if if product or somebody says hey we want these statuses to be sorted and they're all just random strings it's really nice when you can just give them numbers instead so let's talk about beyond basic optimization so what do we do when all this just isn't enough when we have millions and millions and we start approaching the the billion line mark on transactions what do we start doing up there uh we're going to talk about it there's a couple solutions that you can use there's both partitioning tables and there's also making database shards so we'll talk about partitioning first so postgres has a feature called table partitions that are kind of like having a router table a router table if you will that points to the right table to query it's very good at breaking up disproportionately large tables in a database so in our case if we have the transactions table that is getting very large but like we said earlier cards table is still pretty pretty slim we can use partitioning to break apart that table partition creation is not directly supported in ecto but since an ecto migration allows for elixir code it's also not difficult to do notice that we had to create a new table for our partition this is probably the worst part about postgres partitioning is that you cannot partition an existing table at least not in any of the released postgres paths so far and so you will need a migration path to move that data over there is a process for that you can look that one up if you do need to do that migration path as it usually involves a trigger in sorts to well if you want to update it while it's running if you can take down the database for a bit it makes partitioning pretty easy as you can just you know make a new table like the old one and migrate data over doing it while it's running is a bit trickier but here you can see this is a pretty simple partition that we're making for card id in this case we're just creating a single partition for every hundred card ids ideally you'd want to create partitions based on your need for example if you have some cards with most of the transactions like maybe you have some customers maybe we have customers that use ad spend and they have cards that get charged all the time certain companies out there like to charge based on use i probably shouldn't say their names in this talk so you could create a single partition just for that card id it can be helpful to kind of run queries to get counts and kind of group by and see okay what are good partitions we've got most of our data on these 10 card ids we should make a partition there and kind of help with that cards with a few transactions could just stay in the default partition what's nice about partitions is that whenever you query that partition you're only dealing with the load on that partition so if you had maybe a customer on their own partition they would be the only ones that feel the slowness and not everybody else it is important to note that the more partitions that you have the more overhead so you don't want too many arbitrary partitions with low data in them so we obviously don't want to create one for every card id for example another way you can do partitions is by time instead of by card ids this would be useful if we say to our customers hey we just show transactions over the last 30 days other than that you'll have to go get it a different way maybe we have like a cold cold storage kind of approach where if you want anything older than 30 days then you'll have to potentially wait a little longer the biggest downside of a time-based partition is that postgres will not automatically create new partitions for us this can be handled though either via a database trigger or by manual maintenance that you run maybe once a month or whatever the case may be so let's look at and explain with partitions here you can see i i loaded up the database with about 100 000 transactions dispersed across the 11 partitions including the default one since my query isn't using the partition key you can actually see that this is worse than not partitioning the table at all as it now has to go through all these partitions and union all the results together it estimates the cost at 1232 for about 4500 rows here if we use the partition key in the query you can see that we instead drop the cost to 251 and we only search 989 rows of data this is much more optimized and honestly it looks a lot better too and so you can see that the more transactions are added the more benefit we will see from having partition data when we use the partition key so once we get up to millions and even billions of rows this will greatly reduce the amount of querying we have to do because it can go to a subset unless we don't use the partition key and then it will be absolutely terrible as you might expect so the main advantage is that once the data has been partitioned you can query it as normal with ecto so you can get a lot of speed benefits without needing to change any code as it just hits the main table and it figures out which partition to query like i said before there are some pretty major caveats that creating partitions is not automatic the default partition is the only really automatic part of that and that can help you kind of get by as you add more partitions or you can create a database trigger that will create it for you you also cannot partition an existing table you will have to create a similar table with partitions and migrate data into the bar those partitions and you can also lose query speed if you don't use your partition key because now it has the union across all partitions so because of this it's not ideal to partition all the tables in a database if you need to do that the next best strategy would be to shard the database which is what we're talking about next so database sharding is a lot like partitions and except that instead of a single table being split up we're now going to split up the whole database there's generally a core or a main database that acts kind of like a router to get queries to the right database location it's not automatic like the partition table though and adds a great deal of complexity to using a database so there are some very important questions to ask before you shard your database because it is very complex so the really the best question to ask is where is the best boundary to split across split shards kind of like with partitions you want to make sure that your partition key is always used and so you want to make sure that ideally you'd split your table on that partition key it's the same with sharding if you are going to be splitting the database you're going to want to know where that boundary should be in this case we could split by card id which is what we have on the table but another probably better solution would be to split on either a user id or like an organization id or anything like that something that we don't often query across and those are the other questions to ask is what's the least amount of information the core database needs to know as this database is going to essentially be the one that doesn't scale as it now needs to act as the router you want the least amount of information there as possible and then does data ever have to be queried across shards how often if it's really often you should probably go back to the boundary line because it probably needs to move too much data is too much shard interaction is not going to be easy to work with same thing with migrating data across shards is this something that needs to be done often occasionally we'll have to redo this if we decide that you know whatever boundary we picked uh whether we're doing a hash to figure out what shard to go to if we add a new shard it might be a time that we have to migrate data it's good to have a a good migration process there and to double check your boundaries again especially if you have to do it often so how does sharding work the first key is some sort of a core database or router how do we know where our data lives that's kind of the primary question that you always have to to ask when you're querying the database so the purpose of that core database is to kind of direct queries to the right shard location it doesn't necessarily even have to be a postgres database it could be a caching redis nebulx etc whatever you really wanted to use just something that helps you to know where to go as mentioned earlier ecto makes working with multiple database instances pretty easy like i showed earlier with this case of the read replicas we can actually do something very similar similar for shards as you can see here instead of read replicas we're instead saying hey here's all our different shards you can just query them as needed and you can see here that you could say hey it's on shard two and this is pretty hard coded and this would be you know maybe great for a few things but it can be a little tedious especially because all your code needs to know where and what char to go to which can be challenging to have that logic replicated everywhere however ecto does provide a much easier way ecto has a feature called put dynamic repo that allows you to for an entire process basically modify what the default repo is so in this case it means that we can change the repo to have maybe we'll have some middleware so whenever a user hits our api we can then determine okay well we know what charge you should be on because of your user now we can just say hey you should be on this shard for the rest of this process use that chart and so now all your queries for the api process just hit the correct shard and your code doesn't have to know what chart it has to hit it just works which is really nice another options you could use that are used commonly is to use the ids to tell you which chart to query so for example you could have like a uuid prefix with like s1 for the shard id or you could do some other kind of thing this works pretty well although it still has kind of that nuance of knowing what your id structure is and parsing that to figure out where to go you could also run a hashing function to get our chart information this is great for evenly distributing data as it kind of takes a round robin approach and it but it doesn't really allow for data migration very easily and if you add another shard you now have a lot to deal with as your hashing algorithm just changed another thing that's common is to use range sharding so if you have primary keys that are just serial numbers you can say okay if the id range is between 0 to nine thousand nine hundred ninety nine you're on shard zero that's one strategy this is good but it doesn't really work with uuids and it doesn't isolate data very well as a single single customer's data is likely over multiple if not all the shards so if you have a customer that says hey we need all of our information isolated it's not great because it's not isolated at all in fact you probably can't even find out where all their data is um so sharding a database allows you to continue to scale your database horizontally it allows you to spread the load to different databases once you set up the code changes to allow for more than one shard adding an additional shard is straightforward and if a customer needs to have their data isolated like i said it's pretty easy to just give them their own shard and be good to go so it does take a lot of complexity getting started and getting those shards put together and it's quite complicated but once you get it adding additional shards is pretty easy as you already have all the infrastructure needed at the code level to do so um but having a poor boundary line will cause more headache than help and you'll find that you'll probably hate whoever put shards into your infrastructure and then another caveat is to be careful that dynamic repos only lasts for that one process so if you spin off another process you'll need to set up the repo again or get some weird results uh so in summary there are many ways to scale ecto and postgres indexes are a great first step to improve query performance after that read replicas can help handle load from high requests per second ecto provides lots of great tools to improve our querying and provide data consistency and then table partitions once again are great if you have large tables that have outgrown the rest of the data base and when you have no other options and you need to kind of split everything up that's when you should use database shards any questions yes you mentioned the partitions don't you don't want too many of them can you talk more about how i mean it really depends on your database structure the the biggest downside to creating new partitions is that once again if you look back at disk space it all depends on where they store them and if it can't store them all next to each other it might have to go look at a different disk really the biggest trouble is if you are querying between more than one partition uh so you could easily have thousands i mean it depends on how many rows you have but having a single partition for like a few rows of data is not really helpful uh over here uh you mentioned like uh time based partitioning mm-hmm no um dynamically or do you set up a partition for every day yeah so you can set up a date range basically from the two you could probably use intervals and just create a database trigger that once runs once a month or a cron job to say hey just create this from the last 30 days to now so there are ways i mean it depends really on what you're trying to partition by time whether that's very consistent of every month you can even set them in advance so you have some partitions that are empty but are ready to go for next month because what happens if a partition doesn't exist it will put it in the default partition and i believe it will migrate afterwards if you create a new partition yes over here yeah so the each each one underneath is really its own table so you can actually look and um let me see if i can pull up that slide um you can actually see here that if you look at it you can see that you've got a sequential scan on transactions one two three four five six all the way through ten so it really just creates a new database table like you could go in and actually query transaction 7 directly if you wanted to but the actual in this case transactions partition table up at the top kind of acts as that router and just lets you query all of them as if they're one table so they all have their own indexes they have their own data and they all replicate the the master transactions partitions table then i noticed correct you were saying on the next slide that like you need to include that card id but if you didn't let's say you wanted to query across you know the maybe a group of different customers that have these different cards and you want to get generalized you know analytics from that is it just a horrible idea to do that with partitions because you're having all these it has to union everything yeah i mean you can still do it depending on how how much you've narrowed down your where clauses it'll be fine like in this case if you were to know what a very specific amount like we want everything across all transactions between we'll say 75 dollars even on the dot it wouldn't be too bad to do those queries but it would have to yes union all the results which goes back to how many partitions is too many partitions this would be more beneficial for like client-facing stuff but if you want analytics yep that's what snowflake is good for doing stuff just like that right here uh yep there's definitely a good use case uh for like is a good one that's actually become popular and useful lately um try not to dig too much on but they've actually improved quite a bit uh and some other things like cassandra and nosql solutions that are there are great use cases for things like that postgres and ecto aren't the solution for everything but they go pretty far any other questions right speaking of the going forward how far can these strategies discuss go in like order of magnitude and grows like billions trillions i'm i don't know yeah so generally around millions of tables especially if you have a really wide table is when you start to see slowness especially if you don't have indexes and then once you hit billions it starts to get pretty you need some really good optimizations to work but once you start the sharding strategy you can go for a long long time because you can just say okay we've got a bunch of customers just spin up a new shard throw them over here move on and so it lasts for quite a while basically every shard is as good as your i guess your biggest database would have been before so if you could handle millions no trouble you just add a new shard when you get to millions again and then i mean at that point i suppose once you start getting into hundreds and hundreds of shards you're going to run into trouble but you have a lot of money coming in and can figure it out right here do you have any thoughts on how your shards were going to translate earlier how like how they would translate across like actual like cloud hardware you know like different like ec2 instances or you know your database instances or whatever yeah so that really depends on your like i said that kind of router database um really depending on what that is you could have it be wherever you want it if it was like i mentioned you could use like redis a cache of sorts to do that then really it doesn't matter the nice thing about what i showed somewhere in here uh yeah this slide the nice thing about this is that you can configure in ecto every single one individually so you could have different credentials for every shard yeah as long as as long as your code knows where to go there's a question in the back uh explain is really good for it uh as it does kind of give you a cost analysis and then just really trial and error uh when you have a solo query we've used tools in the past like vivid cortex can help give a good breakdown i guess it's solar winds now can give a good breakdown of indexes and like which queries take the longest or even which queries are being hit the most frequently so that can give you a good idea of okay go index this because we query this query more than anything else so there are there's a lot of tools out there i know datadog's got some good tools too around database monitoring and kind of apm and stuff like that so you can also use like ecto telemetry to kind of get results on your queries things like that right here so are you always choosing between sharding and partitioning or would you ever be in a situation yeah you would probably you could probably end up in that situation so for example if we were to if we were to do our shards by like organizational id so every customer the bigger customers get their own shards etc they might still have so many transactions that we need to partition that table anyway so you can do it i mean really a shard is just its own database instance that some has a slight relationship to another sharded instance i thought i saw another question over here um so with the partitions you're not concerned with like knowing which partition correct yeah so postgres takes care of all the partitioning logic for you all the routing you just say hey hit this main partition table and it just figures it out like i said hopefully with the partition key because then it knows exactly where to go without it just has to figure it out it's when you set up the table you have to define it so you can see right here so you can see in this case we are partitioning by range inserted at and the one before the range is card id so it's it's in the table definition which is partially why you can't partition an existing table you have to create a new one because it's pivotal that you have a partition key and there's a number of different options and strategies you can do i just showed range you can also do value which is just specific values with yeah there's a few different options you can use right here these different these different structure approaches that you share it seems like there's a lot of hands-on in order to do this for example have you evaluated any of these approaches with some of these open source solutions out there that kind of do this out of the box like the test or cytus in the case of the test yeah there are a number of solutions out there i mean aurora does this for aws services kind of does some horizontal scaling for you and create replicas this helps you kind of do it your way which i kind of mentioned in the slides of um shoot i kind of mentioned at one point some of the other strategies that you can use here so there are these other options you can use you can use third-party tools in this case for the sharding strategy i wanted this is very customized to what we'd want whether that's isolation needs things like that whereas third-party tools don't usually give you as much flexibility they just kind of i mean on the plus side they take care of it for you you don't have to worry about it you can just keep inserting rows and they'll just manage it but if you do need control over it this is where that comes in i think we have time for one more question in the back yeah as long as it knows what the partition key is it's good to go it uses that i mean it's basically got its own index on which partition to go to so it's pretty quick that depends on once again what your starting strategy is so if you're doing it yourself then yes you could you know recharge it pretty easily if you know what your strategy is if you're using like a hashing algorithm you might run into some troubles as it's going to still automatically assign to that one um yeah so it just depends on your strategy all right i think i think that's all the time i have thanks everyone

Info

Channel: ElixirConf

Views: 1,857

Rating: undefined out of 5

Keywords: elixir

Id: SHjAMcBwY_o

Channel Id: undefined

Length: 39min 50sec (2390 seconds)

Published: Fri Oct 22 2021