Adding a cache is not as simple as it may seem...

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
the hardest thing to scale in a web application is the database therefore in order to improve performance a cach layer is often added to the stack this can be pretty effective improving performance while reducing the number of database queries however when it comes to software development nothing ever comes for free and caching itself has its own problems the biggest one is stale data which is when the data in your cache no longer matches what's in your database this is solved through cash invalidation but knowing when to flush your cash can be run rather difficult often being dictated by the type of data you're caching and the pattern you're using to understand what those problems are and how to solve them let's look at implementing the most common caching pattern there is cach aside which is also known as lazy loading in this strategy your application will request data initially from the cache and if there's a hit return that data to the client if the cache misses however then the database is instead quered for the data this data is then both stored in the cache and returned to the caller it's a pretty naive strategy but but it works rather effectively to see this in action let's go ahead and implement it to go along with this video I have a base project that I've created in Rust which is a simple CR app for managing spells inside of a Wizard's spell book The text stack of this project is rust and axom for the HTTP server for the database we're using post Gres with SQL X and for the cache I'm choosing to use redis the actual text stack itself doesn't matter too much you could easily replace reddis with an in-memory cache if you wanted to the benefit of using reddis however is that we're able to view the data that it contains using the CLI in order to implement the cach aze strategy first clone down the project files using git once completed you can open up the project in your favorite text editor and then navigate over to the handlers / read. RS file inside of this is a find by ID function which is used by our HTTP Handler to pel out spells from the database by their ID before we add in our cache let's go ahead and see this endpoint in action in order to run this we're going to need an instance of postr you can run an instance of this locally if you want to but in order to give this a more real world feel we're going to use the sponsor of today's video Ivan which will allow us to deploy both a postgres and reddis instance for free to deploy our instance first head on over to the Ivan website at go. ian. i/ dreams of code or you can click the link in the description down below then you'll want to create a free Ivan account using your preferred method once signed up you'll then want to create a new project for the application for me I'm going to call this spellwork with our project created we're then prompted to create a new service for this we want to create a postgrad database so go ahead and click the service button upon doing so make sure to click the free plan and then choose the location closest to you lastly give your service a name and click the big blue create button you'll then be shown your connection details here copy the service URI to your clipboard and then head back on over to your project files inside this project you'll find a NV file with a couple of environment variables left empty you'll want to pay paste this service URI for the database URL envar afterwards we can then run our server using the cargo run command this will automatically perform a database migration for us and insert eight rows into the spell table as well if we send off a curl request to the Local Host 3000 spells endpoint we can retrieve all of the rows inside of our table the Handler that we looked at which we're about to add caching to is the/ spells /id endpoint which allows us to pull out individual spells by their ID we can measure the amount of time that request takes by heading on over to the main. RS file and enabling the debug logging level then if we rerun our server and send up two find by ID requests we can see that it takes around 100 milliseconds for the request to process you'll notice the first request was a little bit longer that's because it's creating a prepared statement that is then reused by postgres let's take a mental note of how long this took so we can compare it to the cash a side version that we're about to implement to do so navigate back to your editor and head on over to the Handler SL read. RS file inside the find by ID function is where we're going to implement this strategy the first step is to check the cache with the ID of the spell that we're looking for to see if it contains an entry in our case the redist connection or Cache is stored inside of the App State therefore we can use the get method passing in the ID of the spell this is what that would look like in the redis CLI because this method can fail we don't want it to break the execution flow of the function therefore we'll use the unwrap or method to turn it into a none if it's an error on success this method returns an optional spell representing whether or not there was an entry inside of the cache therefore we can use the following lines to check if the spell exists and if so return it back to the client if the result is none then this will continue with the existing workflow this is the first half of cash aside the second half is that we store the result from the database inside of the cache first we'll check to see if we received a result from the database if so we can then write it to the cache using the set method passing in the ID and the spell this method also expects a couple more parameters we'll set these to be none and false for the meantime with that we're writing to reddis as if we were doing the following command in the reddis CLI we now have a basic implementation of cach aside however before we can run this code we need an instance of reddis that we can connect to fortunately Ivan has us covered again as they provide a free version of reddis that we can use to do so head back on over to the Ivan dashboard and create a new reddis service again be sure to select the free version and make sure it's a location close to you preferably in the same place as postgres once deployed go ahead and copy the connection URI then head back on over to the EnV and add it to the Reddit URL environment variable with that we're ready to run our code again using the cargo run command when we send our first request to the Spells SL1 endpoint you'll notice in the logs that we're retrieving the version in the database however if we make the same request again this time it's coming from the cache if we reenable the debug log in order to measure the time this request takes we can see that the response that pulls from the database takes about 250 milliseconds with the cached response taking around 50 therefore by using this cache we've reduced the amount of time it takes to pull out a record by 50% in the best case however in the worst case we've actually increased it by 2/3 that's because instead of performing a single request to the database we're now performing three requests in total with the additional two going towards redis fortunately there is a way to improve this performance by concurrently writing to the cache and sending the result back to the client we can do this in our code by using the following lines which will perform our right of the cache in a spawned Tokyo task because this task takes place concurrently the return of the result doesn't have to wait for the right to finish now when we rerun our code we can see that our first request which was the worst case only takes an extra 50 milliseconds when compared to before so whilst there is still a performance hit we've managed to reduce its impact by the way you may have noticed that every time we restart the server it flushes all of the keys inside of the cache this was done in potentially in order to simplify development during this first section but you likely wouldn't want to do this in production especially if you have multiple instances or reading from the same cache however removing this presents a problem with our implementation cash invalidation or the lack thereof basically our keys will remain in the cache for as long as it exists therefore in order to prevent our cach from consuming all of the available memory it has we need to find a way to invalidate our keys fortunately we have a couple of options the first is to use an eviction policy which tells reddis how to remove keys from the cache when the memory pressure climbs too high if we head on back over to our redis instance in Ivan we can actually select one of the policies we want to use let's take a look at what each one of these does the first of these policies is the evict or Keys least recently used first or lru this policy frees up memory by removing keys in the cache starting with those that were least recently used with the definition of used being either written to or read from the next policy is the evict only keys with expire set least recently used first this is similar to above but will only evict keys if they have an expiration set we'll talk more about what that is shortly next we have evict all keys in random order which is basically a form of chaos the next one is similar but will only evict keys if they have the expiration set underneath this we have evict only keys with expire set with the shorter TTL Keys being evicted first this policy is good for prioritizing the eviction of keys that are likely going to be removed soon anyway next we have evict using approximated lfus among the keys with an expiry set lfu stands for least frequently used which differs slightly from lru lfu will prioritize keeping keys that have a lot of activity on them instead preferring to evict those that haven't been used as much the last policy also uses an lfu but will evict any key regardless of whether it has an expiration set each of these policies has their own use case depending on what data you're storing inside of the cache in our case the best policy is likely going to be either the evict or Keys least recently used first or the evict any key using approximated lfu in this case I'm going to go with the least recently used first policy by setting this we've now prevented our instance from running out of memory as well as setting an eviction policy we can also tell our keys to expire after a certain period of time by setting an expiration on them we can add this to our caching logic through the following code which will cause our key to expire after 60 seconds if we go ahead and run our code then send up a couple of Cur requests we can see that our workflow is the same as before first pulling from the database and then pulling from the cache if I wait 60 seconds and send up another request you'll see that this one is also reloaded from the database showing that the key had been expired and removed setting an expiration is really useful if you want your keys to be removed after a period of time some common use cases for this are session tokens with authentication or tracking usage for something such as rate limiting other times they can be used to force the periodic refreshing of data however this can cause a common problem when it comes to caching let me show you what I mean here I've changed the expiration to take place after 15 minutes if I send up a get request we can see that our data has been cached as expected however if I send up a request to update the damage value of the spell and then send a subsequent get request I receive the old value this is because the data in the cache is now stale and no longer matches what's found in the database in many applications this is unacceptable and therefore has to be be resolved unfortunately using cach aside only doesn't solve this issue instead we need to use another pattern known as write through caching when a write occurs in your database such as updating a record a write will also occur to the cache at the same time let's go ahead and implement this inside of our update Handler you can navigate there by heading over to the Handler update. RS file inside of the project inside of this file we have our Handler function called update which accepts an ID of the spell we want to update and an update request body which contains our damage value this function is where we want to apply right through caching to do so is rather simple all we have to do is write to the cache once we've successfully written to the database we can do this again by using the trusty set method on our cache instance again making sure to set the expires field however there's another Improvement we can make here in a right heavy workflow this would end up creating a lot of keys that may not be used and if we have an exploration policy that uses an lru then this can be problematic instead we only want to write this key if it already exists in the cache we can do that in reddis by using the XX option of the set method which will only write the value if the key already exists we can do this in our update function by adding in the following code now let's go ahead and test it out if I send a put request to a spell that has been cached the next time I request it will'll see the updated damage value we can also see this updated value inside of the reddis instance however if I send a put request to a spell that hasn't been cached if we check check our reddest instance we can see that no key exists but the update did apply to the model inside of postgres this shows that our right through caching logic is working as expected however we are suffering a performance hit due to the fact that we're writing to the cache as well we can improve this by writing to the cache concurrently inside of a Tokyo task similar to how we did it in our Casher side logic this gives us our caching implementation without causing a performance hit to our users with that we've managed to solve the stale data issue when it comes to our update Handler however we have one last Handler that can still be affected this is the delete Handler which is used to remove a spell from the database in order to prevent stale data we also need to delete this from the cache as well you can do this using the Dell method of the cache instance however rather than showing you how to do this I'm going to let you implement this yourselves let me know how you get on in the comments down below with that we've managed to add caching into our application stack whilst also addressing some of the problems that can occur when doing so as a final note caching isn't always the Silver Bullet it's claimed to be most of the time you should look at adding indexing to your database tables as a first step if you'd like to know more about when to cach and when not to then let me know in the comments down below and I'll do another video on it I want to give a big thank you to Ivan for sponsoring this video they have a really cool data platform with support for a lot of services without them this video wouldn't have been as good so please do check them out using the referral link down below otherwise I want to give a big thank you for watching and I'll see you on the next one
Info
Channel: Dreams of Code
Views: 105,308
Rating: undefined out of 5
Keywords: redis, caching, application, stack, tech stack, rust, postgresql, axum
Id: bFf-A27Rc9s
Channel Id: undefined
Length: 13min 29sec (809 seconds)
Published: Sat Mar 16 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.