Background Processing with Rails, Redis and Sidekiq

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
situation you're working on an e-commerce site and you need to generate a sales report now this report is going to query a lot of data across a lot of different date ranges map over it aggregate it all together in different ways and then generate a nice visual report so let's click the button and see what happens hmm looks like this request is taking a long time to process let's see how long it takes Hey well finally the request came back and we got a report and you know a lot of different metrics and everything is good so we deployed this app to production the problem is when we run the exact same code in production and try to generate the sales report it takes a while again but instead of returning the report like it did locally instead it timed out with an application error it turns out that long-running synchronous web requests in a production application are a really bad idea and in order to get this code to work we need to offload that long-running request into a background process that we can run asynchronously but how do we do that well you're about to learn today I don't always be coding screencast before we write any code let's analyze what's going on here when we click on the generate report button we're making an HTTP request to our web server which in this case is deployed on Heroku the web server will route the request to our application code which in this case is a rails app so it goes through the middleware to the routes file gets routed to the appropriate controller which calls out to the model which will fetch data from the database return it to the controller the controller will organize it in the proper format and send it back to the middleware back through the web server and finally back to our computer as an HTTP response this entire process is what we would call asynchronous request response cycle when we make a request and from a UX perspective we block our application from doing anything else until we get a result but also from an infrastructure perspective the process on Heroku that handles web requests is also blocked from handling any other web requests until the current one is finished on Heroku itself once synchronous web requests can last for a maximum of 30 seconds before it will timeout and return an error so if you have a long-running function you are not going to be able to use it in a synchronous web request now there's a very good reason why Heroku does this by the way on Heroku what you're actually paying for is the number of application processes these are also called dinos now there are two different classes of dinos web dinos and worker dinos at a high level a dino is just a process that runs your rails application code but the web dinos are a special subset that are specifically made for handling web traffic Heroku puts all the web dinos under what they call a routing mesh which is similar to a load balancer and it will evenly distribute traffic across all your running web processes this router will also queue up all incoming requests and choose which web Dino to send that traffic to if your process that's handling that requests is taking too long you're going to queue up all the other requests behind it and nobody is ever going to get a response from the sir ever your application would effectively be broken so we need to introduce the concept of background processing instead of taking 30 seconds to answer the request immediately we can add the request to it's a do list of sorts then return immediately to the user with the message hey we added your request to a to do list and we'll get back to you later and the way this works in practice is with a queue a queue is a data store similar to a database but much simpler because it's just a key value store without the need for data modeling the way you have in a sequel database so instead of calling to the model to generate the report we'll just put a message on a queue that we need a report generated then return immediately to the user with a message that we're generating their report and we'll get back to them later then we're going to need to have a worker a worker is just a ruby object that is deployed on another process which is separate from the process that is handling the web request so in Heroku speak this would be a worker Dino and it will read the job off the queue and generate the report now this worker process is not restricted by the 30-second timeout so we can take as long as it wants once it's done we can perform some kind of action with the data either emailing the report to the user or uploading it to an s3 bucket or adding it to a cache so the user can view it on the web without having to recalculate it anything we want really so let's actually go through the code of what this would look like and we're going to need two things Redis and sidekick Redis is a database that we're going to use as the cue our application will write to sidekick is a ruby library for managing worker processes and rails applications and it will read off the queue to perform the jobs we put on it so Redis is a database it's a key value store you can learn more about Redis at Retta CEO but the good news is you don't really need to know that much about it to get started here to install Redis all you need to do is brew install Redis and to run Redis you just need to run the command read server it'll start the server on port six three seven nine which is the default port Retta so it starts on if you want to command-line interface into Redis you can run the command Redis CLI and this gives you basically a rep ball into your Redis q you could get for example every object that's on the key the cue right now this is some artifacts that have been created from running sidekick which I'll show you in a sec but that's pretty much it you don't really need to know much else so now we need to add the sidekick library you can do the by going your gem file and just adding the sidekick gem one thing to note though is that sidekick comes prepackaged with its own web UI the web UI is written in Sinatra so you need to add the Sinatra gem as well and one other thing is that if you're using rails 5 you might want to bundle the Sinatra app from source because there's some dependency issues with using older versions now we just need to bundle and start the web server I'm going to start the web server on port 3,000 300 web UI we need to mount the sidekick app on our routes file so we can go to routes and require sidekick slash web and then mount the web module to any route we want I'm going to mount it to slash sidekick now if I go to localhost 3003 slash sidekick I should see this dashboard that shows me how many jobs are on the Redis queue how many are being processed how many failed how many are currently busy the retries is scheduled in the dead this is really useful now all we need to do is put a job on the queue and build a worker that can actually process the job and then monitor them from this UI so let's start by building the sidekick worker and we're going to go to our app folder and create a new folder called workers and I'm going to make something called rapport underscore worker darby this is just a ruby class and I'm going to include the module to make this a sidekick worker you can also pass it some options under this parameter sidekick options the one I like to always pass it is retry false by default a sidekick worker will retry over and over again if it fails but I find for just like figuring out what's failed what has and it's easier if they don't automatically retry if you have to manually retry them and then every sidekick burger has a method called perform this is what's wrong when it reads a message off the queue and you can put messages on the queue with parameters and read them off with parameters so for example we could have a start date and end date for the report that it's trying to generate and let's just put up to the console whenever this is run a side kick worker running sidekick worker generating a report from start date to end date and end now in order to put a message on the queue we need to do that from the controller so this controller is what's generating the report right now all that it does is sleep for 30 seconds to mimic a long request but instead of doing that lets in queue a message to generate a report so we can do that by calling to a report worker and calling the perform a sync function that will tell sidekick to put this onto the queue and let's pass it a mock start date of July 1st 2016 and a mock end date of August 1st 2016 and then we'll render text from this that just says request to generate a report added to the queue now if we go to slash sales to generate sales report click generate report you'll see requests the generator port added to the queue and if we go to our sidekick queues and refresh this we should see that we have one message in queued that's the message that we just put on if we go back to the UI to create a sales report and click it again we'll see another message has been added to the queue and now we can see that we have two messages and dude so so basically what we're doing now is when we make a request to the server this controller right here will put the request to generate the report from these dates on to the queue and return immediately as opposed to you know taking 30 seconds to actually generate the report now this worker will read it off the queue and just write to console so how do we run this well all you need to do is go to your terminal and run sidekick and by default you can pass it this option - see of the number of threads you want to run I'm just going to run one thread to make it easier so I'm going to start up a worker by doing sidekick - see and you'll see it started processing these sidekick work are generating a report from a 701 2016 and then this is it reading the second message and now it waits idly for another message to be put on the queue so nothing's being put on the queue nothing is being written if I go back to our UI generator report and I click generate now we'll see that this just ran and it generated a report you might be wondering how sidekick knew to be reading off that Redis queue because we never really told it to do that it turns out that's just the default behavior of sidekick is that it's going to look for a read askew running on port 6 3 7 9 and if it's there then it knows that it's just going to read off there by default so now I want to show you how to take that infrastructure we just built and deploy it to Heroku the first thing you need to do is go to your proc file and define a new process type this is going to be a worker process that you can scale up and down independently of your web process I'm just going to call it report worker we want to have this run the exact same code we're running locally which is just to run the sidekick worker we defined so I'm just going to call bundle exact sidekick - see one now let's commit this code and deploy it and push to Heroku if we go to our Heroku app we should see a new type of worker to plied this report worker that we can choose a scale up independently of this web process so we're going to need to add Redis to our Heroku deployment luckily Heroku makes that about as easy as possible we can just search for Redis in the add-ons and see they offer a variety of different Retta services Redis to go is probably the easiest one to use they have a free tier just hit provision and then the Heroku will go ahead and set Redis set up a read askew on your server now there's one thing you're going to need to do though which is set an environment variable telling sidekick to look for the Redis provider and set that equal to the Redis to go URL config variable now we should have Redis up and running the only thing we're going to need to do now is start this report worker you can either restore it start it by clicking this button in the Heroku dashboard or you can start it from the console you can set Heroku PS scale report worker equals 1 which will basically tell Heroku to scale your Dino configuration so that you have one in one process of the report worker running and we should be able to see that change reflected in the sandbox if we refresh it you can see it right here this was turned off before now if we do Heroku logs - t that'll give us a server logs of the server running on Heroku um you should now see two things you should see this web dot one that's the web Dino anything in brackets is the name of the Dino and report worker - 1.1 is the name of the report worker so anything that's being logged by this is being logged by the worker process anything that's being logged by this is being logged by the web process so there's no jobs on the key right now so nothing's coming out but if we go to BBC sandbox 0 grab comm slash sales we can generate a report in production we'll get that same request to generate a port added to the queue but we can see that the worker actually process this the web request came through and then the report worker processed this request generating report from this so our distributed system worked properly so yeah background processing is a really important part of web development I hope that's a good high level overview of how it works in the real world you see that we could have easily taken that worker process and uploaded a report to an AWS s3 bucket and kind of completed the idea that I was going for there in other words I am on Twitter I always be coding you can hit me up in the comment section of this YouTube channel as well yo this YouTube channel is about to blow the up I'm going to start making a lot more code related content over the next couple months it really helps to hear what I'm doing right what I'm doing wrong I love hearing from people I'm going to make a screencast a day for the next two months so doesn't want to get on it it's going to be really good stuff these guys
Info
Channel: Decypher Media
Views: 42,002
Rating: 4.9600616 out of 5
Keywords: background processing, heroku, rails, redis, sidekiq, queue, background process, delayed job, resque, background jobs
Id: GBEDvF1_8B8
Channel Id: undefined
Length: 15min 0sec (900 seconds)
Published: Fri Jul 15 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.