RailsConf 2018: Inside Active Storage: a code review of Rails' new framework by Claudio Baccigalupo

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
(Techno music) - Hello, thanks for coming. This is awesome; I see a very full room. If, guys, you have an empty seat next to you, can you raise your hand? Okay, guys around, if you want to sit down, there's still like 20 empty seats. Well, thanks for coming to this talk. Thanks for coming to RailsConf. It's been amazing so far, a lot of great talks, and I had a lot of fun. I hope you had a lot of fun as well. And we're also here to celebrate that Rails 5.2 was released just a week ago. If you have any contribution to the Rails codebase, can you raise your hand? Can everybody else give a clap to all these people who made it possible? (applause) I'm not gonna go through everything new in Rails 5.2. I'm just gonna go through one thing, but it's a really exciting and pretty interesting thing, and it's called Active Storage. And because this is RailsConf, I'm not just gonna tell you what it is, I'm just going to go deep inside the code, without scaring you too much. And so, you can also learn how things work, and it's a perfect segueway to the previous talk, so then I can invite you all to become contributors to the Rails codebase, because once you go past the stage of "This is just magic", and you look at the code, you might realize that it's Ruby. And it's classes and objects, and you can also contribute to that. As a matter of fact, Active Storage is the newest framework, so possibly might still miss some documentation, or maybe you can find some issues. So it's the perfect place for everybody to start in mapping or looking at the Rails codebase. My name is Claudio, and English is not my first language, so you can just go and download the slides if you want. They are already available there. It might help you follow, speakerdeck.com/claudiob. So before I get started, I just want to comment that this talk is about Active Storage. It turns out I work for a storage company, but it's not cloud storage. It's actually we come to your place and we take the stuff that you don't need. We wrap it up, take pictures, barcode, take everything to the warehouse, and then you can ask for your items back with a Rails app, of course, and it's called Clutter. You can check it out, and we're based in LA, and we're hiring; that's my pitch. But we're gonna talk about Active Storage instead. Clearly, I can't talk about everything, but there's a great place to start, even after this talk, and that is the official Rails Guides, guides.rubyonrails.org. There's a section about Active Storage, and it goes through all this. So it's actually really well-written, so I invite you to just go there. And, by the way, all the URLs that you see are on my Speakerdeck, so you don't have to write them down. And this is what we're going to talk about today. First of all, what is Active Storage? Maybe you've heard about it, maybe you haven't, you're just curious. I'm gonna give you an introduction on what it is and how to use it. And you're gonna see that it's really really easy to use. So then you might wonder, "How does it work? "What is it made of?" And that's gonna be the second part of the talk. I'm just gonna go through the main classes of this library. And, finally, how does it all work together? And I think one of the reasons why I'm giving this talk is because, when I looked at the code of Active Storage, it's really elegant, it's really good codebase. And sometimes, people wonder, "What can I look at? "What is a library that is a good starting point?" So this is my advice, and you can look at that. And so enough with the introduction, I think I'm gonna get started. Okay, so what is Active Storage, and how to use it? Active Storage is a library to upload files. If you have a web application, and you want users to be able to upload files through their browsers, this can be a good option for you. It's not the only one; in the past, there have been other third-party libraries, like Paperclip or Carrierwave, but Active Storage ships inside Rails by default, so you already have it there. So you might want to give it a try. And so how do you use it? How do you let users upload files? I'm gonna show these in a brand-new app, so there are no dependencies. Just gonna create the scaffold, brand-new scaffold, Rails 5.2 app, and then have an upload. So this is how you just create a brand-new app in Rails. You do your rails new, I call this app "catalog", and I generate a model called "cat", of course, 'cause it's RailsConf, so I haven't seen too many cats in this conference yet, so this is my part for that. So every cat has a name, and then we create the database, we migrate, and we run the server, and if you've ever used the scaffold in Rails, you know that it generates forms like this one. So you have a form where you can add a new cat to your application. So this is the baseline. We already have a form. Now we want to add a field for people to upload an image. So there are three steps. The first step is actually to just add a field to the HTML. And this is pretty much HTML, we are adding a file field, I call this "picture"; you can give it whatever name you want, "photo", "cute little picture", "cat cute picture". And then a label, that's really it. There's a new field called "picture". Now, in Rails, whenever we have a field in a form that we want to submit to a controller, we need to tell the controller about this field. We need to whitelist this new parameter, because of the strong parameters in Rails. So what that means, in your controller, you already have this params.require(:cat).permit(name), these are all the params that are accepted from the form, so all you have to do is to add :picture as another parameter that the controller's gonna accept. And then, the last step in the model, in the cat model, we have to add a single line of code. We just have to say cat has_one_attached :picture . And that's it, that's literally all you have to do. If you do all of this, then you have this form. People can upload the picture, the picture gets uploaded and gets attached to the cat, and, as a matter of fact, if you have a show page, you can just do image_tag @cat.picture, and that's gonna display the picture right there. Do you think it's awesome? (laughter) Okay, a round of applause for Active Storage. (applause) I think it's pretty awesome, also for the cat. So this is the basic usage, but it doesn't stop here. There's a lot more that you can do with Active Storage. Here are some of the other things I'm not gonna really talk about here, but you can still do. Imagine you have this @cat.picture, you can actually display a variant, like a black and white, 90 degrees flipped image. You only need a library like Minimagick new gem file, and it does this, it creates another file like that, black and white. It's not only for pictures, I mean, for images. You can also have, like, has_one_attached :document, like a PDF file. And, as a matter of fact, if you have a PDF, you can even display a preview, like an image, that renders the first page of the PDF. And you can resize that, 100 by 100, for instance. And, of course, if you have a cat, you're going to have many videos of your cat, and so it's not just has_one_attached, you also have has_many_attached that works in a similar way. And, for each video, you can extract metadata, the size, the angle, the duration of the video. So there is more about how to use Active Storage. There's variants, previewers, and analyzers. And so read the guides, and also, if you just Google "Active Storage", "How to use Active Storage", probably the first result is gonna be a blog post by Vladimir, who's right here, and by Alex, by Evil Martians, and they have this very comprehensive blog post about how to set up all these features. So if you want to start using it, it's pretty easy to start. And if you want to do all these variants and previewers, please feel free to do this. Just try in a brand-new app, and then see if it works in your production app. Okay, so far so good? 'Cause this was the easy part. Okay, cool, so now, how does it work? What's inside the code? Like what we were talking about in the previous talk, you know, you use the routes, but then you want to see how they work. So you open the codebase. So here, we're gonna go through a similar journey. And we're gonna look at the classes that are inside Active Storage. Well, to look at the source code, how do you do that? Rails is open source, so the entire source code is on Github. So you can just go to github.com/rails/rails, there is an Active Storage folder, and the entire code is right there. Another way to open source code, if you have Active Storage library in your app, you can just do bundle open activestorage, and it's gonna open the source code in your editor. You can actually do this with any library that's in your app in a gem. Okay, so you open the source code, and there's a bunch of files. Where do you even start? So this is what you should remember about the important classes of Active Storage. There are three main classes. There's Active Storage::Service, Active Storage::Blob, and Active Storage::Attachment. And I'm gonna explain what they are so next time you hear about a blob, you know what exactly we're talking about. So let's start with Active Storage::Service. The service is the part that deals with just moving bytes. You have an attachment, you have a file, just moving the bytes from memory, from your browser, to disk. That's all the service does. So I tried to make a picture of this. When you upload the cat picture from the browser, so an HTTP-uploaded file, there is a component that takes those bytes and stores them, for instance, on your local hard disk if you're in development, in a certain folder, /storage. So that's the service, that's what the service does. It moves bytes from memory to disk. Active Storage::Service is a real class, and the code looks like this. It has other methods, it has upload, download, delete, and so on. These are the most important ones. And, funnily enough, this is the implementation. So if you call Active Storage::Service upload, it raises an error. This is actually a pattern. What this is is an interface. This is physically saying, "There is not just one service, "there are many subclasses." And you can implement whatever service you want, but they have to follow this pattern. So you can't call Active Storage::Service directly, you can call one of its subclasses. And all the subclasses, they implement an upload method, a download method, because they can be implemented a different way. So this is just a pattern you might have seen, even in other Ruby codebases, that there is a class that doesn't actually implement the method. They just describes what methods are there. But now, let's look at one of the subclasses. This is called DiskService, and it's the one that's by default. So in the examples I showed before, I was running an app locally, and DiskService was the different configuration. So DiskService, the upload method, it takes this IO, this file, and really, all it does, it's calling the Ruby method's IO.copy_stream, that just copies bytes, because this is the service that stores a file on your own disk. So it's doing anything else that taking the bytes, and taking the stream of bytes, and copying to a certain location. The location is this make_path_for(key), which, by default, is inside this /storage folder in the samewheres app. The download method is also using Ruby libraries, so it's reading the content of the file from the same path. So this is a pretty straightforward implementation. Takes the bytes, move them from memory to disk, and the other way around. So this is great for development. It's gonna store the image on your computer, but for production, you probably don't want to store the image locally, you want to store it on a cloud solution, like, for instance, Amazon AWS S3. And the good thing about Active Storage is that it already ships with a service for S3, so you don't have to build it. This is another class, it's the S3 Service. You see that there is still an upload method and a download method, they're just implemented differently. This object_for(key).put, those are all methods that come from a gem called AWS-SDK-S3 gem, that is included. So it's just using the methods to move bytes to and from S3. So if you want to use Active Storage with S3, all you have to do is, you go to the configuration file, it's called storage yml. By default, it's gonna say service disk, that's the default. You just change that with service S3, and then make sure you have the credentials for a bucket in S3, and then it just works out of the box. And so, this is pretty convenient, and Active Storage actually also has support for Microsoft Azure and Google Cloud Storage, all inside. And if you have your own cloud solution, and you want to build a service for that, just make sure you follow this pattern, this interface, and then you can build it. Maybe you can build your own gem, or maybe you can do a pull request to integrate it in the codebase. How are we doing so far? Good, good, thank you. Oh, you can applaud if you want, I'm not going to- (applause) Okay, so service moves bytes. So in the case of the cat picture, we have the bytes, but we don't know what those bytes are. We have no idea that it's even an image. It's just bytes. So we need someplace to store the information. Where is that thing, what it is? And this is what the blob does. Same example, when you upload this file, the blob is the part that's actually storing the key, so you know where the file is, the original file name, might be useful, the content type, it's a .jpg, and the byte size. So this is the part that stores a reference to the file. And it's actually storing it in the database. Active_storage_blobs is a table in the database. So this is the scheme of the table, and it reflects what I was just describing before. You have the key, file name, content type, metadata, byte size, and the checksum, to ensure that the image is not corrupted. Now, you might be wondering, where does this table come from? We did not create any table. So this is one of the things that Rails helps you do, because the first time you ever use Active Storage in your app, and you try to upload the file, you will see this page. And it says, "Could not find table active_storage_blobs". So Active Storage needs that table to work. Also, it tells you what to do, "To resolve this issue run bin/rails active_storage:install" And this was actually my own small contribution to Active Storage. But this is really the only thing you have to do only once. So you need these tables, the first time you try locally, you might see this page, then you run this command. This command adds the migrations to your app. You run the migrations, and then the tables are there. And then you don't need to do any more. When you deploy to production, the migration's not gonna be there, you run them, and that's it. So only once, you need to make sure that your app has these tables, and then that's it. And, just as an approach, this is a little different from other file upload libraries that, instead, require you to add fields to every single model. So let's say your cat has picture, they say, "Well, you need to add a field "called "picture" to the cat", then you have the dog model, it says you need to add it there. Active Storage is a little different, because everything is here, is inside these tables. So you only have to do that once. So I think that's pretty convenient. So I'm just gonna show you one method of Active Storage::Blob, and this is the upload one. So what it's doing, as I said before, it's extracting all this information, so it calculates the checksum from the bytes, and then it extracts the content type, either from the extension or the MIME type or just the first bytes, and then it extracts the size. It stores all this information, and then it calls the service to actually store the file. And then, here's the last class. So we have the bytes, we have the blob, that it's telling us, "This is an image of this certain size", but we're still missing a part. We're missing the fact that this image is the picture of this cat. We're missing the association. So that's the last part it's missing, to associate the blob with what it belongs to. And this is what Active Storage::Attachment does. So this is the last part, when you upload a picture of a cat, apart from the service and the blob, there is gonna be a new attachment that says, "This is a picture, it belongs to the record "cat number four", for instance, "and this is the blob number two." So it's really just building this association. So then, when you call cat.picture, it knows where to go. And Active Storage::Attachment is also a table, and it's the other table that the migrations add to your app. And this is the entire scheme. So it has a name, a polymorphic record, and then a blob, and this index. So what this index is saying is if you have cat number four, it can only have one picture for this blob. And that's basically all there is. Even the code for attachment, it's really just describing what I just mentioned. It's backed by this table, it belongs to to a record, polymorphic, so maybe the cat, and then it belongs to the blob. It creates the association. And so, summing up, these are the main classes. You have the service, that moves the bytes, you have the blob, that has a reference to where the bytes were stored, so that you can fetch that, and then you have the attachment, that connects this blob to the original model. Okay, so this is the third act, where we put everything together. How does it all fit together? At the beginning, I showed you Active Storage is pretty easy to use, you just add a few lines of code. Then I mentioned the main classes, but how does it all work together? There's still some magic there, that it's not pretty clear. And so that's what I'm gonna show you now, and this code uses some techniques that are actually all over the Rails codebase, and so if you learn some of these things, that they might even be useful if you look at other parts of Rails, or even your own codebase. So I think the perfect starting point is when I said, "Well, all you need is to add this line of code", so how does this single line of code add this whole behavior to your cat? It's pretty powerful. And even before that, where does this method even come from? Because here, we are inside an active record base. Active record doesn't have has_one_attached as a method. So how did this method end up there? Well, the answer is that Active Storage is an engine, and James Adam yesterday had a talk called "Here's to the Crazy Ones", where he talked about engines for the entire talk, so it's a really good talk to watch on YouTube if you have the chance. Active Storage is an engine, so as an engine, it has initializers. And this is how this code works. This is inside Active Storage. What this is saying is whenever a Rails app is loaded, is initialized, ActiveSupport.on_load(:active_record), what that line is saying is, whenever you're loading a Rails app, when you're loading active_record, stop there for a second, and extend active_record with ActiveStorage::Attached::Macros. So it's loading active_record, with all the active_record methods, where, find, and so on, and then, once it's loaded, extend it with Active Storage::Attached::Macros, which is a module inside Active Storage, and one of the methods is has_one_attached. And so that's where that method comes from. That's how Rails is able to extend, and in this case, we're extending active_record with Active Storage. But this pattern is really used all over. Okay, so that's how the method ended up there. Now what is this method? This method is actually pretty long, and I didn't put the whole method here, 'cause it can be scary, just stopped at the first five lines. But even just the first line is like class_eval <<-CODE, what is this? Maybe you don't write this type of code normally in your applications. So let me explain what this is. When you, say a cat has one attached picture, suddenly your cat has a method called "picture". This is pretty similar to when you say in active_record, "User has many roles", and then you have user.roles. So you have this new method just because you called something else. Of course, Rails cannot know at the very beginning, has_one_attached what? It has to take that name that you've given it and then define methods with that name. So this is what's happening here. This class_eval is taking that name that you passed to has_one_attached, and it's using that to define a bunch of methods with that name. For instance, if name is "picture", this code becomes a little easier to understand, because you said "picture", now you have a method called picture=, and another method called picture, and so on. So this is really meta-programming in Rails, and it also helps you when you debug, to go to the correct line, but this is really what it's doing. It's defining new methods based on the name that you've given. Now this picture= is the method that is called when you set a picture, so when you attach a picture to a cat, and it's calling picture.attach. Picture itself is another method defined by has_one_attached. So this is the picture=. This is the picture method. Picture is an instance of an internal class called Attached::One. This is what picture is. The good thing about this is you can go and look at the code of Attached::One, but it just gives you an idea of how really object-oriented this Active Storage is. It's not injecting, loading modules. It's really based on a bunch of classes, and these classes are pretty small. So if you go inside, you're gonna see that they're not too hard to understand. So this is an instance of this Attached::One, and this is the class that has a method called attach, and this method, it's basically doing two things. First, it's creating a blob from your attachment, and the blob then calls the service to move the bytes, and then it's creating the Active Storage attachment. So we go back to where we were before. Because you are attaching a picture to a cat, first it's creating a blob, and then it's storing the bytes with the key, and then it's creating the attachment. So it goes back to really to where we were. And this is, I hope that with this talk, I'm really letting you know that you can just go and look at the codebase, and both learn and possibly contribute. As I was saying before, this library is pretty new, and it's open to suggestions. Maybe there are some issues. Something else you can do, if you go to the Github page of Rails, there's the Issues tab. All the issues are normally tagged, so if you see the tag Active Storage, you might want to explore and see what happens in the code. The codebase is not really that big. We heard about Journey before, that's a little bigger as a codebase, but especially if you're new to Rails, I think Active Storage is a good place to start. Active Support is another good place to start, Active Job as well, they're all pretty small. So just give it a try. I just want to give another round to explain again all this process of attachments, blobs, and services. This is another method that Active Storage adds. We have has_one_attached :picture, it's this after_destroy_commit. So think about it, if you destroy a cat, which you should never do, (laughter) but if you decide to destroy an instance of a cat from your database, and that cat had a picture, you probably want to delete that picture. So if you had a picture on the street, it doesn't need to be there any more, or maybe it shouldn't. A cat picture, maybe it can stay there, but if it's an attachment of a document and you delete it, you don't want the attachment to stay there. So this is also something that Active Storage implements with this after_destroy_commit ( picture.purge_later). After_destroy_commit is a callback of active_record, and it's saying "Do something after the instance "was destroyed". It's after_destroy_commit, so it doesn't run inside the transaction of the deletion. If it were just after_destroy, it might be blocking your database. So let's see what happens when you destroy an instance of a cat, it's called in picture.purge_later. Purge_later, it's saying if there is an attachment, call attachment.purge_later. Now attachment, you remember, was in the database, it's destroy, but it's also calling blob.purge_later. The blob is actually calling a job, it's saying, well, I don't really need to delete this file in real time, I just can let an active job do this five seconds from now. When the job runs, it's calling blob.purge, blob.purge, it's destroying the blob, and it's calling delete. Delete is calling the service, and say, can you actually delete this file? And then the service, for instance, the DiskService, it's calling File.delete. So this is really the entire flow. And I just wanted to show this again, just to demonstrate some properties of Active Storage. For instance, in this case, really separation of concerns. The cat doesn't know that the file is even on S3, just going step by step. So if I haven't said it yet, I think this library is really awesome, and the code is really elegant, so check it out. To conclude, there is more, and I didn't have time to go through that, but some of the other things that you might want to explore, first of all, Active Storage is not just a Ruby gem, it's also a JavaScript library. The code is in the same place, and the reason why there is a JavaScript library is because sometimes you don't want to upload your files to the Rails app, you just want to upload them directly from your browser to S3, for instance, especially maybe if you use Heroku, you don't want to use the storage of Heroku, you just want to upload them to S3. So Active Storage includes a JavaScript library to do that. You include the JavaScript file in your view, and it takes care of creating a blob, so you have a key, then, with this key, the file is uploaded to S3, and then this is the key that you can then use to retrieve this image. The only thing that changes is the order in which the blob is created and the image is stored. Something else that I found useful is every time you use Active Storage, in your Rails log, you're gonna see that information "file was uploaded" and so on. The way in which this is done is using a library called ActiveSupport::Notification. You can publish a message to and then subscribe. It's used all over Rails, so if you find this and you don't know what it is, just check it out, it's pretty good. And then the last thing is Active Storage has a file called routes.rb, and I like that we tie back in with the previous talk about routes. It has routes because, imagine if you have a file, then you want to display that file in the browser. So it needs a URL. So this URL for a file needs to be generated. The nice thing about this file, about routes.rb, is that it's using a couple of methods in the router that were just added in Rails 5.1, the methods direct and resolve. So in your router, you can have resource get post, but you can also use these two methods. And maybe you have never seen these methods used in a real app, so if you want to check them out, they are in Active Storage. And this concludes my talk. All the slides are available there. And really, I don't know, give it a try and let us know what you think. I am a member of the Rails Issues Team, which means if you open an issue in Rails, I can look at it and check it, and maybe merge or comment. And even if you want to mention me on Github, I am claudiob on Github, so you can do that if you open PR, it will make me happy. And that's it. I'm gonna be here for the rest of RailsConf, so if you have any questions, just come find me. Thank you. (applause)
Info
Channel: Confreaks
Views: 2,956
Rating: undefined out of 5
Keywords:
Id: -_w4uqoVSpw
Channel Id: undefined
Length: 33min 48sec (2028 seconds)
Published: Fri May 18 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.