Deploy ML models with FastAPI, Docker, and Heroku | Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone i'm patrick and in this tutorial we learn how to deploy machine learning models with fast api and docker and then have a production ready app so you can use this template to deploy the container everywhere you want in this video we go ahead and deploy to heroku because there's a free tier and you can follow along also this approach should work in the same way for any machine learning or deep learning framework you want to use so if it's psychic learn or tensorflow or pytorch the approach is the same and it's also not too difficult so let's get started let's quickly test the final app so in this video we built a language detection model and as you can see here we have the live url at heroku app.com predict and we send a post request with this text and if we click on send we get this response with languages english if i for example switch this with the german version and click on send then the language should be german so this is correct and fun fact on the side at assembly ai we also have a automatic language detection feature in our api so this is actually a real world project so let's see if we also can develop an accurate model and deploy this so first let's build and train the model in a notebook so here i'm in a google collab and the focus of the video is not how we develop this specific model but rather how we go from notebook to production app in a moment so let's go over this very quickly the data set is available on keggle also this notebook and all the rest of the code is available on github i put the link in the description so here we have our imports then we load the data set and analyze this so we have lots of different texts and the corresponding language this is our x and y then the first step we apply is a label encoder so this transforms the labels y so basically what it does is that for each of these texts these classes it assigns a number starting from zero and whenever we have a preprocessing step that transforms our data we have to remember this and later also use this in our code so for example this is another pre-processing step here we apply regular expressions to remove these special characters so we also have to use this in the code later then we have the typical train test split and now we build our model so it consists of two parts in this case the first one is a count vectorizer here we fit this and then transform this and then we have the second step which is a naive bayes model and then we can use model predict with the test data and then here we have some metrics and for example print the accuracy so we can see this is 98 so pretty good now one best practice we can apply here if it's possible is to combine all steps into only one step this is much less error prone and then we only have to save one model instead of two here so with sklearn we can do this with the pipeline with tensorflow for example we often have a sequential model where we can put in all the layers in this sequential model so with the pipeline we can now combine the vectorizer and the naive bayes and then we can fit the pipeline and then we can use this pipeline to predict the test set so now we only have to apply this one step and if we compare the accuracy then we see this is the exact same result so now when we are done with training the model we have to save and download this and with sk learn we can do this with pickle dump with tensorflow and pie charge there's also a very simple api to save your model one thing i recommend to do here is to also save a model version so you can keep track of the current version you have so we use this syntax major minor and patch version so this is our first miner model version and one other thing i want to mention here is that now in this case this will be one pickle file so we can actually go ahead now and click on download here but if you use tensorflow or pytorch then often it will save this in one whole directory and we cannot download a whole directory so there's no download button so one trick you can apply is to sip a folder with this command so exclamation mark sip and then you give it the name of the zip file and the name of the folder and then you can download the zip file so now we have this and um finally let's test this one more time so here we have the text and if we run this then we see this is italian and we can also see that y is only this number eight here so here we actually apply the label encoder classes to get the actual um language so italian so i printed this in the top so here are the label encoder classes so we also have to get this for our code and yeah now we're done so now we can use this model and build our app now let's create the fast api app and for this in the root directory let's create an inner directory app and in here we will put all the code so we have one main.pi file this is where we will put the fast api endpoints and i go over this in a moment and then we have a inner model directory and here i start the downloaded pickle file the trained model and then another file that i called model.pine basically this is a helper file that does the model prediction so here i hard coded the model version then i also use path and the path of the current file to get the base directory and this is because the folder structure inside the docker container can be a little bit different and i want to make sure that we can find this pickle file here so then we can open this and make sure that this version corresponds to the file name and then we can say pickle load and loaded our model then here we have the classes so this is from the label encoder the classes in the same order and then we only need one helper function in this case predict pipeline that gets the text so this is a string and then here we do all the same pre-processing steps that we did in the notebook and then we can say model predict we because we now have only one step with the pipeline object so then we get the prediction and this is a number so then we have to access the index of the classes and then we can return the language so this is the model.pi and now let's go to the main.pi so here we import fast api and base model from pedentic i show you what this is doing in a moment then we also import predict pipeline and the version as model version and be careful to start your path with app so app.model.model for this file and then it also finds it in the docker container then we create the app and in fast api it's super simple to create your endpoint so it's very similar to flask we define a function and then decorate this with app.get or app.post so i often like to have a endpoint for a health check so here we simply return health check okay and here also the model version for additional information for example you can also return the api version here and then we have one predict endpoint this is a post endpoint and in here we simply call predict pipeline and put in the text and then we return the language and this is a dictionary and now to make sure that we pass the correct data types to this api when we send the data so the input we want to have should be a string and fast api works with tie pins so for this we can define a class text in that inherits from base model and this should only have one field this is a text this should be a string and now when we send the data and this is not a string then fast api can detect this for us automatically and then raise an error or show an error in the api and this is super cool a super cool feature of fast api so this is why we use this base model and in a similar way we do this for the output so for the response we say prediction out which inherits from base model and here this should be one field language and this should also be a string and then in the code here we can access payload text and we have to make sure that we have one field language and this is basically all we need for this simple api and now we can dockerize it now to dockerize this of course you have to have docker installed on your machine and then in the root directory we need to create a file that is called docker file and then it's also a good practice to have a dot docker ignore this is similar to a dot git ignore here i simply copy paste this from github you find this in the description and this ignores certain files inside the container and now for the docker file we can go to the official fast api docs there's one section fast api in containers so i recommend reading through this because there are different ways of doing it one way of doing it is to use the official docker image with g unicorn uv corn you can also find this here on github so it says docker image with uv corn managed by g unicorn for high performance fast api web apps and in order to use this we can copy this code so this goes in the docker file and this is basically all we need to do and this uses this base image then it copies the requirements txt inside the container then it runs pip install requirements and then it copies the app directory also in the container and that's why also we have to have this folder named app and then we need one file that is called requirements.txt and here we put in all the dependencies so in our case since we already used the base image it already includes fast api uv corn and g unicorn so the only missing dependency in our project is scikit-learn for example if you use tensorflow then you can put in tensorflow here or pytorch and for example it's also important to mention or worth mentioning that you can use tensorflow cpu oftentimes because you don't need the full version and then your container will be much smaller so then it's also good practice to pin the version so for this we can go back to the notebook and then in here we can for example say import sk learn and then sk learn dot underscore version this should give you the current version and then we can copy this so it's 1 0 2 so it follows the same pattern major minor patch so let's go back and say equal equal and then this and now this is all that we need so now we can build the image now to build the docker image we can open a terminal and then i actually noticed one small change that we have to do so as you can see in the docker file we have the app requirements txt so basically an app directory is the new root directory and we have an inner app directory so we have to say copy from app to app slash app and now save this and now we can say docker build and then minus t and give it a name here i call this language detection app and then a dot for the current folder and then hit enter and now this will build the docker image and now this was already done so if you do this for the first time then this might take a few seconds or minutes so now we have this and now we can run the container by saying docker run and then we map the port from the docker container to our machine by using 8080 so this is port 80 and then the name language detection app so now this is running and starting and now you should see that the uv converters are starting and it's listening to zero zero zero port 80 so basically this is our localhost port 80 so now we can find this and then go to this route and for the base route this is a get request so this is working health check okay and model version is zero one zero so now we can for example use postmen to send a post request or what's really cool with fast api is that we get automatic documentation by using slash docs and here we can see all the endpoints so we have the home and we have slash predict and then we also can try this out and as you can see it it knows that we need this schema with a text field and then the string and this is because if we go back here we defined this as a base model with text so the text and it needs to be a string so that's why it knows that we need these fields and then we can click on try it out and let's say hello how are you and then execute and then we get the response directly here and can see language english and you also get the curl command if you want to try it from curl so let's try it with a german sentence so let's say hello we get as diem question mark and then execute this and then language is german so this is working so now as last step let's deploy this to heroku and now we can deploy the container everywhere we want in our case we do it on heroku so let's actually stop our local container and clear this and now um the first thing we want is a git repository so we can say git init and of course you have to have git installed on your machine this will initialize an empty git repository then i also want to add a dot get ignore so here we also ignore some files and folders we don't need so i only ignore the virtual environment and this file and now you could continue in this terminal i actually want to switch to my normal one so here we now say get at dot so everything so we can check get status um that all these files have been added and now we can say git commit and give it a message so let's say initial commit and now we can um start creating our heroku project so for this of course we need to have a heroku account and the heroku command line interface installed and now we can do everything from the command line so first we say heroku login and now this should open the browser and then here you should be able to put in your credentials so i think i already did this and this should be stored so yeah so now we can go back and see we are now logged in and now before we can continue we actually need one more file also in the root directory and this is called heroku.yaml and for this i can recommend a documentation site on the official heroku dev center building docker images with heroku yamo so basically we only need this part so let's copy and paste this in here then of course we have to add this to git again so we say git at heroku yamo and then we say git commit and add heroku yamal as message and now we can create a new heroku project so we say heroku create and then we have to give it a name so let's use language detection app and then it needs to have a unique name that is available so let's try one two and see if this works all right so this worked so now this will be the url that we can use to access our api so now we can say heroku and then git remote and then we set the remote for this project so this was called um language detection app one two and now this will create a remote and now we can say um heroku stack set container because this is using a docker container and now we only have to push this with git so now we can say git push heroku main but as we can notice we are still on the master branch so we have to say or have to change the name of the branch to main by saying git branch dash m main now we are on the main and now we can say git push heroku main and now it will push everything to heroku and start the app so this might take a while all right and this worked so deployment is done so now we can grab the url we've seen above so this one is where our api is now live so this time i want to test this with postman so we can check the home url with a get request and send this and then we see health check okay and model version and now we send a post request with this data to slash predict and now let's see language is german so now let's check a different um text so now let's say ciao and now let's send this and we get italian so this works so this is our app deployed at this end point and this is all that i wanted to show you i hope you really enjoyed this tutorial if you have any questions let me know in the comments below and then i hope to see you in the next video bye
Info
Channel: AssemblyAI
Views: 58,233
Rating: undefined out of 5
Keywords:
Id: h5wLuVDr0oc
Channel Id: undefined
Length: 18min 45sec (1125 seconds)
Published: Sat Jul 30 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.