Python for AI #4: Model Hubs & HuggingFace Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone I'm Patrick from assembly Ai and welcome back to our python 4 AI development course this is lesson number four of the course and in the last two lessons you learned how you can prepare your data and then build your own models now we step up the pace a little bit and talk about model hubs with model hubs you can access and use pre-trained models right out of the box this means you don't have to deal with training the model you don't have to deal with getting the data and preparing the data you can simply use those models right away so in this video I show you three important model hubs and then we have a look at the most popular one so I show you a few different examples for different use cases how we can use different models and what's also really cool is that even though those models are already pre-trained you can still adjust them to your own use case if needed and this is called fine tuning so in the end of the video I also quickly mentioned how you can do fine tuning things and stick around until the end and now let's get started so there are three important model hubs you can use the first one is the hugging face Hub the second one is the tensorflow Hub and the third one is the pyter shop so as you might know pie torch and tensorflow are the most important deep learning libraries and hugging phase provides another library that is called Transformers and this uses both tensorflow and pytorch under the hood so you can choose what framework you want to use so I will quickly touch on the Transformer slide re so that we can understand how this model Hub works and the hugging face Hub is by far the most popular one and the biggest one as you can see here right now they almost have 150 000 models that we can all use for free and this is also Community Driven so everyone can upload models here but also big companies use this for for example here you can see open AI uploaded a model Stanford then we have Microsoft we have Google we have Facebook research pretty much all big companies use this to upload their models here and we can access them for free which is awesome so here you can see for example different use cases we have multimodal models we have models for computer vision for natural language processing even audio so I quickly want to introduce you to the Transformers library now and then we look at three cool examples from different areas so we look at a text summarization example then a computer vision example and lastly a multi-modal example how we can generate images from text like the popular Dali and stable diffusion models so let's get started so let's quickly talk about the Transformers library and how we can use it and by the way you can find the Transformers library on GitHub it currently has 84 000 Stars so it's pretty popular and for this video we use a Google Cola because here it's super simple to set this up so go to colab.research.google.com and create a new collab and before we do anything else I want you to switch the runtime type to a GPU so we can get a GPU for free so select this and click on Save and then restart the collab and now the first thing to do is to install the Transformers and we can do this with Pip install Transformers and if we do a command a terminal command inside a collab we have to use a exclamation mark before so we say exclamation mark and then pip install Transformers and then you can select the cell and click shift and enter and now it will run the cell you can of course also click here so now this is installing the Transformers library and while this is being installed let's quickly have a look at the website so as I said under the hood it is using pytorch and tensorflow so you also have to install this if you're running this on your machine in a collab both torch and tensorflow are already available so we only have to say pip install Transformers and now we can use this and the simplest way to work with the Transformers library is by using a so-called pipeline so a pipeline abstracts away all the other steps that are happening under the hood for example pre-processing the data then applying the model and then showing it to you in the correct output format so there are pipeline steps available for different use cases so we have can have a quick look at the docs so here you can see we have pipelines for sentiment analysis for text generation for question answering summarization translation image classification and some more and the way we apply them is to say from Transformers import pipeline then we give the pipeline the the name so here we say this is a sentiment analysis Pipeline and we can say this is our classifier and then we can simply apply this with now we pass input to this classifier in this case we pass some text in here and by the way we can also use a list of texts so if we want to do more than one and then let's print this and run this and see what's happening and now we see the output so we get one label and the label is positive and the score is 99 so it's pretty confident that this sentence here is positive and now let's also have a look at the other output we are seeing here so first of all it said no model was supplied and then it defaulted to this long weird name here so you might be wondering um how does it know which model to use if you simply say it should do sentiment analysis and the way it works is that usually you also pass the model parameter and then you specify a name that you fight find on the model hub for example here this bird space uncased then you simply copy the name and pass this here and if you don't use one then for each pipeline there is also a default model and this is what happens here so the first time you run this code the model will now be downloaded to your machine so if you run this locally then this will be stored on your machine and here it is stored on the Linux machine that Google provides for us so here you can see the model was downloaded so for example if we run the cell again then we should only see the last output with the label so yeah no more downloading and this is how the pipeline works so usually keep in mind that there is a model and also there is a so-called tokenizer that is unapplied under the hood so the tokenizer is doing the pre-processing for the text and yeah this is how those pipelines work so the most simple approaches that you simply choose a model name and a tokenizer name and then run the pipeline so it only takes these lines of code to do sentiment analysis but of course you can have a little bit more flexibility with the library as well for example you can set up the model and the tokenizer on your own for this one you can say for example from Transformers Import Auto tokenizer and auto model for sequence classification so this is just a very general class that you can then use and then again you will need a model name that you find find on the model Hub and then you say you um you instantiate this by saying Auto model for sequence classification Dot from pre-trained and then the model name and the same for the tokenizer so you will see the stock from pre-trained function a lot and then you have the model and the tokenizer and then instead of the name you can pass the object to the pipeline and then it all works so here we are now using a multilingual model and then try sentiment analysis on a French text so let's run this Cell Again by saying shift enter and here as you see again the first time we do this the model will now be downloaded and then again we get our outputs we have a label five stars and a score so this is working as well and yeah this is how we work with the Transformers library and the model Hub so the general approaches to select a pipeline and then we use either simply the model name or in the tokenizer name or we create the model and tokenizer or the pre-processor ourselves so this is what we will see later when we now look at a few examples so let's look at three examples so you can do this on your own so as a first example let's look at text summarization so to do this go to hackingface.co and then click on models and now this is the starting point for the model Hub here you find all the available models and now you can filter them so for example you can filter by task and then here you can select mode multimodal computer vision natural language processing and in our case we now want summarization so let's click on this and now it's filtering for this and here um you see all the different names and also who uploaded this for example this one is by Facebook so this is what we want to try out in our case of course feel free to explore this and select different ones as well for example this one by Google and now let's select this one and then often you find a so-called model card so this is like a readme that explains what the model is doing how it was trained sometimes or often they also provide code examples to make it super simple to get started with them sometimes you also find information about the data set that would that was used to train it so hugging face not only provides models but also data sets and you can also use them so now to get started and if you write the pipeline code on your own you simply click here on the copy sign and now it will copy the name and then here in the pipeline you will um say model equals and now pass this in and then of course now we have to change this to a summarization um task but in this case we can simply grab the um example code that we find here so here we set up a summarizer so let me copy and paste this and now let's go over this so in here's um insert a new code cell and now we paste this in so we set up our summarizer this is again a pipeline now with the summarization name and this is the model and then here we create the article we want to summarize and then again we simply call this and now we can also apply different parameters if we want so you can have a look at the documentation what they are doing in my case I simply say shift enter and run this and let's see if this works now again you will see the model will be downloaded the first time and we get our output so the summary text and here we have a summary in a few short sentences so this worked as well and yeah this is how we can access the model Hub it only takes a few lines of code a few minutes to set this up and now we can access powerful pre-trained models so let's look at a second example in the computer vision field so let's again go back to the model Hub and click on models here now let's filter for computer vision and let's select image classification for example then again you can explore different models in our case I want to try out the first one from Google then again you will find the model card and see what this is doing so here we have the model transcription so here you will see on the right side you will also see some examples so if you put in this image for example you get this out output then down here we also find a small code snippet and this is all it takes to use this so let me copy and paste this and then we want to go over this so here insert a new code cell so this case we say from Transformers import vit image processor and the vit for image classification so this is similar to what we've seen above when we set Import Auto tokenizer and auto model so this is a more specific class now that does image processing pre-processing and classification then here we have some more helper modules like the pillow library and requests to download a a image so here we simply download an image and let me actually put this in another cell so if we run only this then we should see how the image looks like and while this is running I can already insert a new cell and copy the rest so here you can see these are some cats and now we do this um from pre-train method that I've showed you and here we pass in the name that we are seeing here so this is also this name and now we set up the processor and the model then in this case we don't apply a pipeline so you you don't have to use a pipeline like I set a pipeline only abstracts away all the different steps in this case we do the steps ourselves this means we first apply the processor so the pre-processing step these are the inputs to our model then we pass this to our model and now we get the outputs and here we want to access the logits and then only team the class with the highest predicted probability so we apply this Arc Max function and then we print this so let's run the code and see how this looks like and we get a predicted class so here it says it's an Egyptian cat so this worked as well and yeah keep in mind that we can not only use the pipeline in one line but we can also set up the preprocessor and the model separately and apply them one by one but the approach is very similar we in this case we need the Dot from pre-trained method and then again the name from the model Hub and then we can run the code and now let's look at one last example so you are familiar with working with the Hub so the last example I want to show you is a text to image model so let's click on multimodal and then text to image here we can generate images from a prompt and here you find different models for text to image generation so you might have heard about the stable diffusion model and here you find different models so you can try out different ones I want to click on Runway ML and then stable diffusion version 1.5 then again here you will find a model card and then here you find a small code snippet so this is actually all it takes to run this but in this case it uses another third-party library that is called diffusers so let's go to the collab and let's click on code and insert the code so this is all it takes but before we have to install the diffusers Library so let me quickly open this on GitHub and here you find information about the diffusers Library this is also provided or implemented from hugging face and of course the also the the awesome open source Community here so um in order to install this you will find the commands in here we have to say pip install minus minus upgrade and then diffusers then also transform us we already did this in here and then the accelerate package so let's run this so now this was installed and now we can quickly go over the code so it's very simple here as well we import the pipeline and the difference here is that this pipeline is not implemented in the Transformers Library itself so for this one we need the third party diffusers library and then again we set up the model name that we get from The Hub then we apply this Dot from pre-trained method with the name and here we also specify the data type we want to use then we can pass this to the Cuda device so the GPU so like I said make sure to use the GPU runtime and we get the G GPU for free and then we simply um can write our prompt so here we want to have a photo of an astronaut riding a horse on Mars then we pass this to the pipeline and then we access the images and then we can save this and we can also simply say image here or let's do it here so it will display this and now let's say shift enter and run this and here we get an output and indeed we get an image of an astronaut on a horse on Mars so this is working as well super cool and now you should be familiar with the model Hub and how you can use this on your own with different pre-trained models for different use cases and like I said you can also adjust those pre-trained models to your own use case if needed so this is called fine tuning and I want to briefly go over the approach how this works I don't go into detail in here but we have another tutorial on our Channel that I can link here so have a look at this if you want to learn more about this but I can also point you to the awesome documentation and there's even a collab that will walk you through the step by step so the general approaches that first you need to prepare a data set so you can use one from the data sets module or of course prepare your very own one then you will set up the pre-trained model from The Hub so here again you will use this from pre-trained and then the model name and then somewhere later you will have to train the model so here there will be some more pre-processing steps and then at one point you can use the training arguments and then also the trainer class and then simply train your model so this is the general approach prepare your own data select a preprint model pre-trained model and then train the whole model and again yeah check out this um call up this will walk you through the step by step so yeah this is what fine tuning does it's a very powerful technique in AI that is used quite a lot where you take a pre-trained model and then adjust this to your own use case and train this again with another data set so now you are familiar with the hugging face model Hub I hope you enjoyed this tutorial in the next lesson we take another different approach so if you don't want to build your own models and also don't want to use a model Hub that second approach where you don't have to apply your own models is to use an API of course so we will have a look at different Ai apis and how to work with them so see you in the next lesson
Info
Channel: AssemblyAI
Views: 11,162
Rating: undefined out of 5
Keywords:
Id: gjEaz5FMokI
Channel Id: undefined
Length: 20min 41sec (1241 seconds)
Published: Sat Mar 11 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.