GPT-Neo Made Easy. Run and Train a GPT-3 Like Model

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Here's the corresponding blog post for those who prefer: https://www.vennify.ai/gpt-neo-made-easy/

👍︎︎ 1 👤︎︎ u/VennifyAI 📅︎︎ May 09 2021 🗫︎ replies
Captions
today we'll be covering how to implement and train gpt neo with just a few lines of code for those of you who don't know the gpd neo is an open source replica of gpd3 the largest model that's currently available has 2.7 billion parameters which makes it close to double the size of the largest gpd2 model that's available it is also about the same size of the smallest gpd3 model that's offered through open ai through their api that's currently in beta but i want to note that even the largest gpt neo model that's available is still significantly smaller compared to the largest gp3 model which has 175 billion parameters but even if they did release this model you would need very expensive hardware to be able to run it first off we have to change the run type so if we head over here we could do change runtime type and if you are using a free google collab instance i recommend you use the gpu standard version and then you'll only be able to run the smallest gpt neo model that's currently available but if you're using a upgraded pro account like mine then you can use the high ram version which will allow you to run a larger model now we're going to pip install happy transformer with the following command after it has finished installing we can import a class called happy generation from happy transformer import happy generation and hit run now we'll create an object using our newly imported class we'll name this object happy gen and this class requires two positional inputs for us to begin using gpt neo the first one is the model type and for this we'll put gpt dash neo the second one is the model name and for this we'll head over to huggingface.co and we'll type into the search bar gpt neo here are various models with the tag gpt neo we'll see that there are three main ones and if you are using a free google collab instance i recommend you use the smallest model to avoid a crash if you are using a pro instance then you can use the second largest one or second smallest depending if you're feeling optimistic or pessimistic and we can use that one like so we can press this button right here to copy the name and we can paste it in hit install and wait we can now begin generating text using our happy gen object we'll save the result into an object called result and we'll call a method called generate text from here we only have to provide it with a single input which is a string this string could be a phrase or a sentence or a entire paragraph whatever text input we give it it will attempt to continue it let's give it something interesting maybe to solve world hunger we must invest in we will print the result and the result is a generation result data class with a single variable called text we can print just the text variable like so and the output is the future of agriculture we must invest in the future of agriculture and it just keeps on repeating this sentence we can now modify the generation algorithm we use to do so from happy transformer we will import a class called gen settings and we will use this to create an object called args from here we can easily modify the default settings for this we will increase no repeat engram size from 0 to 2. going forward we can copy this text here and under the args parameter we can include the args we just instantiated hit run and we'll print the result the future of agriculture we must also invest more in the future of food and agriculture in general we must invest more to improve the quality that we produce food so i would say that is a lot better than what we generated above we can now change the text generation algorithm we use we'll head over to happy transformer dot com slash text generation slash settings and we'll scroll to the bottom here we'll see the settings for different algorithms by default an algorithm called greedy is used which is very simple it just simply predicts the most likely word to follow this input just one after another there are more sophisticated algorithms you could use which often give better performance especially if you're looking for creative text and in this tutorial we will cover how to implement top k sampling to do so just copy this line here let's bring it back to our collab notice how the max length is only 10. let's just use the default max length we don't have to worry about that hit run copy this change the args to this args add a new code block we'll print the result the technology to solve hunger itself what we eat where we eat how we eat that's what we need to do to eradicate hunger and then it continues one cool thing about top case sampling is that it is non-deterministic so each time we run it it will give a different result so let's see if this one's better our poor this is a message of the united nations first ever global hunger report it is also the basis for the world food security plan launched on 11 december 2008 we'll give it one more run our planet is currently starving we have created artificial solutions and we will continue to do so we must invest in renewable energy and other solutions such as reforestation solar power and farming so i would say this one was the best other than whatever it outputted here but let's move on to training training a gpt neo model is incredibly simple using happy transformer all we need is a text file that contains nothing but the text we wish to train the model with then we drag it into the file structure like so so now we can access this training data from our google collab instance and from here we type in happy gen dot train and then the name of the file which in this case is train dot text but before we run it we will have to downgrade the size of our model from the 1.3 billion parameter model to the 125 million parameter model then after it has successfully downloaded we can fine-tune the model and run it how we were before happy gen is now representing a smaller gpt neo model if we go back to this cell right here we can run it without it resulting in a crash it can import a data class called gen train args like so then we can instantiate a data class like this and modify the various parameters so for example here here's a list of them and you can get a more detailed list by going to this url right here under the happytransformer.com website so one common thing we may want to adjust would be the number of training epochs if we go back here we can paste that in and instead of using three which is the default maybe we only want to use one and there we go we can hit run and of course like i said there are other learning parameters you can modify when we go back here we can create a new code block and this time set args equals args hit run and there we go thanks for watching there's a link down below to this google collab along with an article published on my website that covers all of the same content but in written format
Info
Channel: Vennify AI
Views: 7,052
Rating: 4.8775511 out of 5
Keywords: GPT-3, GPT Neo, NLP, Natural Language Processing, GPT 2, Transformers, Happy Tranformer, Eleuther, Python, artificial intelligence, AI, How to, Tutorial, Train
Id: GzHJ3NUVtV4
Channel Id: undefined
Length: 10min 22sec (622 seconds)
Published: Sat May 08 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.