I tested a STACK of FREE Large Language Models...here's how it went.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
this large language model app can answer questions complete tasks summarize text and write and run python code at the real kicker it's free it's not using the open AI API it is completely open source in this video we're going to be speed building the app and stacking a 7 billion parameter model a 13 billion parameter model and a bunch more up against open AI to see if they can even get close one of the easiest ways to get access to open source large language models is through GPT for all through it you can actually use a chat like interface whilst running the models locally on your machine the upside to using the guis that'll actually download the model weights to your machine which means that you can then use it inside of lanechain there's GUI based installers available via the GPT for a website I've also included detailed steps on how to run the code if you want to get it up and running yourself I'll get to that a little bit later so first up I downloaded and installed the GPT for all GUI to my Windows machine the first time I did this I wasn't paying attention so I had no idea where the models would be downloaded to pay special attention to the default download folder for the weeks if you Breeze through it like I did you can bring it back up by checking the download part in my case they will say saved here I downloaded the Llama 13 billion snoozy model and a couple of others to put them to the ultimate test would these actually stack up against the big guy now that I've got some model weights downloaded I'm going to whip up an app using streamlit this will allow you to interact with the different llm models through a simple user interface we're going to blast through this pretty quickly because it's pretty similar to what we did inside of the Lang chain crash course in the langchang crash course we started out by creating a file called app.pi This is where our llm application is going to lift now the first thing that we need to do is import stream lit as St this is going to give us our app development framework let's add a comment app Dev framework and then what we want to do is have a title so we're going to say title actually we don't need to set it to a variable so St dot title I'm going to plug in a Emoji because we've got to make it look sweet and we're going to title it GPT for y'all y'all which probably Escape that perfect all right cool so this is the title and then what we want to do is let's start up our app just to make sure this is where talking I can't imagine this is broken right now so we can run Streamliner app or streamlit run app.pi this will open it up in a new browser window perfect so we've got GPT for y'all happening right now so that is our title now if we wanted to change that you could change it to whatever you wanted to something a little bit more appropriate for a business application okay so that is our title the next thing that we want to do is include a place that we can actually pass through a prompt so we're going to create a variable called prompt so prompt holder so this is the prompt text box and we are going to set that equal to St dot text input if you had bigger text inputs you might choose to go with text area I found text input works pretty well and we are going to include the label for this so we're going to say plug in let me make sure my head is not covering that now we're good plug in your prompt yeah and if we go and save and we go and refresh our app again take a look so we've now got somewhere that we can type in a prompt right so I could say um hey how's it going and eventually when we hit enter on our keyboard we want to do something right now it's not doing anything because like you can see that I'm hitting enter if I go and change it if I hit enter it's running up there you can see it really briefly right we actually need to do something or activate a trigger once we go and hit enter so relatively easily so if we hit enter do this so we're going to say if prompt and then all we need to do is right now for right now we don't actually want to go and import anything we'll get to that so we're going to write st.write and we're just going to write out our prompt so we can at least see that we're doing something we're getting a little bit of feedback so we're going to say if we hit enter then we're going to do whatever's here right so that is our Baseline app let's just go and test it out so if we go and refresh so you can see it's already printing out the output of our prompt there so if I go and change it to yo yo yo you can see that we are outputting the output there that at least gives us the shell for our GPT for all app so we've got an app that we can interact with but we haven't actually done any llm stuff yet time to bring the Thunder first up importing Lane chain dependencies so we've got the shell up and running but right now we haven't actually done anything with Lang chain this is where that changes so we're going to jump back into our app and what we're now going to do is first up import a couple of dependencies so import dependencies so we're first up going to import the GPT for all class so from langchain.l lens we're going to import a GPT for all so this is going to allow us to actually leverage our GPT for all weights what we also want to do is import The Prompt template so from langchain dot actually we don't need to go to a sub module we're just going to import The Prompt template Port we spell that right from Lane chain there we go prompt template and we're going to import the llm chain so the prompt template is going to be used for prompt formatting the link chain llm chain is going to give us a chain that we can execute perfect the last thing that we want to do is the path to or set up a variable to hold the path to our weight so path to weights we're going to create a variable called path and set that equal to the area that you've actually gone and set up your GPT for or GUI to download way to in my particular case it's inside of users user app data local nomec.ai and GPT for all so if I actually go to that path I'm actually going to grab a specific set of Weights so let me show you that you can see I've actually got a couple downloaded already now if you haven't downloaded any you're probably not going to see any bin files there but you can actually choose which model that you want to use so let's say for example I wanted to use llama 13 billion 13 billion snoozy I could actually copy that entire file path or file name and then what we're going to do is we are going to paste that into our path at the end so this is actually going to give us the full file path to our weights now remember that in the documents video we use the open AI llm module the core change here is that we're now going to use the GPT for all class Lane chain supports a range of different llm sources which makes it pretty straightforward to sum them out as needed I tried doing this initially with hugging face Hub but notice that inference was taking quite some time if you'd like to see me do a video with hugging face Hub let me know in the comments below for now we're going to stick with GPT for all though because that's what I promised in this video so the first thing that we need to do is actually create an instance of our llm so instance of llm so we're going to say llm is equal to GP t for all and what we need to do is we need to specify the model and to that we are going to pass through our path and we are also going to set verbose to equal to true that are another bunch of other parameters that you can set this is the absolute minimum that you need to write in order to get GPT for all up and running what we then want to do is create a prompt template and I went through a bit more detail in terms of prompt templates in the crash course as well as the documents videos so for now we're going to say prompt is equal to prompt template and we want our input variable we're just going to set it to question and then our actual template e-l-a-t-e we're going to set that equal to a multi-line string and we're going to say our question is equal to or say colon and then we can include our question there and we'll structure and say answer let me think that or step by step so we could tweak this you could change it from template based on a little bit of prompt engineering you could even say um let's go through this let's think step by step cool that's our template kind of set up so again you could play around with this depending on how you wanted to actually structure your prompts I've kept it reasonably simple we've just gone and said question is equal to the question that we're going to be passing through to our prompt template and then the answer is going to start out by saying let's think step by step let's actually make it let's think step by step then what we can do is we all we really need to do is set up our Lem chain and this is going to bring our llm which is our GPT for all model and our prompt together so let's actually go and do that so we're going to say our chain 9 is going to be equal to our llm chain and our prompt is going to be equal to our prompt template which we just created over here and our llm is going to be equal to our GPT for all llm so we can paste that in there beautiful and that is our llm chain now done the Baseline llm chain was Now set up all we needed to do was pass through the prompt and using the run method stack them up so we actually need to put some of this into action right so we've now gone and defined all the stuff but right now we're not actually doing anything with that prompt we actually need to send it to our chain that we've got over here super simple because we've already set up the structures right all we need to do is update our St dot write area well first up we actually need to pass our prompt us the prompt to the llm chain and the way that we're going to do that is we are going to run chain dot run and we are going to be passing through the prompt that we've got over here to that that is going to return a response so we're going to store that inside of a variable over here so response and we're going to set that equal to whatever we get back from our Channel what we then need to do is just make sure that we write it back to our screen so if I grab that response and pass it through to St dot right here that should effectively give us exactly what we need so we're now let's just quickly walk through the flow so we've got our llm we've got our prompt template both of those are sent through to our llm chain then if our user types in a prompt and hits enter then we trigger anything that happens down here what is going to happen down here is we pass through the prompt to our llm chain which we set up over there and then we are going to write out the response that we get back from our chain to the screen the model weights that we defined in our path variable pointed to the nomic AI snoozy 13 billion parameter model if we prompt our app let's say asking about the fastest car in the world we should get back a response and what do you know it's identified the Bugatti Chiron Supersport 300 plus is the fastest car in the world just a tad faster than my 20 year old RAV4 now rather than just looking at one model in isolation though I wanted to see how some of these open source ggml models stack up against each other in open AI a designed six tests to put our models through the first one basic chat q a side note I'll share the code for the comparison app a little later let's establish a baseline with open AI if we ask about the difference between nuclear fusion and fission shout out to James Spriggs for the info we get a pretty coherent response open ai's text DaVinci 003 calls out Fusion is combining two or more Atomic nuclei and fission as splitting any nuclear scientists out there let me know whether or not this actually checks out now what if we ask Mosaic ml's 7 billion parameter commercially licensable MPT instruct model to be perfectly honest it doesn't look too bad still calls out the main idea around splitting and fusing albeit with a focus on astrophysics the thing is though this was an absolute pain in the ah buttocks to get up and running you see when I use the bass prompt template that I wrote I got responses that were about as good as me giving a speech after a multi-pint pub session gibberish after banging my head against my desk for two and a half hours I went for a run and figured maybe I could reverse engineer how the GPT for all GUI app work looking at the underlying GPT for all Library I noticed that the chat completion method used a prompt template which clearly delineated instructions prompts and responses so I updated the prompt template to do the same and well I'll show mpt's new sunflower poem in a sec asking the same question to the nomic AI trained 13 billion parameter non-commercially licensed llama inspired model called snoozy oh my God that's quite a mouthful we got great responses as well short simple and to the point and very similar to what text DaVinci 003 was generated what if we wanted to write an email telling our customers about a sale though open ai's text DaVinci Breeze store pretty quickly even mentioning a 25 discount MPT did pretty well here as well creating a practical and wild structured template with placeholders for the customer's name your name and company name there was a weird set of brackets and a dollar sign that it generated at the start of every response so if you use this model in isolation this would be pretty easy to strip out some string slicing snoozy did even better at this and even offered up a discount code to our customers and speaking of discounts you can get 50 off my full stack machine learning course the courses from Nick for the next week in it you'll learn how to build production grade machine learning models from the ground up and I'm about to add a lane chain project to the course next week grab the course now and you'll get access to the videos as I release them if you're already a student you'll get immediate access and if you buy it and you don't like it don't stress just ping an email to Nick courses from nick.com within 30 days and I'll give you a complete refund no questions asked as soon as you send that through and if you've got any questions shoot me an email at that same location and I'll get back to you now how about a poll text DaVinci is a modern day Shakespeare so asking it to write a poem about sunflowers was a sure thing it goes into a literary exposition of those towering golden helianthus MPT however begun hallucinating and mistook sunflowers for sunnies maybe there's a metaphor somewhere there I'm not so sure about those results but it is an interesting read you let me know snoozy comes through with the goods though even rhyming throughout and boom we now have an llm app using open source models we're going to call it no pin AI t-shirts coming soon interestingly when you download the models through the GPT for all GUI there's information about whether the models are commercially licensable which guides you as to whether or not you can bake these into your startup or business app I wish I could stop while I'm ahead but I can't so we're going to take this further I originally planned on building a trading integration with an algo trading platform but travel has been kicking my butt let me know what other llm ideas you've got so I'm going to refactor the app to use the python tool chain with an open AI app all we really have to do is swap out the llm chain with a python agent before we do any swapping out what we actually need to do is import a couple of additional things so we're going to go from langchain.agents.agent toolkits we are going to import the create python agent function where is it create python agent perfect and then what we also want to do is we want to import the python tool chain so let's say python tool chain Imports so we're going to go from langchain.tools dot python dot tool we're going to import the python reple tool perfect so that will give us the two main dependencies that we need to actually get our python app now up and running while we're at it we can actually get rid of the prompt template in llm chain because we're not going to use those anymore so I'm just going going to delete those we could also comment them out if we didn't want to delete them completely and we can actually get rid of our llm chain and our prompt template now keep in mind that this is kind of optional you don't need to go down this route but I do like the agent executor type tool chain from linkchain so create a python agent and we're going to say python agent is equal to create python agent all we need to do to that is pass through our llm and we also need to create a tool set and we are going to set that equal to our python reple tool this actually means that you can trigger python not only right get it to write python but actually trigger python via this LinkedIn agent I think it's personally absolutely amazing and we're also going to set verbose equal to true and then all we really need to do is swap out our chain down here with our python agent now we should be able to go and use our python agent as opposed to our basic llm chain before dumping the open source llms I had two more tests dropping a block on Formula One over to open AI got us a pretty neat summary of the Motorsport racing event MPT instructs seemed to maybe have reached its limit here although I'm not completely sure whether or not this was due to an incompatibility with the GPT for all class or something else I'm sure with a little fine tuning it could probably work but I did try with varying hyper parameters adjusting temperature and sampling settings it just seemed to bridge too far it was generating Arabic characters and started hallucinating referring to a death in the sport was this it for open source llms nope snoozy came through with a good generating an abstractive and coherent summary of the Wikipedia extract at the same time I tested out another llama 13 billion parameter derivative vicuna this time trained by some prominent us universities again generating a pretty short and sweet summary it was killing me that MPT wasn't quite working along the way I did try the Basin chat models with the GPT for all class the chat model didn't seem to return results this is probably a work in progress to be honest they seem to have the same issues that instructed going a little bit wild one of the things I've been testing out at work has been few shot prompting so I figured hey let's give it a crack here the test case was to look at a sequence of numbers in a evaluated condition here the prompt outlines the numbers 15 32 5 13 82 7 and 1 and notes that the odd numbers in the group add up to an even number ideally we will run our model to respond true or false fifteen plus five plus thirteen plus seven plus one add up to 41 which would render this condition as false and boom Tech DaVinci comes through with the goods snoozy however got caught snoozing here and unfortunately couldn't replicate the results using the same problem admittedly this Chain of Thought reasoning is quite complicated even for most modern llms the kuna however managed to get the samurai which to be honest is kind of amazing unfortunately its Chain of Thought LED it to the wrong conclusion still not a bad effort this brings us to the final tests using the llms inside of a python agent with self-debugging I wanted to see if openai could calculate the 12th number in a standard Fibonacci Sequence now this one's Up For Debate the 12th number in a Fibonacci Sequence will be 89 if you started from zero which from my two minute Googling seems correct but it would be 144 if you started from one with without clarifying text DaVinci comes up with 144 right you let me know the big snooze dog however after 17 minutes of setting my CPU on fire came through with 89 I know crazy mind you even though I got the answer right it took so long running on CPU that I managed to make lunch while it was recording this demo snippet I did see there's a way to run GPT for all on GPU so maybe next video all the code to get this up and running is available via my GitHub account in the description below I've included really detailed instructions inside the readme as well as the requirements so txt file so you know exactly what libraries and versions I use I'd love to know if you're building any interesting llm apps if you do manage to build them and create a video make sure to tag me on Twitter and or LinkedIn I'll include my handles there I'd love to see what you're getting up to and if you're Keen to continue Ulan change only check out the lane chain crash course that we did up here
Info
Channel: Nicholas Renotte
Views: 54,137
Rating: undefined out of 5
Keywords: machine learning, python, ai
Id: 5JpPo-NOq9s
Channel Id: undefined
Length: 17min 27sec (1047 seconds)
Published: Fri Jun 02 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.