Installing Ollama to Customize My Own LLM

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I'm a professional software engineer with almost a decade of experience working at companies like uber flexport and in and today I'm going to show you how to use a tool called AMA to get almost any large language model up and running on your own machine in under 5 minutes you don't need to worry about paying for an open AI subscription nor do you have to worry about paying for any sort of cloud hosting and if you stick around until the very end I'll show you how to use this tool to customize your very own model so let's get to it hi my name is David and welcome to decoder a brand new Channel all about machine learning large language models and more AMA is an open Source tool created by an X Twitter Docker and Google engineer named Jeffrey Morgan it allows us to easily download and run models chat with them in our command line serve them over an API and even create and customize models with our own system prompts and attributes installation is really easy you just go to. a and click download then once you run the installer you should see the Llama icon in your menu bar right up here from there we're now able to use the Olam command in our terminals okay so now that we have Ama installed let's explore it a bit try typing olama into our terminal so this shows us basically what the help command is um it shows us different commands that we're able to run including help itself um let's start off by just listing the models that we currently have installed and it looks like there's nothing here so I guess a good place to get started is by downloading our first model let's pull a new model from the library on the website so to do that we just navigate back to the website go to models and here you should see a bunch of models that may sound familiar to you mistol has been pretty popular Mixr is another one but I'm going to use a model that I like called fi by Microsoft I like fi because it's really small but really capable so regardless of what your system is it should probably be able to run this um as we scroll through this page we see it gives us some example prompts um how to use it in the API um and some other different use cases like code completion or text completion so I'm just going to copy this Command right here that they gave us and paste it in so the first thing that it's going to do is look up the Manifest and actually download the file this one's pretty small but some models can get up to even like 50 gigabytes so depending on what you're trying to do be prepared to wait for a while for that okay and so now we are chatting with our new model that we just downloaded I'm going to ask it what is water made of and wow that response was super fast so it told us water is composed of two hydrogen atoms bonded with one oxygen atom and that is correct um that was really good but one thing to call out is that smaller models like this one can run really really quickly uh however they may struggle with some even basic questions and may get off topic or even hallucinate um but now you know how to explore new models download them and run them all on your own so it's a huge first step for us in addition to chatting with our model over the command line like we were just doing AMA also exposes an API so we can um interact with our model programmatically as well I'm going to use a tool called curl that allows me to make HTTP requests to arbitrary endpoints so to break it down here's the tool curl um here's the endpoint that we're call calling this is the one that's exposed by olama um and then the this flag D here just means that we're passing data and the data that we're passing are just argu Arguments for AMA so we're telling it which model we want to interact with which is fly um our prompt what is water made out of and stream false just means that instead of streaming the response back to us one word at a time we just want everything all at once and then JQ just helps me to format it better and here's our response we got a bunch of stuff back but the main thing that we care about is this actual response element that says water in its most common form on earth is made up of two hydrogen atoms bonded with one oxygen atom so again that's correct but if we look at some of this other stuff here it's not useless to us now um but if we were running our own web app I think this stuff could be very handy for us but we'll keep that in mind for the future so leaving curl and going back to our chat let's exit the chat and try some different things so you can just exit the chat with control D and now we're in a clean session here so let's ask Ama again what we can do with it ama help um and in this case I am curious about learning more about the model that we just downloaded so let's do olama help show so this tells us we can run show with a model name and then some Flags now these flags are interesting but the main one that we care about is the model file because the model file itself actually contains the parameters the system message and the template the model file essentially defines what a model is um how it works and where it lives so let's do that now ol Lama show by model file all right and so this is what F's model file looks like let's take this line by line because there's a bunch of stuff here on the top this is commented out if you're familiar with python that's kind of the syntax that we're using um and so it says this model file was generated if we want to build a new model file which is very interesting thing um we can basically just replace the from line with this line that it gives us right here and then everything beneath that is what we're going to be looking at so we see a couple major section headers here we see from template system and parameter from uh seems to just give us the sort of like location of the memory on our computer where the blob that we downloaded that 1.6 gigabyte thing is just where that lives on our machine and then for parameters we're going to ignore that CU that's a bit more advanced but what we really care about here is the template and the system prompt um if you're familiar with prompting basically there's a couple different inputs that you can have you can have the user prompt which is like the question that I just asked but you can also have the system prompt and that system prompt kind of sets The Scene It defines to the model what it is and how it's supposed to interact so if we look at our system prompt here it says a chat between a curious user and an artificial intelligence assistant the assistant gives helpful answers to the user's questions and then in our template the template composes different types of prompts um together into one sort of like message that starts the chat off with so here we have this if block it says if there's a system prompt then ceue up the system and then input whatever that prompt is next it says here's the user speaking the user says whatever the user prompt is and then finally it says now it's the assistant's turn Okay um f model you can take over and speak as the assistant so I think it' be really interesting if we took that model file and customized it the way that we wanted to so what I'm going to do is start by copying that model file that we just looked at into our own file so here we go show F model file which is exactly what we were just doing and then this little carrot means that we copy into a new file and I'm going to name our file our model file because what I want us to do is to make a custom model that talks like a pirate because I think Pirates are more fun than robots don't you okay and so now if we look at our system we see that we have this new mod model called our model file I'm going to open that up in my editor which is right up here so let's hide this so now here we are in our text editor and we just have exactly the same content as we had before which is what we expect but what we want to do is to change the system prompt to not just be an artificial intelligent assistant but to be an artificial intelligent assistant that speaks like a pirate okay so let's save that and then we can actually just exit out of it and return to our nice little terminal session here okay and so now for the fun part let's go back to our terminal let's use oama help again and what we're looking for here is create so AMA help create and it looks like the command is AMA create the model name and then the flags and the only flag that we really care about here is this file and we're going to be passing in the file that we just created okay so we're going to paste in our Command and it's going to do its process it's going to transfer the model data it's going to see what it can get rid of what it can reuse and here we go our Command ends in success so that mean that should mean that we have a new model so if we do o llama list then this is perfect so we see we have our original fly model that we downloaded but we also have this new model that we just defined RFI so let's see if it works okay and so now we're going to run our model file and ask it the same question AMA run RI what is water made of a ho there made water as you know is one of the three stages of matter it does actually Define it as two hydrogen atoms and one oxygen atom so there we go we both have a model that answers us correctly but it also does what we tell it to and talks like a pirate so now that you know how to use AMA I'm really curious to see what you do with it what are your favorite models to play around with what sort of prompts have you found most effective um as you've gotten into it are there any sort of like settings or configuration that you think that I should talk about in upcoming videos in future videos I'm going to build on this AMA knowledge both to dig deeper into how AMA Works behind the scenes and how we can customize it um but also how we can use AMA to build up and power things like document chatting and things like that thanks for joining me for my first ever video to see what else I'm working on please be sure to like And subscribe
Info
Channel: Decoder
Views: 11,345
Rating: undefined out of 5
Keywords:
Id: xa8pTD16SnM
Channel Id: undefined
Length: 9min 19sec (559 seconds)
Published: Tue Jan 16 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.