hello everyone today I want to show you how you can run these large language models to look by downloading them on your local machine and using your own Hardware this method works with all most all of the open source models such as okunia attacker drama Dolly uh visit and then stable there and you name it now the two we are going to be using is called text generation web UI from uberbook It's actually an amazing tool which has support square a number of different models as now this is a pretty amazing tool because you can simply add and model to it and you use the save web UI and their goal is to become the automatic 1111 of text generation so if you have played it on with stable diffusion there is an amazing tool called automatic 1111 that lets you use multiple stable diffusion models in a single web UI and this webviewer has some really amazing features so as I said as a graphical user into face you can set it up so that it looks like open AIS playground and the great thing is there is actually a role playing capabilities so you can actually set up different characters and do our own playing so this is pretty neat and then as I alluded to before it supports the instructor mode for different models that you see another great feature that I'm going to be showing you in a future video is you can actually combine this or integrate this together with stable diffusion so you can send it images and generate audio responses from your text generation using 11 laps API or any of the other text-to-speech converters now the great thing about this tool is you can actually run it both in CPU and GPU board so if you have a powerful GPU it's amazing but if you want to run it on a CPU you can still do that and it also supports the 4-bit in GPT 24 mode so that's also helpful if you have a weaker Hardware so in this video I'm going to be walking to you through the installation process of the text generation web UI and then I will show you how you can download and use models and locally for this specific example we are going to be looking at this newly released visit LM and so for this model they actually combine the uh follower for volcania model which was around 90 of the uh GPD quality with this new model uh visit Ella which introduced a new uh instruction foundering technique even though visiting them is a very small model but it shows amazing results and performance so I have separate videos both on volcania and visit I would recommend you guys to check those out so before looking at the installation of uh text generation API let's quickly go over what this visit uhunia NM is so this is a model which claims to improve uh the wacuna performance by around seven percent so now they claim to have the performance of almost 97 percent of the chat chipity which is I think a big claim now in order to achieve a performance like that they combine the approaches of wizard NF and wacuna with it and then a new approach for instructional following so in this case you train an imaginary model or you use an Android in order to cover it to create more complex instructions for training your model so here is an example in this case uh let's say the gpt4 was provided with an example conversation between a human and chat chipity and then it's asked to create 10 more conversations where a human asked question and chat GPT answers right so using this is a tablet gpt4 created this whole conversation where human is asking questions chat chip is responding then human again is asking a question chart TPT is responding right and this is a whole conversation that takes place between a human charity petite and the whole thing is created by gpd4 so then they took a data set like that feeding in to abacuna or fine tune the opening model using the method that was proposed for lokunia version 1.1 so the word 1.1 is actually the second version of vocunion now after cleaning this on the conversations they went ahead and compared the performance of chatipiti visited at the original Wizard LM and use gpd4 as the judge so let's look at this just an example and so the work was done is that uh was given a question and responses from these different assistants so assistant one is I think chat GPT then there is a response from Assistant two right S1 and so forth and gpd4 was basically asked to uh a judge it right and or score it out of 100 right so here is just this code coming out from gpt4 so for example in this case like 91 98 85 right uh so it's not really a comprehensive scientific evaluation but it's still a starting point where uh gpt4 is used as a church and based on these results uh according to the authors they showed that GPT chat chipity scored on average 90 what whereas wacuna on a scored 88 which is not very far behind right if you look at it then the original punya is has code 82 and wizard Ln which is only a seven seven mini parameter model score 18 so it's still a respectable score but if you look at it wizard will Kunia is very close to uh chat topt at least on this evaluation data set now this gives you an edge idea of what type of model we are dealing with here I might create a separate video on a visit wakuna Island and compare it with okunya but right now let's see how we can write this okay okay so let's go back to the text generation of API if you scroll down there is an installation section where the provided one click installer so uh it's provided for Windows the Linux Mac OS so depending on your operating system simply choose the one that works for you if you want to install it manually using conduct so there are details step-by-step instructions so you can even go here but to keep things very simple I am going to be using this one click installer now I'll show you the installation instruction for Windows but I am running this on Linux machine because I have a GPU on my Linux machine so in order to install it on Windows simply download mode the zip file and the instruction remains pretty consistent across the operating system so if I show you one you will be able to follow it in another one okay so it downloaded the zip file or next we need to extract it so I'm gonna just save the two because I already have one all right so um after extracting it go to the folder now you will see there are a total five files uh one of them is a text file so this gives you a detailed instructions of how to actually install it so uh you need to run the script that starts with start so this is going to be start Windows or start Linux or start a Mac OS right so first we need to just run this now uh it gives a warning so it's up to you whether you want to run this or not all right so I'm going to click on run anyway now this step is going to take some time because it has to download on a few files to be I'm a patient in an installation that it creates another folder in the same base folder where it's downloading the installation files okay next just ask you uh what type of Hardware you have so if you have an individual GPU simply type in a for e and B type B and then if you're on Apple silicon so type scene for this specific computer I don't have a TPU so I'm going to type in D but if you have a GPU just select the corresponding option now once the installation is complete you will see this new folder called text generation web UI you can actually go in here and there's a folder called models so you can download the models here and just copy them and it will work or there are a couple of other ways you can do it so let's say you simply click on start Windows okay now you're presented with a whole bunch of options these are pre-populated models you can actually select one of them it will download them or in this case we're going to manually downloading a file from hiking face right so there is two ways you can download models from hacking face the one it is which is listed here so you simply select l that means that manually specify a hugging phase model right and it has to be uh they can just format the example you use it and then the model name now in this video we want to uh download the visit uh with kunya uh model right so if you type in or like search for that model so you would see this a second link which is the block visit mukunya 13 bit gtml so you want to look for a model that has gtml in it right and then you set the copy under the link now we go back to our Command window right type in that link please technique right and then you can simply click enter it will start downloading the model now once the download is complete uh you can go to the next step okay once the installation is complete simply run the start script again and it will show you something like this as I said I'm on my other machine right now but in this case what it's doing is it's loading a model from models and the model name is the block visit 13 billion parameter Mark right and there's a DOT pin file which it's the loading now once the model is loaded you will be actually presented with this IP address this is your localhost so you can go to this IP address and start interacting with your or text generation web UI right now here's the IP address I actually added this extra text to show this in in Dark theme right now if you haven't used the Uber Google text generation web device so I'm going to simply walk you through different features in this introductory video we are going to be making extensive use of this web UI and upcoming videos right so first you have a tab of text generation where you simply type in your text text and whatever model you choose it will start generation text to its way simple there are different modes there's instruct mode a chat mode and there's a chi chat mode as well I'm not sure like what this one does but I usually use the check mark right and then like there are the normal options you can start the generation uh you can click continue so similar to chat CPT if it doesn't complete the response right you can regenerate the response and then you can even impersonate somebody right next option or the next tab is a character so here you can personalize your character or Japan chat assistant we're going to be looking at this in more details in upcoming videos right the next step is parameters so you can set different plan images of your text generation and it's a very comprehensive for example you can set the temperature so that will control the creativity of the model then repetition penalty how many electrop k responses you want right so on in so forth next you have the one I think this is the most interesting part so if you have multiple models downloaded they're going to be listed here and you can choose the model that you want so for example in this case I have a Facebook model uh and then this the visit water that we just downloaded now let me show you how you can actually add a new model so rather than going to the terminal or the command line interface you can actually copy the same uh path the username and the model path like the way that we did for our visit voconia model and just click download and it will start downloading the model let me show you how we can actually download another model using the web UI so I actually saw there is a new version of wizard NM which is supposedly uncensored so I'm going to go here and copy this this this is The Wizard and plane with a subset of the data set responses that contain alignment or moralizing word and move the antenna is a trainer visit and it doesn't have alignment built in so that the alignment of NSR can be added a separate thing right so I'm going to just download this to show you as an example but we're going to be playing it on with the original Wizard volcania uh and then for the time being I might actually make another video on this one okay so simply type in the pad and click download and then it starts downloading your model and you can even actually train a model or fine-tuned models here so you can train your own Laura uh and this is a very powerful approach to actually do it so I might actually show you in an upcoming video probably you can fine tune your own models using uh this web UI now the next tab is interface mode the text generation web UI takes a lot of inspiration from automatic 11 and 11. and that's why you you probably will see a lot of similarities so you can control how the interface looks like here okay so now let's play with the model itself right so the first part is in drafting apology email to a customer who experience a delay in their order and provide do your assurance that the issue has been resolved okay so you simply type in your front and we do select chat and then click generate and let's see what the model comes up with now I have an older GPU I think I have a 10 18 ti so that's why it's going to be pretty slow uh depending on your Hardware you might get to a different speed okay so here's the sports uh it says we are ready this email to apologize for the delay in the processing your recent order we understand how frustrating it can be when expected delivery times are not met and we sincerely apologize for inconvenience calls so it's a really normal email that you would expect from a customer support team and I think it did a decent job with it okay so next we're going to try this plot I saw somebody else actually use this to generate a graphical calculator and write python code to create a basic aggregator application with addition subtraction modification and division functions so let's see if it actually can this can do it and if it works for other llms I've seen they have trouble with this okay so it came up with the code so let's run the code okay so it actually came up with a pretty simple basic interface let's see if it actually works so I'm going to see it then uh let's say three great add oh nice again you get the result if you subtracted seven so multiplication and division also works this is pretty neat I wasn't expecting that to work okay so the next prompt uh is create a list of three standard ideas in the Enterprise B2B SAS the startup ideas should have a strong and compelling Mission and also use air in some way over at cryptocurrency blockchain the startup radius should have a cool and interesting name right so it actually started creating the response and I am running it in the chatbot because I want to ask it for the questions so the first one is wine Mill the AI this platform would use AI to help business is better understand the employees needs and preferences in order to create a more engaging and productive Workforce the permission of mine meld AI is to make an office make the office a happier healthier place where everyone involved interesting that's the first time I have seen uh smart sales AI the SAS platform would use AI to help SAS teams better understand and the customers need and preferences and articles these more effectively okay and brain Bridge AI this platform would use AI to help companies Bridge language barriers to provide real-time translation services for meetings and other communication channels this is interesting this is nice when you in the chat mode you can actually create different characters assign them for strategy and uh even assign them different images that you want to use in your characters so I'm going to show you how to create characters in my upcoming video so stay tuned with that the goal of this video was to actually show you how to install a run in stack generation with UI but we looked at the wizard of Kunia model as well I'm going to create a detailed video on this comparing with right but this text generation web URI is an excellent tool for you to run your model locally so I would recommend men to play around with it if you have the hardware good value as always if you have any questions or comments put them in the comment section below if there are any specific topics you guys want me to con cover so also put them in the comments section below I hope you found this video useful thanks for watching see you in the next one
