Install AI On Pc (FREE!) | BEST TextGen WebUI Tutorial

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Want to bypass the annoying censorship of ChatGPT and ask the AI anything without your data being collected? Don't want to pay for ChatGPT 4 to surf the internet and write to your AI without restrictions? Then you are in the right place! In this ultimate tutorial, I will show you how to easily run various AIs on the Text Generation Web UI locally on your PC, with many cool features like web search, characters, and even how to give your AI a voice, and much more. I will walk you through everything step by step and explain how to set up and download everything. So, watch the video until the end to not miss any important details. With that said, let's get started right away! Firstly, we need the Text Generation WebUI interface for the respective language models. You should expect to need about fifty to two hundred fifty Gigabyte, depending on what you want to load onto it. Before downloading, create a folder on your desktop, just like I did. If you've done that or have another location for installation, return to the GitHub page and download the Text Generation WebUI by clicking on 'Code' and then 'Download ZIP'. Drag and unzip the downloaded ZIP file into the folder you created. You can do this by right-clicking on the ZIP file, selecting 'extract here', and using WinRAR to unzip. If you don't have WinRAR on your PC, you can find and download it from the link in my video description. After unzipping, open the folder where you will see many files and folders, which we will not focus on right now. I will explain later which files and folders are important. Next, go to the 'Start-Windows-Bat' file and run it. Once you run the file, you will see many processes happening in the background, but don't worry about those for now. Just wait until this process finishes, and then there are two more steps to complete the installation. When everything is loaded, you can choose to run the large language models on your Nvidia graphics card, AMD, Apple M Series, Intel Arc, or just on the CPU. Your choice depends on your operating system, device, or hardware. For example, I have a good graphics card and CPU, so I could use either. However, the language models run faster on an Nvidia card, but since I'm recording while running these models, it takes up a lot of graphics power, so I prefer running them on the CPU, even if it's a bit slower. Most of you will choose Nvidia, so press 'A', and those without a good graphics card should press 'N'. But remember to check your Task Manager to see which is stronger on your PC, or if you need one of the other options. For me, it's 'N' for the CPU. Press Enter, and additional extensions and packages required for the interface will be downloaded. This will take a few minutes, so sit back and relax, and check back once it's installed. After installation, copy this URL and paste it into your browser. You do this by selecting the URL, pressing Control key and C, then Control key and V, and Enter. The interface will then load, allowing you to write with the language models. But before we get to that, I need to show you two files. Go back to the TextGeneration WebUI folder, where you will find two files. One is the 'Update Windows bat' file, which you use to update the interface. Run this file whenever there are new updates, which you can check on the TextGeneration WebUI GitHub page under 'Releases' and 'Snapshots'. Check the date and if there's a newer version, run the 'Update Windows bat' file, and it will install automatically. If you have a different operating system, use the respective file. The other file you need to know is 'Start Windows bat'. Use this to run the interface and access the language models, so make sure to create a shortcut to it on your desktop, like I have here, and you can rename it as you wish. I named mine 'Text Generation WebUI'. You might want to move the folder we created to a different location to keep your desktop tidy Before we dive into the interface, let's discuss an important topic: troubleshooting for the Text Generation WebUI Interface. If you encountered any issues during installation, such as error messages, you can easily search for these errors on Reddit or even create a post. The community there is very helpful, friendly, and professional. If you cannot find a solution on Reddit, or if the community could not assist you (which I doubt), another option is to check under 'Issues' on the Text Generation Web User Interface GitHub page. I have linked both of these resources in my video description for your convenience. The most common issues usually arise when certain applications are missing or not correctly installed by the One-Click Installer. Therefore, I have linked all the applications that might be helpful for your specific error. You can find these under 'Artificial Intelligence Sources' on my Discord. I have also posted all other important links there. Since the video description has limited space, not all links could be included, so it is worth checking out my Discord. The link to it is in the description. Now, let's truly dive into the interface. I know it has been a long journey, but you needed all this information about troubleshooting and understanding the essential files. But now, let's really get started. The interface itself is not the AI; it's like the ChatGPT page, but the AI is still missing. First, you need to download models, and I will show you exactly how to do that. The best source for models is Hugging Face, specifically from the user TheBloke. He's the best source for models because he regularly posts them, well-organized, and always includes descriptions to ensure you understand how to download them. First, let me enlarge this browser window and explain what you need to know about the model formats. There are two main types you need to be familiar with: AWQ models for graphics cards and GGUF models for your CPU. Depending on whether you've decided to run your models on the graphics card or CPU, choose one of these formats. Remember, AWQ for graphics cards, GGUF for the CPU. I've simplified this explanation, but keep that in mind. Now, regarding the parameters, you might have noticed model names followed by 7B, 13B, 30B, or 70B. This indicates the model's size, that is, how much data it uses and processes. For example, this model utilizes 7 billion parameters. You might wonder what this means and how it makes a difference. Simply put, the more parameters a model has, the more it can process and provide better or more complex responses. However, a model with more parameters isn't necessarily better than a smaller one; it's just capable of creating more complex patterns. The training and other factors play a more significant role in determining whether a model is better. Understand that a model with more parameters also consumes more resources. This is crucial when choosing your model, as you need to consider what size your hardware can support. I will show you approximate model sizes your hardware might support based on RAM and VRAM. Remember, VRAM and RAM aren't the only factors in processing model parameters; the GPU and CPU also play a role. If you're using a CPU to run models, you can assume that 7 billion parameters require 5 to 10 gigabytes of RAM, 13 billion parameters need 8 to 17 gigabytes, 30 billion parameters need 16 to 38 gigabytes, and 70 billion parameters require 32 to 74 gigabytes. These are approximate figures. For most, only 7 and 13 billion parameter models are relevant due to CPU limitations. For example, I have a 24-core, 4 GHz CPU and can use up to 30 billion parameter models. But this doesn't mean 30 billion parameter models are optimal for me. Models with 13 or 7 billion parameters are preferable as they're faster and more satisfactory. It's crucial to test different models to see what works best for you, what's faster or slower, and then choose the right model size. If you opt for a GPU, for 7 billion parameter models, you'll need 6 to 8 gigabytes of VRAM, for 13 billion parameters, 8 to 12 gigabytes of VRAM, for 30 billion, 16 to 24 gigabytes, and for 70 billion, 40 to 48 gigabytes of VRAM. You might have noticed that GPUs require fewer resources than CPUs because they process models better. So, I assume most will choose the graphics card, which handles larger models more efficiently. With a GPU, I could run up to 13 billion parameter models on my PC with a GeForce RTX 3080 with 10 gigabytes VRAM. But even though my card supports 13 billion parameters, I prefer 7 billion for faster performance. Again, please experiment a bit, try out different models, and see which one best fits you and what your hardware supports. Remember, these figures are estimates, and other factors also play a role, including the model type, which prescribes the necessary VRAM. We'll get back to this shortly. Now that you have learned all this and know what to look for in a model and how to choose the best one, let us move on to model searching. There are two methods for this. The first option is to join my Discord, where I have linked the models I recommend. You can find these under AI Sources, or you can search for a model by TheBloke. The best way to do this is by first checking the leaderboard, which I have linked in the video description, and then searching for your model there. The advantage is that you can decide which model is better or worse based on the ranking. Plus, you are always up-to-date and have many other pieces of information that might be important for your decision-making process. However, you might have noticed models like ChatGPT 3.5 or 4 listed there. These cannot be run on your interface but can be used for comparison with your model. To find out which models run on the interface, it is best to look for models where the parameters are listed in the name and then check with TheBloke if they are available. My favorite model, which I always use, is WizardLM. If I wanted to download it, I would copy its name, go to TheBloke's page, and search for it there. Then you decide whether to run it on the graphics card or the central processing unit. As you can see, most people download it for the GPU, but I will run it today on the CPU. Here you can see the model's description, which most models on TheBloke have. I recommend reading it, but the most important thing you need to consider is this part here. This section lists the model types and how much Random Access Memory they consume. You should find this table for almost every model. If it is not there, it is not a big deal because you might find somewhere else in the description how much RAM or VRAM the model needs. Otherwise, you can go by my estimates and see if that fits or not. In most cases, the models work according to my estimates. If there are multiple types of the model, go to 'Get File List'. Once it is loaded, you will see all the model types here. To ultimately decide on one, go back to the website, see how much RAM it consumes, and what is recommended or not. For example, I choose this model here. Then go back, find it in the list, copy it, and paste it here. After that, just press 'Download', and the model will start downloading. It will take about 5-10 minutes, as the model is approximately 8 gigabytes in size. While it is downloading, I will show and explain a few more things to you. You need to know that most models are in English, including the one we've downloaded. If you want it in a different language, you need to visit TheBloke's page and then enter your desired language in the search bar to find models available in that language. You'll then be presented with many options; for example, 'Vicuna' is a very good model. Of course, you can also search for other languages, like Spanish, and see what's available. For instance, right now, there are only two models for Spanish. However, even if your model is only available in English, the interface includes a translator, so don't worry about that. Now, let us return to the interface. As you can see, the model has been downloaded. You can load the model here by refreshing this area and selecting your model. All the settings will be automatically configured for you, which you can edit, but I will leave them as they are for now. Usually, in TheBloke's descriptions, you will find all the necessary settings, including the model loader. You do not have to adjust these manually as they are set automatically, but if something is not configured correctly, you can refer back to the description for guidance. I forgot to press 'Load', which you should do to ensure the model is correctly loaded. You will see a confirmation that it has been successfully loaded. You also have the option to unload and reload the model here and save any settings you have edited. If you have made it this far, congratulations, you have successfully run your model on your personal computer. I know it has been a lot to take in, but I needed to explain these basics. Now, let us move on to the most interesting part. Now that the model is loaded, you can start chatting with it, either on the Notebook, Default, or Chat interfaces. Let us begin with the Notebook. Here, you can directly see a textbox with a prompt, which you can select from the top. You can also create, save, or delete new prompts. Below, you have options to generate messages, stop them, undo, or regenerate. I will demonstrate using this prompt above. For example, I will ask the AI about the benefits of a local AI and then press 'Generate'. As you can see, the AI immediately starts responding in the Notebook. It has provided a very good response, which I will not read out fully now, but you can pause the video and read it. It is impressive to see the quality of the response from an AI running on your personal computer. Remember, we are talking about an AI that runs on your personal computer, free of charge, without the infrastructure of OpenAI. It is amazing to think that we can now run an AI on our own personal computers - it is truly revolutionary. Imagine a future where everyone has their own AI, and with better hardware, anyone can run language models like ChatGPT on their personal computers. Of course, this AI is not quite on the same level as ChatGPT, but it is very close. But let us get back to the main topic. You have different formats to display the text, like HTML, and other functions and options to explore. Let us go back to our prompt. As I mentioned earlier, you can select other prompts, like this one, which has a similar function. You can experiment and even create and save your prompts. Now that you know everything about the Notebook, let us move to Default. The Default view is similar to the Notebook, except here you have input and output sections. Let me show you how it works. Enter your prompt, like Q&A, ask a question as before, then press 'Generate', and the AI will respond on this side, unlike in the Notebook. Now, let us check out the Chat view, which most will use. It is similar to ChatGPT but a bit better. You can track, delete, or rename your last chats here and open a new chat to write with the AI. There are various ways to interact with the AI. For instance, you can send your messages to Notebook or Default. You can copy, replace, delete, continue, or regenerate messages. So, there are many possibilities. Try them out, as you can manipulate the AI to give you the responses you want. Let me demonstrate a chat. I mostly use this AI for improving my texts, so I will set that task for the AI. But before we press Enter, let us go down to 'Character Gallery', open it, and select 'Assistant'. We will get to characters later, but for now, you can start writing to the AI and press Enter. The AI will then provide an answer, similar to the other text views. You can have a conversation with the AI, like on ChatGPT. For example, ask 'what is 2 plus 4'. There are many other options, as I showed earlier. Let me demonstrate one. You can press 'Regenerate' to generate the message anew. I will show you more, but first, let us go down here. You have the option to choose the chat style, like 'Messenger', which will display the chat like a messenger application. You also have the option to switch modes. I will not go into detail as it would exceed the video's scope. But I see I have incorrectly set it here. For example, you should choose 'Chat Instruct' or 'Instruct' for the AI. You can see which mode you need to use when you load this model. It is stated here, for instance, 'Instruct' or 'Chat Instruct'. Then go back to the chat and select the mode the AI uses. Here, you can specify how the AI should respond, like typing 'sure', and the AI will respond with 'sure'. This means you can manipulate the models and get them to answer your unethical questions, which I will cover in another video. You can also bypass censorship by downloading models from my Discord, like the Wizard models, which are all uncensored. I will now show you a few interesting things about the interface, starting with the parameters. With the parameters, you can influence the AI's response. For example, you can set how many tokens it can use, which determines the length of the message, and you can also adjust how creative it is, among many other things. You can learn all about this under 'Learn More'. I will not go into too much detail, as it is really for the pros. I usually leave it as it is, but you can try out other presets and see how the AI responds. And now, let us move directly to the characters. You can think of characters as similar to GPTs. For instance, you can choose how you want to be addressed, determine the character's name, give the character a context, and add a greeting to start the chat with. You also have the option to upload a profile picture for the character. You can delete characters, then create and save new ones, and you can switch between characters. There is an example included here, but I will not go into great detail now, as I will cover that in a separate video. Just understand that characters are like GPTs, and depending on the context you give them, they will provide you with specific responses. Now, let us move on to the Instruction Template. I will not go into great detail here, maybe in another video of mine. What you need to know is that things are automatically loaded here. You can see that the current model loaded is Vicuna 1.1. If it is not automatically loaded, you can check the template again in the description of the respective model and then insert it here. But usually, they are loaded automatically, so you do not have to worry about that. Let us head to the Chat History. Here, you can save your current chat history, and you can also upload histories. Here you can upload characters if you have them in JSON format. That is all for the parameters for now; let us move directly to the next topic. For this, we need to go to the Session section, where you will find all extensions. I will show you the most important ones that you must have. Let us start with Google Translate. As the name suggests, it will replace the AI's responses in the language you have selected. If you want to activate it, you need to check this box and press 'Apply'. This will reload the interface and start the extension. As you can see, the extension is loaded. Strangely, it is set to Japanese, so it is translating into Japanese right away. Below, you can see it is activated, and here you can also select the language you want. And that is pretty much it for Google Translate. Now let us move on to the next extension. The next extension is Silero Text-to-Speech, which allows your AI to speak using the voice you select. Let us activate it and press 'Apply'. Once the interface has reloaded, you will immediately see the audio options here, which we can play right now. I do not find this voice the best, so let us scroll down and select the voice we want. I recommend EN0, as it is the best in my opinion. Additionally, you can set the language, pitch, and speed. I will not adjust these now. Do not forget to check this box so that the text is displayed under the audio. Let us test it now. Let us ask the AI how it is doing. The previous audio will play again, but now we have to wait for the result to load. You will see that we have already set a preset for the response to start with 'Sure'. That was really cool. Just imagine the possibilities you have with this. You can conduct complete conversations, and with funny characters, it gets even better. So, wait until the end of the video; I will show you many more cool things. Now, let us move on to the next extension. Unfortunately, you cannot see it here as it has been removed for some reason, so we will install it manually. This is the Elevenlabs Text-to-Speech Extension, which allows you to assign a voice to your artificial intelligence. You need to download it from the Snapshots. For that, go to this website here, the link is in the description or on Discord, and download the ZIP file. Once you have done that, open the file, go to the Text Generation WebUI folder, and into the Extensions. Copy the Elevenlabs Text-to-Speech file and paste it into the Extensions folder of your Text Generation WebUI folder. Then go back to the main folder and open CMD Windows Bat. Also, remember that you need Python to install the missing requirements. You can find the link on my Discord, and the command as well, which is here at the bottom. You can copy it and paste it here at the top to execute it. Once the requirements are downloaded, we can proceed. When the Text Generation WebUI is restarted, go back to Session, mark Elevenlabs Text-to-Speech, and press 'Apply'. Before we can start the extension, we need the API from our Elevenlabs account. Go to the website, your profile, and copy the API. Then paste it here and press 'Refresh'. Once it is loaded, you can adjust the settings. The most important thing I will do now is to play the audios automatically and choose a voice. I will take Santa Claus for now. If I had Santa as a character, it would be the perfect conversation, but let us just do it. The first thing I will ask is what gifts I will get for Christmas and then press 'Generate'. Well, I will not play it completely, but you see it works. Of course, the response I got is not the best, but that is because it is not the Santa character, it is the AI itself. So, if you play around and want Santa as a character, you will need to create one. Of course, you can also use the other free voices, but I just find the ones from Elevenlabs better because there is more variation. Also, you might have noticed that it didn't write to the end. That's because you need to assign more tokens in the AI's parameters so it can generate more text. That is it for this topic. I hope you have understood the principle of how everything works. Now let us move directly to the next extension in the Session. Now, let us move on to the next extension. We need to use the installer here to download it. Let us first go to the extension list, which is linked in the description and on Discord. As you can see, there are many other extensions available, many of which are quite interesting. You can check them out on your own, as I will not go through all of them. However, the one I will show you is called 'Integrate TavernUI Characters'. This adds a library of characters to the Text Generation WebUI interface. To download it, click on this link, and you will be taken to the extension's page. There is a description you should read, as it contains all the important information. You will also see a preview of the extension. To download it, copy the link and paste it here, then press Enter. This will start the download. Once the download is complete, restart the Text Generation WebUI. If you have done everything correctly, you will see the extension at the bottom. Select it and press 'Apply'. You will then see a new area where you can choose characters. Be warned, many things here are Not Safe for Work, but you can turn that off in 'Featured', which I will not do right now. The first thing we will do is search for my favorite character by entering their name and then searching. Click on 'Page 1', and there is the character we are looking for. To download it, click on it and then 'Download'. Once the download is complete, go to 'Parameters', 'Character', and select the character. You will see everything is there: context, greeting, the name, and my name. This character's function is to generate prompts for Stable Diffusion, so there are not only Role Play characters but also productive ones. I will show you with another character how it will look. As you can see, I have chosen Batman, a very funny character with a lot of context, which I will spare you from reading. If you want to use this character, go back to 'Chat', scroll down to 'Character Gallery', select it, and press 'Refresh'. Once selected, you will see the greeting. Let us start roleplaying by sending a message. As you can see, I have put the text between two markup symbols. It is the action, and what I actually say is, 'Batman, can you take off the mask for me?'. Now you have got the principle, so let us press 'Generate'. What you can see is the text in grey is the action, and the spoken part is in white. You can see Batman has replied, and I would say it is a fitting response for Batman. You can have good roleplay and fun conversations. If you are interested, the next video will be about characters. If you have enjoyed this content so far, consider subscribing to stay updated on AI, and any support is greatly appreciated as it helps me produce more videos. Let us return to the extensions. The next extension I want to show you is Lucid Web Search, which allows you to surf the internet with the artificial intelligence. This time, I will manually install it to demonstrate what to do if downloading it via the installer does not work. First, click on 'Code' and then 'Download ZIP'. Once downloaded, click 'Open File', which opens this folder here. Drag and drop this folder into the Extensions folder here. Next, go into the Extensions folder and rename the file to LucidWebSearch. If the original name remains, it will not work correctly. Then, go back and open the CMD Windows Bat file and execute this command here. Once it is done, there is just one more thing to do. First, close the window. Oh, and one more thing, you need the Chrome browser. If you do not have it on your personal computer, install it. After that, you need to open CMD on your personal computer. Go to search and type CMD, then open this window here and enter this command. This opens Chrome, which must run parallel to the Text Generation Web User Interface for the extension to work. After restarting the interface, you will see the extension here. Activate it and press 'Apply'. Once activated, you can adjust settings below. I will leave them as they are, but honestly, you can do more than just surf the internet; you can also read PDFs. If you want to know more, go to the GitHub page, where everything is explained, including even cooler functions. So, I always recommend reading through it before starting to use an extension. Also, there is a description you can read to understand exactly how everything works. To trigger the extension, you always need to write 'Search' before your text so it will search as I do here, for example, asking the artificial intelligence to 'search for the weather in Frankfurt'. Then press Enter. Now you can see that the AI has given me an answer using Google Search. It is a very cool feature you can do with this extension because there are really no limits. And remember, this extension is also free, just like the AI. It is almost like having ChatGPT running on your personal computer for free, uncensored, and privately. Now the only things missing are image recognition and image generation, which we will get to now. Let us go back to Sessions. You can do image generation with this extension here, but I will not show it in this video as it would exceed its scope. Instead, I will demonstrate how to perform image recognition within the interface. For that, we need to activate the Multimodal Extension here. But before we press 'Apply', let us save the settings. By doing so, the interface will always start with the extensions you have activated here. Once that is done, we have another step here. We need to edit the CMD Flags. Double-click here and insert this command, which you can find on my Discord, and save the file. Then close it. Afterward, you will need a model that supports image recognition. I have chosen this one. Once you have downloaded it, we have to make some settings here. Select 'AutoGBTQ', enter 'four' for WBITS, and 'one hundred and twenty-eight' for GroupSize. If you have done that, then you can load it. But be aware, you need at least sixteen gigabytes of VRAM. I do not have that, so I cannot load it, but for those who can, there is one more thing to do. Go to Chat and set 'Instruct' here below. Then you can start writing to the AI directly. But do not forget, you can also send images here, as you can see. Just click here to insert an image and then write something. For example, you can ask, 'describe this image,' and then the AI will describe the image here. So, that brings us to the end of the video. For all my Patreon supporters, I have made a mind map with all my notes and information from this video, including some extra stuff. With that said, the last sentence goes to the AI.

Info

Channel: Aris

Views: 3,305

Rating: undefined out of 5

Keywords: Install AI On Pc, Run AI On PC, oobabooga, text generation webui, textgen webui install, TextGen WebUI Tutorial, Install AI Local, AI UNCENSORED, GPT UNCENSORED, chatgpt, ai, openai, textgen webui, large language models, artificial intelligence, gpt3, oobabooga github, machine learning, Deep Learning, GPT4, ai chatbot, Run LLM Models, llm tutorial, chatgpt alternative, ChatGPT for Free, llama, ai tutorials, chatgpt tutorial, ai tools, WizardLM, llama 2, vicuna 13b

Id: wI_EIN9UDB8

Channel Id: undefined

Length: 27min 33sec (1653 seconds)

Published: Sat Feb 03 2024