Want to bypass the annoying censorship of
ChatGPT and ask the AI anything without your data being collected? Don't want to pay for ChatGPT 4 to surf the
internet and write to your AI without restrictions? Then you are in the right place! In this ultimate tutorial, I will show you
how to easily run various AIs on the Text Generation Web UI locally on your PC, with
many cool features like web search, characters, and even how to give your AI a voice, and
much more. I will walk you through everything step by
step and explain how to set up and download everything. So, watch the video until the end to not miss
any important details. With that said, let's get started right away! Firstly, we need the Text Generation WebUI
interface for the respective language models. You should expect to need about fifty to two
hundred fifty Gigabyte, depending on what you want to load onto it. Before downloading, create a folder on your
desktop, just like I did. If you've done that or have another location
for installation, return to the GitHub page and download the Text Generation WebUI by
clicking on 'Code' and then 'Download ZIP'. Drag and unzip the downloaded ZIP file into
the folder you created. You can do this by right-clicking on the ZIP
file, selecting 'extract here', and using WinRAR to unzip. If you don't have WinRAR on your PC, you can
find and download it from the link in my video description. After unzipping, open the folder where you
will see many files and folders, which we will not focus on right now. I will explain later which files and folders
are important. Next, go to the 'Start-Windows-Bat' file and
run it. Once you run the file, you will see many processes
happening in the background, but don't worry about those for now. Just wait until this process finishes, and
then there are two more steps to complete the installation. When everything is loaded, you can choose
to run the large language models on your Nvidia graphics card, AMD, Apple M Series, Intel
Arc, or just on the CPU. Your choice depends on your operating system,
device, or hardware. For example, I have a good graphics card and
CPU, so I could use either. However, the language models run faster on
an Nvidia card, but since I'm recording while running these models, it takes up a lot of
graphics power, so I prefer running them on the CPU, even if it's a bit slower. Most of you will choose Nvidia, so press 'A',
and those without a good graphics card should press 'N'. But remember to check your Task Manager to
see which is stronger on your PC, or if you need one of the other options. For me, it's 'N' for the CPU. Press Enter, and additional extensions and
packages required for the interface will be downloaded. This will take a few minutes, so sit back
and relax, and check back once it's installed. After installation, copy this URL and paste
it into your browser. You do this by selecting the URL, pressing
Control key and C, then Control key and V, and Enter. The interface will then load, allowing you
to write with the language models. But before we get to that, I need to show
you two files. Go back to the TextGeneration WebUI folder,
where you will find two files. One is the 'Update Windows bat' file, which
you use to update the interface. Run this file whenever there are new updates,
which you can check on the TextGeneration WebUI GitHub page under 'Releases' and 'Snapshots'. Check the date and if there's a newer version,
run the 'Update Windows bat' file, and it will install automatically. If you have a different operating system,
use the respective file. The other file you need to know is 'Start
Windows bat'. Use this to run the interface and access the
language models, so make sure to create a shortcut to it on your desktop, like I have
here, and you can rename it as you wish. I named mine 'Text Generation WebUI'. You might want to move the folder we created
to a different location to keep your desktop tidy Before we dive into the interface, let's discuss
an important topic: troubleshooting for the Text Generation WebUI Interface. If you encountered any issues during installation,
such as error messages, you can easily search for these errors on Reddit or even create
a post. The community there is very helpful, friendly,
and professional. If you cannot find a solution on Reddit, or
if the community could not assist you (which I doubt), another option is to check under
'Issues' on the Text Generation Web User Interface GitHub page. I have linked both of these resources in my
video description for your convenience. The most common issues usually arise when
certain applications are missing or not correctly installed by the One-Click Installer. Therefore, I have linked all the applications
that might be helpful for your specific error. You can find these under 'Artificial Intelligence
Sources' on my Discord. I have also posted all other important links
there. Since the video description has limited space,
not all links could be included, so it is worth checking out my Discord. The link to it is in the description. Now, let's truly dive into the interface. I know it has been a long journey, but you
needed all this information about troubleshooting and understanding the essential files. But now, let's really get started. The interface itself is not the AI; it's like
the ChatGPT page, but the AI is still missing. First, you need to download models, and I
will show you exactly how to do that. The best source for models is Hugging Face,
specifically from the user TheBloke. He's the best source for models because he
regularly posts them, well-organized, and always includes descriptions to ensure you
understand how to download them. First, let me enlarge this browser window
and explain what you need to know about the model formats. There are two main types you need to be familiar
with: AWQ models for graphics cards and GGUF models for your CPU. Depending on whether you've decided to run
your models on the graphics card or CPU, choose one of these formats. Remember, AWQ for graphics cards, GGUF for
the CPU. I've simplified this explanation, but keep
that in mind. Now, regarding the parameters, you might have
noticed model names followed by 7B, 13B, 30B, or 70B. This indicates the model's size, that is,
how much data it uses and processes. For example, this model utilizes 7 billion
parameters. You might wonder what this means and how it
makes a difference. Simply put, the more parameters a model has,
the more it can process and provide better or more complex responses. However, a model with more parameters isn't
necessarily better than a smaller one; it's just capable of creating more complex patterns. The training and other factors play a more
significant role in determining whether a model is better. Understand that a model with more parameters
also consumes more resources. This is crucial when choosing your model,
as you need to consider what size your hardware can support. I will show you approximate model sizes your
hardware might support based on RAM and VRAM. Remember, VRAM and RAM aren't the only factors
in processing model parameters; the GPU and CPU also play a role. If you're using a CPU to run models, you can
assume that 7 billion parameters require 5 to 10 gigabytes of RAM, 13 billion parameters
need 8 to 17 gigabytes, 30 billion parameters need 16 to 38 gigabytes, and 70 billion parameters
require 32 to 74 gigabytes. These are approximate figures. For most, only 7 and 13 billion parameter
models are relevant due to CPU limitations. For example, I have a 24-core, 4 GHz CPU and
can use up to 30 billion parameter models. But this doesn't mean 30 billion parameter
models are optimal for me. Models with 13 or 7 billion parameters are
preferable as they're faster and more satisfactory. It's crucial to test different models to see
what works best for you, what's faster or slower, and then choose the right model size. If you opt for a GPU, for 7 billion parameter
models, you'll need 6 to 8 gigabytes of VRAM, for 13 billion parameters, 8 to 12 gigabytes
of VRAM, for 30 billion, 16 to 24 gigabytes, and for 70 billion, 40 to 48 gigabytes of
VRAM. You might have noticed that GPUs require fewer
resources than CPUs because they process models better. So, I assume most will choose the graphics
card, which handles larger models more efficiently. With a GPU, I could run up to 13 billion parameter
models on my PC with a GeForce RTX 3080 with 10 gigabytes VRAM. But even though my card supports 13 billion
parameters, I prefer 7 billion for faster performance. Again, please experiment a bit, try out different
models, and see which one best fits you and what your hardware supports. Remember, these figures are estimates, and
other factors also play a role, including the model type, which prescribes the necessary
VRAM. We'll get back to this shortly. Now that you have learned all this and know
what to look for in a model and how to choose the best one, let us move on to model searching. There are two methods for this. The first option is to join my Discord, where
I have linked the models I recommend. You can find these under AI Sources, or you
can search for a model by TheBloke. The best way to do this is by first checking
the leaderboard, which I have linked in the video description, and then searching for
your model there. The advantage is that you can decide which
model is better or worse based on the ranking. Plus, you are always up-to-date and have many
other pieces of information that might be important for your decision-making process. However, you might have noticed models like
ChatGPT 3.5 or 4 listed there. These cannot be run on your interface but
can be used for comparison with your model. To find out which models run on the interface,
it is best to look for models where the parameters are listed in the name and then check with
TheBloke if they are available. My favorite model, which I always use, is
WizardLM. If I wanted to download it, I would copy its
name, go to TheBloke's page, and search for it there. Then you decide whether to run it on the graphics
card or the central processing unit. As you can see, most people download it for
the GPU, but I will run it today on the CPU. Here you can see the model's description,
which most models on TheBloke have. I recommend reading it, but the most important
thing you need to consider is this part here. This section lists the model types and how
much Random Access Memory they consume. You should find this table for almost every
model. If it is not there, it is not a big deal because
you might find somewhere else in the description how much RAM or VRAM the model needs. Otherwise, you can go by my estimates and
see if that fits or not. In most cases, the models work according to
my estimates. If there are multiple types of the model,
go to 'Get File List'. Once it is loaded, you will see all the model
types here. To ultimately decide on one, go back to the
website, see how much RAM it consumes, and what is recommended or not. For example, I choose this model here. Then go back, find it in the list, copy it,
and paste it here. After that, just press 'Download', and the
model will start downloading. It will take about 5-10 minutes, as the model
is approximately 8 gigabytes in size. While it is downloading, I will show and explain
a few more things to you. You need to know that most models are in English,
including the one we've downloaded. If you want it in a different language, you
need to visit TheBloke's page and then enter your desired language in the search bar to
find models available in that language. You'll then be presented with many options;
for example, 'Vicuna' is a very good model. Of course, you can also search for other languages,
like Spanish, and see what's available. For instance, right now, there are only two
models for Spanish. However, even if your model is only available
in English, the interface includes a translator, so don't worry about that. Now, let us return to the interface. As you can see, the model has been downloaded. You can load the model here by refreshing
this area and selecting your model. All the settings will be automatically configured
for you, which you can edit, but I will leave them as they are for now. Usually, in TheBloke's descriptions, you will
find all the necessary settings, including the model loader. You do not have to adjust these manually as
they are set automatically, but if something is not configured correctly, you can refer
back to the description for guidance. I forgot to press 'Load', which you should
do to ensure the model is correctly loaded. You will see a confirmation that it has been
successfully loaded. You also have the option to unload and reload
the model here and save any settings you have edited. If you have made it this far, congratulations,
you have successfully run your model on your personal computer. I know it has been a lot to take in, but I
needed to explain these basics. Now, let us move on to the most interesting
part. Now that the model is loaded, you can start
chatting with it, either on the Notebook, Default, or Chat interfaces. Let us begin with the Notebook. Here, you can directly see a textbox with
a prompt, which you can select from the top. You can also create, save, or delete new prompts. Below, you have options to generate messages,
stop them, undo, or regenerate. I will demonstrate using this prompt above. For example, I will ask the AI about the benefits
of a local AI and then press 'Generate'. As you can see, the AI immediately starts
responding in the Notebook. It has provided a very good response, which
I will not read out fully now, but you can pause the video and read it. It is impressive to see the quality of the
response from an AI running on your personal computer. Remember, we are talking about an AI that
runs on your personal computer, free of charge, without the infrastructure of OpenAI. It is amazing to think that we can now run
an AI on our own personal computers - it is truly revolutionary. Imagine a future where everyone has their
own AI, and with better hardware, anyone can run language models like ChatGPT on their
personal computers. Of course, this AI is not quite on the same
level as ChatGPT, but it is very close. But let us get back to the main topic. You have different formats to display the
text, like HTML, and other functions and options to explore. Let us go back to our prompt. As I mentioned earlier, you can select other
prompts, like this one, which has a similar function. You can experiment and even create and save
your prompts. Now that you know everything about the Notebook,
let us move to Default. The Default view is similar to the Notebook,
except here you have input and output sections. Let me show you how it works. Enter your prompt, like Q&A, ask a question
as before, then press 'Generate', and the AI will respond on this side, unlike in the
Notebook. Now, let us check out the Chat view, which
most will use. It is similar to ChatGPT but a bit better. You can track, delete, or rename your last
chats here and open a new chat to write with the AI. There are various ways to interact with the
AI. For instance, you can send your messages to
Notebook or Default. You can copy, replace, delete, continue, or
regenerate messages. So, there are many possibilities. Try them out, as you can manipulate the AI
to give you the responses you want. Let me demonstrate a chat. I mostly use this AI for improving my texts,
so I will set that task for the AI. But before we press Enter, let us go down
to 'Character Gallery', open it, and select 'Assistant'. We will get to characters later, but for now,
you can start writing to the AI and press Enter. The AI will then provide an answer, similar
to the other text views. You can have a conversation with the AI, like
on ChatGPT. For example, ask 'what is 2 plus 4'. There are many other options, as I showed
earlier. Let me demonstrate one. You can press 'Regenerate' to generate the
message anew. I will show you more, but first, let us go
down here. You have the option to choose the chat style,
like 'Messenger', which will display the chat like a messenger application. You also have the option to switch modes. I will not go into detail as it would exceed
the video's scope. But I see I have incorrectly set it here. For example, you should choose 'Chat Instruct'
or 'Instruct' for the AI. You can see which mode you need to use when
you load this model. It is stated here, for instance, 'Instruct'
or 'Chat Instruct'. Then go back to the chat and select the mode
the AI uses. Here, you can specify how the AI should respond,
like typing 'sure', and the AI will respond with 'sure'. This means you can manipulate the models and
get them to answer your unethical questions, which I will cover in another video. You can also bypass censorship by downloading
models from my Discord, like the Wizard models, which are all uncensored. I will now show you a few interesting things
about the interface, starting with the parameters. With the parameters, you can influence the
AI's response. For example, you can set how many tokens it
can use, which determines the length of the message, and you can also adjust how creative
it is, among many other things. You can learn all about this under 'Learn
More'. I will not go into too much detail, as it
is really for the pros. I usually leave it as it is, but you can try
out other presets and see how the AI responds. And now, let us move directly to the characters. You can think of characters as similar to
GPTs. For instance, you can choose how you want
to be addressed, determine the character's name, give the character a context, and add
a greeting to start the chat with. You also have the option to upload a profile
picture for the character. You can delete characters, then create and
save new ones, and you can switch between characters. There is an example included here, but I will
not go into great detail now, as I will cover that in a separate video. Just understand that characters are like GPTs,
and depending on the context you give them, they will provide you with specific responses. Now, let us move on to the Instruction Template. I will not go into great detail here, maybe
in another video of mine. What you need to know is that things are automatically
loaded here. You can see that the current model loaded
is Vicuna 1.1. If it is not automatically loaded, you can
check the template again in the description of the respective model and then insert it
here. But usually, they are loaded automatically,
so you do not have to worry about that. Let us head to the Chat History. Here, you can save your current chat history,
and you can also upload histories. Here you can upload characters if you have
them in JSON format. That is all for the parameters for now; let
us move directly to the next topic. For this, we need to go to the Session section,
where you will find all extensions. I will show you the most important ones that
you must have. Let us start with Google Translate. As the name suggests, it will replace the
AI's responses in the language you have selected. If you want to activate it, you need to check
this box and press 'Apply'. This will reload the interface and start the
extension. As you can see, the extension is loaded. Strangely, it is set to Japanese, so it is
translating into Japanese right away. Below, you can see it is activated, and here
you can also select the language you want. And that is pretty much it for Google Translate. Now let us move on to the next extension. The next extension is Silero Text-to-Speech,
which allows your AI to speak using the voice you select. Let us activate it and press 'Apply'. Once the interface has reloaded, you will
immediately see the audio options here, which we can play right now. I do not find this voice the best, so let
us scroll down and select the voice we want. I recommend EN0, as it is the best in my opinion. Additionally, you can set the language, pitch,
and speed. I will not adjust these now. Do not forget to check this box so that the
text is displayed under the audio. Let us test it now. Let us ask the AI how it is doing. The previous audio will play again, but now
we have to wait for the result to load. You will see that we have already set a preset
for the response to start with 'Sure'. That was really cool. Just imagine the possibilities you have with
this. You can conduct complete conversations, and
with funny characters, it gets even better. So, wait until the end of the video; I will
show you many more cool things. Now, let us move on to the next extension. Unfortunately, you cannot see it here as it
has been removed for some reason, so we will install it manually. This is the Elevenlabs Text-to-Speech Extension,
which allows you to assign a voice to your artificial intelligence. You need to download it from the Snapshots. For that, go to this website here, the link
is in the description or on Discord, and download the ZIP file. Once you have done that, open the file, go
to the Text Generation WebUI folder, and into the Extensions. Copy the Elevenlabs Text-to-Speech file and
paste it into the Extensions folder of your Text Generation WebUI folder. Then go back to the main folder and open CMD
Windows Bat. Also, remember that you need Python to install
the missing requirements. You can find the link on my Discord, and the
command as well, which is here at the bottom. You can copy it and paste it here at the top
to execute it. Once the requirements are downloaded, we can
proceed. When the Text Generation WebUI is restarted,
go back to Session, mark Elevenlabs Text-to-Speech, and press 'Apply'. Before we can start the extension, we need
the API from our Elevenlabs account. Go to the website, your profile, and copy
the API. Then paste it here and press 'Refresh'. Once it is loaded, you can adjust the settings. The most important thing I will do now is
to play the audios automatically and choose a voice. I will take Santa Claus for now. If I had Santa as a character, it would be
the perfect conversation, but let us just do it. The first thing I will ask is what gifts I
will get for Christmas and then press 'Generate'. Well, I will not play it completely, but you
see it works. Of course, the response I got is not the best,
but that is because it is not the Santa character, it is the AI itself. So, if you play around and want Santa as a
character, you will need to create one. Of course, you can also use the other free
voices, but I just find the ones from Elevenlabs better because there is more variation. Also, you might have noticed that it didn't
write to the end. That's because you need to assign more tokens
in the AI's parameters so it can generate more text. That is it for this topic. I hope you have understood the principle of
how everything works. Now let us move directly to the next extension
in the Session. Now, let us move on to the next extension. We need to use the installer here to download
it. Let us first go to the extension list, which
is linked in the description and on Discord. As you can see, there are many other extensions
available, many of which are quite interesting. You can check them out on your own, as I will
not go through all of them. However, the one I will show you is called
'Integrate TavernUI Characters'. This adds a library of characters to the Text
Generation WebUI interface. To download it, click on this link, and you
will be taken to the extension's page. There is a description you should read, as
it contains all the important information. You will also see a preview of the extension. To download it, copy the link and paste it
here, then press Enter. This will start the download. Once the download is complete, restart the
Text Generation WebUI. If you have done everything correctly, you
will see the extension at the bottom. Select it and press 'Apply'. You will then see a new area where you can
choose characters. Be warned, many things here are Not Safe for
Work, but you can turn that off in 'Featured', which I will not do right now. The first thing we will do is search for my
favorite character by entering their name and then searching. Click on 'Page 1', and there is the character
we are looking for. To download it, click on it and then 'Download'. Once the download is complete, go to 'Parameters',
'Character', and select the character. You will see everything is there: context,
greeting, the name, and my name. This character's function is to generate prompts
for Stable Diffusion, so there are not only Role Play characters but also productive ones. I will show you with another character how
it will look. As you can see, I have chosen Batman, a very
funny character with a lot of context, which I will spare you from reading. If you want to use this character, go back
to 'Chat', scroll down to 'Character Gallery', select it, and press 'Refresh'. Once selected, you will see the greeting. Let us start roleplaying by sending a message. As you can see, I have put the text between
two markup symbols. It is the action, and what I actually say
is, 'Batman, can you take off the mask for me?'. Now you have got the principle, so let us
press 'Generate'. What you can see is the text in grey is the
action, and the spoken part is in white. You can see Batman has replied, and I would
say it is a fitting response for Batman. You can have good roleplay and fun conversations. If you are interested, the next video will
be about characters. If you have enjoyed this content so far, consider
subscribing to stay updated on AI, and any support is greatly appreciated as it helps
me produce more videos. Let us return to the extensions. The next extension I want to show you is Lucid
Web Search, which allows you to surf the internet with the artificial intelligence. This time, I will manually install it to demonstrate
what to do if downloading it via the installer does not work. First, click on 'Code' and then 'Download
ZIP'. Once downloaded, click 'Open File', which
opens this folder here. Drag and drop this folder into the Extensions
folder here. Next, go into the Extensions folder and rename
the file to LucidWebSearch. If the original name remains, it will not
work correctly. Then, go back and open the CMD Windows Bat
file and execute this command here. Once it is done, there is just one more thing
to do. First, close the window. Oh, and one more thing, you need the Chrome
browser. If you do not have it on your personal computer,
install it. After that, you need to open CMD on your personal
computer. Go to search and type CMD, then open this
window here and enter this command. This opens Chrome, which must run parallel
to the Text Generation Web User Interface for the extension to work. After restarting the interface, you will see
the extension here. Activate it and press 'Apply'. Once activated, you can adjust settings below. I will leave them as they are, but honestly,
you can do more than just surf the internet; you can also read PDFs. If you want to know more, go to the GitHub
page, where everything is explained, including even cooler functions. So, I always recommend reading through it
before starting to use an extension. Also, there is a description you can read
to understand exactly how everything works. To trigger the extension, you always need
to write 'Search' before your text so it will search as I do here, for example, asking the
artificial intelligence to 'search for the weather in Frankfurt'. Then press Enter. Now you can see that the AI has given me an
answer using Google Search. It is a very cool feature you can do with
this extension because there are really no limits. And remember, this extension is also free,
just like the AI. It is almost like having ChatGPT running on
your personal computer for free, uncensored, and privately. Now the only things missing are image recognition
and image generation, which we will get to now. Let us go back to Sessions. You can do image generation with this extension
here, but I will not show it in this video as it would exceed its scope. Instead, I will demonstrate how to perform
image recognition within the interface. For that, we need to activate the Multimodal
Extension here. But before we press 'Apply', let us save the
settings. By doing so, the interface will always start
with the extensions you have activated here. Once that is done, we have another step here. We need to edit the CMD Flags. Double-click here and insert this command,
which you can find on my Discord, and save the file. Then close it. Afterward, you will need a model that supports
image recognition. I have chosen this one. Once you have downloaded it, we have to make
some settings here. Select 'AutoGBTQ', enter 'four' for WBITS,
and 'one hundred and twenty-eight' for GroupSize. If you have done that, then you can load it. But be aware, you need at least sixteen gigabytes
of VRAM. I do not have that, so I cannot load it, but
for those who can, there is one more thing to do. Go to Chat and set 'Instruct' here below. Then you can start writing to the AI directly. But do not forget, you can also send images
here, as you can see. Just click here to insert an image and then
write something. For example, you can ask, 'describe this image,'
and then the AI will describe the image here. So, that brings us to the end of the video. For all my Patreon supporters, I have made
a mind map with all my notes and information from this video, including some extra stuff. With that said, the last sentence goes to
the AI.