Hello, everyone. Welcome to the most in-depth tutorial ever
made for Stable Diffusion training. I have been doing Stable Diffusion training
since 2022. So, this tutorial is a cumulative experience
of more than 16 months. As a result, I can confidently say that this
tutorial is like an entire course that you would purchase for hundreds of dollars. In this tutorial, I am going to show you how
to install OneTrainer from scratch on your computer and do a Stable Diffusion SDXL and
SD1.5 based models training on your computer. I will show the very best configuration parameters
that I have found after doing more than 200 empirical research trainings without any paywall
including masked training and proper setup of training concepts. Moreover, I am going to show you how to do
training on Massed Compute cloud virtual machines for amazing discounted prices if your computer
is not good enough with perfect privacy. The discounted price is 31 cents per hour
for an A6000 GPU machine which costs more than 70 cents on RunPod as a comparison. The virtual machine we prepared on Massed
Compute is a desktop interface having an operating system so it will be as easy as using it on
your own computer. Furthermore, I will show you how to utilize
more than one GPU at the same time for different tasks on Massed Compute such as doing two
separate trainings. The same strategy applies if you have more
than one GPU on your computer as well. In addition, I will show you how to caption
your training datasets properly. Also, I will explain why your training is
extremely slow due to shared VRAM issues. Moreover, I will show how to use the very
latest version of Automatic1111 SD Web UI on Windows, Linux, and on Massed Compute with
amazing extensions such as After Detailer and ControlNet. Additionally, I will show how to get amazing
pictures of your training with accurate settings of After Detailer to improve faces in distance
shots image generation. This will help you generate high-quality images
with ease. Finally, I will show how to upload and save
generated model checkpoints and literally anything onto Hugging Face with a very easy
Jupyter Lab notebook. Before we start this massive tutorial, I would
like to ask you one thing: go to the Stable Diffusion link here and star our repository,
fork it, watch it, and sponsor if you like. I appreciate that very much. I am also giving one-to-one private lectures
if you are interested in. You can join our Discord and message me there. Also, we are giving huge support to everyone joined our Discord channel. You can join from
this link. Moreover, all the images I have shown during
this introduction is shared in this link. Go to this link and you will be able to look
at every image details with their PNG info data. Just click the image and you will see the
prompt and every details that you need. So, let's begin. As usual, I have prepared an amazing GitHub
readme file for this tutorial. This file will have all the links and instructions
that you are going to need to follow this tutorial. The link of this file will be in the description
of the video and also in the pinned comment of the video. Moreover, since the video will be very long,
I am going to put sections into the description of the video so you can jump into any section
that you are seeking for. Furthermore, I am going to put fully manually
written English captions and also in other languages. So, if you are having a problem understanding
my English, please watch the video with captions on. I am going to show you how to install on Windows
and also on Massed Compute step by step. First of all, I am going to begin with registering
into Massed Compute and starting our virtual machine. Then, I will start installing the OneTrainer
on my Windows 10. It doesn't matter if you have Windows 10 or
Windows 11. On both of them, it works perfectly fine and
the same. If your GPU is sufficient, you don't need
Massed Compute or cloud computing. You can just use the OneTrainer on your computer
with your GPU. So, please register the Massed Compute with
this link because we are going to use a coupon. So, I am not sure whether you need to register
with this link or not. This is my referral link. Once you have registered, you need to enter
your billing information here. You can add your payment method, charge it,
and delete it if you want. Then, we are going to do deployment. This part is extremely important. Our coupon is only valid for A6000 GPUs and
A6000 GPUs are extremely sufficient to do efficient training. For this tutorial, I am going to start for
A6000 GPUs. I am going to show you how you can do a different
thing in each one of them and you need to choose our virtual machine template. So, select the creator from here, select SECourses
from here. You will see my image here. By the way, this is also the DreamBooth
pretty accurate one. Then, you type our coupon like this. You can also copy the coupon from here SECourses
then put it here then click verify. It will show you the new price of the GPU. You see currently 4 GPUs are not available
therefore let's reduce to 2 and it is not available either and 1. Okay, when I make it 1 I can get 2 GPU
2 instances but when I make 2 I can't get any. The Massed Compute team is keep adding more
GPUs so even though currently it is not available to get 4 GPUs with A6000 it is fine. So, after you put your coupon code like this
and click verify you are going to get a huge price reduction. You see currently it is 0.62 per hour when
I click verify it will become 0.31 per hour with 48 gigabyte RAM and with 256 gigabyte
storage and also with six virtual CPUs. If we compare the price with RunPod, you see
the RunPod price is 69 cents per hour. This is the community cloud price but on Massed
Compute, we are going to get the same GPU with only 31 cents per hour. It is even cheaper than half price of the
RunPod. Therefore, Massed Compute is giving us an
amazing price for an amazing GPU. So, therefore, what I am going to do for this
tutorial is I am going to pick L40 but you don't need it and you don't need more than
1 GPU just to show you how you can utilize more than 1 GPU. I am going to get 4 GPUs but as I said
you just need A6000 and only single GPU to follow this tutorial and do your training. Okay, so let's select creator let's select
SECourses. Unfortunately, the coupon will not work on
other GPUs so click deploy. Okay, it says that I have reached the limit
of running instances. You will get this message when there isn't
a sufficient amount of GPUs so I have to reduce it. Unfortunately, they are also on high demand
so let's see. Yes, I can get 4 H100 GPUs. It will be 10 dollar per hour. It is just too expensive so I don't want to
risk it. Okay, let's go with only 2 L40 GPUs. Still, you will understand the logic of using
more than 1 GPU and you will get to this screen. It will start initializing the virtual machine. Now, time to install the OneTrainer on our
Windows computer. To be able to follow this tutorial and install
OneTrainer on your computer you need to have installed Python, Git, and C++ and FFmpeg
are optional. If you also install them it is better. So, if you don't know how to install Python,
please watch this amazing tutorial. In this tutorial, I have explained everything
about how to install Python, how to set its virtual environment, how to set its path,
and everything that you are going to need. And also install Git. So, how you can verify your Python installation? Open a CMD, command line interface like this
and type Python and you should see 3.10.11. By the way, 3.10.11 works with Automatic1111
Web UI and also Kohya or any other major AI application that you are going to use. This is the most compatible. This is the best Python version that you can
use. This is the version that I use for all of
my tutorials. Then, how you can verify you have Git installed? Just type Git and you should get a message
like this. If you also installed FFmpeg, you should get
a message like this: FFmpeg. Its properties. Unfortunately, there is no easy way to verify
C++ tools installation but everything is explained in this video so watch this video and install
your Python and set it up accurately. Once you have set up your Python all you need
to do is first clone OneTrainer. OneTrainer is an open source Stable Diffusion
or actually it supports a lot of different text-to-image models like Stable Cascade as
well. Open source trainer scripts. It has an excellent documentation also wiki. Moreover, the developer is extremely active
in Discord so you can join their Discord channel and ask any questions that you have. Also, don't forget to join our Discord channel
as well. You see we have over 1000 active AI learners. So, I copied this then enter inside the folder
where you want to install it. I am going to install it into my F drive. I already have OneTrainer here so I will make
a new subfolder. Let's say tutorial OneTrainer. By the way, do not make your subfolder name
the same as the original name. Since the original name is OneTrainer, do
not make the subfolder name OneTrainer as well. It may cause issues. Then, right-click and it will paste your copied
text and hit enter and it will clone it into this folder. Then, enter inside OneTrainer. OneTrainer has automatic installation and
automatic starting bat files so I will just click the install.bat file and it will generate
a new virtual environment and install everything automatically for me. What does this mean? This means that whatever it installs will
not affect your other AI installations, such as Automatic1111 Web UI or Kohya or whatever
you are using. So, everything will be located inside this
virtual environment folder. Okay, let's return back to Massed Compute,
and we see that our instance machine started. So, how we are going to use this instance
machine? To use the Massed Compute, we need to use
the ThinLinc application. The link is here. When you click it, it will direct you to download
options. I am going to download and use the Windows
version. After download, just click it, open it, and
just click next and next and next to install. It will automatically install everything. Then just run the Thin Client, and you will
get to this interface. So, this is the connection interface, and
what is important here is click Advanced and also Options. Thin Client allows you to synchronize your
folders with the virtual machine. This is how you can transfer files from your
computer to the virtual machine that we are going to use. So, there are options: windowed, full screen,
there are local devices, whichever you want to connect, and you need to set your folder,
which will be synchronized with your Thin Client and your virtual machine, with your
computer and with your virtual machine. So, click this details, make sure that you
have ticked the Drives checkbox, and you need to add a path of your folder. You see, I have added this folder. Let me add it again. So, I click add, and I need to select the
folder from here. It will be inside R drive. It will be inside Massed Compute. Let's see, here. And this will be my folder. So, I click OK, and it says Read Only. If you want your virtual machine to be only
read from your folder, not able to write, let it stay as Read Only. But if you want to download files from the
virtual machine automatically into your folder, then you need to make it Read and Write. I am going to make that way. And there are some optimizations if you wish. There are some security options. I am just leaving everything default, only
drives are set. Okay, and there is end existing session. This means that it will restart your virtual
machine. So, be careful with that. Moreover, there is no turn-off this virtual
machine. There is only Terminate. So, you need to back up all the files that
you need before you terminate your virtual machine. This is the only downside compared to RunPod,
but we are going to have a desktop environment and very cheap, very powerful GPU here. So, how we are going to connect? Copy your login IP from here, paste it into
the server. You see, username is Ubuntu, automatically
set for us. Then copy the password. You don't need to show it. Paste password here. And when you set end existing session, it will
restart the virtual machine. Moreover, if you get problems with the folder
synchronization, you should use end existing session, and it will allow you to synchronize
folders. But be careful, this will close all of the
running applications. So, if you are training at that moment, they
will be all terminated. So, be careful with that. Since we are just starting the virtual machine,
I will just select this option. Click Connect. Then you will get this message. Click Connect. Then you will get to this screen. In this screen, it may freeze or may look
like frozen. So, click Start to just skip that part, and
you will get to this screen. And in a moment, we are going to get, yes,
this is our virtual machine. You see, it is like a desktop computer. It has Ubuntu image. So, this is Ubuntu Desktop operating system. And we have automatically installed Kohya,
Automatic1111 Web UI, and also OneTrainer here. So, you can directly start using them. For example, if I start OneTrainer, it will
start the OneTrainer automatically for me, because they are all added into this virtual
machine. We have worked with Massed Compute to make
it super easy to use for you. So, looks like our Windows installation has
been completed, because I don't see the installer file. So, when I click the Start UI.bat it should
start, and it started. You see, it says that no file training presets,
but the presets are here. I don't know why it did give this error. So, this is the interface of OneTrainer. So, we have installed both OneTrainer, and
we did set up both Massed Compute virtual machine. As a next step, what we are going to do? First
of all, let's start the training on my computer. So, to start the training on my computer,
I am going to load my preset configuration. Don't worry, I am going to show all of the
configurations in this tutorial. They will not be hidden. However, I may find better configurations
in future, because I am in research all the time. I did over 100 trainings for this tutorial. So, I am going to download the SDX. SDXL
and SD 1.5 training are all the same. What changes is the resolution and the base
model that you are going to use. And one more thing, which is the model that
you choose from here. For Stable Diffusion 1.5, you choose 1.5. For Stable Diffusion XL, you choose this. You see, it also supports Stable Cascade,
Pixart Alpha, and Wuerstchen version 2 trainings. I haven't looked into these yet. I am only in Stable Diffusion training for
now, but I have amazing configuration for both Stable Diffusion 1.5 and both Stable
Diffusion XL. I did huge research for both of the models. So, let's go to our Patreon post to download
our best configuration. In the bottom, you will see attachments: Tier
One 10 Gigabyte training and Tier One 15 Gigabyte training. So, if your GPU is more than 16 Gigabyte,
you can use this one. This is a faster one. But let's start with the slow. I will show you and explain you all of the
options. So, after downloading it, cut it, move into
your OneTrainer, and you will see that Training Presets folder here. So, paste it there. I am going to delete all of the other presets,
because we don't need them right now. Actually, I tested the Stable Diffusion presets,
and they weren't good, nothing like my presets. So, after that change, we need to restart
the OneTrainer. So, I turn it off, and I click Start UI.bat
file. And now, I can see the preset. So, after I selected that preset, it will
load everything automatically for me. So, what are these options? Workspace Directory is where the checkpoints
and generated backups or the generated sample images will be saved. So, this is the main saving directory. For
each training you can set a different directory that may
be easier to manage and see everything. So, let's click this 3 dots icon, and
let's go to our new installation. You see, I have OneTrainer Workspace folder
here. I am going to make another one: OneTrainer
Video Workspace. Okay, I made a folder like this. I enter inside it and click Select Folder. You can also copy-paste the path. So, let's make another one for the cache. So, let's say Cache 1. By the way, caching is extremely important. I suggest you to set a different caching folder
whenever you use different settings. Which settings, you may be wondering? Some of the settings requires re-caching. Unfortunately, there is not a list of them,
but anything that modifies the training dataset requires re-caching. So, if you select only cache, it will only
cache. Continue from last backup, we don't need it. I never used it. Debug mode, there is for debugging. So, there is nothing else you need to enable
here. I don't find TensorBoard is useful for Stable
Diffusion training. Also, there is Train Device. You can also set this CPU. I never tried it, but it would be super slow. Okay, now we are going to select our base
model. You see, my configuration is set with the
base model of Hugging Face. If I kept it this way, it will download it
into my Hugging Face cache folder and load it from there. However, I suggest you to use a model that
you downloaded on your computer. I find that SDXL RealVisXL version 4 is very
good for realistic training. If you are looking for a stylized training,
then you should use a stylized model. But I am focused on mostly for realism. Okay, from the Hugging Face, you can click
this icon, and it will start download. Also, this can be downloaded on not interface
having computers, platforms, such as on RunPod. You can right-click and copy link address,
then you can use wget to download, like wget and the link. And you can download. Also, delete this part. But on Windows, you don't need. Moreover, by clicking this link, you can download
the Hyper Realism version 3. This is the best model for SD 1.5 that I have
found. I have a recent video where I have compared
160 SD 1.5 based models to find the best realistic model. If you watch this video, you will learn which
models are best for realism, for stylization, for anime, for 3D. So, this is an extremely useful video for
you to watch. That is how I have determined the best SD
1.5 model. I also did some testing for realistic vision
for SDXL, but I don't have a video for that yet. So, the model is getting downloaded. So, the model has been downloaded. Let's move it into our Automatic1111 Web UI installation. I have fully automatic installers for Automatic1111
Web UI installation. So, I am putting it into my Stable Diffusion
models folder, like this. It was already there, but I just downloaded
again to show you. Then click this three dots icon, move into
the your downloaded folder position. You can put it into any folder you wish. It is not mandatory to put here. Select it. You can also give its full path, like this. When you are using such custom model, do not
set VAE. Moreover, only this Hugging Face VAE is working. If you download VAE into your computer, it
will not work yet on OneTrainer. So, when you use such custom model, you don't
need VAE. You can use the embedded VAE. Now, this is super important. This is where the finally saved model checkpoint
will be. So, wherever you want to save your final checkpoint,
you need to set it. I will set the final checkpoint inside my
Stable Diffusion Web UI folder. Let's say tutorial1.safetensors. Also, don't forget to set its extension, like
.safetensors to not have any issues. You see, .safetensors. Okay, so these are the very best settings
that I have found for this configuration. It may change for different models and also
different configuration setups. These are the best configurations. Just pause the video and look at the configurations. Then move to the Data tab. And in here, I only use Latent Caching. If you have different resolutions, you can
also enable Aspect Ratio Bucketing. However, I suggest you to use single resolution
images first. Then after you got some decent results, you
can try with Aspect Ratio Bucketing enabled and different resolution training and compare
them. So, Clear Cache Before Training means that
it will re-cache every time you start training, even though you didn't change anything regarding
the dataset. If you are unsure that you need to re-cache
or not, you can enable this to be sure. But if you are not changing anything but just
trying different parameters with the same training dataset, you don't need to enable
it. Now, the Concepts. This is a super important part to set up accurately. First of all, I am going to add my training
dataset concept. So, click Add Concept, then click Add this
icon. And in here, I am going to give the name of
training. It is enabled. If you don't enable it, it will not be used. So, you need to give the path folder path
of the your training images. My training images are inside here, which
are automatically generated with my scripts, which scripts I am going to show you. By the way, this is not the accurate one;
I need to use this one. Yes, then you need to set their captions. OneTrainer supports three different captioning:
first one is from text file per sample, which means that you can use any captioner to caption
each image, and then it will read the caption file with the same name as the file. What does this mean? Let's say you are using our Kosmos 2 image
captioner. The link of this is in the Patreon. So, this is an automated installation and
batch captioner that I have prepared for Kosmos 2 Let's just start it. You see, the Kosmos 2 is extremely efficient one. So let's start as full precision because
we have VRAM, and let's go to my training images dataset, which is inside here. You see, there are also masks, which I will
explain. So, I am going to delete the masks. So, let's say "delete," and also, let's delete
the prompt. Maybe, let's keep the prompt. So, I will just or delete it, so you will
understand it. So, this is my raw folder. Then the automatic captioner started. So, just enter the path and batch caption
images. Kosmos 2 is extremely good captioner, but
I don't suggest you to caption your training images if you are training a person. Okay, this error shouldn't be important. Let's look at the captioners. Yes, so each file is captioned. Let's open this. This is the batch captioning result, and each
file is captioned like this. You see, a man with dark hair and glasses
is standing in a room, looking at the camera. He is wearing a blue shirt and appears to
be in a home office. And it generated caption as each file. However, you also need to add a prefix for
each one of the captions, such as "ohwx man," and comma. So, this way, you will have a unique identifier
for your captions if you use this methodology. But I don't suggest this. Do you know why? I have a public Patreon post where I have
explained it and compared image captioning effect: When you caption
your images for a person training, it reduces the likeliness greatly. So, therefore, I don't suggest image captioning
if you are training a person with as low as 15 images, 20 images, 10 images,
you can also compare the image captioning effect. But I don't suggest you to caption when you
are training a person. You can read this article in details and see
how it reduced the likeliness of the model. So, we set the folder, which was this folder. Let's open it. So, this is the folder. So, what I am going to do is I am going to
use "from single text file." There is also "from image file name." If you select this, the captions will be like
this: "image 2023 04 30" and such. So, don't use that unless you are sure. Use "from single file text." Now, what does this mean? This means that you generate a new file wherever
you want. It doesn't matter. Let's say "ohwx man" and then edit it. Let's edit it. ohwx man This is it. So, why am I using this as a caption? "ohwx" is a rare token, which will learn my
unique characteristics into this token. This is a rare token. Rare token means that there weren't many images
during the initial training on this token, and "man" is the class that I am going to
train my characteristics on. So, the model will know that I am a "man"
class, a man, and I am "ohwx man" So, this is super important to understand. This is the logic of training something unique,
something new, into the Stable Diffusion model. The specific characteristics are learned into
this rare token, and it utilizes the existing knowledge of the model with the "man" class. I have a more in-depth tutorial for this on
my channel if you need, but this is sufficient for now. So, I selected "from single text file." There are image variations. When you hover your mouse on this, it will
show you the explanation of them. However, I am not using it. There are also text variations. I am also not using it. Repeating is extremely important. This means that it will repeat one time each
image in each epoch, and we will use a different repeating for regularization images' effect. The OneTrainer does not support DreamBooth
training, so we are using fully fine-tuning. However, we are going to make the effect of
DreamBooth with another concept, and there is also "loss weight." This means that how much weight these images
will get. So, if you have an unbalanced training dataset,
you can give different weights to each dataset and try to balance them. However, having an original number of images
equal is better. Image augmentation: I don't use crop jitter,
or random flop. There are also some other options, so I don't
use anything here. When you click "update preview" you will
see the images and text augmentation. Since I am not using captioning, this is also
not important. So, this is my first training concept about
training images dataset. Now, this is super important. People are preparing very, very low-quality
images for training. This dataset is also not a great quality. Why? Because it has repeating backgrounds, it has
repeating clothing. I have taken all of these images with my mobile
phone myself, which is not also a great phone. It is Poco X3, in less than one hour. However, this dataset still has some quality. What I mean by that, for example, let's open
this image. So, you see, it is extremely sharp, well-focused,
and it has great lightning. These are super important to have. You need to have extremely well-focused, not
blurry images. You need to have extremely good lightning,
and in your training dataset images, you shouldn't have other people, always you. If you can also make the training background
different, also clothing different, different timing, then you will have a much better dataset. There is also no limit of dataset images,
as long as they are higher quality. But for now, I am using this medium-quality
dataset to demonstrate you, because I have seen much worse datasets. People are having a huge hard time to prepare
a good dataset. Hopefully, I will make a dedicated tutorial
for how to prepare a dataset. But for now, we are going to use this dataset. Now, OneTrainer does not have DreamBooth
as I said, but we are going to make DreamBooth effect. We are going to add regularization images'
concept. So, let's add another concept, and in here,
I am going to give any name I want, select the path. So, my regularization images are inside here. I am going to use 1024 to 1024 because all
of my images are 1024, so I select it. I will show you the folder, and in here, I
will use again "from single text file" which will be only "man". So, you see, these are my regularization images. There are 5200 regularization images, and
every one of them is manually prepared by me. I have spent literally weeks for this dataset. You can find this dataset link here. When you open this post, you will see all
the details that you need to where to download them, how to download them, how to use them. So, the dataset is here. This is not mandatory because you can train
without them. However, if you train without them, you will
get lesser quality. I also tested it. In this public Patreon post you will see the effect of using regularization
images as a concept. So, it will improve our quality, our flexibility,
likeliness, and everything. Since these are ground truth images, these
are perfect quality images, not AI-generated images. These are all real images collected from Unsplash. Unsplash is allowing for such usage. Okay, for these images, I am going to use
"man" So, this is the "man.txt" What is inside this "man.txt"? When I edit it, you will see that just "man"
because this is the class token that we are going to train ourselves on. Therefore, as we train ourselves on the "man"
class, the model will forget what is "man" because it will see only our images as a "man" Therefore, as we do training, it will overwrite
the previous knowledge of the model. However, with using ground truth regularization
images, we are going to re-feed the model with whatever the original "man" is, and we will
make the model better. But there is one tricky issue here. All of my images are real images. Therefore, it will forget, it will still forget images that contain "man"
but not real. What I mean is like anime "man" drawing or
3D "man" drawing, or such. You understand that, like cartoon. So, this training is focused for realism. If you do training with stylization, you may
not use this dataset, or you can extract "Lora" from the trained DreamBooth model and use it on
a, let's say, stylized model. So, this is my regularization images setup. You see, there is also an option of "include
subdirectories," but I don't have. So, there is an extremely crucial thing in
the setup of regularization images. You see, "repeating" is currently set as 1.0,
which means that in every epoch, it will train all 5200 images, which we do not want. What we want is training an equal number of
"man" images with our training images. So, how you can calculate it: open a calculator
and type your training images number, which is 15 in my case, divided by the 5200 images
because we have 5200 images. You will get 0.0028. So, I am going to type like this, copy-paste
it, and just change the final number one upper, so it will use exactly 15 images in each epoch
from the regularization images dataset, randomly selected in each epoch. I also disabled these two, "update preview,"
and it is done. So, we did also set the, let's return back
to "OneTrainer." So, we also did set our regularization images
concept as well. You can add as many as concepts you want and
train all of them at the same time. This is the beauty of "OneTrainer." Its interface is easier in some cases to
use. Okay, training. Now, this is super important. I have done over 100 trainings, literally
100 trainings, empirically, to find the best hyperparameters. These hyperparameters may not work if you
change anything of them. So, I am using optimizer Adafactor
and the Adafactor settings are super important. These are the settings. Do not enable "relative steps, scale parameter." "Stochastic rounding," by the way, "stochastic
rounding" makes the effect of full precision training. Hopefully, I will also research that, and
I will update the best configuration, but for now, this is the best. And "fused backpass". This is the newest optimization
that "OneTrainer" brought, but "Kohya" still don't have. With this optimization, we are able to train
with only 10 gigabytes of VRAM usage. So, if you have a higher VRAM GPU, don't enable
this to speed up your training, but if you have a lower VRAM GPU, such as RTX 3060, then
use this. "Learning rate scheduler": "constant." "Learning rate" is 1e-05. "Learning rate warm-ups" doesn't matter because
we don't use a linear, cosine, or anything. We are using "constant". Learning rate cycles is 1. "Epochs" now this is super important. Let's say you are training yourself with 100
images, then 200 epochs may cause overtraining, or let's say you are training with 10 images,
then 200 epochs may be, let's say, not enough training. So, there is not a number that works for all,
but 200 is usually good. Still, you can make this 400 and save frequent
checkpoints, compare them, which I will show you, and find the best checkpoint. But for this tutorial, I will make it 200. You can change this. "Batch size" if you need speed, you can increase
the batch size. However, batch size 1 is the best quality
for training a subject in machine learning. And this is "gradient accumulation steps" I think "gradient accumulation steps" is not
working with "fused backpass" due to optimization. This is a fake batch size. I never use it. "Learning Rate Scaler" is none. We train only "Text Encoder 1" in "Stable
Diffusion XL model." You may be wondering why, because training
"Text Encoder 2" causes overtraining and doesn't bring any benefit. So, after I have tested every combination,
thoroughly, I only train "Text Encoder 1". By the way, you can also apply these hyperparameters
to "Kohya" and hopefully, I will make another tutorial for "Kohya" as well. So, "stop training" after 200 epochs. To be sure, you can make this 10,000 and select
this "never", so it will train the "Text Encoder 1" during the entire operation of the 200
epochs. However, let's say you only want to train
10 epochs "Text Encoder 1" then you can select this, it like this. But in all my trainings, I trained the "Text
Encoder" equally to training the model. This is also a beauty of "OneTrainer" that
you can set any specific number of training epochs. The "Text Encoder" learning rate is different,
much lower than the learning rate, so be very careful with that. If you use the same learning rate, it will
cook the model. And there is also "Clip Skip 1". I don't use it. This is useful when you are training anime. It is really or maybe training for stylization,
like generating yourself as anime drawing then it may be useful. Okay, Train "Text Encoder 2" is set as "offline"
and set as 0. Attention. Now, when we use this "fused backpass", I
think it is using a special attention mechanism, or maybe SDP by default. It doesn't make a difference. So, I let it default. EMA. EMA is extremely useful for training Stable
Diffusion 1.5 models. However, for SDXL, it doesn't bring any benefit,
so it is off. Now: Gradient checkpointing, if you are training
on A6000 GPU, such as on Massed Compute, don't enable this. This will speed up your training. However, gradient checkpointing doesn't make
any difference in quality. It is only speed versus used VRAM. Training data type: Float BF16, and Float 32. Now, there is a very important thing: Currently
with my settings if your GPU is an old GPU that doesn't have
bfloat support, then with this configuration, you will not get the very best results. It really requires you to have bfloat support. So, if you don't have bfloat support, use
the SD 1.5 configuration, which I will show at the end of the tutorial. Everything is the same, just the parameters
change. And the resolution is 1024. This is the base resolution. If your images are of different resolutions,
if you use bucketing, I think it will downscale, or I am not sure how it handles. Actually you should ask this to the OneTrainer developer
on his Discord or on GitHub. Therefore, I prefer to train with a single
resolution, but you can train with different resolutions, enabling bucketing and see how
it is performing. Okay, we train U-NET until the very end of
the training, so you can set this as 10,000, and never. The U-NET learning rate is the same as the learning
rate here. By the way, if you don't set them, it is supposed
to use the learning rate you did set here, but to be sure, I set them also, and I don't
use Rescale Noise Scheduler. Actually, I never tested it, so
I don't know. I don't change any of the default parameters
here, as you are seeing. I don't use AlignProp. When you hover your mouse, you can see what
it does, but not much information. To learn this, you need to check out the wiki
of the OneTrainer. You see, there are just so many things that
you can research. And masked training, now this is important. I have recently tested this with doing a lot
of research, and this really improves the flexibility of the model. So when you look at our readme file, you will
see that I have tested the mask training. This is also a public post. When you open it, you will see the images
of masked training. You can download the full 1.5 gigabyte file
and look at every image full size, or you can download the files. These are half size and look at them like
this. So, what I did find is, I think, masked training
weight 60% is a sweet spot, so you can use this. Why don't use lower weight? Because when you use lower weight, the body
proportions get broken. The head looks artificial on your body because
it is not able to learn the proportions of your body and your head. Therefore, this is my finding. So, you don't need to use it because, without
even masked training, you get very good results. But if you want more flexibility, and if your
dataset is bad, especially this, if your dataset is perfect, you don't need it. What is perfect dataset? Dataset never repeating clothing, never
repeating background. So, if your dataset is perfect, you don't
need that. But if your dataset is not perfect, you may
get benefit from this. So, enable masked training and unmasked weight. Unmasked weight means that the non-masked
areas will get 60% weight instead of 100%. So, the backgrounds and my clothing will get
60% weight during training, and my head will get 100% weight. This is how we get some more flexibility. And I also leave this area default. I don't change them. So, we also need to generate masks. How do we generate them? Let's go back to the... Let's go to the tools, and in here, you will
see dataset tools, convert model tools, and sampling tool. We are going to use dataset tool. Open folder and open folder of your training
images, which was this folder, SDXL. Then you will get the images like this. And in here, we will generate masks. Okay. And in the this section, you see the folder
is selected. You need to type a prompt. Based on this prompt, it will generate the
masks. So, I type "head" because I want my head to
be masked. I don't change anything. Create masks. So, you see, the now masks are generated like
this. My head is masked in all of the images. You should verify all images have accurate
masks or not. Then also, you see, the masks are saved here. So, the white area means that it will get
trained. Black area means that it is the unmasked area. Therefore, in here, you will see that only
60% weight, 60% importance, they will get, instead of 100% importance. Now, you can generate samples during training. You can add a very detailed prompt, like,
prompt, let's say, actually, we need to add first sample. Yes, you can set like 1024, 1024, any seed,
and type of prompt like "photo of ohwx man" and such. Or, you can click here and type much detailed
prompt like "photo of a ohwx man, negative prompt is like blurry," and you can set seed, CFG
scale, number of steps like 40, random seed, if you wish, and you can even select the sampler
like this. I find this the best sampler. So, during training, if you don't know how
many epochs you should train, or if you want to see the point where the model training
gets over-trained and the quality starts degrading, then you can generate samples. This would slow, of course, your training. You can also add multiple prompts like this. You see, you can add multiple prompts. All of them will be used. You can enable/disable any one of them. But I don't use these because I do x/y/z checkpoint
comparison at the end of the training. How do I do that? In the backup, you don't need backup. After this is backupping, I think, the diffuser
files, what you need is "save after." So, I am going to "save after" every 15 epoch,
which means it will generate 10 checkpoints, and each checkpoint will be like 6.5 gigabytes
because we are saving them in half precision. You can see output data type is Bfloat 16. This is half precision. So, I am going to generate 10 checkpoints,
and the prefix, this is very useful. After I contacted the developer of OneTrainer,
he added it, thankfully. I thank him. So, let's give a name. Let's say, what was our name? Tutorial one. So, let's say, Tutorial one. So, this will be the prefix of the generated
backup files. It will save like this. If you don't want it, you can save never,
or if you want, based on the number of steps, you can set it number of steps, you can set
it number of seconds, minutes. It is really, really convenient to use. I will do that, 15 epoch. Okay, everything is ready. Save your configuration as Config one, for
example, or whatever the name you want. And let's see our VRAM configuration. I have installed pip install nvitop. This is a super useful library. You can install it, then type nvitop, and
it will open this screen. In this screen, you can see how much VRAM
I am using currently. I am using 15 gigabyte VRAM. And let's see, because of why? Because we have 2 open something, I think. Oh, because one of them is my Kosmos, which
was my UI that I started, and one of them is the OneTrainer. So, this was the Kosmos error. I should check this error and fix it later. So, let's turn off the Kosmos, and you see,
now I am using 7.8 gigabyte. Now, people are complaining to me that their
training is super slow. Super slow training happens if your VRAM is
not sufficient, and it starts using shared VRAM. When you open your Task Manager, you will
see that there is shared VRAM. If it is going over 0.6 gigabytes, that means
your computer started using shared VRAM. Shared VRAM means that it is using your RAM
memory, and it will be at least 20 times slower than the original VRAM. So, make sure that your VRAM usage is minimal
before starting. How you can make it minimize it? You can turn off all of your startup items
from here, restart your computer, and see that to get it below 500 megabytes. You can easily get it under 500 megabytes,
and you will have full empty VRAM to start training. Okay, let's... Okay, we have saved the configuration. Let's start training. First, it will cache the images, then it will
start training. This will take time. And since we did set up everything on Windows,
now I am going to set up everything on Massed Compute. However, if you are starting at this point,
I don't suggest that. You should also see the previous part to not
miss anything. Even though I will also put a video section
here. When you get this error, just cancel. You don't need to install any updates. Just cancel. Okay, I'm going to close this because I'm
going to start from the beginning. First, we will begin with uploading our training
dataset into the Massed Compute Virtual Machine. For this task, I am going to use the shared
folder that I have generated. So, it was inside this folder. You see, there are already files that I have,
and I have even the training images here. I already copy-pasted. I copy-pasted them, just a regular
way. I also put the configuration files. Let's also put the newest configuration file
that we made. So, the configuration was inside presets. Don't forget that. And I will paste it into the Massed Compute
shared folder here. Then, in this screen, the main window that
we are going to use, home button, and you will see all the folders here. Thin drives. So, when you enter inside Thin Drive, you
will see the folder name of your computer and the files and folders that I put inside
this folder will be synchronized to here. It is totally depending on your upload speed,
of course. And I can see all of my folders and files
here. So, let's copy this folder into wherever you
want. I am going to copy it into the home, into
Applications folder here. This is the folder where the applications
are installed. So, I did copy-paste with Ctrl C and Ctrl
V. Before starting the OneTrainer, I am going to update it. So, the update instructions are written on
this readme file. Let's find OneTrainer update. So, I'm going to search it for... Okay, how to update OneTrainer to the latest
version. Just copy this, and let's start a new terminal
here. Just paste it. Okay, it is not copied yet. This is happening for some reason. So, I copy again, and paste again. Okay, still not copied. For some reason, copy again. Okay, why it is failing? Let's open a notepad. Okay, it appears that the copy button of the
readme file is broken. So, I select and right-click and copy, then
paste here. Still not visible. Okay, probably I have an error. Yeah, it is copied here. So, let's try. Interesting. Let's start another one. Okay, let's... The copy-paste is currently not working, so
I will reconnect. So, I turn it off to Thin Client. I am not going to do end existing session this
time, because it will turn off everything. But if it's still not works, I have to do
that. So, let's copy the password one more time. This is the password. Copy it, paste it into here, and connect. Okay, it is connected. Then, I am going to copy this command one
more time. This is not mandatory, but you should do it,
because there are a lot of bug fixes happening every time. And now you see, copy-paste is working, and
it is going to update the OneTrainer to the latest version automatically for you. Meanwhile, let's also download our regularization
images dataset into the Massed Compute. So, the regularization images link was here. You can log in to the Patreon from your Massed
Compute, or else what you can do, you can right-click and copy link address, then open
a Firefox here. Paste it there. Okay, it still didn't copy. I don't know why the first times it doesn't
work. So, I will right-click and copy link address,
then I will paste it here, and it will automatically start downloading for you. Let's watch the download from here. Show downloads. It is getting downloaded into downloads folder. Then, let's go to the downloads folder. In here, you don't need to move it anywhere. Right-click and you need to select "Extract
Here". So, I will right-click and select "Extract". You will see the extract operation is happening
here. You see, this is where you can watch what
is happening at that moment. So, with 200 megabytes per second, it is extracting
it, and our regularization images are ready to use. So, let's see the update process. Okay, it has been updated. To move into the desktop to hide all of the
started icons, either you can click this icon to minimize them or Ctrl+Alt+D. Yes, Ctrl+Alt+D.
I also written this on here, Ctrl+Alt+D to minimize all. Then, double-click and start OneTrainer. Run OneTrainer. It will start the OneTrainer automatically
for us, like this. The rest is totally same with the first part
of the tutorial, same as setting them up on Windows, but I will set them here as well. So, my configuration is not visible. Why? Because I didn't copy-paste the preset. So, let's close this off, and I had put the
preset inside my ThinDrive, which was inside home, ThinDrives, Massed Compute. You can give any folder name. And which was my configuration, which was
this was. So, copy it. Where you need to copy it? Go to the home, apps. This is where the applications are installed. Inside OneTrainer, inside presets, and paste
it. So, I will just do Ctrl+V, and it is pasted. All right, you can also delete these all other configurations,
if you wish. Okay, let's start. OneTrainer again and it is starting. Let's select the configuration from here. Let's see. Okay, let's select the configuration from
here like this, and it will load everything. Unfortunately, the responsiveness of the UI
in the virtual machine is not as smooth as on Windows, so we need to re-set the workspace
directory and everything. You can select the folders. This is the folder structure of the virtual
machine, so I will make it desktop. Okay, then I will say that OneTrainer workspace,
so it will be saved inside my desktop on the virtual machine. This virtual machine is completely separate
from your own computer; therefore, nothing you do here will affect your computer. This is totally running on a different machine,
in a remote machine. So, let's copy this. Let's delete this. Paste and let's say OneTrainer cache. Okay, these are the same. Okay, in here you can also download the model
and use it, or you can use the Stability AI-base model. So, let's use the RealVis on the Massed
Compute again. So, the download links were here. Let's see, this one. Let's copy the link address. Let's open a browser, close this, open a browser,
and let's paste and go, and we are there. Let's download it. It is getting downloaded. At the end of the tutorial, I will show SD
1.5 configuration as well. Everything is the same, only the resolution
of images changes. By the way, I have forgotten to explain how
to resize your images into the accurate resolution. So, very important thing about your training
images dataset, you need to prepare them with accurate resolution. I prefer to use all of them as 1024 to 1024. If you watch this tutorial, you will understand
that zooming in them without resizing is extremely important. I have automated scripts to do that. In this tutorial, you will see them, or you
can manually also resize them into the accurate resolution. For example, these images are currently not
at accurate resolution. I can resize them, or let's see, the raw. These are all ratio generated with my automated
scripts. So, you can watch this tutorial to learn how
to resize them, or you can manually resize them. This is totally up to you. You can also use birme.net, but I don't suggest
to use it because, at that time, the images will not be accurately zoomed in. Therefore, watching this tutorial is really
useful for you to get and prepare the very best training images dataset. Okay, the model is downloaded. Then, I will move it into the installed Automatic1111
Web UI folder. So, cut it. Let's go to the home. Let's go to the apps, Stable Diffusion Web
UI, inside models. Where is it? Here, inside Stable Diffusion. This is where you put your models, and paste. And this is it. Then you can do ctrl + c. This both copies the file and also its path. Let's return back the OneTrainer. This is here. This is the icon. And delete this, and ctrl + v to paste it. You see, it copy-pasted the entire path of
the model automatically for me. You can also pick it from this navigation. Let's delete the VAE, and the output. The folder changes, of course. So, where I want to change it, I want to save
it into here. So, I do ctrl + L, which select the folder
path, copy, and just delete this part. Paste it. You see, I copy-pasted the part like this. You can also type it as you wish. And let's say, this one, Massed Compute, Massed
Compute. Okay, this will be the full path. Okay, let's copy. Everything is the same. I have shown everything, explained everything
in the first part of the video, so I will quickly add the concepts. The first concept is train. Please watch the first part of the video,
the Windows part, where I have explained everything. So, click the path. Okay, I think, yeah, it started here. You see, it was invisible, so I clicked
it here to see it. Where did we copy them? We copied them into the apps folder, here. Okay, then let's go to the tools and set the
masked training here as well. By the way, why this is still lagging? Yeah, because it is here, still. Okay, let's select from a single text file. Why this is happening like that, it is opening
them at the behind. So, move the windows like this. You can write it anywhere you wish. Currently, it is here, so I need to save a
new text file. I opened this Visual Studio Code editor. By the way, this was inaccurately changed,
so I need to change this back. Yeah, original was like this. So, I type here, ohwx man, like this. Let me, okay, we cant zoom. This is not good. So, let's open text editor. ohwx man, okay, I can't zoom in to show you
a bigger font, but it's like this. ohwx man. Click save, and save it anywhere you wish. Let's save it inside the apps folder. ohwx man, sorry about that, it is here. This is the name of the, and txt, this is
important. ohwx man.txt. It is saved inside apps with ohwx man. Let's return back. Up, yes, ohwx man is selected. Okay, it is still open another window here. Okay, train dataset. This is the folder. This is the prompt, and I turn them off. Click preview, and it is set. Let's also generate the masks like we did. Let's open the folder automatically and generate
the mask. Let's type head, create masks. When we first time click create masks, it
will download the model. We can see it here. Yeah, it is downloading and generating masks. It is super fast, and the masks are generated. Okay, we are ready. And let's also add our regularization images
concept. So, it is reg, here. The path of the regularization images is inside
the downloads folder. So, which is here, downloads folder, selected. Now, we set the repeating as we have calculated
in the first part of the tutorial, 0.029, so it will use only 15 images at each epoch
like this, and image augmentation, turn them off. We also need to have a prompt for it. So, it will be just man. Let's save it, ctrl + s, and just save as
man.txt file. Let's go back here, pick the folders. Yeah, it is opened somewhere. See, oh, you need to set the accurate option,
otherwise, you can't, from a single text file, yes. And let's pick the man, and repeating is set,
everything is set, image augmentation turned off. Let's create the preview. Yes, and, okay. So, we did set the concepts on OneTrainer
as well. The settings are same, however, on here, since
we have a huge amount of VRAM, I am going to use speed-up settings. So, let's type nvitop, and you can see, we
have 45 gigabyte VRAM. Actually, it is 48, but I think it is getting
like this. So, disable gradient checkpointing to speed
up, and also in here, disable fused back, so it will use a huge amount of VRAM, but
it will be much faster. Let's enable, and let's set it as like this. I have explained everything on the first part,
so do not skip it. And let's make this like the first part. Okay, let's make this never, as usual, we
don't change anything, and let's also set the backup. So, I will take 10 checkpoints, as the Windows
training, and here, Massed Compute, ctrl + c and let's give it as a file prefix. Okay, so this will be saved inside the workspace
directory that we did set, and the final model will be saved inside this folder, where we
set. Let's save, fast like this, whatever you name,
that you like. Okay, you see, the interface is a little bit
slower, and start training. Currently, this will do the training on my
first GPU. What if, if I want to do another training
with another GPU, because I have two GPUs? So, all you need to do is click home, from
here, you can click here, from home, scripts. In here, you will see that OneTrainer.sh file,
this is what it starts. Open it with text editor, and you are going
to write here a command. Which command? I have written it here. Let's see, export this one. So, this will tell that the script to be able
to only see the GPU one. So, the index of GPUs start from zero. This is the first GPU, you see, it shows index
zero, and this is the second GPU. So, I just copy-paste it here, and when I
start another OneTrainer, it will be automatically using the second GPU. So, let's minimize everything with ctrl +
alt + d, start another OneTrainer. I will demonstrate it to you. So, let's also open the other window that
we have, that shows this VRAM usages. I will just load the preset that I have made. Not this one, preset loading is a little bit
slow, unfortunately. Let's load the fast, and hit start training,
and you will see that it will start using the second GPU. I just want to demonstrate. I will turn it off. Now, it started on GPU one. You see, in the second process, that it is
started to use my GPU, is using the GPU one. You see, I am able to utilize both of the
GPUs. By the way, this may have caused overriding
the caching folder, so I am not sure if it did broke or not. The cache is here. I hope it didn't broke the caching process. Maybe it broke it. So, what can we do? We can restart the process to be sure. So, let's also see that. Okay, I close them off, and currently, the
new OneTrainer that I am going to start will be using the GPU one, by default. Let's open, and select fast, and I will make
it re-cache. So, even though it has cached files, I will
make it recache with clear cache before training, so it will start caching from zero, and we
are set. Everything is set, it will do the training. Let's look at our Windows training. You see, it is using 17 gigabytes VRAM currently,
and it was using already 7 gigabytes before we started training, so it is only using 10.2
gigabytes VRAM because this was our configuration, but on Massed Compute, it will use a huge
amount of VRAM. We will see. So, let's also start the Automatic1111 Web
UI on Massed Compute to see how it works. On Windows, it is so easy to use. In this tutorial, I have shown how to install
also Automatic1111 Web UI. However, if you want to use the auto installers,
I have auto installers for them as well. So, to automatically install Automatic1111
Web UI on Windows, go to this post, and you will see all of the details that we have,
the changes, the information, the scripts. Let's download the scripts. Let's extract it into our F drive, let's say,
video_auto_1111, it is pasted. Extract files here, and you will see that
several bat files. So, I am going to install Automatic1111 Web
UI with double-clicking. It will do everything automatically for me. It will install, and you will see a bunch
of other files, and what are they? So, I suggest you use Update Torch xFormers
Install After Detailer Automatically bat file when after this installation has been completed. This installer also downloads some of the
best models automatically for you, as you are seeing right now. If you are interested in TensorRT, you can
also use this to automatically install it. If you are interested in ControlNet, you
can use this to install automatically, download all of the ControlNet models, which are over,
let's see, how many they were, which are over 50 models already. So, these scripts will automatically download
all of the models into the accurate folders, including IP Adapter and Instant ID, which
are really hard to install. This is extremely useful script. I also prepared the same scripts for Massed
Compute. Let's open this link. This is the link that you need for Massed
Compute, and in here, you will find the Massed Compute version one zip file. Download this zip file. You will find the files and extract them into
your Massed Compute folder, like here. These are the content of the zip file, which
I can access already on Massed Compute. So, how to use them? Let's go to the thin drives, which is our
synchronizer drive. Let's go to the Massed Compute. So, which files do I need, if I want to download
and use ControlNet, IP Adapter, and everything? So, let's select, actually, from here to here. So, let's copy. Some of them are extra files that you are
seeing. I will show them too, don't worry. Let's go to the apps, and paste them here. So, you will see that they are getting pasted. The operation is here, as you see. Okay, they are done. So, on Massed Compute, what do I suggest to
you? I suggest you update the Automatic1111 Web
UI to the latest version. Also, in some cases, xFormers may not be automatically
updated for OneTrainer on Massed Compute. You can use this command to update it. And, how do we update Automatic1111 Web UI with this
command. So, this is copy, open a new terminal, like
new window, right-click, and paste it, and hit enter. It will update the Automatic1111 Web UI to
the latest version automatically. But, you need to do this before starting the
Automatic1111 Web UI. Then, you need to also add this to do its
starting file. So, let's copy this. How do we do that? Let's go back to the desktop of the Automatic1111
Web UI in Massed Compute. In here, start Stable Diffusion settings file. It will open the this sh file, to modify,
and paste it here. Okay, it didn't copy it. So, let's copy again. With this way, you can start using the latest
version of Automatic1111 Web UI on Massed Compute. However, if you need more, like, what you
may say, if you need more of TensorRT, this is currently not working for some reason,
or install After Detailer, ControlNet, and Face Fusion, then you should execute this
command. How? Copy it. By the way, you need to put these files inside
apps folder, which is inside here. Let's go to the apps folder. You see, the scripts are here. Let's open a new terminal, and paste. Okay, it didn't copy again. I hate this. Copy, and paste, and it will update Automatic1111
Web UI with certain extensions. I am also going to show you the content of
this file, so even if you are my not Patreon supporter, you can still use it. So, let's open with text editor. And, this is the content of this file. So, it is going to download and install After
Detailer, ControlNet extension, Face Fusion extension. It's also going to install necessary libraries
for Face Fusion. This is it. This is the content of it. If you support me on Patreon, I would appreciate
that very much, because, with that way, you will get always most up-to-date version of
the scripts, and helping me to continue in this journey. So, you see, it is doing the necessary updates
of Automatic1111 Web UI with the necessary libraries. This will also update it to the Torch version
2.2.0, and xFormers 0.0.24. One another very beautiful thing of this virtual
machine image template is, by default, it comes with Python 3.10. Point, I need to type Python3 I think, 3.10.12. So, this is a very rare template. In most of the templates that you will find,
they are installing different Pythons, but on this machine, we have Python 3.10.12, automatically
installed. Okay, the update is continuing. Meanwhile, let's look at the training speed
on our Windows. So, it started training, and the speed is
2.5 seconds per it. And what does this mean? This means that each step is going to take
this much and how many steps that I am going to train. I am going to train 200 epochs, so let's calculate. So, every image in my concepts will be trained
200 epochs. How many concepts I have : 2 In each concept, I am training 15 images because
I have training images 15, repeating 1, and I have 5200 images with repeating 0.029. So, 15 + 15 = 30 images is trained per epoch.
So 200 multiplied with 30 = 6,000 steps that we we are going to do training total, and since each step is taking like two seconds / it now, I don't know why it became faster somehow,
it means this 6000 with 2 seconds, it is going to take 12,000 seconds, with when we
divide it by 60, it becomes 200 minutes taking on my windows. And is this done? Yes, let's close this terminal. Let's look at the training speed on the Massed
Compute if started. No, still caching. So let's start the Automatic1111 Web UI after
we made all these changes. Run Stable Diffusion, and it will start the
Automatic1111 web UI latest version with the most commonly used extensions. You can also install any extension like on
your computer here. It is exactly the same, nothing changes,
only the folders change. So you see, it is making the necessary updates
since it is starting the first time. Our training on windows, continuing, and you
see, we got a checkpoint at 450 steps, which is equal to 15 epochs. You see, it also shows the epoch in the naming,
450 steps, 15 epochs. And if your training stops when you are touching
the cmd, just hit enter, it will continue. Okay, we got some, yeah, we have basic as
our error. I think I had fixed this, but it looks like
it is not fixed. So let's fix this error. How we are going to fix it? We need to activate the virtual environment
of the Stable Diffusion and install it manually. Okay, so let's do it. Let's go to the home, apps, Stable Diffusion
web UI, start a new terminal here, and for activating this virtual environment here,
we are going to use this command: source ./venv/bin/activate Let's copy this and paste it here. Yes, it is activated. pip install basicsr and it is getting installed. Yes, it is installed. Okay, let's move back and start the Stable
Diffusion Web UI. I am going to add this to my automatic scripts,
so you won't have this issue. I will update the scripts. Oh, we got another error. This is weird. These are all errors of the Fase Fusion, so
this is why you should support me because you will not have these errors, or if you
have these errors, you will message me and I will fix it. So let's edit this, install basicsr, and
which was the missing one? It was realesrgan, pip install realesrgan. So let's also install this with our activated
virtual environment. Okay, it's installed. Now we need to restart the web UI. Okay, which was the restart web UI? Okay, let's close this terminal. Not this one. Okay, this is OneTrainer updated terminal. This is OneTrainer training. Okay, not this one either. Okay, I think we had closed it, so start Stable
Diffusion web UI again. Okay, it is started, and I don't see any error. Yeah, I think these are not very important,
and you see, you can either use it locally with this URL. How to use it? So, open the link, it will open it in the
browser, so you can use it locally on the remote computer, or you can use it on your
computer or even on your mobile phone with the Gradio share. You see, copy the link, and I can open it
in my browser. Let's open it here. So whatever I generate here will be generated
on the cloud machine, not on my computer. If you don't want Gradio share, what you need
to do is, when copy-pasting the startup parameters, which was here, just remove the share from
here that we have added in the beginning. If you remove it, then it will not start Gradio live share. And you see, I can just type, let's generate
100 images. Let's see the speed of image generation. By the way, currently, OneTrainer is running
on the second GPU, and the Automatic web UI is running on the first GPU, so they are not
blocking each other. How can I be sure? nvitop, and I can see that. So you see, the second GPU, which is OneTrainer,
is using 34 gigabyte VRAM because we have disabled all the VRAM optimizations, and its
speed is 1.2 seconds per it. And on my computer, its speed is, what was
it? Two seconds per it. You see the difference. It is faster. And this is the speed of the image generation
on the first GPU. It is 20 it per second for Stable Diffusion
1.5 model, which is here. Let's interrupt it. There will be, of course, some delay with
the Gradio live share because it will download the images from the Gradio live. It was really fast. Okay, let's load the RealVis XL model. If this screen takes forever, if this screen
doesn't work when you are loading, it can happen on your Windows or anywhere. That means the Gradio is bugged in your browser. So, turn off all of your browsers and reopen,
and it will fix it. Okay, it is loaded. This is 1024 since this is SDXL model, and
let's say this is the best sampling method that I like, and generate images. Massed Compute is totally secure, so no one
can see whatever you are doing. Okay, it is getting generated. It is like 6 it per second. You see, this is the image generation speed. It is almost equal to RTX 4090. Actually, it's a pretty decent speed, really,
really good speed, actually. 6 it per second. I don't know why, but TensorRT is currently
not working for some reason. I couldn't solve it yet, but hopefully, I
will solve it and update the Patreon scripts. And training is continuing. So all I need to do is now just waiting, the
checkpoints being generated, and comparing them both on my computer or on the Massed Compute. Both is same. It has been a while, so let's check out the
status of the training. When you are training, in the left bottom
of the screen, you will see the status of the epoch, the number of epochs completed,
and the current step of that epoch. So this is where you will see the status of
the training. In the cmd window, also, you will see some
of the messages. However, it doesn't show the current epoch. It only shows the current step at that particular
epoch. So let's see the status of the Massed Compute
training. To see that, I click this. This is the interface of the OneTrainer, and
already, 158 epochs have been completed. While training, I suggest you to start uploading
files into the Hugging Face because if you use folder synchronization to download models,
it may not work, or it may be very slow. So how you are going to use that? It is so easy. We are going to start the Jupyter Lab. You see, there is "Run Jupyter Notebook". So let's run it. It will start the Jupyter interface like this. Now, you can either load the file that you
have downloaded from Patreon, or you can make a new notebook. Either way works. You select Python3 ipykernel. So to load it, it was inside our thin drive,
Massed Compute. This is my folder. It may be different in your case, whatever
you set it up. And you see, there is "Upload Hugging Face
Notebook File". If you double-click it, it will not be opened
accurately. So how you are going to open it? To open it in this interface, we click this
"Upload". So we have to select it from this interface. Let's go to the home, go to the thin drives,
Massed Compute, and select the notebook file, "Upload Hugging Face". Open. So this will open it accurately. You see, now it is appeared here. Double-click it. It will open it like this. And these are the codes that you need to run. There are several cells, and each one is doing
something different. So, first of all, let's install the Hugging
Face Hub and ipy widgets. Okay, the cell has been executed. When you click here, yes, you see, we got
this screen. So what we are going to do right now is, go
to your Hugging Face settings, and in here, click "Access Tokens", and make a new token
with read and write permission, like "test test test", whatever the one you want, and
then copy it. By the way, you need to register a Hugging
Face account. It is free. Then paste it here at login. Once you're logged in, then you can upload
model files or generated images into a private repository. So, there are these codes. What do they do? This first one allows you to upload a single
checkpoint into your destination repository in your Hugging Face account. This one lets you upload a folder, and this
one lets you upload a folder. This is a better way of uploading a folder. So, if you're going to upload an entire folder,
you should use this method. If you're not going to upload an entire folder,
just a single checkpoint, you should use this. So, let's upload a single checkpoint first. Where is our checkpoint saved? They are inside home, actually, inside desktop,
inside OneTrainer, workspace, inside save. So our checkpoints are saved here. Let's upload the first checkpoint. So, I do Ctrl+C. It will copy the path of
the file, then I will paste it here. Okay, maybe the copy-paste not worked. Okay, copy-paste worked, but it doesn't allow
me to paste. Let's open a new window. Yes, it is copied, but in here, why I can... Okay, now I can paste it. So, first, I have to copy-paste it into an
editor, then I can copy-paste it. So, this is the full path, and then we need
to give the model name in the repository that we are going to upload. So, this is the name. I copy it, delete this one, and paste it like
this. And the repo ID, so this is the repository
that you are going to set on your Hugging Face. Let's click "New Model". You can make it either public or private. Test11111. Let's make it as private, so no one else can
access it. Click here to copy your Hugging Face repository. You see, username and repo name. Delete it, and click "Play" to upload. I will restart the notebook. Okay, so let's close the Jupyter Lab interface. Then, let's return back to the desktop, Ctrl+Alt+D.
Start the Jupyter Notebook. Okay, it is getting opened. Let's open the upload Hugging Face and play
again. It is done. Let's click "Play". Okay, you see, the widget is now started. Perhaps we had to restart the terminal to
get this widget. Then, let's copy our token one more time,
which was... Let's go to our settings. I'm going to delete this access token after
the video, so it's fine. Copy it, paste it, login, and yes, token is
valid. Now, I can start uploading. Okay, this is also saved. So, let's just click this and just wait. By the way, we should see that it is playing
this cell, but the color of this cell is also not accurate. I wonder why. Oh, because this is raw. We have to make this as code. I was just playing with it. So, you need to select "Code" from here. Then, let's just click "Play". Okay, now it is going to start uploading into
the Hugging Face repository. I suggest you do this while training. Why? Because this way, you will save your time. You will not wait upload. You can upload files while training. And also, let's say you wanted to upload an
entire folder. So, you can set the folder here. You can upload either the images or all of
the generated models. You just change it here, the folder path. You change your repository name like here,
whatever the repository you did set. For example, I did set this and play this. So, this will upload one by one, everything. You see, the upload speed is pretty decent. It is like 50 megabytes per second, which
translates into 500 megabits per second, almost. And with this way, you will be saving your
time. One another thing is that our newly fresh
installed Automatic1111 web UI has started. If you remember, we had installed it from
here. So, after the initial installation, I can
also install ControlNet or I can update xFormers, install After Detailer, and the Face Fusion. So, you can see that currently, it is using
Torch 2.1.2 with xFormers 0.0.23, and After Detailer extension or ControlNet extension
is not here. So, I am going to install those extensions
and update everything automatically. I am also going to show you the code right
now because I said that there is no paywall in this tutorial. So, this is the code that installs everything
automatically. You can code it if you wish. And I am also going to show you the updater
code because I said there is no paywall in this tutorial. So, let's edit it. This is the code that is going to install
After Detailer automatically and update the Torch version and the xFormers version for
me. If you need Face Fusion on here, I can also
add it into my automatic installer. Just message me from Patreon, and I will add
for you. There is also automatic installation and downloader
for ControlNet with setting everything into accurate folders because it is hard. But don't worry, I have a full tutorial on
my channel regarding the ControlNet. So, just type "YouTube SECourses ControlNet". And this is my grand master ControlNet tutorial. You see, it is 90 minutes. You can watch this to learn more about Control
Net if you are interested in. However, if you use my automatic installers,
it is better for you. There is also some extra information here
regarding how to use newest instant ID and also IP adapter face ID. But if you ask my opinion, the ControlNet
extension, IP adapter face ID, and instant ID implementations are way behind their original
repository implementation. I have one-click installer for instant ID
and also IP adapter face ID. When we go to the Patreon exclusive post index
and search for instant ID, you see, I have instant ID for Windows, Runpod, and Linux,
and Kaggle notebook. And also for face ID, we have the same. Let's find it here. IP adapter face ID plus version 2, 0-shot face
transfer. This is also another standalone Gradio installer
for them. Hopefully, I will make a new tutorial for
instant ID. I already have a tutorial for face ID. If you want to watch it, just type "SECourses
face ID". And this is the tutorial for it. Okay, so our model has been uploaded into
our repository. Let's check it out. When I return back to my repository and refresh
here, I should see the model inside files and versions. And yes, I can see it. Now, I can download it and start using it
on my computer. If I want, I can also already use it on Massed
Compute. How? I can move the model file into the models,
or I can give this as a path of model loading. So, let's move the saved models into the models
folder. You don't need these YAML files. Automatic1111 web UI automatically generating
YAML files. To select all of the models here: while keep pressing the shift key, I select
the first one, and then I select the last one, and it selects all. Alternatively, you can right-click and select
all, and then copy or cut. We don't need to copy; we should move them,
so cut and let's go to the home, inside apps, inside Stable Diffusion Web UI, inside models,
inside Stable Diffusion folder, paste, and now I can start using them on Massed Compute. It is not finished yet; it is still generating
checkpoints. However, you see, we have already got the
checkpoint of 180 epochs. So how we are going to test them. If you remember, we had started Automatic1111
Web UI on the Massed Compute. We can see its terminal somewhere around here. Yes, here. So there was a Gradio live and also local. Currently
it is running on local URL as well, so which is 127.0.0.1:7860 port. This is the port of the Automatic1111 Web
UI which starts default, and now I can do testing. How am I doing testing? First of all, I need to refresh models here
so they appear, you see, and then I refresh this interface so that these models will appear
in the x/y/z checkpoint, and then you need to type your prompts. For example, let's check out the first checkpoint,
150, because 150 epoch is a sweet spot that I find with my configuration, but it may not
be the same for you; it depends. Then what I'm going to do is type a prompt
that I like. You can find a lot of good prompts on my CivitAI
profile, so type CivitAI SECourses like this, then you will get my page. On my profile, click my profile, go to my
images, and you will see my generated images here with their png info. For example, for testing, let's pick a prompt
from here. Okay, like this one. This is an SD 1.5 image; I am going to use
this prompt. For SDXL I don't need very much negative prompts, and
later I decided to remove canon word because I think it is not really necessary. Okay, it's looking accurate. I prefer to do 40 steps for sampling. I am using DPM++ 2M SDE Karras. This is the best sampling method that I find. Then we are going to generate in 1024 because
this is our base training resolution, and a very important thing is that you should
use After Detailer. Let me demonstrate to you why. So let's generate like 4 images to see
what we get and why we should use After Detailer. The reason is that the model learns our face
and our body with a very limited number of images and limited number of resolution; therefore,
the face details are not very accurate when you generate a distant shot images, not close
shot, distant shot. So we in-paint face automatically to improve
it; that is what After Detailer does. Okay, we got images. For example, this one, this one, or let's
see this one. Okay, I am going to improve this one's quality,
and there is this one. I don't know; we can also improve this one
too, whatever the one you wish. So where is the seed of this? Here, let's copy the seed, let's paste it
here, and let's make it batch size one and regenerate. So I am going to show you the effect of the
After Detailer. Okay, we are getting the same image. Okay, this is the image we got. You see, the face is not accurate; however,
the overall shape is accurate. So After Detailer can fix up to a certain
degree. To get perfect images, you need to have a
good training, and you can see that the model is not over-trained because the clothing details
are perfect; the background details are perfect. So this is not over-trained, and the anatomy
of the image is looking accurate. It depends on your taste. If you don't find this very natural, you can
generate more images to get a better image. So what we are going to do is, we are going
to type a prompt here which is very important. Photo of ohwx man, this is super important
because this is the most likely to improve your face. If you want a different pose of the image,
you need to also change here, I will show. In the detection, make it detect
only the first face, and in the in-painting tab, I prefer to make it 0.5 percent denoising
strength, and there is only one thing that I change, use separate steps to improve the
quality of the face like 70, and let's try again. Now we will see that it will automatically
in-paint effects. Okay, we forgotten to enable the After Detailer,
so let's enable it and try again. With SD 1.5, if you do 1024 training, and it is fixing
the face, and we got the image. You see, the face is much better. This image is not perfect; I would generate
more images to get a better one, but you see the logic. Also, if you want to do upscale with high
resolution fix; you can do that too. The crucial part is you need to select some
other upscaler here. If you select Latent, it will drive away the
image a lot. For example, let's use our R-ESRGAN 4X plus,
and let's upscale like 1.5, and we don't need to change denoising, but let's try. So let's upscale it. Currently, we are running on L40 GPU, so this
is faster than the A6000 GPU, which I suggest you to use. It is faster, but A6000 GPU is also not very
slow. Okay, the upscale is getting done into 1.5X. I think 1.5X broken some of the anatomy; let's
see. So it is going to apply the face improvement
to the upscaled image, and this is the upscaled image. Yes, I think the anatomy is broken with 1.5X. So you can do this like 1.25X and try again. These are the numbers that you need to play. If you find it too different, you can also
reduce the denoising strength. So you can play with these parameters until
you find your liking, but I use this in my images, the After Detailer. Okay, you see there are some new faces, but
they won't be changed because I have selected mask only the top k largest, and we got the
image. The anatomy is not accurate. So how can we fix anatomy? By the way, this anatomy inaccuracy is also
being caused by the masked training. Because we have reduced the weight of the background
and the body; therefore, it is not fully accurate. So if you don't want to get any anatomical
problem, you shouldn't use masked training. However, the masked training improves the flexibility
of the model; it allows you to generate better images. Also, if you don't like the generated image,
you can skip it directly before waiting it to be finished. And moreover, there is a way to generate an
infinite number of images. I am going to show that in a moment, once
this image is done. Okay, it is fixing the face. The face quality is pretty decent. Yeah, the back face is not good though in this case. So you can right-click here
and generate forever. So this is a way of generating forever. And how do we find the best checkpoint? So once you decided your prompt for comparison,
you go to the x/y/z in script menu, in the very bottom, you see x/y/z plot, you enable
it, and in here, you will see that checkpoint, checkpoint name, and select the checkpoints
from beginning to end, like 15, 30, 45, 60, 75, 90, select all of them. Then I put 50 grid margins, 50 pixel, and
I generate 4 images for each case. Since this GPU is very powerful, we can also
make the batch size 4, and let's hit generate, and we can watch the progress from here. It will tell us how many images will be generated. These generated images will be saved in the
outputs folder, so I click that folder, and I can see images here. Let's change the view. Yeah, I think show list is better, and we
had generated some car images, so can we make them look bigger? Yes, there is zooming. Okay, let's make this more zoomed in. Okay, so you can see the images here. So how you can download all of these images,
since these are small images, you can do synchronization with your computer. So let's try it, or you can upload them into
the Hugging Face. So let's copy this folder and let's go to
home, let's go to Thin Drives, Massed Compute, and paste the folder here. After I copy-pasted the folder here, you need
to wait first. This copying files to Massed Compute folder,
I can go to my Massed Compute folder, and you see, the folder started appearing here. So now it is synchronizing the remote cloud
folder with my local folder. They are getting downloaded as you are seeing
right now. I suggest you to use this way for the images,
or still, you can upload them into the Hugging Face. If I were going to upload them into Hugging
Face, what would I do? What would I do is, I would first zip all
of the images. So how would I zip them? Right-click and let's see. Yes, there is compress, and you can give any
name of zip like test one, then it will be a single file, then you can copy the path
of this file with Ctrl C, then go to the text editor, open a new tab, paste it, then copy
the path from here, Ctrl C, return back to your upload Hugging Face, and delete this
file path, then paste the new file path, and give any name you want. So you can entirely upload the new zip file
into your Hugging Face repository. Let's click play. This is a very small file, so it doesn't worth
it, but you can do it is 200 megabytes, and you see, it is uploaded. Now, when I go back to my Hugging Face folder,
it should appear here, and yes, here. So I can download it directly from this way
as well. Hugging Face is really became the backbone
of AI. They give all of this to us for free. They are amazing, amazing. I hope that they get better and better every
day. So I suggest you to follow me on CivitAI. You will see my username as SECourses. Also, please star our GitHub repository. It is super important. When you click here, you will get to our main
page, and in here, please star it. This is super important for me. Please fork it. Please also watch it. If you also become my sponsor on GitHub, I
appreciate that very much. Moreover, in the top of this GitHub readme
file, you will see that you can support me on Patreon, you can buy me a coffee, you should
follow me on Medium, you should follow me on DeviantArt, you should subscribe to our
LinkedIn channel, you should follow me on LinkedIn, you should follow me on Twitter
if you wish. Unfortunately, our Udemy course is not available
right now, but I am working on it. So you can see my LinkedIn profile. You can follow me here. I have 4000+ followers. Hopefully, it will get even better. If you have any questions regarding this tutorial,
ask me by replying to the video, or you can also join our Discord and message me there. Either way works fine. And our x/y/z plot is getting generated right
now. Okay, meanwhile, x/y/z checkpoint comparison
is getting generated. You see the training has been finished. So what can we do? The last file is already saved inside the
given folder. Let's also move the last save checkpoint. Let's go to the home, apps, inside Stable
Diffusion Web UI, inside models, inside Stable Diffusion. Let's paste it. Let's also switch back to the list like this. Yeah, this is good. And let's find the last checkpoint. What was our name for the last checkpoint? It was this. So this is the 200 epochs checkpoint. Before terminating my Massed Compute, I need
to upload everything, or the one that I like it most. So to upload everything, I am going to do
Ctrl L. It is going to select this folder path. Right-click and copy. Then I will return back to my Hugging Face
uploading. I will change the path. Currently, it is like that since this is downloaded
from Patreon, but if you need that, you can do it. So what was our repository name? It was this. Let's copy it. Unfortunately, there is no copy, so select
and Ctrl C, and then here, let's put it here. You see, there is multi commits, multi commits
verbose, and just hit play, and this will start uploading every file in that folder
with accurate folder names into my Hugging Face repository. Initially, it may take some time to prepare
the commit, then it will start uploading everything, and meanwhile, our x/y/z checkpoint comparison
is almost completed. We can watch the progress here. By the way, since I am uploading at the same
time, it is loading the checkpoint, so it will become a little bit slower. Let's do Ctrl D. Okay, it is like this. Okay, you see, currently, it is using 41 gigabytes
of RAM. It says, okay, this is... Yeah, we have 252 gigabyte RAM, and we have
1.2 terabyte hard drive with this Virtual Machine. We are only using like 200 gigabytes of the
hard drive. We can see the values here. Also, we can see them on the nvitop, which
I have shown you here. Yes, we can see how much RAM we are using,
how much CPU and GPU we are using. You can also start multiple Web UI instances
with the way I have shown, for the OneTrainer export CUDA visible devices. That's all you need to do. Okay, let's watch the progress. I can also connect this on my computer with
Gradio Live Share, so it is a choice of yours. Okay, it started uploading the files. While this way, you uploading, you will not
see the files here until it is completed, but when you go to the community, you will
see that WIP work in progress upload folders using Hugging Face Hub multi commit. So this has a resume capability as well. You will see that it is uploading the files,
and as they get uploaded, these checkboxes will be checked. So it is uploading . I have to wait fully
finish it to see the files here. Okay, it is uploading. Our x/y/z plot is almost completed as well. So the x/y/z checkpoint comparison has been
completed. Sometimes you may not get the final image
here. When it is saved, when you click this icon,
it will open the folders, and then you will go to the text2image grids, like this is
actually inside the grids, and in the very bottom, you will see a saved png file. It is a big file, 70 megabytes, so I will
copy it and paste it into my Thin Drive. Therefore, I will be able to open it on my
computer. So let's do that. We will see that it is pasting here. I think it is synchronous synchronization
not asynchronous, so it just pasted. Let's look at our Massed Compute device here. Okay, the grid is here. Let's open it. I am using Paint.NET, which is a free software. So here, look at each image. You see, in the first checkpoint, the face
is not like me, so as we move between the checkpoints, it will become more like me. Still not. Okay 45 epochs, still
not like me. By the way, you need to compare this face
with the training dataset face, not my current face, because there are some differences. You see, okay, not here. Okay, it becomes more likely, as you are seeing,
in the 75 epochs. Likeliness should improve as we pass more
epochs. Yes, for example, here, here, we got... I think 135, pretty decent. This is a personal thing, so I can't say this
will work best. So this is 150 epochs, which I like in most
cases, then 165, but model is far from being over-trained. Okay, I think a little bit over-trained here. So this is the way of comparing each checkpoint. I think the clothing is also got over-trained
in the 180 epochs. You can see the quality of the clothing degraded. Let's compare it with 150, for example, or
135. I think the clothing is better in this one,
so you can see which checkpoint is performing best. The best clothing will be in the first checkpoint
because it is the least overtrained checkpoint. As you see, the clothing is looking better
than the last checkpoint, so this is the way of comparison. There is a trade-off between the likeliness
and the flexibility, so if you want more likeliness, then you need to do more training, and it
will become less flexible. However, it is still perfectly able to follow
the prompt. For example, let's try another prompt, another
complex one. Okay, I am going to disable x/y/z checkpoint
comparison because you should have understood the logic. So, make this none, and let's generate 4
images. Yes, yes decent, let's say it will use the. Okay, it is currently at the 90 checkpoint,
so let's pick the 150 checkpoint and let's say, photo of ohwx man wearing a yellow suit
and a red tie and a white shirt in an expensive hotel room. You see, none of my clothing was like this,
so I have picked fully different colors to see how much the model is able to follow my
prompt. Let's generate. I am, of course, using the photo of ohwx man
as a face prompt. I am not using the same thing because this
way it is more focusing on the face. Okay, initial images are getting generated. How much the After Detailer can fix the face
totally depends on the quality of your training, and let's see how much it obeyed our prompt. If it obeyed perfectly, then that means the
model is not overtrained. People often ask me that, try different expressions. If you want different expressions, then you
really should include them in your training dataset. Currently, my training dataset does not have
such different expressions, so the model capability of such different expressions will be limited. You need to add them into your training dataset. However, as you can see, it followed the prompt
perfectly accurately. The face is not the greatest one; however,
the jacket is here, the shirt is here, the tie is here, all colors are accurate. So, I can generate more images to get the
best one. Also, I can add some beautifying prompts like
hd, hdr, uhd, 2k, 4k. I can add some LoRAs if I want. It's all up to you, and 8k. So, it is all up to you to add more prompts
and negative prompts and LoRAs to improve your output. This is something more related to how well
you can prompt, and you can generate more images and find the best ones. For example, for the introduction part of
this video, I am going to generate a lot of images and pick some of the bests to show
you. This is how Stable Diffusion works. We can generate in masses, and we can use
the very best ones. Also, the base model that you use will make
a difference. Not all base models are perfectly trainable. For example, on Juggernaut XL, I didn't get
such good likeliness, so not all the models are properly trainable. Some models are better, and some models are
not. Moreover, masked training can cause irregular
anatomy. Therefore, I suggest you improve your training
dataset rather than depending on the masked training. So, let's see. I think this checkpoint could be a little
bit overtrained because on the face, I see some deformities. So, let's reduce it, like to 120 epochs, and
I will use the same seed by clicking here. We are doing 50 percent denoise. We are doing 70 separate steps for the face
inpainting. Okay, let's try again. The power of L40 is like, I think, RTX 4090
in terms of speed for inference. For training, it's a little bit slower in
the batch size 1, but you can increase the batch size to significantly increase training
speed if you need. Okay, we are getting the results, and I will
show you a different expression and how to do it with After Detailer. While it is generating, you can also open
the folder and watch the images here. Let's sort them by modified dates. Okay, you can see the last saved image here. Open it and see without waiting. The accuracy of the anatomy will depend on
how good images you edit into your training dataset because it will learn the proportions
of the anatomy. For example, this image is looking pretty
good, pretty decent. Let's see the results. This, so it is up to you, whatever you like. And if you want a different expression like
smiling, photo of ohwx man, then you should add the expression here as well; otherwise,
the expression will be gone. Like, smiling photo of ohwx man, let's see
what we are going to get. But I don't have any smiling expressions in
my training dataset; therefore, it will be hard for the model to get a very high-quality
smiling expression. And some expressions are harder than others. Smiling expression is rather easy for the
model. There is also, hopefully, Stable Diffusion
3 is upcoming, and I think it will be much better. I will be hopefully the first one doing a
full training tutorial for it, releasing the training hyperparameters setup. So, follow me on Patreon. You see, the smiling expression is not that
great because the dataset does not have such great expressions. The dataset is all single expression, so if
you want different expressions, you should include the expression in the dataset. That is the way of doing it. Still, they could be counted as perhaps acceptable. Yeah, maybe in this one. So, it is up to you, maybe slightly because
slightly is, I think, performing better than the full smiling, in my case. For example, let's try again. So, this is the logic of it. This is how you do. There could be a better way of prompting,
including some LoRAs; however, it is up to you. I am not that much expert with prompting,
to be fair. I am just using in my use cases, in my demos. Let's see what we are going to get with slight
smiling. I also like slight smiling rather than smiling. So, expression should be both written here
and also here. This is not as good as manual face inpainting
in the inpainting tab, but this is an automated way. So, you can generate in masses and pick the
best one. However, if you inpaint manually, you can
get better results. That is for sure. Also, YOLO doesn't have head; it still has
only face. I don't know why they didn't add the head here. That could produce better results, to be fair. Okay, you see, the slightly smiling is much
better because I have slightly smiling expression in the training dataset. So, how do we do inpainting? When you click this icon, it will go to the
inpaint tab with the image. So, you can change the size and carefully
mask the face. I'm not very careful right now. Carefully mask the face and type prompt, which
will be slightly smiling photo of ohwx man, nothing else. Then in here, we select only masked. We select the best sampler. You can select any number of steps, like 60
steps, and you can make it resize by 1. We are not upscaling. Denoising strength, whatever you wish, like
50%. You can also increase it. You should make the seed random, so that you
can get different results and pick the best one. And that should be the number of images, which
is here. Yeah, let's generate, batch size 4, and let's
see. Okay, let's just wait a little bit. You can always follow what is happening on
the terminal. This is the way of using AI applications. See, yeah, the terminal is here. This is the inpainting speed, but you need
to multiply this with 4 because we have batch size 4. It is over 6 it per second, and the inpainting
is generated. Let's look at the different ones. You see, there is a difference. I think the first one is best. You see, like this. So, this is the way of generating different
inpaintings, and with this way, you can get a very best result. And then you can use our SUPIR upscaler to
improve the face with this selected base model to improve everything. Hopefully, I will also make a new tutorial
for our SUPIR application. That is really amazing. That's mind-blowing. What if you want to make a LoRA out of these
models? So, it looks like OneTrainer convert models
is not a good option. You really should stick to the Kohya conversion. I have an explanation of how to convert Kohya
if you wonder. It is on my Patreon. This is a public post, so I go to the Patreon
post index, and this is the article. Yes, in here, I explain it, how to convert
lora models from base model. You select these options in the Kohya menu,
and you save it. Let's see, LoRA extraction, yeah, here, extract
LoRA, inside utilities, LoRA, extract LoRA save precision, load precision, and everything
is existing here. Let's verify if the models are uploaded into
Hugging Face. Not yet. You really should verify everything uploaded
before you terminate your virtual machine. Massed Compute. You see, it still says 8 to go, but these
are not all models. There are also some other files. We can also see the commit from the community
in WIP. Yes, you see, it shows the file uploaded and
waiting files to be uploaded. Meanwhile, our Windows OneTrainer training
also completed. How do we know? You see, 200 epochs to 200 epochs. It took more than, way more than the message
compute because we used optimized VRAM here, not the speed. Therefore, it was way slower. So, where did we save the files? The folder was OneTrainer video workspace. So, when I enter inside this folder, I will
see that the saved checkpoints are here. To use them, I will move them into my freshly
installed Automatic1111 web UI. It is inside here, inside models, inside Stable
Diffusion, and when I move them here, I will be able to use them. Moreover, what was the final file? It was saved inside my other Automatic1111
installation. So, I just need to start my Automatic1111
web UI, and the models will be there to use. It is exactly the same as on Massed Compute. There is no difference. Everything is exactly the same, so I am not
going to repeat them on Windows again, but it is fully the same. You see, all the models are here. I can generate any image of mine, photo of
ohwx man wearing a leather jacket in a desert. Whatever the prompt you want, sampling steps,
selected sampling method, width, and height. These are super important. If you do SD 1.5 basic training, don't forget
to set your resolution accordingly. Enable After Detailer, photo of ohwx man, and
detection. Only first face is mine. It allows you to inpaint different faces with
different locations as well. It is available on the After Detailer extension
page. Inpaint denoising strength, I find this very
good. Use separate steps and hit generate. So, this is my local computer, local training,
nothing different except this is running on my computer. The other one was running on the cloud, on
the Massed Compute cloud, and we are getting the image. The initial image is generated. Now, it is going to inpaint face. So, you see, this was the initial image face. However, even though the face details are
not great, you see, the head is my head; that matters. And the face is now fixed, and we got the
image. It's a great image. You can change your prompt to get better images. This is how you do it, how you use it on your
local computer. Okay, let's meanwhile look at the SD 1.5 configuration. So, where is the SD 1.5 configuration? In this post, I have also shared it here,
if you remember. And there are several configurations. This is Kohya, by the way. So, where is the OneTrainer? These are Kohya. Yeah, OneTrainer is here. OneTrainer, SD 1.5. Yeah, this is where you can download. We have tier 1 and tier 2. So, what is the difference? If you have a GPU that is 8 GB,
then you should use tier 2 configuration. Actually, I compared tier 1 and tier 2. There wasn't very much difference with the
latest version of xFormers, so don't worry about that. You can count both of them as tier 1. There isn't a difference. I compared it. There is a link here. And if you don't have a BF16 supporting GPU,
you still should use SD 1.5 training. SD 1.5 training is also amazing. We get amazing quality. So, let me show you the configuration of SD
1.5. The configuration is different, not the same
as SDXL. Remember that. I am going to open my OneTrainer. I have the configuration here. Let's start it. Okay, let's pick the configuration. Where is it? SD 1.5 slows. Yes, I'm going to show you the list where
I'm using. So, this is the first part. This is the same in all. And in here, you select your base model. This is Hyperrealism Version 3. You can select VAE, but you don't need. They are all embedded, so you shouldn't select. Model output, these are all the same. Now, the important part is weight data types. With SD 1.5, you really need to train them
in full precision, float 32. This is mandatory. Others are not working good. Also, you need to select base Stable Diffusion
1.5 here. Fine-tune is selected. I didn't search for LoRA or embedding. My configuration is for fine-tuning only because
it is the best one. And in data, that is the same. Concepts are the same, but what differs is
you really should use 768-pixel resolution for training for this Hyperrealism Version
3 model. Some models support over 1024 pixels. However, I compared them, and I find that
768 is the sweet spot. So, that is the different thing. Also, you need to use the same resolution
regularization images. So, you see, man images dataset, 768 to 768. My training dataset is 768 to 768. Then, in the training, this is also different. First of all the Adafactor settings are
the same as the SD 1.5. If you have a high VRAM, you can disable this
to speed up. The learning rate is 7e-07. The rest is the same, like batch size, epochs,
and whatever. I don't suggest using more than 1 batch
size. Train text encoder, we do that. We never stop the training. Let's make this like 10,000 to be sure. Text encoder learning rate is the same with
the learning rate in SD 1.5. Moreover, with SD 1.5, we use EMA. This is super important. You can use it on CPU if you don't have sufficient
VRAM. However, if you have sufficient VRAM, use
it on GPU. How you know you have sufficient VRAM? When you do training, if it starts using shared
VRAM, then that means you don't have sufficient VRAM. You need to reduce the VRAM requirements because
it will become 20 times slower. So, these are the EMA settings. EMA decay is 0.999, and EMA update step interval
is 1. Gradient checkpointing, this is again for
saving the VRAM but will make it slower. The resolution is 768. Attention, now you should pick xFormers. Actually, I compared the attention with the
latest version of xFormers, and it didn't make a difference. In the past, it was making a huge difference,
but in my recent training, it didn't make. Still, you can make this default to get the
best quality, but it will make it slower, and it will make it use huge VRAM. So use it xFormers, so use this on GPU if
your VRAM is sufficient; if not, use this on CPU and train U-NET. Yes we train until to the very end,
like this, U-NET learning rate is same. I didn't test the effect of rescale noise
scheduler; I didn't test any of these yet. I also didn't test align prop. Masked Training is same as SDXL, so you can also
train Masked or not. When you do Masked training, your anatomy
may not be perfect, so that's the trade-off. I also didn't test this area, so these are
the very best settings that I have found for SD 1.5. The rest is same as the SDXL; there is no
difference. Okay, we are almost finished uploading everything
into the Hugging Face repository. As I said, with the Massed Compute, it is
extremely important that you backup everything before you terminate your session because
once you click this terminate icon, it will delete everything. There is no stop option or permanent storage
option on Massed Compute yet. However, I am believing that they will add. Moreover, I am in contact with the Massed
Compute developers, and they will add, hopefully, more GPUs. You see, it became 5 available A6000 GPU while we are recording. We can see the billing. Currently, my cost per hour is 2 dollars
because I am using L40 GPU. If we were using A6000 GPU 2 instances,
let's select our creator image, which is SECourses, apply our coupon, verify, and deploy. Okay, if you get this message, that means
that there aren't available GPUs on the same machine, so let's verify. Don't forget to verify and see the price here;
that is super important. Deploy, yes, it is deployed. This will only take 31 cents per hour. I mean, this will only take 31 cents per hour. Yeah, they are not separately displayed, but
you see that. Okay, let's terminate this one because we
don't need it. It will show you the instance name and instance
ID to be sure that they are matching, and it says that selecting terminate will delete
all data on this VM and recycle the machine. There is no way back. Let's terminate, and it is gone. My currently running instance is here, and
it is uploading with the 55 megabytes per second, which is a very decent speed. So, you really should read this readme file
very carefully. Read all of the links I shared here very carefully;
it will help you tremendously. This is super important. Don't skip reading this readme file very carefully. It is really, really important. Watch these videos that I have shared here. Ask me any questions from the Discord or from
the replies of the video. Hopefully, I will reply to every one of them. And if anything gets broken while you are
trying to use my Patreon scripts, please message me from Patreon. I fix them as soon as possible. Actually, the majority of my current time,
current employment, is maintaining the scripts. I have a lot of scripts, and it is really
hard to maintain everything. I am full-time working on this stuff, so this
is my main income, main source for continuing my life, paying my bills. Therefore, I hope that you understand it. I hope that you understand the importance
of supporting me. This tutorial is made after doing research
for weeks, so it was a huge tutorial, it was a huge task. This tutorial is literally the experience
of over 15 months, and if you are a company, or if you are interested in more professional
training like a bigger dataset, like training style, like training objects, or other stuff,
I am also giving private consultation. Just message me from LinkedIn or from Discord. I am also open to project-based working. I am open to every kind of collaboration,
so we can collaborate. Okay, you see, it says that it was merged. If I open this in this machine, it will tell
that not exist because it is a private repository, but if I open it in my computer, I can see
all the model files that we did training is uploaded, along with whatever else we have
in that folder. We have, for example, 1.5 base model, we have
RealVis XL model, so this is the way of doing it. I hope you have enjoyed this tutorial, and
if your synchronization with thin client doesn't work, what you need to do is let me demonstrate. You close thin client, open ThinLink again. Sorry, I said thin client, actually ThinLink
client. It is not very inaccurate. Then we need to get our password one more
time, copy it, connect, and check this, and existing session, and let's see what happens
when we do that because this is very dangerous. It will terminate everything running on the
pod. Therefore, all your unsaved progress will
be lost. Moreover, all your applications will be closed. Therefore, your training will be terminated;
your generation and other things will be terminated. Okay, something happens. Sometimes this may happen. Let's try again, and existing session, let's
connect, let's verify if our IP is accurate. Yes, it is accurate. You should always verify IP, username is matching,
starts. You see, it says mounting local drives. Wow, something is happening. I hope that this error is not a common one. Maybe we need to try several more times. This time I'm not going to select end existing
session because I did that already. Maybe that is the reason. It needs to wait more, so let's try again
without end existing session, start, mounting local drives, starting session. I think this time it will start. Yes, and you see, all of the applications
are gone, RAM usage is now 4 gigabytes. The data is, of course, remaining, but all
the running applications are terminated. However, there is no problem, no issues with
the data. I can see all the data here. I can start Stable Diffusion web UI and start
using it. This is almost as equal as from this menu,
from right top, power off, restart. If you do power off, probably you will lose
everything. There is no way to back. I am not sure, so I don't suggest that. Okay, there is also power settings. I think you should make this by default, power
modes. Yeah, there is no problem. Okay, this is it. Now, when I terminate this session, everything
will be deleted, everything will be gone forever, and I won't be able to recover it. However, I have saved everything in my Hugging
Face repository. Therefore, there is no issues. I didn't save all the generated images, but
you know how to save it. I hope you have enjoyed this video. Please like it, subscribe to our channel,
support me on Patreon. You can see all our links here. Please also follow me on these links. I appreciate it, and also leave a comment
about my new voice. And by the way, maybe you noticed that my
voice changed because it is already 7 a.m. here, so my voice is degrading in quality,
but I would like to hear your opinion about my new microphone that I have purchased to
increase the sound quality. Hopefully, new more amazing tutorials are
on the way. Please also open the bell notification to
not miss anything, and in my channel, you can use this search icon to search anything
like ControlNet if you want to learn. You will see the ControlNet, like Stable Diffusion
if you want to learn, like type SDXL, and you will see the SDXL tutorials. I also have amazing playlists, so you can
look at all of my playlists. Hopefully, new amazing tutorials are on the
horizon. See you later.