Full Stable Diffusion SD & XL Fine Tuning Tutorial With OneTrainer On Windows & Cloud - Zero To Hero

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hello, everyone. Welcome to the most in-depth tutorial ever made for Stable Diffusion training. I have been doing Stable Diffusion training since 2022. So, this tutorial is a cumulative experience of more than 16 months. As a result, I can confidently say that this tutorial is like an entire course that you would purchase for hundreds of dollars. In this tutorial, I am going to show you how to install OneTrainer from scratch on your computer and do a Stable Diffusion SDXL and SD1.5 based models training on your computer. I will show the very best configuration parameters that I have found after doing more than 200 empirical research trainings without any paywall including masked training and proper setup of training concepts. Moreover, I am going to show you how to do training on Massed Compute cloud virtual machines for amazing discounted prices if your computer is not good enough with perfect privacy. The discounted price is 31 cents per hour for an A6000 GPU machine which costs more than 70 cents on RunPod as a comparison. The virtual machine we prepared on Massed Compute is a desktop interface having an operating system so it will be as easy as using it on your own computer. Furthermore, I will show you how to utilize more than one GPU at the same time for different tasks on Massed Compute such as doing two separate trainings. The same strategy applies if you have more than one GPU on your computer as well. In addition, I will show you how to caption your training datasets properly. Also, I will explain why your training is extremely slow due to shared VRAM issues. Moreover, I will show how to use the very latest version of Automatic1111 SD Web UI on Windows, Linux, and on Massed Compute with amazing extensions such as After Detailer and ControlNet. Additionally, I will show how to get amazing pictures of your training with accurate settings of After Detailer to improve faces in distance shots image generation. This will help you generate high-quality images with ease. Finally, I will show how to upload and save generated model checkpoints and literally anything onto Hugging Face with a very easy Jupyter Lab notebook. Before we start this massive tutorial, I would like to ask you one thing: go to the Stable Diffusion link here and star our repository, fork it, watch it, and sponsor if you like. I appreciate that very much. I am also giving one-to-one private lectures if you are interested in. You can join our Discord and message me there. Also, we are giving huge support to everyone joined our Discord channel. You can join from this link. Moreover, all the images I have shown during this introduction is shared in this link. Go to this link and you will be able to look at every image details with their PNG info data. Just click the image and you will see the prompt and every details that you need. So, let's begin. As usual, I have prepared an amazing GitHub readme file for this tutorial. This file will have all the links and instructions that you are going to need to follow this tutorial. The link of this file will be in the description of the video and also in the pinned comment of the video. Moreover, since the video will be very long, I am going to put sections into the description of the video so you can jump into any section that you are seeking for. Furthermore, I am going to put fully manually written English captions and also in other languages. So, if you are having a problem understanding my English, please watch the video with captions on. I am going to show you how to install on Windows and also on Massed Compute step by step. First of all, I am going to begin with registering into Massed Compute and starting our virtual machine. Then, I will start installing the OneTrainer on my Windows 10. It doesn't matter if you have Windows 10 or Windows 11. On both of them, it works perfectly fine and the same. If your GPU is sufficient, you don't need Massed Compute or cloud computing. You can just use the OneTrainer on your computer with your GPU. So, please register the Massed Compute with this link because we are going to use a coupon. So, I am not sure whether you need to register with this link or not. This is my referral link. Once you have registered, you need to enter your billing information here. You can add your payment method, charge it, and delete it if you want. Then, we are going to do deployment. This part is extremely important. Our coupon is only valid for A6000 GPUs and A6000 GPUs are extremely sufficient to do efficient training. For this tutorial, I am going to start for A6000 GPUs. I am going to show you how you can do a different thing in each one of them and you need to choose our virtual machine template. So, select the creator from here, select SECourses from here. You will see my image here. By the way, this is also the DreamBooth pretty accurate one. Then, you type our coupon like this. You can also copy the coupon from here SECourses then put it here then click verify. It will show you the new price of the GPU. You see currently 4 GPUs are not available therefore let's reduce to 2 and it is not available either and 1. Okay, when I make it 1 I can get 2 GPU 2 instances but when I make 2 I can't get any. The Massed Compute team is keep adding more GPUs so even though currently it is not available to get 4 GPUs with A6000 it is fine. So, after you put your coupon code like this and click verify you are going to get a huge price reduction. You see currently it is 0.62 per hour when I click verify it will become 0.31 per hour with 48 gigabyte RAM and with 256 gigabyte storage and also with six virtual CPUs. If we compare the price with RunPod, you see the RunPod price is 69 cents per hour. This is the community cloud price but on Massed Compute, we are going to get the same GPU with only 31 cents per hour. It is even cheaper than half price of the RunPod. Therefore, Massed Compute is giving us an amazing price for an amazing GPU. So, therefore, what I am going to do for this tutorial is I am going to pick L40 but you don't need it and you don't need more than 1 GPU just to show you how you can utilize more than 1 GPU. I am going to get 4 GPUs but as I said you just need A6000 and only single GPU to follow this tutorial and do your training. Okay, so let's select creator let's select SECourses. Unfortunately, the coupon will not work on other GPUs so click deploy. Okay, it says that I have reached the limit of running instances. You will get this message when there isn't a sufficient amount of GPUs so I have to reduce it. Unfortunately, they are also on high demand so let's see. Yes, I can get 4 H100 GPUs. It will be 10 dollar per hour. It is just too expensive so I don't want to risk it. Okay, let's go with only 2 L40 GPUs. Still, you will understand the logic of using more than 1 GPU and you will get to this screen. It will start initializing the virtual machine. Now, time to install the OneTrainer on our Windows computer. To be able to follow this tutorial and install OneTrainer on your computer you need to have installed Python, Git, and C++ and FFmpeg are optional. If you also install them it is better. So, if you don't know how to install Python, please watch this amazing tutorial. In this tutorial, I have explained everything about how to install Python, how to set its virtual environment, how to set its path, and everything that you are going to need. And also install Git. So, how you can verify your Python installation? Open a CMD, command line interface like this and type Python and you should see 3.10.11. By the way, 3.10.11 works with Automatic1111 Web UI and also Kohya or any other major AI application that you are going to use. This is the most compatible. This is the best Python version that you can use. This is the version that I use for all of my tutorials. Then, how you can verify you have Git installed? Just type Git and you should get a message like this. If you also installed FFmpeg, you should get a message like this: FFmpeg. Its properties. Unfortunately, there is no easy way to verify C++ tools installation but everything is explained in this video so watch this video and install your Python and set it up accurately. Once you have set up your Python all you need to do is first clone OneTrainer. OneTrainer is an open source Stable Diffusion or actually it supports a lot of different text-to-image models like Stable Cascade as well. Open source trainer scripts. It has an excellent documentation also wiki. Moreover, the developer is extremely active in Discord so you can join their Discord channel and ask any questions that you have. Also, don't forget to join our Discord channel as well. You see we have over 1000 active AI learners. So, I copied this then enter inside the folder where you want to install it. I am going to install it into my F drive. I already have OneTrainer here so I will make a new subfolder. Let's say tutorial OneTrainer. By the way, do not make your subfolder name the same as the original name. Since the original name is OneTrainer, do not make the subfolder name OneTrainer as well. It may cause issues. Then, right-click and it will paste your copied text and hit enter and it will clone it into this folder. Then, enter inside OneTrainer. OneTrainer has automatic installation and automatic starting bat files so I will just click the install.bat file and it will generate a new virtual environment and install everything automatically for me. What does this mean? This means that whatever it installs will not affect your other AI installations, such as Automatic1111 Web UI or Kohya or whatever you are using. So, everything will be located inside this virtual environment folder. Okay, let's return back to Massed Compute, and we see that our instance machine started. So, how we are going to use this instance machine? To use the Massed Compute, we need to use the ThinLinc application. The link is here. When you click it, it will direct you to download options. I am going to download and use the Windows version. After download, just click it, open it, and just click next and next and next to install. It will automatically install everything. Then just run the Thin Client, and you will get to this interface. So, this is the connection interface, and what is important here is click Advanced and also Options. Thin Client allows you to synchronize your folders with the virtual machine. This is how you can transfer files from your computer to the virtual machine that we are going to use. So, there are options: windowed, full screen, there are local devices, whichever you want to connect, and you need to set your folder, which will be synchronized with your Thin Client and your virtual machine, with your computer and with your virtual machine. So, click this details, make sure that you have ticked the Drives checkbox, and you need to add a path of your folder. You see, I have added this folder. Let me add it again. So, I click add, and I need to select the folder from here. It will be inside R drive. It will be inside Massed Compute. Let's see, here. And this will be my folder. So, I click OK, and it says Read Only. If you want your virtual machine to be only read from your folder, not able to write, let it stay as Read Only. But if you want to download files from the virtual machine automatically into your folder, then you need to make it Read and Write. I am going to make that way. And there are some optimizations if you wish. There are some security options. I am just leaving everything default, only drives are set. Okay, and there is end existing session. This means that it will restart your virtual machine. So, be careful with that. Moreover, there is no turn-off this virtual machine. There is only Terminate. So, you need to back up all the files that you need before you terminate your virtual machine. This is the only downside compared to RunPod, but we are going to have a desktop environment and very cheap, very powerful GPU here. So, how we are going to connect? Copy your login IP from here, paste it into the server. You see, username is Ubuntu, automatically set for us. Then copy the password. You don't need to show it. Paste password here. And when you set end existing session, it will restart the virtual machine. Moreover, if you get problems with the folder synchronization, you should use end existing session, and it will allow you to synchronize folders. But be careful, this will close all of the running applications. So, if you are training at that moment, they will be all terminated. So, be careful with that. Since we are just starting the virtual machine, I will just select this option. Click Connect. Then you will get this message. Click Connect. Then you will get to this screen. In this screen, it may freeze or may look like frozen. So, click Start to just skip that part, and you will get to this screen. And in a moment, we are going to get, yes, this is our virtual machine. You see, it is like a desktop computer. It has Ubuntu image. So, this is Ubuntu Desktop operating system. And we have automatically installed Kohya, Automatic1111 Web UI, and also OneTrainer here. So, you can directly start using them. For example, if I start OneTrainer, it will start the OneTrainer automatically for me, because they are all added into this virtual machine. We have worked with Massed Compute to make it super easy to use for you. So, looks like our Windows installation has been completed, because I don't see the installer file. So, when I click the Start UI.bat it should start, and it started. You see, it says that no file training presets, but the presets are here. I don't know why it did give this error. So, this is the interface of OneTrainer. So, we have installed both OneTrainer, and we did set up both Massed Compute virtual machine. As a next step, what we are going to do? First of all, let's start the training on my computer. So, to start the training on my computer, I am going to load my preset configuration. Don't worry, I am going to show all of the configurations in this tutorial. They will not be hidden. However, I may find better configurations in future, because I am in research all the time. I did over 100 trainings for this tutorial. So, I am going to download the SDX. SDXL and SD 1.5 training are all the same. What changes is the resolution and the base model that you are going to use. And one more thing, which is the model that you choose from here. For Stable Diffusion 1.5, you choose 1.5. For Stable Diffusion XL, you choose this. You see, it also supports Stable Cascade, Pixart Alpha, and Wuerstchen version 2 trainings. I haven't looked into these yet. I am only in Stable Diffusion training for now, but I have amazing configuration for both Stable Diffusion 1.5 and both Stable Diffusion XL. I did huge research for both of the models. So, let's go to our Patreon post to download our best configuration. In the bottom, you will see attachments: Tier One 10 Gigabyte training and Tier One 15 Gigabyte training. So, if your GPU is more than 16 Gigabyte, you can use this one. This is a faster one. But let's start with the slow. I will show you and explain you all of the options. So, after downloading it, cut it, move into your OneTrainer, and you will see that Training Presets folder here. So, paste it there. I am going to delete all of the other presets, because we don't need them right now. Actually, I tested the Stable Diffusion presets, and they weren't good, nothing like my presets. So, after that change, we need to restart the OneTrainer. So, I turn it off, and I click Start UI.bat file. And now, I can see the preset. So, after I selected that preset, it will load everything automatically for me. So, what are these options? Workspace Directory is where the checkpoints and generated backups or the generated sample images will be saved. So, this is the main saving directory. For each training you can set a different directory that may be easier to manage and see everything. So, let's click this 3 dots icon, and let's go to our new installation. You see, I have OneTrainer Workspace folder here. I am going to make another one: OneTrainer Video Workspace. Okay, I made a folder like this. I enter inside it and click Select Folder. You can also copy-paste the path. So, let's make another one for the cache. So, let's say Cache 1. By the way, caching is extremely important. I suggest you to set a different caching folder whenever you use different settings. Which settings, you may be wondering? Some of the settings requires re-caching. Unfortunately, there is not a list of them, but anything that modifies the training dataset requires re-caching. So, if you select only cache, it will only cache. Continue from last backup, we don't need it. I never used it. Debug mode, there is for debugging. So, there is nothing else you need to enable here. I don't find TensorBoard is useful for Stable Diffusion training. Also, there is Train Device. You can also set this CPU. I never tried it, but it would be super slow. Okay, now we are going to select our base model. You see, my configuration is set with the base model of Hugging Face. If I kept it this way, it will download it into my Hugging Face cache folder and load it from there. However, I suggest you to use a model that you downloaded on your computer. I find that SDXL RealVisXL version 4 is very good for realistic training. If you are looking for a stylized training, then you should use a stylized model. But I am focused on mostly for realism. Okay, from the Hugging Face, you can click this icon, and it will start download. Also, this can be downloaded on not interface having computers, platforms, such as on RunPod. You can right-click and copy link address, then you can use wget to download, like wget and the link. And you can download. Also, delete this part. But on Windows, you don't need. Moreover, by clicking this link, you can download the Hyper Realism version 3. This is the best model for SD 1.5 that I have found. I have a recent video where I have compared 160 SD 1.5 based models to find the best realistic model. If you watch this video, you will learn which models are best for realism, for stylization, for anime, for 3D. So, this is an extremely useful video for you to watch. That is how I have determined the best SD 1.5 model. I also did some testing for realistic vision for SDXL, but I don't have a video for that yet. So, the model is getting downloaded. So, the model has been downloaded. Let's move it into our Automatic1111 Web UI installation. I have fully automatic installers for Automatic1111 Web UI installation. So, I am putting it into my Stable Diffusion models folder, like this. It was already there, but I just downloaded again to show you. Then click this three dots icon, move into the your downloaded folder position. You can put it into any folder you wish. It is not mandatory to put here. Select it. You can also give its full path, like this. When you are using such custom model, do not set VAE. Moreover, only this Hugging Face VAE is working. If you download VAE into your computer, it will not work yet on OneTrainer. So, when you use such custom model, you don't need VAE. You can use the embedded VAE. Now, this is super important. This is where the finally saved model checkpoint will be. So, wherever you want to save your final checkpoint, you need to set it. I will set the final checkpoint inside my Stable Diffusion Web UI folder. Let's say tutorial1.safetensors. Also, don't forget to set its extension, like .safetensors to not have any issues. You see, .safetensors. Okay, so these are the very best settings that I have found for this configuration. It may change for different models and also different configuration setups. These are the best configurations. Just pause the video and look at the configurations. Then move to the Data tab. And in here, I only use Latent Caching. If you have different resolutions, you can also enable Aspect Ratio Bucketing. However, I suggest you to use single resolution images first. Then after you got some decent results, you can try with Aspect Ratio Bucketing enabled and different resolution training and compare them. So, Clear Cache Before Training means that it will re-cache every time you start training, even though you didn't change anything regarding the dataset. If you are unsure that you need to re-cache or not, you can enable this to be sure. But if you are not changing anything but just trying different parameters with the same training dataset, you don't need to enable it. Now, the Concepts. This is a super important part to set up accurately. First of all, I am going to add my training dataset concept. So, click Add Concept, then click Add this icon. And in here, I am going to give the name of training. It is enabled. If you don't enable it, it will not be used. So, you need to give the path folder path of the your training images. My training images are inside here, which are automatically generated with my scripts, which scripts I am going to show you. By the way, this is not the accurate one; I need to use this one. Yes, then you need to set their captions. OneTrainer supports three different captioning: first one is from text file per sample, which means that you can use any captioner to caption each image, and then it will read the caption file with the same name as the file. What does this mean? Let's say you are using our Kosmos 2 image captioner. The link of this is in the Patreon. So, this is an automated installation and batch captioner that I have prepared for Kosmos 2 Let's just start it. You see, the Kosmos 2 is extremely efficient one. So let's start as full precision because we have VRAM, and let's go to my training images dataset, which is inside here. You see, there are also masks, which I will explain. So, I am going to delete the masks. So, let's say "delete," and also, let's delete the prompt. Maybe, let's keep the prompt. So, I will just or delete it, so you will understand it. So, this is my raw folder. Then the automatic captioner started. So, just enter the path and batch caption images. Kosmos 2 is extremely good captioner, but I don't suggest you to caption your training images if you are training a person. Okay, this error shouldn't be important. Let's look at the captioners. Yes, so each file is captioned. Let's open this. This is the batch captioning result, and each file is captioned like this. You see, a man with dark hair and glasses is standing in a room, looking at the camera. He is wearing a blue shirt and appears to be in a home office. And it generated caption as each file. However, you also need to add a prefix for each one of the captions, such as "ohwx man," and comma. So, this way, you will have a unique identifier for your captions if you use this methodology. But I don't suggest this. Do you know why? I have a public Patreon post where I have explained it and compared image captioning effect: When you caption your images for a person training, it reduces the likeliness greatly. So, therefore, I don't suggest image captioning if you are training a person with as low as 15 images, 20 images, 10 images, you can also compare the image captioning effect. But I don't suggest you to caption when you are training a person. You can read this article in details and see how it reduced the likeliness of the model. So, we set the folder, which was this folder. Let's open it. So, this is the folder. So, what I am going to do is I am going to use "from single text file." There is also "from image file name." If you select this, the captions will be like this: "image 2023 04 30" and such. So, don't use that unless you are sure. Use "from single file text." Now, what does this mean? This means that you generate a new file wherever you want. It doesn't matter. Let's say "ohwx man" and then edit it. Let's edit it. ohwx man This is it. So, why am I using this as a caption? "ohwx" is a rare token, which will learn my unique characteristics into this token. This is a rare token. Rare token means that there weren't many images during the initial training on this token, and "man" is the class that I am going to train my characteristics on. So, the model will know that I am a "man" class, a man, and I am "ohwx man" So, this is super important to understand. This is the logic of training something unique, something new, into the Stable Diffusion model. The specific characteristics are learned into this rare token, and it utilizes the existing knowledge of the model with the "man" class. I have a more in-depth tutorial for this on my channel if you need, but this is sufficient for now. So, I selected "from single text file." There are image variations. When you hover your mouse on this, it will show you the explanation of them. However, I am not using it. There are also text variations. I am also not using it. Repeating is extremely important. This means that it will repeat one time each image in each epoch, and we will use a different repeating for regularization images' effect. The OneTrainer does not support DreamBooth training, so we are using fully fine-tuning. However, we are going to make the effect of DreamBooth with another concept, and there is also "loss weight." This means that how much weight these images will get. So, if you have an unbalanced training dataset, you can give different weights to each dataset and try to balance them. However, having an original number of images equal is better. Image augmentation: I don't use crop jitter, or random flop. There are also some other options, so I don't use anything here. When you click "update preview" you will see the images and text augmentation. Since I am not using captioning, this is also not important. So, this is my first training concept about training images dataset. Now, this is super important. People are preparing very, very low-quality images for training. This dataset is also not a great quality. Why? Because it has repeating backgrounds, it has repeating clothing. I have taken all of these images with my mobile phone myself, which is not also a great phone. It is Poco X3, in less than one hour. However, this dataset still has some quality. What I mean by that, for example, let's open this image. So, you see, it is extremely sharp, well-focused, and it has great lightning. These are super important to have. You need to have extremely well-focused, not blurry images. You need to have extremely good lightning, and in your training dataset images, you shouldn't have other people, always you. If you can also make the training background different, also clothing different, different timing, then you will have a much better dataset. There is also no limit of dataset images, as long as they are higher quality. But for now, I am using this medium-quality dataset to demonstrate you, because I have seen much worse datasets. People are having a huge hard time to prepare a good dataset. Hopefully, I will make a dedicated tutorial for how to prepare a dataset. But for now, we are going to use this dataset. Now, OneTrainer does not have DreamBooth as I said, but we are going to make DreamBooth effect. We are going to add regularization images' concept. So, let's add another concept, and in here, I am going to give any name I want, select the path. So, my regularization images are inside here. I am going to use 1024 to 1024 because all of my images are 1024, so I select it. I will show you the folder, and in here, I will use again "from single text file" which will be only "man". So, you see, these are my regularization images. There are 5200 regularization images, and every one of them is manually prepared by me. I have spent literally weeks for this dataset. You can find this dataset link here. When you open this post, you will see all the details that you need to where to download them, how to download them, how to use them. So, the dataset is here. This is not mandatory because you can train without them. However, if you train without them, you will get lesser quality. I also tested it. In this public Patreon post you will see the effect of using regularization images as a concept. So, it will improve our quality, our flexibility, likeliness, and everything. Since these are ground truth images, these are perfect quality images, not AI-generated images. These are all real images collected from Unsplash. Unsplash is allowing for such usage. Okay, for these images, I am going to use "man" So, this is the "man.txt" What is inside this "man.txt"? When I edit it, you will see that just "man" because this is the class token that we are going to train ourselves on. Therefore, as we train ourselves on the "man" class, the model will forget what is "man" because it will see only our images as a "man" Therefore, as we do training, it will overwrite the previous knowledge of the model. However, with using ground truth regularization images, we are going to re-feed the model with whatever the original "man" is, and we will make the model better. But there is one tricky issue here. All of my images are real images. Therefore, it will forget, it will still forget images that contain "man" but not real. What I mean is like anime "man" drawing or 3D "man" drawing, or such. You understand that, like cartoon. So, this training is focused for realism. If you do training with stylization, you may not use this dataset, or you can extract "Lora" from the trained DreamBooth model and use it on a, let's say, stylized model. So, this is my regularization images setup. You see, there is also an option of "include subdirectories," but I don't have. So, there is an extremely crucial thing in the setup of regularization images. You see, "repeating" is currently set as 1.0, which means that in every epoch, it will train all 5200 images, which we do not want. What we want is training an equal number of "man" images with our training images. So, how you can calculate it: open a calculator and type your training images number, which is 15 in my case, divided by the 5200 images because we have 5200 images. You will get 0.0028. So, I am going to type like this, copy-paste it, and just change the final number one upper, so it will use exactly 15 images in each epoch from the regularization images dataset, randomly selected in each epoch. I also disabled these two, "update preview," and it is done. So, we did also set the, let's return back to "OneTrainer." So, we also did set our regularization images concept as well. You can add as many as concepts you want and train all of them at the same time. This is the beauty of "OneTrainer." Its interface is easier in some cases to use. Okay, training. Now, this is super important. I have done over 100 trainings, literally 100 trainings, empirically, to find the best hyperparameters. These hyperparameters may not work if you change anything of them. So, I am using optimizer Adafactor and the Adafactor settings are super important. These are the settings. Do not enable "relative steps, scale parameter." "Stochastic rounding," by the way, "stochastic rounding" makes the effect of full precision training. Hopefully, I will also research that, and I will update the best configuration, but for now, this is the best. And "fused backpass". This is the newest optimization that "OneTrainer" brought, but "Kohya" still don't have. With this optimization, we are able to train with only 10 gigabytes of VRAM usage. So, if you have a higher VRAM GPU, don't enable this to speed up your training, but if you have a lower VRAM GPU, such as RTX 3060, then use this. "Learning rate scheduler": "constant." "Learning rate" is 1e-05. "Learning rate warm-ups" doesn't matter because we don't use a linear, cosine, or anything. We are using "constant". Learning rate cycles is 1. "Epochs" now this is super important. Let's say you are training yourself with 100 images, then 200 epochs may cause overtraining, or let's say you are training with 10 images, then 200 epochs may be, let's say, not enough training. So, there is not a number that works for all, but 200 is usually good. Still, you can make this 400 and save frequent checkpoints, compare them, which I will show you, and find the best checkpoint. But for this tutorial, I will make it 200. You can change this. "Batch size" if you need speed, you can increase the batch size. However, batch size 1 is the best quality for training a subject in machine learning. And this is "gradient accumulation steps" I think "gradient accumulation steps" is not working with "fused backpass" due to optimization. This is a fake batch size. I never use it. "Learning Rate Scaler" is none. We train only "Text Encoder 1" in "Stable Diffusion XL model." You may be wondering why, because training "Text Encoder 2" causes overtraining and doesn't bring any benefit. So, after I have tested every combination, thoroughly, I only train "Text Encoder 1". By the way, you can also apply these hyperparameters to "Kohya" and hopefully, I will make another tutorial for "Kohya" as well. So, "stop training" after 200 epochs. To be sure, you can make this 10,000 and select this "never", so it will train the "Text Encoder 1" during the entire operation of the 200 epochs. However, let's say you only want to train 10 epochs "Text Encoder 1" then you can select this, it like this. But in all my trainings, I trained the "Text Encoder" equally to training the model. This is also a beauty of "OneTrainer" that you can set any specific number of training epochs. The "Text Encoder" learning rate is different, much lower than the learning rate, so be very careful with that. If you use the same learning rate, it will cook the model. And there is also "Clip Skip 1". I don't use it. This is useful when you are training anime. It is really or maybe training for stylization, like generating yourself as anime drawing then it may be useful. Okay, Train "Text Encoder 2" is set as "offline" and set as 0. Attention. Now, when we use this "fused backpass", I think it is using a special attention mechanism, or maybe SDP by default. It doesn't make a difference. So, I let it default. EMA. EMA is extremely useful for training Stable Diffusion 1.5 models. However, for SDXL, it doesn't bring any benefit, so it is off. Now: Gradient checkpointing, if you are training on A6000 GPU, such as on Massed Compute, don't enable this. This will speed up your training. However, gradient checkpointing doesn't make any difference in quality. It is only speed versus used VRAM. Training data type: Float BF16, and Float 32. Now, there is a very important thing: Currently with my settings if your GPU is an old GPU that doesn't have bfloat support, then with this configuration, you will not get the very best results. It really requires you to have bfloat support. So, if you don't have bfloat support, use the SD 1.5 configuration, which I will show at the end of the tutorial. Everything is the same, just the parameters change. And the resolution is 1024. This is the base resolution. If your images are of different resolutions, if you use bucketing, I think it will downscale, or I am not sure how it handles. Actually you should ask this to the OneTrainer developer on his Discord or on GitHub. Therefore, I prefer to train with a single resolution, but you can train with different resolutions, enabling bucketing and see how it is performing. Okay, we train U-NET until the very end of the training, so you can set this as 10,000, and never. The U-NET learning rate is the same as the learning rate here. By the way, if you don't set them, it is supposed to use the learning rate you did set here, but to be sure, I set them also, and I don't use Rescale Noise Scheduler. Actually, I never tested it, so I don't know. I don't change any of the default parameters here, as you are seeing. I don't use AlignProp. When you hover your mouse, you can see what it does, but not much information. To learn this, you need to check out the wiki of the OneTrainer. You see, there are just so many things that you can research. And masked training, now this is important. I have recently tested this with doing a lot of research, and this really improves the flexibility of the model. So when you look at our readme file, you will see that I have tested the mask training. This is also a public post. When you open it, you will see the images of masked training. You can download the full 1.5 gigabyte file and look at every image full size, or you can download the files. These are half size and look at them like this. So, what I did find is, I think, masked training weight 60% is a sweet spot, so you can use this. Why don't use lower weight? Because when you use lower weight, the body proportions get broken. The head looks artificial on your body because it is not able to learn the proportions of your body and your head. Therefore, this is my finding. So, you don't need to use it because, without even masked training, you get very good results. But if you want more flexibility, and if your dataset is bad, especially this, if your dataset is perfect, you don't need it. What is perfect dataset? Dataset never repeating clothing, never repeating background. So, if your dataset is perfect, you don't need that. But if your dataset is not perfect, you may get benefit from this. So, enable masked training and unmasked weight. Unmasked weight means that the non-masked areas will get 60% weight instead of 100%. So, the backgrounds and my clothing will get 60% weight during training, and my head will get 100% weight. This is how we get some more flexibility. And I also leave this area default. I don't change them. So, we also need to generate masks. How do we generate them? Let's go back to the... Let's go to the tools, and in here, you will see dataset tools, convert model tools, and sampling tool. We are going to use dataset tool. Open folder and open folder of your training images, which was this folder, SDXL. Then you will get the images like this. And in here, we will generate masks. Okay. And in the this section, you see the folder is selected. You need to type a prompt. Based on this prompt, it will generate the masks. So, I type "head" because I want my head to be masked. I don't change anything. Create masks. So, you see, the now masks are generated like this. My head is masked in all of the images. You should verify all images have accurate masks or not. Then also, you see, the masks are saved here. So, the white area means that it will get trained. Black area means that it is the unmasked area. Therefore, in here, you will see that only 60% weight, 60% importance, they will get, instead of 100% importance. Now, you can generate samples during training. You can add a very detailed prompt, like, prompt, let's say, actually, we need to add first sample. Yes, you can set like 1024, 1024, any seed, and type of prompt like "photo of ohwx man" and such. Or, you can click here and type much detailed prompt like "photo of a ohwx man, negative prompt is like blurry," and you can set seed, CFG scale, number of steps like 40, random seed, if you wish, and you can even select the sampler like this. I find this the best sampler. So, during training, if you don't know how many epochs you should train, or if you want to see the point where the model training gets over-trained and the quality starts degrading, then you can generate samples. This would slow, of course, your training. You can also add multiple prompts like this. You see, you can add multiple prompts. All of them will be used. You can enable/disable any one of them. But I don't use these because I do x/y/z checkpoint comparison at the end of the training. How do I do that? In the backup, you don't need backup. After this is backupping, I think, the diffuser files, what you need is "save after." So, I am going to "save after" every 15 epoch, which means it will generate 10 checkpoints, and each checkpoint will be like 6.5 gigabytes because we are saving them in half precision. You can see output data type is Bfloat 16. This is half precision. So, I am going to generate 10 checkpoints, and the prefix, this is very useful. After I contacted the developer of OneTrainer, he added it, thankfully. I thank him. So, let's give a name. Let's say, what was our name? Tutorial one. So, let's say, Tutorial one. So, this will be the prefix of the generated backup files. It will save like this. If you don't want it, you can save never, or if you want, based on the number of steps, you can set it number of steps, you can set it number of seconds, minutes. It is really, really convenient to use. I will do that, 15 epoch. Okay, everything is ready. Save your configuration as Config one, for example, or whatever the name you want. And let's see our VRAM configuration. I have installed pip install nvitop. This is a super useful library. You can install it, then type nvitop, and it will open this screen. In this screen, you can see how much VRAM I am using currently. I am using 15 gigabyte VRAM. And let's see, because of why? Because we have 2 open something, I think. Oh, because one of them is my Kosmos, which was my UI that I started, and one of them is the OneTrainer. So, this was the Kosmos error. I should check this error and fix it later. So, let's turn off the Kosmos, and you see, now I am using 7.8 gigabyte. Now, people are complaining to me that their training is super slow. Super slow training happens if your VRAM is not sufficient, and it starts using shared VRAM. When you open your Task Manager, you will see that there is shared VRAM. If it is going over 0.6 gigabytes, that means your computer started using shared VRAM. Shared VRAM means that it is using your RAM memory, and it will be at least 20 times slower than the original VRAM. So, make sure that your VRAM usage is minimal before starting. How you can make it minimize it? You can turn off all of your startup items from here, restart your computer, and see that to get it below 500 megabytes. You can easily get it under 500 megabytes, and you will have full empty VRAM to start training. Okay, let's... Okay, we have saved the configuration. Let's start training. First, it will cache the images, then it will start training. This will take time. And since we did set up everything on Windows, now I am going to set up everything on Massed Compute. However, if you are starting at this point, I don't suggest that. You should also see the previous part to not miss anything. Even though I will also put a video section here. When you get this error, just cancel. You don't need to install any updates. Just cancel. Okay, I'm going to close this because I'm going to start from the beginning. First, we will begin with uploading our training dataset into the Massed Compute Virtual Machine. For this task, I am going to use the shared folder that I have generated. So, it was inside this folder. You see, there are already files that I have, and I have even the training images here. I already copy-pasted. I copy-pasted them, just a regular way. I also put the configuration files. Let's also put the newest configuration file that we made. So, the configuration was inside presets. Don't forget that. And I will paste it into the Massed Compute shared folder here. Then, in this screen, the main window that we are going to use, home button, and you will see all the folders here. Thin drives. So, when you enter inside Thin Drive, you will see the folder name of your computer and the files and folders that I put inside this folder will be synchronized to here. It is totally depending on your upload speed, of course. And I can see all of my folders and files here. So, let's copy this folder into wherever you want. I am going to copy it into the home, into Applications folder here. This is the folder where the applications are installed. So, I did copy-paste with Ctrl C and Ctrl V. Before starting the OneTrainer, I am going to update it. So, the update instructions are written on this readme file. Let's find OneTrainer update. So, I'm going to search it for... Okay, how to update OneTrainer to the latest version. Just copy this, and let's start a new terminal here. Just paste it. Okay, it is not copied yet. This is happening for some reason. So, I copy again, and paste again. Okay, still not copied. For some reason, copy again. Okay, why it is failing? Let's open a notepad. Okay, it appears that the copy button of the readme file is broken. So, I select and right-click and copy, then paste here. Still not visible. Okay, probably I have an error. Yeah, it is copied here. So, let's try. Interesting. Let's start another one. Okay, let's... The copy-paste is currently not working, so I will reconnect. So, I turn it off to Thin Client. I am not going to do end existing session this time, because it will turn off everything. But if it's still not works, I have to do that. So, let's copy the password one more time. This is the password. Copy it, paste it into here, and connect. Okay, it is connected. Then, I am going to copy this command one more time. This is not mandatory, but you should do it, because there are a lot of bug fixes happening every time. And now you see, copy-paste is working, and it is going to update the OneTrainer to the latest version automatically for you. Meanwhile, let's also download our regularization images dataset into the Massed Compute. So, the regularization images link was here. You can log in to the Patreon from your Massed Compute, or else what you can do, you can right-click and copy link address, then open a Firefox here. Paste it there. Okay, it still didn't copy. I don't know why the first times it doesn't work. So, I will right-click and copy link address, then I will paste it here, and it will automatically start downloading for you. Let's watch the download from here. Show downloads. It is getting downloaded into downloads folder. Then, let's go to the downloads folder. In here, you don't need to move it anywhere. Right-click and you need to select "Extract Here". So, I will right-click and select "Extract". You will see the extract operation is happening here. You see, this is where you can watch what is happening at that moment. So, with 200 megabytes per second, it is extracting it, and our regularization images are ready to use. So, let's see the update process. Okay, it has been updated. To move into the desktop to hide all of the started icons, either you can click this icon to minimize them or Ctrl+Alt+D. Yes, Ctrl+Alt+D. I also written this on here, Ctrl+Alt+D to minimize all. Then, double-click and start OneTrainer. Run OneTrainer. It will start the OneTrainer automatically for us, like this. The rest is totally same with the first part of the tutorial, same as setting them up on Windows, but I will set them here as well. So, my configuration is not visible. Why? Because I didn't copy-paste the preset. So, let's close this off, and I had put the preset inside my ThinDrive, which was inside home, ThinDrives, Massed Compute. You can give any folder name. And which was my configuration, which was this was. So, copy it. Where you need to copy it? Go to the home, apps. This is where the applications are installed. Inside OneTrainer, inside presets, and paste it. So, I will just do Ctrl+V, and it is pasted. All right, you can also delete these all other configurations, if you wish. Okay, let's start. OneTrainer again and it is starting. Let's select the configuration from here. Let's see. Okay, let's select the configuration from here like this, and it will load everything. Unfortunately, the responsiveness of the UI in the virtual machine is not as smooth as on Windows, so we need to re-set the workspace directory and everything. You can select the folders. This is the folder structure of the virtual machine, so I will make it desktop. Okay, then I will say that OneTrainer workspace, so it will be saved inside my desktop on the virtual machine. This virtual machine is completely separate from your own computer; therefore, nothing you do here will affect your computer. This is totally running on a different machine, in a remote machine. So, let's copy this. Let's delete this. Paste and let's say OneTrainer cache. Okay, these are the same. Okay, in here you can also download the model and use it, or you can use the Stability AI-base model. So, let's use the RealVis on the Massed Compute again. So, the download links were here. Let's see, this one. Let's copy the link address. Let's open a browser, close this, open a browser, and let's paste and go, and we are there. Let's download it. It is getting downloaded. At the end of the tutorial, I will show SD 1.5 configuration as well. Everything is the same, only the resolution of images changes. By the way, I have forgotten to explain how to resize your images into the accurate resolution. So, very important thing about your training images dataset, you need to prepare them with accurate resolution. I prefer to use all of them as 1024 to 1024. If you watch this tutorial, you will understand that zooming in them without resizing is extremely important. I have automated scripts to do that. In this tutorial, you will see them, or you can manually also resize them into the accurate resolution. For example, these images are currently not at accurate resolution. I can resize them, or let's see, the raw. These are all ratio generated with my automated scripts. So, you can watch this tutorial to learn how to resize them, or you can manually resize them. This is totally up to you. You can also use birme.net, but I don't suggest to use it because, at that time, the images will not be accurately zoomed in. Therefore, watching this tutorial is really useful for you to get and prepare the very best training images dataset. Okay, the model is downloaded. Then, I will move it into the installed Automatic1111 Web UI folder. So, cut it. Let's go to the home. Let's go to the apps, Stable Diffusion Web UI, inside models. Where is it? Here, inside Stable Diffusion. This is where you put your models, and paste. And this is it. Then you can do ctrl + c. This both copies the file and also its path. Let's return back the OneTrainer. This is here. This is the icon. And delete this, and ctrl + v to paste it. You see, it copy-pasted the entire path of the model automatically for me. You can also pick it from this navigation. Let's delete the VAE, and the output. The folder changes, of course. So, where I want to change it, I want to save it into here. So, I do ctrl + L, which select the folder path, copy, and just delete this part. Paste it. You see, I copy-pasted the part like this. You can also type it as you wish. And let's say, this one, Massed Compute, Massed Compute. Okay, this will be the full path. Okay, let's copy. Everything is the same. I have shown everything, explained everything in the first part of the video, so I will quickly add the concepts. The first concept is train. Please watch the first part of the video, the Windows part, where I have explained everything. So, click the path. Okay, I think, yeah, it started here. You see, it was invisible, so I clicked it here to see it. Where did we copy them? We copied them into the apps folder, here. Okay, then let's go to the tools and set the masked training here as well. By the way, why this is still lagging? Yeah, because it is here, still. Okay, let's select from a single text file. Why this is happening like that, it is opening them at the behind. So, move the windows like this. You can write it anywhere you wish. Currently, it is here, so I need to save a new text file. I opened this Visual Studio Code editor. By the way, this was inaccurately changed, so I need to change this back. Yeah, original was like this. So, I type here, ohwx man, like this. Let me, okay, we cant zoom. This is not good. So, let's open text editor. ohwx man, okay, I can't zoom in to show you a bigger font, but it's like this. ohwx man. Click save, and save it anywhere you wish. Let's save it inside the apps folder. ohwx man, sorry about that, it is here. This is the name of the, and txt, this is important. ohwx man.txt. It is saved inside apps with ohwx man. Let's return back. Up, yes, ohwx man is selected. Okay, it is still open another window here. Okay, train dataset. This is the folder. This is the prompt, and I turn them off. Click preview, and it is set. Let's also generate the masks like we did. Let's open the folder automatically and generate the mask. Let's type head, create masks. When we first time click create masks, it will download the model. We can see it here. Yeah, it is downloading and generating masks. It is super fast, and the masks are generated. Okay, we are ready. And let's also add our regularization images concept. So, it is reg, here. The path of the regularization images is inside the downloads folder. So, which is here, downloads folder, selected. Now, we set the repeating as we have calculated in the first part of the tutorial, 0.029, so it will use only 15 images at each epoch like this, and image augmentation, turn them off. We also need to have a prompt for it. So, it will be just man. Let's save it, ctrl + s, and just save as man.txt file. Let's go back here, pick the folders. Yeah, it is opened somewhere. See, oh, you need to set the accurate option, otherwise, you can't, from a single text file, yes. And let's pick the man, and repeating is set, everything is set, image augmentation turned off. Let's create the preview. Yes, and, okay. So, we did set the concepts on OneTrainer as well. The settings are same, however, on here, since we have a huge amount of VRAM, I am going to use speed-up settings. So, let's type nvitop, and you can see, we have 45 gigabyte VRAM. Actually, it is 48, but I think it is getting like this. So, disable gradient checkpointing to speed up, and also in here, disable fused back, so it will use a huge amount of VRAM, but it will be much faster. Let's enable, and let's set it as like this. I have explained everything on the first part, so do not skip it. And let's make this like the first part. Okay, let's make this never, as usual, we don't change anything, and let's also set the backup. So, I will take 10 checkpoints, as the Windows training, and here, Massed Compute, ctrl + c and let's give it as a file prefix. Okay, so this will be saved inside the workspace directory that we did set, and the final model will be saved inside this folder, where we set. Let's save, fast like this, whatever you name, that you like. Okay, you see, the interface is a little bit slower, and start training. Currently, this will do the training on my first GPU. What if, if I want to do another training with another GPU, because I have two GPUs? So, all you need to do is click home, from here, you can click here, from home, scripts. In here, you will see that OneTrainer.sh file, this is what it starts. Open it with text editor, and you are going to write here a command. Which command? I have written it here. Let's see, export this one. So, this will tell that the script to be able to only see the GPU one. So, the index of GPUs start from zero. This is the first GPU, you see, it shows index zero, and this is the second GPU. So, I just copy-paste it here, and when I start another OneTrainer, it will be automatically using the second GPU. So, let's minimize everything with ctrl + alt + d, start another OneTrainer. I will demonstrate it to you. So, let's also open the other window that we have, that shows this VRAM usages. I will just load the preset that I have made. Not this one, preset loading is a little bit slow, unfortunately. Let's load the fast, and hit start training, and you will see that it will start using the second GPU. I just want to demonstrate. I will turn it off. Now, it started on GPU one. You see, in the second process, that it is started to use my GPU, is using the GPU one. You see, I am able to utilize both of the GPUs. By the way, this may have caused overriding the caching folder, so I am not sure if it did broke or not. The cache is here. I hope it didn't broke the caching process. Maybe it broke it. So, what can we do? We can restart the process to be sure. So, let's also see that. Okay, I close them off, and currently, the new OneTrainer that I am going to start will be using the GPU one, by default. Let's open, and select fast, and I will make it re-cache. So, even though it has cached files, I will make it recache with clear cache before training, so it will start caching from zero, and we are set. Everything is set, it will do the training. Let's look at our Windows training. You see, it is using 17 gigabytes VRAM currently, and it was using already 7 gigabytes before we started training, so it is only using 10.2 gigabytes VRAM because this was our configuration, but on Massed Compute, it will use a huge amount of VRAM. We will see. So, let's also start the Automatic1111 Web UI on Massed Compute to see how it works. On Windows, it is so easy to use. In this tutorial, I have shown how to install also Automatic1111 Web UI. However, if you want to use the auto installers, I have auto installers for them as well. So, to automatically install Automatic1111 Web UI on Windows, go to this post, and you will see all of the details that we have, the changes, the information, the scripts. Let's download the scripts. Let's extract it into our F drive, let's say, video_auto_1111, it is pasted. Extract files here, and you will see that several bat files. So, I am going to install Automatic1111 Web UI with double-clicking. It will do everything automatically for me. It will install, and you will see a bunch of other files, and what are they? So, I suggest you use Update Torch xFormers Install After Detailer Automatically bat file when after this installation has been completed. This installer also downloads some of the best models automatically for you, as you are seeing right now. If you are interested in TensorRT, you can also use this to automatically install it. If you are interested in ControlNet, you can use this to install automatically, download all of the ControlNet models, which are over, let's see, how many they were, which are over 50 models already. So, these scripts will automatically download all of the models into the accurate folders, including IP Adapter and Instant ID, which are really hard to install. This is extremely useful script. I also prepared the same scripts for Massed Compute. Let's open this link. This is the link that you need for Massed Compute, and in here, you will find the Massed Compute version one zip file. Download this zip file. You will find the files and extract them into your Massed Compute folder, like here. These are the content of the zip file, which I can access already on Massed Compute. So, how to use them? Let's go to the thin drives, which is our synchronizer drive. Let's go to the Massed Compute. So, which files do I need, if I want to download and use ControlNet, IP Adapter, and everything? So, let's select, actually, from here to here. So, let's copy. Some of them are extra files that you are seeing. I will show them too, don't worry. Let's go to the apps, and paste them here. So, you will see that they are getting pasted. The operation is here, as you see. Okay, they are done. So, on Massed Compute, what do I suggest to you? I suggest you update the Automatic1111 Web UI to the latest version. Also, in some cases, xFormers may not be automatically updated for OneTrainer on Massed Compute. You can use this command to update it. And, how do we update Automatic1111 Web UI with this command. So, this is copy, open a new terminal, like new window, right-click, and paste it, and hit enter. It will update the Automatic1111 Web UI to the latest version automatically. But, you need to do this before starting the Automatic1111 Web UI. Then, you need to also add this to do its starting file. So, let's copy this. How do we do that? Let's go back to the desktop of the Automatic1111 Web UI in Massed Compute. In here, start Stable Diffusion settings file. It will open the this sh file, to modify, and paste it here. Okay, it didn't copy it. So, let's copy again. With this way, you can start using the latest version of Automatic1111 Web UI on Massed Compute. However, if you need more, like, what you may say, if you need more of TensorRT, this is currently not working for some reason, or install After Detailer, ControlNet, and Face Fusion, then you should execute this command. How? Copy it. By the way, you need to put these files inside apps folder, which is inside here. Let's go to the apps folder. You see, the scripts are here. Let's open a new terminal, and paste. Okay, it didn't copy again. I hate this. Copy, and paste, and it will update Automatic1111 Web UI with certain extensions. I am also going to show you the content of this file, so even if you are my not Patreon supporter, you can still use it. So, let's open with text editor. And, this is the content of this file. So, it is going to download and install After Detailer, ControlNet extension, Face Fusion extension. It's also going to install necessary libraries for Face Fusion. This is it. This is the content of it. If you support me on Patreon, I would appreciate that very much, because, with that way, you will get always most up-to-date version of the scripts, and helping me to continue in this journey. So, you see, it is doing the necessary updates of Automatic1111 Web UI with the necessary libraries. This will also update it to the Torch version 2.2.0, and xFormers 0.0.24. One another very beautiful thing of this virtual machine image template is, by default, it comes with Python 3.10. Point, I need to type Python3 I think, 3.10.12. So, this is a very rare template. In most of the templates that you will find, they are installing different Pythons, but on this machine, we have Python 3.10.12, automatically installed. Okay, the update is continuing. Meanwhile, let's look at the training speed on our Windows. So, it started training, and the speed is 2.5 seconds per it. And what does this mean? This means that each step is going to take this much and how many steps that I am going to train. I am going to train 200 epochs, so let's calculate. So, every image in my concepts will be trained 200 epochs. How many concepts I have : 2 In each concept, I am training 15 images because I have training images 15, repeating 1, and I have 5200 images with repeating 0.029. So, 15 + 15 = 30 images is trained per epoch. So 200 multiplied with 30 = 6,000 steps that we we are going to do training total, and since each step is taking like two seconds / it now, I don't know why it became faster somehow, it means this 6000 with 2 seconds, it is going to take 12,000 seconds, with when we divide it by 60, it becomes 200 minutes taking on my windows. And is this done? Yes, let's close this terminal. Let's look at the training speed on the Massed Compute if started. No, still caching. So let's start the Automatic1111 Web UI after we made all these changes. Run Stable Diffusion, and it will start the Automatic1111 web UI latest version with the most commonly used extensions. You can also install any extension like on your computer here. It is exactly the same, nothing changes, only the folders change. So you see, it is making the necessary updates since it is starting the first time. Our training on windows, continuing, and you see, we got a checkpoint at 450 steps, which is equal to 15 epochs. You see, it also shows the epoch in the naming, 450 steps, 15 epochs. And if your training stops when you are touching the cmd, just hit enter, it will continue. Okay, we got some, yeah, we have basic as our error. I think I had fixed this, but it looks like it is not fixed. So let's fix this error. How we are going to fix it? We need to activate the virtual environment of the Stable Diffusion and install it manually. Okay, so let's do it. Let's go to the home, apps, Stable Diffusion web UI, start a new terminal here, and for activating this virtual environment here, we are going to use this command: source ./venv/bin/activate Let's copy this and paste it here. Yes, it is activated. pip install basicsr and it is getting installed. Yes, it is installed. Okay, let's move back and start the Stable Diffusion Web UI. I am going to add this to my automatic scripts, so you won't have this issue. I will update the scripts. Oh, we got another error. This is weird. These are all errors of the Fase Fusion, so this is why you should support me because you will not have these errors, or if you have these errors, you will message me and I will fix it. So let's edit this, install basicsr, and which was the missing one? It was realesrgan, pip install realesrgan. So let's also install this with our activated virtual environment. Okay, it's installed. Now we need to restart the web UI. Okay, which was the restart web UI? Okay, let's close this terminal. Not this one. Okay, this is OneTrainer updated terminal. This is OneTrainer training. Okay, not this one either. Okay, I think we had closed it, so start Stable Diffusion web UI again. Okay, it is started, and I don't see any error. Yeah, I think these are not very important, and you see, you can either use it locally with this URL. How to use it? So, open the link, it will open it in the browser, so you can use it locally on the remote computer, or you can use it on your computer or even on your mobile phone with the Gradio share. You see, copy the link, and I can open it in my browser. Let's open it here. So whatever I generate here will be generated on the cloud machine, not on my computer. If you don't want Gradio share, what you need to do is, when copy-pasting the startup parameters, which was here, just remove the share from here that we have added in the beginning. If you remove it, then it will not start Gradio live share. And you see, I can just type, let's generate 100 images. Let's see the speed of image generation. By the way, currently, OneTrainer is running on the second GPU, and the Automatic web UI is running on the first GPU, so they are not blocking each other. How can I be sure? nvitop, and I can see that. So you see, the second GPU, which is OneTrainer, is using 34 gigabyte VRAM because we have disabled all the VRAM optimizations, and its speed is 1.2 seconds per it. And on my computer, its speed is, what was it? Two seconds per it. You see the difference. It is faster. And this is the speed of the image generation on the first GPU. It is 20 it per second for Stable Diffusion 1.5 model, which is here. Let's interrupt it. There will be, of course, some delay with the Gradio live share because it will download the images from the Gradio live. It was really fast. Okay, let's load the RealVis XL model. If this screen takes forever, if this screen doesn't work when you are loading, it can happen on your Windows or anywhere. That means the Gradio is bugged in your browser. So, turn off all of your browsers and reopen, and it will fix it. Okay, it is loaded. This is 1024 since this is SDXL model, and let's say this is the best sampling method that I like, and generate images. Massed Compute is totally secure, so no one can see whatever you are doing. Okay, it is getting generated. It is like 6 it per second. You see, this is the image generation speed. It is almost equal to RTX 4090. Actually, it's a pretty decent speed, really, really good speed, actually. 6 it per second. I don't know why, but TensorRT is currently not working for some reason. I couldn't solve it yet, but hopefully, I will solve it and update the Patreon scripts. And training is continuing. So all I need to do is now just waiting, the checkpoints being generated, and comparing them both on my computer or on the Massed Compute. Both is same. It has been a while, so let's check out the status of the training. When you are training, in the left bottom of the screen, you will see the status of the epoch, the number of epochs completed, and the current step of that epoch. So this is where you will see the status of the training. In the cmd window, also, you will see some of the messages. However, it doesn't show the current epoch. It only shows the current step at that particular epoch. So let's see the status of the Massed Compute training. To see that, I click this. This is the interface of the OneTrainer, and already, 158 epochs have been completed. While training, I suggest you to start uploading files into the Hugging Face because if you use folder synchronization to download models, it may not work, or it may be very slow. So how you are going to use that? It is so easy. We are going to start the Jupyter Lab. You see, there is "Run Jupyter Notebook". So let's run it. It will start the Jupyter interface like this. Now, you can either load the file that you have downloaded from Patreon, or you can make a new notebook. Either way works. You select Python3 ipykernel. So to load it, it was inside our thin drive, Massed Compute. This is my folder. It may be different in your case, whatever you set it up. And you see, there is "Upload Hugging Face Notebook File". If you double-click it, it will not be opened accurately. So how you are going to open it? To open it in this interface, we click this "Upload". So we have to select it from this interface. Let's go to the home, go to the thin drives, Massed Compute, and select the notebook file, "Upload Hugging Face". Open. So this will open it accurately. You see, now it is appeared here. Double-click it. It will open it like this. And these are the codes that you need to run. There are several cells, and each one is doing something different. So, first of all, let's install the Hugging Face Hub and ipy widgets. Okay, the cell has been executed. When you click here, yes, you see, we got this screen. So what we are going to do right now is, go to your Hugging Face settings, and in here, click "Access Tokens", and make a new token with read and write permission, like "test test test", whatever the one you want, and then copy it. By the way, you need to register a Hugging Face account. It is free. Then paste it here at login. Once you're logged in, then you can upload model files or generated images into a private repository. So, there are these codes. What do they do? This first one allows you to upload a single checkpoint into your destination repository in your Hugging Face account. This one lets you upload a folder, and this one lets you upload a folder. This is a better way of uploading a folder. So, if you're going to upload an entire folder, you should use this method. If you're not going to upload an entire folder, just a single checkpoint, you should use this. So, let's upload a single checkpoint first. Where is our checkpoint saved? They are inside home, actually, inside desktop, inside OneTrainer, workspace, inside save. So our checkpoints are saved here. Let's upload the first checkpoint. So, I do Ctrl+C. It will copy the path of the file, then I will paste it here. Okay, maybe the copy-paste not worked. Okay, copy-paste worked, but it doesn't allow me to paste. Let's open a new window. Yes, it is copied, but in here, why I can... Okay, now I can paste it. So, first, I have to copy-paste it into an editor, then I can copy-paste it. So, this is the full path, and then we need to give the model name in the repository that we are going to upload. So, this is the name. I copy it, delete this one, and paste it like this. And the repo ID, so this is the repository that you are going to set on your Hugging Face. Let's click "New Model". You can make it either public or private. Test11111. Let's make it as private, so no one else can access it. Click here to copy your Hugging Face repository. You see, username and repo name. Delete it, and click "Play" to upload. I will restart the notebook. Okay, so let's close the Jupyter Lab interface. Then, let's return back to the desktop, Ctrl+Alt+D. Start the Jupyter Notebook. Okay, it is getting opened. Let's open the upload Hugging Face and play again. It is done. Let's click "Play". Okay, you see, the widget is now started. Perhaps we had to restart the terminal to get this widget. Then, let's copy our token one more time, which was... Let's go to our settings. I'm going to delete this access token after the video, so it's fine. Copy it, paste it, login, and yes, token is valid. Now, I can start uploading. Okay, this is also saved. So, let's just click this and just wait. By the way, we should see that it is playing this cell, but the color of this cell is also not accurate. I wonder why. Oh, because this is raw. We have to make this as code. I was just playing with it. So, you need to select "Code" from here. Then, let's just click "Play". Okay, now it is going to start uploading into the Hugging Face repository. I suggest you do this while training. Why? Because this way, you will save your time. You will not wait upload. You can upload files while training. And also, let's say you wanted to upload an entire folder. So, you can set the folder here. You can upload either the images or all of the generated models. You just change it here, the folder path. You change your repository name like here, whatever the repository you did set. For example, I did set this and play this. So, this will upload one by one, everything. You see, the upload speed is pretty decent. It is like 50 megabytes per second, which translates into 500 megabits per second, almost. And with this way, you will be saving your time. One another thing is that our newly fresh installed Automatic1111 web UI has started. If you remember, we had installed it from here. So, after the initial installation, I can also install ControlNet or I can update xFormers, install After Detailer, and the Face Fusion. So, you can see that currently, it is using Torch 2.1.2 with xFormers 0.0.23, and After Detailer extension or ControlNet extension is not here. So, I am going to install those extensions and update everything automatically. I am also going to show you the code right now because I said that there is no paywall in this tutorial. So, this is the code that installs everything automatically. You can code it if you wish. And I am also going to show you the updater code because I said there is no paywall in this tutorial. So, let's edit it. This is the code that is going to install After Detailer automatically and update the Torch version and the xFormers version for me. If you need Face Fusion on here, I can also add it into my automatic installer. Just message me from Patreon, and I will add for you. There is also automatic installation and downloader for ControlNet with setting everything into accurate folders because it is hard. But don't worry, I have a full tutorial on my channel regarding the ControlNet. So, just type "YouTube SECourses ControlNet". And this is my grand master ControlNet tutorial. You see, it is 90 minutes. You can watch this to learn more about Control Net if you are interested in. However, if you use my automatic installers, it is better for you. There is also some extra information here regarding how to use newest instant ID and also IP adapter face ID. But if you ask my opinion, the ControlNet extension, IP adapter face ID, and instant ID implementations are way behind their original repository implementation. I have one-click installer for instant ID and also IP adapter face ID. When we go to the Patreon exclusive post index and search for instant ID, you see, I have instant ID for Windows, Runpod, and Linux, and Kaggle notebook. And also for face ID, we have the same. Let's find it here. IP adapter face ID plus version 2, 0-shot face transfer. This is also another standalone Gradio installer for them. Hopefully, I will make a new tutorial for instant ID. I already have a tutorial for face ID. If you want to watch it, just type "SECourses face ID". And this is the tutorial for it. Okay, so our model has been uploaded into our repository. Let's check it out. When I return back to my repository and refresh here, I should see the model inside files and versions. And yes, I can see it. Now, I can download it and start using it on my computer. If I want, I can also already use it on Massed Compute. How? I can move the model file into the models, or I can give this as a path of model loading. So, let's move the saved models into the models folder. You don't need these YAML files. Automatic1111 web UI automatically generating YAML files. To select all of the models here: while keep pressing the shift key, I select the first one, and then I select the last one, and it selects all. Alternatively, you can right-click and select all, and then copy or cut. We don't need to copy; we should move them, so cut and let's go to the home, inside apps, inside Stable Diffusion Web UI, inside models, inside Stable Diffusion folder, paste, and now I can start using them on Massed Compute. It is not finished yet; it is still generating checkpoints. However, you see, we have already got the checkpoint of 180 epochs. So how we are going to test them. If you remember, we had started Automatic1111 Web UI on the Massed Compute. We can see its terminal somewhere around here. Yes, here. So there was a Gradio live and also local. Currently it is running on local URL as well, so which is 127.0.0.1:7860 port. This is the port of the Automatic1111 Web UI which starts default, and now I can do testing. How am I doing testing? First of all, I need to refresh models here so they appear, you see, and then I refresh this interface so that these models will appear in the x/y/z checkpoint, and then you need to type your prompts. For example, let's check out the first checkpoint, 150, because 150 epoch is a sweet spot that I find with my configuration, but it may not be the same for you; it depends. Then what I'm going to do is type a prompt that I like. You can find a lot of good prompts on my CivitAI profile, so type CivitAI SECourses like this, then you will get my page. On my profile, click my profile, go to my images, and you will see my generated images here with their png info. For example, for testing, let's pick a prompt from here. Okay, like this one. This is an SD 1.5 image; I am going to use this prompt. For SDXL I don't need very much negative prompts, and later I decided to remove canon word because I think it is not really necessary. Okay, it's looking accurate. I prefer to do 40 steps for sampling. I am using DPM++ 2M SDE Karras. This is the best sampling method that I find. Then we are going to generate in 1024 because this is our base training resolution, and a very important thing is that you should use After Detailer. Let me demonstrate to you why. So let's generate like 4 images to see what we get and why we should use After Detailer. The reason is that the model learns our face and our body with a very limited number of images and limited number of resolution; therefore, the face details are not very accurate when you generate a distant shot images, not close shot, distant shot. So we in-paint face automatically to improve it; that is what After Detailer does. Okay, we got images. For example, this one, this one, or let's see this one. Okay, I am going to improve this one's quality, and there is this one. I don't know; we can also improve this one too, whatever the one you wish. So where is the seed of this? Here, let's copy the seed, let's paste it here, and let's make it batch size one and regenerate. So I am going to show you the effect of the After Detailer. Okay, we are getting the same image. Okay, this is the image we got. You see, the face is not accurate; however, the overall shape is accurate. So After Detailer can fix up to a certain degree. To get perfect images, you need to have a good training, and you can see that the model is not over-trained because the clothing details are perfect; the background details are perfect. So this is not over-trained, and the anatomy of the image is looking accurate. It depends on your taste. If you don't find this very natural, you can generate more images to get a better image. So what we are going to do is, we are going to type a prompt here which is very important. Photo of ohwx man, this is super important because this is the most likely to improve your face. If you want a different pose of the image, you need to also change here, I will show. In the detection, make it detect only the first face, and in the in-painting tab, I prefer to make it 0.5 percent denoising strength, and there is only one thing that I change, use separate steps to improve the quality of the face like 70, and let's try again. Now we will see that it will automatically in-paint effects. Okay, we forgotten to enable the After Detailer, so let's enable it and try again. With SD 1.5, if you do 1024 training, and it is fixing the face, and we got the image. You see, the face is much better. This image is not perfect; I would generate more images to get a better one, but you see the logic. Also, if you want to do upscale with high resolution fix; you can do that too. The crucial part is you need to select some other upscaler here. If you select Latent, it will drive away the image a lot. For example, let's use our R-ESRGAN 4X plus, and let's upscale like 1.5, and we don't need to change denoising, but let's try. So let's upscale it. Currently, we are running on L40 GPU, so this is faster than the A6000 GPU, which I suggest you to use. It is faster, but A6000 GPU is also not very slow. Okay, the upscale is getting done into 1.5X. I think 1.5X broken some of the anatomy; let's see. So it is going to apply the face improvement to the upscaled image, and this is the upscaled image. Yes, I think the anatomy is broken with 1.5X. So you can do this like 1.25X and try again. These are the numbers that you need to play. If you find it too different, you can also reduce the denoising strength. So you can play with these parameters until you find your liking, but I use this in my images, the After Detailer. Okay, you see there are some new faces, but they won't be changed because I have selected mask only the top k largest, and we got the image. The anatomy is not accurate. So how can we fix anatomy? By the way, this anatomy inaccuracy is also being caused by the masked training. Because we have reduced the weight of the background and the body; therefore, it is not fully accurate. So if you don't want to get any anatomical problem, you shouldn't use masked training. However, the masked training improves the flexibility of the model; it allows you to generate better images. Also, if you don't like the generated image, you can skip it directly before waiting it to be finished. And moreover, there is a way to generate an infinite number of images. I am going to show that in a moment, once this image is done. Okay, it is fixing the face. The face quality is pretty decent. Yeah, the back face is not good though in this case. So you can right-click here and generate forever. So this is a way of generating forever. And how do we find the best checkpoint? So once you decided your prompt for comparison, you go to the x/y/z in script menu, in the very bottom, you see x/y/z plot, you enable it, and in here, you will see that checkpoint, checkpoint name, and select the checkpoints from beginning to end, like 15, 30, 45, 60, 75, 90, select all of them. Then I put 50 grid margins, 50 pixel, and I generate 4 images for each case. Since this GPU is very powerful, we can also make the batch size 4, and let's hit generate, and we can watch the progress from here. It will tell us how many images will be generated. These generated images will be saved in the outputs folder, so I click that folder, and I can see images here. Let's change the view. Yeah, I think show list is better, and we had generated some car images, so can we make them look bigger? Yes, there is zooming. Okay, let's make this more zoomed in. Okay, so you can see the images here. So how you can download all of these images, since these are small images, you can do synchronization with your computer. So let's try it, or you can upload them into the Hugging Face. So let's copy this folder and let's go to home, let's go to Thin Drives, Massed Compute, and paste the folder here. After I copy-pasted the folder here, you need to wait first. This copying files to Massed Compute folder, I can go to my Massed Compute folder, and you see, the folder started appearing here. So now it is synchronizing the remote cloud folder with my local folder. They are getting downloaded as you are seeing right now. I suggest you to use this way for the images, or still, you can upload them into the Hugging Face. If I were going to upload them into Hugging Face, what would I do? What would I do is, I would first zip all of the images. So how would I zip them? Right-click and let's see. Yes, there is compress, and you can give any name of zip like test one, then it will be a single file, then you can copy the path of this file with Ctrl C, then go to the text editor, open a new tab, paste it, then copy the path from here, Ctrl C, return back to your upload Hugging Face, and delete this file path, then paste the new file path, and give any name you want. So you can entirely upload the new zip file into your Hugging Face repository. Let's click play. This is a very small file, so it doesn't worth it, but you can do it is 200 megabytes, and you see, it is uploaded. Now, when I go back to my Hugging Face folder, it should appear here, and yes, here. So I can download it directly from this way as well. Hugging Face is really became the backbone of AI. They give all of this to us for free. They are amazing, amazing. I hope that they get better and better every day. So I suggest you to follow me on CivitAI. You will see my username as SECourses. Also, please star our GitHub repository. It is super important. When you click here, you will get to our main page, and in here, please star it. This is super important for me. Please fork it. Please also watch it. If you also become my sponsor on GitHub, I appreciate that very much. Moreover, in the top of this GitHub readme file, you will see that you can support me on Patreon, you can buy me a coffee, you should follow me on Medium, you should follow me on DeviantArt, you should subscribe to our LinkedIn channel, you should follow me on LinkedIn, you should follow me on Twitter if you wish. Unfortunately, our Udemy course is not available right now, but I am working on it. So you can see my LinkedIn profile. You can follow me here. I have 4000+ followers. Hopefully, it will get even better. If you have any questions regarding this tutorial, ask me by replying to the video, or you can also join our Discord and message me there. Either way works fine. And our x/y/z plot is getting generated right now. Okay, meanwhile, x/y/z checkpoint comparison is getting generated. You see the training has been finished. So what can we do? The last file is already saved inside the given folder. Let's also move the last save checkpoint. Let's go to the home, apps, inside Stable Diffusion Web UI, inside models, inside Stable Diffusion. Let's paste it. Let's also switch back to the list like this. Yeah, this is good. And let's find the last checkpoint. What was our name for the last checkpoint? It was this. So this is the 200 epochs checkpoint. Before terminating my Massed Compute, I need to upload everything, or the one that I like it most. So to upload everything, I am going to do Ctrl L. It is going to select this folder path. Right-click and copy. Then I will return back to my Hugging Face uploading. I will change the path. Currently, it is like that since this is downloaded from Patreon, but if you need that, you can do it. So what was our repository name? It was this. Let's copy it. Unfortunately, there is no copy, so select and Ctrl C, and then here, let's put it here. You see, there is multi commits, multi commits verbose, and just hit play, and this will start uploading every file in that folder with accurate folder names into my Hugging Face repository. Initially, it may take some time to prepare the commit, then it will start uploading everything, and meanwhile, our x/y/z checkpoint comparison is almost completed. We can watch the progress here. By the way, since I am uploading at the same time, it is loading the checkpoint, so it will become a little bit slower. Let's do Ctrl D. Okay, it is like this. Okay, you see, currently, it is using 41 gigabytes of RAM. It says, okay, this is... Yeah, we have 252 gigabyte RAM, and we have 1.2 terabyte hard drive with this Virtual Machine. We are only using like 200 gigabytes of the hard drive. We can see the values here. Also, we can see them on the nvitop, which I have shown you here. Yes, we can see how much RAM we are using, how much CPU and GPU we are using. You can also start multiple Web UI instances with the way I have shown, for the OneTrainer export CUDA visible devices. That's all you need to do. Okay, let's watch the progress. I can also connect this on my computer with Gradio Live Share, so it is a choice of yours. Okay, it started uploading the files. While this way, you uploading, you will not see the files here until it is completed, but when you go to the community, you will see that WIP work in progress upload folders using Hugging Face Hub multi commit. So this has a resume capability as well. You will see that it is uploading the files, and as they get uploaded, these checkboxes will be checked. So it is uploading . I have to wait fully finish it to see the files here. Okay, it is uploading. Our x/y/z plot is almost completed as well. So the x/y/z checkpoint comparison has been completed. Sometimes you may not get the final image here. When it is saved, when you click this icon, it will open the folders, and then you will go to the text2image grids, like this is actually inside the grids, and in the very bottom, you will see a saved png file. It is a big file, 70 megabytes, so I will copy it and paste it into my Thin Drive. Therefore, I will be able to open it on my computer. So let's do that. We will see that it is pasting here. I think it is synchronous synchronization not asynchronous, so it just pasted. Let's look at our Massed Compute device here. Okay, the grid is here. Let's open it. I am using Paint.NET, which is a free software. So here, look at each image. You see, in the first checkpoint, the face is not like me, so as we move between the checkpoints, it will become more like me. Still not. Okay 45 epochs, still not like me. By the way, you need to compare this face with the training dataset face, not my current face, because there are some differences. You see, okay, not here. Okay, it becomes more likely, as you are seeing, in the 75 epochs. Likeliness should improve as we pass more epochs. Yes, for example, here, here, we got... I think 135, pretty decent. This is a personal thing, so I can't say this will work best. So this is 150 epochs, which I like in most cases, then 165, but model is far from being over-trained. Okay, I think a little bit over-trained here. So this is the way of comparing each checkpoint. I think the clothing is also got over-trained in the 180 epochs. You can see the quality of the clothing degraded. Let's compare it with 150, for example, or 135. I think the clothing is better in this one, so you can see which checkpoint is performing best. The best clothing will be in the first checkpoint because it is the least overtrained checkpoint. As you see, the clothing is looking better than the last checkpoint, so this is the way of comparison. There is a trade-off between the likeliness and the flexibility, so if you want more likeliness, then you need to do more training, and it will become less flexible. However, it is still perfectly able to follow the prompt. For example, let's try another prompt, another complex one. Okay, I am going to disable x/y/z checkpoint comparison because you should have understood the logic. So, make this none, and let's generate 4 images. Yes, yes decent, let's say it will use the. Okay, it is currently at the 90 checkpoint, so let's pick the 150 checkpoint and let's say, photo of ohwx man wearing a yellow suit and a red tie and a white shirt in an expensive hotel room. You see, none of my clothing was like this, so I have picked fully different colors to see how much the model is able to follow my prompt. Let's generate. I am, of course, using the photo of ohwx man as a face prompt. I am not using the same thing because this way it is more focusing on the face. Okay, initial images are getting generated. How much the After Detailer can fix the face totally depends on the quality of your training, and let's see how much it obeyed our prompt. If it obeyed perfectly, then that means the model is not overtrained. People often ask me that, try different expressions. If you want different expressions, then you really should include them in your training dataset. Currently, my training dataset does not have such different expressions, so the model capability of such different expressions will be limited. You need to add them into your training dataset. However, as you can see, it followed the prompt perfectly accurately. The face is not the greatest one; however, the jacket is here, the shirt is here, the tie is here, all colors are accurate. So, I can generate more images to get the best one. Also, I can add some beautifying prompts like hd, hdr, uhd, 2k, 4k. I can add some LoRAs if I want. It's all up to you, and 8k. So, it is all up to you to add more prompts and negative prompts and LoRAs to improve your output. This is something more related to how well you can prompt, and you can generate more images and find the best ones. For example, for the introduction part of this video, I am going to generate a lot of images and pick some of the bests to show you. This is how Stable Diffusion works. We can generate in masses, and we can use the very best ones. Also, the base model that you use will make a difference. Not all base models are perfectly trainable. For example, on Juggernaut XL, I didn't get such good likeliness, so not all the models are properly trainable. Some models are better, and some models are not. Moreover, masked training can cause irregular anatomy. Therefore, I suggest you improve your training dataset rather than depending on the masked training. So, let's see. I think this checkpoint could be a little bit overtrained because on the face, I see some deformities. So, let's reduce it, like to 120 epochs, and I will use the same seed by clicking here. We are doing 50 percent denoise. We are doing 70 separate steps for the face inpainting. Okay, let's try again. The power of L40 is like, I think, RTX 4090 in terms of speed for inference. For training, it's a little bit slower in the batch size 1, but you can increase the batch size to significantly increase training speed if you need. Okay, we are getting the results, and I will show you a different expression and how to do it with After Detailer. While it is generating, you can also open the folder and watch the images here. Let's sort them by modified dates. Okay, you can see the last saved image here. Open it and see without waiting. The accuracy of the anatomy will depend on how good images you edit into your training dataset because it will learn the proportions of the anatomy. For example, this image is looking pretty good, pretty decent. Let's see the results. This, so it is up to you, whatever you like. And if you want a different expression like smiling, photo of ohwx man, then you should add the expression here as well; otherwise, the expression will be gone. Like, smiling photo of ohwx man, let's see what we are going to get. But I don't have any smiling expressions in my training dataset; therefore, it will be hard for the model to get a very high-quality smiling expression. And some expressions are harder than others. Smiling expression is rather easy for the model. There is also, hopefully, Stable Diffusion 3 is upcoming, and I think it will be much better. I will be hopefully the first one doing a full training tutorial for it, releasing the training hyperparameters setup. So, follow me on Patreon. You see, the smiling expression is not that great because the dataset does not have such great expressions. The dataset is all single expression, so if you want different expressions, you should include the expression in the dataset. That is the way of doing it. Still, they could be counted as perhaps acceptable. Yeah, maybe in this one. So, it is up to you, maybe slightly because slightly is, I think, performing better than the full smiling, in my case. For example, let's try again. So, this is the logic of it. This is how you do. There could be a better way of prompting, including some LoRAs; however, it is up to you. I am not that much expert with prompting, to be fair. I am just using in my use cases, in my demos. Let's see what we are going to get with slight smiling. I also like slight smiling rather than smiling. So, expression should be both written here and also here. This is not as good as manual face inpainting in the inpainting tab, but this is an automated way. So, you can generate in masses and pick the best one. However, if you inpaint manually, you can get better results. That is for sure. Also, YOLO doesn't have head; it still has only face. I don't know why they didn't add the head here. That could produce better results, to be fair. Okay, you see, the slightly smiling is much better because I have slightly smiling expression in the training dataset. So, how do we do inpainting? When you click this icon, it will go to the inpaint tab with the image. So, you can change the size and carefully mask the face. I'm not very careful right now. Carefully mask the face and type prompt, which will be slightly smiling photo of ohwx man, nothing else. Then in here, we select only masked. We select the best sampler. You can select any number of steps, like 60 steps, and you can make it resize by 1. We are not upscaling. Denoising strength, whatever you wish, like 50%. You can also increase it. You should make the seed random, so that you can get different results and pick the best one. And that should be the number of images, which is here. Yeah, let's generate, batch size 4, and let's see. Okay, let's just wait a little bit. You can always follow what is happening on the terminal. This is the way of using AI applications. See, yeah, the terminal is here. This is the inpainting speed, but you need to multiply this with 4 because we have batch size 4. It is over 6 it per second, and the inpainting is generated. Let's look at the different ones. You see, there is a difference. I think the first one is best. You see, like this. So, this is the way of generating different inpaintings, and with this way, you can get a very best result. And then you can use our SUPIR upscaler to improve the face with this selected base model to improve everything. Hopefully, I will also make a new tutorial for our SUPIR application. That is really amazing. That's mind-blowing. What if you want to make a LoRA out of these models? So, it looks like OneTrainer convert models is not a good option. You really should stick to the Kohya conversion. I have an explanation of how to convert Kohya if you wonder. It is on my Patreon. This is a public post, so I go to the Patreon post index, and this is the article. Yes, in here, I explain it, how to convert lora models from base model. You select these options in the Kohya menu, and you save it. Let's see, LoRA extraction, yeah, here, extract LoRA, inside utilities, LoRA, extract LoRA save precision, load precision, and everything is existing here. Let's verify if the models are uploaded into Hugging Face. Not yet. You really should verify everything uploaded before you terminate your virtual machine. Massed Compute. You see, it still says 8 to go, but these are not all models. There are also some other files. We can also see the commit from the community in WIP. Yes, you see, it shows the file uploaded and waiting files to be uploaded. Meanwhile, our Windows OneTrainer training also completed. How do we know? You see, 200 epochs to 200 epochs. It took more than, way more than the message compute because we used optimized VRAM here, not the speed. Therefore, it was way slower. So, where did we save the files? The folder was OneTrainer video workspace. So, when I enter inside this folder, I will see that the saved checkpoints are here. To use them, I will move them into my freshly installed Automatic1111 web UI. It is inside here, inside models, inside Stable Diffusion, and when I move them here, I will be able to use them. Moreover, what was the final file? It was saved inside my other Automatic1111 installation. So, I just need to start my Automatic1111 web UI, and the models will be there to use. It is exactly the same as on Massed Compute. There is no difference. Everything is exactly the same, so I am not going to repeat them on Windows again, but it is fully the same. You see, all the models are here. I can generate any image of mine, photo of ohwx man wearing a leather jacket in a desert. Whatever the prompt you want, sampling steps, selected sampling method, width, and height. These are super important. If you do SD 1.5 basic training, don't forget to set your resolution accordingly. Enable After Detailer, photo of ohwx man, and detection. Only first face is mine. It allows you to inpaint different faces with different locations as well. It is available on the After Detailer extension page. Inpaint denoising strength, I find this very good. Use separate steps and hit generate. So, this is my local computer, local training, nothing different except this is running on my computer. The other one was running on the cloud, on the Massed Compute cloud, and we are getting the image. The initial image is generated. Now, it is going to inpaint face. So, you see, this was the initial image face. However, even though the face details are not great, you see, the head is my head; that matters. And the face is now fixed, and we got the image. It's a great image. You can change your prompt to get better images. This is how you do it, how you use it on your local computer. Okay, let's meanwhile look at the SD 1.5 configuration. So, where is the SD 1.5 configuration? In this post, I have also shared it here, if you remember. And there are several configurations. This is Kohya, by the way. So, where is the OneTrainer? These are Kohya. Yeah, OneTrainer is here. OneTrainer, SD 1.5. Yeah, this is where you can download. We have tier 1 and tier 2. So, what is the difference? If you have a GPU that is 8 GB, then you should use tier 2 configuration. Actually, I compared tier 1 and tier 2. There wasn't very much difference with the latest version of xFormers, so don't worry about that. You can count both of them as tier 1. There isn't a difference. I compared it. There is a link here. And if you don't have a BF16 supporting GPU, you still should use SD 1.5 training. SD 1.5 training is also amazing. We get amazing quality. So, let me show you the configuration of SD 1.5. The configuration is different, not the same as SDXL. Remember that. I am going to open my OneTrainer. I have the configuration here. Let's start it. Okay, let's pick the configuration. Where is it? SD 1.5 slows. Yes, I'm going to show you the list where I'm using. So, this is the first part. This is the same in all. And in here, you select your base model. This is Hyperrealism Version 3. You can select VAE, but you don't need. They are all embedded, so you shouldn't select. Model output, these are all the same. Now, the important part is weight data types. With SD 1.5, you really need to train them in full precision, float 32. This is mandatory. Others are not working good. Also, you need to select base Stable Diffusion 1.5 here. Fine-tune is selected. I didn't search for LoRA or embedding. My configuration is for fine-tuning only because it is the best one. And in data, that is the same. Concepts are the same, but what differs is you really should use 768-pixel resolution for training for this Hyperrealism Version 3 model. Some models support over 1024 pixels. However, I compared them, and I find that 768 is the sweet spot. So, that is the different thing. Also, you need to use the same resolution regularization images. So, you see, man images dataset, 768 to 768. My training dataset is 768 to 768. Then, in the training, this is also different. First of all the Adafactor settings are the same as the SD 1.5. If you have a high VRAM, you can disable this to speed up. The learning rate is 7e-07. The rest is the same, like batch size, epochs, and whatever. I don't suggest using more than 1 batch size. Train text encoder, we do that. We never stop the training. Let's make this like 10,000 to be sure. Text encoder learning rate is the same with the learning rate in SD 1.5. Moreover, with SD 1.5, we use EMA. This is super important. You can use it on CPU if you don't have sufficient VRAM. However, if you have sufficient VRAM, use it on GPU. How you know you have sufficient VRAM? When you do training, if it starts using shared VRAM, then that means you don't have sufficient VRAM. You need to reduce the VRAM requirements because it will become 20 times slower. So, these are the EMA settings. EMA decay is 0.999, and EMA update step interval is 1. Gradient checkpointing, this is again for saving the VRAM but will make it slower. The resolution is 768. Attention, now you should pick xFormers. Actually, I compared the attention with the latest version of xFormers, and it didn't make a difference. In the past, it was making a huge difference, but in my recent training, it didn't make. Still, you can make this default to get the best quality, but it will make it slower, and it will make it use huge VRAM. So use it xFormers, so use this on GPU if your VRAM is sufficient; if not, use this on CPU and train U-NET. Yes we train until to the very end, like this, U-NET learning rate is same. I didn't test the effect of rescale noise scheduler; I didn't test any of these yet. I also didn't test align prop. Masked Training is same as SDXL, so you can also train Masked or not. When you do Masked training, your anatomy may not be perfect, so that's the trade-off. I also didn't test this area, so these are the very best settings that I have found for SD 1.5. The rest is same as the SDXL; there is no difference. Okay, we are almost finished uploading everything into the Hugging Face repository. As I said, with the Massed Compute, it is extremely important that you backup everything before you terminate your session because once you click this terminate icon, it will delete everything. There is no stop option or permanent storage option on Massed Compute yet. However, I am believing that they will add. Moreover, I am in contact with the Massed Compute developers, and they will add, hopefully, more GPUs. You see, it became 5 available A6000 GPU while we are recording. We can see the billing. Currently, my cost per hour is 2 dollars because I am using L40 GPU. If we were using A6000 GPU 2 instances, let's select our creator image, which is SECourses, apply our coupon, verify, and deploy. Okay, if you get this message, that means that there aren't available GPUs on the same machine, so let's verify. Don't forget to verify and see the price here; that is super important. Deploy, yes, it is deployed. This will only take 31 cents per hour. I mean, this will only take 31 cents per hour. Yeah, they are not separately displayed, but you see that. Okay, let's terminate this one because we don't need it. It will show you the instance name and instance ID to be sure that they are matching, and it says that selecting terminate will delete all data on this VM and recycle the machine. There is no way back. Let's terminate, and it is gone. My currently running instance is here, and it is uploading with the 55 megabytes per second, which is a very decent speed. So, you really should read this readme file very carefully. Read all of the links I shared here very carefully; it will help you tremendously. This is super important. Don't skip reading this readme file very carefully. It is really, really important. Watch these videos that I have shared here. Ask me any questions from the Discord or from the replies of the video. Hopefully, I will reply to every one of them. And if anything gets broken while you are trying to use my Patreon scripts, please message me from Patreon. I fix them as soon as possible. Actually, the majority of my current time, current employment, is maintaining the scripts. I have a lot of scripts, and it is really hard to maintain everything. I am full-time working on this stuff, so this is my main income, main source for continuing my life, paying my bills. Therefore, I hope that you understand it. I hope that you understand the importance of supporting me. This tutorial is made after doing research for weeks, so it was a huge tutorial, it was a huge task. This tutorial is literally the experience of over 15 months, and if you are a company, or if you are interested in more professional training like a bigger dataset, like training style, like training objects, or other stuff, I am also giving private consultation. Just message me from LinkedIn or from Discord. I am also open to project-based working. I am open to every kind of collaboration, so we can collaborate. Okay, you see, it says that it was merged. If I open this in this machine, it will tell that not exist because it is a private repository, but if I open it in my computer, I can see all the model files that we did training is uploaded, along with whatever else we have in that folder. We have, for example, 1.5 base model, we have RealVis XL model, so this is the way of doing it. I hope you have enjoyed this tutorial, and if your synchronization with thin client doesn't work, what you need to do is let me demonstrate. You close thin client, open ThinLink again. Sorry, I said thin client, actually ThinLink client. It is not very inaccurate. Then we need to get our password one more time, copy it, connect, and check this, and existing session, and let's see what happens when we do that because this is very dangerous. It will terminate everything running on the pod. Therefore, all your unsaved progress will be lost. Moreover, all your applications will be closed. Therefore, your training will be terminated; your generation and other things will be terminated. Okay, something happens. Sometimes this may happen. Let's try again, and existing session, let's connect, let's verify if our IP is accurate. Yes, it is accurate. You should always verify IP, username is matching, starts. You see, it says mounting local drives. Wow, something is happening. I hope that this error is not a common one. Maybe we need to try several more times. This time I'm not going to select end existing session because I did that already. Maybe that is the reason. It needs to wait more, so let's try again without end existing session, start, mounting local drives, starting session. I think this time it will start. Yes, and you see, all of the applications are gone, RAM usage is now 4 gigabytes. The data is, of course, remaining, but all the running applications are terminated. However, there is no problem, no issues with the data. I can see all the data here. I can start Stable Diffusion web UI and start using it. This is almost as equal as from this menu, from right top, power off, restart. If you do power off, probably you will lose everything. There is no way to back. I am not sure, so I don't suggest that. Okay, there is also power settings. I think you should make this by default, power modes. Yeah, there is no problem. Okay, this is it. Now, when I terminate this session, everything will be deleted, everything will be gone forever, and I won't be able to recover it. However, I have saved everything in my Hugging Face repository. Therefore, there is no issues. I didn't save all the generated images, but you know how to save it. I hope you have enjoyed this video. Please like it, subscribe to our channel, support me on Patreon. You can see all our links here. Please also follow me on these links. I appreciate it, and also leave a comment about my new voice. And by the way, maybe you noticed that my voice changed because it is already 7 a.m. here, so my voice is degrading in quality, but I would like to hear your opinion about my new microphone that I have purchased to increase the sound quality. Hopefully, new more amazing tutorials are on the way. Please also open the bell notification to not miss anything, and in my channel, you can use this search icon to search anything like ControlNet if you want to learn. You will see the ControlNet, like Stable Diffusion if you want to learn, like type SDXL, and you will see the SDXL tutorials. I also have amazing playlists, so you can look at all of my playlists. Hopefully, new amazing tutorials are on the horizon. See you later.

Info

Channel: SECourses

Views: 7,966

Rating: undefined out of 5

Keywords: onetrainer, one trainer, dreambooth, dream booth, fine tuning, stable diffusion, sdxl, sd 1.5, stable diffusion xl, training, massed compute, massedcompute, OneTrainer

Id: 0t5l6CP9eBg

Channel Id: undefined

Length: 133min 13sec (7993 seconds)

Published: Tue Apr 09 2024