Become A Master Of SDXL Training With Kohya SS LoRAs - Combine Power Of Automatic1111 & SDXL LoRAs

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Greetings everyone. Today I am going to show you how to install Automatic1111 Web UI for Stable Diffusion X-Large aka SDXL. How to enable quick VAE selection and use correct SDXL VAE. How to use your trained SDXL LoRAs with Automatic1111 Web UI. Not only SDXL but same rule applies to SD 1.5 models and SD 2.1 models as well. How to assign previews to your LoRA checkpoints. How to do Grid x/y/z checkpoint comparison test to find your best LoRA checkpoint. How to sort your generated images with best resemblance according to a reference image. How to update your existing Kohya GUI to the latest version or install it from scratch with the correct installation settings. How to prepare your training dataset images properly for your Kohya training. How to do SDXL LoRA training on Kohya with most optimal VRAM settings for lower VRAM having GPUs. I will also show the best settings if you have a strong GPU such as RTX 3090. Why do I use Ground Truth classification / regularization images and its explanation. Detailed information regarding repeating logic of Kohya which is a very commonly asked question to me. Very detailed explanation of Kohya settings and parameters for training. How to manually execute training command from command line interface for Kohya GUI. How to save state of your LoRA training and continue from saved state. And how to get such good images as you are seeing right now with LoRA inpainting. I have prepared a very detailed GitHub readme file. All of the instructions and links that you are going to need will be in this file. When watching this tutorial, follow the instructions written here. Because when you are watching this tutorial, some of the things that I show could be outdated because of the speed of the AI development. However, I will update this GitHub readme file. To be able to follow this tutorial, you need 2 things to be installed on your computer. First 1 is Python. If you don't know how to install Python, I have this excellent tutorial, watch it. For Automatic1111 web UI, Python 3.10.6 is suggested. However, I suggest you to install Python 3.10.11. Because latest xFormers, which is our mainly used optimizer library, is now requiring 3.10.11. And 3.10.11 is working very well. Do not use Python 3.11. It won't work. It has to be 3.10.X version. And you also need Git. After you installed Python and Git, you are ready to follow this tutorial. So I will show you how to download, install, and use Automatic1111 SD web UI with SDXL. I have already automatic installer. Actually, installing it is very easy, but I am keeping up to date my Patreon post as well. So let's do both automatic and manual installation. When you go to the Patreon post, you will see auto_1111_sdxl.bat file and download_sdxl.py file. Download both of them. Move the files into any folder where you want to install your Automatic1111 web UI. This installation will not break or modify your existing another installation. Double click auto_1111_sdxl.bat file, click more info, click run anyway, and it will clone the Automatic1111 web UI repository, install it with xFormers, and also it will download SDXL model files for you. Okay. Let's also do the manual installation. So for manual installation, we will begin with cloning Automatic1111 web UI. Copy this, make any new folder, start a new CMD inside that folder like this. Begin with cloning. So the Automatic1111 web UI has been cloned. Let me also show you my Python version. Start a new CMD, type Python, and it will display you the Python version as you are seeing right now. This is important. You need to see your Python version like I am seeing 3.10.11. Okay. After repo has been cloned, enter inside the repo, right click and edit webui-user.bat file. If you are not seeing the extension, go to view and in here, check file name extensions. Then you will see the extension, right click. You can edit with notepad, Notepad++, anything. I am preferring Notepad++. In the command line arguments, we will define --xFormers and --no-half-vae. That's it. Save it. Then double click and it will start installing the Automatic1111 web UI from scratch for you. If you are wondering why I did add xFormers and --no-half-vae, these are command line arguments. And when you click this link, you will see all of the command line arguments of Automatic1111 web UI here. SDXL requires --no-half-vae because the default VAE of SDXL is 32 bits. Therefore we are adding it. xFormers is an optimization library. It will speed up our generation and also it will speed our training as well. As you are seeing right now, automatic installer is also downloading the model files automatically for us. So you need to download the SDXL model files if you haven't yet. I have put the direct download links here. This is the corrected VAE file. Open this link and it will start downloading automatically for you as you are seeing right now. Then we need 1.0 SDXL model. I prefer this for LoRA training. So open this link. These are direct links and it will start downloading that file as well as you are seeing right now. Once the files have been downloaded, you need to put them into the correct folders. Put the sd_xl_base_1.0.safetensors file into the models folder, cut it, move back into your installation folder, enter inside models, Stable Diffusion. This is the folder where you need to put and paste it there. Then return back to downloads, cut the sdxl_vae.safetensors file, move back into fresh installation wherever you are installing. Move inside models, inside VAE folder and put it there. So these are the folders where you need to put your files. When you see that running on local URL like this, that means the installation has been completed. The rest is same for both automatic installer and manual installation. So we need to open this URL, copy it, open it in a new window and you will see SDXL base model is selected. Are we done yet? No, we are not done yet. We need to set the VAE that it is going to use. If you are wondering what is VAE, with Stable Diffusion, we are working in the latent space and latent space resolution is 1 over 8 of the generated image. So for SD 1.5 version, it is 64 by 64. For SDXL, it is 128 by 128, but we are generating 1024 pixel images. Therefore, VAE is used to reconstruct latent image into the final output image. So how are we going to set our VAE? Go to settings, in here go to the user interface, in here go to the info quick setting list. When you click here, it will list you all of the options. Type VAE and select SDVAE, then apply settings and reload UI. After you have reloaded your UI, you will see SD VAE Automatic. So what does Automatic means? Automatic means that it will look for the VAE same name as this safetensors file. When you select none, it will use the embedded VAE file. Then you can select the certain VAE from here and we will use this VAE. Why? As I did put the link here, open it. I made a comparison with the SDXL 1.0 embedded VAE and the released VAE, which is the one we downloaded. You will see the difference. With the embedded VAE it generates noise like this. You see those green pixels in the image. When you look at the downloaded VAE, you won't see them. So the embedded VAE has problems. That is why Stability AI released this fixed VAE file. Now we are fully ready to start using SDXL with Automatic1111 web UI. There is also refiner of SDXL, but it is not officially supported with Automatic1111 web UI yet. It is in the development. I am following the news and I will make a new video, hopefully when it is merged into main branch. So you can type anything. Photo of a speed car. The SDXL works with 1024 and 1024 pixel resolution don't forget that and you are ready. Just hit generate and this is our generated image. Now we can begin installing Kohya SS for our training. So to be able to use Kohya SS LoRA training script, you need to first have installed Visual Studio Redistributable package. The link is here. Open it and install it. After that, we will begin with cloning the Kohya SS. This is its repository. Let me open it. So this is the repository that we are going to clone and run. It is very simple to install. Copy this, make a new folder and enter inside it. Open a new CMD, paste the link. It will clone the Kohya SS repository into that folder. You will get a folder like this. Open inside the folder, find setup.bat file, double click it. It will ask you a bunch of questions. It is expecting Python 3.10.9. However, Python 3.10.11 is also working. We are going to select option 1, Kohya SS GUI installation. You don't need to install cuDNN files anymore because we are going to install Torch version 2 and it comes with the proper cuDNN files. Then select option 2 here, Torch version 2. This is really important. Do not select Torch version 1 anymore. This screen may take a while. When you open your task manager, go to the performance and when you select your ethernet, you will see the download like this. Because currently it is downloading and then installing the downloaded libraries. You won't see any messages until it completes the current package and moving to the next package. So this is all you need to do for installation. Let's say you have existing Kohya SS installed. How can you update it? Open a new CMD inside that folder, then type git pull. It will fetch the whatever new, then run the gui.bat file and then it will update the necessary libraries if there are any and you will have most up to date version of Kohya. This is really important because it is getting updated frequently because of the SDXL model. So make sure that you have the latest version. Recently, one of my Patreon supporters were getting out of VRAM error and it turns out that his Kohya installation was not up to date. After we updated it to latest version, the out of VRAM error is gone. If you ever feel like your installation is stopped, touch your CMD and hit enter. Because sometimes when you touch your CMD window, it will freeze. So for unfreezing it, touch your CMD and hit enter and you will make the CMD progress continue. After a while, when all of the dependencies are installed, you will be asked these options. Select this machine, hit enter. Select no distributed training, hit enter. Do you want to run your training on CPU only? No, hit enter. Do you wish to optimize your script with Torch Dynamo? Hit no, hit enter. Type no, hit enter. Do you want to use Deep Speed? Type no, hit enter. What GPUs you want to use? Type all, hit enter. Now this is important. If you have GTX 1000 series, use FP16. If you have some special kind of cards, look whether they are supporting BF16 or not. If you have RTX 2000 series, 3000 series or 4000 series, select BF16. So I am going to select BF16 since I have RTX 3060 and RTX 3090. So BF16 selected and we are done. So this was all of the installation. What if, if you want to install on RunPod or on Unix? I have auto installer here for RunPod. It should also work with Unix systems. You probably only need to change the workspace folder path. This is the video where I have shown how to install on RunPod. Its readme file is up to date. Hopefully, I will make a new video for Kohya installation on RunPod for SDXL LoRA training. So after training has been completed, you can close this CMD or start the Kohya SS GUI in browser. I will show you how to start. So let's close this. Enter inside where you have installed the Kohya. Double click gui.bat file and it will start the gui automatically. So let's say you have updated your Kohya installation. On this screen, you will see whether any library is updated, upgraded or not. So this is my current installation. Let's open this link. And this is our Kohya GUI LoRA training or DreamBooth training screen. It also supports Textual Inversion. Hopefully, I will also work on DreamBooth training for SDXL and make amazing video for that as well. So for today's topic, we are going to do training with LoRA. LoRA is optimized version of DreamBooth training. If you are wondering what is LoRA training, I have this master tutorial, which is about 1 hour. Watch this tutorial. Also, if you are wondering rare tokens, class images, and all about DreamBooth training, this also includes LoRA training somewhat. Watch this amazing tutorial. You won't regret it. So this is our GUI of Kohya LoRA. First of all, we will begin with selecting our source model. You see there are quick selection of models. However, we will use custom model. So select custom from here. Then it will ask you pre-trained model name or path. So click this icon. If you are on RunPod, you would be need to give full path. But here we can just click this. Where we have downloaded our model, it was inside our Automatic1111 web UI. So I will enter there, models, Stable Diffusion and SDXL base 1.0 safetensors file. So this is the file of my base model where I will do my LoRA training. Don't forget to check this checkbox SDXL model. Then after doing this, make sure that you have saved your configuration. I also shared my configuration in the GitHub. When you go to this section, you will find my used settings in this video. After downloading it, you can change the extension to JSON. Let me show you that. So this is the file. I will change the extension to JSON. Click Yes. Now I can load it. So let's first load it open into downloads. This one and it will load all of the settings I have. However, if you do this, you also need to change your image folder path, regularization folder path, output folder path, logging folder path, and the model output name if you wish. So let's start from zero again. OK, so I have selected my base model, selected SDXL. I will save my configuration. So for saving, click Save as. Where I want to save my configuration? Let's save it inside this folder. You can save it wherever you want. I will save it as test1 and click Save. And it will generate this file and save whatever the configuration you make. After each change don't forget to click Save button for saving. And when you do that, you will get the message on the CMD window where your Kohya GUI instance is started. Whatever we do, we will see it here. So we will begin with preparation of the training images. This is really important. But before that, I noticed that I am using the DreamBooth tab. Don't make this mistake. We are doing LoRA training. So select LoRA tab and let's make the same changes quickly. So let's select the model, Stable Diffusion, SDXL base, SDXL model. Let's save as so the saving file here, save OK. Make sure that you are in LoRA tab. And now we can begin with the preparation of training data set. Click tools. In here you will see deprecated. I don't know why they are deprecating it, but this is the best way of preparing your training images. Otherwise you will likely to make mistakes. So the instance prompt will be OHWX. However, there is also debate that use a celebrity that is like yourself if you are training yourself. I will hopefully make a video about that and compare the celebrity name approach versus rare token approach. But for this video, I will use OHWX. The class prompt will be man. As I said, if you don't know what are these rare tokens, class token, class images, watch this excellent tutorial. Okay. Training images. This is really important. Currently, I am using these images and these images are not good, not optimal. Why? Because they have repeating clothing and repeating background, which makes model to memorize those as well. Hopefully I will make an amazing how to prepare good training data set tutorial very soon, but for now we will use this. As I said, this is not a very good training images data set. All of the images resolution are same, which are, let me show you right click sort by more than select width and height from here. Then view details. Okay. All of the images are 1024 by 1024. Kohya scripts are supporting bucketing as well, which means training with different resolution images. But I suggest you to use same dimension having images for both training images and for classification images, because in one of my viewer, I seen that bucketing were causing error and I had to connect his computer and try to find out the reason. And the reason was he had so many different resolutions and bucketing were failing. So these are my training images. Copy the path. You can select the path from here, control C copy and paste the path here. Alternatively, by clicking here, you can select that path. For regularization images. This is now really debated topic, but I will show you a proof of my approach right now. I am using ground truth regularization classification images approach. So what is that? When you click this link, you will get to this page. This page is EveryDream2trainer. Every dream trainer is also a trainer script. Hopefully I will also make a training with that and compare the results with Kohya and the DreamBooth extension of Automatic1111 web UI. So in here it is very well explained. The ground truth means that the real images originally were used to train the Stable Diffusion. So what are they in this case? They are real images of people like you are seeing right now. These are my classification images. I spent huge time to prepare them. Those images are shared in this Patreon post. You can also prepare your ground truth regularization / classification images. Using very high quality real images is the key of ground truth images. I have shared so many resolutions of these images. They are collected from Unsplash .com so they are freely available to use even for commercial purposes. All of these ratio having images are shared on my Patreon and from this link you can download them. So how did I generate these many ratio having perfect quality images? When you open these images, you will see them they are in the perfect ratio with the maximum possible resolution. I have explained the scripts that I used in this tutorial video. Watch this tutorial video to learn more about my subject cropping script. I shared the script on the video for free. You can also download it from my Patreon. The script is included in this post as well as with the previously prepared images for you. They are also not just auto-cropped. They are also preprocessed. What do I mean by preprocessing? When you watch this excellent tutorial video, you will see the preprocessing scripts. With preprocessing I have eliminated outlier images from the dataset, which is minimal when you use unsplash. But still, there are some images that needs to be eliminated. For example, if there are 2 person in 1 image, I eliminate that image. I deleted or moved it from the folder. So we don't have any classification / regularization images that have 2 person in them. I also then manually verified all of the images, then shared them with you. So it was a huge task for me. Therefore, if you support me on Patreon, I would appreciate that very much. Still, you can find all of the scripts in these tutorial videos. They are YouTube videos, open them, and then you will be able to replicate what I have done. Since we are doing training with 1024 by 1024 pixel, I will use this dataset, this resolution. Let me show you their resolution, right click, details, and you see all of the classification images are same resolution. If you don't use same resolution classification images, especially in DreamBooth extension of Automatic1111 web UI, you will get error. It will try to generate new images. I don't know what is the behavior of Kohya scripts. So make sure that your classification images are same dimension as your training images for best results. So let's copy this path. Let's go over to Kohya GUI, paste it. Now number of repeating, which is one of the most confused and debated topic of the Kohya. I have asked Kohya repeating logic to the original developer of Kohya. Click this link. You will see that I have asked four cases and thankfully the Kohya SS replied to my questions. Moreover, we are discussing how can we improve the repeating logic. You can also leave a comment here. I hope you do. So I am hoping that Kohya will improve his repeating logic. So read this thread and you will understand the repeating logic. I will use repeating 25. So in one epoch, it will train each image 25 times with 25 different regularization images. How many regularization images you would need, you can easily calculate. It is training images multiplied with the number of repeating times. So I need total 325 classification images. If I had 20 training images, I would need 20 multiplied with 25, 500 classification images. This is the way how you calculate your number of classification images that you need. So the destination training directory, this is where generated LoRA model checkpoints and other things will be saved. If you want to use them directly with Automatic1111 web UI, you can do this. Click this folder icon, enter inside wherever you have installed your Automatic1111 web UI, enter inside models, enter inside LoRA and save it there. So our generated checkpoints will be saved inside LoRA folder. Therefore, we will be able to use generated LoRA checkpoints immediately and very easily. After doing this, click, prepare training data. Once you clicked, you will see the messages here. And when you enter inside that folder, inside LoRA, you will see all of the folders are generated like this. Make sure that you get this done message because it will copy all of the images inside that folder. You see all of the class images are copied. Then all of the training images are also copied. After that, don't forget to click copy info to folders tab. Then you are ready. You have prepared the training images directories like this. And model output name. This is important. Whatever the model name you give, the files will be saved like that. So let's say 12vram. This will be my model output name and then click save. Don't forget to save your changes. Are we ready? No. Now we need to set training parameters and this is where LoRA training becomes very complex. Because there are so many different LoRA types. There are so many different optimizers. Each one have different effects, different settings and finding optimal settings is very hard. I have done over 20 trainings and I will share my best training values with you. So train batch size. Make this 1. Why? Because in machine learning, if you make the training batch size higher, then you will reduce your training generalization. Moreover, increased training batch size requires different learning rate. So if you increase your training batch size, you need to test increased learning rates as well. The learning rate will not be same. Number of epochs. I train persons with up to 200 epochs. Since my repeating is 25, I will train 8 epochs. Because in 1 epoch, it will train 25 times each training image, which is actually 25 epochs. So the total number of epochs will be 200 with this way. I will save every 1 epoch. So actually I will be saving every 25 epochs. Caption extension. If you use captions, you need to define their extension. However, for training a person, I don't find captions improving the quality. If you were doing fine tuning like 200 subjects. Like with 10,000 images, captions would be necessary. But for teaching a person, I don't find them useful. Hopefully, I will make a fine tuning video and explain the logic to you as well. So how you can improve your training quality without using captions? You can improve your training data set and it will improve your training quality. Make sure that your image is very best quality. Don't have repeating patterns such as clothing, such as backgrounds and your training will become much better. Okay, select cache latents and cache latents to disk. These will improve your VRAM. Mixed precision will be BF16 since my graphic card is RTX series. Save precision will be BF16. If BF16 don't work your graphic card, then you need to select FP16. You may be wondering what are these? This is totally related to machine learning training. Normally, each number has certain precision which is by default FP32. However, it will use much more VRAM and when you use mixed precision, half precision, the quality is almost same. So for saving VRAM, we are using mixed precision. So BF16, BF16 selected. Number of CPU threads per core. I increased this but I didn't find much more improvement. Actually, there is an amazing file that you can read. Find explaining LoRA learning settings and open this link. This is a huge file which explains majority of the settings that you will see in Kohya training. This file is a gold. Believe me. Read this file and you will learn so many things about how LoRA works, how Stable Diffusion works, what are the parameters, options of training. Believe me, you won't regret it. If you are interested in learning what are these parameters, read this file. Okay, seed is optional. You don't need. Now learning rate. This is really important. The learning rate depends on your selected LoRA type and your optimizer. I am using Adafactor. This optimizer is supposed to be lowest VRAM using optimizer. How do I know? Type Adafactor VRAM usage to Google. You will get efficient training on a single GPU thread on Hugging Face. And in here search for Adafactor and you will get here. It uses a very low amount of VRAM and for getting over the instability issues, we are going to use these extra arguments: scale parameter false, relative step false, and warm up init false and it becomes stable and it uses minimal amount of VRAM. You see AdamW uses 24GB GPU, Adafactor uses 12GB GPU and 8-bit quantization uses 6GB GPU. However, 8-bit quantization of DreamBooth training is not supported yet as far as I know. This is available currently for large language models training. So we pick Adafactor. We make the learning rate 0.0004 which is equal to actually 4e-4. So 4e-4 is scientific notation. However, we type it like this: 0.0004. Learning rate scheduler I will use constant. As I said, read this file to learn more information about all of these options. So constant. When you use constant learning rate scheduler, this is 0. Okay, Adafactor selected. Don't touch these two text boxes. Optimizer extra arguments. This is really important. In our readme file, copy these optimizer extra arguments, paste them here like this. Max resolution. This is also really important for SDXL. We are going to use 1024 and 1024. We can possibly use higher resolution. I will explore that. But for 12GB training, you couldn't get higher resolution. You should use 1024 and 1024. Stop text encoder 0. Don't enable buckets. In some cases, when people enabled bucketing, they saw training speed decrease. I think bucketing system has some flaws, some bugs. Therefore, don't use it. And since we are doing training with the same resolution, we don't even need it. Text encoder learning rate. I use same learning rate for both text encoder and for both UNET. As you are seeing right now, I made them equal. You can select full BF16 training. This is supposed to reduce VRAM usage, but it won't work with all of the optimizers. So it is fine to select it. Now Network Rank (Dimension). This is one of the most crucial part. As you increase this, it will be able to learn more details, more information. However, it will also increase your VRAM usage and it will take more hard drive space. Each one of the checkpoint file will become bigger. If you have RTX 3090, 4090, a graphic card that has 24GB, I suggest you to use Network Rank (Dimension) 256. When you use this dimension, each one of the generated files will be 1.7GB. However, for 12GB having GPUs, you will get out of VRAM error. I suggest you to start with 32 and you should be able to train with 32. If you are not able to train with 32, that means that your other applications are using your VRAM. How can you reduce them? Reduce your screen resolution. Right click, open your task manager, go to startup and disable all of the startup applications. So do whatever you can do to reduce VRAM usage. You shouldn't have more than 500MB VRAM usage when Kohya is not running. With that way you will be able to use Network Rank (Dimension) 32 even 64. I will do the testing on my second GPU which is RTX 3060 and it has 0 VRAM usage because it is my second GPU and I am able to do training with up to 96 network rank. However, I tested this on one of my Patreon subscribers computer and he was able to train with 32 network dimension on Windows 11. So you should be also able to do training with network dimension 32. This is generating decent results. This is also the network rank dimension I used to generate those images that I have shown you in the beginning of the video. Network Alpha. This is not very well explained anywhere, but this is actually affecting the learning rate. When you are doing training you are actually changing the weights of the model. So network Alpha changes how strongly the learning rate is applied. For standard LoRA you should leave the network Alpha value 1. But if you use other LoRA types then it changes. Okay all these are default advanced configuration. In here all you need to do is check Gradient Checkpoint and this is really important otherwise you will get out of VRAM error and other than that, you don't need to do anything else. Use xFormers is selected. You can also select memory efficient attention. However, it is not reducing VRAM usage. You can also try full FP16 training. I tested and it didn't make any difference in terms of VRAM usage. So these are all of the settings. There is also one more thing. No half VAE should be selected for SDXL. Some people also asked me how can they continue training at a later time. For doing that, you need to check save training state and if you want to continue from previous training state, you need to resume from previous training state. So in this tutorial, I will show you how to do that. I checked save training state. I am not generating any samples during the training. You will likely to get out of VRAM memory error if you do that, especially on low VRAM GPUs. So these are all the settings. Let's save them and then you can click print training command. It won't start training and it will display you all of the settings. It will display you how many images it has found. This is really important. You need to see all of these values correctly. So it says found 13 training images and it is going to take 325 steps for each epoch. Why? Because the repeating number is 25 and it is reading these repeating numbers from the folder name. Then it says it will use regularization images and regularization images will be also 325 because the regularization images repeating count is 1 and in each epoch, it will do 325 steps and in each step it will use 1 regularization / classification image. Therefore it will be 325. Training batch size 1. Gradient accumulation steps 1. Do not increase these two unless you have to. In which case you may have to? If you need faster training then you need to increase. However it will reduce your training quality and it will make you to find a more optimal learning rate. So your learning rate will change if you increase your training batch size or gradient accumulation steps. Gradient accumulation steps is actually fake training batch size. I explained this in previous videos. It is not very important right now. Hopefully I will make a video about general fine tuning and I will explain all of these things in more details. So how many epochs we will do training? We will do training 8 epoch. Then we will compare each one of the generated checkpoints and find the best checkpoint. I will show you how to do that. Max training steps is calculated like this. So in each 1 epoch, it will do 325 steps. 1 step means that 1 time GPU execution. Why 325? 13 images, 25 repeating. Then what is this? It says that it is divided by 1. Why? Because our training batch size is 1. Then it is divided by 1.0. Why? Because gradient accumulation steps is 1.0. Then it is multiplied with 8. Why? Because we are going to do training 8 epochs. Then it is multiplied with 2. Why? Because we are using classification images. So in each step, it will also do training of classification / regularization images which will further fine-tune our model. Why? Because we are going to use ground truth images. What if if you don't have subscription to my Patreon? Then you can generate your class images from Automatic1111 web UI. So let me start it. It is starting my Automatic1111 web UI. Okay it is started. You see it has taken the next available port because our Kohya GUI is already using this port. Let's open it. Let's say you are going to do training of a woman. I don't have regularization images for women yet, but it is on my to-do list and hopefully I will release it on my Patreon. All you need to do is type photo of a woman. Use this prompt usually because you want to get as much as possibly normal images. Set your width and height to 1024 and then right click and generate forever. Generate as many as you need. Here an example image. I also made the sampling steps 30, but you see this image is nothing like a real image. It is nothing like a real image. This is also useful when you want to keep the original style of the model or keep the certain style of a model. You can also use styled images. Whatever you use will further fine-tune model to that style or that quality so it is really important. And finally, it will list you all of the command like this. This is basically generating the command line interface command for you and executing it. So the difference of GUI from the original Kohya GUI SS is just this. You can also copy this with ctrl c, paste it into a text editor and change every parameter you wish. For example: some people asked me how they can do train only UNET. So this is the --train_unet_only. Then append it to the end of the training. Prompt copy it. Enter inside your Kohya GUI SS folder. Enter inside venv virtual environment folder, enter inside scripts, start a new cmd type activate. After you have activated, paste the command and hit enter and it will start training. We will see in a moment. It gave me an error. Why? Because it is looking for this sdxl_train_network.py file. However, since it is using relative path, it is not available here. So I will move into the main folder like this and I will copy paste the command again. Pasted command again. Hit enter and now it should work. Whenever you get an error, you should look where the error happened. Sometimes people are asking me okay, we got another error. Okay, it says that sdxl_train is not supporting train UNET only. However, it says here that it is supporting so this is definitely a bug. But if they fix it later, you can try this. Okay, let's close this. Let's return back to our GUI instance and start training and we will see that it will start training. This time it will execute the training, not just display. However, this training will happen on my main gpu right now. So how you can use it on your second gpu and for demonstration purposes for low VRAM. I will now begin to training with second gpu. So before closing the existing running cmd window, I will save it. Then I will close both of these windows. We are going to use set CUDA_VISIBILE_DEVICES argument. You can use this pretty much with every application. So where we need to add this command? We need to add this command into the gui.bat file which is here. Right click, click edit and add that line after calling Activate.bat file which is activating the virtual environment. Save it, then start your GUI. Now we will see that it is displaying RTX 3060 on the screen with its available vram and other features. So let's reload our interface. LoRA configuration open. We have saved our configuration here. Select it, then it will automatically load everything and hit start training. And let's look at the vram usage of my RTX 3060. since my RTX 3060 is fully empty, not any other application is using I will get the best performance and we will see the how much vram it is going to use exactly. Okay, it's starting first time. It will cache all of the images into the directory as a permanent file since we have selected it. It will take some time. Normally using multiple threads is supposed to increase this speed, but it is not. Maybe it is a bug. Caching latents is using cpu, however, it is single threaded. I think actually it is supposed to follow number of cpu threads per core option we defined here, however, it is still using single thread, therefore it is slow. I reported this to the Kohya. I hope he fixes this because if we had so many images, it would take too much time unnecessarily. This caching of images into the disk will be only one time. Then it will start training. You see it is starting to use more dedicated gpu memory. Your shared gpu memory will not be used so don't depend on it. With network rank 32, it is using 11.5 gigabyte vram of my gpu and the speed is 2.4 seconds per it. This is probably the maximum speed that you are going to get currently. However, if they improve the script libraries dependencies it may get better. But for RTX 3060 currently, this is probably the maximum speed that you are going to get and it's a decent speed. You see, my training will be completed in 3 hours 30 minutes. When you run this script first time, you will see that this duration is decreasing faster than it should be because there is some statistics issue. So wait like 10 minutes to get the final it per second or the duration of your training. Until like 10 minutes it will keep decreasing the time. When you check your training folders, you will see these files. These files are the cache of the images since we cached the latents in disk as well and when I do another training, it won't generate this cached latents again. It will use the existing ones. Okay, it has reduced to 2.38 seconds per it. So this is not the final yet. If you get an error like this, image file is truncated that means that one of your training or classification image is corrupted. This usually happens when you use RunPod or cloud services and upload your images and upload is not completed. So if you get such error during training, make sure that all of your images are perfectly working. I have an image validator script on my Patreon. You can also use this script to verify all of your images validity. The link of image validator script is also shared on the GitHub readme file. The training is continuing on. Now it is displaying 3 hours, 21 minutes. So far 5 minutes passed. If you get any error, join our Discord channel. The link is here. You can also follow me on Twitter from here. I have Udemy course as well and if you support me on Patreon I would appreciate that very much. You can also follow me from LinkedIn and I also have a CivitAI profile as well. All of the links are here. You will find these links in all of my tutorials. I appreciate them if you follow me. If you support me. It has been almost 10 minutes and it is almost stabilized. So you see it is going to take 3 hours 26 minutes for 5200 steps. The Kohya added stop training feature as well. Let's stop it and let's try with network rank dimension 8. If you can't make 32 work, you can reduce your network rank dimension to reduce vram usage. And let's start training again to see how much vram it will use. So you see after I stopped the training it became 0 vram usage. Then it will start training again. You see it has been frozen the screen. I hit enter and it continue. Now it's starting. You will also get this error caught was no module named triton. Because triton optimizer is not available on Windows. So you will also get this error. Don't worry it is just an optimizer. Therefore, we will get a little bit lesser speed, but it will work perfectly fine. I hope that the triton developers also release packages for Windows. I even opened an issue on their development GitHub. You will also see that it is using DreamBooth method even though we are doing LoRA training because LoRA is an optimized way of DreamBooth training. LoRA is an optimization technique used in machine learning trainings. Okay, this time it won't recache images because it already has the cached images. It is starting. Let's see how much vram it will use this time. You see the vram usage reduced 300 megabytes. It was 11.5 gigabytes. Now it is 11.2 gigabytes and you see the speed is increasing because we need to wait like 10 minutes to see the final speed of training. Okay, the vram usage increased to 11.2 gigabytes. I think it is stabilized. The speed is also getting increased. What if if you still can't make it work? And what if if your vram usage is too much? You will get terrible speeds such as 30 seconds per it if it uses too much vram because it will bottleneck the gpu. Sometimes you won't get out of vram error. It will continue training but it will be too slow. This is happening when it uses too much vram. Let me demonstrate you. So I will stop the training. Then I will make the network rank dimension 96. Actually let's try 108. Okay now it is using 11.8 gigabytes and you see the speed is decreased significantly. Now I will further increase the network rank dimension to have more bottleneck of gpu vram. Stop training. Let's try network rank as 116. Okay with 120 network rank dimension now it is taking almost 7 hours. So let's say no matter what you do, you are getting out of vram error: make the network rank dimension 4. This is a really low. This will not be very good but if this is what you can do you, you need to do it. Then start training. Let's see how much vram it will use. It is using 11.2 so it is same as 32 rank. Therefore find your optimal rank. Okay now I will continue network rank 32 dimension full training. So it has been over 1 hour and we have saved 2 checkpoints. You see we are at the epoch 3. You will notice that the epoch 1 the first checkpoint was saved at 650 steps. Why? Because number of training images multiplied with repeating. Then since we have classification images it is 650 steps per epoch. And since we are saving every one epoch, the model checkpoints are generated. Where they are generated. They are generated in the folder where we define it. We are also saving state for continuing and each state file is 8 gigabyte. I think the saved state file size is independent of what network rank dimension you use. The checkpoint files are only 200 megabytes, but each state file is 8 gigabytes. So I will not wait until this training is completed because I have done that before. I will stop training. I will show how you can continue from saved state because it was asked of me. So in the Kohya GUI, go to advanced configuration tab which is at the below (now it is at the top). Then find save training state section and here you will see resume from saved training state. Click this folder icon, go to the folder where you have saved your training. Inside LoRA, inside model. Let's continue from state 2 which is generated at the second checkpoint. Okay, select folder. Now it should continue. Let's hit start training and let's see what happens. Okay. You see cmd windows is frozen so I need to hit enter. Okay, now it's continuing. Whenever you start a training, it will also generate a unique json file. This json file will be saved here as you are seeing. So even if you don't save your configuration, you will have saved json file that you can load. Okay, we should see it is loading the saved state. Okay, we can see that resume training from local state and this is the state where it is loading and resuming. This is how you resume your training with LoRA when you are using Kohya GUI for training. So now I will close my Kohya GUI. I have previously completed checkpoints here. You see: 12 gigabytes settings safetensors file. At the end you will also get this 12 gigabytes settings safetensors without any additional numbering. This is the last file that it will generate. Let me move them into new folders. So this is our fresh installation. Models, LoRA. I will just paste them here, even for demonstration. I will move them into the model folder where they would be have saved if we had continued our training. So let's start our Automatic1111 web UI and I will show how to do checkpoint comparison and how to use those LoRAs. Okay, double click webui-user.bat file. The Automatic1111 is starting. This is the commit hash that I am using right now. These are the web UI arguments that I am using. I don't have any extension installed. You don't need additional networks extension anymore to use LoRAs with Automatic1111 web UI, so don't use it. Okay, it is started on this url here. Our SD web UI loaded. So how you are going to use your trained LoRA? When you click this icon, it will open this tab and in here you see Textual Inversion. This is another training methodology. Hyper Networks. This is also another training methodology. I have tutorial for Textual Inversion. Not Hyper Networks. Checkpoints. These are our models and LoRA. This is where we will pick our LoRA. You see it is also displaying the folders inside LoRA folder and it is displaying all of the LoRA checkpoints that I have. So this 12 gigabyte settings are the LoRAs that I have previously trained and you see 12vram. This is our first checkpoint from this video training and this is the second checkpoint. How we are going to know which checkpoint is best performing? So we have trained our face with ohwx man. Ohwx is our rare token. Man is our class token. Let's type photo of ohwx man and nothing else. Then we need to add the LoRA checkpoint. For adding LoRA checkpoint. Either you can type its full name or you can click this icon and then it will append it to here. Okay after it is appended here, click generate. Currently we are using 512 and 512. Therefore we will get very bad quality image as you are seeing right now. So with SDXL you need to use the native trained resolution or bigger resolution. Okay let's hit generate and we got the image. It is not very good because of the prompt. It is also pretty much memorized the background, the clothing. So this is our first attempt to see what we are getting. We will make this much better. Okay let's try with another checkpoint like this and let's also change the prompt. I have shared some prompts here, so let's copy the prompt from our GitHub readme file. Let's paste it. Okay, this is going to use the seventh checkpoint, then let's copy the negative prompt and let's paste it here. Let's make the sampling steps 30 and hit generate. Okay, it is coming up. This is not a cherry pick. I will show you how to get very good quality images as well. Okay, this is the image. Let's generate 10 and then pick the best one to assign a preview for our LoRA. How we are going to do that, let's hit generate. This will generate 10 images one by one. You can see the entire progress here. It is 2.88 it per second. Why? Because I am recording a video right now with Nvidia Broadcast so they are using a lot of gpu power. Okay, I have got 10 samples and let's see which one is looking decent. Okay, let's say this image is looking decent and I want to set this as preview of the seventh LoRA file. How am I going to do that? Let's open the LoRA again and let's go to the seventh checkpoint here. Click this icon, replace preview and you see it is replacing preview with the selected and save and now this is the preview of my LoRA. So how am I going to find the best LoRA checkpoint? First of all, you need to decide your prompt. This is really crucial. For finding the best LoRA I will use this prompt, copy paste it. Then I will copy paste negative as well. Make sure that you are fixing your LoRA checkpoint naming with yours. This is mine so I will delete this. Let's go to the LoRAs. For example let's start from the first one. This is the first LoRA file. Moreover, for making it easier, I will do a manual renaming so this is where my LoRAs are located. The final LoRA is named like this. I will make its naming same as the other, so I am renaming it as 8. So this is actually the eight checkpoint. Then all we need to do is we need to change this number. So copy this, go to the x/y/z plot and in here use prompt search and replace. So it will look the first string in this string you have written here. Then I will replace the values like this. So I am going to check each one of the checkpoints as you are seeing right now. So it will test all of the 8 checkpoints. Make the batch count like 12 so it will generate 12 images for each checkpoint. Moreover, CFG 9 is sometimes working better for LoRA training at least for me, but it may not be same as you. This is the weight of the LoRA. If you reduce this, the effect of the LoRA will reduced. This is also improving the importance of these two tokens. These are our trained rare token and class token. Okay, this is all optional. This is all related to you. You need to find a good prompt as a base and do this checkpoint comparison. For LoRAs this is the way. Then let's make the grid margins 50 and hit generate. We will see the progress here. It says that it will generate 96 images on 8 multiplied with 1 grid. 12 images per cell. Okay, let's just wait now. So the x/y/z plot generation has been completed. Let's open it from folder, click here. It will open text to images folder, go to outputs, go to grids and in here you will see the final grid file which is 110 megabytes. I will open it with Paint.net and here. Now this part is totally subjective and you need to figure out which checkpoint is looking best. When we consider the first checkpoint, it is certainly under-trained. Let's move to the second one. This is also looking like under-trained. So let's move to the third checkpoint. You see this is the third checkpoint. It is looking decent. Let's move to the next one. So this is checkpoint 4. It is again looking decent. With LoRA training every checkpoint produces very different images. This is not the case when you do DreamBooth training, but with LoRA they are very different results even though you use same seed values. So let's look at the checkpoint 5. Okay, this is a really good picture. Actually, let's compare with our training data set. So the training data set is here. And yes, let's open this image. Okay, you see this is the LoRA generated image and this is the real image. Okay, this is also good. Let's look at the checkpoint 6. So here we are seeing the checkpoint 6. Okay, let's look at the each example. You need to find your best checkpoint. This is totally subjective as I said, so I can't say that it will be also best for you or not. And these are the other images. Even at the checkpoint. Okay, there is something very important. At the checkpoint 8. We see the overtraining. How? It even has memorized the background this on the wall. Or here it is fully memorized. It lost its flexibility even though I define it white suit it is not even completely white. This background is also pretty much memorized. So checkpoint 8 is definitely over-trained. This is clear indication. Let me show it from you the training data set. You see this is on the wall and it is also here on the wall too. Even this detail, yeah, you see, it is completely memorized. This is why you should have different backgrounds. So we eliminate checkpoint 8. In the checkpoint 7. Still, it is a little bit memorized. It's a decent quality, but still a bit memorized. Okay, let's look at the checkpoint 6. Yeah, on the checkpoint 6, we don't see such memorization and the results are also very decent. Still, there is some memorization, but not at that level. And on the checkpoint 5, the results are really good. So either checkpoint 5 or checkpoint 6 are good. We can do another test to see which one is more generalization. This is for realism. For realism, we can use checkpoint 5 or checkpoint 6, even checkpoint 7 we can use. Actually, I used this to generate thousands of images. I will show you. So another test: what could it be? On the GitHub file when you click Stable Diffusion link here, it will open my main repository. Please also Star it. We have reached over 700 stars. It is really important. Please also fork it and also watch it. If you also become my sponsor I would appreciate that very much. In here you will see amazing prompt list for Stable Diffusion. I am updating and improving this file. Also recently I started to add pictures as well with the prompt. So I will test this prompt because it is looking pretty decent. When you click this image it will show you the output. Let's do another test with this prompt. Let's select and copy it. This prompt don't have any negatives so let's paste it here. Let's delete the existing LoRA and let's delete the negative prompts and hit generate again to see what we are coming up with. Since x/y/z plot is selected, it will compare each one of the checkpoints. Actually, you can remove the first 4 checkpoints to speed up, but for this experimentation I will keep it so we will see the difference between each one of the checkpoints. By the way, the last generation took about 18 minutes for all of the checkpoints. The new test is also completed. Let's check out the results so it is also inside the grids. Let's open it. Okay here, let's re-evaluate. So this is the first one. Okay second checkpoint, third checkpoint, fourth checkpoint, fifth checkpoint, the sixth. It is still pretty amazing. Especially this one. Okay the seventh. Wow. The seventh is looking even better I think. Yeah, really, really good, really good. And the eighth checkpoint. The eighth checkpoint was over-trained so the seventh is also really good. So this is how you can evaluate your results and see at which point it is starting to become over-trained or not, it is totally up to your taste. I find that up to 200 epochs is delivering good results. From my past experience as well. So yesterday I used this prompt with changing the suit color and generated 2000 plus images. You are seeing them right now. You see starting from the 0. Let's go to the very bottom up to 2471. Finding the best images among these too many images is really really hard. After looking for a while you will become face insensitive, right? Even you won't be able to recognize yourself. So what you can do? I have an amazing tutorial that will sort your generated images based on a reference image. The tutorial link is here. You can watch it. Learn it. I have shown the script that I use. Alternatively, you can download the script from here from this Patreon post. There are also instructions shared in this post as you are seeing. So either you can become my Patreon supporter, download it or watch this tutorial to learn. This script is very easy to use. After you install the requirements, you just give two folders like you are seeing. The original folders and the generated images folder. Based on your given images it will sort them. Now I will show you the results. So at first I used this as a base reference image. Based on this real image. This is from the training images. It will sort all of the generated images. It doesn't have to be from your training images. You can provide any reference image and based on the face similarity the algorithm the AI will sort all of the generated images. Now let's look at the sorted images. So here we are seeing the sorted images results. You see they are amazing quality and based on this sorting you can find your best images and use them on your Linkedin profile, on your Twitter profile, wherever you need them. Can these images be improved? Yes! If I had used a better training data set, probably I would be able to get better results. Moreover, I find that this tutorial is still better than SDXL for realism. However, SDXL is much better for styling than the SD 1.5 based models training. I will also explore DreamBooth training of SDXL and I am expecting even much better results than what I have obtained with SD 1.5 based workflow. So this was the results of first reference image. Let me show you the second reference image. So this is the second reference image. Based on this second reference image, you can see the new sorting. You can give any reference image and based on that reference image, they will get sorted. For example, this is the reference image. This is the SDXL LoRA generated image. I think it can be still further improved and when we have SDXL refiner training I think we will get much better results. But it is still super similar. Super high quality. The clothing is very high quality and with even a better data set, I think we can get better results. I will also explore higher resolution training not 1024 and 1024. I will use even higher resolution and test the results. So here the third reference image. You see I am looking another direction in this image and now when I see the sorted images results, you will notice that most of the most similar images are looking at the same direction. So with this approach, you can find the similar direction looking images. It is very convenient to use. Moreover, you can put multiple reference images into a single folder and then the script will sort them based on the average similarity. That is also possible. This approach is also possible, so watch this tutorial to learn more and use these scripts if you wish. If you become my Patreon supporter, I would appreciate that very much because my Youtube revenue is very bad and your Patreon support is significantly very important for me. You may be wondering the styling capability of the model. So in this Patreon post, I shared how to get amazing prompts with ChatGPT for Stable Diffusion. Let's download the prompts. Okay, let's open it. There are some prompts here. You can test all of these prompts, generate images, and then compare them. Let me show you the results. Okay, here the results of those random prompts totally generated by the ChatGPT. There are some very, very interesting and different prompts. There was a single prompt. I liked very much. Let me find it. For example, you see they are very, very interesting. Okay, let's work on this prompt because I liked it and I will improve the face and show you how you can get much better quality. Since this is a distance shot you see the face quality is not very good, but we can make it better. So to be able to use it I will use png info in the Automatic1111 Web UI. Let's drag and drop the image into here. Then let's send to text to image tab. Don't forget to change your LoRA version accordingly. So let's pick our best LoRA which is like the setting seven, the checkpoint seven. Okay. Then there is certainm seed. However, this was generated with another LoRA so probably we won't get the same result. So let's first try whether we're going to get. I want to show you the logic of how to fix and improve the image. Okay, it is getting somewhat similar, but not similar. Okay, I will use the same LoRA to get the same image. Or alternatively, let me generate some images and find a decent one so you will see what are these settings are capable of. I will right click and generate forever until I get a decent one. Oh by the way, don't forget to make the seed random. So I will cancel, make the seed random and then I will generate again. So let's skip this one. Right click and generate forever. Okay. Okay the images are being generated. For example this one. It is looking decent or this one. We can improve this one significantly. It is also looking very cool. Okay, this one is also really looking well. Yeah except it has some missing parts. Okay here another image which can be improved. This one is also pretty cool. Okay here another one. You see, SDXL is extremely flexible. It generates this realism then this styling with the same prompt or this realism. This is also realistic image. It is so capable of outputting very different concepts. Very different art styles with the same prompt. Okay, you see this is also like dragon merged with a horse. Okay let's try to improve this one. I won't wait too much. This is just to show you the logic. So let's just cancel generate forever, then go to the png info. Delete the older one. Let's drag and drop the image, send text to image tab. Now we need to test high resolution fix until we get errors. What I mean by that? So for example let's upscale with 50 percent, make the denoising strength 50 percent and then hit generate. When you start getting repeating issue, then that is the point where you need to stop. Until getting that repeating issue you need to try bigger resolution with upscale by percentage. Okay with 50 upscale there is no noticeable repeating problem so let's try 60 percent. I will increase it to 60. Generate. Okay, the image has been generated. Let's look at it. Okay still can we say? Yeah we are starting to see some deforming here and here it doesn't exist or not very noticeable. So let's use this 1.5 which is 50% increment. What are we going to do is as a next? Let's go to the png info. Delete, load the image. Send it to the inpainting. Okay, the same settings are loaded. What we need to fix is let's first fix the hand because it is not looking very good. So I will just mask it like this. Select only masked from here. Let's do not change anything else and try like this. A hand of a male. Okay and let's generate. By the way it will only in paint the masked, inpaint masked, original, only masked padding pixels 32. You can also increase this and try different values. You can also increase denoising strength to see the effect. All right, let's look at the results. So click this icon. It will open the image to images folder and when we look from zooming in, this is the image. It is not looking very good. Also, the tone is not very much matching. So what can we do to fix this? First of all make the seed minus one. So every time we will get a different result. We can zoom in and try to mask it more carefully. When you hover your mouse here it will show you the options. You see. So while I am holding my alt key, I am zooming in. While I am holding my control, I am able to make the masking area smaller so let's mask it a little bit more detailed. Okay, like this. Maybe like this. Allright. Let's also fix the error in our prompt. While holding f you can move the canvas. Okay, I think we need to be on the canvas section. Yeah, now working. All right. We have 50% denoise 32 pixels paddings. Let's try several more times until we get a decent result. During the generation, it will show you the zoomed in generated area like this. Okay, after several tries, I got a very decent image. As you are seeing right now, there is still some toning mismatch, but when you look from the distance it won't be very much visible. This is the logic. This can be perfected. So I changed the prompt: ohwx man holding bike arm. You see, I defined it more carefully. Then we need to send this image to the inpaint again so the image here will get updated. Also, click this to update it. Then we need to use the original prompt we used generate this image to fix the face. So the original prompt was this. Now let's zoom in and mask the face so I will make the masking bigger. Let's mask it like this. There are also automatic masking and automatically generating in-painted face extension called as adetailer. Hopefully I will make a tutorial about that as well, but not in this one. All right, let's try again. Okay, even at the first try we got a decent image. Now all you need to do is try several times until you get a decent image, until you are satisfied with the result. You see even at the first result, I got a decent image. This is how you can improve distant shots generated images of yourself. Previously we had to use this Kohya developed sd web UI additional networks extension, but you don't need to use it anymore. Now Automatic1111 web UI is supporting all of the LoRA types automatically, which is great. I added that Patreon post link to here which I explained how to get amazing prompts for testing. It is here. So let's say you want to use same training command that I used. It is shared here. Copy it, then paste it into any text editor. Then you need to change your folder paths. Make sure that these folder paths are properly prepared in your case, otherwise it won't work. After you have done this all you need to do is just copy it, activate the Kohya virtual environment, and execute it as I have shown in the beginning of the video. I also have some other prompts here you can play with them. You see in some cases you need to reduce the weight of the rare token. Let's try this for example. Let's copy it. There are no negatives for GTA 5. Okay, let's paste it. Let's disable high resolution fix, make the seed random, delete the negatives, and let's generate 12 images. Okay, here we got the results. Not all of them are very good. Like this. This is decent. This is decent as you are seeing right now. This is not related. I like this one. Okay, this one is also decent. This is also very decent one. Let's increase CFG to 9 and try again. Okay, here we got the CFG 9 results. Okay, as we are seeing right now, they are decent, but not very good. Okay. Okay, this one is decent. It is following face strictly but not very stylized. Maybe we can do reduce the weight of the LoRA so it will follow more. Like this. Let's also remove the weighting here and try again. This styling capability totally depends on your training data set, your used checkpoint, and the weight of the LoRA. If it doesn't get styled enough, try to reduce the LoRA weight and especially try to improve your training data set differentiation. As I said, use different backgrounds, different clothing, and use very high quality images in your training data set. Okay here the results. When we give 90 percent weight to the LoRA. This one, this one, this one, this one decent, this one, this one, this one, this one, this one not very related, this one and this one and this one. So you see you can even further reduce and try again. These are not cherry picking. You know with Stable Diffusion it is numbers game. You need to generate a lot of images to get the perfect image that you are looking for. That is why my similarity script is very important and very useful. And we got results for 80 percent. Okay, let's look at them. Yeah, it is becoming more and more gta5 style so you need to find a sweet spot between your likeliness and your styling. And here the results. It generates one black and white. I don't know why, but the results are decent. This is not very like me, but you see it becomes more like gta5, especially maybe this one. Thank you for watching. I hope you have enjoyed. Please subscribe, join, and support me. It is really important. Leave a comment. Ask me anything you wish. Like the video. Share the video. All of these things will help me significantly. We have amazing Discord channel. Click this link. It will open you this page. Join our server. We have over 4000 members and we are growing each day. We have some experts. We are sharing knowledge with each other. Follow me from Twitter. Click this link to open my profile. I am a PhD doctor. This is my Twitter. You can also follow me on LinkedIn. Click this link. This is my LinkedIn profile. You can follow me here. Connect with me here. I also started a CivitAI profile. As I said, everything you need will be in this GitHub file. The link of this GitHub file will be in the description and the comment of the video. Read this GitHub file very carefully when you are watching this video or after watching this video. This file will be up to date. I will update this file if it be necessary. Currently when installing Kohya on Windows, there is nothing else you need to do just install as I shown in the video. If something gets changed, I will update possibly these two sections. These extra arguments are important. This is for Adafactor. If you use other optimizers, it will probably not work. This is the json config file that I used. This is user training command file. There are some prompts here you can use them. Also this is my pip freeze information. Hopefully see you in another video. The raw recording of this video took more than 2 hours. You can imagine how much time I have spent. I hope you consider supporting me on Patreon. Thank you so much.
Info
Channel: SECourses
Views: 14,344
Rating: undefined out of 5
Keywords: automatic1111, automatic 1111, sd web ui, sd, web ui, sdxl, stable diffusion, stable diffusion x-large, stable diffusion xlarge, chatgpt, stability ai, stabilityai, lora, kohya, kohya ss, kohya gui, training, sd training, stable diffusion training, sdxl training, sdxl lora training, lora training, dreambooth, dreambooth training, sdxl lora, x/y/z, checkpoint comparison, inpainting, inpaint, fixing faces, fixing hands, ai, generative ai, ai art, avatars, profile avatars, photography
Id: sBFGitIvD2A
Channel Id: undefined
Length: 85min 3sec (5103 seconds)
Published: Thu Aug 10 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.