How to use ControlNet with SDXL. Including perfect hands and compositions.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Why do you need ControlNet for SDXL? Have you ever generated images and experienced that one terrible defect that spoils your entire image? Yes, I'm talking about deformed hands. It happens in SDXL, SD 1.5 and even the new Stable Diffusion 3, but I have a solution for you: ControlNet and now also with SDXL, but ControlNet can do more than generate perfect hands or repair them stay tuned for the other features I'll show you in the second half and at the end of the video. Hi, I'm Patrick. With this channel I want to explain technology and AI clearly and correctly, but before we begin: This video is sponsored by FaceMod. With FaceMod you can AI face swap online easy and fast so you don't need a powerful GPU and no additional software. And it can be used both with images and videos. Just upload your image or video. Choose a face to swap from your own media or preset templates and save the result. For easy access you can upload images you want to use for swapping. I'm using a photo of myself and some images I've generated with SDXL. You want to have more muscles? Upload the muscle man image and after the analysis swap the face. Now I'm strong. Select a portrait style from lots of templates. Choose a face. Now I'm wearing a suit. Wonka from Anime styles let's me look very creative. For swapping faces in videos there are numerous templates. Choose a template a face and swap The result has even glasses. Of course, you can upload your own videos including HD and swap the face. Does this guy really look younger than me? You can swap up to six faces at same time. Assign a new face to each face in the video and press swap. Simple isn't it? If you want to try FaceMod yourself the link is in the description. Now let's move on to installing ControlNet. We use Automatic 1111 the process is same for Windows, Linux and macOS. And it's not very difficult. At first we install or upgrade ControlNet to the latest version. This is required to make sure that your ControlNet extension supports SDXL. We install Mikubills SD web UI ControlNet, all links are in the description below. Copy the URL to the extensions git repository in the web UI. We select the extensions tab and install from URL. We paste the extensions URL we have just copied. Press install if you haven't installed ControlNet before like me. If the ControlNet extension is already available then press check for updates. You should see latest version and date at least same or newer than the one in this video. Press apply and restart UI. Congratulations now your ControlNet supports SDXL. Well, except for one important point we have to consider. You require special ControlNet models for SDXL. The ControlNet models for stable diffusion 1.5 are not compatible with SDXL. There's no official repository for SDXL ControlNet models. We choose the collection from lllyasviel in files and versions. We find what we require in this tutorial. I'll show you how to use the most important models OpenPose, Canny and Depth. Feel free to try other models yourself and download as much as you like or you have space on your drive. I'll download Diffusers XL Canny and Diffusers XL Depth in Full version. Choose mid or small if you don't have that much disk space. For OpenPose I download T2I Adapter Diffusers XL OpenPose, T2I adapter xl OpenPose and Thibaud xl OpenPose. Usually I start with Thibaud xl OpenPose and choose the others if the results do not fit or if I need faster generation speed. Move all downloaded models into your stable diffusion web UI directory models ControlNet. In order to test our installation and generate suitable images we need an SDXL checkpoint. I've chosen Dreamshaper SL alpha 2 because the turbo version is not available for commercial usage, but you can use it for private usage or choose any other SDXL checkpoint you like to try. Even SDXL base is ok. Move the checkpoint into your stable diffusion web UI directory models stable diffusion. Make sure to select the appropriate SDXL checkpoint. An SD1.5 checkpoint will not work with the SDXL ControlNet models. In the description for Dreamshaper AL alpha 2 there was a hint to DPM++ 2s a so we select this sampling method. Sampling steps is 30, width and height are 1024 as we use SDXL. For CFG use the suggested 8. We type in a very simple prompt and generate. The result is great in order to show you how to use ControlNet. Although one hand looks quite ok the other one is deformed, misses a finger. For sure you want to know how to repair this hand. You won't be disappointed. We send the image to the image to image tab and do our first exercise with ControlNet. There are three independent ControlNet units. We enable unit 0 and pixel perfect is usually a good choice. As you remember we have downloaded ControlNet models for canny, depth and open pose. We start with open pose. There are different pre-processors, even one for animals. We need a suitable model. We have downloaded three, but when speed doesn't matter I usually choose Thibaud xl open pose. Remember stable diffusion checkpoints and ControlNet checkpoints have to match, so combine SDXL with SDXL and SD1.5 with SD1.5. Otherwise you will run into trouble. With the control weight you can choose how much the ControlNet unit will influence the image generation. It's a good idea to try different settings, but avoid very high or very low values. For the control mode in most cases balanced is a good choice. Although the pre-processor open-pose-full does a good job. We choose the improved version DW OpenPose full. The denoising strength is quite important as we want to override the deformed hand. We will just start with 0.7 but if your background is getting inconsistent you might want to lower the value for example to 0.5. We don't want to regenerate the whole image only the hand so we send it to inpaint. Now we inpaint the deformed hand. Very important: I had to learn the hard way that it is not good enough to only inpaint the deformed part. A perfect hand with all fingers will be wider than the current hand. So you have to grant it more space and inpaint a larger area of the image. Otherwise the result will be most probably deformed in a similar way. Now it's time to generate. The hand is wider and has all five fingers. Now we have a template for our next exercises with ControlNet and generating images with perfect hands. Do you want to know how directly generate good hands without the need to fix them? Then the next chapters are for you. And if you don't want to miss my next videos about such topics press the subscribe button. You have already seen how to use ControlNet in the image to image tab. We can use it in the text to image tab too. Before I show you how to keep image compositions in newly generated images we'll have a closer look at open pose. We drop our last generated image into ControlNet unit 0. Select the DW open pose full and press the small button for generating a preview. Allow preview has to be enabled. You see a schematic representation of the pose. By choosing edit you could even modify some of the points according to your needs and press send pose to ControlNet in order to use it for your next image generation. Model is Thibaud xl open pose again for sdxl checkpoints. You can increase the resolution to 1024 but that's not a must. Using the preview again, we see the schematic representation is more or less same. Although there are several preprocessors you might want to try as a rule of thumb choose DW OpenPose full for people and animal OpenPose for animals. Okay, I nearly made a common mistake. Don't forget to enable the control unit or units you want to use for your image generation. Let's generate. Style background hair cloth all differs but the pose is still same and both hands are correct. So if you have a template with good hands for the pose you want to generate this is the way to go. Maybe I should mention that I sometimes take pictures of myself in the required pose in order to have such a template. Just in order to give you another option. Before we have a look at my preferred ControlNet model we select Canny. Preprocessor is canny. We don't want to invert it. Make sure to choose the right Canny model after running the preprocessor. You notice that the edges are detected. This can be used to reuse an image composition. The edge detection is influenced by the low and high thresholds. Important: Although it is stated somewhere else the edges between low and high are not kept. Rather it is the case that edges below the low threshold are discarded and values above the high threshold are always kept. So setting both thresholds to low values will result in a lot of edges being detected. Setting both thresholds to high values will result in only the clearest edges being detected. Keep this in mind when planning the composition of your image. We choose the default settings and generate. The style of the image is very different but the composition is same and the hands are correct. You can influence the image by changing the prompt. Yes, there is a lot of green now. If you want to have the result a bit more independent from the ControlNet lower the weight for example to 0.5. This result reminds me of an artistic butterfly woman. We continue with my preferred ControlNet model and we choose Depth. Only if Depth doesn't work for me I usually select Canny and try to enclose the required edges by using the thresholds. Depth has several preprocessors with different levels of detail. Zoe and hand refiner do not work on my machine. We start with Midas. It has lowest details, but sometimes this is exactly what we want. Regarding depth the brighter the pixels the closer and the darker the far away. It's a very good way of reusing compositions. Leres++ has most details. Leres is between the ++ version and Midas. Anything is between Midas and Leres. Remember more detail is not always better. Let's use the original prompt. Style is different composition is same hands are good. We do an extreme change. We try to generate a man in an armor. I assume we have to lower the control weight, but first let's try with one. Not sure if it is really a man and the silver arms are somehow creative nevertheless the fingers do not look good. So we lower the control weight to 0.5 and we require more detail for the hands. We choose Leres++. The hands are much better now and it might be a man with long hair. Anyhow, the influence of our depth map with long hair is obviously very very high. By changing the prompt you can influence both the foreground and the background We see the volcano, but the foreground is very dark. Why not choosing another SDXL model and check the result? We select the SDXL base model and generate the same settings. Although we didn't use the refiner the result looks better than the one with the custom model. SDXL requires lots of performance and VRAM. You can use ControlNet in same way with SD1.5 checkpoints and ControlNet models. For higher resolutions, you can simply upscale your results. If you want to know how this works, watch my video about upscale.
Info
Channel: Next Tech and AI
Views: 1,614
Rating: undefined out of 5
Keywords: stable diffusion, stable diffusion tutorial, stable diffusion ai, stable diffusion xl, nexttechandai, automatic1111 stable diffusion, ai art, controlnet stable diffusion, stable diffusion controlnet, ControlNet SDXL, SDXL controlnet, SDXL canny, SDXL openpose, SDXL depth, controlnet depth, controlnet canny, controlnet openpose, perfect hands, sdxl perfect hands, sdxl fingers, controlnet hands, controlnet fingers, automatic1111 controlnet, sdxl hands, sdxl fix hands
Id: NqTBV_vR-iM
Channel Id: undefined
Length: 15min 44sec (944 seconds)
Published: Mon May 13 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.