Today, I'm going to show you how to use ControlNet
for stable diffusion to create amazing images. Let's begin. Go to your extension tab and click
on "load." Search for ControlNet and install it. Head over to this Huggingface link
and download the models that you need. For now, I recommend these. Move them
to the ControlNet folder inside models Restart your UI after you're done. Go to
your web UI, and you will see ControlNet. First thing we're going to do is click
here and find your image to upload it. Now let's look at all the options here.
Click on "Enable" and "Allow Preview." Let's start off with our first model, Canny.
Canny detects the edges of objects in an image. By clicking on the preview icon, you can
see what the Canny preprocessor does. Also, enable the “Pixel Perfect” option as it
automatically adjusts your controlnets resolution. I'm using a realistic model to turn
this game character into a realistic person. The generated image you get is
a mix of ControlNet and your prompts. Now let's try out a different control type.
OpenPose is one of the more useful models. It detects the key points on a human body,
it allows you to recreate human poses in your images. If you look at the preprocessors,
you'll have five different options. Let's start off with OpenPose. It tries to detect
all the major points in our input image. Now let's try to generate an image based on my
reference image. I'm going to paste my prompts and generate. If I also wanted to recreate
the hand position from my reference image, I would use OpenPose Hands. It is
a combination of hand detection and OpenPose. You can easily copy the exact hand and
finger placement shown in your reference image. OpenPose Face includes the face and
the body, whereas OpenPose Face Only only includes the face. These do
well with portraits and close-ups. The final one, OpenPose Full, combines
all the ones before it - the face, the body, and the hands. As you can see,
OpenPose also works with multiple people. Next, we have the Depth preprocessor. It is good
for positioning things, but it loses the finer details in the process. Depth has four different
preprocessors right now, with minor differences. The lighter shades represent close-up objects,
whereas the darker shades are farther away. Normal generates a normal map based on our input
image. Normal maps are images used to make flat surfaces appear textured. Overall, Normal_Bae is
better for both the foreground and the background. MLSD is a straight-line detector, useful for
recreating buildings and interiors. All the curves and finer details are ignored. Line Art is
a really useful tool to color your digital or hand drawings. There are multiple preprocessors for
this with minor differences. For this generation, I'm using the Lineart Realistic preprocessor. Here
is the generated image, and here are two more. Next, let's talk about Soft Edge. It is a known
detection algorithm like Canny, but overall, the edges are softer. But, it still helps you
keep the fine details and get a generated image close to the reference. It is best to use when
you don't want the outlines to be too strict. Scribble preprocessor turns your images
into rough scribbles. Hed and Pidinet give you coarse outlines, whereas XDoG gives you fine
outlines. Segmentation assigns a unique color to every identified object in an image. You can look
up the color codes on the spreadsheet. Changing or adding colors within an editing app allows
you to introduce new objects into the image. In this new image, I will place the window
with the painting and add a new person. Shuffle distorts and shuffles the colors of
an input image. The shuffled image provides a starting point to generate our image. It is really
useful for copying color schemes and themes. Here are more images with a different theme. P2P or Pix2Pix changes your images
based on your instructions. There's no pre-processor for it. You only need to
download the model. I don't really find it useful because the generated
image quality is not that good. Next, we have Reference. You'll
notice that it does not need a model, only the pre-processor. The Style
Fidelity slider doesn't do much, so I would leave it at the default.
Now let's talk about the preprocessors. Reference Only is good at creating minor
variations of your reference image. I found reference_only to be the best because it
follows your prompts as well as your reference. The results of adain+attn are quite similar
to Reference_only. You can try out both to see which is better suited for your image.
Reference_aiden ignores the colors and the details while keeping the overall composition.
It is also heavily influenced by the prompt. Finally, we have Multi-ControlNet. It allows
you to use multiple ControlNets at once. To enable this, head over to Settings
> ControlNet > Multi-ControlNet. I'm going to select 2 and restart my UI. You
can now access multiple ControlNet tabs. For my first ControlNet, I'm using OpenPose. If
you already have a pre-processed image like I do here, you have to set the pre-processor
to "none." For my second ControlNet, I'm using MLSD, and I have a
pre-processed image once again. We're going to change one more option, which is
Control Weight. It determines how much ControlNet affects the generated image. I'm going to set this
to 0.5 so it doesn't affect my subject too much. As you can see, the generated images
are a mix of both ControlNets.
Subscribe if you want to see more
videos about stable diffusion.