Stable Diffusion - Poses and FaceSwap - Fooocus - Image Prompts

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

At some point, once using Stable Diffusion and Fooocus, you'll probably find yourself wanting to get a specific pose, or a specific pose mixed with things like design, or even swap faces. If you want to figure out how to get more consistent characters with Fooocus and Stable Diffusion, the many features that I'm going to be covering in this video will be very useful for that. In this video, we're going to be diving into the image prompt feature of Fooocus. If you've used other tools for Stable Diffusion or Midjourney, then this chart gives you an idea of the differences between Fooocus and the others. With Fooocus, you can mix image and text prompts reliably and perform better in most instances than other interfaces for Stable Diffusion. I cannot confirm if this is true or not in my testing yet, but it is very reliable, I've found overall. Now, this video assumes you've already have Fooocus installed, and you know the basic usage. If not, I do have other videos that cover that. I'm going to be doing this with using the regular run.bat file and the default settings and styles unless I say otherwise. So, at this point, Fooocus is up and running. I have turned on the advanced tab, I've changed over to just quality, and the 1024x1024 aspect ratio just to make it more consistent with most of the images that I'll be generating here. Now, to use the image prompt feature, we're going to want to check off the box down below here that says 'Image Prompt'. I'm going to show you the basic image prompt first, then we'll move on to the advanced stuff. Immediately, you're probably going to use the advanced for the most part. Now, in this, I'm going to use a specific image. You can drag the images from up above for these as well. I'm going to use this woman in a red dress, dancing in a field. The idea of the image prompt is this is going to influence the generated image. It won't look exactly the same, but it should carry over things like the color she's wearing, a dress, a similar looking woman in the field - those sorts of things are going to pretty much carry over from most of the images you generate. Okay, as you can see, we have two images generated. In both of them, the red dress and similar-looking woman are present, even the dress itself is very similar. The poses are slightly different, and the background is different, but it's still the same idea for the picture. So, you can see how that would influence the generation. Now, I'm going to demonstrate that this isn't always going to be the case, and this is where the advanced features will come in. I'm going to drop this as a Tardis from Doctor Who and go ahead and generate. Again, not putting any text in the prompt – you can, and I'll cover that afterwards – but just for this purpose, I'm going to demonstrate how this doesn't influence the image as you might expect it would. Okay, as you can see in this image, it did not influence it too much. It did influence the color a little, and I don't know if anything else. Now, I'm not going to dive into this a lot because the advanced features really make this stuff there; it doesn't really matter. But I am going to show you one more. Now, you can mix and match multiple pictures to try to influence, but you're going to get unreliable results. Let's just put it that way. I'll show you one more example. Now, for here, I'm actually going to add a text prompt, but I've dropped the picture of the dog down here just to show what this will do. What it should do is generate a picture of a cat with similar colors as this dog. As you can see, one of these was more influenced than the other, but still, you can see the sun in the background is influenced on this image and this image. It won't always influence everything, so that's the reason why just the basic image prompt isn't necessarily as useful. For myself, at least, I don't ever really use it because, with the advanced controls which I'm going to show you next, there's really not much reason to use this. But I did want to cover it. Okay, at this point, we're going to move ahead to the advanced features because I honestly think that's what you're going to be using. You're going to find that you really have no interest in the basic. So, we're going to go down here at the bottom and check off 'Advanced'. Now, you will see more options come up. You're going to have these two sliders and these options down here, which I will cover. Now, the first option was the image prompt, which we were already using. Then you have 'Stop at' and the 'Weight' slider. Think of those as well, think of the 'Stop at' as all about timing. It determines when the influence on your image prompt will stop during the generation process. When it gets to 0.5 (50%), which is what it's set at here, the prompt's influence at that point ends, and Stable Diffusion takes over and generates from there. So, the later you stop that, the more influence it's going to have on your image generation. Now let's talk about the weight. The setting is like a volume control for your image prompt. The higher the weight means that your prompt will have a stronger impact on the generated image. It's like turning up the dial on how much you want your prompt shapes the style, composition, or other aspects of your final image. Usually, I start with the default and start playing around with them to get what I want. A lot of this is going to be trial and error when it comes to those things, but always start with the defaults and go from there. So, on here, I'm actually going to do another image prompt, but I'm going to change a couple of things here. We're going to actually increase the weight, and I'm also going to increase the 'Stop at' to a higher amount. Let's see what happens when we do that compared to these original two images. And I'm going to put 'cat' in here as well, just like we did before. Okay, so the image generation is done. As you can see in this one, it had much more influence on the image, almost having a hard time even creating it as a cat. It did switch over at the end and was able to force that. So keep that in mind; if we had stopped it at a much higher amount, it would have probably had a much harder time creating that cat. At that point, the weight was more with the color and everything else with the background image, and it stuck with a much similar picture. That's how your 'Stop at' and the weight works, and it works the same for pretty much all of these. Now, the next two I'm going to cover are PyraCanny and CPDS. I'll cover them somewhat together because they do similar things. These are used for bringing over the structure of an image. Maybe you want a certain pose or you want a house to look a certain way. You can bring in an image and, unlike with the image prompt which we brought before - a perfect example was the woman dancing - that would influence the image, but the pose itself wouldn't be influenced. This is where PyraCanny and CPDS can come in. PyraCanny breaks it down to the outlines. Think of it more like a coloring book. If you look at this image on the left, you have the original image of the house, and on the right, you have the outlines. That's similar to what PyraCanny does. The influence of how much detail is brought over when doing that is determined by the weight and the 'Stop at.' Those do have influences on that. So, if you want to increase or lower the weight, you're going to get more or less detail. The same idea with the 'Stop at' is how long that's going to influence the original image. Once at 0.5, once it gets halfway through, then it stops using that original image and generates from there. Now, CPDS works for a similar purpose, but the difference with that is, instead of doing the outlines, it decolorizes the image. As you can see on this page, these are the original images and the resulting images over here. This can be better for certain images, whereas PyraCanny can be better for other images. They both have their advantages and disadvantages, and a lot of it will come down to trial and error on which one works better for which. PyraCanny is great for capturing and reproducing finer details and clear lines from an image. For instance, if you have a picture of a bunny and you want to keep the exact proportions in a new style, PyraCanny would be the tool to use. CPDS would be more suitable when you want to transfer complex scenes or poses from one image to another, maintaining the general shape and depth but not necessarily the fine details. So, experiment with the two. Usually, if I don't get the results I want from one, I'll try the other one to see if that method works better for what I'm trying to accomplish. Okay, so for these examples, I'm going to try a few different things just to give you a basic idea of all the different things you can do to mix and match all these things. Because you can mix and match them, you don't really want to mix and match PyraCanny and CPDS, from my understanding, and I don't usually. You just want one of those in one of these image prompts, but then you could have an image prompt, you could have PyraCanny, a face swap which we'll cover next. You can mix and match these. So for this one, I'm actually just going to show you, let's go with this house and PyraCanny, and I'm not going to touch anything else. I could generate just with that, but that's not really what you're going to be doing. You're going to be mixing it with text prompts and other prompts. So, that's what I'm going to show you here. Let's say we have a house. I want a house that looks similar to this, but let's say I want it in a lush forest. So, we're going to put a house in a lush forest and we're going to go ahead and generate a couple of images. Actually, I want to just change this over to speed, just to make this a little bit quicker for myself. We're going to generate some images. For this one, as I said, we have PyraCanny. We set it to the default settings, which has a weight of one, and then the 'Stop at' 0.5 or halfway. Now, depending on what you set these at, and I'm going to show you this, I'll leave these going for right now. I'm going to show you what you get for a result, and then I'll show you how you can adjust the settings or how the settings being adjusted will impact the results that you get from this. So now, just like Stable Diffusion itself, you're always going to have varying results. And so we see the first one, it actually came in with a pretty good forest around it, the other one, not so much. We can actually change how that might be impacted. Now, the problem is, if you look at it, the house actually looks pretty similar to the actual, you know, same structure idea. As you can see, the chimney, now this one added a chimney here. So just like regular Stable Diffusion, it's going to add things, so just keep that in mind. I'm also keeping it a very simple prompt, so that would be another factor. So let's go in here, and I'm actually going to show you a difference. We're going to stop this at 0.2, just to give you an idea. We're going to use the same weight but we're going to just stop at 0.2. I'm going to give you a general idea of how that can change the picture. So now we see where it stopped using the influence, and it's generating from there. As you can see, it's definitely not really that close. I mean, it keeps the house in the middle, things like that, but otherwise, it's going to lower that influence. Because, as you can see here, it starts off, and you're like, 'Oh wow, that's actually looking like the house.' But you're going to be able to see, almost in an instant, right there, it changes, and you lose that influence in the prompt. So, the further along that you are, the more it's going to look like the original picture. The problem is, if you're trying to make something too radically different, it's going to have a harder time creating that. So now, let's crank that way up. We're going to actually crank that to almost all the way to one, and we're going to generate. Now, we'll see the difference on this. Notice pretty quickly that it's going to keep even the clouds, as you can see here. These clouds also line up in here, so it's going to keep pretty close to that picture up until the very end, and then may add some variation. But usually, towards the end, is when it only adds the minor details, so you're not going to have anything drastically change on the image when you set the 'Stop at' that much higher. Now, you can mix these with styles. That's another thing to keep in mind. These things can all be mixed with styles as well. So now, as we can see with this one, it actually did a pretty decent job, even with the influence set, you know, the 'Stop at' set very high. But it still stuck with even that bush, that small tree is back here. So, you get an idea. Play with these, get an idea of what they do, and we'll stop this one. Let's go with the 0.5, and I'll just show you the difference. We're going to put the weight down here, and we'll generate again. As you can see in this image, the weight is much less, and you can see the house is different, making it much lusher for us. So, you can play with these settings to get an idea once you know what you want. Okay, so we're going to actually use this one here of this woman dancing in the field. I picked this one because it wasn't a standard pose; that would be a very hard pose to get consistently with different clothing styles or different people. It would be very hard to describe in the prompt, so this is where this sort of thing comes in handy. We're going to put on the CPDS for this one. Like I said, try different things; they're all different. The other thing to keep in mind is the fewer other details in the picture, like in this one, we're trying to go with, let's say, if we mostly wanted the pose, we would probably want a picture even better without any trees in the background or anything if you could get. Now, the other thing is, let's say you don't want the person wearing a dress. That's the reason I didn't use the dress one because if you use the dress one, that outline of the dress is actually going to impact the prompt. So, you're better off using one that doesn't have, you know, it keeps a general body shape, as I'll demonstrate here. Now, let's say a warrior dancing in the forest. So here we go with this one, and we're going to see what that does. So, like I said, we only put the CPDS on, and that's the one that uses the decolorization. We're going to have a stop at 50, and we have the weight set on one, which are the default settings. Now, as we can see here, it's generating. We have that same pose, you know, now it will vary slightly, but I find in most cases, if you have a good source image and everything, the pose will be there almost every time, pretty closely. The hands might be slightly different, things like that. So, if you really want a picture, preferably that doesn't have anything in the background, so in these two, as you can see, it's pretty exact to the original, and it's also in pretty much the same spot and everything else. So now, let's say we had the pose, but let's say we also want to have a certain style. What we could do there is, let's say we wanted, we could describe it, but it might be easier to actually just drop another image. So, in this one, I'm going to actually drop this image in here, and we're going to use this as the image prompt. So remember, that influences the look; this one we're going to leave for the pose. Now that we have those set, I'm actually going to remove the prompt, the text prompt because I want to show what the image prompt does here. So, as I said, this over here is going to be more the structure of the image; this over here is going to be more the style, the color, and all that. Now, I am going to increase the weight on this one, just because I don't have any other text or anything like that, so I'm actually just going to increase that a bit little bit and I'm going to go ahead and hit generate. Okay, so as you can see, we have a similar style as the one on the right, and the pose and structure of the image from the image on the left, which is using the CPDS. Now, we could potentially use PyraCanny in this. As I said, they're both doing similar things; it's just how they do it differently. One thing we'll notice here is that I don't have anything else in the text prompt, so these are the only two things being influenced. But let's say I wanted to influence that, and I'm going to say 'a warrior with a sword.' I'm not specifying 'dancing' or anything like that; it should carry over the structure and everything, so I shouldn't even have to worry about adding that. As we can see in these images, it's added a sword. Now, it keeps the same pose, and the style and everything is taken from the one on the right. I could generate 50 of these. I tested them earlier, and in all honesty, I generated a whole bunch of them. These are just some of the images. So, as you can see, it's pretty consistent on the results that you can get. If you're not getting the right results, consider that your image source may not be good enough, or you may have too much background clutter. Even adding an image with a watermark on it may get some weird effects from that. These are the things to keep in mind when you're doing this. Okay, now for the last thing I'm going to show, I'm going back to just show the face swap down here in the image prompt. You can mix this with the other things I've already shown you. You can mix and match those how you want with text prompts, styles, and everything, but for this, I just want to demonstrate how this works. For this one, I'm going to start off with this image here, and I'm just going to put in 'a woman with black hair.' I didn't get any more detailed than that. Now, the idea behind this is that the face should look pretty close. We haven't cranked the weight up, and the 'Stop at' is already pretty high with these. The idea is that it's going to remember more of the structure of the face. I always start off with the default and then increase it if I find it necessary to get what I'm looking for. Always start with the default because those are usually set at what they found were the best settings. Looking at those, I would say they're actually, yeah, they're pretty close. They're not always going to be exact, and like I said, you can increase the weight. But I'm also going to show this, and we're going to actually just change the face in here, and we haven't changed the prompt at all. As you can see, this definitely influenced it. Now, it obviously, like I said, it's not going to always get them exactly the same, and that's something you can use multiple if you have two, three, four different face angles. So you could add another one in there with a different angle that will improve the results that you get. Okay, so I think that pretty much covers everything in the image prompts, as far as the basics and the advanced, but not the extreme advanced. There's a lot more we could get into, mixing and matching, and I'll probably cover some of those things in later videos when it comes to consistent characters and things like that. The biggest thing when it comes to any of these is to experiment, and if you don't get good results, make sure you're keeping your prompts simple. You know, try different images. Low-quality images, I find, don't always work well. So if you're trying to use a pose with a low-quality image and there's a lot of pixelation or anything like that, that could really negatively impact the results that you get. I hope that you'll learn something from this. My next video actually is probably going to be on the in-painting and out-painting, hopefully, I'll have that probably within the next four to six days. That's going to take a little bit longer to produce, like this one did. And then I do hope to have some even more advanced tutorial, at a later date, with like I said, consistent characters and things like that. So thanks for watching and have a good day. See you next time.

Info

Channel: Kleebz Tech

Views: 24,401

Rating: undefined out of 5

Keywords: stable diffusion ai, face swap tutorial, fooocus tutorial, stable diffusion face swap, stable diffusion controlnet, stable diffusion image to image, stable diffusion poses, stable diffusion consistent, stable diffusion consistent character, stable diffusion consistent face, fooocus face swap, fooocus ai poses, fooocus consistent character, ai face swap, ai influencer, image prompts, stable diffusion tutorial, controlnet, stable diffusion img2img, Poses and FaceSwap

Id: 2hW0M-Y1R6k

Channel Id: undefined

Length: 22min 39sec (1359 seconds)

Published: Tue Jan 16 2024