Stable Diffusion - Poses and FaceSwap - Fooocus - Image Prompts

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
At some point, once using Stable Diffusion and  Fooocus, you'll probably find yourself wanting   to get a specific pose, or a specific  pose mixed with things like design,   or even swap faces. If you want to figure out  how to get more consistent characters with Fooocus   and Stable Diffusion, the many features that I'm  going to be covering in this video will be very   useful for that. In this video, we're going to be  diving into the image prompt feature of Fooocus. If you've used other tools for Stable Diffusion  or Midjourney, then this chart gives you an idea   of the differences between Fooocus and the others.  With Fooocus, you can mix image and text prompts   reliably and perform better in most instances than  other interfaces for Stable Diffusion. I cannot   confirm if this is true or not in my testing  yet, but it is very reliable, I've found overall. Now, this video assumes you've already have  Fooocus installed, and you know the basic   usage. If not, I do have other videos that  cover that. I'm going to be doing this with   using the regular run.bat file and the default  settings and styles unless I say otherwise. So, at this point, Fooocus is up and running.  I have turned on the advanced tab, I've changed   over to just quality, and the 1024x1024 aspect  ratio just to make it more consistent with   most of the images that I'll be generating  here. Now, to use the image prompt feature,   we're going to want to check off the box  down below here that says 'Image Prompt'. I'm going to show you the basic image prompt  first, then we'll move on to the advanced   stuff. Immediately, you're probably going  to use the advanced for the most part. Now,   in this, I'm going to use a specific image. You  can drag the images from up above for these as   well. I'm going to use this woman in a red  dress, dancing in a field. The idea of the   image prompt is this is going to influence the  generated image. It won't look exactly the same,   but it should carry over things like the color  she's wearing, a dress, a similar looking woman   in the field - those sorts of things are going to  pretty much carry over from most of the images you generate. Okay, as you can see, we have two  images generated. In both of them, the red   dress and similar-looking woman are present,  even the dress itself is very similar. The   poses are slightly different, and the  background is different, but it's still   the same idea for the picture. So, you can  see how that would influence the generation. Now, I'm going to demonstrate that  this isn't always going to be the case,   and this is where the advanced features will  come in. I'm going to drop this as a Tardis   from Doctor Who and go ahead and generate. Again,  not putting any text in the prompt – you can,   and I'll cover that afterwards – but  just for this purpose, I'm going to   demonstrate how this doesn't influence  the image as you might expect it would. Okay, as you can see in this image,  it did not influence it too much. It   did influence the color a little, and  I don't know if anything else. Now,   I'm not going to dive into this a lot because the  advanced features really make this stuff there;   it doesn't really matter. But I  am going to show you one more. Now, you can mix and match multiple pictures  to try to influence, but you're going to get   unreliable results. Let's just put it that  way. I'll show you one more example. Now,   for here, I'm actually going to add a text prompt,   but I've dropped the picture of the dog down  here just to show what this will do. What it   should do is generate a picture of a  cat with similar colors as this dog. As you can see, one of these was more influenced  than the other, but still, you can see the sun   in the background is influenced on this image and  this image. It won't always influence everything,   so that's the reason why just the basic image  prompt isn't necessarily as useful. For myself,   at least, I don't ever really use it because,  with the advanced controls which I'm going to   show you next, there's really not much reason  to use this. But I did want to cover it. Okay, at this point, we're going to  move ahead to the advanced features   because I honestly think that's what  you're going to be using. You're going   to find that you really have no interest in  the basic. So, we're going to go down here   at the bottom and check off 'Advanced'.  Now, you will see more options come up.   You're going to have these two sliders and  these options down here, which I will cover. Now, the first option was the image prompt,  which we were already using. Then you have   'Stop at' and the 'Weight' slider. Think of  those as well, think of the 'Stop at' as all   about timing. It determines when the influence on  your image prompt will stop during the generation   process. When it gets to 0.5 (50%), which is  what it's set at here, the prompt's influence   at that point ends, and Stable Diffusion  takes over and generates from there. So,   the later you stop that, the more influence  it's going to have on your image generation. Now let's talk about the weight. The setting  is like a volume control for your image prompt.   The higher the weight means that your prompt  will have a stronger impact on the generated   image. It's like turning up the dial on how  much you want your prompt shapes the style,   composition, or other aspects of your  final image. Usually, I start with the   default and start playing around with them  to get what I want. A lot of this is going to be trial and error when  it comes to those things,   but always start with the  defaults and go from there. So, on here, I'm actually going to do  another image prompt, but I'm going to   change a couple of things here. We're going to  actually increase the weight, and I'm also going   to increase the 'Stop at' to a higher amount.  Let's see what happens when we do that compared   to these original two images. And I'm going to put  'cat' in here as well, just like we did before. Okay, so the image generation is  done. As you can see in this one,   it had much more influence on the image,  almost having a hard time even creating   it as a cat. It did switch over at the end and  was able to force that. So keep that in mind;   if we had stopped it at a much higher amount,  it would have probably had a much harder time   creating that cat. At that point, the weight  was more with the color and everything else   with the background image, and it stuck  with a much similar picture. That's how   your 'Stop at' and the weight works, and it  works the same for pretty much all of these. Now, the next two I'm going to cover  are PyraCanny and CPDS. I'll cover them   somewhat together because they do similar  things. These are used for bringing over   the structure of an image. Maybe you want  a certain pose or you want a house to look   a certain way. You can bring in an image and,  unlike with the image prompt which we brought   before - a perfect example was the woman  dancing - that would influence the image,   but the pose itself wouldn't be influenced.  This is where PyraCanny and CPDS can come in. PyraCanny breaks it down to the outlines. Think of  it more like a coloring book. If you look at this   image on the left, you have the original image  of the house, and on the right, you have the   outlines. That's similar to what PyraCanny does.  The influence of how much detail is brought over   when doing that is determined by the weight and  the 'Stop at.' Those do have influences on that.   So, if you want to increase or lower the weight,  you're going to get more or less detail. The same   idea with the 'Stop at' is how long that's going  to influence the original image. Once at 0.5,   once it gets halfway through, then it stops using  that original image and generates from there. Now, CPDS works for a similar purpose, but the  difference with that is, instead of doing the   outlines, it decolorizes the image. As you can  see on this page, these are the original images   and the resulting images over here. This can be  better for certain images, whereas PyraCanny can   be better for other images. They both have their  advantages and disadvantages, and a lot of it   will come down to trial and error on which one  works better for which. PyraCanny is great for   capturing and reproducing finer details and clear  lines from an image. For instance, if you have a   picture of a bunny and you want to keep the exact  proportions in a new style, PyraCanny would be the   tool to use. CPDS would be more suitable when you  want to transfer complex scenes or poses from one   image to another, maintaining the general shape  and depth but not necessarily the fine details. So, experiment with the two. Usually, if  I don't get the results I want from one,   I'll try the other one to see if that method  works better for what I'm trying to accomplish. Okay, so for these examples, I'm going to try  a few different things just to give you a basic   idea of all the different things you can do  to mix and match all these things. Because   you can mix and match them, you don't really  want to mix and match PyraCanny and CPDS,   from my understanding, and I don't  usually. You just want one of those   in one of these image prompts, but  then you could have an image prompt,   you could have PyraCanny, a face swap which  we'll cover next. You can mix and match these. So for this one, I'm actually  just going to show you,   let's go with this house and PyraCanny,  and I'm not going to touch anything   else. I could generate just with  that, but that's not really what you're going to be doing. You're  going to be mixing it with text   prompts and other prompts. So, that's  what I'm going to show you here. Let's say we have a house. I want a house that  looks similar to this, but let's say I want it   in a lush forest. So, we're going to put a house  in a lush forest and we're going to go ahead and   generate a couple of images. Actually, I  want to just change this over to speed,   just to make this a little bit quicker for  myself. We're going to generate some images. For this one, as I said, we have PyraCanny.  We set it to the default settings,   which has a weight of one, and  then the 'Stop at' 0.5 or halfway. Now, depending on what you set these  at, and I'm going to show you this,   I'll leave these going for right now. I'm  going to show you what you get for a result,   and then I'll show you how you  can adjust the settings or how the   settings being adjusted will impact  the results that you get from this. So now, just like Stable Diffusion itself,  you're always going to have varying results.   And so we see the first one, it actually  came in with a pretty good forest around it,   the other one, not so much. We can actually  change how that might be impacted. Now,   the problem is, if you look at it, the house  actually looks pretty similar to the actual,   you know, same structure idea. As you can see,  the chimney, now this one added a chimney here.   So just like regular Stable Diffusion, it's  going to add things, so just keep that in   mind. I'm also keeping it a very simple  prompt, so that would be another factor. So let's go in here, and I'm actually going to  show you a difference. We're going to stop this   at 0.2, just to give you an idea. We're going  to use the same weight but we're going to just   stop at 0.2. I'm going to give you a general idea  of how that can change the picture. So now we see where it stopped using the influence, and  it's generating from there. As you can see,   it's definitely not really that close. I mean, it  keeps the house in the middle, things like that,   but otherwise, it's going to lower that influence.  Because, as you can see here, it starts off, and   you're like, 'Oh wow, that's actually looking like  the house.' But you're going to be able to see,   almost in an instant, right there, it changes,  and you lose that influence in the prompt. So, the further along that you are,   the more it's going to look like  the original picture. The problem is, if you're trying to make  something too radically different,   it's going to have a harder time creating that. So now, let's crank that way up. We're going to  actually crank that to almost all the way to one,   and we're going to generate. Now, we'll see  the difference on this. Notice pretty quickly   that it's going to keep even the clouds, as you  can see here. These clouds also line up in here,   so it's going to keep pretty close to that  picture up until the very end, and then may add   some variation. But usually, towards the end, is  when it only adds the minor details, so you're not   going to have anything drastically change on the  image when you set the 'Stop at' that much higher. Now, you can mix these with styles. That's  another thing to keep in mind. These things   can all be mixed with styles as well.  So now, as we can see with this one,   it actually did a pretty decent job,  even with the influence set, you know,   the 'Stop at' set very high. But it still stuck  with even that bush, that small tree is back here. So, you get an idea. Play with  these, get an idea of what they do,   and we'll stop this one. Let's go  with the 0.5, and I'll just show   you the difference. We're going to put the  weight down here, and we'll generate again. As you can see in this image, the weight is much  less, and you can see the house is different,   making it much lusher for us. So, you can play  with these settings to get an idea once you know   what you want. Okay, so we're going to actually  use this one here of this woman dancing in the   field. I picked this one because it wasn't a  standard pose; that would be a very hard pose   to get consistently with different clothing  styles or different people. It would be very   hard to describe in the prompt, so this is where  this sort of thing comes in handy. We're going   to put on the CPDS for this one. Like I said,  try different things; they're all different. The other thing to keep in mind is the  fewer other details in the picture,   like in this one, we're trying to go with,  let's say, if we mostly wanted the pose,   we would probably want a picture even better  without any trees in the background or anything   if you could get. Now, the other thing is,  let's say you don't want the person wearing   a dress. That's the reason I didn't use the  dress one because if you use the dress one,   that outline of the dress is actually going  to impact the prompt. So, you're better off   using one that doesn't have, you know, it keeps  a general body shape, as I'll demonstrate here. Now, let's say a warrior dancing in the  forest. So here we go with this one,   and we're going to see what that does. So, like  I said, we only put the CPDS on, and that's the   one that uses the decolorization.  We're going to have a stop at 50,   and we have the weight set on one, which are  the default settings. Now, as we can see here,   it's generating. We have that same pose,  you know, now it will vary slightly,   but I find in most cases, if you have a good  source image and everything, the pose will be   there almost every time, pretty closely. The hands  might be slightly different, things like that. So, if you really want a picture, preferably  that doesn't have anything in the background,   so in these two, as you can see,  it's pretty exact to the original,   and it's also in pretty much the  same spot and everything else. So now, let's say we had the pose, but let's  say we also want to have a certain style. What   we could do there is, let's say we wanted, we  could describe it, but it might be easier to   actually just drop another image. So, in this one,  I'm going to actually drop this image in here,   and we're going to use this as the image  prompt. So remember, that influences the look;   this one we're going to leave for the pose.  Now that we have those set, I'm actually going   to remove the prompt, the text prompt because I  want to show what the image prompt does here. So,   as I said, this over here is going to be more the  structure of the image; this over here is going to   be more the style, the color, and all that. Now,  I am going to increase the weight on this one,   just because I don't have any other text  or anything like that, so I'm actually   just going to increase that a bit little bit  and I'm going to go ahead and hit generate. Okay, so as you can see, we have a similar  style as the one on the right, and the pose and   structure of the image from the image on the left,  which is using the CPDS. Now, we could potentially   use PyraCanny in this. As I said, they're both  doing similar things; it's just how they do it   differently. One thing we'll notice here is that  I don't have anything else in the text prompt,   so these are the only two things being influenced.  But let's say I wanted to influence that,   and I'm going to say 'a warrior with a sword.' I'm  not specifying 'dancing' or anything like that; it   should carry over the structure and everything, so  I shouldn't even have to worry about adding that. As we can see in these images, it's added  a sword. Now, it keeps the same pose,   and the style and everything  is taken from the one on the   right. I could generate 50 of these. I  tested them earlier, and in all honesty,   I generated a whole bunch of them. These are  just some of the images. So, as you can see,   it's pretty consistent on the results that you  can get. If you're not getting the right results,   consider that your image source may not be good  enough, or you may have too much background   clutter. Even adding an image with a watermark  on it may get some weird effects from that. These are the things to keep in mind when you're  doing this. Okay, now for the last thing I'm going   to show, I'm going back to just show the face  swap down here in the image prompt. You can mix   this with the other things I've already shown  you. You can mix and match those how you want   with text prompts, styles, and everything, but for  this, I just want to demonstrate how this works. For this one, I'm going to start off with  this image here, and I'm just going to put   in 'a woman with black hair.' I didn't  get any more detailed than that. Now,   the idea behind this is that the face should look  pretty close. We haven't cranked the weight up,   and the 'Stop at' is already pretty high  with these. The idea is that it's going to   remember more of the structure of the face.  I always start off with the default and then   increase it if I find it necessary to get  what I'm looking for. Always start with   the default because those are usually set  at what they found were the best settings. Looking at those, I would say they're actually,  yeah, they're pretty close. They're not always   going to be exact, and like I said, you can  increase the weight. But I'm also going to show   this, and we're going to actually just change the  face in here, and we haven't changed the prompt   at all. As you can see, this definitely influenced  it. Now, it obviously, like I said, it's not going   to always get them exactly the same, and that's  something you can use multiple if you have two,   three, four different face angles. So you could  add another one in there with a different angle   that will improve the results that you get. Okay,  so I think that pretty much covers everything in   the image prompts, as far as the basics and  the advanced, but not the extreme advanced.   There's a lot more we could get into, mixing  and matching, and I'll probably cover some   of those things in later videos when it comes  to consistent characters and things like that.   The biggest thing when it comes to any of these is  to experiment, and if you don't get good results,   make sure you're keeping your prompts simple. You  know, try different images. Low-quality images, I   find, don't always work well. So if you're trying  to use a pose with a low-quality image and there's   a lot of pixelation or anything like that, that  could really negatively impact the results that   you get. I hope that you'll learn something from  this. My next video actually is probably going to   be on the in-painting and out-painting, hopefully,  I'll have that probably within the next four to   six days. That's going to take a little bit longer  to produce, like this one did. And then I do hope   to have some even more advanced tutorial, at  a later date, with like I said, consistent   characters and things like that. So thanks for  watching and have a good day. See you next time.
Info
Channel: Kleebz Tech
Views: 24,401
Rating: undefined out of 5
Keywords: stable diffusion ai, face swap tutorial, fooocus tutorial, stable diffusion face swap, stable diffusion controlnet, stable diffusion image to image, stable diffusion poses, stable diffusion consistent, stable diffusion consistent character, stable diffusion consistent face, fooocus face swap, fooocus ai poses, fooocus consistent character, ai face swap, ai influencer, image prompts, stable diffusion tutorial, controlnet, stable diffusion img2img, Poses and FaceSwap
Id: 2hW0M-Y1R6k
Channel Id: undefined
Length: 22min 39sec (1359 seconds)
Published: Tue Jan 16 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.