LATENT Tricks - Amazing ways to use ComfyUI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
fascinating ways to use comfy UI hello my friends how are you doing so today I want to show you several ideas what you can do with node based uis like conf UI in this video I show you how to install that set everything up I've created a zip file with an image from each of these projects so when you drag that into your comfy UI you will actually have the complete Network that I built with these nodes and you can use it for yourself to experiment with that today we're going to go through different ideas and a lot of them are using the concept of latent images now latent image is not an image yet it is the kind of information that the AI understands before it is then decoded into an actual pixel image so here you can see an example where I'm rendering an initial image and then I'm changing the ethnicity by handing the latent image down on to the next rendering process so let's look at how this process is working so over here right at the start we are loading our model down here deliberate to in that case and I'm setting up an initial positive prompt negative prompt and up here we have an empty latent image now an empty latent image is just the starting noise that we are going to use and over here this is going down here to our first sampler you can see that we are using a seat I have enabled that it uses a random seat every time we're using 24 steps CFG scale 7 the sampler is Euler ancestral we are having a normal scheduler instead of Kara simple or ddim then we have a denoise level of 1 here and this goes up here from the latent output so this outputs a latent image here goes up into the vae decoder this is also using deliberate 2 as a vae and then this information comes in here and is decoded into a pixel image this gives us this first image here and as we can see from The Prompt we want to have a 25 year old Swedish woman in casual clothing now this latent output is going on into our next sampler now this next sampler is also using a new positive input it's using the same negative input but it is using a new positive input now we have a 25 year old Asian woman I want to point out that this method of course is limited because we already have fully rendered the latent image here so we can't really turn this from a night image into a day image but changing the ethnicity is a rather small change in the image so that works rather well again we outputting the latent image we are sending it here into the decoder and then this is turning into an image now again we are handing this off you can see this is going on as a latent image into the next one and we repeat the process again with the new positive prompt here now we have dark black African woman I really intensified the darkness because otherwise it's not sticking to the image so that we actually have a black woman here and then we again hand this over the latent image now in the last one we have here an Indian woman again in casual clothing and again it's going through the same process so here we have a Indian looking moment also she has this dot here in the middle of her forehead so that's also a very nice little detail you have to keep in mind every time I click here I can always generate a set of four pictures where the background is similar the supposing is similar the clothing thing is similar but the ethnicity is different and that makes it actually very interesting now let's go here to my second example this is kind of similar now in this case though the method is different that we are using because what I'm doing here is I'm injecting a new style into the original prompt it's using the same idea so we have here again our model we have here our positive and negative prompt and this is going into the sampler down here we have our latent image so far so good now this is going again into the VA e decoder and it's creating our anime image as the first style but I want to have the same pose similar clothing here but in different kinds of styles so the first one is Anime the second one is a photo the third one is a 3D rendering of the character so you can see here this is a part of a positive if prompt so we have raw photography 8K UHD DSLR soft lighting high quality film grain Fujifilm xt3 but this is here combined condition combining so down here we have our original prompt Masterpiece best quality woman anime pirate cute clothing we combine this original prompt with this new prompt both of them now together make the positive prompt but on top of that we are using the latent image output here as an latent image input instead of just Noise We already use the latent image that is rendered here and then vaed code this and this is why we have here the same post the same image but this is now a photo now the same process happens down here I'm injecting 3D model into the original prompt I'm combining both of them sending it into the sampler rendering it and then we have here our 3D model and you can see that works really well now here comes the third idea this might be a little bit confusing because I have all these similar images now you don't actually need this this is just for me to show you what is going on now what I'm doing here is a double upscaling so here we have our checkpoint we have our positive and negative prompt is going into the sampler it's going into the decoder now here this is our original image it has a resolution of 512 by 768. now as an example here I'm using here a normal appscaler real ESR again four times upscale and upscaling it over here like you would do in automatic 11 11. now this is the original image with the small resolution and this now is the upscale now you can see that of course the app scale doesn't have new details in there it's just bigger it is nicer it's not pixelated but it doesn't have nice details so the beard for example looks very blurry that's not ideal now instead what I'm doing here is that I'm using the latent output from our 512 by 768 image and I'm sending it to a latent upscale this is doubling it in size to 1024 by 1536 and this is again going into a sampler because the latent upscaler is supposed to be used in between two Samplers now here we are again rendering with the same steps the same prompt and negative prompt same model everything the same only the latent input is an upscale a latent upscale of our original image this comes out from here and here I have a decoder you can see here this is now the higher resolution now here again is the small image and then next to it you can see the higher resolution and you can see that this has now more details because as I said a latent image is not a pixel image so when we do a latent upscale the AI can still add details to the image that's really interesting so you can see the beard for example has some more details here now again when we go back here what I'm doing now is I'm using the image output in this case we are no longer using a latent image this is the pixel image output and this is now going into the app scaler now I'm using the real ESR again two times model because we have already two times upscale this so first we have a two times latent upscale now we have a two times pixel upscale and this will leave me with the same size as the four times upscale we did before but now when you look at that image you can see that it has a lot more detail and looks a lot better now of course you can do all of that in one go without these in between steps this is just for me to show you the steps and how they differentiate from each other also here is a comparison on the left side we have the four times upscale with just the real ESR again method and then on the right side we have the two times latent upscale and then the two times pixel upscale you can see how much better the image quality is now my next experiment here we are going to go a little bit more complex in this case I wanted to create different characters but on the same background of course in the same go in this case I only created two different characters but of course you can create as many different characters as you want on the background so I have here of course my model again the positive and negative prompt for the scene so first I'm completely rendering the back background you can see that here as a complete image now I'm outputting this as an image this time not a latent image pixel image down here to a vae encoder for in painting now for that we're using a vae which is also deliberate to again and we're using a mask here I have created a mask with the same resolution what I'm doing here is I have a PNG with a transparent background and I'm deleting the parts from the image that should be masked so this creates an alpha layer to give me more control over the output quality and the pose I have also used here A Pose that we are decoding with open pose from control net so let me show you how that looks in Affinity photo here we have the pose I set the character a little bit lower so that we see the background better and then I put over this the mask and I deleted the parts here around the area where I expect the body to be now I took a very soft brush this means that the mask can fade into the surrounding and makes it easier for the AI to melt our new character with the background then I turn our background off with our open pose puppet in here so that we have this checkup board and I export this as a PNG so back in our conf UI we have the mask up here with the deleted middle and then we have down here the open pose image you have just seen now here comes the tricky part I'm using a vae encoder for impaint to turn my already rendered pixel background back into a latent image the reason why I need to do that is because I want to combine the background with the mask so here we have the pixel input good and we have the mask input I showed you before now this is turned with our vae from deliberate version 2 into a latent image that I can then use in my sampler as a latent input now of course we also need the control net image so the way we are doing this is we have here a new positive prompt for our cyberpunk man wearing a shirt and long leather pants 8K UHD soft light high quality in a cyberpunk alley at night this is going into the conditioning for our control net apply then we have the control net loaded here for open post this is going into the control net input and we have our control net pose image loaded and we load that as an image into the image input over here then we also have a new negative prompt here that fits to our image that we want to use from the man and this is all of that is going into the K sampler so you can see we have here our conditioning from the control net this is consisting of our new positive prompt and the control net input is going into the positive prompt we have here the negative prompt and as I said before the mask and our original image the background which has been converted into a latent image is going into the latent image input here and of course we're loading our model we have here our latent output this is then decoded again into a pixel image so here we have our guy and you can see on the left side we have our Le here with these two lights and we have the exact same image here on the right side but now we have a man in here with that pose now the exact same process has to happen again with additional nodes for all of the pro process I just showed you and this is then of course creating the image of a woman it's the same pose but you can see that for all three images the background here is exactly the same if your head hasn't exploded yet we are gonna go one step deeper now this method is a little bit wonky and the background is very similar but not exactly the same but at the same time you're getting an actual AI image that is just one image so there's a downside there's an upside to that but also you find a really good example here on the confi UI examples now this is using the same method in a different way because this is combining the same background over all of the image with different characters in The foreground while I am combining different characters with the same background in different images again this is going through latent image and it's also so stopping the steps of the rendering and that is a really interesting thing to understand so this is why I want to show you the example here so what I'm doing here is again we are loading our model down here and then we're going up here we have our empty latent image then we go further up we have another empty latent image we have our positive prompt and the negative prompt in this case it's just a cyberpunk Le at night and we have here our first sampler now this is the advanced sampler and what you can see here when you look close is that we have here our start and end step now the complete steps are 25 steps but we start at step 0 and we end at step 10. we have return with leftover noise enabled and also add noise enabled now what this is doing here is when this is rendered we are creating this image here and this is already the Le because when you look here there is some yellow noise here on the side and then there is some Violet loys here in the middle and a little bit of greenish noise here on top so when we look at our final image you can see that these are the same qualities in here so that means this noise already has the fundamental elements in the noise for our final image because it has already rendered for 10 steps now we are doing the same thing for our guy but not for the complete image this is why over here we have a second empty latent image this is a smaller image it's not 512 by 768 it's only 192 by 576. now this is going into a second sampler again this is going step 0 to 10 with the same settings otherwise and then of course we have here another negative prompt we have another positive prompt for our centered cyberpunk men wearing a shirt and long leather pants 8K UHD soft light high quality in a cyberpunk alley at night but of course His Image is just around the body and then this has to be composed together this is why we have here our latent composite this is the two image and the from image the two is the background this is what you want to render upon you want to render two parts the from is the image that you want to apply to that this is where you're coming from so because our image is smaller from the resolution we have to push it a little bit to the side so to the right side we go 128 pixels and then we have to push it a little bit lower because it's shorter we are pushing it down 184 pixels and we have a Feathering which means a soft edge with 48. now the these two parts the latent composite is combining them but both of them only have 10 steps so far so if I decode that what I get is this image here but of course we need to finish that process so what we're going to do is to use these two combined latent noises and we put that as a latent input here for latent image for our third sampler for that again we need a positive prompt now this positive prompt explains the complete situation cyberpunk men wearing a shirt and long leather pants 8 K U HD soft lighting high quality in a cyberpunk Le at night this time everything is in here and of course now what we want to do is we don't add noise so this is disabled we are rendering from the 10th step until the last step altogether 25 steps I have reduced to see if cheesecake a little bit and we have return with leftover noise disabled so that no other noise is added here and then we are ending up with this image here where you can see that now both of these latent images which have been combined before are now rendered to the Finish from the noise mix basically that we have created now over here we have the woman and you can see she's a little bit off center the character is kind of freely moved inside of that box that you're giving it it's not as exact as with the mask or with the open post that I showed you before in the other process and also you can see even though the setup of the background image is very similar the details are still distinctly different and also this is very interesting for experimenting with images where you do in between steps to add things to the noise this is why this example is so interesting you also want to check out this example here where different characters are rendered into the same background and as I told you you can simply take one of the images from my zip file drag it into the background of conf UI and this will load my complete build here with everything of course you still have to load here your own model I also want to invite you to my Discord where I have a specific comfy UI Channel people are very active here figuring out all kinds of cool methods to experiment with conf UI and then another thing I want to tell you is that the developer of conf UI is inviting you to code your own notes to get in contact with him so this can grow because this is an open source project and the more people work on it the more amazing this project can get thank you very much for watching leave a like if you enjoyed this video and see you soon bye oh you're still here so uh This is the End screen there's other stuff you can watch like this or that's really cool and yeah I hope I see you soon leave a like if you haven't yet and well um yeah
Info
Channel: Olivio Sarikas
Views: 82,550
Rating: undefined out of 5
Keywords: oliviosarikas, olivio sarikas, olivio affinity, olivio tutorials, stable diffusion, stable diffusion tutorial, ai art, comfy ui, stable diffusion ai art, stable diffusion ai, controlnet stable diffusion, stable diffusion installation, comfyui guide, stable diffusion gui, comfyui explained, local comfyui, automatic1111 comfy ui, comfyui basics explained, comfyui tutorial, nodes comfyui, comfyui nodes, nodes, artificial intelligence, better than midjourney, adobe alternative
Id: OdMtJMzjNLg
Channel Id: undefined
Length: 21min 31sec (1291 seconds)
Published: Mon Mar 20 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.