ComfyUI EP02: AI Image Generation Process (Details of K-Sampler Node) [Stable Diffusion]

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] hello welcome to episode 2 of table diffusion configure a tutorial in my channel in this episode I will tell you the meaning of each note and the meaning of each parameter that can affect the final result of your AI work so if you want to understand more how AI image generation work you have to continue watching to finish this clip so let's get started [Music] in this example I will use the default workflow so if you load already load the configure and you press the load default first to ensure that you have the same workflow as mine and I will use the model checkpoint model as the previous episode which is epic releasing if you already load default and have this model we will try to create some image but if you have different seat with my you will get the different image so I will fix the seat by change the value to one two three four press enter and you have to also specify that you will fix the seat after generate so if you change the seat to one two three four and fix it and press Q prompt or control enter you will get the image same as mine which is this one you will have the exactly same image as mine so next I will try to explain this workflow for you how AI can create this image for us so if you look at this workflow you start with the model and The Prompt which is positive prompt is the thing that you want in the image and negative prompt is the thing that you don't want in the image The Prompt is something that's condition condition what condition how to create the image however if you look at this buff called care sampler this box has many input so if I want to zoom in to see larger font or larger box you can spin the wheel of middle Mouse app to zoom in see and you can hold down the left Mouse click and drag to pan the space so if you you see that this case sampler has many inputs like the model positive from negative prompt and latent image I will explain you about this latent image first if I drag it up this is called empty Latin image latent image is the image that human cannot see cannot understand but AI can understand this if we human want to see the result of latent image they have to try to convert it back to actual image which can do using this node this node called vae decode it can convert the latent to be the actual image so if I want to see what is empty Latin look like for now I can click at this Latin white and red out and use I will choose the vaed code which the node that can convert the latent attend to be the image and if you press at the image and drag you will see the many nodes that you can choose I will choose preview image to see that how this Latin look like however if you press here prompt you get red stuff red circle in this vae is tell that you have to input via e in this one you already connect the Latin to the sample but you you haven't connected the VA yet so where we recorded from vae can get from the model you see the model many models are already contained vae where it is something like the sub model that can help us so if we want to convert the latent to be the image you have to connect the vae by drag this way and put it on this VA on really the code and if you you'll prompt again it's work you'll see that this preview image is the empty Latin image so it's empty right is some kind of similar to blank canvas it's blank canvas that AI can create image on this area so is this the meaning of empty latent mean so doesn't mean that this is the starting point and this empty latent got into this here sampler right this is Latin image but into this case sampler and it can convert this empty latent to be the beautiful image like this so how can it possible to do that the process that happen in this box is that the AI will try to create the noise from this seed if seed is the same the noise that is created it's the same so for example if seed is one two three four maybe the noise look like this it's nicely stuff this is the starting point starting part and I will try to remove the knife the noise will be less and less get to the beautiful image letter and I will try to remove Noise by many steps step this means step two remove noise removed knife remove knife so the number of steps to removing knife can config here is number of steps for example if we put the number step too low maybe only one step that should mean that we tell the AI to remove noise from the noisy image and try to get back to beautiful image but only one step to remove noise it's like tell someone to clean dirty stuff one time maybe the result is still dirty so if you use only one step and the result will be messy however you you see the shape of the glass bottle right because we have condition that we want the noise to get back to this picture and we have negative problem that means I don't want the noise to get back to this picture so it means this is how conditioning means it means how to get back from noisy image to beautiful image by this condition and not this question okay so if you increase the numbers that maybe five step and he'll prompt the result will be much better you see because it's like someone clean the stuff five times and the result will be better and if you put the step as 10 steps and kill prompt the original is more detailed and better quality you see however if you try to improve more see for sure because it's like you have to clean things up 20 times but you have better image so if you increase this now maybe 30 steps the result we will change but there's only a little bit it's like you clean 20 times or 30 times not make much difference now see if you take 50 steps always takes much longer but the result is change only a little bit I think you should figure out how many steps that you satisfy for me I think the number step may be about 20 or 30 steps maybe 20 is okay for me in this example and in this prompt this actually the numbers that also depends on sampler and scheduler sample and scheduler is is the method of how to remove knife some sampler remove noise much as first step and remove less at last step some sampler removed equally some sampler has different method of removing noise so you have to test by yourself I cannot explain you the meaning of each one in detail because it's too much technical but you can test so if you change the sampler to be different one maybe this one and try to kill prompt you see that it's 20 steps but the result is changed and the time is changed too because some sampler takes more time but I think provide you much better quality right so it depends on many things you have to play with it the next parameter that I want to as to explain to you is this CFG value the CFG value means that the degree that AI has to follow your prompt if this number is high AI has to follow your prompt more or try to create the image that exactly like your prompt but thus if the CFG is low AI has freedom to not to follow your prompt for example this shift is the default value is it you see that the picture result this feature has no Galaxy you see it has purple but there is no Galaxy in this in this result image but it has uh okay nature glass bottle or something like that it has purple but no Galaxy in this however if you lower this CFG value to be maybe one or two or one one is very low and try to kill prompt you will see that the result of this one let's follow your prompt there is no Galaxy there some kind of strength and I don't know what is it this is not to follow your prompt right next if I try the higher CFG maybe 10 because at it there is no Galaxy right so let's try 10. if I try 50 and 10 let's see the result you will start to see some Galaxy here okay because let's follow your prompt more so I try at 15. see the result okay you will see the start to see the galaxy in the bottle and you see more detail if you try CST 20. you will see the Galaxy the purple Galaxy like your prompt okay and if I try the safety to be 30. okay you will start to see some strange color if I put it even higher at 50. okay yo your result is explored because the CFG is too high so it means that you have to find the value that works for you again for me normally I put the default CFC at it but if the number Edge is not follow my prompt as I wish maybe I put it 10 in this example I think 10 is okay for the CFG because if you think that the CFG is high enough but your result is not satisfied because you want to see more galaxy in the result image you can tell the AI that you want to emphasize more on some specific keywords such as purple Galaxy okay if you want to emphasize more on this keyword you can highlight and hold down control and press up Arrow there will be the numbers appear this number is the weight that emphasize if you press up again the number will be increased so that's the width of this keyword is increased so if you Q prompt after you increase the weight of purple Galaxy the result should have purple Galaxy okay and if you want to tell the AI that you want to lower the weight of something such as the beautiful scenery nature okay you can highlight that keyword and Order control and down arrow if you press down arrow the width will be less okay maybe I put the weight only 0.7 and then we mean that the weight is less than normal okay the normal weight is one and if you try to kill prompt after lower the weight of beautiful scenery you will see that yeah the beautiful scenery is less appear in in the final results if you don't want to see any scenery you can remove that keyword completely okay if you remove the keyword completely and you prompt there should be no scenery in the image okay however if you don't want something like this purple Ball purple circle or something like that you have to put it on negative prompt okay like a ball or beat or something like that okay okay I I misspelled it but I understand you don't want to Bow don't want beat or something like that okay there is no no bone of it in the picture only bottle purple Galaxy and glass bottle landscape okay it's it's actually like your prompt this is the meaning of CFG and the meaning of the weight of the prompt for the last parameter which is denied it means that how many percent that you want to change the initial attend for example if you connect the key sampler with mg Latin image that means that the initial image is completely random noise is noisy and messy so that means that you have to change one is 100 of the initial image you want to just try all of it and start generating the image so you should keep this number as one if you connect with empty latent if you lower this value maybe 0.5 okay and start kill prompt you will see that the result will have some kind of residual noise if you lower this even more maybe 0.2 you will see better that there's still some nicely around the picture let's change only 20 percent okay the remains still have residual night if you even lower 0.1 is very low so it means that they will change only maybe 10 percent of the starting point so that's mean that if you want to connect with empty Latin you put should put it at one that means 100 percent okay and you will get the best result so for this value we will lower this value only if we connect the Latin image with something that is not empty which I will tell you in the next episode we will connect with the actual image there may be image from internet image from the previous thing that we already generate you can put it at the starting point of latent image and you will use the lower the noise value there okay and I will explain to you in the next episode for this episode I have to thank you that you stay with me until the end so you should have more knowledge now about how AI generates the image for us right so if you have any question you can leave the comment and ask me the question see you in next episode bye foreign foreign

Info

Channel: AI Angel Gallery

Views: 2,508

Rating: undefined out of 5

Keywords: ai, stable diffusion, comfyui, tutorial, ai art, digital art, generative ai, สอน, เอไอ, artificial intelligence, sd

Id: T3LE33tvpdU

Channel Id: undefined

Length: 20min 8sec (1208 seconds)

Published: Thu Aug 03 2023