First time SD - ComfyUI Basic Introduction

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

all right hello and welcome to a comfy UI tutorial I'm going to be focusing on people who uh using SD and possibly come for UI for the first time uh you may be coming at it from having never used node graphs before you may be coming at it from having used SD before who knows but I'm going to cover the basics of how it works what um what each of these these strange node things do what the note the noodles are actually doing these connective lines and I'm going to explain to you some ways of organizing it so that it's not completely and utterly confusing so when you first open up come for UI you're going to be presented with something along the lines of this layout here something like this and um it might not make any sense to you um you won't have this particular little window here either it's a live preview window you can activate that um elsewhere I'll explain that closer to the end of things but you'll see that all these noodles are going underneath boxes there's like they're Tangled they connect to who knows what and these boxes you can't tell what they do um you know all that kind of stuff so I'm going to start off going from left to right I'm going to explain what each of these does so the load checkpoint node is is designed to load a model so it will load your um your SD model that you've downloaded from somewhere you may by default have 1.4 or you may have to go to a site called civitai to find more of them but typically you'll be running there'll be something in here usually and um so that's what it does it also sends out the model as a um it's kind of like a a little line which tells anything connected to this to this noodle what the model what this is it also sends out a clip noodle which is specifically for sending information the model needs to send to this particular node which is the clip text encode and it's also sending out the value encode which is what this model uses to um encode and decode images and which are done in the sampler so the first thing is the clip thing the clips are basically it's how it converts the text that you put into them into something that this model can understand and it'll run through all the words and it'll it'll try and find sentences that make sense but mostly it focuses on individual words depending on the model you're using it may be better at a natural kind of sentence or better at individual words and each of these words is divided up with a comma and spaces now the negative prompt is just down here and this is designed to you put stuff in here that you don't want to see so it's very useful for controlling specific aspects of an image it also it's better at individual words it won't understand even pairs of words very well it just tends to like um individual words and then we've got an empty Laden image and the empty Laden image is basically how you set how many pixels uh high and how many pixels wide your image is it also has the ability to set a batch so if I put more than one number in here and I click generate this is how you run your prompt by the way over here this Q prompt thing so it'll it'll start making images and you may notice that it's got a little thing on here now if we make this bigger and we click this button here it'll show you all of the images that is just generated so setting this number will tell it to send more images down the deadline however the more of these there are the more likely you are to have problems if they are high resolution which you can see by setting this high and then having a higher batch count than one so one will just send one image all right so the sampler itself it's kind of like the heart of stable diffusion it's the heart of confi and without samples you'd you're basically you're not going to be doing much stable diffusion you can edit images into all kinds of stuff but you can't do any stable diffusion without samplers and each each sample you can have many many samples and each sampler basically it runs a whole bunch of calculations on what's called latent noise and um late noises produced in basically using this value here the seed and this image here the input image and if you see it says empty latent image that's because it's sending just a blank kind of image usually it'll be like 50 percent all colors so it'll be somewhere in the middle of everything it would probably be a grayish a grayish beige color probably and it applies this kind of noise to that value and then um it'll just run it through this over and over again it runs it through and it calculates it it goes okay is this an image is this an image is this an image and it just keeps going and it uses your prompt in order to do that and it filters it down and down and down kind of like when you are trying to see when you're watching CSI and they look at a number plate and somehow it bounces off six things and they somehow get an image off of a number plate that is visible like you know the whole focus in on the thing the whole enhance thing you're just doing enhance enhance enhancing hands which you can see when we cue The Prompt and it goes so that that's basically what data it is designed to do in order to do that it runs it through at a certain number of times this was running through 20 times it's applying a certain amount of CFG so CFG is how much of these how much of your prompt it's going to make use of when finding um finding images within the noise basically now the sampler and the scheduler they're kind of part of the same system the sampler is the math it uses to calculate from that noise some are quicker some uh some are slower some do things differently and some will produce artifacts in your image like uh see if we do Euler see that it seems a little bit blurry if we do dim and we set the steps to I don't know a lower amount I'm sitting at lower so that it makes it a bit more obvious you can see there's a lot of blurriness going on there's kind of um there's also this kind of um noisiness to the image that um comes in just because of the sample that you're using whereas if we use something like dpm2 and so on and then cue it it creates a much Sharper Image all of a sudden as you can see still blurry because it's low resolution but um the the Samplers can really change what uh what this thing produces so the the Keras the the scheduler is kind of like how how it applies the sampler within the amount of steps that you have so if you have 20 steps it'll do certain things it'll add a little bit of noise and then remove a little bit of noise at each step and so on and so forth It's each of these will do different stuff in a different way and then you have the denoise count and that is how much of this particular thing it is using now when you're sending when you've when it's done calculating an image it will go okay I need to decode this image and that's because in this state you'll see it goes it becomes a pink line called latent and that's because it's using a certain kind of data type which is basically just a a real a big bunch of numbers and it's not really it's not like a bunch of pixels right so in order to make it into pixels that we can see we have to run what's called a VD code where it says to the model hey how do I turn this into an actual pixel image and the the model will go oh you use my V now it could use this way or it can use another vague there's a a node where you can load different ways so um later on when you get learn more about this stuff you may have access to more of these and they they're basically a way of calculating images anyway so when it runs this decode it turns this into an image as you can see it's the color is now blue and that is coming out as an image now in order to send something to a sampler so I'll clone this you could send it as a latent image like this or you can do of a in code and of a sorry if I decode and then a vague encode and you do this because it always needs this pink line to come in and this this particular process here where you switch it to a blue line is as you can see it's going to take time to do this it's going to take time to do this so it's not very optimal but there's a reason you might do it you might want to do things to it within this area so you might want to for example upscale your image at which point it is going to send it to here upscale it send it back into another sampler and then it's different resolution for example you could also uh go in here here and do an image blur for example and do the same thing they only accept one input so but so that's that's one of the useful things these so Within These image spaces like within pixel space it can do certain stuff that it can't do and he won't find image blur in latent space so that's kind of a useful thing to know as well I'm gonna remove these because we don't need them now um let's talk about the complexities of all this wiring so you may have seen that all this wiring is some of it's going under stuff some of it's going around stuff and as you build a more complicated thing you may be tempted to to hide the wiring which is not not a smart idea it may be more familiar to you if you've used um a11 to lay things out like for example this where you have your prompts and then you have your thing out and you have your resolution size like that it may make more sense to you from a where things are and how I access them point of view but if you actually want to do stuff to this you have to untangle it and then you have to you know re reorganize things so you can get access to all the different and figure out where things are and you you'll get lost right like six months later you'll open this cramped thing and you'll be like what what do these things do what's this plugged into it's just it's not smart basically to do it that way and you could lay it out the other way where everything is in the order in which it runs you could lay it out that way you could lay it out a number of different ways um some people just have stuff all over the place in a big box where there's just everything's hidden and it's just nodes next to each other and you know that may be your thing but there are some different ways of running this that I like to use and I'll show them to you now and we'll do that by setting this up again we're going to keep these because we're lazy and I'll show you how you can set up this that actually makes a bit more sense and is more usable at the same time so first off you're going to want access to everything in one place at any time so that you you um you will always have these things available in order to do that we build what's called a bus so these re-root nodes are used to send the cables in certain directions between stuff so you can you can send these around nodes or whatever you want to do with it but essentially what we're doing here is we're building a little Highway where we can grab everything that we need at all times so this is Ctrl C and Ctrl V to paste stuff there's quite a few shortcuts or come for your eye keyboard shortcuts and you can find them on the GitHub page near the install instructions there's quite a few useful ones and I'm going to show you one in a minute as well which is very very useful well I'll show you two that are very useful if you've used Photoshop or any other thing you'll probably know what half of these things do already anyway so now we have a bus and the reason that we have a bus like this is because we'll always have access to what we need in order to make a sampler so because the sampler is the core of how to run conf UI having this rainbow of information is always useful and then we can wherever we have a sample we can connect things that we might need to it and we'll do a vad code and because we have one here we'll do this now as you can see this makes a fair bit of sense as to what each thing is doing now back here we have some stuff that doesn't make sense we we can't tell what each of these is doing so let's change them now we know that this one is positive and this one is negative there are there are other ways of doing this if you are running complicated stuff and you just want to have that initial um interface that a111 has you can even build those things with nodes um it just requires a little bit more work so if you right click on on a node and go down here you'll be able to convert text to input so with all of these things you can create inputs on all of these values you can create these and you can send them wherever you want if you have the right nodes in here I'm going to need a text node for this if you double click on the background it will open a search thing and you can just type in words you might be looking for so text so now it's got a bunch of text things and we're going to draw a text box so if we connect this to a text box we could have this way up the other end of the thing and then bring this all together with a bunch of these things we could we can send this information back here as values as well you could build your own interface over here and then have your entire node graph over here if you wanted to I'm going to do that it's it's too complicated for a beginner thing but it kind of shows you some of the possibilities that we have now I'm going to show you one of the keyboard commands that's very useful which is control so if you copy this node control well copy all the nodes why not Ctrl C and then we can use normally would do control V and it'll paste them in like this but then you have to rewire these we don't need to do that if we press Ctrl shift V it will now paste them in with the wires intact so that's incredibly useful if you want to repeat this stuff basically now I could do a whole bunch of stuff over here upscale a thing and then do a different stuff over here is continuing Along on each of these paths because this pink line here basically becomes it's like it's like saying this is the thing I want to work on so everything connected to Pink is going to attempt to run so it will run if I click queue it'll run through the prompt and I missed it somehow there we go so run through the prompt sends it to the samplers and as you can see they're both the same because the um the seed was the same when I copied them but after I've it's randomized them both so now that will be different when I click it again so you can click this to change it to increment decrement or fixed unlike fixed because a fixed seed will only generate this once and it will just keep it until you decide to change this from fixed so if I render this again yeah well it'll change it once so if you change it to fix it'll render it once and then it will go on so if I do it sorry a third time um it'll just do this one now so if I now decide that I want another one of these I can copy these Ctrl C Ctrl V now I did that wrong so we will now delete these even though I make mistakes control shift V right there we go now I'm going to show you something really cool with this so I'll cue The Prompt and we'll run it and then it should run this one as well yep same as this one because the seed was the same again but what we're going to do is we're going to press Ctrl m now this node is muted which means it's still there it's less information but it's not going to be run when you click queue prompt instead it's going to throw an error over here and just keep running this stuff because this needs this node to exist and currently it doesn't exist so while it is drawing error it's not a bad error because this is intentional sir as you can see because you can turn off nodes by Ctrl m you can actually build a complex workflow here a complex workflow here and then when you don't want to run this you can just turn off part of it and it the whole thing won't run okay one more thing to show you and then we're going to do an upscale and then we're going to probably end the tutorial right so if you have this stuff and you find you want to make this particular thing repeatedly and you don't want to have to rebuild everything if you select these nodes and you right click on the background you can save the selected as a template and I'm just going to call it in case sampler simple all right so now it has saved it under this node templates now if we go down here and go k sample simple see it's it's put my nodes in here again so that's a useful way of saving parts of your workflow that you can use at any time all right so the next thing to show you is how to upscale now we could do a we could do what's called a latent upscale we could do what's called a hang on we have to use images for this we could do a image upscale okay I'm going to remove this one and change it to it upscale by sorry Laden upscale by the buy is just the simpler one because it it just multiplies it and we go to with each of these and there's another one which is also an image one and it's going to be upscale using model as the three basic kind of upscale types there's techniques for upscaling and then there's types of upscaling so if we pull this one out preview image preview image and they encode and then preview image so this is in in latent space which means that um it's not really pixels it's math that's in here which means um it can get a little fiddly if you're trying to manipulate it and doing an upscale in Lane is fiddling with the image which means that the process of any upscale for Laden is going to become lossy and you'll see that in a minute when I upscale this so if we cue this it should it'll process the ones that aren't changed and for simplicity's sake let's make these fixed for our purposes all right now it's running this one okay so I'll show you this one I'll show you to the outputs next to each other as you can see I'm keeping these things pretty well organized they're they're separate from each other the wires that I'm trying to keep them from going underneath each other too much they do a little bit but um generally speaking they are pretty well organized and we do this because of all the wires and stuff everywhere and the complexities of organizing it it's a good idea to just be neat so you know what each thing is doing otherwise it you're gonna you're gonna get lost in the weeds really easily with come for you if you are not maintaining the organization of your workflow as much as possible all right so first one latent you can see it is lossy it has changed the image a fair bit as well you see this one is kind of blurring and it's it shows pixelation even though these pixels are probably made out of four or five pixels themselves and this one is somehow a lot sharper and I'll show you why that is and that is because this is quite a bit bigger this has gone beyond just two times bigger this has gone four times bigger because this model is at four times upscaler model and as you can see there's some there's some Oddity to the way that things look on this one so if I change all the seeds to the same we're getting a better idea of what each of these is doing so we'll just go one four four on each of these we may be running a bit long on this tutorial but um I feel like I'm in a good place so here we go we'll just keep going with this okay so we're now generating the same image on each of them now to speed this workflow up I could remove the preview and VD code on this one but not on these ones because they must be very decoding all right so this one blurry as and noisy this one just blurry pixelated and blurry and this one not pixelated but strangely smooth with odd focusing of things the focusing of things is the same here but there's just some odd odd witness with the way that handles certain parts of the image all right so those are the three different knob scale methods and you could stop a thing there most of us who make really big complex images will continue on we'll make Samplers after this where um we go okay and you can just go from the lane with this one but um you know usually you have to vein code and then more steps and then more Samplers and then each time if you just keep bringing this along with you you can keep everything organized in a linear fashion from left to right and you won't have too much overlap things will be relatively easy to follow visually with what each of these sections is doing you'll have to move around a bunch you can't just sit here and enter all the information then click go um which is I guess kind of annoyance for some people but um that's kind of one of the things about confui you've got to do stuff differently than you might expect in a normal workflow now if we wanted to negate the problems that Laden upscale courses I'll just show you the trick for that so say you really need to use latent upscale for memory reasons or for um say you just want to use it because of whatever reason maybe you want to add some more detail to your image because this will add more detail you can instead of way decoding because we we already know what this image looks like so we will make a sampler and the sampler is going to be set to just over five and that's because it needs a certain amount of noise denoising done because of the fact that it comes out noisy so if we now do a very decode again that should come out clear while it's doing that I'm going to show you how to get these preview things done if you go to your um your comfy UI folder and you go to your the bat file that runs your server and you open it in a text editor you can add this line dash dash preview Dash method Auto and this will allow you to have thumbnail previews it they'll appear on all of your samples but you must restart the server to make it work all right so now it is calculated and you can see it's actually made a new image and it's sharp now but you will notice a difference between this image and other image if you look at the bottle you'll see that the edges are strange the the glasses I've got these odd Reflections on the sides of it there's some other odd differences in the image as well but we have improved the image over this one which is also upscaling at the same size so you can see this is blurry as this one's much sharper now but it's added details that are different to the original image so we can mitigate that a little bit by changing the upscale method we can change this to area or bilinear bicubic I want to change it to bilinia what that'll do it'll change how the noise is applied when it upscales and what it should do if it's doing correctly is these edges should be a lot a lot more like the original a lot cleaner see how the edges are a lot more like the original now I mean they're not dark like on this one but the it's kind of it's a lot it's a lot more realistic in some ways so um this is because uh this particular noise that gets added is is differently calculated and it's not it's not as blocky and noisy as the uh the noise that comes out at nearest exact so in some ways this can be better upscaling than image previously it would appear so probably about as good as this in terms of sharpness so the way that we can make a non non-non latent one so we don't have to run it through samples and stuff is I'm going to change this one from this to actually I'm going to do this one and then we're going to apply this to it so by doing this 0.5 we are going we are upscaling it by four down scaling it in half which should get us to the same as going up by two and now if we I'll do another preview why not I'll copy this one yeah here we go all right so so by doing this we can avoid problems caused by this one and void problems caused by this one as well because it it creates some oddness in an image as well it's more obvious in faces and stuff like that than this particular thing that we are generating I should have set this to fix so it didn't run it again so these ones will sharpen it similarly to this but they won't uh they won't have a sampler in it essentially is the idea so if we now look at these side by side see how this is blurry as but this one's Sharp so we've gone up to 4K and then down again it's at the same resolution as this one but it's not blurry and it's more like this image in terms of sharpness and quality so those are the two competing ways of doing it you can use latent and it'll take more time or you can use this way which is more complicated but um you know does some different processes so as you can see there's a few different options available and we can put more samples after these and change them and make them even bigger we can more add more of these we could just add another step we could go copy this whole thing and go okay I want to go up to the next step of resolution Ctrl C Ctrl C yeah I'm actually I'm going to delete that sorry I've been talking so long now my throat hurts all right control C TRL V and we could just plug this in somewhere along here and then throw images in so um yeah okay hopefully um I've given you a better idea of how to use come for UI now how to upscale and how to do some of the kind of processes that are kind of easily accessible in a11 or any of the other user interfaces you may enjoy some of my other tutorials I cover creating much more advanced systems than this from nothing so you could even walk through it not knowing anything about um SD will come for UI and follow those as well anyway thank you for watching and I hope it helps you in the future

Info

Channel: Ferniclestix

Views: 4,593

Rating: undefined out of 5

Keywords: tutorial, beginner, comfyUI, stable diffusion, introduction, starter, guide, learn to, AI, image generation

Id: hdWQhb98M2s

Channel Id: undefined

Length: 33min 46sec (2026 seconds)

Published: Sat Aug 05 2023