ComfyUI SDXL Basic Setup Part 2

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right um next part we've set up the loader we've set up the prompt we have set up a sampler which has made a bad cat so the next thing to do is we need to do some stuff to make this cat come out a little bit better because currently it's broken as right so one of the ways to do that is to mess with the sampler and the sampler is kind of it's how it generates stuff based off of this empty latent image it each one of these numbers will calculate the noise the way that it creates an image in a different way they're all very similar they use different kinds of maths to do the same job and some are better at certain things than others some are really quick some are very slow and some make images come out with certain artifacts that you may not want I'm going to use DP mpp2s ancestral which usually requires less steps than others to make images and does make things a little better when it's done with photos but uh you know it kind of depends what you're after I think all right so it generated another image using this dbm pp2 2s ancestral so it may just even work and bring out a better image this way so if you're having trouble with that bring out decent images considered as changing the sampler do a different one and see if it works for you see it's still coming out weird it's got weird stuffed eyes are funny all that stuff and one of the main reasons for that is that we're not using refiner so let's set up our refine our workflow now this doesn't use the same clip text in code it uses a different one I'm going to set it up really quickly the same way we did the other one and control C control shift V Ctrl shift V we'll copy paste the thing with any cables that go to its connected again it doesn't work on some nodes when you do this though now there's some things that just will not copy with the connectors not sure why but some just don't 90 99 will work there right uh green red um let's just do um what would be actually no we don't need to use this we will use text input like the other one that's what I was going to do right so the cool thing about using a text is input is you can literally just slap in this and I'll use the same stuff for this as well which is especially good for negative because you use negative pretty much the same way in all of them and it doesn't really change all that much between um the different clips right so this is set up now we make a bus and uh I generally like to have one bus up here and one bus down here because they're more accessible because you're going to be having one thing is going to be using the top one the next one's going to be using the bottom one and it's easier if these two buses don't uh get crosswired because if you plug in a model from this one into something that's getting the conditioning from this one it's it's gonna bring out weird stuff in theory the VA will work across but uh I've I've had some issues in the past with it not bring out images properly it's more of an issue if you're using 1.5 model as your second model which you can do you can actually use instead of the sdxl refiner you can actually use a 1.5 model like Juggernaut or realistic Vision to make images look a little bit better anyway so making the bus getting distracted right uh positive negative right let's do the Quick Way Ctrl C TRL GV actually no oops that's what I do afterwards ignore what I just did all right control C TRL V connect them all right this is the easier way of doing it right Ctrl shift V now no I guess I guess I'm just completely wrong all right anyway I'm just messing this up now messing up the tutorial right here we go anyway so building a second bus and the second bus is going to run all of our second model on it and so we need to make sure that we have our re-root nodes in the right place to build our second key um I have so many custom nodes right second K example so there's a few different ways of setting up this whole sampler setup first DXL there's a complicated way and then there's a really just simplistic way which is to use two samplers 2K samples and then you've just plug them in oppositely basically and so what this will do is it'll run this model it'll send out a preview image to here I'll just lay it out the way that I liked later I like to put VA decode here and then because we don't really need it other than once and then you need to connect the two k samplers now if you are using a 1.5 model for your refining step say Juggernaut or something you will need to break this out into a with a V decode before sending it to your second case sampler because uh 1.5 models will not understand what this latent image is because it's done differently so you must change it to an image and put it back in and encode it using the second model in order to actually have it understand what it's looking at but because we're just using the refiner and they use the same VA system and they or the same kind of model base it they can just plug the laden in this way right so um same setup denoise down to something really low like 190 or something they decode and we'll add reroute node for my way and preview image you can minimize these by clicking the gray Circle by the way all right and you want these the same size and you don't want to faff around you can just hold alt drag a copy across and plug into that now you can actually drag connectors onto a minimize node but you cannot drag connectors off of a minimized node so if you're lazy you can actually plug things into this it's not very easy there right so Q okay let's give me a drawing of a cat and the reason it's done that is I haven't told it that I want it to be a photo which means it has free reign to decide what it wants to bring out the image as so bear in mind that your prompt has a large bearing on what it does and the more simple it is the more it can choose what it wants to do all right so now we're going to compare these two images you may notice it takes a lot longer on the second thing and that's because I don't have a lot of video RAM which means it has to unload one model and load another model and then shift them around all right comparing the two now see there's practically no difference the eye color has changed between the two images the background structures are all pretty much the same some of these lines are a bit finer in detail so what the refiner actually does is it's kind of like a a darkened and sharpened step really is what it seems to do mostly so yeah at all over here it's very gray and very kind of confused and a bit pixelated looking but over here it's refined it with a lot more it's a lot darker there's more the lines are darker the the whites are kind of clearer and pretty much it'll do the same thing everywhere which is why you want this the refiner step to generally be a low denoise level because if you try and generate an image with just the refiner it will make some pretty meh things like they're not gonna look great if you use the same prompt all right so let's do one that's a little bit easier to um figure out the differences between stuff with which we'll do photo of a cat on the roof which should make this image be a photo It won't always be a photo but in order to give it more detail we need to do some more complicated stuff with the prompts but I'll do that in a minute probably the next stage right so key thing to know about sdxl is the first kind of Step the first thing you generate is actually going to be generally better than if you have additional samples with more steps if you try to if you do this it's a one and do the next one it's a 0.6 right so you still want to do most of you want to do say 60 of the image to be new it's going to make it generally worse rather than better because it's kind of the base is kind of set up to do the first step it's not set up to do the Second Step so if you do it using for a second step it's gonna it's gonna be a bit odd right okay so differences between the two eyes look generally worse on the second one mouth looks a little bit worse as well on the second one we've got the back foot is still messed up got the roof section in the corner here looks better it's less kind of messed up see this this kind of connects oddly here but here it looks more like there's an actual edge of metal here um like it'll fix it I've I've seen it fix certain things and then make other things worse with the refiner that's what I'm noticing a lot of it's good at whiskers a little bit better than this one at whiskers and stuff like that so the eye actually looks sharper though even though it looks worse because of that white line under it like it doesn't have sense like it doesn't go okay I these things are fine I'll use them it goes I need to do this same process for all of the image and it just seems to not work as good sometimes anyway so this is the basic bare bearings way of setting up Samplers it's not the optimum way it's not the way that will give you good results either it's the way that will give you average to Fair results um all right let's go on to the next step we'll start using the way that a lot of people generate images for this sir let's now let's just delete these two right okay so the way most people do it is they'll use case sampler Advanced which is more complicated uh it's not entirely um transparent what each part of it does so you need to have a better understanding of how Samplers work which is why it's called advanced in order to use it properly but it plugs in pretty much the same as the previous one did so I'll copy that across connect the latent now the tricky thing about I keep trying to connect the value when it doesn't need it the tricky thing about Advanced Samplers is they use steps instead of denoising so if you want to use a instead of empty Lane image you want to use a an actual image so you would go to image load image vein code yada yada let's say you wanted to use an actual image in order to do this stuff the important bit is the steps that start and this needs to be a higher number and it needs to be lower than your total steps value so total steps is how many steps your thing is going to go for and started step is like how far along it assumes it is in the process when it starts making this sample work so the higher this number is the more steps it assumes have been done before this sampler is triggered so if you want it to do like it's kind of like a percentage of this think of it that way if you want to do pull out the calculator and go okay what's the percentage of 25 this is you can say that's how much denoising it's going to be um and you also want the thing about first case sampler the very first one you're going to use is gonna You're Gonna Want This enabled I'll talk about why that does that in a second but all right so if you're not using this and you're just using this you set the startup steps on zero I should move that maybe you can do image to image that way if you want right so using latent image start zero sampler name I'm just going to use the same one as before right um so the second K sampler you generally want to set it up so it only does 20 percent of the image or ten percent that's it's 10 or 20 percent of the emitter so uh a good a good kind of Judge if it is you could go calculating okay assume 25 steps so they're both doing the same steps amount um we can start at its step um say 20. so we're only doing five steps or something like that so I'm going to do 18. um return with leftover noise is disabled that will make sure that it actually denoises completely this step here will actually add a tiny bit of noise which will make this image different than it would otherwise be and we're gonna do now there we go so yeah you can connect the minimize ones it's just really fiddly right okay so we'll run it and uh all right I'll do it again so yeah you can actually connect it to him in minus one there you go cancel oh if you're wondering how I'm getting this image preview thing there's an option in your bat file um hang on I'll just find it while it's generating some stuff all right so yeah your bat file if you have this section on it preview method Auto it will allow you to have live previews of your images but the important thing to realize is that it will use more RAM that way so uh don't do it unless you've got the ram to do it all right so now we get to see what the actual proper way of setting this up will do so see the eyes are like a hundred times better using this method the fur is a bit Messier here like it's more realistic it's less artificial than you see it here the tiles are still out of whack because it's not using enough steps to fix tiles or do anything like that uh tips of the ears look a little bit better maybe so it's hard to say more realistic anyway and you'll see these have kind of artifacts around the whiskers whereas these whiskers seem a little bit better mashed in with the background so that's why you want to use the advanced Samplers instead of the normal key samplers um and you generally want to set up you need to you need one which is the normal base and one is a refiner all right so what we have now constructed is I've got our positive and negative prompts we've got our um our loaders for both models we've got two buses got two samples and a whole refining setup and you can make better cats this way let's have it make a photo of uh how about Robo cup standing on a car in uh burning City Street you'll note I'm not using just individual words on this prompt I'm using a sentence and that's because clip text encode SD Excel any other one do better with um like a contextualized sentence than they do with individual words all right I'll just see if we can generate an image off that and because I've used 80s 80s Tech you now know that I'm a child of the 80s because I've said replica and apparently this thing doesn't know what Robocop looks like it knows what color he is but not what he did he has a face it has also embedded him in the car showing that AI is not perfect all right let's see what differences it has between the two images writing is always going to be messed up because not none of the normal text to image models do writing very well all right so there's minimal differences uh fire looks pretty much the same this fire here has changed just slightly more realistic this one section the rest of it is mostly the same um the lights appear close to the same the front grille actually looks a little bit worse than the second one than the first one which is interesting um background cars look a little bit better maybe it's hard to say if that's just because it's using random um you know random seeds or if that's because it's actually doing anything if you get me um yeah so yeah it's basically doing very little there's more visible letters here than on this side but generally speaking there's not a lot of difference between the two images so the job of the refiner is to do a very small part of a larger image all right so next step is going to be making the prompt way more advanced and doing G in text L properly we're gonna do um we might do a different method of doing latent image input a method that I like to use called pre-diffusion well I call it pretty Fusion but uh anyway all right so I'll see you in the next video then
Info
Channel: Ferniclestix
Views: 3,139
Rating: undefined out of 5
Keywords:
Id: dvK6Lyah-64
Channel Id: undefined
Length: 25min 25sec (1525 seconds)
Published: Sat Jul 29 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.