Complete Comfy UI SDXL 1.0 Guide Part 1 | Beginner to Pro Series

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

Thank you for the ComfyUI guide. This will help new users transition from Automatic1111 to ComfyUI.

👍︎︎ 4 👤︎︎ u/hashms0a 📅︎︎ Aug 28 2023 🗫︎ replies

Great video! Can't wait for part 2.

👍︎︎ 2 👤︎︎ u/rookan 📅︎︎ Aug 28 2023 🗫︎ replies

Nice tutorial, much better production values than mine XD theres even le-gasp, EDITING! :D

I love you are stepping through it from nothing, and showing comparisons with A1111, very nice.

👍︎︎ 2 👤︎︎ u/Ferniclestix 📅︎︎ Aug 28 2023 🗫︎ replies

Thank you so much. I can load pre-existing ComfyUI workflows and use them. But understanding WTF is going on... that's a different conversation.

👍︎︎ 1 👤︎︎ u/esmeromantic 📅︎︎ Aug 29 2023 🗫︎ replies

where could we get the workflow?

👍︎︎ 1 👤︎︎ u/cliffordp 📅︎︎ Aug 29 2023 🗫︎ replies
Captions
sdxl has been out for a couple of weeks now and the general consensus has been that comfy UI is the preferred method to work with this model particularly due to the fact that the base and refiner model have been split apart not to mention the additional layers of control that comfy UI gives a few checkpoint merges have started to be released that combine the Bas and refiner outputs into a single model that you can use with automatic 1111 a lot of people have been complaining about how complex comy UI is how scary it can look with the nodes and cables flying all over the place essentially noodle soup however that doesn't have to be the case comfy UI can be as easy as using automatic 1111 or as complex as you want it to be in fact I've included a link in the description with the configuration file that will more or less replicate automatic 1111 text to image interface as closely as possible in this series we aim to go from absolute beginners to comfy UI Pros having an understanding of what each of the main components do if this kind of content is interesting to you please don't forget to like And subscribe it really helps the channel out to start we're actually going to open up automatic 1111 and go over some of the key elements that we are going to try and replicate in comfy UI the key components are the positive and negative prompts textual inversions hyper networks and luras the seed the CFG scale restore face and a detailer highres fix and control net these are the main components most commonly used in automatic 1111 and will aim to replicate these over the course of the series as well as explore more advanced settings that you can do with it heading over to comfy UI by the way if you haven't gotten it installed or are using a runp pod instance I have a separate video that explains how to do that once we're in comfy UI you'll be greeted by a screen that looks something like this it could be completely empty or it will have the default nodes already set up just so that we're on the same page go ahead and click the clear button so we can start from an empty page now there's two main ways to create nodes on the canvas one is by right-clicking and when you do that you'll see this little menu pop up which allows you to go into the different categories and select which node you want this is great if you're not entirely sure what you're looking for and you want to explore the nodes and what they do the other way is by double clicking on the canvas and you'll get a little search box and you can very quickly type in the node that you're looking for this is great because for me sometimes I forget which category a particular note is and it's just faster to double click and go after the note I'm looking for in this case we are going to grab the simple checkpoint loader named checkpoint loader simple we're just going to type it in and grab it and drop it over here now you'll also see here that there is an advanced version of the checkpoint loader we will not be using it at this time and that allows you to load in a configuration yaml file we'll go ahead and delete that now in our load checkpoint note I'm going to go ahead and click in the window here and select sdxl base 1.0 now you'll notice there's three little dots here called Model clip and V these are going to be how we connect our nodes to other nodes and we'll get into that later on now the next node we're going to grab is one called K sampler and again there's two variants of this case sampler and K sampler Advanced we're just going to use the standard case sampler for now but we will need to get into the advanced one later on to make the most of sdxl 1.0 now the case sampler is the component that does the heavy lifting for your model you'll notice some sections here that look similar to what you might have seen in automatic 1111 this is where your seed is input your CFG is input as well as other parameters that affect the output of your prompt next we need to connect the case sampler to the model that we're using to do this we connect the two model purple dots together to connect the two nodes next we need to actually connect our positive and negative prompts we do this by adding on a pair of nodes called clip text and code and what this is is this takes your prompt and applies it to the model to get the desired output based on the parameters of the case sampler so in this case we're going to add two clip text en code nodes one for the positive and one for the negative and like before we connect the matching dots on the Node so on the model clip to clip and on the right side conditioning to positive for this one and negative for the other one as you'll see here and lastly you'll notice in the case sampler there is a DOT called latent image here we have to connect it to a node called empty latent image and what this is is it is a blank image in a latent format which is a format that the AI models can understand in this case we're submitting an empty one because this will be the starting noise that will be used to generate our image in this case we need to set our latent image to 1024 x 1024 as this is an image size that works best with sdxl 1.0 great and once that's done we can now head into the case sampler and start to understand what each of these lines mean and how they compare to automatic 1111 so as I said before we have our seed just like we have an automatic 11-11 now underneath it you'll see control after generate what this does is this allows you to decide whether to keep your seed the same whether to generate a new random seed or whether to increment it by one just like you can set these things in automatic 11-11 steps this is just like Steps in automatic 11-11 where you decide the number of steps and iterations that the model is going to go through to generate the image CFG determines how much creative freedom you're going to give your model just like an automatic 11-11 sampler name is deciding your sampler method just like an a111 scheduler D noise is simply a percentage of the number of steps that you want the case sampler to complete so in this case if we've got we got 20 steps and we set the D noise to 0.5 it will only complete 10 steps out of the 20 in the D noising process essentially what this means is you may end up with a noisier image okay so now we're going to do a little bit of housekeeping and try and lay out the nodes to look something like they do on automatic 1111 so we'll move this here pop this here now let's finish this flow you'll notice on the right hand side of the case sampler there's a little dot that says latent this means that the case sampler is going to Output a latent image we now need to convert that into a pixelated image that we can view we use that using a v decoder you may have noticed V in automatic 1111 on the top right where you can select different vays to affect how your image output is that's the same thing it's a decoder it translates your latent image into pixels so we're dropping the V decoder here and like in automatic 1111 we're going to add in the load V node where we can select the sdxl V and connect it to the V Dot and we're going to connect latent to sample and finally we want to be able to output and save our image in this case we're going to use the save image node and connect the image dot to the image dot on the save image node and there we go with a little more adjusting we've basically got the skeleton structure of automatic 1111 in comfy UI so now we're just going to adjust some of these nodes to clean things up okay so now now we can plug in some prompts to try and generate an image I've already set up a negative and positive prompt here where I also forgot to connect my negative prompt to the case sampler which I'm doing now if you want to try out this prompt I've left it in the description below uh it took a little bit of experimenting to get an image that I wanted to work with and once we're ready to see what the output is we go ahead and click Q prompt and if we want to see how the development of that prompt is going we can also just click view Q you may now notice some of the nodes are starting to turn green this is at what stage the development of your image is at so right here it's at load checkpoint then you'll see it go through the clips the case sampler and save image and there you go we finally got our image now as you may know part of working with sdxl is about sending your prompts through the base and then through the refiner model so we're now going to work on the refiner side before we continue let's group this up so that it's nice and tidy so that we know which are the nodes that belong to the base side of the equation like for the base model we now need to load the refiner checkpoint so we'll do this by using the same node we need to give the refiner its own K sampler and of course we connect the checkpoint to the new K sampler now the K sampler also requires its own positive and negative clip now we don't want to have to type out the same prompt for the base and the refiner so we're just going to connect the existing clips from the base model onto the refiner K sampler one of the key differences on the refiner side of this graph is that the latent image the K sampler takes as an input is no longer going to be a blank latent image in fact it's going to be the output of the K sampler from the base model in this case we connect latent to latent image and we're now feeding into the K sampler the semi-finished image to continue working on like before we're also going to add the V model loader and V decoder and finally the save image now this is going to give us an error uh which we're going to correct now and the reason for that is because we are trying to use the same clip in multiple K Samplers now the reason we did it this way is because we want to keep our comfy UI experience simple and easy to use and one of the ways ways that we do that is to not have to keep inputting the same text again and again if we're using it in multiple case Samplers so the correct way to do this without having to have multiple text boxes to input the same thing again and again is we can actually extract elements from the nodes and reuse them in multiple nodes so in this case for the clip we are going to rightclick the clip and convert the text to input and you'll notice the node goes blank don't worry your information is still saved you'll notice there's now a new input on the right hand side called text we're going to click on that input out into the canvas and when we try and click anywhere on that connection you'll see a drop- down of nodes come up we're going to go down into utils and select primitive you'll now notice that the text that you had before in clip has been filled out here into this primitive one we're also going to do the same thing for the negative prompt and now that the clip nodes are empty and so big we're just going to resize them to clean things up and we're actually going to duplicate these clips and move them over to the refiner side and now we can connect the text boxes to the respective clips and then the clips into the refiner K sampler so essentially what we've done is we've given each K sampler their own negative and positive Clips but we are reusing the same text across the clips and K Samplers and this is why we were getting the error earlier it does add a little bit of messiness to the way that it's done but we can always put these nodes out of the way to get the output that we want and now if we click Q prompt we will see a new base image and the refiner output which should in theory be an improvement on the base model in this case though you'll see we got something kind of similar but also very different and in fact quite gruesome this is definitely not an improvement on the base image it's still a different image entirely so why are we getting this problem well as I mentioned earlier to successfully use the sdxl model both the base and refiner together we actually need to use the advanced case sampler nodes so now we're going to go in and replace the uh case sampler nodes for the advanced variants and explain the additional components and how they make everything work together so let's replace the base case sampler for the advanced one and reconnect everything and we'll do that for the refiner one as well great and now that everything is replugged in what do we need to do differently to make this work so if we head over to the base Advanced case sampler you'll notice three new lines that weren't in the previous case sampler start at step end at step and return with leftover noise and basically what this does is allows us to Output an unfinished image out of the base model this is why the return with leftover noise line is so important because what we're doing is we're allowing the model to Output an image with noise left over and then the refiner model will go in and finish removing the noise focusing on improving and adding in details rather than what was happening before where it was trying to regenerate the image as if it had a full layer of noise so in this case we're just going to leave it at 20 steps and if we head over to the refiner cas sampler we can see here that we've got the same lines but now we're going to change the start at step to 20 which is where the previous one ended and we're just going to add on an additional 20 steps so let's finish it up at 40 and keep the number of steps the same and if we cue The Prompt now you'll see the two images look almost identical in fact if we zoom in they are identical so what's going on here the refiner doesn't seem to have done anything to refine the base image well this is probably happening because we're giving the base model too many steps and it's outputting a finished image with no noise for the refiner to work on so if we come back to the base case sampler and we drop down the ending step to 16 and we set the refiner case sampler to start at 16 you'll now see that the output of the base model is gives us the outline of an image it's still very noisy which the refiner model can then take and tweak and we can see here the refiner model looks absolutely phenomenal the image looks detailed there's great lighting and so on and so forth before before we finish up in the spirit of trying to keep things simple and keep our workflow as easy to use as possible we are just going to extract out of the 2K Samplers much like we did the positive and negative prompts the starting an ending step so that you can just set it once and that will become the ending step for the base K sampler and the starting step for the refiner one and we're going to do this by right click clicking on the case sampler and for the base one we are going to convert end at step to input and on the refiner start at step to input then just like before we're going to create a primitive and then connect that primitive to the new outputs on both and now we no longer have to keep jumping back and forth between the two case Samplers every time we want to change the starting and ending steps and that's basically it we now have a full working workflow from beginning to end using both the base and refiner model and don't forget to like And subscribe and click the Bell icon to stay updated for new videos it really helped the channel out if you want to make use of this workflow I've included the Json file below you can just drag it into your comfy UI interface and it will work I recommend you go in and play around with the parameters adjust the starting in end steps the number of steps the CFG Etc to see what kind of different images you can get have some fun with it and I'll catch you guys in the next one where we will start looking into prompts and integrating embeddings and prompting techniques to try and get better images
Info
Channel: Endangered AI
Views: 6,684
Rating: undefined out of 5
Keywords:
Id: 39k5e5_kfJ8
Channel Id: undefined
Length: 20min 26sec (1226 seconds)
Published: Mon Aug 28 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.