Stable Diffusion Tutorial:Using XYZ plots to Optimize Parameters and Get the Most Out of your Model!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everyone keyboard Alchemist here and welcome back to another stable diffusion tutorial today I will share with you my personal workflow on what to do when you're just starting to use a new model or checkpoint before we start do me a favor and click the like And subscribe buttons to help support this channel your likes and subscriptions Help Me Grow this Channel and allow me to continue making quality content thank you last time we downloaded the Magic Mix realistic version 5 model and added it to our web UI if you don't already know how to add a new model to your stable diffusion models folder go ahead and check out my previous video the timestamp is 9 52. I'll link the video on the top right hand corner often times after you've downloaded a new model it might not be readily apparent what values you should set the parameters in order to generate good results you could use a trial and error method and just try different parameters at random however that is time consuming and not very efficient luckily we have a great built-in tool that can help us do this in a more systematic and efficient way the XYZ plot tool in the script section the XYZ plot tool provides an easy way to make a three-dimensional grid of images with your choice of different parameters on the X Y or z-axis I will show you how to use this tool to test the boundaries of a new model and how to find the optimal ranges for the most important parameters let's get started to me the most important parameters to get right when you first start using a new model are the sampling method sampling steps and CFG scale parameters first let's take a moment to understand how stable diffusion models generate images and describe each of these parameters in a bit more detail stable diffusion models generate images using a denoising process which means it first generates an image that is just random noise based on the seed value then it will try to take a portion of the noise away from the original noisy image by guessing based on your text prompts each time it takes noise away from the original image is called a sampling step after a certain number of sampling steps we will get the final output image that we perceive as an object a person or whatever was described in the prompt in general more sampling steps means longer image generation time and usually a better final image up to a certain number of steps in other words the quality of the image will Plateau after enough steps have been used this means you usually do not want to crank up the steps to the max value of 150 because you will lengthen your generation time without getting more benefits in return instead you want to strike a balance between good image quality and fewer steps for faster generation the way by which the model makes a guess of the noise at each sampling step is by using a sampling method sampling methods are just ways to calculate the predicted noise that is to be taken away from the original noisy image each sampler is different some Samplers need more steps to fully take away enough noise to generate a decent image and some Samplers need less steps you can see in this example the top sampler took 150 steps to arrive at a decent image while the bottom sampler took fewer steps also some Samplers run faster relative to others this means you want to pick a sampler that will give you a good image with a small number of steps and runs relatively fast for each sampling step lastly let's talk about the CFG scale CFG stands for classifier free guidance think of it as a creativity meter for how closely the model should follow your text prompts a higher CFG value means the model would follow your prompt more strictly a lower CFG value means the model would have more creative freedom when generating the image the CFG scale goes from 1 to 30 with a default value of 7. as we will see later you would almost never need to use a CFG value that is much greater than 7 as it tends to introduce more artifacts but some sampling methods do work better at lower CFG values now that we understand what sampling methods sampling steps and CFG scale mean how do we use XYZ plots to help us figure out the optimal ranges for these parameters within a specific model or checkpoint here is the tldr first we will make an XYZ plot of sampling steps versus CFG versus sampling methods using large intervals second we will use the plot we made in step one to help us see which sampling methods or Samplers work well with this model and pick one or two Samplers to work with third after picking a good sampler we will plot a sampling steps versus CFG XY plot using smaller intervals to find our optimal ranges for the sampling steps in CFG scale now let's go through an example with our test model the Magic Mix model I'll start off by grabbing one of the test images that I have generated and drop it into PNG info then send it to text to image this helps populate all the parameters that were associated with this image we can see that the sampling method was Euler sampling steps was 30 and CFG was the default 7 then we can click the generate button to regenerate the image this image will be our reference image for the upcoming steps we want to use this same seed to generate our first plot of sampling steps versus CFG note if you generated your reference image using a random seed minus one remember you can click on the Green Recycle button to reuse the last seed the important thing here is that you need to use a fixed seed value to generate the XY plot not a random seed to generate a XYZ plot scroll down to the bottom of your web UI click the drop down list and select XYZ plot under X-Type select steps and under Y type select CFG scale for the first XY plot we are using large intervals for both the sampling steps and CFG you can see the steps on this screen here then on the left are different ways of how you can write out the range of values that you want to plot for example out what I have here for y values tells the plot to go from 5 to 30 by increments or steps of five it can also be written as a simple range or with the square bracket notation these three Expressions give you the same y values there are 10 values on the x-axis and six values on the y-axis so we are going to get a 10 by 6 Grid or 60 total images which took my computer about 15 minutes to complete if you plot more numbers for each axis of course it will take longer but if you have a better graphics card with more vram it will be faster for you let's take a look at the completed plot again sampling steps are on the horizontal x-axis and CFG are on the vertical y-axis we can see that we don't need to have CFG scale more than 10. there is too much noise with a higher than 10 CFG value even 10 is not so great with lower steps for the sampling steps we see that we can't go too low 10 steps is not great but higher steps is not ideal either because it increases your image generation time and doesn't add more to your image quality as the steps increase past about 50 you are not getting much more change in the character's appearance pose clothing and background it all look basically the same between 20 to 50 sampling steps looks like a good range okay so we said we want sampling steps to be between 20 to 50 and CFG to be below 10 but we are not done yet now we need to use smaller or finer intervals to dial in both parameters for the second plot we are going to use sampling steps 20 25 30 35 40 45 and 50 and CFG equal to four five six seven eight and nine this is a seven by six grid 42 images which took about six minutes to generate a lot faster since fewer images were created looking at this second plot we see that we get more artifacts at lower steps and higher CFG scales the difference is subtle but if you zoom in here to CFG 8 or 9 with 20 steps you can see slight artifacts on the face and some shadowing here's another example that I have generated with a different seed value which shows more of a difference at higher CFG values to illustrate my point we can see lots of artifacts above cfg7 also at cfg4 the images looks a bit darker and less vibrant than slightly higher CFG values as for the steps we can see that we don't really want to go below about 30 even at the default value of cfg7 if we only use 25 steps we can get many artifacts most noticeable at the eyes and the mouth but the image is pretty consistent when you can go to higher steps and all the way to 50. now we have a well-defined region to work with that we know will give us good results that is CFG between five to seven and sampling steps between 30 to 50. let's pick cfg6 and sampling steps 40 and generate a batch of four images to see the result don't forget to turn off your XYZ plot in script this is where you want to switch to using random seeds [Music] okay the batch is done let's take a look [Music] these images look great the facial features hair clothing and background look great and there are little to no artifacts so this was a success but wait a minute you might be wondering I said in the beginning that the first step of this workflow is to generate a XYZ plot with all three parameters including sampling methods but why did we make only a steps versus CFG plot that is because we took a little shortcut here for the sake of understanding how to use the XYZ plot script I wanted you to see the XY plot first then add in the third dimension which is the sampling methods on the z-axis the creator of the Magic Mix model noted on civit AI that he recommended to use the Euler sampler so I kept things simple by just using the Euler sampler first but let's talk about how to select an appropriate sampler from the beginning you should do this if you didn't already know which sampler to use there are currently 20 Samplers available but not all of them are going to be suitable for your purpose and generating a grid of 60 images for each of the 20 Samplers is going to be very time consuming not to mention you will get an error telling you that plotting so many images in a grid exceeds the limit of what the XYZ plot script can do so let's try to narrow down the list a bit there is an article that provides a great summary of the different sampling methods and how they are different I will link it in the descriptions if you want to read more about the different sampling methods you might have noticed that some of the Samplers have a small a in the name this is indicating the sampler is an ancestral sampler which does not converge this means you might get different output images even if you do not change your input parameters at all so if you're looking for reproducible output images then the ancestral Samplers may not be the right choices I like my images to be reproducible when I use the same prompts and same parameters it makes it easier for me to fine tune my prompts so I will eliminate all the ancestral Samplers this reduces the sampler pool a bit even though DPM fast LMS Keras and PLM Samplers are not ancestral they do not converge well so let's take them out of the running out of the remaining 11 Samplers there are two categories the first order solvers Euler LMS ddim DPM plus 2m DPM plus plus 2m Keras and unipc and the second order solvers hewn dpm2 dpm2 Keras DPM plus plus sde and DPM plus plus sde Keras the second order solvers are more accurate but are slower in comparison so if you care about image generation speed more go with the faster Samplers and if you don't care too much about speed you can try a combination of both the DPM adaptive sampler can be four times slower than the first order solvers so we are eliminating this one as well another thing that we can consider is the fact there are Samplers that uses a standard non-karis noise schedule versus the Keras noise schedule what's a noise schedule you might ask the noise schedule controls the noise level being subtracted from the image at each sampling step the noise is highest at the first step and gradually reduces for each sampling step the Keras noise schedule have larger noise step sizes at lower sampling steps and smaller noise step sizes towards higher sampling steps compared to the standard noise schedule what this means for you and me is that the Samplers that uses the Keras noise schedule in general improves the quality of the output image so between Keras and a non-keras Samplers with the same name I would choose the Keras one and this helps us further narrow down our choices to just eight Samplers this is a more manageable list you already know how to generate a XY plot of sampling steps versus CFG scale now we will add the Samplers to the z-axis to create the proper XYZ plot we want to use larger intervals first for both steps and CFG scale if you recall using CFG values greater than about 15 is not very helpful since we already saw that it adds a lot of noise so in your plot you can save some time by plotting CFG values equal to 1 3 5 7 10 and 13. I'm using CFG values between 5 to 30 for consistency and illustration purposes only fair warning generating this large XYZ plot of all eight Samplers took me about two hours so plan ahead to avoid being bored two hours later okay when the plot is completed this is what we see because of the added z-axis we have eight ten by six grids one for each sampling method and this is what you should expect to see in your output folder other than the overall XYZ grid you will get a JPEG file and a PNG file for each variable on the z-axis here we are judging the Samplers based on the quality of images that it was able to generate I converted these plots into a heat map of sorts by using the following criteria to put each image on the grid into either a red yellow or green category if a step size and CFG value combination gives me any obvious artifacts or noise then I would mark it with red if the image has slight artifacts that are hard to notice unless you inspect it closely then I would mark it with yellow slide artifacts include things like shadowing slightly messed up facial or other features and if the hands are showing maybe some abnormal fingers these types of things are easy to fix with some upscaling tricks or by using web UI extensions we will talk more about these methods to fixing small issues in future videos and if the image has no artifacts then that would be green you can find the final XYZ plots and my heat maps in the video descriptions for your reference we can see the Samplers that are not suitable for this model are LMS ddim and unipc ddim and unipc generated stylized images that look more like CGI or illustrations and that is not what we are looking for since we are aiming to create photorealistic images these sampling methods might be good with another model or checkpoint the LMS sampler provided some interesting results it seems like this sampler can generate images well at low CFG values and high steps so I tested it with low CFG and high steps only here are the results we can see that the results for CFG scale two to four were pretty good only at about 140 to 150 steps in CFG equals five were there some artifacts so the moral of this story is you can still get the LMS sampler to work it might have a smaller green zone or a sweet spot to work with but the images created are not bad at all finally we can see that the good Samplers for this model are Euler hewn dpm-2 Keras DPM plus plus 2m Keras and DPM plus plus sde Keras these Samplers gives you plenty of the green zones to work with on the lower end of both the steps in CFG scale and the image quality were good with plenty of details any of these Samplers would be a good choice for this model but if you care about image generation speed remember that the hewn dpm2 Keras and DPM plus plus SD Keras Samplers are slightly slower methods so in this case I would pick either the Euler or the DPM plus plus 2m Keras samplers then using the two Samplers we just picked I would generate the second XYZ plot with the smaller or finer intervals this is the same as what we have already talked about earlier in the video so I won't repeat them again here are the XY plots of the two final samplers for the Euler sampler we can see that the optimal zone is cfg5 to 7 and sampling steps 30 to 50. for the DPM plus plus 2 m Kara sampler the optimal zone is CFG 5 to 6 because cfg7 gave this weird artifact with the sleeves at certain step sizes the sampling steps are again between 30 to 50. lastly to illustrate that the optimal zones are not the same between different models I have generated another XYZ plot of steps versus CFG versus sampling method for the dream shaper version 6.31 model here is the comparison between the first four sampling methods we can see how different the green zones are for each sampler maybe it is because dreamshaper generates a more stylized image but there seems to be a larger workable zone for each sampling method I will leave a link where you can download the XYZ plot for both models in the descriptions so you can take a closer look at the actual images here are the other four sampling methods once again very noticeable differences between the Green zones When comparing between the models so the takeaway here is since the same parameter values may work in one model but does not work in another model it is a good idea to create one of these steps versus CFG versus sampling methods XYZ plot to figure out the optimal parameter ranges when you first start working with a new model in conclusion the workflow we just went over include Just Three Steps step one make the XYZ plot using all three parameters with large intervals step two pick one or two sampling methods that works well for this model you are testing and step three make an XY plot with finer intervals to find the optimal working ranges for your sampling steps in CFG value and don't forget the time saving shortcut you can use if you already know which sampler works well from the model's documentation then you can skip straight to step three feel free to leave a comment down below if you want to share your experience working with different models and how you have found the best parameter values I hope you enjoyed this video and find it helpful I would appreciate it very much if you show your support by clicking on the like button leave a comment and subscribe to this channel it would help me a lot thank you and I'll see you in the next episode

Info

Channel: Keyboard Alchemist

Views: 9,050

Rating: undefined out of 5

Keywords: Stable Diffusion, Stable Diffusion Tutorials, Automatic1111, AI, AI Art, AI Tips and Tricks, AI Tutorials, A1111, XYZ Plot, Sampling Methods, CFG Scale

Id: reiZ4AXtjDs

Channel Id: undefined

Length: 21min 5sec (1265 seconds)

Published: Wed Jul 19 2023