Kasucast #21 - Improving Stable Diffusion images with FreeU (SDXL, LCM, Turbo) and ComfyUI API

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome back in this video I'm going to go over the freu node first I'll briefly go over the freu paper then outline the difference between backbone B and Skip connections s once we understand how the values work we can start performing a grid search starting with the B1 and B2 values then progressing onto the S1 and S2 values I'll show my attempt at doing this using comfy UI comfy roll custom nodes but also show the bug occurring when attempting to create the XY grid as a result of issues with the comfy roll nodes I'm deploy another methodology utilizing the comfy UI API in which I show how to programmatically Q prompt the images then create a functional XY grid afterwards I repeat the process for both the official LCM Laura and the sdxl Turbo Laura that I extracted myself if you're interested in increasing the image quality of your Generations or exploring the comfy UI API let's get started first off what is freu freu or or free lunch and diffusion unit is a method that in the author's words substantially improves the generation quality on the Fly it's a bold claim but let's go ahead and take a look at the paper right off the bat in figure one we're greeted with vanilla images and images generated with free U applied from my observation free U seems to increase the saturation of the image as well as sharpen up the details it also appears to clear up any deformations in the image this one is the most evident example where the teeth are set properly anyway the first notable observation the authors make is in figure two so this one the denoising process you can see that the image Generations proceed from left to right going from very noisy to refined image the first row is the standard image generation we're used to the second row is the low frequency being denoised the second row is the low frequency being denoised the authors point this out to show that the low fre frequency handles the global or general gestation of the image if you're familiar with sculpting this would more or less equate to the primary detail the third row is the denoising of the highfrequency image which handles the fine details in the same vein of the sculpting analogy this would be secondary detail but it leans more into tertiary detail in my opinion later on in section 2.2 in the methodology the authors Define the low frequency data to be denoted as B for backbone and the high frequency data to be denoted as s for skip connections skip connections should be a familiar term but let's go ahead and refer to the stable diffusion paper for a quick review the skip connection s or the gray Arrow here passes features from earlier layers of encoder blocks directly to the decoder hence the name Skip connection right after this section the authors mention that in figure five changing the back bone B value enhances the quality of the image while changing the skip connections s doesn't impact the image quality of the generated images however that's not to say that skip connections don't have their place in overall Aesthetics next the authors make an important Discovery in figure six on the right here the graph shows that increasing the backbone scaling Factor results in suppression of high frequency components in the images generated by the diffusion model so if you forgot or don't remember The High Frequency components are the secondary and tertiary details of the image so the S values or the skip connections control how much detail actually goes into these areas basically figure 5 conveys that the higher the B value the less detailed the image will be this makes sense if you recall that a higher B value means more of a focus on primary detail which could be in turn a flatter image conversely when when the S values are gradually increased it might be difficult to see but you can kind of make out more of these grainy details in the images the higher you go as such our method of tuning free you becomes extremely simple the B1 and B2 values are the values that control the overall image composition while the S1 and S2 values control the finer details if we reframe the order of these variables tuning freu becomes a very simple grid search proc problem where we first vary the B1 and B2 values for our model in question until we get the ideal B1 and B2 values that work for us then we keep those B1 and B2 values static and Vary the S1 and S2 values simple now that the crash course theory is out of the way let's get to the Practical side of things I'm going to show you two methods to tune your free U settings the first method will use comfy rule custom nodes and the second method will employ The Comfy UI API I ended up going with the comfy UI API in the end the reason will become apparent after I show you the comfy roll custom nodes and their deficiencies people claim that the comfy roll nodes will be able to create image grids for whatever you want but I have found that this is not working properly for me if we go to the GitHub issues on the comfy roll nodes page here and then search up XY image from grid like so you can see there's one closed issue go ahead and click on that and if you read through it there is some documentation or something similar to documentation on Civ AI so let's go ahead and open that up in addition to the Civ AI workflows for comfy rooll there's also a YouTube video If we go back to the GitHub issues and click on this it's created by this person named groer and he shows us how to use the comfy rule node for plotting CFG versus seed so I think he has a diagram here CFG for the xaxis and seed for the Y AIS however the key difference here is that he uses the CR XY list node whereas we want to use the comfy XY interpolate node more on this soon so let's head back to the Civ AI page and download the zip file here with all the workflows once you download the workflows from Civ Ai and unzip them you can see that coincidentally there are two free you demos right here so let's drag this into comy UI here we go drag in there it is there's something called a mile high Styler here which I'm not going to use so I'll just delete that I'll also change the checkpoint to use my own fine tune model so right here that all I changed was setting up these prompts like usual and now let's go ahead and take a look up here which are the comfy roll custom nodes this is the comfy roll XY interpolate node I was speaking about earlier we have this primitive over here that controls how and when the node starts from so it says to use index one and if you're using index one it'll end at index 12 because the reasoning is you have three columns and four rows so 3 * 4 is 12 and hence your ending value should be that to match up properly after that there's only one thing that we really have to look at is the output folder and the save so right here the save image is going to be in the folder called free you you have to create these folders free U and then also create this XY grids as well so like that I've already done that for both please recall that I haven't made any changes aside from changing the prompts this is just straight out of the workflow and to run this since we have 12 here we're going to need 12 for the batch so to enable that go to extra options enable this checkbox and type in 12 here and one more thing is we should have a stable seed somewhere so yes fix seed it's already fixed this is important and one more thing is that when we run this you will see this counter up here jump up to 13 because it will add 1 plus whatever end index is here so let's go ahead and do it now see the counter going up 13 so the generation has finished and we can go over here and look at this grid weren't there supposed to be 4 * 3 = 12 images here I only see 11 why is that what is going on how come this grid is created before the final image that doesn't make sense to me what is going on I don't know perhaps this is a visual glitch so let's go ahead and consult with our actual folder viewer let's go here and go into free you how many images do we have 4 4 4 12 yes let's go over to XY grid here open that up and hm 11 images something is wrong in the workflow so let's go back and investigate we're back in comfy UI the first thing we need to do before doing anything else is resetting this counter back down to one otherwise we won't be able to get a grid of anything because this counter keeps track of which images go where next what if we change the start index to zero will that make sure that every image is included in the final grid well there's only one thing to do and it's to find out once again B count of 12 and let's generate the generation has finished once again and if we look here at the grid there are 12 images here however if we look even more carefully this is the 11th image I think the 11th image is repeated down here again while it should be this image is this image present anywhere in this grid let's check I don't believe so I don't see it here so if you start from the index of zero you do end up getting 12 images one of them is repeated and one more thing you're missing the xaxis for the labels so what is going on here and how do we fix this there's some kind of interference with this XY from folder node if we go ahead and consult the video put out by groer you can see that he is using the XY list instead of XY interpolate let's try this before writing off the confir Ral nodes there's a few things that we have to do first let's go ahead and move this XY interpolate node up here we can disconnect it if we want so just like that rotation and Trigger that looks okay now let's go ahead and create the XY list double click CR XY list there we go probably need some more room and don't forget to move this start index back to one very important and now in here we have to manually enter the values that we want so no longer do we have XY interpolate doing this for us you can look at this for reference so 0.9 1.1 1.3 and 0.8 to 1.1 I finished typing in those values that we need now we have to connect this index back to the Primitive so go here right click convert index to input and just drag this over you can see see it's back at zero we have to go from one next up we have The annotation we can just straight up connect these as well as the trigger like that we just have to connect these XY values into the freu node now down here however there's a bit of a gacha here because we're working with text this is all text and we have to convert that into a number and a float so let me just make some room over here so we need text to number text number like that and then let's go ahead and clone that because we'll have two of these and then we also need a number to float to float like so and we need two of these as well like that so the top is X and the bottom is y just drag those over here B1 and B2 and I think that's it so let's go ahead and generate once again remember batch count 12 one do another check one to 12 good good good okay looks good to me we can go ahead and just delete the interpolate node don't need that and if we want to see what's going on just connect these to these the generation has finished and if we go over here and take a look at the Grid or click on this you can see that the same issue is happen happening where the last image is not here the grid is being generated before the final image can be added and also you can see this time it's not prepended detailing what this is the top is B1 and the Y AIS is B2 so you can prepend here on your annotation if you need that distinction which you probably do but I don't know why this is not working for me in the video here it's clearly working and I am updated on everything for the comfy roll it's using the latest update from GitHub so I don't know why this issue is happening to me what do I do now if these comfy roll nodes aren't working for me let's go ahead take a step back and put on our thinking caps in both cases of XY interpolate and XY list the images were being generated properly likewise in both situations the grids were being created with n minus one images where n is the total amount of images this leads me to believe that there is some kind of issue with the XY save image grid node or it could be an asynchronous issue with that being said let's go ahead and open up the second free you workflow here from Civ Ai and you can see there's an XY from folder for free you demo drag in this workflow replaces it this looks pretty similar to what we have but remember that here it should be not 1.2 to 1.5 but these values instead like that and I guess we have to retype this B2 if you go back and look at the original node they're separated by semi colons and not commas and give it another try and there we go finally a working graph 0.8 0.91 if you want to go ahead and use the comfy roll nodes you will have to manually input your values here and if you're using XY interpolate you'd have to do the same thing here as well since I mentioned that I'm not a big fan of manually entering lists or two-step workflows I thought to myself for my own personal sanity is there any way I can get around this issue luckily for me if you head over to the comfy UI repository you can see here that there's this conspicuous up update how to get the prompt in API format in the example commit hm what does the script examples directory have inside it let's go ahead and check it out inside you have a basic API example python file very interesting once you go in here the very first comments are this is the comfy UI API prompt format if you want it for a specific workflow you can enable Dev mode options by clicking on the gear why don't we go ahead and try that out I have a basic sdxl 1.0 workflow loaded up here without the refiner connected and it's using freu up here in this node so freu comes right after the loading of the checkpoint in the order and here we are so how do we enable Dev mode options let's go over here click on the gear this comes up and we have this check box box here enable Dev mode options click on that escape out of here and then you can see in the bottom right corner there's a save API format right here so let's go ahead and click on that and then find a place to save your Json file so for me I've already done so but I usually like to keep them together so it's in here here here and I've saved it as this one custom save API and one thing to notice is that these API files are considerably smaller than the workflow files even though they're both Json files this is the file that we just saved so you can see that it is the API Json file and this probably doesn't make too much sense to you yet but let's just go ahead and get familiar with it so you have these numbers here and I believe they represent the ID of the node so you can see 24 is this text multiline go around and look we have a save image one image save on node 63 so how do we make sense of all of this information here and how does it look like when we compare with the non-api version and the non API version it's roughly the same however it seems to be a lot more verbose and there's probably a lot of things that aren't really relevant like this reroute node I don't think I saw any of that in the API here let's just check reroute none pattern not found it looks like the API file only has the most relevant pertinent Bare Bones information for the workflow now that we see that every node is indicated by an ID node let's go ahead and go back to the example scripts once we make our way past the initial comments up here and scroll down all the way to the bottom the plot literally thickens starting from line 103 to the end you can see several things of note happening first off there is a method named Q prompt that sends a post poost request to your comfy UI Local Host address and port with a SL prompt appended to it then you can see that the prompt simply functions as a dictionary or object you can change the values depending on the Node number as you can see here and the first example on line 112 the ID is 6 and it's manipulating the text input so let's go to ID of6 and you can see it's the clip text in code inside the dictionary for the key of six you can go to inputs which is another dictionary or object and then you have text which you can change to whatever you want this is the base which is Masterpiece best quality girl and you can see here that it's being manipulated To Masterpiece best quality man instead in the same vein for the second example here which is the ID of three we can go there and we're changing the inputs of that and specifically the seed number so right here and the default we have this value here 8566 257 and down here we are overwriting that seed value to a simple integer of five it's all coming together now now that we have a little bit more context about the API Json file format we can head back to the API Json that I saved please recall that this workflow has the free you node added and it's down here in node number 51 so here and you can see there is B1 B2 S1 S2 and the class type is freu V2 the newer version of the freu node in my own script all I'll have to do is just programmatically change the B1 B2 and S1 S2 values once the images are generated I can then create my own grids and search for the best values here is the final script that I'll be using there isn't anything really special about it as it's just a more advanced version of the script examples shown in comfy UI repository the first thing that I do is load in the API Json file as opposed to defining prompt text here on Line 39 workflow equals json. load then I create temporary variables that will store the particular values of the node ID in the workflow that correspond to freu K sampler and save image or image save note that I'm not using the save image node but the image save node so I should rename this but I did using the image save node allows you to save to a specific directory namely the current date time this is how I set it up in Python and let's just go ahead and take a look at the nodes once again so you can see this is the API Json file and you can see in node 63 the last one there is an output path this is normally what's going to happen but I've just overwritten it with my own to find the ideal values please recall that I mentioned that this would involve a two-stage grid search the initial step will be to find the ideal B1 B2 values once we settle on those values we will keep those B1 B2 values and then vary the S1 S2 values to find the final optimal parameters in the code that's performed down here when I specify Dynamic values type in the case where Dynamic values type is B the S1 static and S2 static values will be loaded in alongside of it with the B1 and B2 range being a plethora of values and vice versa when the dynamic value type is s for now let's concern ourselves with defining a B1 and B2 range and keeping the S1 and S2 values static however we need some kind of Baseline for these S1 S2 values otherwise no matter what B values we search for everything will look atrocious so how did I come to this value of 1.1 and 0.2 well that's easy let's just go to the free you repository here here we are on the freu repository page if you scroll down to the end you can see here under the stxl section that there are sample values the suggested values for B1 B2 respectively are 1.3 and 1.4 the suggested values for S1 S2 respectively are 0.9 and 0.2 I did some testing on my own and I found that I liked 1.1 and 0.2 for S1 S2 in addition you can see that there is range for more parameters you can consider one from 1.2 1.2 to 1.6 and anything below one however I did more of an extensive test because I wanted to see what exactly the kind of results I would get using different values before we execute the code let me briefly touch upon the architecture of the grid creation process since I was already alerted to the pitfalls of the comfy roll XY grid asynchronous issue I made sure to handle that in my own script since you can calculate beforehand how many images you will generate so here you can see run image creation process and expected image count are the length of the X range and Y range where these are Dynamic values depending on whether it is b or S once you calculate the total amount of images or expected image count I set up a polling system where I will only create the image grid once I have found the total amount of images so you can see this wait for directory creation function right here I set the refresh rate to 5 Seconds over here so wait for images and then a five down here time. sleep 5 so the image searching will commence every 5 seconds if you make the overall polling weight time long enough so let's go back up here image generation timeout this will also eliminate the issue of having to deal with asynchronous request with the Q prompt post request itself so there are two more things I want to mention if you're running your script from inside Linux on Windows subsystem for Linux 2 then you you'll have to make these changes the first is in environment variables over here you want to get the IP address of Windows because you're in Linux and inside Linux you have Local Host but on Windows you're not using the Local Host so for me it is 192.168 64 to get this value you need to run a special command in Linux I'm in my Ubuntu workspace right now and the command you have to run is this one grip name server once you enter that you can see 1 19 2.1 68. 64.1 and that is my IP address if you're using the comfy UI pulled from the repository as opposed to using the portable executable then what you have to do is change your run comfy. batch file I'm not sure if this is default or if I made this file but it's a batch file and inside here inside I'm running these commands which is activating the virtual environment and then running python main. POI with this is is important if you want comfy UI to be available everywhere on land so remember D- listen is key if you want to call the API from inside your Linux drro before executing the code we should see what is happening we'll have the dynamic values type PB and if this is the case you'll have S1 static and S2 static being loaded in which is 1.1 and 0.2 for the B1 B2 range I have both of them going from 0.4 to 1.7 if you calculate that that is going to be 14 values including both end points because it's incrementing by 0.1 so we should have a grid with 196 images on them here are my two terminals on top we have Ubuntu Linux and down here we have Windows which is where my comfy UI is so first up let's start up the comfy UI server for me it's this okay it's looking good for me and now let's go back to to Ubuntu now in Ubuntu let me go ahead and activate my shell script which is this one this will activate the virtual environment like so and now I can go ahead and run my tuning script before I execute this please take notice down here because it will receive an influx of Q prompts here we go and you can see all these got prompts and it's waiting for the first directory to be created on the first image generation you're going to create a directory once that directory is found it will tell you so the first generation is happening look carefully because it will ask you about the directory here see the path doesn't exist and it will create it on its own and then you can see up here that the directory has been created and it has been found now it's just waiting for all of the 196 images to be generated so let's go ahead and just wait for that to happen let's go ahead and analyze the results I'm in my image viewer here right now and I believe this is the folder with the output if we click on any image we can see down here that it says 197 objects that makes sense because we should have 196 images and one image grid let's scroll all the way down and yes we have an image grid click on that and we have this very high resolution grid I'm going to take this into Photoshop and do some more analysis here is the plot in Photoshop I've added some annotations that I'll show you eventually when the B1 value is low so 0.4 to maybe 0.8 the images look pretty bad they seem very sketchy so why don't we just start with B1 at 0.4 and continue down on the Y AIS increasing the B2 value let's just keep going down down when B2 is around 1.3 we start to get some form defined this is much better than this and if we go all the way down to the bottom at 1.7 you can see that this is very reminiscent of a cooked Laura or overbaked Laura I would say that this is still not very great but it loads better than what we had up here at low B2 and B1 while we are down here I would like to point out a trend when B2 is extremely high at 1.7 increasing the B1 value doesn't seem to have such a great quality impact in this entire stretch you can see that nothing's really really changing here same same just very minute improvements only when it's around 1.1 do we start to get a big jump in quality B1 at 1.1 1.2 and 1.3 are good areas to look at to see when there is a jump in quality already here from B 1.2 to 1.3 you can see that the hands are becoming defined and the face is not as bad so let's go back to up here where I was speaking about 1.1 1.2 and 1.3 1.3 is my pick for seeing a dramatic increase in image quality so let's just keep it at 1.3 and just keep going downwards increasing the B2 value keep going and keep going once we hit here which is when B2 is 1.3 is when I think it's a good place to stop if I go any further you can see that there's more artifacts here so let me highlight these there's an artifact here on the eye ey entire face and last ly at 1.7 it just seems very Overexposed however it's not too bad I would say nothing that in painting can't fix and there's one more thing that I want to point out which is increasing the B2 value seems to saturate the image increasing the B1 value also saturates the image if we go back here this is the image I stopped at right now we're at B1 1.3 and also B2 1.3 if we progress to the right the image is getting better and then if we go one more I would say this is very very good and then if we go one more the image composition has changed I'm not sure how much I like this image not very much and then one more and this is too much loss of the features for me so I'm thinking B1 uh 1.5 and B2 at 1.3 is a pretty good range of values to use for the static B however there's one more thing that I want to handle which is narrowing these values even further right now we're using an increment of 0.1 but I want to lower this increment even lower to 0.05 to make sure that we're honing in as accurately and as precisely as possible so let's get to it now that we have a general range of values to search across let's go ahead and run that as well the second passw will be to search from B1 1.3 to 1.7 and from B2 1.1 to 1.7 for a narrow window so make sure that you comment out the previous B1 B2 range which was 0.4 to 1.7 on the increment of 0.1 and enable this one which has a lower increment of 0.05 you can split as many hairs as you want here but having it from 0.1 to 0.05 is good enough for me so let's go ahead and run this back in the image viewer this is the original 196 values and this is the second one you can see that there are less images but there's also a grid here still it's working as expected let's go ahead and take this grid into Photoshop as well and have a look here we are in the narrower window apologies in advance I think I went up to 1.75 for B1 and the same for B2 nevertheless more data points are always welcome remember that I ballparked around 1.5 for B1 and 1.3 for B2 the narrower window is to explore the general area around to see if there are better selections however before we do that I want to make some more observations if you zoom out and squint like so it becomes apparent that you can separate the images into three different categories anything on the left is desaturated anything in the middle is saturated and then everything thing on the right is a new composition for the new composition I want to point out that the images are a lot more cleaner if that makes sense if you compare any of the images here it's not as chaotic noisy or in my case I like to call painterly or sketchy as the images before so let's just look again this is the desaturated and then saturated here and then let's go one more block over to the very end around here and you can see that it's very clean depending on what you want this might actually be the area that you want to look in however for me I want to return to what I have earlier so this square and look around now that I have delineated the three categories as well as pointed out the center right here for the ballpark we can go ahead and see what exactly we want you can see that the majority of the images fall into the category of saturated and more clean but not as clean as the third category and only one of each for the other categories let me just look around to see if there's any more improvements remember that we're on a lower increment so it's 0.05 and not 0.1 like before for me I still really think that 1.5 for B1 and 1.3 for B2 is what I'm going to end up using but depending on what you like you can choose something that's a bit higher or lower this one doesn't look that great it's okay maybe this is also fine for me U maybe not this one not this one the second check was more of a Precision check and I'm happy with the output here depending on your aesthetic eye and use case you can choose something in other categories there are just one or two things I want to address before going on to the dynamic s values you may ask why did you choose preset static S1 and S2 values that aren't the same and aren't zero or one great question and I don't want to leave any stone unturned here in Photoshop you can see the plot for dynamic B range with static S1 and S2 at zero here in the title at first glance you can see that everything is extremely blown out and Overexposed it's basically impossible to do any kind of analysis here let's just maneuver around for a better idea there isn't much to work with so static s values at the value of zero is unfortunately a dead end now what about static values of one for S perhaps zero was too extreme of a value once again same seed this time we definitely have a lot more to work with however we're confronted with a separate problem namely all of these images seem extremely similar visually there isn't much Clarity that distinguishes the two groups of images from one another on this left hand side you can see that the design really falls into one kind of aesthetic very similar pretty much interchangeable in my opinion and then on the right side you have something we're a little bit more familiar with this design here it's really difficult to see the differences in between the images I would say just a closer look yeah how can you tell how can you tell indeed as such I think there is an argument against having the S1 and S2 values being too low or equal to each other when performing the D Dynamic B value search however this is just my opinion but it does feel like a pretty solid conclusion now that we've dialed in our values for B1 B2 to 1.5 and 1.3 let's go ahead and explore the S values this time I've set the S1 S2 values to range from 0 to 1.6 with an increment of 0.1 also remember to change the dynamic values to S instead of B like so before we get into the analysis let's recall that the free you paper denotes the S value as the skip connections which is in charge of passing on details if the S values are low the image will be less detailed with that being said let's go ahead and look at the most extreme example when the B1 B2 values are static here you can see 1.5 1.3 and S1 is at zero so this value here here in the First Column when we zoom out you can see that no matter how much we increase from S2 from zero going in this direction the image largely remains flat so flat flat only around when S2 is one do you start to see some definition in the image albeit it's still very faint now how about if we go in the other direction so let's go up here we'll keep S2 at zero and this time increment S1 I've outlined it here so this time we're going in this direction see how many adornments are gradually added to the image going from not as detailed over here and then very detailed you can see here that when S1 is 1.5 and S2 is at zero you still get pretty good details once again much like the B1 value the S1 value is the dominant Factor here in addition the images can be roughly grouped together once again in different categories Beyond a certain S1 value in this case S1 of 1.2 you can see that the composition changes slightly whereas the character is facing to the left the character on the right now faces to the right let's go ahead and increase the S2 values now so just heading downwards on the plot like so what I see is that when S2 is increased the image becomes more concretely defined so there's more of a blockout or definition and this is the end here there's a little bit of artifacting going on but this is a trade-off you're making for more clarity if you're familiar with Copic drawings or drawings made with Copic Marker this really resembles that kind of aesthetic whereas S2 values being low reminds me of watercolors all in all varying the S1 S2 values and seeing their corresponding results was extremely Illuminating to me from the perspective of a creative I can definitely see a use case advocating straying from the accepted S1 S2 values in favor of something a lot more stylized whether it be desaturated like so on the right or extremely flat on the left in particular particular these drawings on the left remind me of some kind of Elden ring aesthetic however for the time being I think going with my original values of 1.1 and 0.2 are prudent let's go ahead and do a final comparison now between the original image and the image using the tuned free use settings in my code I've saved the C that I've been using for all of these tests in this text file so here I have the code and random seed with a seed and then it's saved in this random seed text file if I plug in this seed value to comfy UI it will give me the original image without any free tuning applied let's go ahead and do that this is the original workflow file before we start using this to get the Original Seed we have to do two things first delete this free U node because we don't need it second of all we have to change the seed here however if you recall from the API we changed the seed in the case sampler not in this random node right here so we have to create a primitive like that disconnect the noise connect this to the noise make sure it's on fixed yes it is and then paste in our value here and then generate so we get this image down here and I don't believe this is reminiscent of the image that we've been working work with in addition when I press Q prompt you can see that this value is changing why is it changing when I had it unfixed let's go again and it's not changing this time which is strange to me I cannot be sure if this is the actual original image I went back and looked at the API Json files and I found my mistake the issue is that in node 41 for the case sampler advance I'm changing inputs and Seed except seed doesn't exist here I'm supposed to change noise seed let's go back to the code and look at the mistake so the mistake is right here K sampler inputs seed random seed this should be noise seed instead because I'm using workflow node 41 and that doesn't have a seed it has a noise seed so this is the mistake with that being said that means means that this random seed being saved here isn't being used so if I go back to the workflow and just don't change anything I should be generating the base image here is the workflow once again no freu and no primitive node here and I should have noticed this is called noise seed not seed everything the same fixed let's go this more or less looks like the image that we've been working on however let's do one more sanity check let's add in a free you node again freu and also notice that this didn't change at all let's go ahead and just do one one and 1.1 and generate okay that's done generating it's looking pretty familiar I would say let's go ahead and check again this is with 1 1 1.1 and 0.2 here are all the generations from the very first trial open that one up and I do believe this is the exact same image let's check once more so this is the image viewer this is comfy UI image viewer comfy UI so always check your work and be careful when copying the reference code noise seed not plain seed I will make this adjustment in the code when I upload it to GitHub but while I'm stuck here I'm just going to continue using this seed here is a tentative image comparison that I've put together of the results we have so far in the top left corner we have the original image on the fixed seed and then I applied aoya deep shrink on it to see what the effect is and it's not bad and then on the bottom left corner we have free U applied to the original image and this is something we're familiar with then I subsequently applied the coer Deep shrink as well to see it seems that if we use freu together with Koya deep shrink we might have to read tune our free you values personally I haven't been using Coya deep shrink in my own work the reason being is in the past we had high res fix and it always gave me very poor results in this case it's maybe it's an incremental Improvement but I do like the effect of free you over Co deep shrink but of course you're free to use them together and just for reference let me show the node graph for this this is the normal workflow I just added in the freu and then after that I added in the Coya deep strring to add in this node just double click and type in patch model add down scale like so very simple we could call it a day here however I think it would be a good idea to also tune this for LCM and maybe SC XL turbo so let me do that very quickly the workflow for LCM is loaded in here to do it make sure you have a Laura after your model make sure it's set to LCM and then connect it to this model sampling discreete node make sure it's set to LCM instead of eps and then once you have that connect your free you here you can also add deep shrink right after this but I'm going to go vanilla first this is the grid for dynamic b1b two for LCM you can see here it's using the same static values of 1.1 and 0.2 that I did in the very first trial for the normal version and the same issues persist where when B1 or B2 are low the results are generally quite poor this is actually pretty difficult for me to find a good value so just keep going and it's getting better around here I would say right now the area over here is already near the end those values are at 1.9 which are pretty high so I think LCM requires your B values to be very high at least for B1 I think in the end I'm not going to go with such a high B1 value and instead stick to something here so this is B 1.6 this is B1 1.6 and I believe B2 is 1.1 I can double check but I do like this result quite a lot yep B2 1.1 I believe I will use the B1 B2 values here to calculate the ideal S1 S2 values once again we have a very stylized flat look when S1 is low and when S2 is also low I feel like this would really fit an animation style or minimalist aesthetic let's just drink it in very interesting if you know the video game Bravely Default this really reminds me of the art style even though I don't think any of that data is in the training data for this so you can see the effect of free you anyway I looked pretty carefully and found that these s values would work for me highlighted them in Red so here we have S1 at 1.1 and S2 at 1.3 and the other one is S1 at 1. 3 and S2 at 1.1 something surprising that jumped out at me was that these values while not the exact same visually are reflected across the identity Matrix so if we zoom out we have a perfect square like that you can imagine these intersected boxes to be the ones and everything else to be zero these two values are reflected across this that means in the case of this bottom left value here the S2 value is greater than the S1 one value which is a first I'm not sure if LCM behaves differently so it might be important to keep this in mind for the time being I'm going to settle with this upper right image and its corresponding values but I'll keep it in mind that the S2 value being greater then the S1 value is also valid and if we go any further to the right it becomes very very harsh if we go to the bottom corner get rid of the transpose line it really feels like an oil painting if that's a look you want to go for go ahead but it's not really for me since this is LCM and I'm probably going to be working with it in realtime painting or something like that I want it to be very loose and sketchy and these are both results that I can live with here is an updated comparison with the finalized tuning values after that we have the LCM on the Original Seed and on the right we have freu applied to LCM so you can see the newest edition on the most bottom row I want to do one more comparison which is using the sdxl turbo model however stxl turbo in its current form exists as a checkpoint and not a Laura to do the comparison we'll have to extract a Laura from sdxl Turbo to extract the Laura I'm going to be using Coya SS guy by B malte once you're in this main page just go to utilities and then on the very right there's Laura then there's a section here showing you all the types of luras that you can extract for me I'm going to be using extract Laura here I'm going to make some changes first of all I want a higher Dimension clamp quantile and minimum difference are recent addition so you can just leave it as is Flo Precision you can keep it as float or fp16 let's go with fp16 for now next we have to enter three different paths here since sdxl turbo is fine-tuned from sdxl 1.0 Bas this is what you'll put in this area so let's do that you can see it show up here next we need the fine tune model so I assume you've downloaded sdxl turbo already so this is where it would go done very easy nothing fancy lastly we want to save our Laura and that's done here just have to check this stxl box and we're good to go going to go ahead and run this and then show you what's happening in the PO shell I'm going to go ahead and click on this button and at once you can see this happening it's sending this script you can see here after around 2 minutes it's done extracting and we can go ahead and test out our Laura here back in comfy UI to use it just have your checkpoint of choice here load your Laura in for me it's going to be this 256 turbo fp16 and then same old same old clip to clip and then I have freu disconnected for now same seed that I've been using the entire time four steps and using Oiler ancestral for the sampler since this is a Laura you can choose any Schuler you want if you're using the sdxl turbo base model I think there's a special scheduler for that one so let's go ahead and try it out okay and it's done generating this is without freu applied so we'll have to do some tuning for this however just to show you the real- time capabilities I'm going to change this to randomize for now and just rapid fire generate them yeah see how fast it's going it's even faster if you don't use this image save node and let's just do one more yeah so pretty cool it is faster but anatomically I don't think any of these are working out well the Torso seems way too big it definitely seems a lot less coherent than LCM albeit the image quality is Superior in my opinion here is the dynamic B value grid once again for sdxl Turbo Laura nothing changed still using the original S1 has two values of 1.1 and 0.2 I found two initial results that worked for me which are here these two red squares this one and the other one down here this one the one that I ended up going with is the one right here which is B1 at 1.8 and B2 at 1.3 I did like the other one which is B1 at 1.6 and B2 at 1.9 so to be sure I went back into comfy UI and did some A and B generations and those values performed poorly on average 1.8 and 1.3 it is if you are curious on how it looked on the lower B1 values and B2 values yes it's basically very waterc colory and in my opinion not very good let's just stay with the tradition of having the one value be greater than the two value with this one if you're curious about what some of the generations looked like with varying B1 and B2 values here are some of them you can see that sdxl turbo seems to fall into the Trap of the earlier SD 1.5 versions where depending on the image size you have repeating people more than one person extra heads extra arms that kind of stuff I felt that by using B1 of 1.8 and B2 of 1.3 I avoided some of these issues however they were still prevalent in maybe 20% of the images this time we're varying the S1 S2 values like always while keeping the B at 1.8 and 1.3 that I said before on the left hand side with the low S1 S2 values it's pretty much a disaster the anatomy is all messed up and there's duplicating heads which is a very classic stable diffusion 1.5 error however if we keep maneuvering to the right which is where the higher S1 S2 values are I did find some that were very good for me and this one is with S1 of 1.6 and 1.4 you're also free to go with this bottom one but I think there's a little bit of a subtle difference not too much I just prefer the look of this one so there you have it let's finish up with the recap last we left off I added the LCM and free LCM here here but now we have a new addition with the sdxl turbo Laura and you can see there is quite a substantial increase in quality it's basically night and day whereas it might not have been so apparent with LCM and then we have free U on original which is a markedly good Improvement in my opinion however you also have the option of using the Coya deep shrink and it seems that using Coya deep shrink and free U is probably not a good idea there are times where you can use free U and it does seem like it's always going to be a good idea to use it however your mileage may vary and you may choose to go with another option so there you have it in-depth experiments across different Samplers and models before I finish up let's not forget to thank the channel patrons thanks for watching and bye
Info
Channel: kasukanra
Views: 4,293
Rating: undefined out of 5
Keywords: ai art, stable diffusion, digital art, machine learning, midjourney, photoshop, character art, concept art, character design, textual inversion, automatic1111, webui, how to make ai art, kasukanra, workflow, dreambooth, style, dreambooth style, training dreambooth for style, anime art, artstyle, controlnet, LoRA, digital sculpting, training LoRA for style, convolutional layer, vtuber, SDXL, SDXL1.0, style training, freeU, sdxl turbo, LCM
Id: WwsJ_QIgsG8
Channel Id: undefined
Length: 58min 11sec (3491 seconds)
Published: Fri Dec 08 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.