Stable Cascade ComfyUI Workflow For Text To Image (Tutorial Guide)

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

we are going to talk about the stable Cascade in comy UI how can we run in comy UI now before that let's review the stable Cascade models again in case some of you haven't seen this before so stable Cascade we are going to use stages of model steps to generate image so there will be different checkpoint models file structure for us to download and use in comy UI and previously we have tried in automatic 1111 and from my review and use of that it is not good in automatic 1111 and I create this workflow in comfy UI and actually it works better than automatic 111 more flexibility we have more controls of settings and let's check out the hugging phase in here we have the files and versions click on this tab and hop over to the file list underneath here and as you can see there's stage a Stage B and Stage C all these seven 7 G by 2G by 3G by files and actually you don't have to download these files for comy UI and one days ago from my recording of this video we have a new update of this models for comy UI it is optimized for comy UI nodes that we use so the only thing we need to download is these two files the Stage B and Stage C so 4 GB and 9 GB by files that is the only thing we need in comfy UI you don't have to consider how many vrms that you have and then download uh what kind of models like the previous versions of this AI models and nowadays we just go to this one the stable Cascade and go to comy UI underscore checkpoints these folders we have the two models Stage B and C and we can download in here so where's the location we want to locate this in is in the UI models we hop over to the models folder and then we go to the checkpoint folder as usual that we download any checkpoint models and for organizations I like to create a subfolders in here for a stable Cascade and I just saved that in the stable Cascade sub folder so there you go and this also do the same thing as well for stage C checkpoint models and here we have to download this one as well and that's all we need for comy UI and this is the latest update of this checkpoint models in the recording of my day February 20 and then here we have the very basic text to image workflow that I connect all these notes and the checkpoints and then I create one documents here gather all the information from on the internet that what is the ratios and image size that is fit for stable Cascade to generate and also I explain all the stage for example in here I got the stage C compressions and then the default values and then the optimized values in here I capture all the screenshot of each nodes that we have to consider and remember those values when we use stable Cascade in comfy UI so first of all we see these two custom notes is different than the stable diffusions custom notes here and all you can do is now search in stable Cascade just type in here and you can find this empty latent image for stable Cascade and also for the model sampling stable Cascade as well and here is the compressions value that we can set and then this checkpoint loader is going to be used for the stage C of St stable Cascade checkpoint models now I have saved that in the subfolder of stable Cascade so the path are going to be different from yours and you have to manually select Yours by yourself one time and you can run this and also this is in the Stage B that is also applying with that as well for the files name of the checkpoint models so Stage B here I have recording and wrote documents in here mentioned about the Stage B so this is the process involved utilize low resolution latent from the stage C and as the condition input for the Stage B model allow the Stage B model to enhance the image and so right here we got the hugging phase the diagram of stable Cascade is explained very clearly and I have summarized the process of this in Concept in this document and basically it is for us to remember that this process is kind of different with what we used to have in stable diffusions and each of the stage they will have their own individual K sampler so the K sampler with stable Cascade Stage B conditioning and also the conditioning zero out all these notes in this group are using for Stage B so once you select checkpoint models for Stage B in here then you are able to run the stable Cascade without no problem so here is overall the connections remember to check the conditionings latent image that is connect with Stage C and also the condition zero out we have to connect with here just follow this direction and you are good to go so the stage C latent image are connecting the stage C Group K sampler output so once the stage CK sampler finish our processing and then it will pass to to the latent image and also for the empty latent image as well that will be passed to our latent image in case sampler so that is the thing that we have to consider and be aware of when you creating a workflow for stable Cascade and lastly the stage a is simply vae decoding and this is just the stage a right now after we have the updates of comy checkpoints and back then we have like 7 days ago we have the stage a individual checkpoint models and which is 73 gigs I mean 73 megabytes but then right now we don't need that after the checkpoint models in stable Cascade updated already for comy UI so lastly obviously we have the image output and for testing I will create a preview image here just for better quick tests without saving too much junk in my system so so I will bypass this save image leave it on the side and we will use the preview image here so let's click one time and see what's going on if there's any error or things like that before we do anything we have to do the text prompt forgot about this one so let's do a beautiful landscape of a snow mountain with Cloud on the top of the mountain and make the weather as a sunny day so B basically that is a very simple text prompt not really technically fancy stuff and let's try this oh there's an error calling the non-attribute tokenized so yes we got to update our comfy UI the whole UI before we run this stable Cascade latest versions so remember to update go to comfy manager update all and we are good to go again let's try it out so here you go the Snow Mountain image it's kind of small let's make it bigger and we can see it clearly in here yes that is better than more clear and then we can see the Snow Mountain is pretty realistic for nothing optimized no enhanced no detailers and we can get something like this and generate again and we have another snow mountain landscape view it is in 1024 pixels x 1024 and this is the standard size of stable Cascade and later on we will test difference Dimension because in stable Cascade for comfy UI we can use other dimension and let's go to seat numbers let's configure something I want to put up the sampling step in stage C and then let's go to Stage B and see if there's anything we can configure and test different things let's try the seat number is okay and let's go run one time so the command prompts in comy UI is more clean than automatic 1111 we just got two sampling stage in loading in the command prompt that's all compared with previous videos we test with automatic 11 11 lomven we got four loading stage and that is kind of too much so here we have another Snow Mountain let's bring my command prompts in the same Windows here so we got a clear view what kind of things we have execute and what kind of data pass it into our system in processing let me relocate that that and we have a clear view and also make it smaller so we can see the image and let's put the three-stage layout on the side here so we can test and clearly see everything in one screen there you go this is better so yeah let's try something so actually my my my workflow is really clear they're showing three stages in stable Cascade model in here I grouped that together in all the process and let's try again the John Wick image like previous videos that we test in automatic 1111 now I will be writing some other styles instead of the Cyber Punk Styles and see if they can doing other things rather than cyberpunk or GTA Styles image all right so that's all I have for the test prompt and let's run so as you can see there's 20 steps and 10 steps running and that take like 2 second and 1 second loading time for the sampling and we have a John Wick here okay this one looks better the face is clear than the automatic 1111 versions and actually it's more cinematic Styles the light effect from behind and let's go back here let's try other aspect ratio let me copy these numbers from my documents here actually I will share this document in PDF in my community groups and you guys can check it out if you want to and let's try 3,000 width and 1700 height and let's try this it will take a little bit longer since the width and height is a larger number and okay there's awkward two John Wicks sticking together a little bit or to head in one body let's try it again maybe the aspect ratio change and the style from the AI have not changed yet so we got expanded to fill up the landscape ratio but it doesn't actually do the structure in the image with this aspect ratio and let's try other IM first and let the AI to warn up and try other data well sometimes it happens it generates weird image and then usually I will try this other image and then come back to this image text prompt again later then the AI models are going to be okay okay so here's a foreign landscape image with 3,000 width and 1700 height for this image and actually actually it took like 11 seconds in stage C and 6 seconds in Stage B for this file size of rendering for this image size and let's try there another way let's try this one 1 1700 width and 3,000 height for another aspect ratios let's run one time again and see if they have changed some weird structure in creating an image it should be okay right now yes it is better right now we got an other aspect ratios with foreign trees and it looks pretty realistic for this one but when you zoom in there's lots of pixel noise in the image yeah as long as it is not getting any detailers refinement and it will be having this noise appear when you zoom in but then when you zoom it out like a small image it will looks okay not really good it is okay but then let's try some other stuff rather than this one let me close the windows command prompt first and we can fully viewing our workflow here so the diagrams right here I will use other text prompt and see if they can handle multiple elements and uh multiple stuff in one text prompt because last time we test in automatic w 11 is getting pretty poor performance in understanding a text prompt so I want to see if comy UI can perform something different after this checkpoint models updated that is optimized for comy UI so as you can see my text prompt I have multiple things that I want to put in the image and that is the cat and the windows behind and outside of the window is snowing and there you go it did really understand what my text prompt is in comfy UI and it performs I would say better than then in automatic 1111 obviously and that's why I gave up that system in few months ago when I started playing comfy UI and then started making workflows all that stuff and let's try other styles again to just generate the same text prom and see okay this is another image using the same text prompt yeah it does look pretty accurate uh for what I want in my text prom I got uh multiple elements so let's try with other cat Colors Let's be more specific with our cats fur with white color let's see what they got in here yeah this is way better like way better than automatic 111 it can analyze and understand what I mean in the text prompt although I have multiple elements that I input in the text prompt and it's still able to generate that for me and this cat looks pretty cute and actually let's try other settings in the stage C let's go for other empty latent image width and height for example we usually generate the thumbnails for YouTube videos with this size 1200 * 700 and let's see this one okay it looks better than previous videos that we test and I think for their updates in this checkpoint models for comfy UI I think they did something enhancement in the models algorithm as well to be more better in understanding our text prompting and it did what we have in previous testing from what's that called the hugging face demo page I think this update of their checkpoint models is better and here's another image I'm not using a cat I'm using a woman but then again the face is kind of awkward sometimes when we try to generate like a person or characters it's still not being so clear if you just input a simple text prompt you need to specify more things in the text prompt and also the negative prompt in order to make decent look and styles of a character like the face is still kind of looks bury and the eyes is kind of broken in this field image and this one is obviously way off from what I expect maybe I have to restructure the text prompt and try yeah let's put a comma behind that and let's try that again okay this looks going back to the face but then there's still some problems with the eyes I saw many times with a stable Cascade even in the hugging face demo page test few of the image before it's still having some problems with the eyeball it doesn't display very good for the eyeball especially so let's try other dimensions see if there's something Newton this one is kind of off with their eyeballs and let's bring it up to sampling step 40 and see if they have more details in there and then change the CFG okay actually use the CFG 4 and see what happened well it's all about experiments in this one and testing few things and I just record everything in here together so raw footage and let you guys know what will be kind of expect to get from the result in here and this image looks a little better let's bring the Stage B sampling step to 20 and hopefully there are some more details here okay this one looks like a pretty cool face but again the eyes is still not nice I have no idea why this stable Cascade cannot do nice Clear Eyes like in stable diffusions back then even in stable diffusions 1.5 we can do a good eyes and facial image that perform good but not sure if my text prompts problems or the settings problem but then the Outlook like the characters figures is really good but except the eyes that is the one problems in my text prom generate in here okay let's go back to 1024 width and height and see if there's some problems with my text prompt okay good the pose of the characters is very good but then again the eyes problems and maybe we can save this kind of image and then fix the eyes in stable diffusions sdxl or 1.5 to fix the eyes because right now in stable Cascade there is no detailer or enhanced custom notes for this yet or maybe they will will have some updates later on because they have a lot of updating in conv project in GitHub page project right now and believe they are actively optimized comfy UI for stable Cascade so maybe in the future we can do something more stuff with stable Cascade like control net you have other Laura supporting in stable Cascade you have animations custom notes that is the Motions models that is specified training in stable Cascade hopefully there will be something like that after a few months okay this one looks better this image actually looks good so let's see if I can use the styles of image and make YouTube thumbnails with that and I will be using stable Cascade to generate thumbnails for these videos so right now this is like in gambling I am clicking generating image and this looks scary okay the face really looks scary and let's do other things things and see if we can fix that again uh the eyes are problems you see it doesn't show a clear eyeball maybe I have to do a text prompt that is specified for that but I'm not sure because in stable diffusions I don't need to specify that and it's automatically do good in the eyeball and the face but I can save this image and we can see if doing a detailer enhancement in stable diffusions let's try a larger WID and see if there are some better Styles so we are bringing down the sampling step to 30 for stage C and Stage B so this one have a little Improvement but it's still not the right eyeballs that I want let's go text prom and do something let's see a clear ice ball something like that and hopefully there is some better image that we got to wow yes this is something that I want better Styles although the eyeball is not really clear right now but it's better and it looks so elegant the whole image structure and the outfit of the model is very elegant and we can try this image save that and enhance that in other workflow so let's bring some light from the windows now one thing I can see very good at for stable Cascade is the lighting effect like the spotlight you got the sunlight from the window like this one it did really good like very consistency and also going very detailed this AI models know the detail of the lighting effect like from this direction of the windows and all the light is going through in this direction so for example like this image although so the eyes is not clear again but we can see the light from the windows shining in in One Direction and then let's do ultra Clear Eyes First yeah so the light effect is really good in here you see there's one direction and you see the character chest there are some shiny places that is affecting from the sunlight and this new generate image the eyes looks a little better but still not satisfied as what I need so let's try some other text prompt well some new AI models especially the AI generate models it happened like that when you just newly download this it would generate some weird stuff not as expected what they said in the demo page or the research paper but then eventually you generate few few more times it's like letting a player to warn up and eventually it will generate good stuff but then you just need to take times to build that up the momentum of generate a good image that is what I saw so far from all other AI models and this one looks good although there's some Shadows under the eyebrow and it is little better than what we have in previous generations and here's the text prompt of this one so yeah maybe we have to put some like a photo realistic or those kind of realistic Styles text prompt into here then we can get some better result so let's try other things actually let's try one more times with this text prompt I have some feeling that it will have some good image after running a while for this yeah this one is better way lot better better than what we have the sunlight going into this one directions the Salva the outside of the Sova got some sunlight and then the Sova behind that the woman's shoulders there's some sunlight going through it's very consistency you see the hairs as well going through here just like last time we did the flowers image in automatic 11 on 11 and this AI models do very good I can see for light effect and the light directions it can detect really well and produce some really well light effect for image so so far this is what I test for stable Cascade in comfy UI and I will choose this image for this video's YouTube thumbnails so let's try it and hope you guys got some Inspirations how to use stable Cascade in comfy UI and we will check back in other features using stable Cascade image to image and some clip visions that we can do a workflow for that in later other video videos about stable Cascade so yeah I will see you guys in the next videos and check it out this workflow and the documents that I take notes about stable Cascade I will post in the communities groups and you guys can go check it out and see what we can do in future with stable Cascade so enjoy and see you guys in the next videos have a nice day bye

Info

Channel: Future Thinker @Benji

Views: 3,176

Rating: undefined out of 5

Keywords: stable cascade, Comfy UI, Comfy UI tutorial, stages, models, file structure, optimization, VRAM, UI models, checkpoint folder, subfolders, image generation, model sampling, low resolution latent, stage C, stage B, k sampler, conditioning, latent image, vae decoding, image output, preview image, Comfy UI guide, Comfy UI walkthrough, Comfy UI tips, Comfy UI tricks, Comfy UI techniques, Comfy UI best practices, Comfy UI optimization

Id: RCbd9pbSJsc

Channel Id: undefined

Length: 26min 27sec (1587 seconds)

Published: Wed Feb 21 2024