Stable Cascade img2img in ComfyUI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome back everyone I've got some good news I know some people in the comments were desperate for image to image and thankfully they brought in support for the fnet encoder no sooner had I finished the video that I realized that I needed to go back and refactor things so we've released a clean version without needing any uh additional custom nodes okay for both versions there and all it is is exactly the same as V16 but it doesn't use custom nodes for scaling all right it's not quite as smart it just forces it to 10,24 so you will need to use a square image but we're going to cover how you can modify stuff like this quite easily so I've updated the article so we've got V10 the original text to image V12 which has got no custom nodes V16 which is the image to image I just released today and v17 which is the exact same thing without custom nodes I like to make a basic version that's clean for people um so they don't have to mess around with custom nodes but from now on it's going to be custom nodes and then back over here we can see the various versions so whichever one you want you just click on it and then click download uh v17 is the latest one I think V16 is a little bit smarter because it's got a little bit it's got a custom node for SC ing but you know if you have trouble with the custom nodes you can still use V7 so let's jump in there so here it is um the only difference is as I said we've added in the model sampling so you can change shift now some people use three some people use point8 uh the default is two so I've left it on two for people to mess with and the thing I wanted to quickly go over is a lot of people were falling over um I made this mistake akake myself when I came to do the uh the conversion for uh image to image but basically the thing that's funny about this system is we all think ABC this is a so stage a is vae decoder right and then Stage B goes in the second Cas ampler which means that stage C goes into the first case sampler and this is I got this the wrong way around um when I come to do this modification I also got this the wrong way around but all it is guys it's really simple if you go and get the fnet text en sorry the fnet encoder which is on the stability GitHub not the GitHub it's on the hugging face it's called fnet encoder puts it in your Vee folder all right so uh it'll go in the same folder as stage a all right um probably want to put a note there actually and update this but yeah basically you put the fnet encoder into the vae right and this is only for encoding stage C so what I've done is I've got a an image let's just change this image that'll do so we've got image and we'll change a [Music] heroic man in armor I'm not going to change anything else all right and we've got it scaled to sides so just to just to catch if it's not actually a 1024 um I'm only using Square images right now I don't know if it's going to go weird if you try to give it a funny aspect ratio it usually just squishes it funny but um to prevent skewing I'm just using a square image for this setup it's easy to change so the latent gets encoded by fnet and then it gets thrown into the first K sampler with Stage C all right so if you put it into to this one it will give weird errors about dimensions and and things all right also don't try to use the fnet as a decoder because it will just make garbage all right so that said let's generate an image another trick of course is uh most people will try to use the D noise on here that's not correct you want to use the D noise on here to control the uh imageo image blending all right so just a reminder this is what my image look like it's one of the old JoJo poses and this is what I got and this is a 248 obviously you can play around with your cfgs and your steps one other addition is I've made this which is a shrink down to 1024 in fact you know what these require a custom node so I might do a quick update cuz otherwise people are going to be wondering how it's very simple though so that's pretty much it I just wanted to do do a quick update video for you guys so you can find the imageo image uh workflow over here on civit and you can find it also linked in the article uh open that one up there this video will go in here too and yeah um that hasn't updated for some reason anyway so there you go thanks for watching we'll be coming back next time with more Cascade
Info
Channel: FiveBelowFiveUK
Views: 1,739
Rating: undefined out of 5
Keywords: ai, art, generation, image, img2img, stable, cascade, comfyui, workflow, explained
Id: pYqDgrcwJQU
Channel Id: undefined
Length: 5min 31sec (331 seconds)
Published: Mon Feb 19 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.