Creative Exploration - Ultra-fast 4 step SDXL animation | SDXL-Lightning & HotShot in ComfyUI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I think I'm going live hey I think I'm live wait for YouTube to catch up Hello friends welcome here we are again comfy UI creative Exploration with pers we'll be making um we'll be making animations with hot shot today uh uh very fast animations in four steps with sdxl lightning it's uh yeah it's surprisingly fast um little blown away at uh the quality the consistency all the things that we're kind of missing from Hot Shot uh uh are kind of here now with especially with video to video uh this workflow works best uh when you feed it um input footage and then use depth maps to enforce it um but we can try um uh taking the uh the images out of the vae decoder and starting with empty uh like empty images like an empty uh latent and uh just let it dream and see what happens so yeah uh we will uh basically I'll just uh let me give some credit here uh to pull it up real quick on the banad doo server there was a workflow posted very recently called vidto vid sdxl for stops lightning Laura by Kilner kintner uh very cool workflow um if you guys aren't on banad Doo highly recommend you go get on that server I'll drop an invite in the chat right now and this is in the resources section if you want to follow along you can go download it throw them a like and uh yeah we can get to play um if you want we can rebuild this entire node structure from the ground up uh um but for now we'll just start with playing with it so um yeah uh CL uh clinter clinter based on inner Reflections LCM with the thing we worked on last week uh but with the lightning Laura uh an organized a bit uh upscale does need a little work but we figured out a workflow for that so uh I'll I'll cover that and uh the only other thing you'll need is uh I was I've been using this uh Dina din Vision yeah D Vision XL all in one um uh checkpoint works really well for this uh process and then you'll need the lightning luras uh we're using the four step Laura here and that just goes in your Laura folder so download this lightning four-step Laura put it in your Laura's folder and then uh the uh we will just load it with the uh Laura loader and you're off to the race so you can just install that workflow install those models and uh just basically upload your footage and hit go um but yeah so we'll walk through this workflow and how it all uh how it all works together and why it's possible to get such cool animations um I have noticed that with cfg1 uh it it seems to be like ignoring my negative prompt or adding it sometimes like yeah I'm not sure what's up with that butth yeah I'll try to stick to men I guess today so we don't get any accidental breasts on the stream uh might also be a side effect of this model I'm not sure if the model's a little uh little hornier than other models but you know it is what it is so let's start by loading 16 frames of a video I just downloaded off of pixels um and we'll skip the first uh I don't know 140 frames just pick a random spot in the video basically um and I've just been doing them at two times so that I get more more frames and then the this animation is you know the smoothed out one frame per or 24 frame per second version but yeah uh I meant to kill steam before we started here the awful notifications go away all right so yeah uh we load our image or we load our video into the uh uh video upload we set our base resolution we set our upscale resolution if you set these to nothing it will uh not upscale which is great if you don't want to upscale the first thing you make usually you want to wait um all the seeds are fixed so you can reenable this when you're ready to upscale if you've got a good generation you want to upscale it all you have to do is just put numbers in here and otherwise just put letters in here and that'll that'll uh bypass everything uh we're loading dynov Vision uh we're going to load the sdxl vae uh we're going to load load the lightning Laura four-step Laura and we're going to load the Hot Shot uh animate diff model I've added IP adapters to this it's not in the uh current setup so if you want a screenshot of how to add IP adapters that's it right there you just load the apply IP adapter uh in line uh with the model string uh wherever you want to plop it in and then plug everything else in here as it as it looks load image goes into prepare image for clip Vision prepare image for vision goes into IP adap apply IP adapter and then the model loaders go above it um I mean you can put them wherever you want but that's the idea we're loading the models we're loading the image we're loading the model through it and uh we're not using the attention mask for this but uh that's another thing we could play with too uh and that will generally overwrite your positive prompt with whatever you whatever image you have um then you can you know enforce it more also by adding it to your prompt so if I wanted Heisenberg dancing I use a picture of Heisenberg but then I also say I want you know like 50-year-old uh he Walter White Dancing in a meth lab um and hopefully you know well we can run that while we're waiting and see if it works okay so we've loaded up Walter right and hopefully he'll dance in a meth lab um that's our prompt prompt next thing we do is control net we're using depth anything depth anything will go in and make a depth map of the input footage so we'll get uh we could got this to Dream On Top of uh run through animate diff uh we are in this we're using animate diff uh Hot Shot uh with the linear Hot Shot uh sample settings uh beta schedule rather uh and then uh for the sample settings we've got the noise type set to empty and uh we're using looped uniform context so that if our input video Loops the animation will Loop and uh we're using that at 813 that's the typical setup for Hot Shot you can change the context overlap if you want but you don't want to change the context width the length you want to leave that at eight eight for Hot Shot 16 for regular animate diff yeah everything that's fine yeah and then the apply animate diff model literally just loading the loading a model you could use the simple one too if you want no big deal uh then we go to our k and for this we need to use uh one CFG have one and four steps cuz using the four step lightning Laura um sampler name is DPM PP sde and scheduler is simple and it gives us some pretty good results I don't know why his shirt keeps coming off I don't get it but uh you know there he is you'll notice with this we get some like crazy you know noise and stuff in the background and the consistency is not you know nearly as good as running a long run but for doing fun animations and playing with this like you can see the iteration speed is just insane like we're just blasting through these these tests they're really really quick um so let's try maybe different input footage where he doesn't get naked or something I don't know let's try you so for this we're going to switch the aspect we want it to be wide not long so we're going to go 1024 by 576 and let's just see if our stuff's plugged into the right spots negative to negative yeah we like that negative negative do oh I think I found the problem is this the positive conditioning oh no no that's positive negative no you had it right so I'm just checking some routing here so positive and then you go to yeah it's positive see I feel like I'm getting my negative prompt in my main prompt maybe I'll try removing that and see if it uh gets rid of it not sure why that is everything seems connected correctly I really do think that's a byproduct of the one the CFG scale of one but uh yeah whatever uh I didn't I did okay so I swapped these values that's going to make it wider than it is Long Tall so should do this yep he doesn't look like he's in a meth lab but see a lot of this though is from using the input footage so I'm actually quite curious what it'll look like if we use an empty Laden instead of the input footage to um to to dream inside of and just have the depth map control net take over but we'll see yeah this is just running uh sorry I meant meant to mention this is running the sdxl uh control net Laura depth rank 256 you can use either one ranked 128 256 uh I think the 256 is better maybe I don't know I don't know why it's better might just take more vram and only be slightly better I'm not sure I haven't done any tests like that oh there we go all right well Walter White is dancing where which is important work I know we're really doing important stuff here all right now let's instead of using this vae encoder I'm going to use an empty Laten space and see what that does so let's just have this here and then we're going to plug uh into here uh we add a reroute so we have that yeah we'll just remember that's our latent image and then we're going to add an oh not a get node we're going to add a empty latent image there we go pop that in here and our width and height we're going to convert to inputs and we're going to use the width and height from the uh input values here and our batch size is going to come from the frame count from our video so we're going to convert batch size to an input and we're going to pull this into here all right so now we're doing an empty Laten image and we're only using uh depth control net to control the dream uh we'll get to see if this is uh wildly different if it's wildly useless uh we might need that we might need those images on the way in to actually form this or we might get a more interesting robust dream or it might be exactly the same might have to back off the control net a bit it's hard to say this is why we test stuff like this to find out different methods of of dreaming that because like these are cool but they're sticking so hard to the input footage I I have a feeling that's what's causing all this artifacting so I'm hoping if we don't use the input footage as the basis for each frame uh we might uh might have a dream a little more what do I have to do to get the CPU GPU memory gra same Master FX uh it is a plugin uh it's in the manager uh it's called Uh c y chis tools CR Ys tools um just install it and restart comfy and you'll have them okay so we got a little more interesting background a little less uh well about the same amount of noise well that's pretty interesting okay let's turn this down to select every frame and we'll do twice the amount of frames so it should be actually she'll do 64 it doesn't take that much longer and we'll see what it looks like at proper speed because it's a little fast well this is pretty impressive um considering what Hot Shot looks like you know normally it's it can be good but it it does a lot of time look not as sharp as this so that's pretty interesting but it also takes you know this is at four steps normally you'd have to do 25 30 steps to to get this kind of quality will also mention as far as I know lightning is not uh usable for commercial stuff so keep that in mind uh this is for research and and fun uh I don't know if the lightning luras are covered you might not be able to use them for commercial use so uh check the licenses on that uh if you're considering using any of the lightning stuff for commercial use because uh yeah as far as I know it is prohibited because it's still under research license but um I might be wrong that might be a Cascade thing uh let's find out actually oh that might just be a Cascade thing H license open rail yes Christopher Moody yes did you see that the control and the tile model for sdxl was finally released last night I saw that right before I went to sleep last night so that's pretty exciting we'll have to do some upscaling craziness with that um on another stream after I've learned how to use it all right Heisenberg get it man get it okay what about like uh now I'm curious what this does for like non people let's try like uh car race maybe can this dream like crazy footage let's try it with like uh that might be cool I'm actually really curious about this bike footage H 60 frames per second I'm gonna avoid that because it's way too many frames to render let's see what this does I'm wondering if it like gets the ground and all the details and stuff or if it's just gooey noise might be gooey noise all right let's do 64 frames we'll just skip the first yeah same literally the same uh let's do uh hovercraft uh gliding across a alien desert two s's in desert or one I feel like there's one yeah hcraft gliding across an alien desert uh yeah oh we should probably turn the IP adapters off because we don't really need it to look like Walter White I'll just s FW nude in here just in case yeah I'm curious actually I'm also curious if depth anything can even see this because it's depth anything is mostly trained on uh human people because it's uh from Tik Tok so I might have to use a different kind of depth uh control net we'll see can't get it to work right and comfy uh hoping someone will r a nice work for yeah I'm sure they'll be on banad Doo within the hour Community moves so fast well yeah I don't think this is gonna work but might be interesting we can also try adding the frames in if it looks totally messed up we can try adding the uh original frames in as the vae base and see if that like helps that's a good way to remember one s in desert and two s's in dessert because you always want more dessert I like that interesting so this is with an empty h I know you guys are always interested how much vram these these things take so let's find out or no it's MV top this looks like it's about 12 gigs of vram for inference um so yeah that should work on any 12 gig card uh I'm obviously using a bit of vram for streaming and uh OBS and Discord and all that stuff so I have a feeling you should be able to do this with a 12 gig card uh the VA code may take a little bit longer but um if you do longer batches you know you'll you'll render more frames and then you can go make a coffee while it hits the vae decoder it'll probably be some swapping to the memory at that point actually we can see how much vram it takes when it hits that too I am curious about that myself 12.3 gigs oh uh yeah no unfortunately it's very incredibly painfully annoyingly uselessly slow on Max um yeah it's not about the vram on the max it's about the lack of RTX uh lack of Cuda cor just unfortunately yeah nvidia's got a strangle hold on this technology um there's always hope that down the road they're going to figure out how to you know Port P torch or you know whatever to core ml so that you can unlock all the power on these MacBooks but as it stands uh yeah inference is just painfully slow it sucks there are online Cloud Solutions though like run diffusion and run pod and and uh all that um if you check out consumption on YouTube he's got uh like consumption AI you uh he's got guides on how to set all that stuff up well that's kind of cool it's it's like it's sitting in place though it's interesting wonder if that's because it looks like that in the depth map yeah okay so if that's the case so we got it kind of sitting in place which is trippy and interesting and they turned into like weird creatures or something um but let's read the uh vae encoder uh instead of the empty Laten image and see what we get so now we're running the uh we're running this uh through as the original frames as well as the depth map to see if it uh changes the uh the appearance here yeah yeah sorry mob I didn't mean to bust your bubble because uh you know it's very frustrating I know people with some really powerful MacBooks that would love to be able to do this stuff locally it's not that it doesn't work it's just that it's it's too slow for animation like I You' get away with doing single images and stuff like that like it's just going to take you a little bit longer but there kind of like large batches of animations and stuff and just be just be faster to use somebody else's PC or um you know run pod or or run to Fusion or something I think run to Fusion is as low as like 50 cents an hour for comfyi instances which is pretty sweet uh that's what I would do if my 3090 exploded knock on wood definitely getting a lot more more um definitely getting a lot more detail on this because we're using the uh because we're using the in the input footage so yeah this should be pretty sweet pretty sweet would be trippy to have a whole video where the car is like not moving but the tires are spinning and the dust is coming from the wrong directions but he this is not a competition with uh anime diff and sd15 and we're not you know doing we're not doing a comparison to see which one's better this is just just the fact that we can uh do Hot Shot animations in four frames is the exciting part about this the fact that they look kind of weird is fine because you know this is one model was one motion model it's one method basically we're right on the right on the cust here of like something really cool so um yeah this this might not be mind-blowing like uh results wise but the fact that it's it's actually converging in four steps is mind-blowing to me because uh my favorite part about being able to do AI animation is really fast turnaround like when you're when you're working on something that takes 20 or 30 steps like that process is the iterative process of getting your thing to look good like for the first bit just take so much longer than when you can do 16 frames in like 30 seconds or whatever cuz it's like you know just yeah you can just bang it out until it looks cool and then run out a whole run and it's super fast and like we're getting there with animated LCM with sd15 um I'm going to go back to that on the next stream or maybe after we we we've exhausted Hot Shot I'll dig into that a bit more but the stuff that I'm getting right now from animate LCM uh and sd1 five is mindblowing so hopefully we'll get access to that kind of stuff with sdxl uh soon enough uh oh cool so we got more detail but it looks like it's holding in place um that's cool though and the person's like becoming part of the car not part of the car so here's the the stuff about hot shot that that's definitely hot shotes hot shot has an eight frame motion window unlike uh anime diff there's a 16 frame motion like a a context length um which means this is trained on uh eight frame chunks of animations and animate diff is trained on 16 frame chunks of Animation animations so you notice how everything's changing really fast in these videos that's because the context window is very short it's only eight frames whereas with anime diff we have twice the amount of time for it to shift between contexts so yes the things change and animate diff but they change half as fast or twice as slow um or the context windows are twice as long more specifically uh which just gives uh the illusion of more cohesion than these but you know these are still you know the fact that it's even that it's doing eight frames and then jumping context windows and it's just not a big garbled mess of goofy pixels is is incredible it's incredible do you think within four years 48 gigs of vram will become Norm if AI game physics to the future I hope uh I I hope not I mean I yes probably like the high-end stuff will probably double uh within the next couple years but what I'm hoping is we can get more for less I'm hoping we can figure out inference like this on much lower vram than what we need now cuz we don't necessarily need a ton of vrm for inference like actually doing the AI uh image diffusion uh what we need the large chunks of of of uh vram for is like large batch animations and uh you know uh really high res inference and like uh training uh so like most people don't need that stuff on a day-to-day basis um and hopefully the needs for just inferring higher res stuff the vram needs will go down down so I My Hope Is that you'll be able to you'll be able to ramp up what you're able to achieve with less vram cuz like I don't think the future is everyone having 48 gig vram cards because it's it's crazy excessive and like vram runs really hot and there just there's a lot of reasons to try to push it so that uh inference takes less vram or less power and so that we're just getting to the result faster we're getting the answer quicker and people will have to use less less power to get there ideally um so yeah do I think they'll be they'll exist yeah but will they be the norm probably not unless there's a massive crazy boom in AI gaming where you know we're we're having stuff dreamed up in in real time and you need you need like honk and honk and gear for that then yeah maybe maybe that'll push in a whole new wave of of of uh PC Gamers that are just like you know power hungry and and mad for the the latest and greatest video cards and stuff but I don't know I feel like uh we're ways away from that probably for now we just got to play with these you know low low low stakes inference stuff and try to get the best results we can from them so that our friends with less vram and our friends that have uh don't have 24 gigs have um you know have the same amount of fun as we have uh with these big stinking video cards that was a really long answer to a really short question uh all right well this is only kind of better um what else can we do let's uh maybe we'll add some more control net to the mix let's toss a I want to toss a linear control net in there maybe control net so control net through control net to the next control net is how control net works so we just pipe our conditioning through control net so we'll start with depth move on to line art and I got to remember bottom is positive yes so positive bottom negative top okay and then we're going to use um the control net loader but we're going to change it [Music] to oh I need to download the line art one so manager models control net sdxl maybe not control net stability AI cany uh oh really there's no line art stxl line art control net interesting okay guess we're using candy install that okay refresh and then should be in here now there you are controller cany okay so we use a cany pre-processor and we'll take uh so where's this coming from coming from image and then you go into control net okay if we want to see what the candy makes we can drop a video combine turn off the save output because we don't ever want to save these files we just want to look at them and the file name doesn't matter because we're just looking at them so we're going to actually replace this preview image with one of these as well right here so we can see what we're making the videos of the depth and the cany all right and I'll just move this down a little bit because that preview is going to pop up underneath okay uh I think that's everything oh let's see what that does oh no what did I miss cancel I should probably plug this in yeah why not right I downloaded Pinocchio but it get stuck at 80% oh that's annoying yeah save metadata is great so yeah there's our depth map and there's our line art so hopefully this will give us the motion and outlines that we need in the animation but who knows maybe it won't yeah there's two buttons there on the uh on the video combin there's one the save metadata and one save output uh save metadata saves a PNG along with the video with the metadata of the whole project so you can always just go back if you made a video you can go back and grab that and then save output will actually uh save this file when it's done so if you turn that off you just get previews turn it on you get uh results and then if you want you can rename them here or you can use this cool uh oh they didn't add the file name oh they did so if you switch file name prefix to an input there's this wicked node in the Mikey nodes called file name prefix you plug that in here and this is going to give you the ability to make a date for your make a folder for the date that you're working in on or off you can have it append or prefix the date to the the thing you can you can set a custom directory for the thing you're making and you can set custom text for the file name so these are incredibly useful for when you're uh when you're naming your files uh to just dumb stuff in date folders so that you can go in and see what you made that day uh versus the day before uh because like I don't know if you're anything like me but your output folder just becomes an absolute hodg podge of of Madness so highly recommend this file name prefix node and plugging it into the you just got to switch your input to F prefix input to a to an input and that is in the Mikey nodes sweet Mikey Mikey nodes do I have a channel for my outputs uh no uh I post my outputs on Twitter and uh do some nft stuff stuff and all that and art and uh music videos and stuff like that with them but uh yeah uh for the most part I just post them on Twitter so you can check them out and then we usually talk about how they were made and uh then have a stream about it [Music] should be interesting okay we definitely got some some some outlines still feels like it's just sitting in the same spot though so it's cool all right uh let's try some different input footage oh this might be cool it's 60 frames but we'll make it work yeah if you want input footage to play with always I have always use pexels they just have uh they just have a wealth of really fun stuff to play with uh and yeah hovercraft gliding across an alien desert might still work let's try it might have to play with the candy settings a bit here this should be an interesting I don't know what it's going to make cuz it's like the trees have really become the most prominent thing but you know uh it should be a cool mask at least for the input plus it's got the uh the images as the vae encoder so yeah it could be mental go ahead stand up and stretch your legs that's what I'm doing or don't I'm not your mom anyone catch with the vram spiked too during the VA decode I missed it try and catch it this time I think this one's going to be goo but I've been very wrong before me I need to get a stream Deck with a cough button uh what process is making the line art uh it's called uh the canny uh canny pre-processor it's a it's a control net pre-processor comes with comfy and then you plug that into the uh the apply control net node as the image uh so you take your input image input video and out the pre-processor so uh yeah a little rundown on control net in case anybody's unaware uh control net uh is a way of um imposing um like a composition on your on your piece um by using different models trained on different uh databases of of photo information so the uh the let's use depth control net as an example it's trained on a bunch of depth maps so you generate the depth map from your input footage and then you feed it into control net and what it does is it goes frame by frame and tries to apply the dream to that depth map using control net because it's been trained on depth maps uh cany does the same thing with the cany control net it makes line art and then uh well lineart is actually another kind of control net canny is a kind of control net so it actually makes Cy Edge detection videos if you want to be more specific um or or a batch of images specifically you know frame by frame where all the edges are you can adjust the low threshold and high threshold here so the lower the th like Yeah so basically like lower you lower the bottom number and raise the top number and you'll get more you know so you got to find the exact threshold of where you want um yeah so uh yeah control net and then you stack them through the conditioning so you control net one control net two control net three as many as you have vram for that you you want to use you can use looks pretty cool actually kind of changed it into the desert a little bit I think we would actually get better looking results with animated LCM and sd15 right now but this is very interesting because uh yeah Hot Shots never been this fast so we've never really been able to mess around with it like this before it's always just been painfully slow to to animate uh Hot Shot videos because it's like you know high res at high steps takes a while um trying to think of what would improve this I don't I don't think we can improve this very interesting looking just looking for somebody maybe driving a car what's this guy doing nothing it's driving in the mud make a man running down a hill prompt try to find somebody running this might be cool yeah let's try this I think we can speed this up uh I'll look for a a person running down a hill after this uh yeah let's pop that in here okay let's load it twice as fast load it like what what's that 140 frames 200 200 frames in uh let's try kind of want to just leave the hover let's try a Mad Max uh Road Road Warrior uh dystopian uh uh dystopian Wasteland I think the motorcycle footage is bad for Hot Shot or at least bad for this process too much motion too much noise yeah murder basket te D what's what's te D anima if this is so unhinged I love it if this is good we can try upscaling it I know this is a little bit late to ask what is hot shot versus animate diff versus SVD all right yeah let's talk about that okay Hot Shot is a motion model that uses animate diff however it is trained at eight frames instead of 16 animate diff is a I model and uh method of doing batched animations that has 16 frame context windows so twice as long as hot shot but it really only works with sd15 hot shot works with sdxl SVD is stable video diffusion and that's a new stability model that is protected under the stability membership plan so if you want to use that commercially you need to pay stability I think 20 bucks a month uh to be part of the creativ program um and then but what uh stable video diffusion gives you is uh the ability to sort of Runway ask generate uh image videos from images so you can either dream up a new image and then generate a video from that image or you can just use existing images and create videos from that uh it has increasing levels of motion uh motion like ID context IDs or whatever uh so you can tell it to be more motion or less motion and uh it's um with there's a new SVD LCM uh version that's like apparently just absolutely stupid fast so I haven't tried that yet uh we could cover that uh in a future stream um SVD is really cool um and it's also the backend for stable video.com so you can go play with it on stable video uh you can play with it locally uh you can play with it all you want locally I just can't use it for commercial things unless you uh pay stability um but uh as it stands sdxl and SD 1.5 do not fall under that so you can use them for whatever you like uh they are and that's that's anime diff and hot shot so yeah anime diff and Hot Shot are sort of parts of the same thing and then SVD is another architecture Al together um also made by uh stability and available for local use web use and uh commercial use if you pay yeah it's pretty much the landscape for comfy WI video right now there's some other stuff that people are working on too like uh they just made deorum nodes which look really cool this is crazy looking I kind of want to upscale this everything kind of looks dirty it always looks dirty I wonder what that is oh maybe because I put dystopian Wasteland what if we try like abandoned Mall 1980s uh Neon on signs uh emissive lights and uh food court all that kind of stuff I'm curious how much influence the prompt has on this a Trippy video it does look very Mad Max I mean that's exactly what I asked for crazy so yeah I've noticed for the upscaler um it the noise is a little wonky uh try to mitigate it a bit here with the uh sample settings and starting with an empty and then adding a layer of free noise at Point 8% weight or 80% weight um seems to help bring some of the detail back in especially if okay basically when you do the upscale uh with empty noise it the person becomes like uncomfortably smooth like uh all the details washed away so they're like basically just like plastic there's like they have no detail anywhere um so you need to inject a bit of noise into the upskill to get that detail back um I did try swapping out the K sampler for the uh like the regular one uh and then using the denoise to bring it down but when I did that and I ran it the uh upscaler uh crashed my computer so uh yeah I'm not sure what the answer is to that yet um so I don't know if this is an upscalable workflow as it stands uh but yeah we'll see we'll see what we get I can I can do an upscale run after this one because I I'm hoping this one will turn out cool the input footage itself kind of looks Dusty and dingy so I I get why it's coming through the the VA encoder um could try it with an empty latent image and just using the crazy depth and Candy maps to um rebuild the scene actually maybe even if I get a picture of them all use it for the IP adapters yeah just kind of went a little just a little okay let's do one with the IP adapters and then we'll do one where we swap out the uh empty Lane image all right so now we're uh pushing the IP adapters in at about 80% weight uh with this image to try to capture that like abandoned 80s Mall vibe that I asked for with the prompt um yeah see what we get is in my Laura a Mad Max setting no hey food uh isn't there an sdxl beta version of anime diff now the sure is uh yep it uh I have not got a single good result with it yet uh but uh someone might so that's with the IP adapters it's trying to add a little more neon but it's not still really understanding what we want so now this one is just the yeah this is the empty Laten image so we're just going to get the just going to get the uh the dream empty space dream empty noise dream hey Kush right on yeah the workflow is on banad Doo on the banad doo server uh if you scroll up the invites in this chat um the Discord link there uh you can download the workflow from the resources channel it's the one called Flo step lightning Laura sexl animate diff thing Hot Shot blah blah blah blah um yeah it's like the second last or third last one who Direct did that clip I bet it's three separate shots so it's dangerous otherwise uh not sure what you mean what is banad doo banad doo is a Discord server for the comfyi community uh it has an incredible re vast amount of workflows and resources and it's where we actually get a lot of the stuff that we cover um in these streams uh you got to go to banad Doo it's it's it's it's the best uh the link to banad Doo is at the top of this uh stream but I'll drop drop it again there we go yeah highly recommend it uh this workflow is available in the resources Channel It Is by uh clinter and it's FID to vid lightning for step oh okay no problem all right that's definitely late still insane though okay um yeah it's probably as good as we're going to get for that so let's go back to just doing normal human people because I think might be able to get some cool I had where's the other one not you that one so we're going to do the first 64 frames starting at zero actually we'll do the first 16 because we're just right now we're going to do a little style diving to try to figure out uh how to make this look cool 1024 by 576 all right we're going to try um anime uh 30y old man uh uh dancing on a dancing on a dock in front of the ocean so let's see if we can change Styles here uh we're going to turn the IP adapters off so this should just do a quick 16f frame run benad Doo is just um yeah it's a community um have I explored using llama to remove objects in video no okay okay let's get okay that's better the Ocean looks cool yeah Sora is going to be amazing but I feel like it'll be LOL expensive to use all right let's what happens if we add this in does he turn into a mall Pixar let's try Pixar already kind of Pixar I noticed the colors are a little washed out um I'm not sure if more steps will fix that or not we could try yeah exciting stuff yeah I think uh diffusion Transformers or Transformer diffusion or whatever it's called is probably going to be the new hotness oh uh Kush on um pexels.com pels uh it's all cc0 um all cc0 footage you can use for for input for playing around um yeah uh there's another one too called uh oh come on brain what's that other big cc0 website pixels and pixabay I think pixabay pixabay pixabay yeah I think they got videos too but always check the licenses make sure that they're cc0 before you do anything crazy with them I think some of the stuff on pixabay is uh CC attribution so you have to um you know you have to give give credit whenever you use it so it just made crazy stuff in the background but it didn't actually really Infuse the style that we wanted but maybe can we make you Hugh Jackman xar Hugh Jackman kind of looks like him do I concentrate on vertical video no I like all aspects uh just uh just whatever my input footage is or if I want to work on a specific aspect that doesn't really look like Pixar Hugh Jackman but the dock and ocean look great yeah I mean it nailed the hands right those look totally human and perfect I like how a lot of the fingers are clipping into each other and that's a cool that's cool in there the the design is very human now I wonder if that's the cany fighting with the Laura too oh no they're barely in it no I think that's just the death although we're not using the input footage anymore are we uh our latent input to the K sampler is yes let's plug the original footage back in and see what that does I'm curious if he'll still be on the ocean or if he'll be like in a house although again it is not giving hug Jackman although did I turn the IP adapter on yeah I did so yeah now it's fighting with the background from the video because the video's got a light blue background it is working though I bet we get a lot cleaner edges with this because it's got the it's got the dude edges are better overall color is better too it's really picking up [Music] the yeah from the input colors the dream was very dark I've noticed that with this whole model make Elon Musk dance Elon Musk at a dirty Dive Bar dancing to Elon Musk disco dancing at a dirty dive bar hey IO uh workflows are on uh I have some on GitHub uh I just don't have them all on there yet uh where's the link come youi workflows um I got to add some of the new ones here but uh the rest of them are available on my Discord which is on uh The Links at the very top of uh pers. XYZ it's the third link right there uh they're all on the resources Channel and they're all tagged based on what architecture they're for like animate diff or SVD or if they're um you know uh I got some mask mask packs and stuff on there too like resources and things you can use to get started some fun uh open pose animations come on oh is it done do look like Elon at all did I run it maybe I didn't run it so yeah I guess my takeaway for Hot Shot for this is that if you're trying to do video to video this is probably the best way to do it fast um if you want to go deeper and you want to do a lot more stuff with it I would probably recommend sticking around with uh animate LCM and uh sd15 for now uh all this sdxl stuff is still [Music] pretty uh newish uh for this but I hope that this is a step in in getting the uh sdxl stuff to be the new standard or or or get ready for sd3 um I mean sd3 looks really cool I'm really excited to play with that I hope I get into the testing group that would be sweet um oh that's pretty smooth Elon looking smooth maybe we'll try like maybe he thinks he's too young let's try like 40-year-old Elon Musk I don't see the Dive Bar though I don't see the Dive Bar though I don't see the Dive Bar just getting that blue background yeah it's interesting I'm just not getting a lot of control out of this hot shot thing so you know that's why I would say maybe stick to animate diff because you can do way more crazy stuff with with animate diff and as well as like mixing IP adapter in prompt uh works better at higher cfgs which you get with animated LCM like two or three CF G this is what we Sacrifice by getting the speed is the width of our CFG and the CFG is the promess or like the layers of how close to the prompt that it gets like these uh lightning models are distilled down from uh other models into one CFG stream so they're like uh they get to a result really quickly but there's only that result you know at that seed like the rest of it is kind of omitted or distilled so if you want more variety I guess the word is more ability to be less uh be more ambiguous with your words and things like that you want to have you want to use a model that has like a slightly higher CFG with like um the regular models that we use at like 20 steps they have have like you know they sit at s around s and then the goes from like 15 to you know three or four or whatever um then you when you distill down to animate LCM you got like a smaller CFG window of like two to three to maybe four uh and then with this it's one it's just it's just one CFG of one and there's if you increase it or decrease it it looks like crap so yeah that's if you don't care about like this level of speed at this res then yeah animated LCM is definitely thing you look at and check out the inner Reflections workflow from last week the one that we just went over last week last session um that's what this one is sort of based on but you know uh it's an anime diff type thing so yeah uh that is pretty much all I wanted to cover with this this hot shot workflow so I think I'm going to leave it here for today and uh we will um yeah next time we'll dig into something else else um this is really cool and if you want really fast results uh this is the way to go grab the the workflow from Boko and uh in a rocket uh if you want to try the upscaling I highly recommend these are the settings uh I would use for the upscale I'll just leave these here so you can screenshot them um if you don't do this the upscale will either be way too busy or way too smooth um and this range here is like 75 to 082 basically uh for noise weight of the free noise feel free to play around um but yeah you see all this noise that's kind of jumping around around the outsides here that's that's the issue because otherwise they're just really smooth but people are like really smooth and the backgrounds are really smooth so we need to introduce noise but we really want to avoid getting this stupid soupy noise that happens in the background so yeah yeah that was the hotot four step lightning workflow um if you enjoy what we do here uh hit up the patreon and uh yeah hop on the Discord and do all the stuff that uh we do there I'll be having uh by weekly Hangouts on the Discord coming up pretty soon where we will just sit and make stuff together uh not on stream but just you know together so you know everybody can talk and we can all just hang out and uh listen to some tunes and make some stuff you guys can ask questions and talk workflow problems you're having and we can try and sort it all out together uh try to make the community more you know uh viby in that way so uh I'll be hanging out more in just the hangout area and uh I'll be around and if you got questions uh you always feel feel free to pop by the Discord and uh and ask and help and other channels if uh if I'm not around there's other people around uh that will uh be able to help you probably so uh yeah come on by enjoy the resources and uh join the community and uh thank you everyone we'll see you really soon uh I'll be back on probably tomorrow uh Monday at the latest uh with some uh new uh animated LCM uh Revelations I've come across that'll help uh help you get really really great looking animations out of uh animate LCM animate diff and uh the uh t2v motion model so yeah I really like the idea of text a video not always having to have input video so um yeah we're going to dig in hard on that and uh um like the inner reflection stuff but but based on text to video so just making up your own stuff from from prompts uh versus having to always have input footage so yeah it'll be really fun and uh unless something else crazy comes out uh then we'll probably cover that but uh yeah there's a lot of new stuff to cover free Control G gen uh I still haven't covered the the motion brush stuff the I can't remember the name of right now and uh you know a motion director for making our own motion lauras which we'll be covering not making but using motion lauras and some of the motion Laura repositories that exist now uh yeah there's so much stuff to cover and not enough time so uh I really appreciate all the time you guys spend with me and we'll see you really soon thanks everybody and have a great afternoon evening morning night whatever it is there byebye
Info
Channel: Purz
Views: 1,138
Rating: undefined out of 5
Keywords:
Id: 6IcF0WxTG3A
Channel Id: undefined
Length: 77min 21sec (4641 seconds)
Published: Sun Feb 25 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.