Stable Video Diffusion /w Temporal Depth Control in ComfyUI

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

if I'm live I might be live Hello [Music] friends hopefully we're live I feel live OBS has been having some trouble with the connection lately want to connect but uh I think it's live all right cool hello everyone welcome back uh hope everybody had a good holiday um figured I'd do a stream before New Year's uh cuz lots of cool stuff came out in last few days last few weeks whatever um I'm not going to cover it all tonight I'm just going to cover uh SVD temporal control net um some of you might have seen stable video diffusion and and what it does where you take an image and then you can bring it to life like Runway or Pika uh someone very smart Siera strawberry figured out how to to uh use depth maps to guide SVD uh and then um kajai Kai Kai kji um wrapped it up for uh for comfy UI so now we can use it in comfy uh the link uh to the download is in the description below if you want to uh install along with me um I will just go through the two things I had to do to get it to work because uh yeah if you're trying to use it and it doesn't work this will probably be why so I ran into a bunch of problems when I started to run it uh and that's because uh the python environment that comes with comfy was uh had the wrong version of uh diffusers so I had to go to my command line I had to go to comfy um and inside the comfy folder there's a folder called python embedded and in that python embedded folder you will find a python.exe that python.exe is the one you need to run pip install on to make things happen so um let me uh pulled up the command real quick uh so I don't get it wrong got all this help on the banad doo server as usual uh if you're not on there get on there it's great uh okay uh in the stable video diffusion Channel yeah so python.exe uh python.exe Das s-m in PIP install uh dasu and then diffusers and what that'll do is uh it will um just paste that into the chat here for everybody just in case you need that for when you're running it um and basically what that will do is it will update to diffusers and allow this uh custom node to load when you load it up so when you load it up it'll get rid of the error and then uh you know you this one isn't in comfy manager so you need to install it manually uh you can do that by either get cloning from the folder like so uh in comfy UI custom nodes um so you can get clone and then paste in the URL to the thing and it will the the comi temporals control net and it will put the thing in place then you can reboot com UI and it will install uh if it fails it's probably this other command you have to run the diffusers command so run that and yeah that'll get you all up to date and then if you look in this folder here in the examples folder uh there is actually a Json file that you can use to load it up in uh comfy UI and it's basically this I've I've extended and done a couple things with this but uh I'll go over what I've done uh now all right cool so you get this workflow up and running and uh this node here is the one that will fail if your diffusers is out of date so you're going to want to that's red and it's complaining and you go the usual route of going to the manager and then install missing custom nodes it's not going to be there uh it's too new I guess um I should mention this is very new uh yeah so there will probably be a proper in uh proper uh I forget what the word is a proper version of this soon and comfy but right now we got this hack uh so what this does is it takes an input image or video in this case uh it will put the subject on a green screen using um uh robust video matting uh and then I just uh resize and crop the image into the depth maps to match the input width and height here so set your input width and height uh set your uh video that you want to upload for the uh input tell it uh what frame you want it to start on and it'll do 14 frames as is svd's nature uh you prompt it and then what it'll do is it will take the model it'll take the first frame and it will do a generation based on that then it will use this generation to feed into SVD to animate it using this depth map as the source pretty straightforward pretty cool uh we can watch it in action here uh let me grab a video from pexels one person in the frame tends to work better but it's not too bad at multiples that looks like a person walking yeah that should work so just download the file and then load it in comfy uh I've been noticing this thing with comfy lately you have to like go away and back to the video for it to show up I don't know why I don't know what's going on there it's driving me crazy uh and then uh the only thing I want to do here is in my crop node uh I want to pick top Center because I want to try and get her face in there so let's see if we can change her into uh Mr Bean uh ill illustration pencil crayon so yeah now it's uh cutting the background out of the video and grabbing uh 14 frames starting at frame number 20 so there then it's going to crop uh crop it in the center to this uh resolution there we go there's Mr Bean and now it is rendering if you want to uh retain the original background from the image you don't have to use this uh robust video matting if you don't want to um you can just bypass it if you don't have this node it's uh you can get it in the manager yeah video matting it's cool it can pull the background out of any uh thing and put the subject on on the front uh yeah so now it's made the first frame it's fed the first frame in SPD uh it's going to use this depth map well actually we haven't don't have an output of the depth map yet but um it's going to use the depth map from this animation and it's going to try to guide uh yeah g to guide this hopefully we'll get Mr Bean walking holding a cell phone not bad now it's going to interpolate that give us a nice interpolated output and then it's going to just slap them all together so we can compare it to the original input footage it's pretty impressive now I noticed the footage is in no no it's not in slow motion kind of is though by the time we get it here so uh when you load a video with uh VHS nodes the the the load video node you can actually uh skip frames so this select every nth it it means like uh if you have it set to one it'll do every frame if you have it set to do two it'll do every other frame if you have it set to three it'll do a frame skip two frames do the third frame skip two frames do the fifth frame so on and so forth the idea is you can make the video twice as fast three times as fast four times as fast by skipping frames when it's loading them uh in this case because our interpolation is actually like three times slower we could actually probably if we set this to three we'll get three times as many frames in the sense that we'll get you know three times as many things will happen in the time but the frames will have less um there'll be less frames in between them so um the yeah the Json workflow is with comes with um the SVD temporal control net thing it's in the downloads below uh you just go into the examples folder and drag that B on fil in here the only thing I added was the uh robust video Ming rest of this stuff's all built in uh load video so yeah let's try it at three times the speed because we're interpolating here uh three times where's our R yeah multiplayer by3 so yeah if we do this it should uh we should get more video but it's way faster but because we're interpolating the frames between might split the difference here we'll see and if you want to see how to add the video the robust videoa node I literally just loaded it after the load video image to video frames that image out to the video combined you just pop it right in between uh and if you don't need it just bypass it contrl B uh the only other note I added was the uh the crop um so I just added a crop uh the math the the logic's kind of confusing um but we're getting our width and height from the top here so we have to feed that into both of the image resize and image crop and then you just have to kind of play with it until you get them in the right positions uh you don't actually need to crop I just wanted to so I could get one to one uh stuff in the videos otherwise stuff gets squished when you just resize but there's also a true button where is it on the resize node there's a keep proportion true false if you keep proportion it'll resize it and maintain aspect that's what I used before the crop so that it would keep the aspect ratio but resize it to the the size so get the width resize the image crop it drop it into the other stuff that way your three images are the same width and you won't get an error when they all combine these image concatenate nodes are really picky about the videos being their correct width and height yeah that's pretty cool Mr Bean looks good so if we uh remove the robust video matting node and we just run it again it should yeah it should keep the background place and uh that might make a different uh type of Animation um the reason I've been pulling the backgrounds out is because it makes the depth map focus more on the person um we'll see when this one's done whether having the background in yeah looks like it made a whole bunch of crazy stuff so yeah uh I think with this method it's probably best to either use pre matted preg green screen footage or just use robust video matting and then you it doesn't have to be green uh green is just a default uh if you go here uh you can change this background color to any color or I believe any hex a decimal uh color code um so yeah you want it to be black you want it to be white you want it to be purple you want it to be red uh whatever color doesn't clash with the back background footage I guess and that'll help the depth map uh understand uh sort of the background better because otherwise looks like this might just be nonsense but it's hard to tell until it's done so let's just wait and see oh I should mention uh this process requires 12 to 16 gabt a vram uh it's usually hitting around 13 gbt on my machine so uh I would say it probably actually 16 gigabyte minimum of vram to make this work uh just fair warning if you're trying to get this to run on something with eight or 12 um it might still work it just might be slow because it's swapping the system memory but the actual process doesn't really take that long so it might not be that bad um but yeah just fair warning this is very very vram intensive process uh it actually probably can uh the problem with maragold depth as you know probably is it takes so long to render it you know well I'm not actually too upset at uh how this looks um it's nice to have the background information you can actually see it represented in the in the in the background but I think the background is taking on that pencil crayon thing so maybe I have to describe the scene more now that I have um you know the background added so what if we have uh Mr Bean uh holding a cell a mobile phone standing in a concrete building like will that help the background possibly oh did I un yeah I unchecked It Anyway Let's uh cancel that let's uncheck you and run it again oops just the one please so now that I've added a background or the prompt will it yes it did it did actually affect the background that's pretty interesting so yeah I guess it's really about what you prompt to get that first image um and if you don't like where it's at you can cancel this at any point uh while SVD is running uh and just rerun new prompt uh new whatever uh you don't have to wait for it to finish uh if you don't like the generation uh you can just kill it you just uh this here vew Q in the side you make that big and then uh you'll see your your tasks here you can just cancel it sometimes it's really good at backgrounds marle depth is really cool it's just uh takes forever to generate especially for videos it's fine for an image but once you have more than 16 frames you [Music] know nice love the eyeball but you know whatever so yeah let's try a different video input video I'm just mainly looking for someone in the center easy to crop interesting looking that'll work I think yeah let's try that you you video and again just go if it doesn't show up just go like over a video and then back and it'll show up I don't know what's going on with that that's a very annoying all right should we have Mr Bean contemplating sure thinking about life see if we can turn this into a concrete background oh it did okay let's uh actually before we do this let's get off the animated let me try the realistic Vision one and go Mr Bean uh portrait photograph 35 mm lens uh Fujifilm see if we can get a nice realistic nice damn [Music] now I wonder if the speed is actually having a negative effect on SVD because I'm pulling in the input frames at three times the speed so it might be not the right speed for the motion flow on uh SVD let's see what this looks like know why I'm yawning had a coffee maybe I need another one why is he rubbing his hands together okay let's get that back to one it's getting a little flickery let just try that again yeah either more RAM or hopefully we'll find methods of doing this stuff that takes way less resources you know that'd be ideal just make it possible to do it on like six and eight gig cards I think most people don't mind waiting as long as they can get their foot in the door and do it like I didn't mind waiting on my 3070 on the things I could do but when you don't have enough vram it's just annoying because you just you can't even try luckily you can go on like run diffusion and run pod and stuff I think you get 50 cents an hour on run diffusion or something it's pretty good I also have my motion bucket ID kind of high it's at 100 could turn that down a bit we'll see this should be smoother because the input video will be smoother slower more specifically more frames n it still just mangles the face I wonder if there's anything I can do about that now a last image last image wait did they add no why you give us the last image why give us the last image unless we can run can we use this to extend the runs I feel like it's already been VA decoded so if we use this as an nit image for the next run I think it would actually be better to just do unrelated runs with fixed seeds I am curious to see if we can turn that into absolute goo though tempting why would I want the last image I'm just trying something stupid here with and height I'm to see if we can extend the Run ah if screw it uh 25 SVD XT frames 25 okay so we're going to generate the first one with the depth and then we're going to extend it with SVD but I think it's just going to turn into Goo uh I mean that should just run right oh right yeah of course I bet this is like too much vam well maybe not oh 17 gigs of vram no that made goo why why did that make goo oh do I need to run a k sampler on this yes of course I do of course I do of course I do yeah dumb okay latent yeah and then latent to here yep and then positive and negative and the model comes from here okay stupid okay uh we stitch these two together at the end uh we have damage concatenate uh [Music] so that in that uh r I don't I don't know if this is correct but it's funny okay if I'm right that should concatenate the two videos together assuming this one is in total garbage which it might be let's fix that seed so we can run it again yeah 16 17 gigs to do this dumb workflow so oh Jesus now why did that happen let's have a think huh it's like the wrong kind of noise or something oh yes yes yes yes good call good call good call it's totally the CFG it's supposed to be super low with uh SVD right good call I love when you Google the answer and your face shows up it's like you don't even know the answer [Music] what do you say ba still going to be goo because I had the CFG up way too high again oh right 14 and 25 it's not going to work uh the two different lengths um oh concatenate is not the right thing I want to do anyway I actually want to is it combine yeah image combined so we're going to take the first run out of here uh where's the first run there it is first run and then the second [Music] run and then combine that into this right yeah sure all right let's try it it's crazy how it runs again and again with the fixed seed on SVD I I guess the seed is uh yeah I that lends uh lends credibility to the fact the motion buckets are not specific the IDS are not specific motions I have another crazy idea if this works I wish we had loops and comfy but yeah I have a terrible idea and unfortunately I think it'll work [Music] so in this method we're extending it yeah it's looking better yeah it was definitely the CFG eay cryptic [Laughter] okay so instead of using SVD to extend it like this uh what if we do okay I'm going to leave this I'm going to un plug these I'm going to delete all this lovely work we just did because we don't need this it was just a testing thing uh okay we have the first 16 frames first 14 frames of this video right why can't we just do this over and over and then stitch them together yeah grab 15 grab the next 15 grab the next 15 so I'm going to do three of them in a row with a fixed seed I'm just trying to figure out the logistics of this here I don't want to duplicate all this stuff but I think I have to so let's think about it another way why don't I make a Stitcher yeah I will do the process several times then we'll make a new workflow that just stitches those videos together that's easier okay so 896 by 640 we're grabbing 14 frames we're starting on frame one so we want to do the first animation we'll be uh skipping no frames at the beginning I mean we technically already have it but uh I'm just going to save this workflow I'm going to delete some of this stuff I don't need this anymore uh don't need yes of course I need this that makes my thing and then we're going to use this to stitch together the final video so I'm going to just move it down here for now okay so if we think about it this way we'll go we'll grab 14 frames then 14 frames then 14 frames and 14 frames and then I'll just make another thing that stitches those four together into a thing and then combines them and then we'll see what that looks like as the things and I think it's essentially like a slideing context window but without the context the context is the depth map uh this will probably make more sense when we actually do it okay so let's do our first clip which will we're going to save these into a new folder so I don't lose my mind um we're going to call this um Mr Bean uh we're going to put these in the test uh Mr Bean test yeah all right bring it and uh those are all the 10 frame per second renders as the yeah and then we'll Stitch those together and rif them we'll rif them okay save the output almost missed that so they just added the save output button it's pretty sweet music just got suddenly loud all right cool um all right cool you're good you're good you're good you're good you're good so we'll do our first run and that's going to grab the first yeah first 14 frames right did I miss something oh well can I just bypass you guys for now what oh right it's this is done okay so let's check the output folder and see if it's there Mr Bean there you are Beauty all right so now we're going to going to skip the first 15 frames no we're going to start on skip first frames 14 I can't figure out the math I think it's 14 I don't know if it's 14 or 15 but we'll see so now this should run a new run that's going to change his shirt uh it is a fix seed so whatever so yeah let's get four of them and then we'll stitch them together and see if it like makes a coherent animation if it does then we could think about building out a giant workflow for doing it all in one go but it's kind of easier just to move the needle yourself stitch together later nice part is we're going to get our context refresh every 14 frames um so the smearing that happens as it's pulled across SVD will be sort of fixed every 14 frames is so this should be the next next 14 frames coming through here now funny okay cool so this will work so let's go 14 + 14 28 hit it uh 28 + 14 42 hit it and 42 + 14 56 so we've cued them up and now it's just going to run them one after the other and then when it's done we will have the thingies and while that's going I'm actually going to build the workflow for stitching them together that's the nice part about comfy is you doesn't give a you can run whatever you want while it's running you just goes in the queue all right uh we're going to need to stitch I think I'm doing five so let's set up five load let's use uh VHS load video path nodes the video the load video nodes been giving me problems lately so two you're going to need five of these okay this is um hilariously awkward but it'll work I think all right does that make sense first video and the second video are combined first video and second video are then combined with the third video those three videos are combined with the fourth video those four videos are combined with the fifth video okay and then we're going to add a rif node that's our interpolation we're going to go two times actually we'll go three times because it's at 10 frames per second the input footage that'll give us 30 frame per second footage and we're going to add a video combine node and that'll do our final and we're going to set that to 30 frames per second we're going to call that final Stitch SVD final Stitch MP4 lower this to about 18 or 19 this is your quality uh the lower the better but the higher the file size uh diminishing returns below about you know 14 much larger files for not much more quality uh this hurts my whole brain it's very hard [Music] to figure out the smart way to display all this but whatever okay so uh if I go to AI comfy output and I go to my Mr Bean folder I want this one copy as path paste second one paste third one paste fourth one whoops paste and and should be one more going I'm going set the final to ping pong as well so it goes back and forwards all right then oh okay it's all right cool uh that's all five so let's get the path for this one just right click on the file copy as path paste it into the corresponding loader all right we can leave all the settings alone they're all fine we're going into rif we're going to thingy it ping pong it and render it oh it's a problem interesting MP for not work why does it need to know the frame rate oh I guess it's 10 okay let's try uh VHS load video node that's weird AI comfy comfy outputs hello khed hey Filipe we're playing with stable video diffusion uh depth depth map um using depth maps to do the footage right now I'm trying to see if I can extend the Clips past the sort of 14 frame limit we have but uh comfy is not playing nicely output uh Mr Bean where are you Mr Bean Mr your bean there you are okay Mr Bean test one okay these all right let's just see if it works okay I guess the path loader one is broken right now so that's fine I'll just uh do this VHS load [Music] video one two three four five all right number [Music] two and three number four and number five okay and then one two yeah nope that's not right okay [Music] let's let's disconnect all these thank you okay that's all correct okay so you're going to one you're the second one you're the third one you're uh oh you're the fourth one and you're the fifth one no awkward okay uh you you you you you you yeah you're good all right cool let's move you guys back down here okay why you 3 4 5 1 2 3 4 five rif image combine what do you say bae oh that's kind of cool actually I mean it's working it's it's it's doing what I thought it would but you know bad we lose so much facial detail as time goes on with the thing but I mean it's not probably not the best way to even be using SVD but this is very interesting it makes very cool weird footage huh okay let's uh [Music] uh yeah let's try it with so I don't need you at all uh so UND do you and then grab new footage from pixels somebody walking towards the camera away might be interesting too but I think it'll put their face on the back of their body which is always creepy that might work she's actually facing the camera and walking towards it let's try [Music] that okay uh maybe I'll pull the background out too let's try a superhero holding a coffee P photograph oh I see what I did cancel zero let's go from the very beginning might make a male superhero that's interesting let's see what SVD does with that I love how it just changed the color of her shirt kept the pattern though that's crazy so this um this one just works by generating one frame of the the first video uh like with regular SD 1.5 and then using that frame as the input footage for SVD so uh we have a lot of control here actually uh when it comes to what we want uh the input video to look like uh with this method because we're not tied to you know a conditional uh prompting system on something we don't understand we're all pretty used to 1.5 by now and and we can get around it and use lauras and do whatever we want to do so if you want to uh you know maintain character consistency and stuff you can probably train your own luras and and pop them in there and use IP adapters and all that stuff to get you towards the result you want for your initial frames before we even get into the SVD side of it that's pretty good that's pretty good let's try this one Zer plus 14 is 14 the 14 + 14 is 28 28 + 14 is 42 and 42 + 14 56 all right okay so we've just gone in a completely different Direction with the face this should be totally insane as the person's face keeps changing weird yeah it's like everything everywhere all at once person keeps changing and flipping that zombie one might be cool we can crop in different quadrants so should be possible if the subject's off to the side or something we I think we can fix [Music] that wonder what it does with like animals we should Bey right it's doing a lot better job keeping the the face really stable it's crazy I should put up a female superhero probably I'm going to grab a water while this finishes you guys should stand up and stretch you don't have to I'm not your mom but it wouldn't hurt sh all right are we done I think we're done all right let's stitch them comfy UI output Mr Bean should probably change that folder name since it makes no sense now okay uh final Stitch 30 save let's turn the ping pong off hit it so I had an idea while I was making coffee I actually like this this is kind of neat I like the way it's sort of Shifting between realities like this it's interesting um but I had an idea so we're generating this image and we're using sd15 which means uh we can actually use IP adapters to get us closer to where we want to be so I'm wondering if I can use IP adapters um to really really influence the way the person looks so uh how do we do that well uh IP adapters runs through the model so we're loading our realistic Vision model here we need to load a uh I think it's like apply IP adapter apply encoded yeah yeah and then we get all the other nodes from here IP adapter we're going to get an IP adapter model loader we're going to use the plus face because we are doing face stuff embeds we're going to get an IP adapter encoder which is going to give us our four image Ed and we're using IP adapter plus because we're doing face we're doing faces uh clip Vision we can get a uh clip Vision loader all right so these two loaders can go together plus face and and clip Vision sd15 uh IP adapters if you need these uh these models or IP adapters at all manager install custom nodes IP adapters and it's this one com you IP adapter plus you go to the GitHub by clicking on the link and here's all the downloads you need and where they're supposed to go what folders they go in it's all explained here just go here read it do it you can ask on my Discord but uh lots of people have already asked because it's all here all right and then uh and then uh yeah we could even try some masking but we're not going to do that right now load image uh load image node but between the load image node and here and the encode we're going to uh prepare an image for clip Vision that's a node whoops did I yeah okay repair prep image or clip Vision all right image goes there there and with IP adapters I just like to put it on all four set this to like 085 point8 to start because it's usually very strong and now uh we can just pick a person person um I'll just use my stupid face I guess so let me get my face all right and let's do that first video actually this one may work yeah okay so we'll start at zero let's skip first frame zero now uh let's go uh 43y old man uh horn rimmed classes uh uh Gray beard it's going to get all that from the image hopefully so if I right this will when it runs really try to like put my likeness on it but you know you never know let's try it I mean not bad you can see where it got most of that from pretty cool let's let that run and see what it looks like this cool okay so I've just done four more runs every 14 skipping 14 frames sequentially um forgot to set that to Center whoops maybe I'll clean this up and just throw the workflow on my Discord because this is confusing confusing AF as the children say get death frams where the depth frames going did I delete something what the depth frames going into the SVD oh yeah we good oh this is probably for uh like uh so we can see them don't save the output show me MP4s 10 gaser nice all right that's pretty cool all right let's get this back and clean all right so uh this is this is crop and resize where you come from robust video Ming yeah just make we're already using set and get let's clean it up more Set uh we'll set a uh we call you input frames you know okay and then you're fine and then we'll get it get node input frames sco you and you a nice blue color that's not it that's not it uh oh I want that color that's annoying all right thought this were supposed to go the right color and whatever all how many we got two [Music] more so input video mading a that's the madting preview input frame yep okay should probably put you guys beside go matting down here you go up here play the same size yeah that's better all right cool yeah frame load cap input [Music] video These are settings your settings to technically just pop the prompts here actually maybe we pop wait oh yeah you're good lineer what do you need the width and height for oh for wait oh yes it's all connected okay I get it and then yeah okay I get it this makes sense is okay then that's all control net yeah control net IP adapters settings maybe like that yeah settings depth frames cool those are done let's stitch them up cool if I wanted to stitch the end onto the could I not get the first frame yeah can I had another rif here right no I want image combine here I want image combine uh into the rif I also want to add from this one the first image flip it around if I'm right that should Stitch the first frame back onto the end of the video and make it a perfect Loop ch oh right that just made it three times F slower than the other video uh so for that the multiplier on this rif would have to be one right I think it's kind of cool in Ultra slow mode though I think that what's fast mode okay that should be regular speed yeah oh it's really fuzzy though I need to get rid of this uh go away steam okay um actually that one's kind of bad too oh I wonder if it's just bad nature anyway uh my point is I guess I don't actually need to I want to stitch them all together I could Stitch the final frame onto the last one before it goes into this rif right we just do this where are you coming from no you're supposed to come from here combine it with the first image great think I have to rfe them though H that's pretty cool it jumps though see I think there needs to be one R before there might be wrong but let's see no do and do then you two combine R and R and then it combine right I forget how I did this let me look at my oh I don't have it anymore bummer forgot I lost all my node templates when I formatted all right but that's all fine all right well this is cool anyway uh with the IP adapters it works so let's try [Music] um uh I'm trying to think of a character that's recognizable let's try Kermit the Frog want a head shot that might work that's probably the one all right let's try Kermit all right we're gonna clip Center yep all right uh let's try a different input video should Kermit be dancing maybe where's all my dancing videos PEX Piel what's the best video for Kermit maybe this one yeah this might be weird Okay Kermit the Frog brushing his hair and start on frame zero see what we get for oh Jesus I mean I'm willing to ride out the nightmare see what we get you just hi on me OBS what the heck oh okay I mean it has groups now not so great all right well I mean it's working it's just nightmareish because I use you know Kermit the Frog instead of you know something normal um but uh yeah I mean this this is actually working surprisingly well I really like stitching them all together it's pretty neat I can't tell why it looks like crap though I guess the inputs kind of look like that the rif isn't helping maybe film maybe this makes more sense like this yeah yep that's true yeah Mickey Mouse in the public domain tomorrow only steambow Willie version though well this is a butt ugly workflow but it works all right feel like I want uh more of a interpolation back to that first frame but whatever get that shabon back there all right cool all right let's find a good image and oh maybe RoboCop wonder if we can do RoboCop uh there we go no I just want his head is RoboCop's head good old RoboCop we got two comfies open oh okay y RoboCop brushing his hair 35 millim photography oh no we're not uh I'm not trying to do face mapping I'm just really messing around but yeah just playing with IP adapters and uh depth motion control and all the fun stuff not bad this should be nightmarish Oh weird oh just get off the IP adapters completely turn it off and let's just try uh Margo Robbie there 30 years old uh brushing her hair yeah that's true Jonathan floff just tweeted and mentioned I might be able to use anime diff as a finisher for these too um might be worth investigating especially with um LCM think that awful experiments for another night though yeah SVD loses so much in the um in the motion sometimes let's try this at Double speed but uh let's skip it ahead to like 64 frames in oh what's your matter oh why you mad bro oh can't just mute robust videoa interesting oh something wrong with her nose like when she pulls her scalp off her head all right uh well that's pretty cool I mean this is pretty much all I wanted to cover tonight just temporal SVD and and and loading up depth maps and uh uh getting uh getting that from the input footage so uh yeah if you're you want to play with this at home and you have more than 16 gigs of vram or more than 14 gigs uh this is a really cool workflow uh I think I will try some experiments on my own with uh uh feeding this directly to um directly to uh animate diff for image damage but the trick there is to get good input footage on the way in and with SVD yeah this is It's just losing its losing the cohesion over the course of the depth map um which I kind of expected honestly uh because this isn't really how you talk to SVD in this way but you know it's fun and it's cool and it's interesting and I like smear weird AI art and this definitely fits the bill for smeary weird AI art so have fun play do your thing uh if you have any questions I all the setups at the beginning of the video um and as always uh if you like these videos and you want them to keep going uh you can uh subscribe to my patreon you can subscribe for free just to get uh access to you know uh files and stuff but uh there's also a $5 month for um you know just for supporting tip jar type situation uh you get a uh special um I think I see a watermark oh that's not good you see that that's a watermark interesting that's not good anyway yeah um I wonder where that's coming from hopefully not SVD weird all right um yeah so I hit up the patreon uh hit up my Discord everything's at pers. XYZ all the links are there um yeah uh pers loves you and have a great night and uh I'll be back next year AKA two or three days from now um have a happy New Year everybody happy holidays and we'll see you very soon uh and yeah we'll be going over more comfy UI nonsense and uh going crazy stuff and getting mental with the um workflows again soon but uh yeah tonight I just wanted to take it easy and play with uh SVD and uh the depth temporal depth thing now that I got it working so yeah if you trouble getting it working uh hit up bad Doo check out the stable video diffusion channel in there um or on my Discord there's people that I've got it working so uh yeah you feel free to ask in The Help Channel and uh yeah have a great night everybody thank you for hanging out as always and we will see you very very soon

Info

Channel: Purz

Views: 1,584

Rating: undefined out of 5

Keywords:

Id: qXZSAMwaGrw

Channel Id: undefined

Length: 109min 40sec (6580 seconds)

Published: Sun Dec 31 2023