ComfyUI AnimateDiff + OpenPose & ControlNet

Video Statistics and Information

Captions Word Cloud
Reddit Comments
hello hello YouTube Friends Hello friends Hello friends make sure all my settings are right here desktop audio off everything cool all right welcome hello welcome welcome hello here we are back again back on YouTube back and come fori uh I've been having some fun today with open pose so I figured I'd keep the fun running and uh we'd do some more open pose extraction and um yeah open pose extraction and in INF Fern um earlier today I made a video uh using control net uh open pose uh I'll drop the link because I don't want to play this song on the stream and get my stream uh get my stream uh copy stricken but uh yeah you can see it there posted a link in the chat and uh a j downloader well actually we'll see that really we'll need that later all right yeah so uh basically I extracted the pose from um a video of uh some awesome dancers dancing to uh uh dancing to Missy Elliots throw it back and uh extracted the open pose and then uh you know put myself in there uh we did that with IP adapter um this is literally just four pictures of me um four pictures of me and an open post skeleton and uh yeah here we go so as you can see we just loaded up four pictures of me into the uh IP adapter and then uh sorry I totally smashed my workflow apart in the last few hours trying to get this all working so I took the open pose information from Beyonce's single ladies video right and I said uh three 40-year-old men dancing on stage in unitards and panty hose wearing high heels just like the video and here we are pretty much nailing the vibe how do I look pretty good so as you can see a lot of power here is it perfect no is it amazing yes hell yeah all the single ladies all the single ladies it's even sexier in interpolated motion it's all slow-mo um and so you know this is only rendered at um 768 x 512 I think yeah 768 x 512 so if you took longer and you rendered it higher res you'd get way better results um and again this is only just uh this is just open pose as well um no other control net so yeah it's pretty pretty cool so I figured I would just walk through the process of of extracting uh open pose uh data from a video and then uh um you know making a making a video from it um so let's open up resolve here and open up uh project settings no project manager and uh let's do a new project create hey Frank I got some um I got some footage earlier [Music] of someone doing a Rings or rings routine oh but it doesn't work does it in resolve no okay then let's do something else let's grab um let's do the Napoleon Dynamite dance that'll be perfect download download down nice 480p this will be fast know decide to trade that's fine so it's two and a half minutes we don't actually need that much of it so let's cut it [Music] up let's go story really starts moving let's grabb that whole section all right use a blade to cut these now we have our little clippy whippy let's Zoom in get more Napoleon and let's uh just make it 1080 by 1080 because we don't need yeah and then let's get him centered make sure we get all the body parts inside the frame the whole time perfect now we're going to extract the skeleton information from this video Napoleon Dance sco all right and then uh so all we do is load a video uh use the DW pre-processor and then save the uh save it that's all you have to do so we'll go to our video which is in my video folder Napoleon yep right click copy as path paste the path into the video path loader frame rate of 24 have no frame rate cap or frame load cap uh we're going to select every single frame and start on frame zero uh image goes in here and image goes in here so we're using the DW pre-processor which is DW open pose which is slightly better than regular open pose uh well a lot better actually it's slower but it's better so you just add that you know at the beginning of your um uh your project and then when you just like run it and then when it's done just cancel your project and uh another easy way to do that is to just have an image load cap of one on the control net later so that it only ever when it's done it only has to diffuse one frame you know so yeah hit it so yeah that's the node setup there if you're confused on how to do an open pose uh extraction um I'll drop this well I mean this workflow is already available on my Discord in the resources Channel if you want to grab it and play with it and follow along so I just rendered out a square video and we're going to render out the square frame GES of Napoleon dancing uh the reason I do it into an image save node with this open POS thing is because it takes this DW preprocessor forever to work so we want to save what we make so that if we want to make a second video we don't have to run this again because this takes forever so maybe we should uh get some pictures of who we want to do the Napoleon Dynamite dance um Google and I want uh let's do [Music] [Music] um Mr Bean let's get four good pictures of Mr Bean that one that one that one and that one Perfect all right that's be it perfect more Bean than anyone can handle hope you like Bean uh yeah so basically we're using this thing called the uh encode IP adapter image and you can actually encode up to four Images into it um and that's going to basically be like our quick Laura like our uh you know it's going to just take on this look at it with um uh clip vision and use IP adapters to push our you know our our prompt or model towards what uh towards Mr Bean what it sees uh you got the four weights here and I haven't played with them much yet but uh yeah we were doing some interesting hybrid stuff earlier uh like a picture of a lion and a person and uh a couple other animals and stuff and you just get like these um creepy animal people with like uh weird fur and stuff is really weird and again that's just four pictures of me doing the Single Ladies dance looking absolutely Stellar while I do it I think I need some leotards good look for me apparently oh let's go DW I wasn't kidding I it takes forever go away steam if I can find you one of you is steam there you are byebye no more messages please quit close and all the things that I'm not using okay let's go DW as always I don't play music on my Stream So if you want to put some tunes on in the background please do so not trying to get these videos copy stricken cuz there's useful information in them every now and then and I don't want them to mute me not all the time in fact it's a lot of not useful information can no open pose only tracks people you wanted to track a spinning object you'd have to use some combination of uh depth and normal and like maybe segmentation there's no way that's how you spell Napoleon I must have spelled that wrong Napoleon right come on also my favorite part about the GW pre-processor is there is no uh there's no progress bar at all it's just going till it's done and it doesn't tell you when uh well uh uh well this goes we could always make more masks in blender come on DW for desktop audio is off yes okay uh I should check how long this is actually it's only 10 seconds it's only 240 frames and it's taking that long I guess it's 300 frames but still let's go I don't even think Single Ladies took this song I love the open pose animations they are hilarious it's so crazy we can just get basic motion graph or motion capture from video on a consumer computer in less than like 20 minutes come on DW do your thing still using 8K eight gigs so I'm still doing stuff hey food what's up buddy we're going to make Napoleon Dynamite dance did you you see my single ladies I think I nailed it wish I could play the song oh it's done nice now we're saving all the napoleons we're going to make Mr Bean do the Napoleon Dynamite Dance Now can't eat in tonight can you control this on someday there we go how we looking not bad cool all right Mr Bean Mr Bean dancing on stage cinematic lighting got these input images they're all prepared for clip centered and blah blah [Music] blah let's get that running and let's just do a quick batch to see what it looks like so we need to load the appropriate thing we just generated which is in our output oh I save them in an open pose folder open pose there it is uh no my head was just for pictures of my face okay and we're going to load uh 16 just to see how it looks 512x 512 so this should just be a quick Mr Bean 16 frame nice and fast nice all right I like it let's let it run how many frames is it 251 easy it'll take like 10 minutes maybe 20 let's get them going nice what are some other uh really iconic dances trying to think because it's like Singing in the Rain obviously oh 11 minutes not bad yeah runp pod oh yeah Elaine that's a good one um consumption on YouTube has videos on how to get comfy working uh on runp pod uh for like a anime diff um for anime diff setups and stuff so yeah check out consumption on YouTube yeah that'll get you going there yeah consumption does workflow videos like me but they're fast and to the point and tutorial shaped whereas mine are like long rambling info dumps for five hours straight slightly different approach I'm wondering what we do for the face because the face is always a little messed up wonder if there's a inliner face fixer we can sort of get kicking on this thing eventually feel like it would have to be on the decode space though I feel like it would take forever maybe it's just a combination of IPA and open pose and nothing else like maybe with depth or uh cany or something we can get the the faces or whatever have to start stacking control nuts to see if they're better or not could be interesting test try um like depth wait right wait my brain hurts I'm just going to turn this whole thing off for now and rebuild that later all right uh you're connected to that you're connected to that oh you're connected to more or less yeah sorry I've confused myself quite a bit here but I think we can get there so basically the way you can use stack control Nets is by uh uh running them through the clip on top of each other um so right now I'm just trying to stack a second a second control net and then we'll run a instead of the DW pre-processor we'll use the uh let's do Midas I think it's best for faces and this is an open pose it's death like if you [Music] were if if you wanted to go straight from like Source video you can you could run both of these pre-processors straight into the control and that stuff and just like you know Let it go but I just like this one step in between where I'm saving them and then just load them up after because then they're sort of reusable got to find a way to put them together though CU I'll forget you know which ones are connected uh so we have that we're going to grab the depth maps on the next run let's use the other one yeah let's try that then we'll load them into this one we go uh clip out positive negative and positive positive negative negative and then out of this one positive into the K sampler negative into the K sampler so uh IP adapter clip control net control net K sampler which is I mean that's just the whole thing anyway we're IP adapter into the checkpoint checkpoint we we aren't loading Aura right now but we could later um uh take our clip into clip skip uh load our vae into the decode run animate diff in between uh set up our settings uh run that in the case sampler and then run our clip into control net to grab any guidance we want to do on top uh so I leave mine to randomize because I'll forget to change it a lot um but uh if your computer's crapping out because you're doing two large batches B basically um you can use the fix seed on the batch runs to um to run them in chunks you know you do like 96 frames and then 96 frames and 96 frames like earlier I was doing a long like 800 frame animation or something and it just kept crapping Out video memory wise so I was just breaking it up into 240 frame chunks and it was a lot easier to deal with so um thick seeds are great for that um yeah up to you it's whether you're trying to do the same run or if you're trying to try different stuff every run I usually like to try different stuff every run that's a philosophical thing I guess I'm not really chasing a result I'm sort of just letting it happen whereas I think some people are definitely chasing a result and trying to get something specific out of the uh out the AI yeah it will um but it's only in the run just means it'll be the same Run next time if all your settings are the same because this seed is for the entire animation it's not a frame by frame thing two minutes yeah it's the interesting thing about anime is it does it all as one big chunk it's not like theorum where you're you're flipping seeds for every uh every frame like a flip book this one's sort of using the same seed across the entire I mean the context window might be jumping seeds I'm not sure how that works exactly but I really think that's just for the motion model but I'm not a scientist happy to be proven wrong if somebody knows how it actually works any controls you luck with uh flickering I [Music] mean uh depth is great uh they're all great you just I don't know do you mean flickering like your depth maps are flickery like what's flickery I'll turn my camera back out was I stopped this yawning fit Jesus like if you're doing a girl um like which which control net are you using are you pre pass oh yeah no that's just going to happen that's that's really hard to to deal with these days yeah here we go right on looks good love the way the stage lights are moving around maybe IP adapter is the solution then CU it's staying pretty like his suit is staying pretty coherent and that's just open pose so let's grab the depth maps like we were going to last time we'll set this to one oh let's think this one through these guys both need their image load caps to be the same every [Music] time let's add an integer yeah and plug that into both yeah we'll do one frames [Music] uh yes all right we're going to do a quick depth map pass well quick I don't know how long quick qu 12 likes why are the lights not moving in the draft render but the some being the draft render which draft render Oh you mean like the the little 16 frame one we did first because it didn't have time to jump context Windows the context window is 16 frames long so every 16 frames is going to shift around a bit it's sort of the nature of anime diff so I bet we'll have a lot um because we're gring the pardon me on this pass we're grabbing the depth maps here that was fast and we're going to run the depth control net as well so hopefully that'll make the background and the character and everything a lot stickier and we'll just try you know like some 64 frame runs or something to test the settings I don't know enough about how the motion models work to answer that sorry maybe there's a smarty party in the pants smarty pants in the party who knows uh a smarty party in the pants a smarty pants in the party who knows uh who knows about that I'm not quite [Music] sure these must be big ass files oh that looks so bad completely useless let's cancel that and no no no I went against my better instincts actually we can just do this if you don't close your window uh comfy resumes its state when it comes back up so you can always just kill comfy uh well an animation with a 24 frame context would just be goo like if you change that number to anything but 16 you pretty much get goo all right uh let's try this with the other depth mapper oh can we for size 512 by 512 yes yes yes yes cancel I'm just going to take my own advice and kill this real quick start it again all right let's try that again uh 512 by 512 maybe that'll be faster and just absolutely no indication of how long it's going to take I love that I love that was fast as hell though all right let's see how are we looking even worse o completely useless all right let's try Zoe blur you ever want to watch somebody restart comfy UI 65 times on stream I'm that dude okay let's try Zoey because that looks horrible what do you say Zoe if I recall correctly Zoe takes forever tempal diff version one can understand 24 oh that's cool that has not that has not been my uh experience with changing the context it's probably because I use temporal div for something I know you use it too that's weird uh how long does Zoe take hopefully not super long at [Music] 512 you know I made a i made an animation for Looking Glass the other day but it might actually be a cool anime diff animation too cuz I can get real depth maps from it from blender interesting maybe we'll try that after this we'll try and convert a blender scene to uh anime diff and see how it goes I can fight with getting dep Maps out of blender again how we doing how does it look most importantly AI comfy output depth okay this this is what I was hoping for that's more like it so having this as a backup layer underneath the open pose should hopefully be just what we need so that's how many frames AI comfy output open [Music] pose 251 okay and we want depth this is the [Music] folder we're going to skip zero select every on Z one01 yeah it's fun eh I get to see all the submissions tomorrow for the contest cool so you'll be loading from [Music] depth will you be using depth at a strength of let's try 0.8 uh and let's do a context window or a render length of 16 just to see cool okay so got open pose and depth see how it affects it see if it helps with the face yeah Zoe Zoe is the best it's really good on ad animations too nah his face isn't [Music] better and the arm is totally messed up too see I don't know if it's better maybe 0 five just to reinforce it his face is I guess a little less lickery but I don't like how the arm just kind of clips into his throat it's kind of gross I broke my whole switch selection for the film interpolation thing I can't figure out how to do a switch where it'll send it to two different places I can only find switches that'll take something in from two different places which is a totally it's fine but I yeah whatever it's really hard okay if I remove this yes does the arm still clip in with just open pose I I grab toy xyz's stuff and I got to figure out how to get that body into the mixim thing so I can render out the open pose animations from it that's a little [Music] better you can see how the depth was blacking out the back or like cleaning out the background more research is needed all right let's open up blender and open up that scene and see if we can find a way to export it I'm curious now this is the scene this is the looking glass thing let me turn on my looking glass my thing might go blank here for a second there we go switch on the little uh Bridge might have to restart blender I forget how that works oh no there it goes so then you get this little viewer thing you can see how it looks on the Looking Glass um so you can tell if your depth is like way off if you don't have a Looking Glass connected but then if you click this little uh this guy oh it doesn't see it yeah I think I have to restart blender for it to see it um it's not a priority right now because uh I got this which is fine um yeah it's very interesting uh show you what the scene looks like with Cycles so that's what it looks looks like on the Looking Glass um can't preview it in Cycles because of the you has to render 48 views at once to do the uh the other render but yeah um if we take all the elements of this we could probably make a really cool ADI animation so let's Let's uh let's look at it first off I'm going to save this as a new project so I don't uh ruin the other one I think we'll just get rid of the Looking Glass camera yeah may get rid of the end panel oh no we need uh Toy X I should have read the read me select Target bone oh I feel dumb oh how do you look at bones again no there they are the spine I guess nope oh select Target bone select open pose bone okay let's look at the read meat oh do you have to like put your own no that might be too stupid for this so you can render the images you need for multile control NS yeah yeah yeah yeah we to install the open POS attach add-on oh maybe I don't have that we can do this uh yeah we should do more about rigs cuz basically I just want to turn this into the open POS [Music] guy do I have that auto align plugin nope uh let's see if I can find that uh open pose attach did I not download that lender file where's the damn preferences and all downloads maybe I didn't download it I don't even see this thing welcome to my brain sucks in here all right uh let's figure this out right right it might be harder to attach it to an existing model than I thought that's too bad because it'd be really good we can do it the old fashioned eway though just render it uh yeah let's add a camera all right and let's do 102 1080 1080 let's move that camera camera around a bit right let's try that can you see yeah no the the thing is that that plugin has a like an entire rig for o Open pose so you could get perfect like perfect open pose animation without having to estimate it but I have to learn how to use it before I try and use it on stream that's all total waste of time no I'm just kidding okay but I'm curious oh no we go back to [Music] Eevee actually I think we can go to workbench yeah because what I really want is the depth maps camera uh come on brand we want to uh no we can go back to Eevee get rid of the light where's that light there you are the hell out of here [Music] bud HD R okay cool compositing yeah yeah where the hell are my Fring cubes [Music] man where are the cubes where are my cubes it's a perfectly valid question [Music] cubes why don't y all show up though what is going on here man so weird is there something in the way no what the hell's going on with my death pass oh I know what's going on stupid still missp pass [Music] should be there okay whatever let's go back [Music] to comine it's fine whatever that's better okay so that's our depth how do I get a I wish blender's compositor wasn't a huge pile of trash I love blender but this compositor just makes me absolutely crazy what how do I adjust my death map settings because these are crazy yeah for you is all right sucks for anime diff though in my opinion blender uh depth map manipulation it's interesting nah it's not what I need I need hey Daniel I'm just feeling really stupid right now this is we just did this the other night let's try color ramp see what we get come on there we go awful wait awful awful okay so that's not the move and normaliz [Music] do ah [Music] hello that's what I wanted color ramp uh it adds too much [Music] um adds too much uh too much contrast between the frames sometimes I think this is backwards yeah sorry one sec here just trying to get this right no that's that but then I guess it's just really close to the screen what if we give it some more oh so what if I added some ah bingo close there we go I think that should do it ah the orbs though the orbs C no yeah I need the orbs to have some love they got no depth man now the orbs got depth but the person doesn't that's going to be a hard line to follow nah this ain't going to work well that's just a bummer and a half yeah this has been my experience I haven't been able to get better animations with it the details better but the lighting sucks um not a good tradeoff I feel like this has got to be am I doing something wrong here yeah all right whatever can try this so if we got that gu the arm will be depthy all right screw it uh let's dump these depth maps baby see temp depth uh kg [Music] Runner uh they'll be black and white yeah easy peasy oh maybe if I just move the this is kind of hacky how far back do they need to go really it's weird right am I dumb why would that be the case was it because the material I wonder couldn't be no just can't get those a little darker that's weird right you think it would work oh wait are we in orthographic view shouldn't be oh we're perspective what the ever loving I don't get it screw it let's see what it do we got our depth maps uh we could do one where you're just yeah yeah this and this is and then the background is right and then delete why is it that [Music] color right okay need even light uh even simple light uh oh does sun Mode work [Music] in thing now not well enough okay so you need to be just a straight up background yeah and then we'll just throw so what about instead of that we do awful Armature lights should be able to get it there should be enough all right let's try running the op thata into the open pose as well uh open pose render cool all right let's uh drop that in the we need to pre-process into DW open pose and we already have depth maps so we don't have to do them attemp okay I'm we set that to one for this run then we'll load up the depth maps there nice well what I do wrong open pose Looking Glass Runner is that not right oh load video path dirp I want to load uh image sequence path there we go image into here go yeah if it's a um if it's an image sequence not a you need to use a different loader it's only like 47 frames should be fast right right uh we got our depth maps we're loading our DW preprocessor it's doing its thing there we go oh we lost a [Music] frame uh we lost the Torso for a little bit hopefully that's not the end of the world uh AI comfy but I'm curious though would we get a better would we get a [Music] better yeah like I wonder if I get better frames like that uh what's where's my background just fully proper lit I did a depth map straight from blender but I'm just trying to get a really good open pose hopefully that's enough the freestyle help in this case I don't think so maybe no I don't think it will all right we'll do standard let's just try this one more time see how it looks is that what it's supposed to yeah that's what it's going to look like perfect uh runner two all right let's try and do the DW [Music] again rner two all right go I hope it's in the same place what open post the skeleton from like it does but you need a bone structure that looks like open po which toy XYZ made but uh it's for like posing the existing rig I need a I need a blender blender smarty pants rigging smarty pants to help me uh learn how to attach the uh the rig to an existing one so I can just attach it to this mixo rig cuz couldn't figure that out but yeah ideally you would render it in blender and it would be perfect 44 all right here we go ah we lost even more frames I wonder if I can replace the missing ones with the ones that are interesting let's have a look oh what no uh AI comfy output open pose come on I need a new uh need a new keyboard pretty bad yeah that's just a head so no 12 no it's just a head too that's great it's like it's got the head this one doesn't have the head it's dropping the same frames okay this is going to be fine get some flickers but it's all good it ain't getting better tonight uh AI comfy output open pose we want the first one first one this one we're going to do 44 uh Mr Bean running up a hill with monsters chasing him all right Mr Bean how's your chance bud need a loop node like I got those two image sequences and I just want to Loop them three or four times wonder if you could do that with batch nose probably could for h huh didn't really work at all it kind of did but not really might be this oh we're not using the death map at all so you know there's that let's turn that back on shall we like nothing from the depth map is showing up I wonder why oh strength says zero there we go that's making a whole lot more sense now nice you have to post some results in the Discord Tron like to see what that looks like or tag me on the Twitter let's try some weird pictures with it next time instead of just Mr Bean oh that is freaking cool I mean he looks like a crazy crazy thing but that is so neat swirling clouds behind them let's not do Mr Bean let's do me c that one trying to find where my whole head's in there there we go 40y old man running up a hill with monsters chasing him swirling clowns behind them uh neon Nike track suit uh and uh emissive headband right yeah of course all right well this one runs I am going to go get some water and stretch my legs and I suggest you do the same but I'm not your mom so you do you boo I'll be back in five if all right hello we're back we're back we're back we're back how are y'all my wife had made some delicious tacos so I just double fisted them like like uh absolute Sasquatch and now I'm full of tacos and back yeah this came out really cool I'm going to keep running these I think I even like the prompt it looks cool didn't get the emissive headband at all but what I am going to do is actually increase the quality a bit just let it run a little longer I want to see what it looks like at uh 768 x 768 [Music] curious let's see what we got here hello Mr pers y motion model temporal deflate and max resolution is only 512 by 512 if I change this will it produce Cuda error uh I don't think so I'm doing a 768 x 768 right now m oh I am so full of tacos now I could be none more full of tacos too many tacos I don't know only time will tell perhaps three many tacos it's hard to say it's a waiting game right now [Music] for the Beau bunch of so two control Nets 7 68 by 768 plus IP adapter who wants to bet o 16.1 gigs of vram I am streaming though too um might still get away with this on um I still get away with this on a 40 60 TI 16 gig mayap mayap mayap now I'm actually very curious oh NV top why are you like this yeah I'm curious when we hit the um really 14 I'm curious when we hit the Vie decoder uh how much vram this is going to take how do you get rid of that oh there we go all right 16.1 when this hits 20 I'm going to see uh did it make like an island out of celery watch the VM Spike when it hits the VA decode I probably won't take long because it's only 47 frames or 45 frames or whatever but I bet it hits like 21 or 22 gigs 19 19.8 computer gets a little hot when that happens it's actually not much better at 768 and in some cases actually looks worse that's interesting was that cuz our death maps are five no death maps are higher res it's interesting like the face is totally decimated clouds look a little better let's go back to 512 for brevity are we using rev animated oh no we're using deliberate h no no l is cool I like that Loop's pretty good too guess we could always add an upscale or run after I don't really want to run it back through control net though for that and you kind of have to it's awkward all right well um let's see control net IP adapter yeah that's pretty much everything I wanted to cover tonight the uh open pose stuff and uh you know how to how to inject them and how to pull them out of uh videos and and put them into new one so think I'm just going to cut it for tonight before I just ramble on for another three hours for no reason um so yeah play with us at home and uh tag me in your post I want to see what you make with this stuff so basically uh you know what we did tonight was we went through open pose and uh learned how to pull uh open pose data from existing uh videos and even blender animations and then pull the depth maps even at a blender or generate them with the pre-processors and how to use the pre-processors and comfy how to uh route things so their uh things and as always if you want to use this workflow it's on my Discord slide on over to the Discord uh The Links at pers. XYZ um got mask packs on there we got uh some open pose animations got this workflow and uh lots of really helpful knowledgeable people that are there ready to uh uh help you get your animations going so uh yeah don't be a stranger come by and hang out and uh if you like what I do and you want to support it come by the patreon and uh become a supporter and uh yeah supporters get access to a special channel on Discord and uh we there'll be more rewards later but uh for now I'm not trying to put any of the knowledge uh behind the pay wall um basically uh all the live streams will continue to be free and uh then like uh you know I'll figure out some other kind of reward for supporters all right thanks everybody for hanging out and uh yeah I'll be back again probably Thursday um and the latest uh Friday night all right have a great night
Channel: Purz
Views: 4,107
Rating: undefined out of 5
Id: clbvjYszIIs
Channel Id: undefined
Length: 124min 5sec (7445 seconds)
Published: Thu Nov 02 2023
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.