AnimateDiff Most Consistent Motion Model?

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

hello everyone welcome back to House of dimm I had a great suggestion to do a comparison of different anime diff models when to use one over another and the results of each really appreciate it thank you you know who you are so that's what we're doing in this video here's the workflow we have here probably not as important in this video as in the past ones because we're really just focusing on the results we're going to get out of anime diff this is the same workflow I used in my first anime di tutorial with a few little minor changes if we look over here I have a load low node this is so we use version three of the adapter when we try that that model and I'm actually going to try it with and without this and we'll compare those moving on over I'm using one control net and it is the open pose and this is what it looks like here pretty basic and I wouldn't worry too much about um you know I'm just using it so we have some motion but it's not a very good model you'll see some of these are missing hands but you know we just want the motion going down here I have turned off the IP adapter we don't want to use it cuz we we want to focus on just the motion models I'm not using the motion LS and wouldn't work anyway these only work with version two of the motion model and this is the models we're going to go through and I'm going to try these one at a time use the same seed same model and I realize it's going to change it a little bit you know cuz these models are affecting the model that we're loading but you know I want to try to get as consistent testing as possible but we'll see here we go through the first case sampler and then we upscale it I have the face swap which is turned off because I want to see how the motion models affect the face I'm rendering 8 frames a second so here's the node that I add back frames to to to bump it up to 24 frames per second smooth it out but we're turning that off too because we don't want anything to affect what's coming out animate diff so it's there but it's disabled or bypassed so I wanted to go over one other thing before I rer them all out and we do comparisons I want to go over as much as I could try to find out what what these mean you know what is the goal behind the developer that created them so we're going to start with these two here stabilize mid and high this is directly from their civid AI post and you'll have to bear with me some of these are very short some of the don't make any sense to me but we're going to read them up anyway so stabilize mid a bit more stable than the base model stabilized High more stable than the base model but at a cost of having much less movement at the very top of this it says these are very experimental models so don't expect amazing results I expect we're not going to get amazing results next one we'll go over is temporal diff and here's what they say and I'll put links to all these in the description temporal diff is a fine tune of the original anime diff weights on higher resolution data set 512 x 512 testing so far indicates a higher level of video coherency than the original weights I also adjusted the stride from 4 to two frames to improve how smooth the motion was current limitations are that the labeling for my data set was a bit off so it has a slightly reduced ability to interpret the prompt I'll be releasing a new version that fixes that soon this should work the same way as any base model in terms of use just drag and drop it into comy UI or the animate diff repository and use as normal this does not require any additional memory to run as the generations were 512 x 512 before the training was done at 256x 256 okay so that just a little bit went over my head but uh I wanted to read it to you anyway now on to the anime diff models we got version one which I have V1 in there version two and version three so version one couldn't find a description for it but I you know it's their it's it's their start that's where they begin and it's pretty rough in version two this is what they say in this version the motion model is trained upon a larger resolution and batch sies we observe this significantly helps improve sample quality moreover we support motion low for eight basic camera movements now that's the key part of this one if you want to use the motion LS where you can pan Zoom tilt you know and you can chain these together you have to be using version two okay on to version three here's what they say in this version we did the image model fine-tuning through domain adapter lure for more flexibility at inference time additionally we Implement two sparse control encoders which can take arbitrary number of condition maps to control generation process so my understanding and if I'm wrong somebody please correct me but the sparse control encoders you could use it to start with an image and have an end image and feed that in and using a control net kind of coax the animation to fill in the gaps between the too I need to test it and probably just create a whole new video on just that cuz it it I'm sure the results would be interesting okay let me get these rendered out and we'll look over the results and see which ones we like the best okay I got them all in here side by side and I use two different checkpoints to test this out one is more realistic it's analog Madness and one is more artistic and that's Dark Sun and uh I got them all laid out here hopefully it's not too small but we'll zoom in and look at some so let's just play forward here and just kind of roughly look at them all you'll see V1 there's hardly any movement at all she just kind of wobbling and legs are blinking back and forth V2 is better as expected V3 is actually really really great it's fallowing the open pose like perfectly and I will turn that on so you can see it see even with the the gaps in the hands that that I have there it's still falling it which is awesome turn that back off temporal is not great in my opinion stabilize mid kind of the same stabilize High it is fall in the pose better there but to me hands down V3 we'll zoom in and you can take a closer look [Music] here great results on that one okay now I want to compare the V3 with the V3 with the Laura or the adapter as they call it and you'll look here [Music] and it's yeah I just don't quite understand what the adapter is supposed to accomplish it does not look as good when I'm using it as this does here and maybe I'm using it wrong I just loaded as Aura and this is what we got and I would ignore the pant color change it's just you know all these affect the model as they go and and even if you use the same seed you're going to get different stuff but I yeah I definitely prefer this one now we're going to take a look at a more artistic checkpoint and this is dark sun again and you'll see kind of the same thing version one is not moving at all version two is better version two is pretty good but I think version three is you know still tops everything temporal not moving stabilized mid and high are both yeah I mean they're moving but I just don't think anything's going to beat version 3 yeah it's just much better now let's compare version 3 with the version 3 and the adapter load as [Music] Aura and yeah again I prefer version three by itself again you can ignore the the pants but look at how the foot here changes to no shoe where the results here are are pretty solid we are getting some uh cloth on the arms there out of nowhere but that's something you might be able to negative prompt out you know if you put sleeves in there or something it may not show and I didn't go over the face as much but I mean for the most part the faces all do okay yeah all the faces do okay I still think you get better results if you you know and it doesn't take that much longer to run a face replace at the end like a good trick would be to to to use this prompt on a static image and use that face as your face replac and then and then you'll have you know it'll there'll be consistency there and hopefully your face won't blink out or turn into a monster have wither it's my opinion I'm I'm probably going to almost always use version three if unless you want to do the uh camera pans and zooms you have to use version two it just depends on what you're doing if you if you're using a control net and you want the model to follow that control net you you don't want the pans and zooms and tilts and all that anyway but if you're just doing like a portrait of somebody and you want you want it to kind of pan and zoom on them then V2 would probably be great for that so that's it that's my results I hope I was able to provide some information and Clarity on anime diff motion models what are your thoughts if anyone has any other suggestions on topics I should cover please post them really genuinely thanks everyone for your support be safe and have fun creating

Info

Channel: House of Dim

Views: 3,257

Rating: undefined out of 5

Keywords: comfyui, stable diffusion, ai art, tutorial, animatediff, controlnet, controlnets, animation, motion models, best motion models, animatediff motion models, comparison, motion model comparison, modles, modules, nodes

Id: sANtkmbzQFM

Channel Id: undefined

Length: 9min 24sec (564 seconds)

Published: Wed Jan 17 2024