ComfyUI: Vid To Vid AnimateDIFF Workflow Part 1

Video Statistics and Information

Captions Word Cloud
Reddit Comments
in this video I'll test and fine-tune the vidto vid workflow from inner Reflections you can find the download link in the description first let's take a closer look at how this workflow functions we're using the animated if node to load our emotion models along with the checkpoint model we're also using the control net to guide the animation using our loaded animation which you can load right here additionally we're using the loading animation frames as latent images in K sampler for more detailed explanations of these nodes you can check out the guide by Inner Reflections in the description but for now let's test this workflow by adding some tweaks we can input the video path here to load the reference video I'll employ a different node that allows us to load the video directly in this step I'm loading this simple animation here for the test but you can also load a real video [Music] here [Music] still using my favorite anime diff model which is MM stabilized I've tried the latest models including V3 but I'm still impressed with the results of this mm stabilized [Music] model okay here's our first basic vidto vid result with minimum tweak I think the black woman comes from just a random seat that it picked all right now let's take a look at our reference animation it did a good job of copying the movements and even the background is quite similar in this video we're also going to try changing the background style to execute the tests faster I'll set a lower frame count here now let's make this animation even cooler instead of just using text prompts to describe our style we can can also include an IP adapter node this way we can use a reference image to help shape the look of our [Music] animation I'm using an AI image of a girl walking by the beach to give a cool look to this animation now let's tweak the text prompt to match this background all right let's run [Music] it here's what we got let's see the adjustments the clothes from IP adapter image and the beach background there are still some elements from the animation too okay let's tweak the workflow further first let's disconnect the latent images from the reference video and connect empty latent image node this way the resulting animation will get more freedom about the background style choose the same number in the batch size that we are using from the reference image pay attention to context overlap in anima if determining the frames shared between consecutive runs context overlap impacts animation consistency and smoothness for instance with a sequence of 32 latents and a context length of 16 you can run animate diff twice with no overlap 05 and 16 to 31 or with some overlap 0 15 and 12 to 27 higher overlap reduces flickering and artifacts but makes the animation less diverse lower overlap increases diversity and motion but may introduce more noise and glitches let's increase this to eight doing this will be useless in our case because we are using only 10 frames for now and the animat diff animation length is 16 so let's increase our animation [Music] length I will smooth the loaded animation with interpolation I will use RFI for this hopefully this method will make the final result smoother okay let's run this here's our result yes the style and the background is mostly from the IP adapter image the face does not look perfect I will try to rectify this in subsequent attempts but first let's try another example image in the IP adapter and adding face swap node on the resulting animation I'm using the face image I created in my last video where I talked about maintaining a consistent character in a comy UI okay here's the result and here's the face swapped version the face swap is fine but it could have been better if I had turned on the face restore inside the reactor node let's try one more [Music] time okay this time it picked very little details from the I adapter let's see the face swap with restore on looks good let's try another let's tweak the text prompt I'll use the Epic realism again okay this is kind of result I was looking for looks impressive you can see the details from IP adapter image it could be better but you got the idea how to do it the face swap looks good now let's make it even better by upscaling it I'm going to use upscale with model node to enhance the [Music] quality here's the upscaled version let's compare these I'm playing around with different ways to improve this workflow and I'll be sharing all the cool things I find in future videos thanks for watching and sticking around for the fun stuff
Channel: Zuhaib R
Views: 4,446
Rating: undefined out of 5
Id: YhAS0hO3mqI
Channel Id: undefined
Length: 6min 46sec (406 seconds)
Published: Sun Jan 07 2024
Related Videos
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.