Animations with IPAdapter and ComfyUI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone this is mato I am the developer of the IP adapter extension for comy UI and today we are going to talk about animations by now you should know that IP adapter supports attention masking this is a very simple example you just need three masks one for each image a rough description of the scene and the IP adapter does its magic what you probably don't know yet is that the mask can be an ated let me show you this is the most bbone animate diff workflow in the prompt I only have cut morphing into a dog and of course it is not going to work you can try with prompt scheduling and a better prompt and also you can try your luck changing the seat but stable diffusion is all about increasing the chances of you getting exactly what you want so first of all we add two IP adap nodes one for a dog and one for a cat we need the IP adapter model I'm using the IP adapter plus sd15 we can reuse the same for the second IP adapter node I need the clip vision and I can reuse this one as well then we go from the animate diff node to the first IP adapter and from the first to the second and finally to the K sampler we need to lower the weight a little I'm setting 8 for the dog and 75 for the cat because in most checkpoints cats have a higher weight so this won't be enough but let's see what we get yeah the result is a kind of morph between cat and dog that does nothing so what I'm going to do now is to add a transition that goes from the dog to the cat first of all I'm loading a series of masks that I prepared I'm using the VHS load images note and I'm selecting slide right 16 16 is the number of frames let me show you what this is with a preview so these are 16 frames of a transition that goes from black to white now I need to convert these frames into a mask with an image to mask node and and I am connecting this to the first IP adapter since I wrote a cat morphing into a dog I want to start with an image of a cat and everywhere the mask is black it means that it is covered and the image won't show up in this case it starts with black so at the beginning there will be no dog to show the cat I have to invert this mask with invert mask note and send it to the second Cas sampler let me show a preview of what we've done this first transition goes from black to white for the dog and the second goes from white to black for the cat all we have to do now is to reactivate the video combined node and generate so something happened as you can see there's some kind of transition it's very smooth but it is still not doing what I want so let me try something first of all I'm going to prep the imag for clip vision and I'm also adding some noise okay we are still not there and there is probably because we are using an IP adapter plus model uh the plus are very strong models so I'm going to try with the standard sd15 and finally I got my transition from cat to dog of course we can twe the animation even further try more seeds and whatnot but yeah I wanted to show you the whole process with a couple of of tries because this is not a simple process it involves a lot of trial and error also the original checkpoint is very important and you probably also need to try a few animated diff models okay this is cool and clearly has potential but the result a little underwhelming so let's up our game I want to create a cool animation with my logo to put at the beginning of my videos so let me grab the latent vision logo and I've created with sdxl a weird image of an eye sdxl is very good at creating this kind of logo stickers kind of thing so I want some kind of cool transition from this picture to this other I start by grabbing a V2 animated diff model that usually grants better results in the text prompt I'm putting something random like a logo morphing into an eye then I'm going to change the kind of transition I don't want this left to right mask but something that starts from the center and reaches the Border like this one I want to go from the logo to the eye so the logo mask has to start from White I need to invert The Masks the white mask goes into the logo and the one that starts with black goes into the eye I'm also trying to force a white background with flat white background in the positive I'm also increasing the strength and I'm removing illustration from the negative because this is actually an illustration since I want the images to be very close to the reference I'm also setting back an IP adapter plus model and increasing the strength to one this won't be enough but let's see what happens okay it's something but the main checkpoint of course is not able to recreate my logo so I need a little convincing and we can do that with a control net I'm using ACN Advanced control net apply then I'm loading in the model with control net loader Advanced I'm selecting tile control net the reference image is my logo then we connect negative and positive back to the case sampler and I'm stopping the control net influence at 75% just to give the model a little leeway since I want the eye to really pop out I'm also trying to increase the noise a little and using Channel penalty as weight type that usually has a sharper stronger character everything else should be fine I can probably lower the weight of the logo because now I have the control net and what I can do also is to connect the logo mask to the control net so the control net will only influence the logo but not the eye I think that we should be set and we can give it a try and this is starting to look like something interesting we probably have to play a little with the seed this is pretty cool already let's see if we can improve it even more I'm going to smoo the control net I'm adding a scaled soft control weights node with a base multiplier of9 let's see and now it's perfect I really like how the eye unfolds like it's a sticker here on the right really cool and I don't know why but this technique is incredibly good with logos I've seen a guy on Discord who transitioned his logo to a dragon and it was really cool a few recommendations when you are working on this kind of transitions the base checkpoint is very important I've had the best result with deliberate V3 I've also tried version 4 but it is not as good at least for Logos also remember to try all the IP adapter models and of course the animate diff ones then it's just a matter of having a little bit of luck with the seed so this animation is very smooth but only 16 frames of course we can increase the number of frames but for this kind of animations I think that it is better to stick with 16 and then interpolate the frames to possibly 32 with a model like RVE but let's see what happens if I set the frames to 32 I need to set a context window so 16 frames will be generated at a time because the anime diff models are generally trained at 16 and now it's 32 frames and super weird but you probably notice that I didn't changed the mask transition so the mask is still 16 frames while our animation is 32 fortunately the IP adapter is smart enough to use the transition mask even if it has a different number of frames and in this case it is simply repeating the last mask 16 more times I can of course feed a 32 frames animation like this one and as you can see we have now a completely different animation but it is 32 frames let's see what happens with a different reference I have this one also made with sdxl let me change the prompt a logo morphing into a brain and what's cool about masking is that it's not one bit but it responds to Shades of Gray so let me take a Fade Out 32 frames animation I believe this this one is inverted let me check so it starts from White and goes to Black so I have to invert The Masks the first one goes here and to the control net and the inverted goes to the second reference image should be fine let's see and as you can see now we have a smooth transition from one reference to the other I can try to remove this node I'm also changing the seed let's see if we get lucky and of course I can use another control net for the second reference image let's give it a try connect the two together the second to the cas sampler then to the tile control net model the reference image is our brain and we are also using the mask and this is so cool I like the lights blinking in the corners and I'm really looking forward to seeing what you're going to do with this new IP adapter feature okay we've seen how to send a batch of mask to the IP adapter but what if I want to send a batch of reference images instead so this is a basic IP adapter workflow I have one reference image and I get this result I can of course add mulp multiple reference images let me add this one using a batch Noe and now the result is a mix of the two if I increase the batch size in the empty latent Noe I get of course two results and both take all elements from my references now if I set the unfold batch option to through and generate I get two different results the first that takes l only from the first reference and the second it takes only from the second reference I believe this is already pretty cool without taking animation into account because it lets you generate a bunch of images just in one pass with 1K sampler but of course we can use this for animation so let me move this away I'm removing the second image I'm also removeing the batch for the moment so I'm going to create a 16 frames animation but I'm using only for reference images I'm going to repeat this one six times with repeat image bch and this one two times but of course I'm going to change the frame in this picture basically it's the same with closed eyes you know sometimes you just want a character to Blink and it's tedious to do with animate diff and you'll see how easy it is this way now I need a image batch and I am merging these images I need another image batch I'm repeating the same two images and a third image batch to merge everything together now to the IP adapter so in this node I have 16 frames the first six are this character with the eye open then I have two frames with the character with eye closed and then I repeat the same sequence now I need an animate diff note we connect to the model then to the IP adapter we set a V2 model and also apply V2 models properly now I'm setting a batch size of 16 and I'm using a video combined node to view the animation let's see how it goes and and she is blinking all right maybe we can make it even slightly better saying that the girl is blinking let's see yeah now it's a little more natural to reintroduce a little interest I can try with uh rescale CFG note let's see if it helps ah this is very cool there's also a flare in the background very nice and this this is not just sheer seed luck all the animation will have a blinking girl and this is an incredibly powerful tool that can be also used to further stabilize your animation or to create some crazy stuff like this or this I just scratched the surface of what can be done with IP adapter and animations but I wanted to introduce to you the new features as soon as possible I will make a more in-depth video about animations in comy UI so stick around and before I leave I wanted to let you know about a comfyi contest that is taking place right now it's held by open art. a and if you look closely you'll see that I am part of the jury the contest is about creating actually useful workflows and not the best looking pictures so I think it's accessible to anybody head to contest. openen art. Ai and have a look at the rules that's all for today I'm looking forward to your feedback and see you next time ciao
Info
Channel: Latent Vision
Views: 32,042
Rating: undefined out of 5
Keywords:
Id: ddYbhv3WgWw
Channel Id: undefined
Length: 16min 7sec (967 seconds)
Published: Thu Nov 30 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.