ComfyUI Workflow Create Animation With Differential Diffusion Latent Noise

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
with stable diffusion and comfy UI there's a lots of way to do animations in this tutorial I'm going to create a workflow with segmentation mask IPA adapter and differential diffusion to create more consistency style of video animation and also using LCM checkpoint model we are able to improve the speed of generation process let's get started hello everyone so in today's video I'm I'm going to explore the differential diffusion now this is a really cool feature and it's already included in comy UI it's by default in the comy UI nodes themselves if you have the updated versions basically this is for input mapping of the image and once the image is inputed it can be masking the areas of the image and create another output with the shape or form from your input image and this is really cool once I explored more about this paper well it's it's not a new paper it's already been public for a while but I'm trying out a lot of masking how can we customize it for the differential defusion workflow how we can use different methods like IP adapter segmentation masking all kinds of methods and I've been researching this one recently and playing around with it and the results look pretty nice actually some images here aren't working let's see if there's something we can show but right here you can see the soft in paint section in this paper it really shows the input elements how you can inpaint a mask to an output image like this one that we can reform with the same original image structure and I realized what if we can do that with differential diffusion generate new animations with this kind of form and put that into like a character the movement the dancers for example we always use dancers as a practice example for how we can change the total outfit and the Outlook of the character with the same shape or form you know have their movement stay the same but then the style looks different so for example like this the old Templar change to like a post-apocalypse city something like that and yeah maybe we can use masking like that so in my previous workflow here I did field masking meth OD this is the good old Coco segment here that we've always been using very out for segmenting characters highlighted in red and then we can do a growth mask and mask blur to make the edges smoother in order for the image mask to be generated well in differential diffusion we also tried with segment anything using a prompt segment here and then we can combine a mask with the original image frame together and pass it out to a masked data and both of those are working okay but what if I want to just change a jacket or change just part of the character's outfit without dealing with the backgrounds and everything and I found that in the automatic extension we also have one called a person mask generator and this one is doing quite fast previously we played around in automatic and I saw that they also have the comy UI custom nodes available for download so we can try this out as well so combining the masking using coco segmentor or the segment any anything or using this person mask generator we can pass that data and for example in the comy UI KJ nodes that I've always been using we have the set and guess nodes to Define values and they also have the growth mask with blur so you can just put one custom node like that and control the masking growth and the shape or form before you pass that to differential diffusion for doing the masking for the character's outfit Etc what we usually did in the workflow here is we can do IP adapter masking that we always did using a masked character image and then we pass that to the IP adapter tile here with our reference image so it's okay we're not going to stick with one method always but sometimes it depends on the situation if I just want to change the character then I'll enable this group if I want to change the whole thing the character and the background s then I'll be using this group because they have both masking for characters and the backgrounds in here so I think this whole workflow is pretty complete already and I want to do another new workflow based on this structure but then I'll use another new method maybe I'll still stay using the Coco segmenter and also put the personal Mass generator comy UI versions as an optional method for you guys to try and that will be using also the different diffusions in by default in comy UI itself so as you guys can check this keyword and you can go to comy UI and just put different diffusions and you will have these nodes right here it is by default in comy UI already so what it does is that before you pass to the sampler between differential diffusion you can use that in here so for example if I have a k sampler in the first uh differential diffusion uh sampling groups I can put that between the rescale CFG and pass the models data to this one and it will be making the pixels as it mentions giving the pixels strength to modify between the mask mapping and also the input source image or in our case the input source videos image frame and we'll have another pass of models data to the case sampler at the final step here and also we can pass those models those data to external parties but this one is rarely used by myself usually we just generate this case sampler we pass the image frame and also the latent image data and pass it to the second group to do a little latent upscale and make it clearer and then at the final step we'll be going to the detailer maybe enhanced person hands face whatever you want to enhance so this workflow is pretty complete already and I've posted it in previous videos already you guys can check it out but then I want to try out this new method a simpler workflow without so many nodes connecting to each other I just want to try with the IP adapter and the different diffusions and also using the Coco segmenter mask and the person mask generator to do that and as we've played around with this before in automatic this is a pretty fast generator for masking our character's face Coler hey the bodies themselves Etc but then it's really just focused on characters but you know if we want to do masking for the backgrounds or we want to change the backgrounds we want to enhance the other elements in the video we need some other optional things so we need the IP adapter to mask and the Coco segmentor to mask the backgrounds maybe change something else in the video so I'll be freshly newly creating another workflow based on this method and trying it out so first we'll empty our workflow just have an empty one and I have some existing custom nodes that I can use and that is of course the load checkpoint so here I'll use LCM checkpoint models or you can use other checkpoint models whatever you want to but I like to stick with stable diffusion 1.5 base checkpoint models in here I used a real dream turbo LCM 6 and will first you know have a set node for that so let's say I have you know that is the stable diffusion model okay I like to call that the models based data so once we pass the models data out we can do other stuff for example the clip layers we have clip layers and between the clip layers of course we have the positive and negative prompts so I'll highlight this in green color and red color so mostly for these two I would just have my preset template for this for example I have something like this maybe I don't want the skin color I just you know put a preset text prompt in here because mostly we're using the IP adapter for the Styles and influencing the characters and backgrounds so here it's just a template for the text prompt and secondly we'll use luras maybe you can use luras in here to add details if we're using LCM so we get a little more detail for the characters themselves so maybe we can do that as well it's just you know anything you can do in here as well you don't have to follow exactly what I did it's just the way I normally like doing it in here and by the way we don't need the clip layers themselves if we don't really need that yeah so we can go for this way you know maybe we'll have another clip layer just for the settings maybe we can pass to other groups in the future so let's say this is the clip layer data maybe that will be passing out to some other groups so keep that as output data structure then we can do something starting from here like the conditions so in my case I'll have control net so maybe for the control net I'll be first doing the Bose so here I have the control net and you can search for pose and that will be for this one so pass this in and here we're going to wait for the image frames from our input source video so again we'll have input Source video from here and I have you know pre-rendered some animations before so we'll have other styles later it's easier to create so by default I always do 50/50 image frames as my default testing and I'll say this is the global frame numbers so we pass that out to other groups if there's need to use the image frames and here's the other thing it's the frame count and then we can set that one out as well and then in the meantime we'll have to pass the resize image and as I've just copied it from my existing workflow here now I don't need to type it again with the same stuff that I'm using and then we'll have the integer width and height that's also needed so I just grabbed it from my existing workflow and plugged that into here width and height so let's say I'll put it in here so one area for setting everything before we're executing all the data and basically that's it for loading our source image and we're going to use the initial resize image from my resize image data here to pass to the control net here so we'll use the get node and here we'll connect the image we'll select the resize image in here and there we go we have our first control net model so right here we'll connect the condition positive and negative here and basically that's following the same thing we're going with a soft Edge or you can use depth map so for example if you're using soft Edge like the one that I used in the detailer I want to try putting it up front this time so we're able to use that to soften the character edges Etc and pass this one as well and the control net in here so I like to switch over starting from my control net and pass the pre-processors as well passing the resize image to the second control net area in here and basically that's fine for our control net group but one more thing I want to do is the depth anything because I want to give a little more layering for our generated image because you know it gives it more detail like having more separation between the characters and the backgrounds sometimes it's good especially for dance animations or Tik Tok videos like that so there we have our most complete control net part and lastly we can pass those conditions out to positive and negative part actually I can copy it from my existing workflow so I don't have to type that again yep and positive and negative part is almost complete in here so basically this is the whole settings group for loading the video and here let's say I want to create this one called load video and this is going to be in here let me organize this and we're good to go yeah so this is the load video part I'll put it on the top because that's the first step we've got to do having the source video and the second thing we've got to do is the loader Group which is loading from the checkpoint models the LCM checkpoint models and add some details with Laura and doing some template text prompts and then we're getting the Motions the movement from The Source video into the control net and passing that data into our conditions so this is another one called the loader group so this is like a long one that I always did by default in here as well so right now we have the first step load video and the Second Step loader group and let's make this control net section one color so we can easily see it and then the third step is we've got to inject The Mask into the latent space now this is another way to do that before I would do the Coco segmenter before the latent space but you can prepare the mask and pass the mask into the image frames together into the latent image and while it's generating it's just one process to generate the whole image frames with your IP adapters influenced image so that way it'll be faster and smoother we discussed that on Discord with other animation creators and found out that way is running better with LCM models so in this workflow I'll try to do it like that here so this is the segment mask group so this is the third part maybe we'll have something from here to start on one way to do that is using this mask basically this whole mask can be copied to the new model but then I'll try to use this one this time just to simplify it because we've done the Coco segmenter many times already maybe after this workflow is done I'll put that one as an optional segmentation method but this time I see using the KJ notes grow mask with blur and using the personal mask generator in here we can you know just directly install that and use that in control net UI so let's try this way so in here starting from the mask of course we need this get node and get the resize image so once we have the image we're ready to play with the masking the personal mask generator can be found in here you can do the automatic w we style way to mask the face skin mask the face and bodies and the hair or only the face just like that and that works for control net UI as well as you can see right here so I'll follow this structure and create those let's go to our workflow here so we've got all this resize image passed to the mask generator and we got the mask out let's set a mask preview in here we can see some mask preview in here so you get an idea of what kind of masks you'll be expecting to receive once you have the mask so this is one method by using the personal mask generator if you just want to change the face you can do that or if you want to just go for the background you can do that but there's pros and cons the con is that you're only able to select either the characters or the backgrounds like for example if I select the face hair and bodies and clothing like almost the whole whole characters but if you're masking the background as well and if you enable the background mask together with the character part that means the whole image is going to be masked so it's useless if you're selecting everything in here so basically I saw from this GitHub page that they're only selecting one certain thing at a time and the maximum logically is either you select everything of a person or you select none of the things of the person and do the background only that's the way it goes and the confidence settings in here just like the masking strength maybe we set it like 30 or 25 something like that will be good enough to mask a person or background using this one so in this case I would do a face mask hair mask and body mask and clothing as well just want to change the whole styles of the dancers in our videos or the other option is that we can use the C segmentor this one we've used for many workflows it's very stable so I would just you know basically copy this one into here just have another option the Coco segmentor we're getting the frame count from our loader group so the variables are the same because I've reused that from my existing workflow so the width and height are also the same let's put it in this way and then it's easier to see yeah so this method is coming from mat's pre previous video talking about masking with the IP adapter so this goes really well with using it with the IP adapter so I have no problem using this one so what it does is we go to the Google segmentor passing all the images into segmentation color and we use this mask from color to transform each image into a mask and basically we are selecting mostly red colors that means it's segmenting the characters in red and we mask and grow that a little bit blur it and then we'll pass those into our image compression mask and also I have an invert mask here because this way I'll be able to select the backgrounds of the characters and this is an empty image to compose The Masks into images and before that we have another one called upscale image just have things from the source image to destinations and compose it into a new masked image list so we'll connect this back to the image and also this one we have the Coco segmenter here so this is one style of using the mask I'll put that in yellow color as we used to do with this yeah so that's one option and the other option is only using the personal mask generator with just this one node here so see which one can perform better for your scenario if you like this style a quick lazy way to do it you can do that as well if you want to go deeper dive into every segmentation setting and play around with all the parameters and settings then the Coco segmentor is a good option in here so let's move on from here by the way we need the output as well so let's get this output from my existing workflow and put that in here as well and we set the mask background set the mask character and we set the mask image which is from here now this is one method of outputting the Mask and this one is another way of outputting the Mask as well so for this new method I would do that for the mask person let's say let's name it something else like mask person gen just you know to better remember this mask data output so this will go to the Final Destination here and we have some preview images here so it's easier to see what kind of data we have so basically that's for masking and segmentation masks if you enable one of them either one of them then just highlight this part and do a bypass and you know you don't need to run two things at the same time well you can do that if your computer's processing power is good enough but no usually you don't have to do that so the next thing we've got to do is start preparing for the latent image so for the latent image we're going to create one group for the latent image and sampling no actually it's latent image pass to the element diff and right here we'll use element diff as the center of this group so we'll copy it from here we'll do the LCM sampling and the vuu of course we need that to make it clear for LCM data right here and then we'll have this organized better just tighten up everything and then for the sampling in here we'll have the sampling settings from noise I recently like to use from model data and that's basically it for the animated diffuser and of course before the V2 we have what we talked about the different diffusions search for this custom node it's by default in the UI itself you don't have to install extra stuff so in here we have the diffusions then we'll go to the case sampler and we'll have the sampling method everything runs in here but then before that we have to prepare our latent image as well so in this group I'll call that you know animated prepare so that way we can you know prepare the animated settings and models the motion models in here I like to use the version 3d motion models and by the way I'm fine-tuning this motion model for element diff and it's in training right now and hopefully there will be some good results coming out in the future for the fine-tuned models you guys can follow the updates in our Discord group we've been playing around with that recently with some other animation creators and yeah so far this is what we have and the next thing is we have to receive the mask from the segmentations here so like this one let's try this one the person mask generator will have that mask gen person and we've got to use the JK notes growth mask with blur this is well just the same concept as here as we have the mask growth and mask blur that's the same concept but then JK notes combines two things into one custom node and you know it's a very compact way to make things so why not use it right as you can see the settings of this are very similar to The Mask growth and mask blur that we use for the Coco segmentor group that's expanded you've got the incremental expanded and you got the blur rate Etc everything is very similar but then one thing we have to remember is that this blur radius we have to test it out we can't just put this as zero sometimes we want to put it as 1 or 0.2 or even 10 right so it depends um but then um yeah so like this one we can let's say test it at one so the blur radius is about one which is well it's very normal for a mask blur then we're going to remap The Mask range so that's something that we can use for the mask like here the mask inverter actually this is like the invert mask itself so it's the same concept as this one the invert mask but it just pretty much nicely compacts One Note One custom node that can pack up up all these small nodes together so if you want to use the mask inverter here you can receive the background of the mask image but this time we'll be using this one with the remap range make it like you know 70s something like that or in here I like to set it like 30s to just kind of invert the mask a little bit you can do that as well in here but then actually we need one thing from here which is that apply latent noise mask that's by default in control net UI you know this one can be different from the other latent image custom nodes so usually when we generate an AI image we can use the empty latent image or the upscale latent like that but then with this one we are injecting the noise to the existing latent data the latent image data so in here it's different and we have the resize image we have to put that in here as well we can do this first of all we do a v encode and pass the original image latent and inject the latent mass that we've produced and inject it into the apply latent noise mask node here so what it does is it'll be the same concept as here we input the image and then we cover it up with the mask and then we output something different like this one so we need the vae and that will be coming back to my old workflow and yep we have the vae here let's put this into the loader group group as well for example with the LCM models we don't have a vae so we've got to use this one okay so we have the vae to get it in here and lastly we'll have the latent noise mask output in here let's check it out if there's anything we don't want so in here that would be the initial latent initial mask latent I'll call this one in this way so you can do that in here and lastly that will be the sampler what I want to do in here that'll be something easier to recognize is that we pass this data it's called the add model data no model I ad data like that and we'll pass this one out and also pass the mask latent out to the first sampling group so we'll create one sampling group so we'll call that the sampling one so in here we'll make another get node actually to get nodes here so first we've got to get the model data second we've got to get this initial math latent and the third one is resize image if there's something we need and of course we need the conditions positive and negative conditions and mostly that's all we need then we pass to the K sampler now the K sampler you can use the simple K sampler or k sampler Advanced but it's just up to you whatever you want to use here you can convert the seed into input so in here um like what I did previously I have a global seed number and I put put that into the loader group so I can copy this one and this is the seed number and I have the condition negative and positive as well and lastly we have the mask latent image let it be running in here and let's try this one lastly is the vae decode and then for the vae decode we're going to use of course the vae let's grab it from here we don't have to type it again and because we're using LCM so we've got to choose LCM here and you can choose the srgb uniform or DD IM uniform whichever one you like is fine just test it with your image and stuff for the LCM model of course we do load sampling steps and let's go for CFG 1.5 just like what we usually do and there we go we have almost everything done and lastly we need the video combiner like this one let's bring this one I don't want to type it again and again so just copy and paste and do it in here yep so we've got everything mostly completed and we don't need the resize image so let's try this time and see what we'll have in here so our image is here and we're going to do one more thing which is restyling our characters that means we need the IP adapter as well so for the IP adapter let's just copy my existing one I don't need to redo that again again IPA for character only okay so this is the group for the IP adapter and we have the model base so in this workflow the names are a little different we use model base like what I used to do and make it a little bit better there we go we have a beautiful one then I right now use the image background remover to you know if any image I want to remove the backgrounds and only keep the characters for the IP adapter and I'm doing that as as well so that's one of the beauties of using the IP adapter and the image backgrounds remover so here's the models of the IP adapter only so we're going to get this one passed to our K sampler actually no we should pass that to our animated diff connection so let's get the animated models and we get the models the IP adapter character only models and pass it to here and before that we have to get the base model data pack ET pass it to our IP adapter and then we have our output of model data that contains our IP adapter data in the model pipeline so there we have almost everything complete and let's try to run it one time and see so right here I'll be using this one just temporarily bypassing everything but then this is quite difficult using it like that let's separate another mask for the Coco segmentor that'll be only for Coco segment mation masks then this one is for another method so I'm going to click this one as disabled and use this to go and see what we'll have on the first run okay so after I try that and Implement everything I'll check one more time remember you have to select this mask for your IP adapter so in this case I've defined mask b t person. gen that is related to this group I'm using this person mask generator using this one and if you use the Coco segmentor let's say I have enabled this one then you have to change the mask character with this label so I'll temporarily disable this one and just focus on using this one this time and we'll try it with masking our character the dancer and the dress is going to be in purple so just very simple see if the outfit of that will be changed to the IP adapter image so let's click generate and we have the result here so what I did was have this video as a source video run in 50 frames then pass the IP adapter the purple dress hopefully we can do some color using these Styles and influence the purely white tank top and the pants because it won't be fully changing the outfit of this character it'll change the dress to this style of a long dress because we're using these very heavy heavily influenced by the open pose soft Edge and depth anything you can always change the settings in here to play around with it sometimes you want to you know do it lower if you can do it like 0.3 that'll be less influence on the overall character soft Edge but I like to go in the middle of things to just test it on the first try and then we use the depth anything here I don't put it fully to 1.0 on the strength here so we press the condition to our element if next one is getting the mask from here we do the growth mask and blur mask and then remap those ranges set it in the apply latent noise mask and again you can always test these numbers around like I don't prefer you go to like 1.0 or something just start testing with eo. 7 if you're thinking that's too much for your video then you can turn it down to like 0.5 or 0.6 like that in here and then we're going to the model data using the IP adapter output data pass to the models and we're getting the settings in the animated version 3 motion model to do our animations then obviously we use the free U version two and the differential diffusion that we talked about earlier in this video then we pass this data out to our first sampling step so in here we've got our model data and we got the latent Mass the initial Mass latent data pass to the case sampler and it runs and I like to pass this data out for a second sampling and this is the first sampling result although it's not really completed the resolution is kind of low in the first sampling as always so I like to do the second sampling which is passing the latent sampling image data do a 1.5 upscale in here and do another second sampling set the Den noise to 0.3 sometimes you want to do a little bit more it's up to you like maximum I would go 0.4 on the second sampling if you do it too much then the video will look totally unexpected so this is the second sampling it looks more clear the outfit of the character changed the face and the hairstyle changed but we got the remnant of the background that's basically because we just masked the person in here without doing the background influence so yeah that's it for this workflow how it goes if you want to change the backgrounds or something uh of course you can do it in here you can do like an invert mask in here or you can just use the uh control net part in here uh do an invert mask to you know do a background mask for your IP adapter just like previously what we did in the old workflow we do the mask data pass to IP adapter for influencing our backgrounds but then there's always inconsistency in the backgrounds if we use that way or we use the control net input in here it's up to you so yeah try it with different control Nets in here I would say that would be helpful and also you can test other things like for example test the sampling steps as well if you want to go like sampling at8 then 8 to six on the first sampling is good enough don't go too high for LCM if you're not using LCM of course you've got to set it high but you know normally we use LCM then just 8 to six sampling steps is good enough here so that's it for this result we got you know and other styles of characters with black hair another face and another color of clothing and yeah I've done other videos using this as testing before I did this workflow demo that's this one so this one is going to use Wonder Woman as the character and then we change the hair we change the outfit like that and also the previous dance video I'm using another outfit dress but then that's only able to influence the top outfit because the IP adapter reference image is very strong on this example for the top outfit so it's only able to influence that in here let me see if there's an example of this one so like this one the original outfit and then this is the IP adapter image then that's how I got this result like this and also the hair and the face is influenced by the IP adapter as you can see in here yeah so you can play around with the settings always it's not only one fixed way when I create a workflow and you're just going One Direction using my settings so that's it for these videos and hope you guys get some Inspirations on how to create that if you don't want to join patreon to get this workflow that's okay you can just follow along with the videos and create this workflow by yourself so don't blame me why am I not sharing this workflow to the public again cuz I've explained that before in the previous videos why I don't do that again that is what it is it sucks sometimes if you don't like that then please hunt for those guys who try to ruin the community and I hope you guys enjoyed it got some inspir cretion and we'll play around with comfy UI all right see you later in the next video bye
Info
Channel: Future Thinker @Benji
Views: 8,845
Rating: undefined out of 5
Keywords:
Id: alnrfN6F-lI
Channel Id: undefined
Length: 37min 8sec (2228 seconds)
Published: Fri Apr 19 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.