ComfyUI AI: IP adapter new nodes, create complex sceneries using Perturbed Attention Guidance

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to my new video again narrated by the AI voice of my choice Charlotte one reason why we find some images exciting is that we can perceive the inner dynamics of what is happening in the image in other words we get the impression that the figures or objects in the picture are interacting it is not easy to create multi-layered scenes with the AI models as the models still struggle to realistically depict such complex actions and events when the new IP adapter nodes came out a few days ago I immediately started creating a workflow to see if they could help and what could be more entertaining and dynamic than two Ninjas having a fight in a rainy swamp land but first things first setting up the workflow in addition to the new IP adapter nodes I also integrated the new and fantastic upscaling an image enhancement method perturbed attention guidance into the workflow the performance is simply phenomenal as you can see briefly here but more on that later in addition to the basic sdxl nodes and I am again using the juggo XL lighting model as the checkpoint here we need four load image nodes and we bind these again to two prep image for clip Vision nodes so that our loaded images are are reliably in the square shape required by the IP adapters then the first new nodes like the IP adapter Regional conditioning node come into play we have to combine this with the clip text encode node these provide the IP adapter with a short description of the source image for this region once we have connected them all correctly we can connect the fourth image loader to the new mask from RGB cm/ BW node I have added another image resize node here which has several advantages that we will see later or in the uncut version of the video to make sure that the mask works we can connect two more mask preview node to the red output Channel and to the green output Channel we also need the black color Channel but we wouldn't see anything interesting in the preview as The Test shows the mask works perfectly ly to ensure that the node recognizes the shapes and colors it is helpful to paint the image in the brightest possible colors we then connect the mask output to the mask inputs on the IP adapter Regional conditioning node the adapter and of course the K sampler will later use this to to identify which region of the image to be generated should be assigned to which output image once all Clips have been connected we can now combine the params of all IP adapter Regional conditioning nodes via the new IP adapter combined params node and then we combine the prompts using the conditionings combine multiple nodes one of these for the positive prompts and one for the negative prompts the last Regional node does not need to be combined here as it will basically only be used for the background of the image however we should not forget the prompts from the basic sdxl setup and connect them to the combine multiple nodes but we are not finished yet the actual IP adapter is still missing and this is also one of the new nodes IP adapter from params in this workflow we can use the IP adapter unified loader as only one adapter model is needed the plus model performs well overall at least that's my experience with it the correct knotting goes from the checkpoint to the unified loader and via the IP adapter to the K sampler only the positive and negative prompt conditions then need to be connected insert the source images write the prompts make the settings and the first part is done the second part is also done quickly all we need for this is a k sampler which we can simply drag to the right place by holding down the ALT key with the advantage that all settings for the Jugger XL lightning model are applied immediately for upscaling we leave the image information in the latent space to save Resources by using the NN latent upscale node as our checkpoint model is an sdxl model we only need to switch to sdxl and can set a factor of 1.2 for upscaling the next node is a type of support node automatic CFG at each step the node evaluates the potential average of the minimum and maximum values of the CFG value from the K sampler although this only has a small influence it still has a stabilizing effect on the result the node that matters however is this one perturbed attention guidance Advanced because it delivers such amazing results I want to pause briefly with the workflow setup and true to the Channel's motto show don't tell demonstrate what the node can do it won't take long I promise in addition to the sdxl basic setup the nodes just shown are required for it to run optimally I quickly set this up connected a canny control net to the first Cas sampler and left the positive prompt empty so that as little input as possible is available the upper value scale is something like the CFG value of the K sampler in general a higher value can increase the image structure if it is too high the image burns out the value below this adaptive scale in turn counteracts this in the later stages of denoising a value of zero means no counteraction and a value of one switches page off completely unet block has the settings input middle and output these determine which stage of noising has the greatest influence the values set above are of course decisive here I keep reading that the middle option is recommended but during experimentation I found that input can achieve pretty good results as far as I understand it the image generation process in stable diffusion is divided into blocks and P can influence these with the unet block ided value and the sigma start and sigma end setting provides a further option to influence how the node deals with image noise if these values are negative this feature is deactivated back to the workflow basically we just need to tie everything together correctly and we're ready to go have fun trying it out thanks for watching the video if it was interesting Andor helpful for you I would appreciate a like and a subscribe last but not least have a great day
Info
Channel: Show, don't tell!
Views: 4,740
Rating: undefined out of 5
Keywords:
Id: 5RV793RiC6c
Channel Id: undefined
Length: 9min 34sec (574 seconds)
Published: Thu Apr 25 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.