New IP Adapter Model for Image Composition in Stable Diffusion!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

you know how last week I showed visual style prompting well this week I've got the perfect companion for you IP composition adapter just as the name suggests this model is for image composition scrolling down their page you should easily be able to see what's going on here we've got some sdxl examples starting there with a slightly badgered hugging face maybe you're thinking oh this is just like canny or a depth control net well not really as scrolling down we get to see a guy hugging a tiger because that's a perfectly normal thing to prompt for seriously though it isn't don't use any type of cats in your prompts people oh sorry I mean uh you can see the composition is similar but the guy is facing the other way and you know things are different which canny wouldn't normally do a few more sdxl examples there too with a person standing and holding a thing a face another one there holding a stick and as we scroll down even further there are some stable diffusion 1.5 examples doing exactly the same sort of thing as well as you can see it's a lot less strict and imposing than something like a control net to instead give you a similar composition all without having to type a single prompt being a model it should work in anything which supports IP adapter such as the automatic 1111 and Forge web UI though I'm going to be using comy UI because I like making workflows to use it you'll just need to download the model to the usual place for your chosen interface for comfy UI that's the model's IP adapter directory and for automatic 1111 that's the models control net directory right let's get into this and start with a bog standard workflow talking of workflows if you want to use the exact same ones I'm using in this video and they're all already available to the wonderful nerdling level patreons easy peasy bit to start with here just so we're all on the same page as to what's going on I'm just going to disable this composition group here there we go bypass let's generate and see what that image looks like just as a normal SD 1.5 image and there you go we've got an image and if I was to generate another one is it going to be the same image well no it isn't it's going to be random each time cuz I haven't got anything in my prompt turning the composition adapter on however means I no longer get those random images but instead I now get images that are very similar to the one that I have provided there you can see I've got a person there's a little bit of a desert and a blue sky and some mountains in the background so it's not exactly the same like you'd get with a canny or a depth it's just taking the composition of it and going yeah there you go it it it looks something like this and I haven't had to use a positive prompt at all yet either how cool is that totally different style though of course the composition is great I'll touch on Style in a minute but like we saw in the earlier examples you can use prompting to change certain aspects of your composition rather than an empty prompt you could do something like a bearded man in order to get a guy there instead Forest will replace that desert with a forest or you could do something like buy a lake and yes you've guessed it that will turn that desert into a lake each time keeping the actual composition pretty similar to the guide image which you have provided if you've used IP adapter before you'll know some models have a stronger impact than others meaning the weight value is something you'll have to adjust this composition model isn't as strong as some of the others I've tried so do be aware of that I've often found myself turning the weight up instead of down in some cases I find that values below 0.6 will barely match the composition at all and once you hit a weight of around 1.5 then things may start to look a little bit messy one is typically just right though sometimes going a bit higher can help depending on what it is you're trying to achieve you can change end at as well and it even seems seems good around 0.5 with a weight of one with the composition sorted style is probably something you'll be looking to do next for that you can do all the normal things like prompt for it like if you wanted a more watercolor look then you could just throw watercolor into your prompt or perhaps a more black and white sketch style is more your thing another easy change you could make is the model you're using up until now I've been using real cartoon 3 3D so changing to something like analog Madness along with a more photorealistic style prompt will also change the output quite considerably does it work with all the usual stuff such as control Nets and whatnot well yes it does it's just a model for IP adapter and what better to go with a composition adapter than a style adapter this time here I'm using the sdxl composition model adapter along with sdxl checkpoint and the visual style prompting from my previous video another thing to note is their suggested guidance scale which is way down at three now this one I'm not too sure about as they say it applies more to sdxl models than to stable diffusion 1.5 uh yet for me it seemed to be the other way around if I just go back to that 1.5 workflow for a second and here I'm using a guidance scale of seven uh well to me that looks to be going too hard on the colors and so should definitely be lowered however over here in sdxl land with a guidance scale of seven and well that looks absolutely fine to me now the other thing I found is the lower the guidance scale is the more the style will come through over the composition in this particular example so here I've dropped the guidance scale from 7 to three and I'm sure you can see the difference there rescaling will change these values of course so say I slap a rescale node in and then just connect it up now I can easily double that seven up to 14 and yet still get a fairly reasonable image now this isn't completely magic of course you can't just slap any old pictures in there and hope that it will come out brilliantly like in this example so here I'm quoting for a graffiti art style rodent and yet I've given it sort of three pins and a little bowling ball and in the style I've got of painting so the style in my prompt isn't matching the style I'm sending in in the picture and the composition is just really weird it does its best as you can see it is pretty cool it's got the people there it's got some f faces they've got their rodent ears so it tries really well but perhaps not exactly the best collection of different things to try to merge together what I found tends to work best is if everything is sort of coherent so here I've got smiling and happy because my composition is of a person so the prompts relate to that smiling is something that a person can do and then in the style I've got a pattern so those things I think tend to work together really well when all the bits work together and complement each other there you have it then both style and composition just using images and you can guide it with prompts too certainly something I've been having a lot of fun with as you can tell want to know more about the visual style prompting well just click the link for the next video

Info

Channel: Nerdy Rodent

Views: 11,927

Rating: undefined out of 5

Keywords: IPAdapter, stable diffusion, composition adapter, ip adapter composition, stable diffusion composition and style, comfyUI, generative art, tutorial, guide, howto, education

Id: Sld3wMPbbb4

Channel Id: undefined

Length: 8min 37sec (517 seconds)

Published: Fri Mar 22 2024