IP ADAPTORS - A FAST New ControlNet for ComfyUI and SDXL! Plus RunwayML - Image to Video Prompts

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

so it's been a few months now since Runway added the ability to take text and turn it into video you can now however use image prompts and you can see some of the really cool things that the users here have created and I'll be doing that with some of my own images and we'll actually be working with an image that I created using a new type of adapter for stable diffusion so we'll be seeing what that does when it's given one of the images that I created now before we get into that I want to remind you guys that I have got a couple of courses on stable diffusion and config UI and sdxl we've got a beginner's course here and also an advanced course and I will give you some really amazing discounts in the description make sure you use those quickly otherwise they tend to get used up fairly quickly and we'll be taking a look at this new feature which is called the IP adapter from tencent and it's a really nice new adapter which will work a little bit like a control net [Music] it looks like the video that I created inside of a Runway is ready so let's see what that looks like [Music] okay okay I can never get it to do exactly what I wanted to do but this is okay this is the alien City Monumental sculptures I'll be showing you the images that I created inside of comfy UI so this is one of the images here a Monumental sculpture above this kind of alien this city on an alien planet and to get there we we brought in a an image of Mount Rushmore which I'm using as a control and that image I got from come from canva so with canva I was able to download about four or five of these images and we're going to use those as sponsors for the depth map and the depth map is going to allow us to just control where the sculptures appear uh there's obviously a text prompt which is talking of the alien City and these sculptures Mount Rushmore all that type of stuff I wanted to create a classical feel so it wasn't classical architecture here and that's supposed to represent a city it's supposed to be an actual planet where you can see these giant sculptures just coming out of the Rock and the the software did a fine job it actually kind of represented more or less exactly what I wanted here took a few attempts though and you can see the depth map that we got I found with the depth maps that it really helped if you're using an HDR image which I was doing there and uh with the depth map it really correlated strongly with the results that we got and you can see Abe Lincoln you can see I think that's Washington so you can see the depth map where you've got a good depth map it can work with the control net really well here I was not using a control net I was using T2i adapters which gave me a little bit more aesthetic freedom because the controllers we'll take a look at a couple of control net examples which they're fine but they don't give me the same aesthetic results and the input image was this guy here who seems to be ready for any type of campaign that you want to throw it at him so that was the basic idea so take a depth map T2i adapters then you take the IP adapters and you give it a prompt and an image and it produces something quite amazing now I was working with SD 1.5 so the images are expected to be 512 by 512 but I ended up expanding the images quite a lot into a landscape and I think I'm using dreamshaper as the model and it allowed me to create images that were fairly large much larger than the standard 512 by 5 one two and most of them look quite okay so let's take a look at a few cherry-picked examples this is one here which I think I think this guy was involved in the in the covet uh info information Wars there I think he lost and uh we we saw him created into these Monumental sculptures so it's basically the same theme we've got the city at the front the alien City we've got the Monumental sculptures and my he's looking he's looking impressive there they did a good job with their sculpture and if you're wondering what what an ordinary photograph would look like I used this ordinary one same control uh here we're using a T2i adapter and it was beautiful it's kind of like this surreal image this does look like a movie poster doesn't it and one now with a control net this is one of the control net ones and I tell you that the control net ones I I didn't like them aesthetically um this is the woman the input the depth map is again one of the canva Rushmore Mount Rushmore images and it's figured out yeah the trees there um there's cultures in the background depth map wasn't as good and I felt the Aesthetics of the control net something seemed a little bit off with every one of the control control Nets that I used and I didn't like the overall effect with the T2i adapters sometimes it worked really well and sometimes you've got results like this where it just looked a bit ridiculous so I found with this particular one it just didn't translate into something that was good but with the T2i adapters generally speaking I got better results uh with the input images I really like this one with this Roman gladiator and you've got him represented reasonably well there and there and uh he's really looking like oh they made some kind of some kind of metallic golden statue out of him and he's looking impressive there it's all looking kind of Roman and gladiatorial over here and that was intentional I used that in the text prompt another control net here and the control net ones were there was something about the color something about the contrast that just didn't seem right again there's something to do with the contrast and the color that just didn't seem right we've got the same thing happening here but this was probably the best control net one that I got after that I started using the T2i adapters and then there was some really funky ones this one was using the same image as the depth map and we ended up with this really fascinating result which uh gave me the confidence to try other other depth maps but there were some earlier examples that I did where I was telling it just the basics just give me some sculptures that are coming out of the granite rock and oh it does a fantastic job even without the control net it actually generally does a fantastic job you can see these are some of the examples here and uh some of the small examples and some of the enlarged examples this one was where we were not using a control net at all so we were just using the the image prompt and the text prompt and once again we're kind of asking for something that looks a little bit like a Mount Rushmore so you can see yeah they're sort of coming out of the mountains to be sure supposed to be gigantic we've got our lady there who is the inspiration the text prompt or the image prompt rather and uh she's looking like she's being used as the inspiration in in quite a nice way there it's also possible to do some abstracts so this was one of the abstracts that I did where we took this image no depth map or anything and then just gave it free reign to produce something amazing and it did so with producing this particular outcome here and this one I liked I thought this was a really nice um attempt to play on this image and just create something really Dynamic really original and this is one where we took an abstract image gave it that Monumental statue prompt and it did something pretty cool with that it was able to produce something that both resembles this and the kind of monumental statue that I wanted that that I was looking for so it's a very creative way of working and it allows you just the right amount of control allows you to give the software just the right amount of freedom to get the results that you want so you can use it in this type of way you can use it to create all sorts of results and uh the ones that I was using they weren't perfect by any means there were some some errors that I encountered not that frequently but occasionally didn't counter some errors and with some of the files they were prone to errors this is I think the one that we used in the runway animation and you can see again there this is SD 1.5 but it's a huge image we managed to to create some extremely large images from SD 1.5 now there are also some abstract ones so that's an Helix being used to create this image and you can see it in the enlarged version here it's it's something which is kind of original but sometimes you can trace it back to the to the source image sometimes it's a little bit more difficult but uh yeah very very creative and very powerful way of working with images and and text prompts now in the paper for this particular IP adapter what they say is that in this work we propose ipadapted to achieve image prompt capability for the pre-trained text to image diffusion models the core design of the IP adapter is based on our on a decoupled cross-attention strategy which incorporates separate cross-attention layers for image features and they say both qualitative and experimental results demonstrate that the IP adapter with only 22 million parameters performs comparably or even better than some finely tuned image prompt uh models and existing adapters and to be sure it was very very fast when I was using it it was incredibly fast it was almost like there was nothing there it was almost like hey this is not taking up any extra time at all so the IP adapter has done it up here on on GitHub from tencent and they've got some stuff here that you can take a look at if you want to and also we've got some modules over on um over on hugging face so I'll link to these and if you want to take a look at that take a closer look at it remember I'm gonna probably start including these in the courses I'm probably gonna have some lectures coming up in the courses I did find that there was one or two problems with some of the files but once I'm a little bit more confident that everything's good I think I'll include a couple of lectures and if you want to see whether or not I spend the IP adapters have been included you can come to the course content and just open up the the sections most likely it's going to be in the at the advanced course for comfy UI and sdxl so just come in and check for uh to see whether the IP adapters have been included and also I do have another course for automatic 11 11 so I might update that one as well but I'm not 100 sure as yet whether I'm going to be working on that soon but anyway I'll have uh discounts for these in the description so hopefully see you over there [Music] foreign

Info

Channel: Pixovert

Views: 4,853

Rating: undefined out of 5

Keywords: Stable diffusion Controlnet, Stable diffusion tencent, Stable diffusion SDXL, Stable diffusion ip adaptors, RUNWAYML image to video, runway ml i2v, runwayml, runway ml image2vid

Id: vOWQMb9-1EM

Channel Id: undefined

Length: 11min 2sec (662 seconds)

Published: Wed Sep 06 2023