ComfyUI: IPAdapter Update | Stable Diffusion | Deutsch | Englische Untertitel

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello and welcome to this video in which I would like to exchange some life-time knowledge again. There was an IP Adapter Plus update, that's a bit of a thing again, unfortunately I haven't had the opportunity to make a video about it yet, I'm going to do that now. Mateo, the developer, has published an update video on his channel Latent Vision. I'll link you down in the description, you're welcome to take a look. There's a little more information in there than I'm dealing with right now. I've picked out two cool innovations, respectively an innovation of the node itself, but also a new technique that he has presented, and let's take a look at that. In addition, I would like to give a little update to the last video that I published here on the channel about the IP Adapter. I noticed a few things there, respectively Mateo also jumped on our Discord afterwards and got us a little excited, and I would like to give them to you again. And I would say we're going to get into it now, I'll show you again briefly what I mean here. If you have the ComfyUI Manager, go to Install Nodes, say IP Adapter, and that's the ComfyUI Adapter Plus up here. We'll talk about it, you can also find it on Mateo's GitHub page, here are also the videos and his channel linked. Take a look at it, as I said, it's an update to another video that I've already made, you're welcome to take a look at it again. It's really a great node and a great technique for creating amazing images. Well, if you've installed it, let's jump into the first piece of information I'd like to give you. We always install the IP Adapter in such a way that we then say Apply IP Adapter, pull it out here, select the model you want, let's take the Plus model. We can pull out Clip Vision, select Clip Vision Loader, that's already correct, model. I can also build a workflow around it, we'll take the RevAnimated, I'll take this VAE, Clips gives minus two, positive and negative we leave empty, otherwise everything stays as it was. I'll take a Sampler Advanced right away, we'll get to that in a moment, just right-click here and we'll say Preview here, then the whole thing works, or I'll just build an image save, that's the image save node. That's practical, I can enter a folder name here and I can say Embed Workflow false here, because I don't need it for these images and then we can start. So, to load an image here, we always said Load Image, here's another one of my old ones, I'll just pull this one in here and then we just said we'd pull the model down here and then over it and then the whole thing worked. An important piece of information is, in my opinion, that because of Clip Vision, the images are always rendered to 224 by 224 pixels, now we have 300 by 300 here, everything is rendered down because Clip Vision can't work with it otherwise and that's why there's the Prepare Image for Clip Vision node, where we can pull the image in here and then over here. I'll take another one, I'll take this picture here, that's just a little bit bigger and a little bit wider, that's pretty big and it doesn't help if we load large images into the IP Adapter, because, as I said, it's always rendered to 224 by 224 pixels and that's why I use the Prepare Image for Clip Vision node to say which area you want to focus on in the image. And if possible, and I would say that applies to everything in Comfey UI, if you use a scaling, always use the Langsos filter, there's usually B-Cubic, then there's Nearest Extract and so on, sometimes Box, Bilinear, if possible, always Langsos, Matteo also gave another example in his video at the end, just by using the Langsos filter you get a lot more crunch into the image and a lot more sharpness. But I'll just show you that if we say Preview Image here and I'll just turn off these two nodes here with CTRL-M, then let's take a look at that and here we see 224 by 224, that's the size with which Clip Vision works and don't think that if the images are larger, you get more details out of them, unfortunately it's not like that, it's not necessary if we're honest, it works 1a also as it is, but the node does have its right of way, especially if we say we want to take Left here, then we see here it took the Clex, if we say we want to take Right, then it takes this blue Clex here and if we even say Center here, then it focuses on the middle of the image, so always keep in mind that large images do not bring anything, so no more details or anything else and this node here is quite important to use to put the focus on the image again, we can also add a little sharpening to it, then it will also be a little sharper here, but nevertheless we stay in the 224 by 224 range, well, I'll throw in the other innovations in the course of the video, because that fits a little better into the process, I would say we'll start with the biggest innovation that the suite has received here now, and I can first create a little more space or create less space, but then it becomes clearer, let's do it somehow so far it was so that we have the apply ip adapter here, I just drank an energy drink, I have a little pub mouth, excuse me, and if we then wanted to load several images, then we did it that way by simply saying we want to use an image batch for it, we pull in the image and from here we go into the apply ip adapter and with that we have combined these two styles and if I start that now, he has to load the model again, then these two impressions are mixed together, the ip adapter describes what he sees there, if you press it out so salop and then convert it into information for the sampler, that's what the clip vision does here, we remember clip is there to describe text and now we have such a rough mixture here is a good example, we see our picture of our wife together with this with this oil style but in the oil style still this pencil sketch with it and the whole thing without prompts so we have now mixed these two styles and what we were able to do was we can add a new one here this is a new injection at this point and it ensures that we get even more details in the picture that you can see here pretty well that it has become much finer and more detailed here it would also be advisable to install the clip vision node again, we can do that again here prepare for clip vision then we pull that into the batch 1 and it's a bit ugly but I think we can get it out and we say we want to put the focus on the left area now I would expect that we get a little more red yes we now have the red clip if we put it on the right area I would expect these blue clips a little more now it didn't work out so well yes here you can see it a bit that the color scheme is drifting a bit into the blue because we now have this area here and the innovation I'll take it out so that it's a bit clearer for the video the innovation that we have now is the moment no it was wrong a bit with the names in code ip adapter image this is it by the way second point that I learned from matthew in discord is this noise injection is coupled with the weight so if you turn down the weight and thereby give more weight to the prompt that you enter, so the less weight we give the ip adapter, the more weight the prompt gets, then we also turn down this noise injection at the point because it is coupled with the values from the weight only that you have it in the back of your head well we can actually throw this note away so they pull in for it we need a little more space again and that is now one of the big innovations or actually the big innovation what we have to do now is we have to put the clip vision in here, but we can save the batch image at the point and now we can throw the pictures directly into the encode ip adapter image note here in the back we have to add another note that is called apply ip adapter from encoded quite a lot of space here so and here we have our model again that means we now pull the model into this note and from there into the sampler the ip adapter we now pull in here and the embeds that we get out of this note come in here and here we also have a weight for everything together but the interesting thing at the point is that we can now say we want to put different weights on the four images that are available here, that means we can now say that here is image 1 image 1 should get a little less weight at the point let's say 70 percent and weight 2 so the picture should still have 100 percent and here is already the third info that I would like to give you that was also mentioned in the comments thank you very much for that if we select the ip adapter plus model up here or any of these plus models then we also have to put ip adapter plus on true here that also has an influence on the storage of the whole thing I said in the last video we can also save our embeddings that we create here and later load again with a load node but that means if you use a plus model and you want to save that away then you have to put the ip adapter plus on true here beforehand only then does it work when loading it is no longer so important because we go there when loading and say we want the embeds directly then up here in the apply ip adapter from encoded but when saving it is important that's why it might be quite good if you don't just write ip adapter here but also say ip adapter plus and then the name of the model which you want to save away or not the model but the embeddings to stay correct here then you also know that you have used a class model here and above all others know it too, although it is probably no longer so important for the others, they can just throw it in here in the apply ip adapter and don't have to worry about it anymore, I still find such a certain naming convention at this point not wrong well, let's throw them away again and now let's see what our settings do here we said wait 1 exactly picture 1 we only want 70 percent wait 200 percent we use a plus model exactly now he has made a great picture again we have to see if we can force it a little better at this point yes here you can see it we still get colors nevertheless the characteristic comes through much more strongly from the brush sketch let's pull it a little further here so that's how it works if we now say here we only want to add 20 percent then the brush sketch comes through much more strongly we can of course also say here we want the brush sketch less have the oil painting more then we turn down wait 2 and then it comes a bit mixed up here with the ads so a little better but we see we are much more in the area of the oil painting instead of the brush sketch also here we can add new ones let's do it once okay that's not so nice now yes that's also far from nice but you can see what can come out of it is just a bit crispier and coarser again and here too, as I said, the new one depends on the weights now you can think okay we only have four images here and I would like to have more but that doesn't change the fact that you're still looking for it otherwise I'm lost in the menu we can still do image batches and only put them on wait 1 that works as before so you already have a great opportunity then you thematically so a few batches to clap together and then to hang them in the corresponding inputs here and then specifically to control at the point up here you of course have a wait again for the for the complete yes for the complete output that comes out of here and I'll give you a prompt a sketch of a cyberpunk woman just for safety here so yes we see he's trying a little leather clothes and something like that here it goes more in the direction of cyberpunk now, however, only with 40 percent ip adapter influence and that we can now mix so it gives us a lot of possibilities to control all these stories here here also give a little more and then say the pencil sketch should have less there we get very, very great pictures that looks really cool a little new to pack so a third of new again throw in that makes a difference we still have the oil painting here with a weight of 1 loaded in the pencil sketch only with 57 percent and there is something coming out that is the first big innovation and I think I would always build a workflow with this note in the beginning, even if we only say we want to have one image or a batch of several in one that just gives us the opportunity to do it again later without having to rebuild more pictures here the other innovation or technology that Matteo has shown he called time stepping and I thought that was very, very cool so we put that back on 1 leave everything at the front as it is and I have taken an advanced sampler here that means with an advanced sampler we have among other things the possibility to say up to which step of let's say 30 steps this sampler should render and after that it can do it again or we can add another sampler here we can now say that this sampler should stop at a certain step in the rendering and this should then do the rest of the work again and so that it is nice and comfortable for us, let's say we want to store the end at step at the first sampler and the start at step at the second sampler we double click on it then we get a primitive note and we say now we have 30 steps we say it should stop rendering from 10 steps and this sampler should continue to render from 10 steps we still have to do it here in the first sampler add noise enable and return with leftover noise enable in the second we do add noise disable and return with leftover noise disable and then we let it run through Well, that looks terrible. Did I do something wrong? We have a wrong sampler here. Let's try again. Still looks terrible. So, done. Yes, very stupid. I did 20 steps here and not 30 as in the previous one. So you would actually recommend that we take the steps out of here, we take the steps out of here, we double click on steps, say 30 here and hang the steps in here. Then something like that can't happen. It's the little details sometimes, no. So, I only set it to 30 steps. So, and time stepping. Matteo called it time stepping and I think that's pretty cool because he did it so that he said he wanted to give the model a little more style or a little more space. I'll do it this way. I want to take out this oil style and I just want to have our pencil sketch at this point. We're doing the following now. Here we see that he has taken the pencil sketch as a role model again. What Matteo did at this point is we take the model from the front and I'm going to take a detour over reroutes because I'm a little unsure how the tiny terra nodes work internally. So, if I had pulled the model out of here now, I don't know if he would have taken the optional model at this point. I think that's the way it is. Or the model that is only delivered from the pipe. I think he delivers the optional model further and we don't want that. We want to get back to the point where we use our original model again. That's why we take it right here in front of the loader. And if we let that start now, we'll see that the first 10 steps were rendered with the IP adapter model. So we press it in the direction of the pencil sketch, but then the pure base model takes over for 20 steps. In this case, we get a drawing at this point, but with more influence from the RevAnimated itself. And here we can say again that we want to do half of all the steps with the IP adapter model. You can see that it's going in the direction of the pencil sketch. And here we can see that it is clearly going in the direction of our IP adapter build again. And so you have a good influence on how much space you want to leave to the pure model at the end. We can therefore also create much stronger motifs with the style that we want to have pushed in the direction. So here is nothing of cyberpunk, but here we want to have only 8 steps up here. Then we give the model behind it 22 steps to pick up again. And now you can see that we get a similar picture here as this is what we throw in. So our portrait of a woman. But the model takes over a lot more at the end. And if we exaggerate it and say 20 steps with the model here, then the RevAnimated hardly has a chance to do anything at the end. But that can be quite desirable. So actually the trick is that we render with a certain number of frames and allow the model to do its pure work again without the influence of the IP adapter. What did he do here? Well, I think he's got a lot of problems with the oil painting. He's trying to get something like that here. Pretty cool comic style, by the way. We can do the mixture of the two variants again. We are now up here with 12. I go down to 10 steps. Now he has made the mixture of the two again. And then at the end here you can still see the outlines that he created before and the picture that RevAnimated makes afterwards. I think it's pretty cool that he now translates the oil painting into such pretty angular edges at this point. It looks a bit tacky. I like that. We can pass on two more steps to the IP adapter sampler. Yes, you can already see that pretty cool things come out of it. Very, very cool things. And now let's take the example picture again. Let's see what happens there. Hey, beautiful. Boah, that's killing me. I'm flashed. With something like that, it's probably better if we just say we give the IP adapter six steps and then say take over the model. Yes, much better saved and actually quite cool, except that he made several arms and so you can still control that a bit with prompting. But it's a really cool style. I like it. Yes, okay, good. These are the innovations that existed and this note here, as I said, is really super cool, because here you can still say, okay, we want our image 1 here, but it should only have 60 percent influence. New is a bit high again and give it to him. That's why it's pretty cool. The RevAnimated now suddenly makes such an absolute comic style. Fascinating. I think it's really good. Biggest innovation then. Remember, big pictures are not worth it. Takes the clip vision to focus. So prepare for clip vision encode, as this note is called. I look again before I tell you cheese. Prepare image for clip vision. And yes, and the noise is coupled to the weight at this point. And remember to put the thing on top of the plus. That means, if you don't take a plus model, you take it out again and it just works that way. You don't notice it so much. Yes, look, that's the difference from plus to non-plus. The plus has a lot more tokens than it can give. That means the description will be better. We now have a little more interpretation here because we didn't take the plus model, but the normal model. Also cool, but good. So when you want to save, if you want to pass on the IP adapter embeddings or something, it's best to take a naming convention where there is also plus in it so that you know that you may have to activate plus somewhere. Maybe there will be more updates or something that will be necessary again. Yes, then I hope I was able to give you a little insight. In any case, take a look at Matteo again. He also touched on other topics in his video. Among other things, he also explained the models a bit and he also goes into the area of AnimateDiff. I left that out here because we haven't even talked about AnimateDiff on this channel yet. It's definitely worth it. The link is in the description. Take a look over and otherwise I wish you a lot of fun with the IP adapter. It is already mandatory in at least 50% of all workflows to create images, I would say, or to push a bit in directions where a model can no longer follow. And just have fun with it and have a good time. See you in the next video. Until then, take care. Bye.
Info
Channel: A Latent Place
Views: 1,518
Rating: undefined out of 5
Keywords: ComfyUI, Stable Diffusion, AI, Artificial Intelligence, KI, Künstliche Intelligenz, Image Generation, Bildgenerierung, LoRA, Textual Inversion, Control Net, Upscaling, Custom Nodes, Tutorial, How to, Img2Img, Image to Image, Model to Model, Model2Model, Model Merge, IPAdapter
Id: CGP_j0nGdxE
Channel Id: undefined
Length: 29min 34sec (1774 seconds)
Published: Mon Oct 30 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.