ComfyUI: Master Face Swap & Pose Mimicry with IP-Adapter v2 & ControlNet | JarvisLabs

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hi everyone. I'm Vishnu Subramanian, founder of Jarvislabs.ai. In this video, we have brought in a copilot and his name is Thamim, who is going to help us with creating a workflow for IP adapter, combining it with ControlNet. We are going to bring in our image and we're going to bring in another image and see how we can bring the face from one image and structure from another image. Let's do that. If you are new to IP adapter or ComfyUI, we have been making videos and you can check our playlist. It'll be super helpful. You can go check it out and come to this. If you are knew already using IP adapter, then let's get started. Right. The first thing we will do is load checkpoint. No let's load an image, Right. The reason again is let's start with the ControlNet. For passing image. To control net we actually need to extract the features of an image. So let's not go with this thing. Let's go. Let's not even go with this. Let's choose a good image, maybe this thing. Okay. Nice. So she's an Indian actress by the way, in case you don't know about her. Okay. And let's bring a depth map. Right. Does this call Midas depth map I think I'm pronouncing it right. Okay. And let's say, preview image. Right. So this is going to be our depth map, and let's quickly put it as part of a group node. Let's add it to group. Okay. Then we can have some names. Image preparation. Okay. Image prep for ControlNet, right. So what we're going to do now is we have to build a very quick workflow. So for our workflow let's start with load checkpoint okay. Let's choose a model turbo vision xl. And let's bring in our prompt text encoders. Let's connect these two things. I would say option enter. It should have got a duplicate. So what? What is the, shortcut to actually get it with connection? another is a keyboard shortcut, command+ shift+V or control+shit+V, if I don't want to do that just by using a click, what do we do? I still didn't figure it out, so fine. So let's attach to the K sampler. And sometimes the, copilots don't work. So we are on our own. Right. Let's bring in our decoder. Right. If you are already comfortable with this, you can just to a forward, you need to send the latent image also. So. Yes, that's right. We miss the latent image , I'm very helpful. Yeah. I like, let's change the resolution 1024x1024. Right. And let's say a beautiful, beautiful Woman, And we don't want it to be nude or ugly or malformed, nsfw, I don't want to have a bad image. Did I miss something? Yes, I know I should have connected to clip. Right? What went wrong? No. You just like, added. No. I think we did a double copy paste or something. Right. So I like command+shift+V, so multiple ways to do things and something. Oh, we lost a prompt also beautiful woman. All right I hope we should have our image. Let's maybe did you change the CFG scale. No, I did not. Okay. Right. So the problem with this is, if I want an image, let's say, of this particular structure, then I have to tweak the prompt enough. Or I have to work really hard to get an image similar prompt. Right. So let's see how we can use controlNet and, make our images similar to the structure that we have in our older image. Right. So before we do that, let's just do some restructuring so that things become a lot easier for us to understand. Right. So let's bring the controlNet here. So what is going to be the first row that load controllNet model. Right. So we already have a control Lora depth which is actually a controlNet model. But I'm not sure why they called it Lora. It works for SDXL and it's really good. So you can use that. And we need to apply that. To do that, the way we use it as advanced . So we get access to the positive prompt and negative prompt. All right. So what do we do. Let's connect this controlnet. And we need to connect the image. Let's bring the image from here. And we need a positive prompts and negative prompts. So let's expand this and take this one here. And I think I've got a negative one. So it's all is good to color these things so that we don't get confused. So let's color them. Also let's say a color I want it to be green and a negative one I want it to be red. You can also change the text if you want, but for now I think it's fine. So what we're going to do now is so what this controlNet is basically doing is it's taking the text prompt and the negative prompt, and it's taking the prompts for the image. It's converting them to embeddings and adjusting the overall embeddings in a way the model understands, the structure of the image that we are passing it into it. Okay. In simple words, to create images based on like a pose, or depth or what else is whatever else, there is something called line art. So if you draw something sketches it must be like very helpful. So now let's bring these positive and negative here. Right. So what we are basically doing is we are taking our base workflow. We are taking an image and passing the image to controlNet, and then connecting the controlNet output to our base workflow. Now let's see the magic, assuming that we have not done anything wrong. Right, Launch Yeah. Did you change the cfg? , no. Okay. So we are using turbo model, so we don't need it higher CFG scale. Okay. So let's probably put these in a group. Not very much required but maybe it will be helpful. So right click and say a group of selected nodes. Right. Now what we're going to do is, we'll bring in another image and we'll try to create an image which looks let's, bring in the image first. So that becomes a lot easier. Let's search load image right. And let's bring in the popular actress I'm not sure what's her name I think thamim know, Ana de armas, Yes. Okay. So so what are we going to do is we're going to get create an image similar to the structure but with her face in it. Right. Or a face which is similar to that. So to do that we will go and do IP adapter okay. Correct. Okay. So start with the load. So we have IP adapter plugin already installed. It's v2. So if you don't have you can install it. If you're using a Jarvislabs instance it's already pre-installed for you. So IP adapter unified face ID. So we just need face ID right. no okay. We need just need to load the face ID model right. Correct. So if I draw it from this, why not this? From this? Does it pick it up automatically? Yes. It picks. unified loader faceID. All right. So now what we do is we connect this image to this. Let's bring it here. Right. And we need the model. So we need to connect the checkpoint. And we're using only one IP adapter. If you're using another IP adapter then we can connect these I think that's what that node is for. and now let's connect the model to the k sampler or so maybe you can just copy paste. Yes. Right. or like what Thamim said, let's try to save some time. So let's copy this entire thing. And say command+shift+V. We got this. Now just connect the model. Yeah. I just want to bring all of them together. Okay. All right, I got them together. let’s connect the model positive, negative prompt all is there, so I think we're good to go. Yes. So for the first time, if you are using ip-adapter node, it will take some time. So be please be patient. Yeah. So there's also one more thing. , we still have these things enabled. So what we can do is we can just bypass it. so that it doesn't run and becomes slightly faster. Okay. So in between the loading time, Vishnu, Do you want? Like, tell the audience we did some experiment using ip-adapter model. Yeah. So before we actually started this recording, we were, playing with multiple things. Okay, so there is this what I call this watermarks? Yes. So let's go and change our, maybe add watermarks or say not to add watermarks. Yes.let's make sure the spelling is right. Okay, so what experiment that we were doing was, Trying to bring Indian actress and using some structure. We had some challenges.so we will also do that now. It's kind of a mixed result. Sometimes the results are really good and sometimes they are bad. In this case we added watermark, but I'm not sure why it's still coming. Maybe add a weight to the negative prompt also. Yeah, sure. I'm mostly it works. I'm not sure why it didn't come or it's like so. Okay. Let's say. I hope it doesn't come now. Okay, so let's do a couple of other things. , let's try to bring in now. Okay. It's something weird. Okay. I think it's taking this. did I spell it wrong? Yes, I did. Okay. I think that's why it's. The watermark is coming. I never knew that it was stable. Cascade can generate or text properly. Oh. That's crap. I'm not sure why it's happening. Any suggestions, Mr. Copilot? . Still thinking. Figuring out what went wrong.Let's try it for the last time. If it doesn't work, let's. So maybe try it with a different sampler. So we are using the euler, right? We usually don't use the Euler. So we usually we usually use like DDPM. That's right. So. Yeah. Okay. It's not going to change. So let's change our, Let's say okay. Let's try the same image. So the first image is going to give the structure. And this is going to give the face. Probably we'll get more variants of the similar image. And there is already a watermark. I think the watermark is actually coming from the depth map. So okay here. Okay. So to fix this issue maybe you can use something called prep image. So or we can use this to adjust the weight of the control net. Yeah. You can. So this one changes it. let's say, No, I think we have to adjust the depth. Right. So this one. It was earlier one. So this one changes. It changes the object's depth map. let's say maybe you can use a different image. Yeah, I want to. Okay, let's try for one last time. If it doesn't work, let's change the input image. Oh, yeah. It perfectly worked. Right. Let's now let's try the other image also and, and see if it works perfectly well. So make sure you don't have any watermarks in the input image. Otherwise, adjust the weights. Yeah, or let's have a better copilot for next time. Okay, let's change the image back to, the one which we are using. So what we'll do here is the image is not very clear. Right. So let's forget about the watermark. We know the culprit. It's basically the Input image for. ControlNet. earlier It was not working. I'm not sure why it was that. Let's use an image crop face. Right. I think why did not work? Like you said, we were using a slightly different thing. Let's now use this and let's also do a preview image. Right. So once we do this, we will also make one more change. Okay. So what basically this is doing is it's just getting the face of the image that we want to use. Hopefully we don't get the watermark in this. No, I think we may get it. Okay. It's in the controlNet. Yeah, this is just more of an IP adapter reference image. Okay. It's going to come. Right. What probably we'll do is will prep the image before we actually apply the Midas. Okay. The watermark did not come. Okay. And I think the model has got more facial details. so we can play more with this, or we will just do this one last thing before we wind up this video. So let's actually make this group node slightly bigger so that it can accommodate more content. So what node we are supposed to use. So we are supposed to use prep image prep space image. Yeah. To get a clip vision. So I'm not sure if this will fix it but let's try it. It will like crop the image basically. So top is enough right?. Yeah. Okay. Let's try it. Maybe you can, like reduce the, that a value threshold value and also can change it, which is let's say 5, I don't know, let's run it. I think we have. Already something running. Let's see. But I think we have got a good result. Right. Yes .okay. It's like, very stubborn. So let's do something different. I mean, it's working, but it doesn't want to loose.so let's actually try this. This is what we actually tried, and we had bad results with this. So basically we thought like IP adapter model is so biased. It, not IP adapter. Most of the stable diffusion-based models usually have that challenge. But I think it has done a really fabulous job. I really like the image though. It's black and white. I think we can try again. I'm not sure why it went to black and white. but probably is more of random in nature, but I think it has got a decent job, still a lot decent job, because at least, from the results, what we got earlier, it is a lot, easier. Right? So what we will do is, we will add this workflow to the comments so that you can try it on your own. , and I hope you enjoyed this video. If you face any challenges, you can join our community on discord. We are actively building it. , you can talk to us. And also, you want any other new topics that you would like to, you like us to cover? And please hit the like button and subscribe to our channel to watch more such videos. Thank you. Bye bye. Thank you. See you all the next video.

Info

Channel: JarvisLabs AI

Views: 3,776

Rating: undefined out of 5

Keywords:

Id: 2kgZ1vM5UfY

Channel Id: undefined

Length: 16min 32sec (992 seconds)

Published: Mon Apr 22 2024