ComfyUI: Master Face Swap & Pose Mimicry with IP-Adapter v2 & ControlNet | JarvisLabs

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hi everyone. I'm Vishnu Subramanian, founder of  Jarvislabs.ai. In this video, we have brought in   a copilot and his name is Thamim, who is going to  help us with creating a workflow for IP adapter,   combining it with ControlNet. We are going to  bring in our image and we're going to bring   in another image and see how we can bring  the face from one image and structure from   another image. Let's do that. If you are new to  IP adapter or ComfyUI, we have been making videos   and you can check our playlist. It'll be super  helpful. You can go check it out and come to this.   If you are knew already using IP adapter, then  let's get started. Right. The first thing we will   do is load checkpoint. No let's load an image,  Right. The reason again is let's start with the   ControlNet. For passing image. To control net we  actually need to extract the features of an image.   So let's not go with this thing. Let's go. Let's  not even go with this. Let's choose a good image,   maybe this thing. Okay. Nice. So she's an Indian  actress by the way, in case you don't know   about her. Okay. And let's bring a depth map.  Right. Does this call Midas depth map I think   I'm pronouncing it right. Okay. And let's say,  preview image. Right. So this is going to be our   depth map, and let's quickly put it as part of a  group node. Let's add it to group. Okay. Then we   can have some names. Image preparation. Okay.  Image prep for ControlNet, right. So what we're   going to do now is we have to build a very quick  workflow. So for our workflow let's start with   load checkpoint okay. Let's choose a model turbo  vision xl. And let's bring in our prompt text   encoders. Let's connect these two things. I would  say option enter. It should have got a duplicate.   So what? What is the, shortcut to actually get it  with connection? another is a keyboard shortcut,   command+ shift+V or control+shit+V, if I don't  want to do that just by using a click, what do   we do? I still didn't figure it out, so fine.  So let's attach to the K sampler. And sometimes   the, copilots don't work. So we are on our own.  Right. Let's bring in our decoder. Right. If you   are already comfortable with this, you can just  to a forward, you need to send the latent image   also. So. Yes, that's right. We miss the latent  image , I'm very helpful. Yeah. I like, let's   change the resolution 1024x1024. Right. And  let's say a beautiful, beautiful Woman, And   we don't want it to be nude or ugly or malformed,  nsfw, I don't want to have a bad image. Did I miss   something? Yes, I know I should have connected to  clip. Right? What went wrong? No. You just like,   added. No. I think we did a double copy paste  or something. Right. So I like command+shift+V,   so multiple ways to do things and something. Oh,  we lost a prompt also beautiful woman. All right I   hope we should have our image. Let's maybe did you  change the CFG scale. No, I did not. Okay. Right.   So the problem with this is, if I want an image,  let's say, of this particular structure, then I   have to tweak the prompt enough. Or I have to work  really hard to get an image similar prompt. Right.   So let's see how we can use controlNet and, make  our images similar to the structure that we have   in our older image. Right. So before we do that,  let's just do some restructuring so that things   become a lot easier for us to understand. Right.  So let's bring the controlNet here. So what is   going to be the first row that load controllNet  model. Right. So we already have a control Lora   depth which is actually a controlNet model.  But I'm not sure why they called it Lora. It   works for SDXL and it's really good. So you can  use that. And we need to apply that. To do that,   the way we use it as advanced . So we get access  to the positive prompt and negative prompt. All   right. So what do we do. Let's connect this  controlnet. And we need to connect the image.   Let's bring the image from here. And we need a  positive prompts and negative prompts. So let's   expand this and take this one here. And I think  I've got a negative one. So it's all is good to   color these things so that we don't get confused.  So let's color them. Also let's say a color I want   it to be green and a negative one I want it to  be red. You can also change the text if you want,   but for now I think it's fine. So what we're going  to do now is so what this controlNet is basically   doing is it's taking the text prompt and the  negative prompt, and it's taking the prompts   for the image. It's converting them to embeddings  and adjusting the overall embeddings in a way the   model understands, the structure of the image that  we are passing it into it. Okay. In simple words,   to create images based on like a pose, or depth  or what else is whatever else, there is something   called line art. So if you draw something sketches  it must be like very helpful. So now let's bring   these positive and negative here. Right. So what  we are basically doing is we are taking our base   workflow. We are taking an image and passing  the image to controlNet, and then connecting the   controlNet output to our base workflow. Now let's  see the magic, assuming that we have not done   anything wrong. Right, Launch Yeah. Did you change  the cfg? , no. Okay. So we are using turbo model,   so we don't need it higher CFG scale. Okay. So  let's probably put these in a group. Not very   much required but maybe it will be helpful. So  right click and say a group of selected nodes.   Right. Now what we're going to do is, we'll bring  in another image and we'll try to create an image   which looks let's, bring in the image first. So  that becomes a lot easier. Let's search load image   right. And let's bring in the popular actress I'm  not sure what's her name I think thamim know, Ana   de armas, Yes. Okay. So so what are we going to do  is we're going to get create an image similar to   the structure but with her face in it. Right. Or  a face which is similar to that. So to do that we   will go and do IP adapter okay. Correct. Okay. So  start with the load. So we have IP adapter plugin   already installed. It's v2. So if you don't have  you can install it. If you're using a Jarvislabs   instance it's already pre-installed for you. So  IP adapter unified face ID. So we just need face   ID right. no okay. We need just need to load the  face ID model right. Correct. So if I draw it from   this, why not this? From this? Does it pick it  up automatically? Yes. It picks. unified loader   faceID. All right. So now what we do is we connect  this image to this. Let's bring it here. Right.   And we need the model. So we need to connect the  checkpoint. And we're using only one IP adapter.   If you're using another IP adapter then we can  connect these I think that's what that node is   for. and now let's connect the model to the k  sampler or so maybe you can just copy paste.   Yes. Right. or like what Thamim said, let's try to  save some time. So let's copy this entire thing.   And say command+shift+V. We got this. Now just  connect the model. Yeah. I just want to bring   all of them together. Okay. All right, I got  them together. let’s connect the model positive,   negative prompt all is there, so I think we're  good to go. Yes. So for the first time, if you   are using ip-adapter node, it will take some  time. So be please be patient. Yeah. So there's   also one more thing. , we still have these things  enabled. So what we can do is we can just bypass   it. so that it doesn't run and becomes slightly  faster. Okay. So in between the loading time,   Vishnu, Do you want? Like, tell the audience we  did some experiment using ip-adapter model. Yeah.   So before we actually started this recording,  we were, playing with multiple things. Okay,   so there is this what I call this watermarks? Yes.  So let's go and change our, maybe add watermarks   or say not to add watermarks. Yes.let's make sure  the spelling is right. Okay, so what experiment   that we were doing was, Trying to bring Indian  actress and using some structure. We had some   challenges.so we will also do that now. It's  kind of a mixed result. Sometimes the results are   really good and sometimes they are bad. In this  case we added watermark, but I'm not sure why it's   still coming. Maybe add a weight to the negative  prompt also. Yeah, sure. I'm mostly it works. I'm   not sure why it didn't come or it's like so.  Okay. Let's say. I hope it doesn't come now.   Okay, so let's do a couple of other  things. , let's try to bring in now.   Okay. It's something weird. Okay. I think  it's taking this. did I spell it wrong? Yes,   I did. Okay. I think that's why it's. The  watermark is coming. I never knew that it   was stable. Cascade can generate or text properly.  Oh. That's crap. I'm not sure why it's happening.   Any suggestions, Mr. Copilot? . Still thinking.  Figuring out what went wrong.Let's try it for   the last time. If it doesn't work, let's.  So maybe try it with a different sampler.   So we are using the euler, right? We usually  don't use the Euler. So we usually we usually   use like DDPM. That's right. So. Yeah. Okay.  It's not going to change. So let's change our,   Let's say okay. Let's try the same image. So the  first image is going to give the structure. And   this is going to give the face. Probably we'll  get more variants of the similar image. And there   is already a watermark. I think the watermark  is actually coming from the depth map. So okay   here. Okay. So to fix this issue maybe you can  use something called prep image. So or we can   use this to adjust the weight of the control net.  Yeah. You can. So this one changes it. let's say,   No, I think we have to adjust the depth. Right.  So this one. It was earlier one. So this one   changes. It changes the object's depth map. let's  say maybe you can use a different image. Yeah,   I want to. Okay, let's try for one last time. If  it doesn't work, let's change the input image.   Oh, yeah. It perfectly worked. Right. Let's now  let's try the other image also and, and see if   it works perfectly well. So make sure you don't  have any watermarks in the input image. Otherwise,   adjust the weights. Yeah, or let's have a better  copilot for next time. Okay, let's change the   image back to, the one which we are using. So  what we'll do here is the image is not very clear.   Right. So let's forget about the watermark. We  know the culprit. It's basically the Input image   for. ControlNet. earlier It was not working. I'm  not sure why it was that. Let's use an image crop   face. Right. I think why did not work? Like you  said, we were using a slightly different thing.   Let's now use this and let's also do a preview  image. Right. So once we do this, we will also   make one more change. Okay. So what basically  this is doing is it's just getting the face of the   image that we want to use. Hopefully we don't get  the watermark in this. No, I think we may get it.   Okay. It's in the controlNet. Yeah, this is just  more of an IP adapter reference image. Okay. It's   going to come. Right. What probably we'll do is  will prep the image before we actually apply the   Midas. Okay. The watermark did not come. Okay.  And I think the model has got more facial details.   so we can play more with this, or we will just do  this one last thing before we wind up this video.   So let's actually make this group node slightly  bigger so that it can accommodate more content.   So what node we are supposed to use. So we are  supposed to use prep image prep space image.   Yeah. To get a clip vision. So I'm not sure  if this will fix it but let's try it. It will   like crop the image basically. So top is enough  right?. Yeah. Okay. Let's try it. Maybe you can,   like reduce the, that a value threshold value  and also can change it, which is let's say 5, I   don't know, let's run it. I think we have. Already  something running. Let's see. But I think we have   got a good result. Right. Yes .okay. It's like,  very stubborn. So let's do something different.   I mean, it's working, but it doesn't want to  loose.so let's actually try this. This is what   we actually tried, and we had bad results with  this. So basically we thought like IP adapter   model is so biased. It, not IP adapter. Most of  the stable diffusion-based models usually have   that challenge. But I think it has done a really  fabulous job. I really like the image though.   It's black and white. I think we can try again.  I'm not sure why it went to black and white. but   probably is more of random in nature, but I think  it has got a decent job, still a lot decent job,   because at least, from the results, what we got  earlier, it is a lot, easier. Right? So what   we will do is, we will add this workflow to the  comments so that you can try it on your own. , and   I hope you enjoyed this video. If you face any  challenges, you can join our community on discord.   We are actively building it. , you can talk to  us. And also, you want any other new topics that   you would like to, you like us to cover? And  please hit the like button and subscribe to   our channel to watch more such videos. Thank you.  Bye bye. Thank you. See you all the next video.
Info
Channel: JarvisLabs AI
Views: 3,776
Rating: undefined out of 5
Keywords:
Id: 2kgZ1vM5UfY
Channel Id: undefined
Length: 16min 32sec (992 seconds)
Published: Mon Apr 22 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.