Hi everyone. I'm Vishnu Subramanian, founder of
Jarvislabs.ai. In this video, we have brought in a copilot and his name is Thamim, who is going to
help us with creating a workflow for IP adapter, combining it with ControlNet. We are going to
bring in our image and we're going to bring in another image and see how we can bring
the face from one image and structure from another image. Let's do that. If you are new to
IP adapter or ComfyUI, we have been making videos and you can check our playlist. It'll be super
helpful. You can go check it out and come to this. If you are knew already using IP adapter, then
let's get started. Right. The first thing we will do is load checkpoint. No let's load an image,
Right. The reason again is let's start with the ControlNet. For passing image. To control net we
actually need to extract the features of an image. So let's not go with this thing. Let's go. Let's
not even go with this. Let's choose a good image, maybe this thing. Okay. Nice. So she's an Indian
actress by the way, in case you don't know about her. Okay. And let's bring a depth map.
Right. Does this call Midas depth map I think I'm pronouncing it right. Okay. And let's say,
preview image. Right. So this is going to be our depth map, and let's quickly put it as part of a
group node. Let's add it to group. Okay. Then we can have some names. Image preparation. Okay.
Image prep for ControlNet, right. So what we're going to do now is we have to build a very quick
workflow. So for our workflow let's start with load checkpoint okay. Let's choose a model turbo
vision xl. And let's bring in our prompt text encoders. Let's connect these two things. I would
say option enter. It should have got a duplicate. So what? What is the, shortcut to actually get it
with connection? another is a keyboard shortcut, command+ shift+V or control+shit+V, if I don't
want to do that just by using a click, what do we do? I still didn't figure it out, so fine.
So let's attach to the K sampler. And sometimes the, copilots don't work. So we are on our own.
Right. Let's bring in our decoder. Right. If you are already comfortable with this, you can just
to a forward, you need to send the latent image also. So. Yes, that's right. We miss the latent
image , I'm very helpful. Yeah. I like, let's change the resolution 1024x1024. Right. And
let's say a beautiful, beautiful Woman, And we don't want it to be nude or ugly or malformed,
nsfw, I don't want to have a bad image. Did I miss something? Yes, I know I should have connected to
clip. Right? What went wrong? No. You just like, added. No. I think we did a double copy paste
or something. Right. So I like command+shift+V, so multiple ways to do things and something. Oh,
we lost a prompt also beautiful woman. All right I hope we should have our image. Let's maybe did you
change the CFG scale. No, I did not. Okay. Right. So the problem with this is, if I want an image,
let's say, of this particular structure, then I have to tweak the prompt enough. Or I have to work
really hard to get an image similar prompt. Right. So let's see how we can use controlNet and, make
our images similar to the structure that we have in our older image. Right. So before we do that,
let's just do some restructuring so that things become a lot easier for us to understand. Right.
So let's bring the controlNet here. So what is going to be the first row that load controllNet
model. Right. So we already have a control Lora depth which is actually a controlNet model.
But I'm not sure why they called it Lora. It works for SDXL and it's really good. So you can
use that. And we need to apply that. To do that, the way we use it as advanced . So we get access
to the positive prompt and negative prompt. All right. So what do we do. Let's connect this
controlnet. And we need to connect the image. Let's bring the image from here. And we need a
positive prompts and negative prompts. So let's expand this and take this one here. And I think
I've got a negative one. So it's all is good to color these things so that we don't get confused.
So let's color them. Also let's say a color I want it to be green and a negative one I want it to
be red. You can also change the text if you want, but for now I think it's fine. So what we're going
to do now is so what this controlNet is basically doing is it's taking the text prompt and the
negative prompt, and it's taking the prompts for the image. It's converting them to embeddings
and adjusting the overall embeddings in a way the model understands, the structure of the image that
we are passing it into it. Okay. In simple words, to create images based on like a pose, or depth
or what else is whatever else, there is something called line art. So if you draw something sketches
it must be like very helpful. So now let's bring these positive and negative here. Right. So what
we are basically doing is we are taking our base workflow. We are taking an image and passing
the image to controlNet, and then connecting the controlNet output to our base workflow. Now let's
see the magic, assuming that we have not done anything wrong. Right, Launch Yeah. Did you change
the cfg? , no. Okay. So we are using turbo model, so we don't need it higher CFG scale. Okay. So
let's probably put these in a group. Not very much required but maybe it will be helpful. So
right click and say a group of selected nodes. Right. Now what we're going to do is, we'll bring
in another image and we'll try to create an image which looks let's, bring in the image first. So
that becomes a lot easier. Let's search load image right. And let's bring in the popular actress I'm
not sure what's her name I think thamim know, Ana de armas, Yes. Okay. So so what are we going to do
is we're going to get create an image similar to the structure but with her face in it. Right. Or
a face which is similar to that. So to do that we will go and do IP adapter okay. Correct. Okay. So
start with the load. So we have IP adapter plugin already installed. It's v2. So if you don't have
you can install it. If you're using a Jarvislabs instance it's already pre-installed for you. So
IP adapter unified face ID. So we just need face ID right. no okay. We need just need to load the
face ID model right. Correct. So if I draw it from this, why not this? From this? Does it pick it
up automatically? Yes. It picks. unified loader faceID. All right. So now what we do is we connect
this image to this. Let's bring it here. Right. And we need the model. So we need to connect the
checkpoint. And we're using only one IP adapter. If you're using another IP adapter then we can
connect these I think that's what that node is for. and now let's connect the model to the k
sampler or so maybe you can just copy paste. Yes. Right. or like what Thamim said, let's try to
save some time. So let's copy this entire thing. And say command+shift+V. We got this. Now just
connect the model. Yeah. I just want to bring all of them together. Okay. All right, I got
them together. let’s connect the model positive, negative prompt all is there, so I think we're
good to go. Yes. So for the first time, if you are using ip-adapter node, it will take some
time. So be please be patient. Yeah. So there's also one more thing. , we still have these things
enabled. So what we can do is we can just bypass it. so that it doesn't run and becomes slightly
faster. Okay. So in between the loading time, Vishnu, Do you want? Like, tell the audience we
did some experiment using ip-adapter model. Yeah. So before we actually started this recording,
we were, playing with multiple things. Okay, so there is this what I call this watermarks? Yes.
So let's go and change our, maybe add watermarks or say not to add watermarks. Yes.let's make sure
the spelling is right. Okay, so what experiment that we were doing was, Trying to bring Indian
actress and using some structure. We had some challenges.so we will also do that now. It's
kind of a mixed result. Sometimes the results are really good and sometimes they are bad. In this
case we added watermark, but I'm not sure why it's still coming. Maybe add a weight to the negative
prompt also. Yeah, sure. I'm mostly it works. I'm not sure why it didn't come or it's like so.
Okay. Let's say. I hope it doesn't come now. Okay, so let's do a couple of other
things. , let's try to bring in now. Okay. It's something weird. Okay. I think
it's taking this. did I spell it wrong? Yes, I did. Okay. I think that's why it's. The
watermark is coming. I never knew that it was stable. Cascade can generate or text properly.
Oh. That's crap. I'm not sure why it's happening. Any suggestions, Mr. Copilot? . Still thinking.
Figuring out what went wrong.Let's try it for the last time. If it doesn't work, let's.
So maybe try it with a different sampler. So we are using the euler, right? We usually
don't use the Euler. So we usually we usually use like DDPM. That's right. So. Yeah. Okay.
It's not going to change. So let's change our, Let's say okay. Let's try the same image. So the
first image is going to give the structure. And this is going to give the face. Probably we'll
get more variants of the similar image. And there is already a watermark. I think the watermark
is actually coming from the depth map. So okay here. Okay. So to fix this issue maybe you can
use something called prep image. So or we can use this to adjust the weight of the control net.
Yeah. You can. So this one changes it. let's say, No, I think we have to adjust the depth. Right.
So this one. It was earlier one. So this one changes. It changes the object's depth map. let's
say maybe you can use a different image. Yeah, I want to. Okay, let's try for one last time. If
it doesn't work, let's change the input image. Oh, yeah. It perfectly worked. Right. Let's now
let's try the other image also and, and see if it works perfectly well. So make sure you don't
have any watermarks in the input image. Otherwise, adjust the weights. Yeah, or let's have a better
copilot for next time. Okay, let's change the image back to, the one which we are using. So
what we'll do here is the image is not very clear. Right. So let's forget about the watermark. We
know the culprit. It's basically the Input image for. ControlNet. earlier It was not working. I'm
not sure why it was that. Let's use an image crop face. Right. I think why did not work? Like you
said, we were using a slightly different thing. Let's now use this and let's also do a preview
image. Right. So once we do this, we will also make one more change. Okay. So what basically
this is doing is it's just getting the face of the image that we want to use. Hopefully we don't get
the watermark in this. No, I think we may get it. Okay. It's in the controlNet. Yeah, this is just
more of an IP adapter reference image. Okay. It's going to come. Right. What probably we'll do is
will prep the image before we actually apply the Midas. Okay. The watermark did not come. Okay.
And I think the model has got more facial details. so we can play more with this, or we will just do
this one last thing before we wind up this video. So let's actually make this group node slightly
bigger so that it can accommodate more content. So what node we are supposed to use. So we are
supposed to use prep image prep space image. Yeah. To get a clip vision. So I'm not sure
if this will fix it but let's try it. It will like crop the image basically. So top is enough
right?. Yeah. Okay. Let's try it. Maybe you can, like reduce the, that a value threshold value
and also can change it, which is let's say 5, I don't know, let's run it. I think we have. Already
something running. Let's see. But I think we have got a good result. Right. Yes .okay. It's like,
very stubborn. So let's do something different. I mean, it's working, but it doesn't want to
loose.so let's actually try this. This is what we actually tried, and we had bad results with
this. So basically we thought like IP adapter model is so biased. It, not IP adapter. Most of
the stable diffusion-based models usually have that challenge. But I think it has done a really
fabulous job. I really like the image though. It's black and white. I think we can try again.
I'm not sure why it went to black and white. but probably is more of random in nature, but I think
it has got a decent job, still a lot decent job, because at least, from the results, what we got
earlier, it is a lot, easier. Right? So what we will do is, we will add this workflow to the
comments so that you can try it on your own. , and I hope you enjoyed this video. If you face any
challenges, you can join our community on discord. We are actively building it. , you can talk to
us. And also, you want any other new topics that you would like to, you like us to cover? And
please hit the like button and subscribe to our channel to watch more such videos. Thank you.
Bye bye. Thank you. See you all the next video.