클릭 한번에 얼굴과 프롬프트 스타일 합성하기?! INSTANT ID!!!

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hello, I'm Donhyeon Choi from Stable Diffusion Korea, Soylab. Nice to meet you. Today, we're going to look at the function called Stable InstantID, which is an instant ID with the support of the ControlNet. The beginning was posted by a very good member of Stable Diffusion Korea. If you look closely, This is what it's about. It was a post created using my face that appeared in a promotional video for a lecture I recently made at Fast Campus. It was so natural that I was surprised. I was almost confused as to whether I had taken a photo like this. Thank you very much. Since you were so kind to me, I wanted to talk to you on the phone. I don't know what this person is doing. I'd like to hear your voice. I called him like this, but of course I didn't have a phone number. Don't talk nonsense, dude. So I'm going to find the related parts a little bit faster. I'll see him a little bit faster. I've actually met him. So I'm going to give this content to someone who's so grateful. Today, we're going to talk about the ControlNet InstantID. If you click this link here, you'll see the ControlNet discussion page that supports SD Web UI A1111. If you look at this, you can check out the parts related to the InstantID. There are some parts that you can test related to the page that I've been studying like this. If you look at it now, there's information that you can check it out and get a model from this place. I'll explain these things a little bit. Basically, it looks very similar to the Replacer and Face Swap we have been using. In some ways, it is very similar to IP-Adapter. You might think it's just a face swap from a distance. But if you look at the actual part, When you create an image, I'll refer to this part and create an image while maintaining the direction you want. That's what it contains. If you look at the process, there are things like this. If you look here, you can see that the IdentityNet is basically integrated and working. This is the ControlNet. It plays the same role as the related part. There are two functions in here. One is related to the landmarks. In a way, it's similar to the Openpose. And the other part about Face Embedding works together to create this face. So, if you look at the results, it's like this. You've set up some related styles. There's an IP adapter, an IP adapter, a face ID, and a face ID plus, a photo maker, and an instant ID. You can see that you've compared the contents. In the case of an IP adapter, when a prompt is entered, the styles of the face are over-interpreted. Or you can see that there are parts that look like a different person. But in the case of the instant ID here, you can see that you can create the original style and characteristics that you originally have, and create the person you want to make, and you can create the feeling of an oil painting, a watercolor painting, or a black-and-white picture. So, if I mix this with the process structure I showed you earlier, it works like this. Basically, two ControlNets are used simultaneously to convey the style of the face and landmark information about where to look to create this type of picture. So, to start with these related parts, you need to download the model. There's something like this here. Download the related model and change the name. You need to change the name. If you download it, the IP adapter model will have a different name. The control net model may have a different name. You can change it to the name you want. You can save it. And here's what I explained briefly earlier. It's an important part. When you use this ControlNet here, you need to use two. There's an order here. You have to run A first and run B. If you look here, it says, please use the IP adapter model as the first model. What this means is that the model in here is not the same model. We downloaded two models earlier. Among them, ip-adapter_instant_id_sdxl must be entered at number 1 shown here. And control_instant_id_sdxl must be placed here marked number 2. The preprocessors here are related to the face embedding and the face keypoint. So, first of all, we're going to do image embedding on number one and then we're going to do a diffusion on the related content. Based on the landmark, we're going to rebuild the parts that will be built in what form. You can proceed like this. Next, you don't just run it right away. You have to finish this setting, too. What I'm showing you now is the order of the settings, so you can do this later. You have to finish this and start. If you look here, there's a setting. If you press the setting, if you press the part related to the ControlNet in here, you'll see these things at the bottom. This model cache size is basically set to 1. Change this to 2. Apply setting. Completely, this is the end of the CMD window. If you run it again after you've done this, it will run without any problems. Now, let's get started. The content we're going to start with is these two sources and this image made of prompting. I'll try to make the results. As I explained earlier, the first is face embedding. The second is the work of the parts related to the face keypoint to extract the landmark. Once this is properly executed, you'll be able to create a picture like this. I'll try to make a picture like this. Now, let's get started in earnest. The settings and prompts are very simple, but I think I need to explain this briefly. If you look here, the parts we need to use are the ControlNet sdxl models, so you need to use the sdxl model for the stable diffusion checkpoint that needs to be used here. So I'm using the DreamShaper turbo model. You need to change the VAE related parts here to the VAE related to sdxl. Now I'm going to apply the settings. First of all, I'm going to turn off the Hires.fix for a moment. And I'm going to turn off the ControlNet for a moment. I'll try to create it first. Now, the image has been created. In this state, I'm going to create a related part. I'm going to put a new style in that style. The first is to put an image for face embedding. The second is to put an image that can extract the landmark. If you compare these two a little more, the preprocessor here is the instant_id_face_embedding. In the case of the model here, you can change it to it to ip-adapter_instant_id_sdxl. The second ControlNet is the instant_id_face_keypoints. And the model is the control_instant_id_sdxl. This is the control net. Let's start right away in this state. You can see that the image has been created like this. So, the two images I showed you earlier are exactly mixed. You can see the results of the style that was originally intended to be created mixed. This time, I'm going to create another image based on the famous people. I'm going to create an image like this. Let's create "Hold an Apple", which I've been campaigning for for a while. I'm going to use the DreamShaper model to set it up. It's set up like this. If you look at it now, You can see that it's prompted with the keyword "a man, wearing white shirts, apple, pop art style". You can see that there's only one nsfw in the negative prompt. In this state, the ControlNet is activated. You have to activate this. The first is the image I'm going to embed. And the second is the landmark that will affect this. The keypoint that can bring you the points related to the angle and direction of the face. I'll try to run it. Let's create it in this state. Oh, that's good. Now, we're going to have an image made like this. We're going to upgrade it through Hires.fix later. It can be created. And then, if you just use this image as it is, it's a little narrow. If you expand it like this and make it, You can create a very artistic character image. And we can't just end it right here. I'll try one more thing. I'm going to use two people to create an image. When we first introduced it, If you look at the information table of the image, There's something about two people at the same time. I'll try to run that part. The method is the same, but there's something else that's important. There are a total of 4 ControlNets that go in here. So here's A, A2. Here's B, B2. It's going in. There's a lot of ControlNets. I think it would be good if the Vram had enough capacity. And this process or direction is likely to improve and generalize a little more than now, so it would be good to pay attention. I'm going to change the model this time. I'm going to use a model called realvisxl. I'm going to proceed. It's an SDXL model. And I'm going to apply the settings. I'll try to create it first. Two men, white shirt and black shirt. It's just like this. The important thing is to make two people. Just think of it as a prompt. This is how the image was created. And I'm going to process this. How are you going to process it? The first image is to erase the right side of your face The second image erases the left face. Create two images like this. When I set it up here. I'm going to use four ControlNets. You're going to use a total of four ControlNet, so you're going to use two face keypoints that go into ControlNet. So in this process, If the people involved are superimposed, there may be a problem that the landmark is created for only one person equally. Because this model can detect one by one, so I blocked it and put it in to prevent it. You have to listen carefully from now on. I'm going to put the characters in here. First, put Altman in here. I put Musk in here. It's a little different from before. We have to adjust the value of the control weight at the bottom. I used all 1s earlier because it was a single object, but this time it's working in 4 modes at the same time. So here is 0.5 and here is 0.35 When you actually work, You have to find the right number. Basically, you have to work below 0.5. That way, the image will be generated correctly without being broken. I'm going to loosen up the overlapping parts as much as possible. I'm going to prepare each of them. Apply in order 0.5, 0.35, 0.5, and 0.35, and you are done. The other options didn't touch anything. In this state, I'm going to create an image. I can see that the image was created with two people. That image will be this one. And you can apply these parts a little more. Try putting images that create landmarks in the middle into different images. You will be able to create these images through each image. It's very interesting. That's all I've prepared for today. It was a time to see that this part of IP-Adapter was going to be used in a lot of places. Don't forget to subscribe and like. I'll be back with good content next time. Thank you.

Info

Channel: soy_lab

Views: 3,007

Rating: undefined out of 5

Keywords: stablediffusion, aiart, instantID, oneclick, soylab, stablediffusionkorea, A1111, COMFYUI, faceswap, ai, a1111, comfyui, instantid, face, aiimage, sd

Id: 5CMiTL0-WnI

Channel Id: undefined

Length: 12min 3sec (723 seconds)

Published: Thu Feb 01 2024