ComfyUI: Hände reparieren | Stable Diffusion | Deutsch | Englische Untertitel

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

Hello and welcome to this video, in which I would like to exchange a little bit of life against knowledge. In this video I would like to introduce you to the mesh graph former. This is a possibility to improve the hands. It's not out there for that long, so yes, there is already a little bit out there now. Unfortunately, I haven't gotten around to making a video about it yet. I'm just looking for it now. For me, the most practical option is to improve hands, to refine them. I know with the face detailer and the corresponding hand model you can still tear a little something and some may work with open poses for hands and so on. I think that's a bit inflexible, to be honest. That might make sense if you go into the professional area and really design and style a picture. But for the everyday use, to improve hands, this is, as I find it, the easiest technique so far and I'll show you that now. I have already prepared a small workflow here, it is very simple. I loaded the epic realism model as a prompt, simple woman waving at the camera, text watermark as negatives. We take 512x512. I have already caught a very good seat here when trying around. I also set it to fixed and that's the picture we get. I actually think that's good because we see here the right hand from us has been scammed. Something went really wrong while the left hand, so from us on the left is her right here, the hand on the left side, let's say, has become relatively ok and will probably be improved by further sampling. But we'll get to that in the course of the video. So before we get started, I'll quickly show you where you can get the whole thing from. Open your ComfyUI Manager, install custom nodes and the whole thing is in the Comfy UI auxiliary preprocessors. That means we're looking up here for prepro and then that's from Fannovel16. You can install this one if you don't already have it. That's a very neat collection of preprocessors for the various controlnets. We'll take a look at the whole thing. Up here we can already see the mesh graphormer in action. But it's much more interesting if we scroll down here. Because there are normal and depth estimators. We can already see the node mesh graphormer hand refiner here. A link to further information here at the end of the video. And it's important that we still need this control net here. I'll just open it, go to the link, then just click on download here at the top left and put it in the control net folder and then we're ready to go. Of course, I'll put the link to this page in the description . We don't need them anymore, we'll close them. I've already installed it, so we can go straight ahead. Here is the basic prompt, the default prompt. I'll change that again through a save image node so that I can have it later for thumbnails, etc. So once saved, better now than later. Okay, let's build the whole thing up. In principle, you can find the whole thing under control net preprocessors and here under normal and depth estimators. And here we see the mesh graphormer hand refiner. Alternatively, you can also look for meshgraphorm or meshgrap like me here. Then you can also find it and it does the following. So we connect it once and take a look at what it spits out. We do preview image and we take a preview mask or mask preview, as it is called here. Incidentally, that comes from the ComfyUI Essentials. The link is also in the description below. It is quite practical to take a look at masks. And the mesh graphormer comes with a few settings. I don't change them now. Mask expand is, for example, so you can change the size of the mask that is generated. Random Seed I have tried. I haven't really been able to see any changes, so I'll leave it that way. Resolution is the resolution with which the thing works. If you come to the point and work with dynamic resolutions, i.e. not how 512x512 works fixedly, but dynamically, then store it. Resolution to input and then take the GetImageSize, for example. I also take the one from the ComfyUIEssentials. Pack your image in there and then that's the width you have to connect. Then this node automatically adapts to the size of your image. But for now it should be enough if we leave the whole thing in the widget, in the node. And we can select mask types here. Based on Depth. That basically generates you, I'll let it run. This creates a mask that roughly corresponds to the shape of the hands that this node calculates here. Is sometimes a bit difficult because depending on how the finger positions of the hands are, now we can take a look at that. So this hand has now calculated the node for us based on the picture we got here. And we can already see that the right hand has been improved. We now have five fingers here and so on. The node detects the positions of the hands and then calculates optimized versions for it. And here on the right we see what the mask looks like if we leave it to Based on Depth. As I said, you can see here on the right that there are gaps in the fingers. And then it can be that later in the inpainting the old finger still looks through here. Nevertheless, the mask can be very useful if you now have hands that are in front of a face, for example. Because I'll show you the other thing. There are still tight BBoxes. This means that we get small boxes created. So and these boxes then take a certain edge with them. And that means when the hands are in front of the face, parts of the face are also re-rendered. For this example, however, we now go with, I would say, original. Original makes nice big inpainting boxes for us. And for this picture it fits pretty well. Okay, now we can see here what this node does. What we need next is an inpainting. And there are different variants. I'll take the normal inpainting now, I'll take a VAE encode for inpainting. We say we want to encode our image with our mask and here we also need another checkpoint. I'm going in here with an inpainting model. Feel free to watch the video of me on how to make an inpainting model out of every model that exists. That's pretty cool for stories like this. We'll take the VAE out of it. And of course we need a second sampler. So let's copy it over. We'll connect the Latent in here. The new one stays because we only do inpainting on the masked area. And what we still need is the control net, where I said at the beginning that you have to download it. And I'll show you in this example what I think has worked best so far. That's why I'm taking an advanced control net here. A great node for controlnets. I recommend you to download it in any case. You have much more possibilities than with the control net nodes that come with ComfyUI itself. And I'll show you again in a moment how I think that you can get it a little nicer to shape the hands. So what we need here first of all is of course we have to store the positive and the negative once through our control net. We then pack the whole thing down here in our sampler and make sure that the whole thing is a bit nice. We need the control net itself. We go up there and take a control net loader. Coincidentally, the right control net has already been selected here. The stable diffusion 1.5 inpaint depth hand with a fine precision of 16. This is what you can download from github. We need our picture once and that's what we're doing down here. Wait a minute, why is it green? I wanted blue. What is he doing there? No, that's right. Why was he green when pulling? In any case, we need the picture. This is what is generated here, with which the control net works. It's just a depth control net and that's a picture with a deep information and we need it. No, we don't need it. We can do that, but I haven't found any improvement if we use the mask there either. That's enough if we take it here. And in principle we can now take a look at what the whole thing looks like as an intermediate step. I'll copy that over. So I have to cut briefly. One moment. So, as always, I forgot here, yes, I took that over here once. I forgot to turn on the thing down here again. So we'll let the whole thing run now and see what comes around. This is our inpainting sampler now, of course later now up here once the control net. After that, it will also load the checkpoint again. We can hang it down there. That means the sampler is now doing the inpainting with the results that we have now got from our mesh graphormer hand refiner node. We wait briefly. And there we go. I have a mistake when loading. That took a long time. Now I have discovered a mistake. Of course, we have to connect the inpainting model with the sampler. Let's run it again and there the whole story comes out. And we see the hand has already been repaired. Here he also ironed over it again, although it was good. I'll show you a trick in a moment. Depending on the need, it can be used. But we already have five fingers here. That's gotten better than our picture from up here, where we only have three fingers and a thumb. That worked out pretty well. Then I would say we do a second pass on the image that we have generated here. And for a second pass we actually only need the second sampler again. But we take the positive and negative again from the front, so not from the control net, because we don't want to use the control net again. We now want to do a second pass here. I go over here so that we can look at the result. Latent to samples. And now a little caution is required. Of course, we go out of the latent and go into the sampler here. We take the denoise down. I would say we go in here with 0.4. And as I said, you have to be a little careful. Let's see if we can see anything. Unfortunately, it is not visible. Maybe if I briefly increase the steps. Unfortunately, this preview has not worked for me for a short time. I have to improve it a bit. If the preview would work a little better, you would see that we also get the mask that is used here in the VAE Encode for Inpainting via this latent connection here . That means the second sampler only does second pass on the hands. But we want to make the whole picture. And to get rid of this mask again, there is once the possibility who has the impact pack. There is a set latent noise mask. We'll need it later. Remove latent noise mask. Remove noise mask without latent. That works pretty well. But that's in the impact pack. What happened now? I know. Nice that we ran into it. So it makes patch and the error has now happened because by copying this second case sampler is still connected to our inpainting model. He now wants to do an inpainting at the point, but doesn't get a mask anymore and that's why the whole thing crashes. That means we have to forward the model back here. And I'll do that with reroots because we will probably need it back here even more. Let's go along with it. Let's go so long here. Is that still clear? Yes, you can do that. You can offer it that way. So and from here we go into the sampler and now we get a rerendering of the entire picture. So our desired second pass. And that looks like this. Well, alternatively, that's why I had this node here in my head. Alternatively, there is also the possibility, if you don't have the impact pack at the point, that we use the set latent noise mask from the confi UI ourselves and go there and create a mask for us. Oh no, it's called solid mask. We put this mask on white. This value 1 means white at the point. We can also leave 512 and 512 here because that is our picture. If you work with other picture sizes here, store width and height as input and then also, as just shown here at the front, connect with the get image size node. Then it also automatically adapts to your picture. And if we send the mask in here now, we tell the sampler, in principle, we have masked the entire picture area once and that is why it will then also render the picture again. But now we continue here at the point with the impact pack. Just as a tip on the edge here with the solid mask, push it up once, then it does not collide with the model here. And now we have the setup that we have the sampler for generating the base image. Then we let ourselves generate the correct hands. Then we do an inpainting over the hands with the corresponding control net from up here and an inpainting model. Then go into a second pass, remove the mask again here because we are still working in latent space here and the mask from the inpainting is then transferred here. It has to go because we want a second pass on the whole picture and we do that with the second case sampler. But here we again take the model, i.e. the not inpainting model, but the original model. Now we're going to go into an upscaling process. And I've built an upscaling process here by template. I use it quite often. Is an upscale via model and then an image resize down to other sizes. I'm going to go here with 1080 so that the rendering time doesn't take so long. And now I say right click, convert to group node, say this is our upscaling and we only have one node left. How that works, I explained in the video that I'm just showing. That's pretty handy. You can save a bit of space and time, as you can see. We have now built an upscaling here, which works well. We can now go here, just say we want to push the latents in here now. We want to have our VAE, which we have up here. We push them in here and what we get out of it is another latent back here. And we can put that in another sampler quite comfortably. So here we go in with a 0.5 watt denoise. Let's say we want to decode the image out of it here at the back and see if everything is accepted. He just calculated it again because I just switched back from the noise mask to remove noise mask. So now he's doing the upscaling and calculating the image again in large. Let's take a look at what came out of it. So that looks pretty good. The hands are repaired, too. Here it is still a bit blurry. I would still recommend that if we go into the upsampling and then into the upscaling sampler, so I always mix upscaling and sampling. I always say upsampling. I always connect that. In any case, I would still recommend that we use another control net, namely a control net part, which is there for that. Wait a minute, let me just type that in here, control net part. So the control net part says in principle or helps the sampler with a original image of a small original image and that is in the case. Wait a minute, I'll copy the VAE decode node from here. Then the VAE is connected. This is our little picture in this case. That means we have to take it from here, pack the samples in here and say that we send it to the control net. We also need the positive here at the same time. We can hang the positive from down here from the upscaling sampler, because we need the positive here. Make it a little smaller, save space. And now we tell the sampler to orientate itself on the image that comes out small at the front when upscaling. So here in our second pass. That helps him a little bit to sort the pixels again. That means we stick to the original image more after the upscaling. So let's take a look at the result now. Yes, it's gotten a little better, I think. The dress, we can once again pull the original image from here at the front to the back and now compare the original with the upscaled image. That's pretty good. So as the next step, what I found out is that it is definitely quite good to use a time-stepping in the sampling itself up here at the front. We can do that with the Apply Advanced Control Net. This is also possible with the supplied Control Net Apply Advanced Node from ComfyUI. We can also do time-stepping there. However, I will also expand it a little bit and that's why we go over this node here. I think the Advanced Control Net is the best anyway. It offers the best possibilities from the start. That's why we take this here, as well as this mask here, which we don't need right now. But well, I'll swipe away. Through the time-stepping, I think it's quite good when we say that the Control Net should immediately hit at 0%, i.e. from the beginning. However, at 80% we can't get into the sampling anymore. And that's how we get the first step here. We create, so to speak, a basis based on the graphormer, but give the model in and of itself more possibilities to apply details and the style from itself. We'll let the whole thing run now. So we have seen very briefly down here that it once twitched. From then on, the Control Net went out and we'll see what changes here now. So I think it has changed a little bit for the positive. Everything has become a bit straighter, it has been moved back and forth a bit again. That was already quite good. I'll just look at the VAE up here again. No, that's right. I find the picture a bit in the dress so weird here right now. But well, it's about the hands, not about the dress. And what I also find quite practical is, or rather quite good, we have the option of doing a weight override here. Come on. And then we take ScaledSoftControlNetWeights. No, I always have to look for myself. We take the ScaledSoftControl. So ScaledSoftWeights is the node and here it is called ScaledSoftControlNetWeights. It's a bit of a name. Well, but good. So if we put them in there, it's actually intended for animation. That means it multiplies backwards over a number of pictures, turns on the certain multiplier and reduces the influence of the control net per picture . But you can also turn that around. I think there are always very good values, if you say 0.9 and FlipWeightsFalls. That means the strength of the control net is not increased over the duration, but reduced. But since we only have one picture, it doesn't do much. Nevertheless, I think that's very good. So it has a good influence. It always had a good influence on my brightness control results. And here, too, I think it had a very good, very good effect backwards. We'll just wait until it's done and see what I've done. I'll open the picture and we'll wait. I can't exactly explain why that's the case. However, whenever I work with controlnets and enter the values that we just had, I think the result is a bit nicer than if we didn't have it. We can see that here, for example, on this finger. The fingers somehow see a tick better than if we don't have it. This is the next tip. It is optional. You can do it as you want. You can also work with a normal control network. That works too. But that's the edited version for me, where I always think it works best. Somehow the most beautiful results always came out. Well, now with this example, that's why I left the seed. It is the case that we actually have our image preview notes here at the back. So the left hand, so from us on the left, man, that's confusing. This one here, yes, it actually succeeded here, while only this one went into the pants here. And then I thought to myself, I have to paint a hand that doesn't look diagonal again. And then I had the following in mind. This probably doesn't work with every example, so depending on where the hands are. But in this example it is relatively clear. And I just want to give that back as a little tip. We can create a solid mask that is black. We do that with the value 0. Behind it we put a mask composite. Because we want to connect masks with each other. We take this mask as a source and as a destination we take the mask that we have up here, which comes out of the graphormer node. Here we can already look at what the whole thing does. I take a mask preview node again here and we just take a look. And now we can see that everything is black. That means we don't have a mask anymore. Now we go here and say 512 by 2. We then get a width of 256. The whole thing looks like this and with this mask here we have already managed to exclude the one from our left hand and only keep the right one for the inpainting. That means we push the whole thing here a bit. I don't think I can do that really nicely now, but roughly so. If we now connect this mask up here in our inpainting, then only our right hand is inpainted and improved. We go to upscaling and we look again at the difference that came out of it. I'll open the old picture again. And there we are. And now we only have this hand here inpainted and this one is still from the original picture. As you can see in the background, the inpainting area is not adapted. So in principle you can say yes, okay, I have two hands in the picture, one of them is quite okay, the other is misplaced and I only want one inpainting because of that. But we always get two masks for all recognized hands or X masks for all recognized hands here at the point. That's why you can turn it off like this. If we now want to turn off the other side, I'll turn it off so that we don't go into the rendering now, but just look at the mask here. Then we would simply have to move the X coordinates around the corresponding pixels and we would now only have the other hand that would be inpainted. That's just as a tip if you want to exclude something here. Of course there are problems when the hands are somehow over each other and he makes a big mask or something. You can run a second example. This is one where it works quite well. But we'll take the mask in here again. So I'll just dare to generate a new image now. This was a prepared image. It doesn't always work with these hands here. There are sometimes also very strange results. We'll take a look at that now. I'll mute this node once and oh come on, what should we do? I'll put a preview image in here so that we can see if we can get another picture. Yes, her hand is also mean. That's okay. That means we'll send that through the whole process now. Let's see what happens. Because I want to be honest, of course there is no all-round solution to fix hands. It also goes into the pants a few times. It can't save every picture. Are we honest? That's also a bit of seed luck at this point. It is of course prompting. But roughly I wanted to show you this technique. So let's see how he did it here. Yes, well, you can see what I mean. Yes, he gave us five fingers and not just four. You can't really tell what 4 is. He tried to correct it, but it's a bit difficult. If you want to have several variants, you can also take a repeat latent batch with you. Let's just hang that in between. Then at least we get x suggestions that we might be able to work with. Of course, everything takes longer accordingly. Of course, he now has to calculate four pictures at once. Let's see what suggestions he generates for us. So that took a while now. Don't be surprised, I always do a little fast forward. Then let's take a look at the whole thing. A little bigger. So we have now received these four suggestions and spontaneously I would say this one or this one. Difficult, difficult, but I just wanted to show you because I went in with a prepared seed where I knew it worked quite well. Of course, we also have the seed here on fixed. Something can change here in the second pass. We have it here on fixed. If you put it on randomize, there can also be a little bit of a change. Otherwise, yes, it's not always, I always say, don't always trust the example pictures on the net and something like that. That's always a bit picky, of course, the selection of the best pictures. Many things just go into your pants and then you have to try a little bit at one point. Well, I'll just put the workflow under the video. You now know from the principle how it works. I once showed this here, from which I think I had the best results so far. You can convert all of this to normal controlnets, as always. Nevertheless, just play around with it a bit. Feel free to write me your experiences with this technology in the comments. It's always good for me to learn a little bit and then get back information . And otherwise I just wish you a lot of fun making and experimenting. We'll see you again in the next video. Until then, take care and bye!

Info

Channel: A Latent Place

Views: 1,377

Rating: undefined out of 5

Keywords: ComfyUI, Stable Diffusion, AI, Artificial Intelligence, KI, Künstliche Intelligenz, Image Generation, Bildgenerierung, LoRA, Textual Inversion, Control Net, Upscaling, Custom Nodes, Tutorial, How to, Prompting, Tipps, MeshGraphormer, Hände, Hands

Id: Dkz6G48bvk0

Channel Id: undefined

Length: 33min 28sec (2008 seconds)

Published: Wed Jan 17 2024