ComfyUI: Face Detailer (Workflow Tutorial)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi I am Mali and welcome to the channel I have lots of AI generated images where the faces are distorted fixing them via manual in painting is a tedious process using clips image segmentation I can edit all these facial elements in one go also with these same tools you can take a photo realistic image and give it a graphical face or add some facial realism to a graphical AI generated image all this can be done without manual and painting let me show you my workflow and hacks on how to do this in comfy UI a shout out to the channel members for their continued support there will be four workflows in this tutorial the workflows in the beginning are basic and they become a bit Advanced toward the end some custom noes are required for the workflows I will leave the GitHub page of Dr Data in the description please thank him for making the comfy UI manager and the quality of life nodes without his nodes none of this would have been possible both the impact and Inspire packs are required then install clip egg this node is crucial for the last workflow if you are loading the Json file do not install missing nodes The Comfy manager picks up another node which will install unnecessary things not required for this tutorial ensure that you manually install this node the was node suit python gos and the ultimate SD upscale custom nodes are also required for the tutorial to download the ultral litics models go to install models from The Comfy manager search for ultral litics for the tutorial I will be using these two models one is for bounding boxes and the other is for segmentation I will use the sdxl base checkpoint and the refiner for the first workflow these are the hugging face models with this basic workflow around 80% of your low resolution photo faces can be fixed the remaining 20% would be images that can only be partially fixed by a single pass through to check the robustness of the workflow I will show you some worst case scenarios and how the workflow can be tuned to make it work two checkpoints one for the base and the other for the refiner a vae loader for the Bas use the sdxl clip text and code nodes for the prompts this node is from the impact pack search and add to detailer pipe sdxl this is a very useful node what it does is that it creates a pipeline carrying information like the conditioning VA models Etc and reduces the Clutter so if you add multiple nodes that require this information you will have to connect these inputs again for each added node this becomes messy with long workflows with this node we can avoid all that the only difference between the sdxl and the normal version is the refiner inputs in the node when you drag out the bbox detector it doesn't show up the ultral litics node just search and add that manually anything in this suit with a provider means it provides some information such as the different YOLO and segmentation models Prov provider nodes are the source and do not require any input connections the ultral litics have bounding boxes and segmentation detector outputs when connecting bbox ensure the model is a bounding Box's model only similarly choose a segmentation model for connection to a segm input bbox detects the subject as a rectangle in comparison the segmentation model uses a silhouette as a mask segment ation combined with bbox is more accurate and I would be using a combination of both wherever possible I will use the YOLO 8s face model for bbox and the YOLO 8N seg 2 model for the segmentation for the silhouette extraction the impact pack uses a more sophisticated segment anything model instead of a segmentation model the Sam loader model will be downloaded automatically search and add the face detailer pipe node connect it with the detailer pipe and the load image node the guide size value is used only when the detected mask size is smaller than it if the mask size is larger than this value it will just scale it to the maximum size and then add details the guide size 4 determines if the guide size value is based on the bbox detected area or the crop area the crop area is determined by the bbox crop factor the noise mask when enabled will add noise only to the mask area when off it will go beyond the mask area to add noise however it will limit itself to the area defined as per the crop factor the force and paint will force regeneration even if the mask size is smaller than guide size when it does that the upscale factor is fixed at one bbox threshold is the confidence level of the detection at lower values smaller subjects will be selected a higher value ensures only bigger subjects are selected the segment anything model has different methods of detection the detection hint determines which point should be included when performing segmentation for example Center 1 means one point in the center of the Mask vertical 2 means two points on the center vertical line would be included Diamond 4 specifies four points in a diamond shape around the center point to be honest I did not bother experimenting with this option the center one is what I will be using throughout the tutorial always add image preview notes to the cropped refined enhanced Alpha and mask outputs the previews will help you fine-tune the settings let's choose this image and hit the Q prompt add a 0.5 B box threshold value it has detected five faces the detection system has a limitation it can only detect above a threshold which means if you want it to detect the smallest face only it cannot do that it will select all faces above that small value threshold however you can make it only select the main face just increase the value in this case to 0.8 the B box dilation will increase the mask area the right ear of the subject is not entirely in the mask area to apply the detailing effect to the ear as well the dilation should be increased the Sam dilation will increase the overall mask area whereas the Sam B box expansion will expand the mask towards the boundaries of the bounding box Feathering is how well the mask blends with the background a higher value reduces the defined edges of the Mask the crop factor value plays a crucial role the model needs enough surrounding area to understand the context and add details the value here is a multiplier of the area surrounding the mask later I will show you practical examples of when this value should be changed to get the desired results with the default grid size and Max size value I am not getting the desired results you can see the face is not properly restored when I increase the resolutions to 1024 I do get decent results this however may be taxing on some systems depending on the source resolution there is another solution after some test in I understood that adding some sort of a prompt helps however getting the exact prompt for each image I tested became difficult so I added the blip analyzer node which works in this case you can skip it if you want to do manual prompting to connect the string with the conditioning add a note called text to conditioning here are the results for comparison the 768 with the prompt gives the best results for the performance however going further below does not output the desired results for this image I am taking an entirely distorted image the sole purpose is to test how robust the workflow can be it could not restore it and this is expected running it through a second pass with the same checkpoint would not make that much of a difference since the output is terrible so I am going to use a different checkpoint since the first checkpoint is sdxl ensure the second one is also sdxl based selecting a non sdxl check point would give an error the checkpoint you choose depends on various factors and you can choose any sdxl checkpoint you are comfortable with here add an edit detailer pipe node this is useful for making changes along the way to the existing pipe flow connect the face detailer to this node duplicate it and the image output from the first face detailer will connect to the second one this is much better I tested this specific image with random seeds and six out of 10 images were usable for this workflow I will delete the second pass nodes I added previously the blip nodes and all the sdxl conditioning and refiner I will use a simple SD 1.5 based checkpoint and pass it twice through two face detailer nodes replace the detailer pipe node with the non sdxl version duplicate all the face detailer nodes and connect them for the second pass there is a reason why the base sdxl is not used here it just doesn't work fine-tune checkpoints work better for intrinsic details depending on the desired output the checkpoint is crucial here a checkpoint trained for anime will not work in realistic details I will show you some examples say I want to change the face to a more realistic look for this I am using the Juggernaut reborn checkpoint unless the model gets something wrong I prefer only to use the negative prompt and leave the positive empty let's try another example without changing any settings I will show you one example of using a prompt using the dream shaper checkpoint I will try and convert the face to an illustrative Style for the positive I will just put illustration and Vector R this is a feminine look and is not consistent with the original image I am only going to add the word mail in the positive prompt and it's fixed one last example to show you why the second pass some checkpoints do cause some artifacts in the first pass but they get corrected in the second and some checkpoints on the other hand overdo the image in the second pass for this workflow delete the second pass face detailer nodes also delete the ultral litic provider nodes the objective is to have different facial features Auto selected upon command and enhance or change them in one go except for the hair selection everything else is pretty straightforward to automate the hair selection process you have to create a whole new set of nodes but once done it is very accurate as shown earlier I am adding the blip nodes for the positive conditioning this works best and maintains consistent consistency with accurate prompting the media pipe face mesh detector is specifically designed for facial details bbox and seum are combined in this node and should be connected to the detailer pipe all you have to do here is enable a specific facial detail it then automatically creates a mask for it let's CU prompt first since the face was selected it's masked the face let's change that to the left eye pretty cool huh you can add Laura and a prompt which will affect only the mask area let's add an sdxl compatible Laura to enhance the eyes also adding light green eye to change the eye color to select another facial detail add an edit detailer pipe node connect the pipe from the face detailer output then duplicate and connect the media pipe face mesh node duplicate the face detailer nodes and connect them with the edit detailer pipe I will copy and paste the same prompt and change the eye color to Blue you can now repeat this process to automatically select each media pipe face mesh node element I will be using clip segmentation to automate the hair selection process add the clip seg node the clip seg can detect single word categories like person face hair clothes Etc first we will generate the hair mask for the tutorial the blur value should be set at one a higher dilation Factor would mean a bigger mask area for the hair this is based on image to image start with four and go up to 30 depending on your desired hair coverage the threshold value determines the sharpness of the Mask edges a lower value spreads it more and gives it a fading gradient at the same time a higher value makes it more defined and sharp figures ranging from 0.35 to 0.5 seem to be the best however it depends on the image duplicate these three nodes and create a mask for the face we want to avoid any denoising happening on the face this node doesn't come up in the search so rightclick add node impact packed and operations here select the mask minus mask node we need to subtract the face mask from the hair mask connect the hair mask with the first input and the face mask with the second The Mask generated by the clips node is not binary to convert it add to Binary mask node when you increase the dilation for the hair mask say to a value of 40 you can see the mask expands outwards as well as inwards we want it to expand only outwards otherwise the face would be regenerated this is where the mask subtraction is useful to generate the hair I will add the detailer for each debug pipe node a higher value in The Mask threshold will reduce the mask area and a lower value will increase it this is used to fine-tune the mask further The Mask has to be converted to segments this node will then connect with the detailer debug input the combined value should be set to true I have already explained the crop factor before to get the other details to this node we need to create a basic pipe add a node called to basic pipe and connect all inputs drag it out till the second face detailer and add a note called from basic pipe version to but before we connect this node to the last detailer we need to modify the prompts for the hair so add an edit basic pipe node in between ADD clip text and code nodes and complete the pipeline let's try curly blonde hair in the positive prompt oops don't forget to connect the clip inputs nice now let's try something else how about straight purple hair a couple of things here this can be fixed by reducing the mask threshold binary value or increasing the mask dilation if the hairs are short you can increase the length slightly by increasing the mask size here I will reduce the face mask dilation to 1 and The Mask threshold to 30 you can really fine-tune the hairstyle by adjusting the mask I am trying to make changes to both the eyes and the mouth I want to show you some of the settings that affect the generation when dealing with confined areas like eyes lips or eyebrows the model sometimes does not have enough pixel data to understand the context this causes artifacts in the output and it can be corrected by expanding the cropping factor and giving more data to the model to work on have a look at this area increasing the bbox dilation to 50 the Sam dilation and bbox expansion to three along with the crop factor to four fixed it if changing the settings has a negligible effect then keeping the same settings try random seeds some distortions can be produced due to the hair overlapping the clothes reducing the mask is not a proper solution here in such cases is a clothing mask has to be created and subtracted from the overall mask creating a clothing mask is the same as the hair mask the text for the clip EG should be clothes duplicate The Mask subtraction node the clothes mask will connect to Mask 2 the output from the first mask subtraction connects to mask one adding clothes subtraction gives a better result we can further streamline the process by using the switch node the switch node setup will allow you to switch between all masks only face and hair masks and only hair mask rename the first binary node to all then search and add the impact switch node connect the binary to input one and connect the switch and mask to segs nodes add another to Binary node connecting with the first mask subtract this will be the face and hair only duplicate it again and connect only the hair mask directly to this node rename it to hair only connect hair only to input two and the hair and face only to input three rename the switch any node for refence refence and that's it change the select value based on the image before hitting the Q prompt for the upscaling add the ultimate SD upscale node switch the mode type from linear To None to avoid adding further details during upscaling this will be a basic imageo image upscale only I recommend 1 5x to Max 2x at a time extend the basic pipe connection add another from basic pipe node and connect relevant inputs with the ultimate SD up scale and that's about it I hope the tutorial was helpful and you learned something new in comfy until next time [Music]
Info
Channel: ControlAltAI
Views: 26,455
Rating: undefined out of 5
Keywords: impact pack, stable diffusion, comfyui, custom nodes, workflow component, comfyui inspire pack, stable diffusion face detailer, comfyui face detailer, comfyui face restoration, comfyui face fix, comfyui face detect, comfyui workflow, comfyui custom nodes, comfyui segs, comfyui segmentation, comfyui clipseg, comfyui sam, comfyui face fixing, segment anything
Id: _uaO7VOv3FA
Channel Id: undefined
Length: 27min 16sec (1636 seconds)
Published: Sun Jan 07 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.