ComfyUI Tutorial - Automatic Subject Masking via COCOSemSeg

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay what's up uh and poison Berry and I make AI uh quote unquote art so I'm gonna teach you something how to use comfy UI to do automatic subject masking because comfy UI is awesome until it's not and then you have to like figure out like really like complex stuff in order to do things so what's going on here it's doing the the things with the stuff with this put this in here and then in the negative prompts because I want this to be on YouTube so it goes through here it does the sample does the base image right over here and then oh what's this how do you do that how do you make us a solid mask based on the person based on kind of the person it's good enough so the way that I do this is with the Coco Samsung processor just a base overview what's going on base image Coco something actually you know what let's start from scratch let's remove all these oops didn't mean to do that get an image preview see what's going on preview image preview image clone preview image so what Coco Samsung does I could be wrong about this but I'm fairly confident this is the same type of segmentation preprocessor that like self-driving cars use that's why I use what this prompt right here to kind of show what's going on you can't see my mouse pointer by the way I forgot about that last time so yeah uh this prompt has a thing with the person and as you can see there's some cars in the background some trees some buildings there's a one girl with a handbag whatever it's floating it's attached to her and then if you look at the thing you know it's sub it's able to know the street it knows let me put these right next to each other because you can't see me Mouse pointer so it's able to segment the person it's able to segment the street it's able to segment the cars very well because you know a self-driving car needs to know what cars and people look like so it doesn't hit them and it needs to know what like you know a tree is so that if there's an unavoidable collusion it can hit the tree instead of the Alive stuff I mean the Alive stuff we actually care about I'm not a vegetarian uh the buildings it wants to avoid those the sky it wants to avoid flying into the sky I guess whatever so the question is though how do you actually use this how is this useful like what does this like how like what is it it's we're going to use it to create masks for the subject so how do you do that how do you how do you use this how to use this information to go to that black and white mask that I had earlier and the answer is with math so you see everything in computers is math these colors don't let it fool you they're not actually colors they're all numerical values the computer only understands numerical values everything is a one and zero so we need to not that Mass no image uh uh I forgot which one it was mask composite convert image to mask there we go masks are black and white so it has to convert based on color channels so we're going to get all of the color channels the red the green and the blue all right and then plug this this Coco some sag into the convert image to mask and then it'll separate it into all the color channels and now what we need to do is convert all these masks back to an image so that we can actually see what's going on and then have a little bit of a better understanding of what it's doing so I'm going to clone all these bad boys or these bad girls we don't know but move all this stuff out the way this here plug it in plug it in all right so what's going on let's get a new image let's just put it on fixed to make it easy but color channels so what do we have to do next separate something in the color channels so how do we get a black and white mask from all this stuff we have to do math all right so the thing that I've learned about this is that cars will always be blue and that doesn't matter the subject the person will always be in it'll always be the most red the most red channel it's always going to be this color no matter what every single time and then inanimate objects that might be attached will be a slightly different color this might be problematic there's a little bit of red up here why I don't know but all right let's use this image fixed cool we got the same image so it segments them we get the color channels now we have to do math the way you do math with masks is with the mask come composite note so you can see there's a source and a destination I don't know why they did it like that it should be a and b in operation a times B A Plus B A minus B but yeah that's how it works it's this minus this so like I said earlier the the red the red is the most person the person is the most red it'll always be this color the person will so how are you doing math with colors that doesn't make any sense um everything like I said before everything is is numbers for the computer the computer doesn't know colors the computer knows numbers so colors you can think of as a three-dimensional array that have a red created in blue oh oops red green and blue Channel there we go so it's you know uh red Channel comma green Channel comma blue Channel and then it yeah it just separates up into a gray scale so it's just uh converts it from a three-dimensional array into a a one dimensional float where instead of you know the the r the RGB it's just gray it's either a one or a zero so now it's now we have just we have a float we have a single number so now we're able to do math with that because you know math with an array it's it's a little more complicated so whatever not really but the kind of it it just makes it easier for what we're trying to do because we need to subtract we need we need this this red color because the Red's always the person so we have to subtract the green and the blue from the red so we take the red and then we subtract the green see subtract subtract and then again we want another we want another image preview so that we can see what we're doing because doing math with colors is not really intuitive to some people including myself and I've been doing this for a long time so we subtract the Green from the red and now we have this uh this is going to be a problem all right this is going to be a problem but we'll fix it don't worry unless we can't then it's going to be uh so that's something to be aware of uh this is AI and we're trying to be lazy by setting up something to do the work for us which means that it might not always work the way we want so just keep that in mind but there's ways around that don't worry I'll show you later so what do we do we subtracted the Green from the red but now we need to we still gotta subtract the blue because the blue is cars and we don't care about that we just want the person we want the red the red information to clone that clone these two shift drag crap control click shush control click shift drag make sure you have the right one selected such as that so we have this this is the red minus the green it's the red Channel without green now we need to subtract the blue like I keep saying we only want the red boom I knew it I knew this would be a problem all right so there's all right there's a couple things for some reason that's there but whatever so we need to find out how to get rid of that if there's any extra information that you don't want like I don't know if you could see but there's a little there's some other kind of very dark gray in the corner here so we could bring that up if I use a it's a binary mask it's a binary mask then it's going to it's going to convert anything that isn't black into white see so we want to we want to do something to get rid of these values which is very easy for that you use just another Mass composite what did I just do it's all right oops so we clone this clone these as well shift drag and then we need a solid mask for this so what's a solid mask it's a number with a height and a wick so the so for this if this is not the same resolution as the base image then it won't it won't subtract for the whole thing so we want this to be the same 5 12. by 768 we need the masks to be the same size as the original image or else it'll only work on a portion and since AI creates images differently every time we just want something that works 50 of the time every time so that's what we're doing oh yeah by the way um what is one what like what what exactly is this so if you don't know about uh numerical values and how it relates to math one is pure white zero is black and anything in between is a shade of gray so we have dark gray we have slightly lighter gray we have almost white but not white it's gray there's lots of gray there's more than 50 Shades of Gray all right so we're trying to get rid of this extra information up here and these little these little tiny dark gray patches I don't know if you could see them but they're definitely there that's what the the binary mask showed us earlier so we have this this is pretty pretty this is maybe like point seven maybe 0.8 so let's do it 0.5 subtract oops no we need the source even though this is the source this doesn't mean they should change this to A and B like for like uh whatever all right so what's going to happen that stuff's gone now and now we go to impact operation binary mask Swap this actually let's make a new one clone shift drag binary mask to image so now this should oops this should only have the subject now like we want there we go does it work every time no it doesn't but it doesn't work most of the time yes it does see we get a new image do we have a new image what's going on we have a new image now with a girl and a bag and some cars and stuff so it does a segment thing the red is the most person we have all the different color channels subtract subtract subtract subtract anything else that we probably don't need or any you know small gray values and then boom that got rid of the handbag so this value might be a little too high it's maybe 0.3 repeating of course so what do we have now so it's got a new mask to shave a handbag why do they always have handbags oh dude oh my God it's all the hands it almost had the correct number of fingers the bow is not person so this is the only thing I haven't really figured out how to work around is when it segments the like bows and ties for some reason into a non-red color I have I literally have no idea why it does that I think it's I think it thinks it's a butterfly and then it's like I could run that over because I think cars use this but yeah so this is the only thing I haven't really figured out how to work around like I said two seconds ago uh but you know it's it's doing everything automatically so you shouldn't really expect it to be a hundred percent every time this is AI you know how it works you generate 200 images and then you cherry pick the best one and then you upload it and then you tell people oh I use Photoshop with my art no you don't all right this might be a little too high still let's bring it down to like point one five actually no one point six nine six nine four twenty it looks like it rounds up but it actually doesn't no it needs to be a six nine four twenty six nine there we go there's always going to be there's always going to be a little bit of you know sus I thought hey I thought the bow tie was you know something else maybe maybe it sees people wearing bow ties and then it does this and it's like you can run them over for wearing a bow tie but I don't I don't really know how it works I'm not the person who made it they obviously hate bow ties because this always happens with bow ties see we got another thing see she's not wearing a bow tie so she is okay but the handbag those will always be a different color for some reason and there we go so you can do some additional work with this like uh maybe like a blur or a Feathering let's see impact mask mask feather mask I hate this so I'm not I'm not changing all four of those numbers no no no utility is primitive so I am trying to be as lazy as I possibly can five and then put that here Wait no that's not what I wanted to do we need to feather oh yeah all right yeah this goes here then it converts it to binary so it should maybe stop these little these little lines we're gonna find out yay and if you want you could make this a little bigger control after generated what the heck oh I think that's for for seeds just in case no all right so basically if she has a bow tie she's gonna die that's the mnemonic if you wear bow tie you will die so just put bow tie in the negative why are they always have handbags come on there we go it's all right cool except not uh uh oh no what does that even oh okay I understand her hand sucks so it just said no thank you that I'm not sure what's going on there so I guess it's that so it's not gonna work every time but it's gonna work enough times where you should be able to make use out of this so I hope this has been uh informative and yeah good luck
Info
Channel: poisenbery
Views: 11,471
Rating: undefined out of 5
Keywords:
Id: ySoIptW2huI
Channel Id: undefined
Length: 16min 42sec (1002 seconds)
Published: Thu May 25 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.