DIP Lecture 12a: Image Segmentation

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay so the topic of today's lecture is kind of an introduction into what's called segmentation okay and so last time we talked about Thresh holding which was basically a way to take an image and turn it into a black and white image right but today what we want to talk about is our a couple slightly more sophisticated methods for this process so this is intro to segmentation and so I'm going to cover maybe two or three methods today that are kind of like you know general-purpose segmentation tools and as I'll show you these are kind of built into MATLAB for the most part and then next time we're going to talk about an even more advanced segmentation algorithm that is you know this is the pixel the stuff stereotype of today is kind of like at the pixel level and the stuff we're going to talk about on the next lecture is a little bit more like evolving boundary or a closed contour to squeeze around an object so it's little bit fancier okay so first thing I want to talk about is what I would call basic region growing so the idea here is that you know if you threshold an entire image what you may get is a bunch of disconnected regions that are above the threshold right and as we're going to talk about in a couple lectures you can clean some of this stuff up with what's called morphological image processing where you can kind of select which region you care about and so on but you certainly may get something that looks kind of like this and the rest of stuff is zero so instead suppose I just want to get like one nice connected region of stuff that I feel like belongs to a certain neighborhood right so if we only want one region kind of if we want to click on a point for example and say we can do the following so the idea would be something like I click on this point and I grow out kind of a blob of all the pixels that are similar intensity for example to that point ok so let's see what we can do with this and the algorithm is very simple right so let me just kind of say that basic algorithm is the following so I take my input image X or I write for my input image I get what's called a seed image let's call that SFX why for places that I'm interested in locations of interest and as we'll talk about you know there are ways to get those seeds automatically for example what you could do is you could take these blobs here and you could kind of shrink them down to just get like the Centers of each of these regions and call those the seeds in the case I'm going to show you for my example I'm just going to click on a point in the image and see what we get right but you could guess for example by you know fresh holding and then what I'm going to do is I'm going to reduce this seed image down to just a set of points so I'm gonna take each blob of this seed image that I got down to kind of like a single point each for the moment I'm going to kind of you know this isn't really there of that much I mean there are MATLAB tools that help you do this one is called reagent props where you could use that to get the centroid of the point or you could do something called erosion what you're going to talk about next not next week but the week after where basically you can kind of take a big white blob and shrink it down to a nugget of the center okay but basically I'm gonna say okay fundamentally every place I care about I shrink down just one location I set that equal to one everything else gets it equal to zero and then the algorithm is really easy basically I say I let my new image what's called the processed image T equal one if the image satisfies some kind of predicate or condition and zero else I mean this is super vague so let me be a little more specific so for example I might say something like okay I want to get all of the points that are connected to a given seed point and I want the intensity of each of those points to be close to the intensity of the Z point right so all this means is it's like saying okay you know I click on a point and I say okay that points intensity is 129 and maybe my threshold was 10 and I say okay now find me all the points that are close enough 229 that I can find a path from that seed point right so it's kind of like growing outwards from that seed point all the other pixels have satisfy that condition and then once I had kind of a rind in pixels that don't satisfy that condition I stop right so unlike threshold thing where I could get you know points all over the place here I'm kind of constrained to growing outwards from a given seed point right so it's kind of like saying I have my image I click on a point and then you know my region grows outwards until I have you know got all the points that satisfy that condition right so let's do an example this for a second in MATLAB so MATLAB has a command and let me remember what it is I wrote my own little wrapper around it but let me just see what I call this a region grow is my so it's called gray connected Vegas switch over back to here for a second so it's function of MATLAB is called gray connected all you do basically is you provide an image a row and a column and a threshold and I've kind of made my own version of this where I basically say you know just grow the region with some automatically estimated threshold when I click on a point so let's look at image let's see here so here we got a hippo so again you can do this in color or in greyscale so basically you know my original statement of how this works is you know if I go back to my notes you know this thing here doesn't really apply Issus airily only to greyscale I could have this be like the RGB norm if I wanted to but the MATLAB command only really works on grayscale values which is why I have to turn this hippo into grayscale for the moment of course you could modify this to work in color if you wanted to and then my region grow command is basically going to let me click on a point and you know grow all the pixels whose intensity is sufficiently similar to that point right so as you would expect I click on the middle of that ear and I get all the kind of dark ear points right and if I keep on doing this I can you know click on other places and get kind of the grey of the nose or I can click on the tooth and get just the tooth right and if I'm lucky I can click on the snout here and kind of get the hole here I go you know here you can see that even though I mean it look like it this is a whole big connected chunk of stuff right and so maybe my I clicked a little bit better maybe I hit on a color that was oh it's even worse let's try clicking around here this is like you know a relatively bright area so you know if I wanted to I could supply a threshold that says how close do you have to be as you can see that region growth is pretty uncontrolled right so certainly that region that I click on could still squirrel out to the edges of the image and so you know possibly I needed a more sophisticated algorithm this right so we're going to talk about some more sophisticated stuff in just a second and also you can see that the type of region that I get you know here if I was a little more aggressive about my threshold maybe this kind of white blob would have been little bit more connected right it would have been a little more solid and so I could either fix this by being a little bit more generous about my threshold or I could fix this blob with what are called morphological operations or I kind of could in all the gaps in the middle of the blog for example right so we'll talk about this stuff a little bit later so very simple idea when we pause and ask if there are any questions about this this is just like the warm-up for beginning of the week kind of thing okay so let's talk about something else next thing for those of you that are a little bit more computer graphic see you may have heard of things like quad trees and hierarchical decomposition of images so the second very basic idea would be what I would call like region split and merge and the idea there is to say you know again I want to find places the image that meets some sort of a condition right so I could say something like you know specify a condition or a rule and I kind of split up my image first of all maybe I split it into quadrants right and I say okay I set the Quadra equal to one if you satisfy the rule and zero if you don't satisfy the rule and suppose that these regions satisfy and these regions don't then I say okay well I'm happy with the regions on the left now I'm going to sub split these images sub regions on the right and say okay so now which of you are following the rule right maybe again I get some ones and some zeros and I keep on doing that right so I keep on splitting up the image into sub regions where every region follows the rule and eventually you can imagine that what I get is kind of like a you know a mosaic of bigger and smaller squares that do or don't satisfy the rule right so if I think of it as a binary image maybe the white pixels would be the ones that satisfy it and the black pixels will be the ones that don't satisfy it and so you can imagine you kind of get this you know kind of like chunky you know blocky version of a threshold image where you know you can imagine storing this and implementing this very efficiently using a quadtree type algorithm where you kind of recursively keep on splitting the image and so on this is kind of similar to the block proc command in MATLAB in a way except that this is kind of more of a hierarchical thing and block practice more of a thing where I specify the smallest block size and then everything kind of has the same block size right so again this is kind of like a you know I wouldn't necessarily call it region growing but it's kind of a similar thing to get kind of sum everything in the image satisfies this rule kind of thing okay all right so let's get to the more interesting stuff right so the more interesting stuff I kind of alluded to at the end of the last lecture we were doing this example with color threshold thing right where I was trying to find the yellow bananas or the blue of the box and I said something like you could do a better job with this if you started to use some sort of basic machine learning where you tried to find where the colors were in color space and kind of segregate them to get their cluster them together and so that's the next thing which is what I would call clustering and this is gonna lead to what are called super pixels so first I kind of need this idea that's called k-means clustering this is from statistics I don't know whether or not you would have had this in a class like probably not probability maybe class like the very first machine learning class or a pattern recognition class very easy idea the idea is that suppose I've got some sets of points that are described in in this example just 2d space right here is just XY space and so the premise is okay you know as a human I can kind of see that there are these visual groupings of points in this two-dimensional space what I want is an algorithm that will say yes you know all this stuff is one chunk all this stuff is another chunk all this stuff is the third chunk right and so the idea is that of course in image processing we have more than just X Y we have RGB right so I really I'd have to illustrate this kind of like in a three dimensional sense right but the basic idea is to what's called do k-means clustering we want to gather n-dimensional data into natural clusters in this case the user chooses the number of clusters so how does this algorithm work I'm going to talk about this algorithm in a very general way okay so basically you know it is first you don't know where the clusters are going to be so you kind of make a guess about where the cluster centers might be in this space so first I will specify an initial set of cluster centers and I'm gonna call those m1 through MK which are again in the same space as my data points and then very simple idea for every one of my data points for each of my data points all I'm going to do is I'm going to assign it to whatever cluster Center is closest to so I'm going to assign this and this assignment is basically just like kind of the nearest cluster in the euclidean sense right so for example I'd say okay you're gonna be a member of cluster J if you are closer to mean J cluster Sarah J than any other of these things so the end of this step I've had kind of a preliminary assignment of which data point belongs to which of the clusters and now I will update my cluster centers to basically say okay now I've got new members I update all of these means to say okay just take the average value all the X in cluster J right it's like saying however many points there are in cluster J I take all you X's and I average them together and I do that for all my cluster centers and then I basically keep alternating two and three until convergence until these MJ stop changing so usually what I do is I kind of put a threshold on how far these means need to move before I can stop right so a very simple idea I throw some possibilities down I start to you know collect my neighbors reach these possibilities and I kind of continually update until I'm done as you can see or as you can imagine this algorithm depends on maybe where I start in terms of how I initialize it I do a bad initialization I may get bad clusters and so what happens in practice for example is an algorithm may randomly throw down some cluster scenarios in the space and then run the algorithm you know 10 or 20 times to see what results in the tightest clusters right so in some sense what you can imagine is that some sense of cluster tightness has to do with the standard deviation of the points within a cluster so if I've got a bunch of very tightly focused points then my Sigma's gonna be low and if I've got points in my cluster that are really spread out and I think was gonna be high so you can imagine some sort of a thing where I could automatically say okay if I give you a bunch of possibilities this one is the best of the clusters right so for example how would this apply back to image processing right so let's suppose that we revisit just like looking at the image histogram for example what I could do would be to say okay last time we talked about oats whose algorithm for a binary threshold a right which in a way you could think about as a way to cluster the points in the image into either black or white right based on some sort of boundary line but if I had a histogram that looked like this for example you can imagine that I could say okay I'm going to run some sort of a k-means algorithm with K equals 3 and maybe I choose initially the peaks of these values here and if I were to cluster maybe I would find that all the points in here were one cluster all the points in here where the second cluster all the points in here where the third cluster right and so let's just do an example of MATLAB so first of all it's up stuff hippo hippo is aggressive okay so so there is a mat like command called k-means and this does exactly what you want so in fact I think the k-means example that make this a little bigger is in fact well here I think I just took the k-means example from the data file and made it into a simple m file so I just took the help text and made it into this so here this is like saying on the left is my raw data which you can kind of see as a human breaks into kind of two pieces and on the right is the automatically estimated k-means clustering and then there's some diagnosis here that basically says you know how much better than my clustering get as I iterate between assigning pixel or assigning dots to clusters and updating the cluster centers and so this the sum basically is saying you know how much better are things getting and so things get pretty good pretty fast you can see it was very quick and then you can imagine applying this to images and so I made a so if you search in MATLAB for example for let's just see here image clustering see what it says segment using Auto cluster so I'm going to show that in just second though there was a good one on see k-means but there was a nice worked out example color based segmentation using k-means clustering right so they've got this whole work for example which I'm actually just gonna do I made this into my own M file so if you do your own googling you can do this but here's just the MATLAB file so here there's a set up right I've got this color image which is coming from like a microscope slide right and I can see that it's definitely got some purplish stuff some blueish stuff and some whitish stuff right so what I'm gonna do is I'm gonna cluster in a certain space now we talked about how you could cluster color you can talk about colors in RGB space in this example we're going to do what's called the lav space and so here if I look at the clusters over here so what I'm seeing here is first of all the x-axis is a and the y-axis is B what does that mean that's kind of like the color components of each pixel the one I'm leaving out is the luminance which is basically just like the grayscale intensity and so you can see that here there's you know kind of this dense set of points and if I run the k-means clustering on this I end up kind of chopping up that cloud of points into three regions and the three regions if I go back in color which thing is which end up being pretty much the white stuff in the cluster is basically the white stuff in the image the gray stuff is kind of like the pinkish stuff and the black cluster is like the bluish stuff and so I can see this is a pretty good job of segmenting the image into clusters and then what I could do is break out okay here are the pieces of the images where I can really see that I've got kind of all the pink stuff in one place all the white stuff in one and all the bluish stuff in another place and then as a final little cherry on top I could say okay well the blue stuff then is kind of divided into dark blue stuff and light blue stuff and if I wanted to further discriminate that I could bring back the channel I didn't use the luminance channel and say okay now I'm going to try and split this into a dark part and a light part and so if I did that then I could say threshold just this image here on stuff that is darker than a certain value if we're gonna use things like oats whose threshold from last time right so you can see that with some pretty simple ideas I can get down to taking a fairly complicated image and boiling it down into something fairly simple right now you can imagine that maybe I could do something like you know count the nuclei count the number of cells in this image just from this series of steps right so the clustering idea is pretty powerful and pretty simple so let me pause and ask any questions about this idea so you can run this demo yourself this is just like something I found in the MATLAB help text so nothing like crazy fan state did I do myself so this is okay but one thing that's still a little bit unsatisfying is that what I do this you can see that I still get these disconnected chunks of image right so if I go back and I look at the well I guess this is a this is an okay example right so if I looked at just the pink stuff I can see I've got pink stuff all over the place right lots of squirrely weird shaped things right and so what I'd like to do is lend a little bit more spatial coherence to these blobs that I get right and that leaves the idea what are called super pixels so this is kind of a cool idea that has a lot of image processing applications oh man I got in Cillian windows here so let's go back to look I bested it so first of all let's close all these windows to make my life a little bit easier minimize this guy second of all let's go back to here okay so I guess before I leave k-means let me just write down a couple things I said one was that you know I could probably in general for k means I could initialize with K kind of random locations converge and then I could kind of do this and times and take the best result and so if you look in the k-means help for example it has an extra option that says not only can i specify k but also i could specify sorry waiting for it I guess you use parallel things oh yeah here we go so basically here you can say okay if I specify a number of replicates that's what this replicates 10 is this is saying okay for each of these trials you know how many iterations it take how well did I do and of course if you had a GPU you could find this out to a bunch of different cores you want you that's really going all the way ok so again let's make this a little more specific to image processing this is kind of just like a general machine learning it's called unsupervised clustering right so kind of a nice modification of k-means it's often used in image processing it's called super pixels and these are defined as regions of the image that are contiguous meaning they're all one chunk and have similar intensity or color and to make this little more concrete let me just show you an example of what I mean so so here's my tropical drink and so if I again I wrote a little bit of a wrapper around a MATLAB function called my super pics the MATLAB underlying commands called super pixels we're gonna talk about in a second so let's just suppose that I look at what our super pixels say I wanted to say divide this into 500 super pixels the effect is something like this basically it's like saying find me 500 regions that nicely divide up this image and if you look every region kind of has more or less the same color and you know the regions are kind of tightly divided along the edges of the image and so you can imagine this could be pretty useful for segmentation right so if I just want to segment the cup for example I could pick out only those super pixels that I cared about and instead of having to click on you know ten thousand individual pixels I click on thirty kind of chunky pixels right and of course if you also just color each of these things by its average value you kind of get this nice impressionistic looking you know artistic rendering of the cup right so let's talk about how this is done okay so first of all let's just talk about why we do this why would we want to do this so in a way it's a compact representation of an image right [Music] we could take an image and with thousands of super pixels super I really spelled this wrong super pixels I could use that seems like a lot but I could use that to represent millions of pixels and also you know in a sling II way I could say this kind of keeps things together better for subsequent segmentation and that has a direct effect on the computational efficiency of segmentation algorithms so the last thing when I talk about this graph cut algorithm benefits greatly from using super pixels instead of regular pixels okay so let's talk about how do I collect the pixels into these nice chunks okay and as you can imagine it's kind of a modification of the k-means algorithm except that in addition to just color being the dimension I also am going to add the spatial domain into my nd data right so this is a special kind of super pixel it's called slick super pixels I mean there are a bunch of different kinds of algorithms this stands for simple linear iterative clustering okay and so the idea is basically to do clustering in this five dimensional space so for each pixel I record the red green blue and also the X&Y alright so I have some stuff for color and I have some stuff for location and you know this color again just like we talked about for the other example you know maybe I don't do it an RGB color I could use lav colors for example okay so here are the steps right just like in k-means first I have to decide on where are my initial centers going to be okay and so here are the basic steps the first thing I do is I initialize super pixel centers by sampling n locations on a regular grid in the image plane right all that means is that if I say okay I want to have 500 super pixels initially I just spaced out those 500 points in a grid across the whole image right and just as a side note you know obviously these super pixel centers need to be in the middle of these fundamentally empty regions right so one little adjustment I might make is that I move these slightly within a 3 by 3 neighborhood to lie on the lowest gradient position right that's like saying okay you know I want to make sure that my seeds are kind of in the flattest locations possible right because that's where I want to kind of go out from we don't want to start on an edge okay now just like in k-means what I'm gonna do is for every pixel in the image I'm going to compute a distance to each of these centers but now the distance is not just spatial distance and not just color distance it's kind of a combination of these things so I have to be a little bit specific about what do I mean by distance so let me be a little bit cagey about that for just a second well you complete the algorithm basically I say for each cluster Center am i I'm going to compute a distance which I'm going to determine for you in a minute between MI and each pixel in a neighborhood of MI so this is where it differs a little bit from normal clay means clustering so what I mean by that is let's suppose if I go back to my picture of where I sit down these clusters there's kind of like this spacing s between where I laid down these things right and that spacing kind of depend that kind of sets the tone for how big these super pixels are going to be allowed to get right so I'm not going to be able to cluster anywhere outside of this range so basically the idea would be you know for a given place I'm only going to look in a neighborhood that is kind of s away this prevents the super pixels from getting too large okay and then I assign one of these pixels assign a pixel to cluster I if its distance is better than it was before right so basically I'm saying okay you know I already have a label I look around to see what my you know new neighbors might be and I take the neighbor that has the best distance right now keep on looping over all the clusters and then I update the cluster centers like I usually would in k-means and then I repeat until convergence and kind of the optional step is if I want to get that nice kind of stained-glass effect I can replace colors of pixels in each cluster with the average color so again pretty simple idea the only twist compared to normal k-means is that I have this kind of spatial requirement and I also have to specify what do I mean by the distance right so again I've got this five dimensional vector what is my distance let me be a little more specific about that so so what is the the distance function we should use well I could make it out of two pieces right it's kind of a combination of color distance right so I could have a color distance which is the usual RGB at a pixel I and RGB at cluster J this could just be like my normal to norm of the color distance and I could have kind of a spatial distance D s which is how far away I am and the issue with this is that the two distances here are not really on the same scale right so again XY is measured in pixels so my distances could really range from zero to you know the diagonal the image could be thousands of pixels away right whereas in color space if I'm in RGB space then I'm kind of on a scale of 0 to 255 if I'm in grayscale space or some other space it could be on a scale of 0 to 1 and so I can't necessarily directly add these two things up because they might be on wildly different scales and so what I need to do is make a distance that is something like I divide my color space by something and I divide my spatial space by something so here for example the C could be something like the maximum color distance possible and the s could be the maximum spatial distance which as a user I already set that in the initial phase the algorithm right I kind of already specified that I don't want the super pixels to get more than s big so this s could be the same as this s I mean this may be a little bit harder to guess right to know how far apart things should be in color space and so in practice maybe I don't set some hard and fast number for that maybe I use Z to tune kind of what my trade-off is between the color distance and the spatial distance right so for example if C is big that's like saying that you know if C is really big then this part of the equation this doesn't really even matter because this gets really small right so what matters to the spatial someone marries the super pixel is going to be kind of how spatially close I am so see big means my super pixels are going to be more compact because they will favor keeping the spatial distance small and see small would mean that my super pixels are you know more tightly stuck to for example image boundaries that's like saying if I think the color is really important right that means that this term dominates and says that you have to be really similar color to me to be able to be part of my cluster but I don't really care so much about how spatially far away you are so maybe I would get super pixels they're a little bit long and skinny or a little bit twisty right still subject to the fact that my that my pixels can't get very big right slowly pause and ask before I go back to battle have any comments or questions about this idea all right so let's let's see it in action on a couple of examples here so I guess we already looked at my drink example so let's see it again so here I asked for 500 super pixels and I got this right if I asked for smaller number of pixels actually I'll start with a large number so if I asked for five thousand then I get a very fine division and actually you can see that you know probably from your saying maybe this looks a little bit smudgy but it's actually not that bad right so let's think about this so this image originally is how big is it image 720 by 960 right so that's multiply these numbers together 61,000 pixel locations each of which has a color right so instead if I boil that down to only five thousand locations I get saying it looks pretty close to the original image and I could certainly you know say okay what if I was gonna use about 10% of my you know pixels you know this image looks pretty close to the original image you know maybe it's a little more chunky but certainly at first glance you might not think that that right-hand image had anything wrong with it but I'm using only fifty thousand numbers to describe it right and of course if I swing the other way where I only save asked for like 50 pixels then I get this kind of blobby thing you can see what the super-soul is trying to do even here those super pixels are still doing a pretty good job of following the contour of for example this pink part of the cup looks like it's pretty well segmented but you know it's just a little bit too big to really do a great job so it's kind of like the question of of tuning to see what you like so maybe you know I thought 500 was a pretty good example here you can see that the two pixels really follow along the edges of the image now this is all based on a MATLAB function called super pixels oddly enough so if I do doc of super pixels is either pixel or pixels let's see super pixels and so here you know I specify the image and the number of pixels that I want and then there are some other parameters I can use and one of those parameters is kind of this compactness thing so this compactness number is kind of like the trade-off value of C that I was talking about before right so if I want to make C if I want to make the compactness larger that means that surfaces are going to be so this is basically exactly the C parameter right so let me just see I think that in my swerve pixels algorithm I included this compactness parameter did I though I guess I didn't and if I didn't do anything fancier this is just literally copying and pasting from the MATLAB help file so it's a like I learned anything fancy to write this program let's try a couple other images so [Music] so here is tropical fruit I'm a tropical kick here so basically again if I look and see where are the super pixels here you know here I think that I have tried to conform more to the image edges that I am concerned about how square the the pixels are right I think it looks pretty good and then something like so here again if we were to I'm sorry actually that this is I think this is just an image rendering issue so let's try and zoom in here a little bit so here you can see for example that in the absence of any real color change like in the middle of this door you can kind of see how big the super pixel is allowed to grow right because there's like a big red square in the middle of this door that's basically the super pixel hitting its upper bound on how far I can stray from the middle right and if I were to make the super pixel number smaller I would expect that what we would see is a bunch of smaller whoops no no that way I mean I want the number of pixels to be smaller so something like this yeah here we go so if I were to do this I think that I would probably get inside here you know you can kind of see that there are some straight lines in the middle of these regions that kind of correspond to growing out from the boundary and going as far as I can right although it I was expecting actually a little bit more squares than I saw here but you get the idea and here again you can see that the nice thing about cervicals is the way that they really snap to the edges of the image that's a nice feature okay okay so you could imagine that I could then devise an algorithm that would say group these super pixels into you know regions right and that clustering the kind of good imagine there's a higher level clustering I could do now on these things where in some sense I kind of you know removed all of the image noise right so for example if I look at this image here you know you might imagine that inside this rocky wall there's a lot of texture that may confuse an algorithm that is trying to follow edges or trying to you know you know go tightly around some boundary if I do super pixels then the insides of these boundaries are nice and smooth and so I don't get a sarolea caught up on little noisy edges that maybe I care about right so the important stuff is there but I've kind of abstracted out the unimportant stuff right so for example here you can see that there are these knobs or bolts or whatever on the doorway here that have gotten kind of collected into the same super pixels over here and that little load that little low level of detail has kind of been removed right and so that helps me concentrate on the bigger picture stuff and not worry about the small stuff okay question right so it it's not perfect right so if we look into here you can see that some of these cervix holes are straddling the door a little bit right and so that means that when I look at these shapes over here then you know there's some burps right because for whatever reason those those guys did not converge very well right so where is that I guess that's probably a little bit north of where we are right I lost track of where I am in the image here it's on the other side of the door so let's try looking at here what happened over here yeah so I think for example you can see that there are that we're looking at is this right so here these three pixels for whatever reason have slid over at the door boundary and I could try to fix that problem by for example cranking up that compactness parameter to say okay look you know you're not allowed to get too squirrely and you have to conform to image edges right so I'm not sure if in my implementation so I think that the default value is something like 10 so what I could do is I could think it's called compactness so if I could crank this value up a little bit let's see if we can make this any better so thinking thinking ok this is interesting so what happened here looks a little bit better and actually the supercilious look a lot different right so now actually these almost look like they're inclined to be squares but I only have a little bit of a burp over the edge right and let's actually go the other way just to see what happens so what if I make my number a lot smaller than I think the default is I'm not sure how much of a difference we're gonna see here I mean there definitely is some sort of qualitative difference right so again there's no magic bullet to make this work but it generally works pretty well and so you can play with this at home too to see how this works I believe that I have I guess they didn't write the next homework yet but I'll probably put some sort of superficially a problem for you to play with your own images okay other comments or questions all right so the last concept I want to talk about today is clothes all this junk up and clear all my stuff make my world nice so last concept is a idea that has really swept computer vision by storm and maybe like the no no sense boring to you guys you were guys were all born in the 90s but so in the late 90s mid 90s there was like all the excitement about this thing called graph cuts right and so I want to talk about a little bit what what that is and so graph cut segmentation so the idea here is the following so what I want to do is I want to hear we're going to only talk about splitting the image into two pieces basically like foreground and background right so this is the kind of idea is to separate foreground and background okay so the philosophy behind graph cuts is that you know Thresh holding and simple algorithms like that are going to fail when both the foreground in the background are pretty complicated in terms of texture or color right so I mean threshold inquiry when you've got like a big blue ball that you want a segment right but it won't work so well if you've got like a zebra that you want a segment right where there's black and white and the background is green and grassy and there are trees so for complex images segmentation can be challenging and Thresh holding basic fresh holding and stuff like that is not going to be up to the task right so the idea here is kind of clever and I'm going to kind of draw it first like this and I'll show you a better rendering that I made previously so kind of the idea is that I've got my image plane here and each of these pixels is each of these dots is a pixel basically this is not very even but okay and I have what I'm going to call a foreground terminal and a background terminal okay and so what I want to do is I want to find a way to so and the initially every one of these pixels is connected to both the foreground and the background this is why it's kind of tough to draw on piece of paper let me show you a nicer picture right so the setup is the one at the left right so I've got in this case just three by three pixel grid very simple each of these pixels is initially connected to the foreground of the background what I want to do is I want to cut some of these edges so that after I slice some of these edges I've disconnected this graph into one chunk that belongs to the foreground and one chunk that belongs to the background right and that kind of implies a segmentation right that's like saying that all the stuff that belongs with B is background all the stuff that's connected to F is foreground right so what I do is I want to kind of use my scissors and chop through these gray edges and these black edges until there's no way I can get from F to B along some set of paths okay so that's the name of the game so this separation anything that separates the foreground of the background is what's called a cut okay a cut on the graph and what I want to do in some sense is find the best cut or what we call the minimum cut okay so let me go back to here so basically the idea is a cut is a set of edges that when removed separates these F and B terminals and that cut kind of implies a segmentation okay and so what I want to do is find the best cut what do I mean by that well we assign a weight to each edge and if you go back to this image you see there are two kinds of edges right there edges between the pixels in the image these black lines and then there are edges between each pixel and the foreground and background these gray lines so I'm going to do is we assign a weight to each of these edges both pixel to pixel and pixel to terminal and the idea is we want to find the minimum cut that is if I call see a cut the set of edges the sum over all the edges in see of the weights that I've sliced away okay okay so let's think about kind of intuition for how I should set these weights right so I want there to be you know so again the intuition is that stuff that is relatively the same color should be preserved right should be kind of kept together and stuff that is far apart like if there's an edge I should be okay with kind of cutting the line that connects those two pixels because I want those to be on opposite sides of the segmentation okay so let's formalize that first right so between adjacent pixels we could use something like the weight is gonna be you know maybe I normalize for the distance between I and J so again if I'm just looking at adjacent pixels that distance is probably always gonna be one right so in some sense I didn't have to worry about this term the most important part is I have e to the minus you know some variance times the difference between these pixels right so if I J is similar to III then this number is close to zero and I get something like 1 right if I I is not like IJ then I assign where this is something like e to the minus big number and that's going to be like zero right so it is that for neighbors who have different colors the graph cut algorithm is inclined to remove them because there's a very low weight to doing so very low costs doing that right so basically I have low cost to cut dissimilar edges okay and also you know at the end of the day I can only connect my pixel to either the foreground or the background right and so I also need to have a weight that says how similar am I to the foreground node how similar to my the background node and that's where there's a little bit more of an advanced idea coming in which is that we let the user what I could call scribble on the image to denote some initial foreground and background pixels and those scribbles basically form probability distributions the probability that I'm in the foreground given a color and the probability that I'm in the background given a color and so scribbling does two things right so first of all when I scribble on the image I am nailing down to say you know the scribbled points have to be belonging with either the foreground of the background so number one you know the scribbled pixels are forced to stick with one terminal and how do I do that for example is that if I have a foreground pixel my weight between my pixel and the foreground node I could just manually set to be infinity or some massive number just so expensive that you never cut it while the weight between me and my background is zero right so I definitely there's no cost to cutting that and vice versa for a pixel that I've scribbled its background and then other pixels I could use something like the weight between me and the foreground is something like some scalar times the log of this and vice versa so this maybe seems a little bit backwards right but let's think about this for a second so suppose that the probability that I am background is very low right so that means that FB is very low that means that log of this very small close to zero number is going to be close to like minus infinity which means that the weight on that is going to be very large right whereas you know these probabilities basically so that means that if the foreground if this is probably low then this is probably high which means that this number is going to be relatively low right meaning that I got myself turned around so again let me say this one more time so if the four grrah if the background probability is low right so FB mean low means that log FB is gonna be like you know big now get a number which means that this number is gonna be big positive number right that makes sense so I should be comfortable leaving that as a likely foreground pixel right so that is that I scribble on the image I kind of seed some foreground background pixels all the other pixel values kind of follow along to say okay so if I'm kind of similar to something it was already scribbles foreground then it's gonna be you know a high value on that edge between me and the foreground node so let me make this a little more concrete so MATLAB has this graph cut algorithm built in and where is it it's in this kind of a different thing you go to apps then you go to image processing could be revision image segments or so this brings up a little GUI that has all sorts of stuff in it so let's load an image okay so here is an image and so if I go to graph cut it lets me it can't walk through so step 1 mark a foreground object so I can use my mouse and I can scribble to say okay just very casually this is the stuff that I think is foreground then I say mark background I say very casually okay this is the stuff I think his background now it goes off and automatically makes a segmentation that says ok stuff that is kind of similar to the foreground intensity gets closer into foreground stuff is similar to the background gets cluster into background and I can see that's not perfect right but it's already pretty good and if I were to do some more scribbling like to say no no this is background and this is background and this is background and then maybe I say oh but this is foreground over here so the nice thing is that the segmentation can kind of be iteratively recomputed and if I kind of use that to make a segmentation I can see that I can fairly quickly kind of sketch out this foreground object even though it's actually got a fair amount of you know edges and texture and stuff like that the background is complicated the foreground is complicated but I can still do this thing where again sketching on the Koala basically says I'm seeding the algorithm to say ok I think the foreground stuff that I want is grayish and whitish and I think the background stuff that I want is brownish and greenish and so on right so it's a pretty slick and simple algorithm let's see if we can try it on something else here so let's try so see what happens here airplane so again graph cut all I do is I kind of sketch along the stuff I think it's foreground again this is a very complicated image right in terms of color and I say ok I think this stuff is background let's see what you got it's thinking I assume I guess for some reason oh ha you're right let's go back I guess I was too opaque yeah actually I did okay right so I'm not sure I can go back to my sketching thing let's try that again right so let's repeat the example do do here so again when I sketch I kind of want to make sure I get a little bit of all the stuff that I think is going to be relevant and I did pretty well and I can kind of sketch a little more on here Oh so it's a little sloppy to be honest no offense to MATLAB I think that their implementation is not there there are smarter implementations this and I'll show you one more before I do this but let me just say that let's go back to the idea we had last night was super pixels so here if I do sub show sub regions you can see that actually this graph cut algorithm is under the hood implemented by collecting super pixels together and not collecting pixels together right so that may account for some of the weird stuff in fact you can see that I was you know it was a fool's errand to try to do things over here no not you right because you can see the super pixels actually overlap the nose of the plane and the background right so I could never have necessarily gotten the plane on one side and the ground the others because these are bad super wriggles right so you know maybe if there was a way for me to tune my super pixels I could do a little bit better and that would account for better segmentation I do have another graph cut implementation that I had from there's a different one so here this is there's a modification of the graph cut algorithm called grab cut which is kind of neat without going into too many details the fundamental innovation is that all I have to do is draw a bounding box around the foreground and then I believe it's a lot to do right thinking is it stalled not responding now here we go so I mean that's it pretty well right out of the gate right and if I go to go back to my image I guess I have this overlay right so I can see actually this was much better and probably this image isn't being segmented using through pixels it's probably segmented using actual pixels which kind of accounts for why the detail is much finer but just like in the other MATLAB interface I can click and say okay no actually this stuff here you missed it was foreground and it will automatically update I guess my computer is a little slow because of all the air junk I have running out of this yeah so here I think this is a pretty decent segmentation that came just from drawing a bounding box and editing it a little bit and so what's happening is that more or less inside the bounding box is used to see the foreground density outside the bounding box is used to see the background density right and so the reason that graph cuts became so popular in image processing computer vision is that you know it used to be thought that solving this minimum cut problem was super time-consuming computationally right and then there were a couple of kind of groundbreaking papers Boyka as the guy who was really involved in it and his colleagues who showed that for the kinds of segmentation problems that you find image processing where instead of having like this random graph with vertices and edges you have a very regimented pixel level graph where only you've got kind of pixels selected there for neighbors they showed that there are very efficient ways of solving graph cut problems on those kinds of grid based graphs and that's what turned this into a computationally feasible algorithm because that you can see I'm able to update this on the fly right now that being said I think that you'll still find that graph cut algorithms may choke a little bit if you load up a massive you know 15 megapixel image I think that you'll start to see MATLAB struggling even if it's using super pixels behind the hood right so you're still having to create a fundamentally large graph between all of the edges of between pixels in the image right and also just as a side note you know if you took algorithms for example here you probably learned about max flow problems min cut problems for vocals and algorithm those are the things that are kind of under the hood for solving these kinds of problems so in general these computational problems are inherited from this idea of you know well the dual so-called the minimum cut problem is this maximum flow problem basically how much stuff can I push from point A to point B along these links that have finite capacity right and so that problem turns out to be related to the minimum cut problem and so if you took algorithms you learned about ways to solve that problem okay so I think that's a good place to stop for a moment let me pause and ask if there are any comments or questions yes so I believe that so the super cells are definitely probably for neighborhood or a neighborhood right in terms of connectivity but you'll be nice so clearly there's no in this little gooey thing there's clearly no way to tune anything about the cervical size right so here you know I don't see any little knobs that I can turn to change things right so you could do this yourself very easily by going to the super tools command in that lab and modifying how many you wanted and then you could call the graph cut algorithm on that so I think that under the hood now don't quote me on this let's see what the function is in that labs so I think if I just look for a graph cut so segment image with graph cut yes so I believe that probably under the hood it's using this max flow command but let's see graph graph cut so here it's wanting me to use the segment ER but let's see if there's an actual file that goes along with it mm-hmm I mean I feel like it's got to be something where I can use it without having to go through the GUI interesting rotation analysis so again all this stuff is nicely built into MATLAB call the fresh folder image segments or functions okay so gray thrush multi Thresh so you've learned about some of the stuff already it could be lazy snapping yes so that's what it's called so basically here this is what's under the hood of the algorithm and it's using the through pixels to to do this so here I mean it makes a little bit chunkier because you have to now specify the foreground background instead of giving you the GUI you have to specify some points and so on right but if you want to do this yourself then lazy snapping is the function that you would use to do it and if parse all the stuff is also built into OpenCV and stuff like that by now so python other questions or comments great so next time we're going to talk about a more advanced method that basically you know we're really going to model the shape of the curve that is going to go around the foreground of the object we're going to talk about how to evolve that curve step-by-step to tightly wrap around it so that's going to be called snakes and active contours so I'll see you next time and now I will turn off this guy
Info
Channel: Rich Radke
Views: 33,207
Rating: undefined out of 5
Keywords: rich radke, radke, rpi, digital image processing, image processing, region growing, k-means, superpixels, graph cuts, segmentation, image segmentation
Id: ZF-3aORwEc0
Channel Id: undefined
Length: 76min 24sec (4584 seconds)
Published: Mon Mar 05 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.