Image Analysis in Python with SciPy and scikit-image | SciPy 2018 Tutorial | Stefan van der Walt

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

good afternoon ladies and gentlemen and welcome to the psychedelic to toriel it's exciting to see so many of you in the room today and yeah I'm I'd like to introduce you to my co-presenters today so my name is Stefan I'm the founder of psychic and the researcher at the UC Berkeley data Science Institute with one Nunez Iglesias who's a core contributor to Syfy and currently a researcher at Monash yes and Josh Warner also core contributor to psycho psycho damage previously at the Mayo Clinic and currently at the University of Arizona all right so I'd like to make sure that everyone in the room has the latest version of the material we made some last-minute tweaks so if you clone the repository yesterday just do a git pull if you haven't downloaded the material just go to github the URL is here on the screen so k-dubb comm for drug psycho - image for drug ESCA image tutorials you can click on the download button it will give you a zip file or if you have git installed just to a get flown and a get pull so just a rough check who already has the material on their machine Oh awesome okay excellent so usually we go into a little bit of detail about what psychic is and what it's used for but I presume most of you know that since you're in this tutorial so all I'll say is that you know images are all around us it's a very common modality of imaging there are very rich ways of acquiring images nowadays the instrumentation is improving drastically every day the types of images we get out of the quality of those images the multitude of data that we can gather increases and while there are certainly a growth in neural networks and these kinds of algorithms that can automatically learn a lot about your data it still helps to be able to apply the human heuristics on your data and extract meaningful insights from your images so cycle image is a Python library written for for getting access to these image processing primitives it's aimed at both industry and education and research we focus on code that is well documented well tested accurate and easily accessible both to users and developers it's a fully open source project so if at any point you feel that that is not the case we encourage you to contribute for example you can attend the sprints this Saturday or just mention it to one of the core contributors and we'd be happy to help you work through any of your issues or problems I'd also like to encourage you to visit our website psyche - image org we've got a full set of documentation there as well as a gallery of examples that illustrates the most common usage patterns of the package right so what we presume today is that you've already had some experience in the scientific Python ecosystem specifically we presume that you've at least seen some numpy before that you might have touched on syfy and that you have made a plot with Matt Butler before if that's not the case you'll you'll have example code in the notebook but we're at least hoping that we can sort of start from that level of experience all right so I want to be I want to make sure everyone launches their notebooks correctly because otherwise it might be difficult finding the images or the paths and the image to the images so once you've cloned the repository or unzip the file you downloaded from github change into that directory and launch Jupiter straight from that directory so don't go into any subdirectories just launch it from the root of that directory the directory should be called Eska image tutorials when you fire it up your Jupiter notebook should land you here so you should see you know a subfolder with lectures and workshops so just navigate into the workshops folder navigate to 2018 sy PI and click on index I py and B and that should bring you to this page okay so that's that's our launching point for today you scroll down that document you'll see the schedule so schedule in certain part of the word well so we're going to I'm going to do a brief introduction now into how images are represented as numpy arrays that form sort of the foundation of the lectures that will follow one will then continue showing you how to apply filters to images Josh will continue with image segmentation for extracting objects out of an image and then we will proceed to apply our knowledge to some stack overflow challenges so these are problems that were posed on stack overflow real-world questions that people had that they wanted solve so you can either work on those or bring your own problem and we can try and help you work on your own data set or photos yep right so we'll each hour will be 50 minutes lecture ten minutes break just to give you a chance to stretch your legs snacks are available it's not 2:15 I think I think it's to 25 25 plus two until for the snacks are available yep alright so at the end of that first page you can check if you've got all the prerequisite software installed so if you run the the setup script it should check for you that you've got the latest release of psychic image that you've got numpy sci-fi and MATLAB install those are the ones we'll be using today thank you alright everyone good so far if you have any questions just raise your hand and Josh or one could help you out all right let's get started so images in psychic Emmet are represented as numpy arrays we don't have a specialized image structure we use the nd array which is the common exchange format between the different scientific Python packages because we're using this generic container it means that psychic image is interoperable with most of the other Python packages in the Python scientific ecosystem including sight scikit-learn sci-fi in the image etc so I'm going to show you here how to construct your first image so we're going to import numpy as MP that's standard Convention and to display it we're going to import the PI plot package from matplotlib and here then is the crux of what I'm doing I'm asking numpy to generate a random array of dimension 500 by 500 so 500 by 500 matrix of numbers all between 0 and 1 and I'm going to ask mat blood lip to display it and that's what I get so this kind of follows our intuition that you know we have an array of numbers 0 maps to black 1 maps to white so this was a an image that I just constructed on the fly but the same holds for real-world images so if you import from Seiken image the data module the data module contains a few example images that we shipped with a package we can load up the coins image and I'm going to both display the coins image and a few attributes about that image specifically the images type in other words what kind of Python object are we talking about its data type and its shape so the type is a numpy array like I said we always use numpy arrays to represent these images the data pipe type is U and 8 in other words it's positive integers running between 0 and 255 and the dimensions of the image in this case is 303 rows by 384 columns so that's a gray level image so if a gray level image is a single matrix what do you think the shape of a color image would be well we're going to need different layers to represent the different colors so we're going to need a red and a green and a blue layer there are other ways to represent images like you could think in terms of you saturation and value and there are various other such color spaces but by default we represent them as red green and blue so let's let's grab Chelsea the cat will again display the shape of the array the minimum maximum values so there you see the image displayed in your seed shape has now changed it's no longer the two-dimensional array but it's 300 rows 451 columns and three layers red green and blue so because these are just some pie arrays you can manipulate them in the way that you manipulate any other numpy array you can slice them you can assign to them you can do reduction operators on them you can do whatever you would do on your numpy arrays so in this case I'd like to set part of that image to bright red so what value is bright red well it should be a high value in the red band and low values in the green and the blue bands so my image values ran between 0 and 255 in this case so I'm going to set a small block of that image to 255 0 and 0 red green and blue so you see when I do that I took this slice between rows 10 100 and 10 columns 10 and hundred and 10 that gives me that area and across all the color bands that's what that colon means when I assigned it to 5500 so I get that that red block in the corner if you want to represent transparency in an image you can add a fourth layer called an alpha layer but we will not be dealing with that today there are few other shapes that you might run into in practice so we've now seen two-dimensional grayscale images those have rows and columns we've seen two-dimensional multi-channel images red green and blue so those have rows columns and a certain number of channels we could run into three-dimensional grayscale images so those would have rows columns and a certain number of planes you can think of a microscope scanning through a certain volume you'll have several planes each with its own image or you can have 3d multi-channel data so in this case for each slice that you scan with a microscope it will be a color slice so in that case you'll have planes each consisting of rows and columns with a certain number of channels so for those of you who haven't used matte lip for a while this cell is sort of meant to be a mini primer it's basically contains everything that we're going to use today for displaying images so again I'm using that data sub module of cycle image I'm loading two images Chelsie and an image of a rocket signing them to image zero and one then we've important math mud lips pipe lot sub module as PLT we're going to use the subplots command to generate two axes for us to draw on we can we can generate an arbitrary grid of axes to join but in this case I just want a side-by-side comparison of two images so we're gonna do subplots its first argument will be the number of rows in this case one row of axes and I want two columns so that should give me these two axes side by side and I'm going to ask for a figure size of 20 by 10 nice big figure that I can explain the in the notebook subplots will return to you a figure as well as the two axes that you asked for and then you can use MATLAB to plot the top of them so X 0 dot M show image 0 will display the first image x0 set title we'll set its title etc so we're going to display the image set its title turn of the axes for the second image we're going to display the image we're going to set its title to rocket we're going to add an X label to that image and then we're going to add some vertical lines and plot sort of an arbitrary line somewhere else in the image and finally add a legend so this cell you can refer to later on back in the tutorial if you need to plot things yourself it sort of meant as a small summary of all the functionality you'll need today so there is the plot you can see two axes side by side you can see the two vertical lines going next to that rocket and the rockets tab you can see the legend in the top corner as well as the text captions for titles as well as for x-axis one of the x-axis right so let's get get back to images we've already now seen that you get at least two kinds of ranges for images so the first image that I constructed the values ran from 0 to 1 and in the second image that I loaded up from disk the values ran between 0 and 255 so what's going on there which is the real representation that we use in cycle image well inside in which we support various data types so you could use you could use either and what I'm going to show you here is that you can essentially have the same image and in a floating-point representation or in an integer representation so I'm going to do a limb space here in other ways I'm going to in other words I'm going to take numbers between 0 & 1 2500 of them and I'm going to reshape all those numbers running from 0 to 1 into a 50 by 50 image so what do you think I should see just visually what do you expect to come out there exactly so a gradient that starts order from dark and goes towards light I'm gonna do the same thing with the integers here I have to be a little bit more careful I'm going to run between 0 and 255 I'm going to again take 2500 values but those values will be fractions or floating-point numbers so I'm going to ask numpy to turn them back into integers for me right so I'm going to display both the both of them side by side and there you can see the floating point versus the integer representation essentially yields the same image of course the one is slightly more coarsely quantized but they're both valid representations for your image internally cycle image usually works with floating-point images so it's often beneficial to start off from flooding point images that's an option for converting between the different formats we have the image as whatever functions so you'll see images float or images view bite I specifically want to point two images float because that's a function you will see crop up all over the place your typical workflow would be read the image from disk you call images float on it to make sure that it's got values running between 0 and 1 and then continue with your pipeline so the question is whether images float normalizes so images float if it gets an image with values between 0 and 255 it looks at the data type to decide what the range is and then it will convert the range down so if you put in the Chelsea image from before so what was the minimum and maximum value of Chelsea so that was 0 and 231 right so if we now ask it to do images float on that cat image to give us a new floating point image out the question is what will the range of that image be so here we go so it pulled it back down and in fact you can see you can see how that down conversion worked because it's its maximum value was 231 it was in a number range it runs between 0 and 255 so that's where the point line comes from yeah right so as I said you know we've got these example images but I presume you just don't want to you don't always want to do your research on the example images we ship from cycle to actually pull in some additional images from disk so for that you need to image need image i/o we support various input plugins basically whatever is installed in the system we'll try and use we recommend by default now that you'd use the image io package but you could also use matplotlib pillow etc you don't have to worry about that complexity for now we've got an i/o sub module and if you do IO dot M read it will find whatever is installed on your machine and use that to load the image so from SK image before import IO to i/o dot M read of the balloon image and then we're going to print these attributes of the image the type of object data type shape etc again so there we have the balloon image numpy array you into eight runs between 0 and 255 and as expected it's an RGB image with three layers sometimes it's convenient to be able to access multiple images from disk so for this we provide image collection so nothing stops you from just writing a for loop and calling in read on each image and packing into a list but sometimes you have say you know two and a half thousand images in a directory you want to access all of them you might not want to load all of them into memory at the same time so in that case you use image collection so image collection takes a wild card as a parameter so star dot jpg and it would load up all the JPEGs it only loads the images once you need them so it takes care of all of your memory that you don't accidentally run out of memory because they're all loaded up initially right so in this case I'm just making an image collection of all the images included in the tutorials directory if you call if you look at the files attributes of that image collection you'll see a list of all of them and then I'm just going to display them in MATLAB a little for loop so the important thing I want you to see here is that whenever I need an image I can I can access the image I guess that's not actually written down here in code but you could call ic0 that will give you the first image in the collection as a numpy array and at the point where it's accessed it will be loaded from disk you can also use the list as standard innumerable object like a list or a tuple so who here knows how to use enumerate or what it does okay so this is like one of the most useful functions for me in pythons right I like to showing what it does so imagine you have this list animals that contains cat dog and leopard if you do if you enumerate it basically goes over that list and it grabs the items one by one and tells you whether you're dealing with a zero the first the second the third object etc so it gives you our two things an index and one of the items from your list so for index and animal in enumerate animals I can go over that list and print out the index and the animal and there's what I get get the animal position zeros cat one is dog two is leopard right so it just counts along the elements for you super useful all right so I think that gives us enough of a background that we can start working with images ourselves and what I'd like to do in the next 20 minutes is give you a chance to become familiar with what I've just spoken about so we've got three exercises outlined here the first in the first I asked you to load an image from disk like the cat image and then modify that image to display the letter H in bright green so that is appear numpy array manipulation exercise in this second exercise I asked you to take an image and split out the red the green and the blue layers and look at each individually to see what they contain and then stack them all back together and display them as a color image in the final exercise we ask you to convert images from color to gray now it seems like that should be easy right you can just average the red the green and the blue channels but it turns out that that's not the best way to convert between RGB and gray the different channels have to be weighed with different weights so we give you those weights that give you the optimal conversion and then ask you to use matrix multiplication to weigh the channels together and convert from color to gray all right so let's try that now for the next twenty minutes I'll give you about 15 minutes and then show you some solutions and then we'll take a short tea break before one continues with filters so by the way one trick for in the jupiter notebook if you type something like PLT hist and you want to know what its arguments are just hold the shift key and press tab if you do that once you get the signature if you press it twice you get a little bit of the doc string if you keep pressing shift tab gives you more and more documentation until finally you get of all the documentation for the function so you do the same for any of the psychic image function so color RGB - gray open the bracket type shift tab and there you have the whole doc string popping up with shift tab alright shall we look at some solutions to these so the first one drawing the H on the image there are numerous ways of doing it like you could just slice directly into the array calculate the offsets what I'm gonna do is I'm going to just grab a window out of my array or out of my image and then I'm gonna draw inside of that window just to make the coordinate calculation easier so let's say the H window will be my image and I'm going to slice it between chords zero two chords zero plus how many rows should we have 24 and I'm gonna slice the number of columns chords one two chords one plus 20 take all the colors okay so that should give me that little block out of the array into which I want to draw the H so remembering numpy if you slice out a part of your original array you just get a view onto that erase if you manipulate the values they change the original array as well yes what's that Oh out you're right yeah well yeah I guess it doesn't matter at this point then alright so it does yes all right so H when I'm just gonna set the first ten by ten block to color to show you what happens so you see there I drew a little block on the image and you know then you can say well I want the first I'll watch these things be three columns first three columns to be the color so there you draw the line of the H and so forth okay for visualizing the RGB channels we reading the image then we want to assign the red green and blue channels to different variables so that would become image rows columns zero image rows columns one and image rows columns to note that there's is slightly nicer syntax for this inside Pi in Nampa you can say image dot zero which means basically like all dimensions except for the one I'm specifying now so that gives you the same thing alright and then we have some plotting code that I provided and if you look at that you'll see that you know the blue channel is quite a bit darker why well we have bushes here that green they don't contain all that much blue the ocean on the other hand has got quite a bit of blue so that's bright and it's darker on the red side the ocean also turns out to be quite green so the green channel is also bright and then here I stacked the three images back together and displayed them and that gives you your your color image then I gave you this thing just to sort of develop an intuition say these were red green and blue values what do you expect when you see those combined well it's - its stack them together what are they called red green blue along which axis do we want to stack them along X is 2 so we want to pack them all on top of one another and that's what we get up [Music] finally there was the exercise to convert to grayscale so how do you convert your image to grayscale well use the add operator and you multiply it the red channel by 0.2 1 to 6 the green by 0.7 1 5 2 and the blue by point zero 7 to 2 and you see that your RGB to gray then conforms to the one inside of psychic if you however changed those values to be thirds so if you average the three layers together you'll notice that they do not look the same here's the one from psychic image much darker around here and yours looks quite quite light in that case okay but if you use the correct weights they're identical alright so that concludes images the image representations section we'll take a short break now five to ten minutes and then one we'll get started on ten minutes ten minute break and then we'll get started on filters alright guys so that's ten minutes hopefully people will stream back in so I'm gonna talk about image filtering and what that means this is something that before I was an image analysis I didn't understand these words so hopefully everyone will learn something here so yeah everyone please open the notebook either click on filters or under lectures it's one image filters so the first thing we do is we set up method clubbing fun with that and then we're going to talk about what filtering means and we're gonna talk about it first about one dimensional signals because the two-dimensional case is completely analogous but it's a lot easier to see these things happening in 1d so get up some imports and then we're going to make a very simple signal which is just a step signal so yeah 100 measurements and then halfway between you jump from zero to one so now analyzing a signal like this is very easy but usually signals are not perfectly clean like this they're noisy like that so let's add some noise to that signal and see what that looks like alright so that's you can still see the step but you can only see it because you're mentally averaging across many values so let's try to see if we can get a smoother signal but let's start some simple denoising on that noisy signal so the simplest thing you can do is do a running average and the simplest running average is the one where you take the average of two adjacent pixels so you can do that with numpy vectorized operations like this we're gonna average the all of the values up to the up to but not including the last value together with all of the values from the second up to the last value and plot that so you can see comparing that and that but you get a slightly smoother so you all right so if we want to average a few more pixels let's go with three then the expression gets a little bit more complicated so you can see that now I take three values and I have to do a bit more funky indexing with it so you get a slightly smoother signal again I'm following the orange has the mean of three but again it's a more complicated special so is there a general way to make this type of nearest-neighbor averaging that will not require me to do all of this reason to take myself and that way is filtering so I'm not gonna go through what we did here but essentially so that this is the numpy vectorized version but another way of looking at it is let's take an output signal that starts at not at zero like in most Python things it starts at one and it ends at minus two and then for every position in that signal we're gonna grab the values around it so one value before it the value on top and the value to the right and we're gonna take the average of those three so with filtering what you do is you define a kernel and that kernel is going to be that size three array with values one-third one-third one-third and then the convolve function which we'll see in a second will run through the whole array and take those averages for you so here's how we define it empty dot full for those that haven't seen it it makes an array of this size full of this valley it's slightly cleaner than doing NP dot once and multiplying which is what I've been doing my entire life basically so we define this kernel this is an array of size three actually I'm gonna just show what it looks like edit split so so no okay so this is what we call the colonel and now for every pixel in our input array we're gonna put that Colonel there and we're gonna multiply each value with the corresponding value in the original signal and then we take the Sun and that's the output signal so that's what we do here we do we use the function and pecan valve and I'll talk a bit about the mode in a second for now we use mode equals valid and you can see that they are this is exactly equivalent to what we did before with taking the mean signal vectors and averaging them together with the slicing any questions so far yeah stefano detective I presume it's a you have maybe a older version of the notebook I'm not sure did you did you update at the start of the class okay I can just never so just navigating from where we started we were you launched your notebook in the root of the repository and then if you go to two workshops 28 inside but you click on the index it opens this page and this page links to the okay whoops there goes to the lecture yeah I'll go check okay good so now that we have this formalism we can take a much bigger running mean of eleven values using numpy doubtful same as before convolve it and you'll see that we get how much smoother signal still okay so now we're doing a running window of size eleven there's a slight sort of kink in what we're doing which is that if you have if you have a signal of length 100 and you want to take a running mean of 11 you can only fit about 80 such means because you run into the edge so you can see that my running mean of 3 goes almost up to 100 but my running mean of 11 doesn't quite make it there okay so there's lots of ways that you can deal with this one way so this is what mode equals valid means it means only go up to the edge of the data that you have there's mode equals same which means return the same size of output as you had the book and we will see when we do that what effect that has so what it's doing there does anyone have any ideas based on this thing at the end yeah it's putting in zeros that's right so at the very end it gets to the end there's no more values here you don't notice it because all your values are close to zero but here your values are close to one and when you start to take the running average of the values beyond the edge you start getting closer to zero as you take that mean so that's not ideal and we'll actually there's an exercise right now so this is the numpy evolved convolution is such a common and useful operation that there's many different implementations and they do slightly different things one of them is inside PI and iam involve so I want you guys to input this look at the documentation and then figure out how to do this without having this nasty edge effect using an TMS cut off so we'll take five minutes side till 250 we'll do that and we're here to help you so raise your hand if you are lost sir you need one particular bump in the right direction all right so that's five minutes so I think I saw quite a few of you who got pretty far with this so I'm going to import that function and to reiterate what Stefan said before this is an extremely useful thing to know is how to look up the documentation in the environment that you're working in so you don't have to keep like open in browser tabs and then suddenly the browser suggest that you open Facebook and not the afternoon angle so you can do in Jupiter notebooks you can use a question mark this also works inside of a night Python terminal and then you get the documentation like this or sometimes and your if you know almost exactly what you want you open the parentheses in blue shift tab you get very quickly the function signature we do a shift tab tab you get the full documentation like this in line and if you do shift tab tab Oh what happened I think I was not in edit mode here if you are in edit mode and into your shift tab tab tab it pops that and the same thing as a question like basically okay so now we see that the mode in and the image lock involve is something different altogether from what we had with none fighter convolve and you have this reflect mode which so you can see it has a whole bunch of different modes the options are reflect constant nearest mirror and rep so as I explained before in numpy when you're when you say mode equals same what is doing is the same thing as handy image mode equals constant with the constant being zero nearest just extends it the final value passed the end of the array and reflect just mirrors the array around that edge so it takes a final value then the second final value and self so over here if we want to get some kind of realistic signal if you just wanted to not go to 0 you can use edge if you want to get some kind of realistic signal with similar variance for example then you would use mode equals just liked which is the default so let's give that a go what's the name of the eyes a signal so smooth and the I is equal to involve of lazy signal mean Colonel 11 mode equals reflect and because it's the default I could just as easily have not included the mode and I can be appeal to your plot fig x equals beauty about subplots that stuff smooth yeah alright so now you see that you don't get that drop off at the end was that clear to everyone ish okay all right so that's a smoothing that's called the main filter basically is the simplest filter you can do let's look at some slightly more complicated ones and one of them is a difference filter so let's go back to our original step signal looks like that so can anyone tell me what you would predict if I convolve with this kernel minus 1 0 1 so don't don't run the cell after it and see if you can think through this anyone want to is it so so what happens and yeah so it's going to be flat anytime that the images that the values to the left and right are the same and then you're gonna get a spike right here yeah so you're only gonna get a value other than 0 when the values are changing and you're right ok so that's what it looks like ok and now I don't know if any of you predicted this you'd have to basically know about convolution already if you did but you can see that it's here I would have thought it's 1 minus 1 so it should be sorry 1 minus 0 so it should be 1 but actually for technical reasons to do with filtering theory that I'm not familiar with convolution actually inverts the kernel before doing the multiplication addition so if you want to do it yeah not by very sorry it changes the order of the array to be the reverse so if I do this I will get what I predict which is a bump going up ok so this is the the simplest difference filter and you can see how it would be useful in this specific scenario so you can basically immediately pull out wherever there's a change in your signal ok so again this is the inversion that I showed just before so one cool thing about filtering is that the the convolution operation is commutative and associative which means that you can do it in any order so if I want to do a smoothing and then a difference I can actually converse together to get a second filter and then apply that filter to the signal so let me demonstrate that so here's if we add noise to the signal you can see that suddenly in in terms of the different signal it's very hard now to spot the change in the values and that's because the difference filter is actually very sensitive to noise so what we want to do is we wanna do smoothing and then we want to do a difference filter so as I said before you can actually just convolve the two filters together so here's the 1 0 minus 1 and the one-third one-third one-third and mode equals full will actually go beyond the edge of the array so that you have even just a value of one overlap between the two and again it's gonna have zeros on ambiens so you can see that now you get this mean difference filter which does a wider window and it takes the average of the two values on the left the average of the two values on the right and then takes the difference between those okay so now have week involved a signal with those two sorry with this new filter you can see that the peak becomes a bit clearer now okay so let's do an exercise another five minutes so this is a Gaussian filter it's in signal theory it's kind of the the optimal smoothing filter rather than taking a mean you take a weighted mean where the value is close to your current value are weighted more strongly than the values that are far away and they're weighted by this particular function so make a kernel that looks like this and draw the kernel so you can see what it looks like then convolve it that with the difference filter just like you did up here and then involved with the noisy signal until the result all right so we'll take till 105 to do that sorry 305 all right so the interest of time I'm going to keep moving forward hopefully we've got far enough to get a flavor for this so we're doing these slightly complex formulas and numpy I like to look at things step by step right so if I want a filter of with nine I'm gonna do a rage nine and just make sure that you've got what it is that you want I want the center to be four so here I do X I minus X minus four so I can do that now if I take the square of that now you can see that it's a symmetric function now I'm gonna do MP dot X of this divided by 2 so you get these values which are positive but small oh sorry they're not positive but small they're positive and big because I forgot a minus sign this is why you need to do this real tight okay positive but small and now we do okay equals 1 over V dot I security f2 x MP dot pi times that so that's the kernel and we're gonna do big X close guilty about subplots so we're gonna make all three together and we're gonna do XO 0 flock ok well that's pretty ugly that's like this domina order okay so you see that it's a bit coarse but you get a bell curve shape and the first point is weighted by 40 percent points to the side of it or 25 percent and so on progressively okay so now we want to convey the difference filter so we do ask me difficulty and fee convolve we do k1 0 minus 1 my ID cause full and X of 1 o'clock smooth all right and so now you get a difference filter that actually takes something close to the main difference to the local difference it's a simple difference here but then it averages as you go further up from the central point and I may have flipped the sign here but I'm not gonna worry about it and so now we can do smooth signal equals and the I do I haven't yeah yes noisy signal and smooth so actually it's not smooth signal it's smooth and let's plot that well okay why did it autocomplete for no reason okay and so now you can see our peak when the signal changes here in the middle all right any questions about that cool so that's 1d filtering now let's generalize these concepts to two dimensions and you'll see I hope that it's very similar to do for images and actually if you have 3d images you can also do this kind of thing in 3d so let's start do we start with a very simple step signal let's start with a very simple 2d signal which is a bright square which is just ones in the middle and zeros all around it and that's what it looks like when you look at it as an image going back to Stefan's points about flooding point values being between ziering one they've got white on the middle black on the outside okay so now if we we didn't mean filtering we had values of 1/3 over an array of length 3 in two dimensions we're going to have a 2d array and the values will now be 3 it will be 3 by 3 so there's nine values so we divided by nine for the mean so that's the mean kernel is 1/9 in every position in a little square okay so a filter in 1d was sent to the colonel on the pixel multiply the pixels under the colonel by the valleys in the colonel sum all those results together and then replace that center pixel with that result so this is exactly the same thing in 2d 3d and E so now if we look at that image if we look at this pixel it's that location yeah 1 1 so it's this pixel here if you take the mean of the surrounds you get 1 over 9 and that's this value zero point eleven here if you look at this pixel here it's got two pixels in the neighborhood and so you get zero point two two as a value and so on okay so a few years ago Tony you who's another core developer for psychedelic I write this demo to show you interactively how it works so we're not gonna go through the code it's kind of tricky but we're gonna just run the demo and show you what it looks like so what we see here on the left is the original image you've got zeroes around it and ones in the center and then over here we have the footprint of the kernel we're gonna assume that we've got zeros outside of the image and this is what the mode in the MDA image to involve controls if he's said that the mode was constant and the constant value was one then you would get a different result because these values would be assumed to be we wanted that are outside the image so now we're gonna take the filter and we're gonna move it across the image and then we're gonna update the image on the right based on the result of the convolution so does this work yeah for the first row it's only catching zeros everywhere so the result is zero the second row here you can see there's still only zeros under that footprint as soon as we get to one that's not very visible and you can see it so that's one night so now it's got one value so it makes that picks a little bit brighter it's got two values that pixels a bit but it's still three values been part is that snow one-third of the pixels are bright and so on all right so and you can see here in the center it's an average of one because it's averaging all of the bright Square and you can see that is it is indeed a smoothing filter you get you had a very sharp edge in the original image going from zero one and I got this sort of nice ramp going from zero to one over several steps okay was that clear to everyone what's happening so I don't know how many of you have heard of convolutional neural networks a large number it's the hot new thing right it's not even that new anymore so this is exactly what those are so a a neural network the first type of a neural network is to have a convolutional neural network network is to take one of these filters and pass it over the image and then the you would take this imagine you would apply something called a non-linearity but the principle is they're exactly the same except instead of designing our own filter as we did here we've made a main filter you learned the values using machine learning and if we have time I'm actually gonna get you guys to do that for very simple toy one layer neural network but yeah the basic idea is exactly the same the other thing that I want to mention is in all of these main kernels we've made the sum of all of the values to be equal to one and the reason for that is so that the overall brightness of the image is not affected by our filtering because otherwise if if they sum to less than one for example and you do repeated filters you might end up with your image sort of whittling away to nothing so this is just a common thing to be doing all right so we're gonna use a pixelated image because I think seeing the pixels actually makes you understand what's going on a bit better so we're gonna take the camera image which comes with second image and we're gonna down sample it just by sampling every tenth pixel this is not a good way to down sample as it says in your notebook you should use SK image transform a scale if you want to do real down sampling of an image but for us you get a nice Jaggi image which is exactly what wrong okay and we're also going to because we're going to be showing lots of images by the side-by-side we're just going to use this I am show all function you don't actually it's not doing any magic it's just plopping images next to each other okay so let's run that mean kernel on the pixelated image and see what it looks like so again as you can see you get something smooth like this and that's exactly what we expect so now we're going to talk about the essential filters the the most common filters that you're gonna see over and over again and the very first one is the Gaussian filter which I talked about already in one-dimension you can do the same thing in two dimensions using socket image so finally after all this we get to from psyche dimension port filters and this contains some pre-built filters for you and now we're going to compare the mean of the bright square with the filters Gaussian of the bright square what was the comment I'm not reading oh yes totally forget that that's like ancient history okay so it's not very clear here but you can see that this is a smoother ramp than the mean filter we're gonna look at the pixelated image and the difference between the two filters and show that you get a much better result with the Gaussian filter smoothing how's the Gaussian filter upon first inspection you don't really see much difference but what we're gonna do is we're gonna do a much bigger filtering doing mean filtering of size 60 or a Gaussian filtering of an equivalent size and see the results on that original image and so you can see that the mean filter even though I was doing really good for a small local neighborhood it gives you all of these artifacts you can sort of see the repeated strut of the camera same here it sort of spreads out whereas the Gaussian filter really looks like a blurry version of the original image okay so if you're if you're blurring images the mean filters not usually there I go you want to do something like a Gaussian filter and I think this applies to most signal processing okay so I'm going to talk a little bit the Gaussian filter in two dimensions is very similar to the Gaussian filter in one dimension and as you can see here it's actually just the multiplication of the two dimensions together and so I'm gonna look at what that looks like so to look at a kernel you can either make the kernel yourself or you can filter a single bright pixel and then the output of that is the shape of your kernel so that's what I've done here you can see the smooth bump I'm just going to give you a tiny exercise and give you one minute which is to plot a line profile of this line of what the brightness looks like along this line of the kernel okay so this is just repeating the fact that our images are only numpy right so it's just an umpire right indexing puzzle okay so the kernel is just an umpire ray I want to take every row and the 22nd : and just plot that has a line and you can see what you got a nice bell-shaped curve shift you've done many statistics is exactly what a Gaussian distribution is like okay and so basically this looks like a 2-dimensional Bell bump in the image okay so the next thing that we're going to do that's smoothing in 2d differents filters in 2d and they're a little bit more complicated because I have to do a difference in one axis and a difference in another axis so I'm drawing this vertical kernel which is just a difference taking the average of the three pixels above and the average of the three pixels below and taking the difference so I'm going to look at that for the pixelated image and you get this image where here you have a negative change from Bright's to dark as you're going down the image and then here you have a positive change from dark to bright in the sky I'm just going to show actually I'm sure it's elated and high gradient vertical why did that not work pixelated does I just do pixelated any ideas - son I think it's because it it's the negative values versus the positive values only here is what's happening it's a fight so gradient has has values not between 0 & 1 but between minus 1 and 1 and I am sure all is trying to normalize all of the images to the same space so that's why I'm not getting the same badness but just to show that here it's going from dark to light in the original image and that's what the filter is detecting okay in the interest of time I'm just gonna quickly do this exercise for you so any ideas how do I get a horizontal kernel from the vertical one yeah you just need to transpose the matrix so what did I call it vertical kernel and then I can do Oh all right so this is exactly the same principle but now you can see that you're getting the horizontal edges of the image instead of the vertical edges yes good point update all right I'm not even gonna go there yeah okay and usually what you want is just a overall gradient and we can do that by because these are just numpy rays you can just use vectorized operations on them so I can do a gradient equals NP dot scrt horizontal gradient squared plus vertical great why did I call the great vertical that's pretty okay all right and so now you get a nice gradient around the whole figure so is your basic edge filtering there's a filter called the Sobel edge filter and this is probably the most common edge filtering and it's kind of a baby version of the Gaussian filtering that I showed before where you instead of taking just a simple edge filter you want to do a bit of averaging so if you click on the documentation for these you can see what the kernel is it's taking the edge with white to along the central pixel position but it's also looking for a continuation of the edge above and below that that's specific value but with a slightly lower weight of one instead of two all right does that make sense cool so the vertical V stubble is the sorry edge level is the opposite of that and filters dot Sobel by itself without the edge of the V does exactly what I did here which is to take the square root of the sum of the squares and just give you a normal gradient so you can see that you get a much smoother edge then all we saw before and if I do it with the pixelated image you get that gradient again and again you can do this you can chain these filtering operations together so I can do a Gaussian filtering followed by sobel filter and then get a smoother gradient they can see that if you do a bit too much smoothing you get a fatter line for the thing so this is you do pay a price for smoothing but it's just just something to know about they could see some of the edges along the struts are a lot cleaner okay so I'm gonna leave this exercise for the break you guys are free to look at it this is the one layer and you'll network but since we're running out of time I just want to quickly show you some more elaborate filters that are available in psychedelic and the first one is the median filter so we saw that the main filter blurs out the image but it also contains a lot of artifacts and we saw that the Gaussian filter can blur out your edges as as well as making them smoother but it'll make them blurrier so the median filter is takes your footprint and now it doesn't apply this linear multiplication and addition it just takes the median of all of those values and that's what you get in the middle and that's has the property of preserving your edges I'm not going to go through the logic of that but it's you work through it it'll go quickly from one value to the other and forever gonna pick an intermediate value and so get that blurring effect so let's show what that looks like so you can see for the pixelated image you get some smoothing but it preserves the edges of the figure and that's because when it gets to this pixel it only has a bright patch and the dark patch to choose from and the median value has to come from either of those patches so it's not going to pick a median value a value intermediate to that some gray level value that you get here that's right yeah okay and you can do that with you can see that it works for not just for pixelated images but for images that are more detailed so you can get a nice smooth image with some sharp edges in it still okay and then there's some very far more elaborate filters which will let you guys read about in the documentation so I recommend that you look at you can look for Succot image API filters and it will take you directly to that module and show you all of the filters available there so hopefully what we wanted to show you is the very basics of filtering so you understand what's happening and then reading up about ever more elaborate filters is not hard step for you guys to keep doing at home yeah and that's all the documentation is so take a ten minute break a five minute break what do you guys think how has your timing Josh know what I wanted decisional ten-minute break so 3:40 all right I think most people are back and the rest will filter back in I'm really excited to talk to you today about segmentation was a big part of the work that I did in my PhD and we get quite a few questions about this topic on Stack Overflow as you'll hear from Stefan after this section my goal really is to give you an true overview and ability to attack a given problem from a conceptual level so segmentation at its most basic level is you have an imager or a data set and your goal is to usually with some meaning attached to extract or identify a subset of that data that is more interesting you for some it to you for some reason so you've seen this in popular culture the example on your screen is a screenshot from terminator vision and one of the early Terminator films the way that this gentleman in the cowboy hat every Texan is outlined is essentially a segmentation the Terminator vision or the 3dfx guys behind the movie drew or calculated this outline around that individual and they could have taken this individual out and maybe put them in a different frame and that's something that we often refer to as photoshopping colloquially with proprietary software which of course can do with and other other things the important thing to note about segmentation is that we basically do that via two major ways we either use some sort of guiding input at which point we are the ones that are supervising the algorithm so we say hey I wanted that person I wanted that face in my field of medical imaging I wanted that organ segmented out of a given image or maybe an image volume we're seeding it with that data maybe we have just even put a point in there and saying that's liver and around it is not liver and maybe that's the entire input that we gave to the image but it had some sort of human input and so that's our supervision of it separate from this is the subfield of unsupervised segmentation where we're trying to a priori there is no information known the algorithm has no idea what's going on in the image we haven't given it any hints or help and it is going to try and identify some sort of logical sub regions based on contrast or local features or or other things so we have a number of different segmentation algorithms psychic image and here I've kind of broken down the main ones between supervised versus unsupervised and you'll note that threshold and kind of crosses crosses the divide and we'll talk about why that is we'll use them most of these in the course of this example so first off that is that under is that well understood does everybody kind of conceptually grasp what the goal is in terms a segmentation because that's truly key to this example and if there's one thing I want everyone to walk away from is that there's this distinction is if you can have human input or some guidance to the algorithm versus not and it sends you down a completely different pathway any questions about that no okay we'll move on so first off we're going to just do some standard end points imports will get numpy matplotlib and then various aspects of psychedelia on board and this is just a quick convenience function like we've seen in other examples to give us a pretty plot efficiently so on a very basic level threshold is segmenting and we've already done a little bit of Thresh holding in previous examples but basically we have these image values if we only want to look at all of the values that are above a certain a certain point the we can set a threshold at that point and we just get we get either you know black on one side and white on the other depending on where we set the greater than or less than and that can actually give you some reasonable results depending on if you actually run the matplotlib inline yourself go back down here so this is just a page of Oh actually it's a segmentation text that it turns out it's not the greatest image though the this site is kind of a little bit darker but maybe we can pick a value that will give us a reasonable segmentation of this just just without any functions at all it's based on the UM pie itself now to help us pick that value instead of just throwing darts at the board we can look at the histogram and I don't think the histograms have been greatly looked at in our previous lectures but basically a histogram is just where on the y-axis it's the frequency or the number of times we encounter a given value and then the x-axis is the the values that are in the image itself this image happens to be an 8 bit image so we have a total of 256 total possible values and we just run this cell and you can change the bins around but the bins are the number of actual the way it's moves things together within the histogram and you can see that there's a concentration of pixels that are up here fairly light that's most likely our fairly light text background but then the rest of its kind of smeared out an ideal segmentation case would be bimodal fairly separated and then you could pick a number right in between them so still maybe this is our most of our background that's light and quite a few pixels are in these so let's just try a few so just go ahead and take a minute or two and just make a few segmented images just based on simple thresholding from what we know about this histogram here not bad honestly but we still have this dark area back in the corner if we if we reduce this down to let's say 50 now everything else we start to lose the text over there we've gotten rid of our dark corner we go up higher and obviously well if I don't leave an extra zero around you know the this dark edge just kinds to continues to encroach on us so that's potentially not ideal there are a number of thresholded x' some of them are specifically just designed to tell us where would be an ideal threshold within this histogram and within this image without any additional input so that's an important distinction we just did supervised segmentation because we looked at the histogram and said I think that's where it would make sense but we have a few that would be unsupervised if you had let's say a database of text images of books like the Google project to digitize libraries they're probably using something that's fully unsupervised on each image so that they can binarize the text away from the background they may be using something like threshold underscore and if you just put your cursor right here and hit the tab button these are all the different threshold methods that are available in filters and I would encourage you to just try a few try Otsu trial ii and anything else that you might be interested in but leave local for last and when you get to local take a look at the actual actual docstring and then I'll give you a few minutes for that and we'll go through it together so all of these to answer the question for the for the room all of these are going to give you back not the thresholded image itself not the result but the value at which the threshold should be taken which could be one single value or could be an image that's the same shape as the original image except at each point it will be the appropriate threshold that's how local works okay so let's try a couple of these together again I'll just do a two because that's ooh and Lee are very similar so we'll go ahead and put the text image in Otsu and then we will go ahead and take a look at what that threshold was chosen so it chose 157 and we can make this look more like you would expect a book to look if we do the other greater than it's just reversing the ones and zeros so odds who thought of threshold of 157 made sense and that's a algorithm that you need to know further input from us so that would work and give you the same thing in an unsupervised manner li is similar so I'll just skip past that but local is more interesting in the local thresholding we're actually looking at each point in the image and trying to find an ideal threshold at that point and you have to look in the documentation just a little bit to figure out exactly all the other things you need to give it so first we need to pass an image that's our text image then we need to tell it how large of a local area we're going to use and it has to be an odd value so let's look at a reasonable sized area with something like 15 and then we'll go ahead and leave all the other defaults and that actually gives us thing it's readable now and but we have all of this artifact it's certainly not ideal but you'll note that when I printed the result it's now a huge array so at every point in this image we now have an individual threshold to extract the the most ideal value right out of that right out of that point so if that worked reasonably except we just need to get rid of these artifacts that are based because our 15 point region is trying to extract something too small between lines there's nothing of value there and so it's basically giving us noise we just Jack that up to let's say I want to do 42 but that won't work and now we were starting to get rid of some of these artifacts but not quite all of them so we can keep increasing this and at some level it starts to actually become a problem we're getting this dark corner back in because we're looking at now this entire region and and we have this gradient across it so there there is an upper limit and so let's just settle on something around 50 and see if there's anything else we can do to alleviate this this artifact well it turns out that we do have a value in here an option called offset and that is specifically designed to tune the result of the tune the threshold slightly up or down by the default is just trying to find the the ideal mean or median value but the offset can tune that up or down and so in this case let's threshold just a little bit higher so that ideally we lose all of this extra values behind there in the noisy regions and now that's pretty good if we just increase the offset slightly anyone have questions about that so it's still mostly unsupervised if we had a similar camera that was taking pictures of books in a similar way we could generalize this to other images because we're only really telling at how big of an area to look at and that we'd prefer to live have less noise by just increasing that offset a little bit but the the real point here is that we've iteratively kind of tweaked the parameters a little bit for our particular problem okay so that's as all I really want to say about threshold encodings or segmentation in general yes okay so the question was we are visually saying and this is a good image is there a metric that we can use to quantitatively say that yes this is a good result yes I'm not getting into that in psyche it in just a measure there are a number of different metrics that I would try to apply to that problem but generally speaking what you would probably want to do first would be good at ground truth and then you can compare the difference between your result and the ground truth it's a little hard to do that a priori but you could do some fancy things and look at local entropy or local variation to see if you've still got too much noise floating around a priori but that's how I would attack that that make sense any questions yeah yeah so thanks for asking the question was so we've just done all this stress holding but we're really is the segmentation the segmentation is this result that we have here so when we take our threshold and apply it to the image we get a result that is all it's binary so it's all true and false or all 0 and 1 so depending on where or how we've set this less than or greater than we've selected for either entirely the background to be true or entirely the text to be true and the segmentation is the 0 or the ones in that in that example does that make sense I want to make sure we're bringing everyone along great okay moving on to general supervised segmentation methods threshold is limited to only images that are in specific situations the images a text are a great example where that can work images of real-life situations let's say this room would not be well suited to Thresh holding as as a method of attack so we need to think outside that particular simple box so we're gonna instead use a real image that we shipped with psychedelic domain image from NASA of an astronaut named Eileen Collins before she went on a space shuttle mission and this is a real actual image now one important step that we're going to basically gloss over in this is that every time you're doing a segmentation project it's always a good idea to remove any noise before you proceed so this is an excellent image that is already done not very noisy and so we're going to basically move forward but keep it in the back your mind that when you're segmenting things especially if you'd underlying data has some noise in it consider dealing with that first okay so that's our astronaut and the contrast is pretty good so let's let's consider what if we wanted to try and segment out the astronauts heads this would be kind of a typical more of a photoshop like task where you're maybe gonna if I that and then do something else with it but first you have to know where where the person's you know ahead of interest is if we're just Lin specking this the actual astronauts hair versus the background it's decent contrast this shadow versus the background has decent contrast skin versus they the collar of the suit is pretty decent contrast over here there's also a shadow so we probably are in a decent spot to just go ahead and convert to grayscale this is again using RGB to gray which we've seen before and Stephane was mentioning uses a perceptual method to do the grayscale conversion as as we can see we're probably in a good spot here with grayscale and our to segmentation methods that we'll look at here for supervised segmentation they use very very different approaches so one of them is called an active contour which is often colloquially referred to as a snake and the idea is that you've drawn some sort of a contour around outside the area of interest and it is going to slowly contract but be attracted or repelled from light and edges and so the idea is that each individual point along there is going to move very very small amounts and that will be repeated iteratively until it grabs a corner that it really likes and then it's adjacent points might keep to move and but that one might stay the same and there are a few parameters that we can tweak in there but that's the basic idea behind an active contour random Walker I'll talk about when we get there that's completely different so okay so we're going to use active contours that means that we need to be able to see our image so we again are supervising so we have to actually tell it where we want the contour to start and then we also have to tell it if that contour is going to be a closed contour or not the default for this algorithm is that it will be so like if you were to a half moon it would consider the final point to be connected to the first point the the tail per function I've I've provided here is really just a way to draw a number of points in a circle and so if we just go ahead and use that and we do it as circle with a resolution of two hundreds a reasonably high resolution with a center point at this point we're also not going to include the last point because linspace will get us all the way to and including two times MP dot pi and for those of us that know trigonometry off the top of our heads that's the same place as we were at zero we're not supposed to repeat points in this so we're just cutting off that last one so if we just look at the result of that actually I'm just gonna come on out this line so this is the contour that we're starting with so before we got here I just identified a point that was kind of near the near the middle of the forehead and then we've drawn a circle that's large enough to completely be around the area of interest and this is the only input that we're going to give this function other than tweaking the parameters slightly okay so now then actually running the active contour algorithm itself is pretty simple to begin with if you just look at the documentation with shift-tab there's quite a few different options in here but we'll start without using any of them and just use our greyscale astronaut and this circle number of points that we have generated note that the periodic means that it's going to connect the first in the last that's what I was mentioning before so it's going to consider this to be a closed curve and the rest of this kind of well we'll get to those so let's see how it works that's gonna take a second to run it's done and that didn't really work very well did it hmm okay so the blue is our snake and it two things happened or didn't happen one it didn't really get very far from our initial points and two it doesn't really seem to be grabbing edges yet although arguably it really hasn't found many edges to grab so just like stephan1 and I do when we run into these kind of situations we're gonna look at the documentation and say well it seems like we want this to move a little bit more than it has been and one of the first things that pops out is alpha so higher values make this drink snake contract faster okay so the default for alpha is point zero one let's increase that by an order of magnitude and again it's finished a little better yeah - again of the octopod rawr - seven radians of color and then yeah so if you the idea is that the snake itself it's tracking each one of those 200 points that we gave it to begin with and each each time those points can move up to one pixel we could allow it to move a little bit more if we change this but the right of the default is that it can move up to one pixel per iteration and essentially though the other things that we may or may not have tweaked like the how attracted it should be to lines and how attracted it should be to edges define the underlying function that will say if it wants to move or not it's going to by default contract that's what this alpha parameter does and that's also why we wanted to make sure that we fully encircled what we were interested in for this particular algorithm that won't always be true depending on what algorithm you're looking at and how you're trying to approach the problem but for active snakes in this particular one we needed to include the whole thing and then think about it sort of slowly shrinking and then grabbing edges as it as it narrows in its it's at each point it's looking at the local region and there's a fairly complicated function because you can see there's waiting for the smoothness of the snake and it's it's saying well I don't want to get too far away from my neighbors because that would make this really weird heart acute edge but you can set it so that that's okay if you've reduced the smoothness parameter which is beta as it turns out so it's a little bit more complicated than just running a gradient and then sliding down it because each point is almost independently but not quite because of these parameters like theta2 figure out how it can tweak itself to get a more optimal point and then the entire algorithm looks at a convergence criteria which is basically how much have I moved am i pretty stable here and and it will stop earlier than 2500 iterations if it thinks it's getting to that point make sense and other questions okay so that's a pretty decent result for an active contour now the result of the the what we got out was the actual perimeter here but we could use in the draw sub-module there's we could define this as a polygon and then fill within that polygon at that point we've got an image that has the face segmented out and that's exactly what we wanted to begin with using an active contour so if there's anything if there's any other questions about active contours we can hang around up here otherwise I'll move on to random Walker yes yeah it's sorry so the question was if you just wanted to pick out the skin just just the face or just the hair yeah so you would change a couple things one that would help would be narrowing that circle so that it would start in the hair and then narrow down from there instead of starting outside everything then the the hardest edge that it might find would be the edge of the face instead of the edge of the background and the hair so that would help so you can change your initialization you could also change its attraction to light or dark so there's WAW underscore I think it's edge changes how much weighting it thinks it would like to find lightness versus darkness and you could use that to try and find it basically exclude the hair or exclude the face if you were interested yeah most of the time snakes contract but you could set that alpha to a very low value and I know that you can have it where it will expand locally from where it started if it feels like that would be a better edge over in that local area Stephane do you know if this one can actually be reversed fully yeah we'd have to look into that for you okay in the interest of time I'll move on to random Walker so random Walker works completely and utterly differently than active contours the idea behind random Walker is that you've provided some random pixels they do not have to be connected they do not have to be so it's no longer a contour you can just sprinkle them throughout the image if you feel like and then they have some sort of labeling associated with them the algorithm is then going to determine a penalized penalisation cost to move from each pixel to the neighboring pixels and it's going to iteratively figure out what the cheapest path is back to each label type and whatever is the cheapest is the one that owns it so the result it's a little bit more general this is where you can have a GUI that instead of having to draw circles around things you can just sort of draw a line down the middle of an object of interest maybe a squiggle outside of it and it's going to segment the whole image based on that because everything is just what's the cheapest path back to all of the labels and whatever at every point whichever one is actually the cheapest wins so so we need some labels and in this case I'm just going to see the labels array that starts all the zeros now we've already got our circle that we drew before and in this case we're just going to borrow that same idea from our we're going to borrow those indices that we use previously and set them to sorry we're gonna borrow our points that we used before and set them to two so the outside is two and then we'll use the draw function within psychedelic draw and we're just going to draw a smaller circle near the near the middle of the astronauts face so we could have done different initializations but in the interest of time basically now we have this circle in here which is fully inside the astronauts face but does slightly cross into the hair over in the top and then this is the same circle that we had previously and then we're just gonna go ahead and use random Walker on that and see what happens again just know a default everything except using the grayscale astronaut and the labels alright so let's see how that worked hmm not not perfect is it kind of kind of expanded the circle but it's mostly still a circle it doesn't look like it's grabbing edges like we would hope so that's a that's maybe a problem let's let's look and see if we can fix that okay what's available in this function so the main tunable parameter and random Walker is this beta parameter and that is what weights the differences in the pixels to penalize pixels that are farther apart and the greater back gets the harder it is for them to actually cross over that pixel in order to get to their next point so with very low beta value which as it turns out our default value as applied to this image is fairly low it's actually pretty reasonable when you apply it to floating-point images but we put in a UN an unsigned the 8-bit integer so we've got 256 so full it's a bigger range so it actually makes sense that we would need to change our beta here and we want to increase the beta so let's increase that the default was 130 let's try 400 I'll take a couple seconds and it's better it's not quite where we want it to be but it's improved you can see that it is and I should say that the way that I'm showing this is that the actual segmented image I'm just extracting the face labels from the segmented image and it's defaulting to the various color maps so the lower values are on the bluer side normal values are a little bit on the yellower side except I've put an alpha blending so you can see through it and that's why it kind of looks like a sepia tone if you will but this area in here that's a little more sepia like that's our actual segmentation so we're getting closer but but it's still not quite where we want it to go so let's keep increasing beta and you can see what you figure out as the ideal beta value here all right now we've got the entire face we're missing a little bit of this ear and still not quite gotten through the hair so let's keep going all right that's very similar to before and if we just keep increasing it back out a little bit further in this area and in the interest of time I'll just jump to 2500 and now similar to before we've basically fully identified exactly what we would like it's actually still missing a little bit of this earlobe because there's not enough of a gap here to keep it out so you would expect that so again supervised segmentation we gave it our inputs and then we had to tweak the parameters a little bit it's a common theme but you would have but we would have expected to need to tweak this if I did ran that as float to begin with the beta would not have needed to be changed as much okay that's gonna be all I have for supervised segmentation any questions about supervised segmentation all right in interest of time I'll go ahead and move forward so we're just not always able in our analyses to have the luxury of a human looking at an image and interactively saying this is what you know this is where you want to start this is where you you know need to go and so we have a number of algorithms available a lot of them are based on and there's a lot of correlations here to machine learning including this one so SLI C which stands for we didn't put it when when it stands for the algorithms called slick and and it uses K beans with one set simple linear inert of clustering it uses k-means under the hood so it's basically using psyche at Len under the hood in order to take the space of the image which is all of the pixel values and try and determine or try and separate them out and do a given number of sub regions and we can just put that image right in here and all we're doing is labeling via labelled RGB we're just setting each sub image or sub region that we found to the average of that region which makes it look less like a patchwork of randomly assigned colors and more like an image that has just had been decomposed into areas that are kind of similar and this is actually pretty reasonable so we've taken an image that started at well several thousand values 262 odd thousand pixels down to a total of 100 regions which is the default you can tell it is if you want more or less and it'll give you more or less regions but that's very helpful if you're needing to decompose your data down as a first step and they're pretty logically relevant you know the the face came out pretty much as one but you would have to go in and then interactively pull out what you wanted to but it's certainly a good decomposition step to begin with anybody have questions about simple iterative linear classifications okay Shawn visa is a different algorithm which is based on a level set and level sets are going to give us a binary result so you wouldn't want to use this if you were trying to get multiple sub regions out it's very very different than slick so Qian visa is going to give us basically a it's gonna threshold but it won't necessarily have to be connected and it's evolved this level set over actually takes several seconds to run it also requires a grayscale image yeah so a level set is basically a function that's working behind the image itself and instead of like the snake moved pixels over time the level set you could think of it as sort of a bendable piece of rubber paper and then over time instead of moving the actual individual points around it's changing the way that it's bent in space positively and negatively and then the level part of the level set is where you threshold that usually around zero and so the positive aspects of that level set end up being the result of your image the ones that were a little bit higher and the negative aspects ended up being excluded or vice versa it's a very powerful technique although it does take a few seconds to run and it initializes using a very general it's just a checkerboard that's applied to the image so it starts off with random positive and negative values everywhere and then smooths out to the result that you see so in this case it's kind of interesting it's basically gotten all of the either all the shadows or all of the light tones so in this case I've chosen the the zeros out of it and but the zeros were all of whites and the shadows were what it picked out for all of the darks so if you had an image that you wanted to do that era Thresh holding on you could go back you can try this on the text I think I'm a little pressed for time or else I would have yeah so I encourage you to try that but feel free to so Champy's unsupervised based on a level set must be grayscale input and it will give you a binary result all right your timing okay well let's just go ahead and try it on the text a little faster on the text because that was a smaller input and and that didn't work quite as well did it this is that checkerboard initialization that I was telling you about hmm nah I think I've got it because it's still it's the Chan Beast result that I'm showing all right this one I haven't prayer at prepared so I'm gonna let it run a little bit longer and increase the speed that it runs so it hopes that that evolves it a little bit faster he's still having some difficulty yes okay we're gonna have to get back you on this one I think I think the issue is that it's looking for large features and these are all such small features that it's having trouble identifying them as individual things that it should grab hold of and the idea that the way to get around that live I'm unsure of yeah oh yeah well it could be because I am still running Python 2 on this laptop didn't want to change things up right before the actual presentation thank you for that see if my colleagues can get that working in Python 3 okay so one more algorithm we're gonna go over today is called I'm probably gonna butcher this I think it's a feller in Schwab and this one is again using machine learning under the hood it's using minimum spanning trees and this one interestingly does not guarantee the number of clusters that you get you just get two basically at a high level say what scale it's going to consider but it's going to run and generate as many as it thinks is appropriate for that given scale or zoom factor on the image also it full requires a color image we cannot use a gray so we'll just go ahead and use the they astronaut the original astronaut image and take a look at the results and that's quite a lot of different segmented regions how many actually is it so here I'm going to introduce a very handy function function and numpy called unique so the number of unique labels in our result here just the size thereof okay so we've got nearly 3300 sub Regents that we've found in a very short period of time and all we're going to do is then just recolor them using the the region averaging that we like we did in that previous example so just go right back up to slick where we took this region coloring coloring in with the averages we use that except we're going to use the result we just got based on the original takes a minute to do it because there's 3,300 regions and this almost looks more like a posterized image posterization is just a reduction of the number of colors that you started with except we started with 255 times 3 channels and now we've got 3,300 so we haven't really helped ourselves there yet but there is a common technique in image processing which is where you over segment you actually get more regions than you need but lesser than the original number of pixels and then within those sub regions you can more easily select between them or define which images sub regions you're you're interested in or want to look at and one of the ways to combine them is with a region adjacent together I like these because they're conceptually very similar to random Walker so before in random Walker we were looking at the penalties of moving pixel to pixel back to wherever we were starting where the region adjacency graph were looking at the penalties of moving from region to region and seeing which ones are kind of hanging together and the only real catch here is that we're gonna add one to this because in a step that we do with the region properties which is a really helpful function it doesn't include zero but we do want to actually include zero that's a useful value that comes out of fellowship swap so we're importing from the future rags are in the package but they're not actually the the API might change they're stable they're they're robust but we might change the way that we interact with them in the future or we might not we might just move them right into the base I can image graph sub-module but right now we have to import them from the future if we want to actually use them so will based on color so we'll do the mean color is is what this entire region of JCCC graph is gonna be based on so this time it's not local contrast or anything else it's literally the mean color like what we just generated and now we have our Regent adjacency graph now the graph itself is conceptually a little bit difficult to understand so the next all we did in this step was to for each region we're gonna tell the graph where the center of that region is and that will help us plot the center of that region so that we can show how it connects to all the other regions around it and this particular function iterates through the graph and for each edge between regions that it finds it's going to plot it between the centroids of those two regions that's really all that's going on here and then it will only include the ones where that the weight of that connection is above a certain or below a certain threshold yeah above a certain threshold okay so if we just set the threshold to infinity we get every edge in the graph that's pretty complicated but we've got three thousand plus sub regions in here and the interesting thing is that we can yes okay all right yeah it just it just becomes yeah alright I'm not gonna hit chip dinner on my screen cuz this works for me if this works for you don't do that it doesn't work for you make this change you should being brave yeah thank you alright if there's issues like that please pick up okay so this is obviously too too complicated and too convoluted but if we just threshold at some value and in this case we have to basically just iterate here so let's try interest a time let's try 40 so then we're going to display only the edges that are below 40 so there's the mean color differences below 40 so we're actually separate threshold alow and the goal is to find similarity and this does directly edit the graph so if you want to change this you've got to go back up and recreate your graph here and then rerun this and then get back down there but with around 40 you can see that now things like this stripe have hung together but aren't really connected to other adjacent this is similar but it isn't all necessarily connected to its adjacent areas and when we are happy with the way that this looks we can continue to threshold down because you can't ever go back up iteratively unless you recreate the negraph so we can go down to 30 here and now that separated things out even a little bit better the face no longer connects to the hair for example and so let's let's consider that our final cut and again we can cut based on a threshold so it's just like what we came up with here and use this label to RGB coloring that we've done just in the past and see how see how that looks and we've lost a lot of detail in the face but again we have logical sub regions that we've managed to combine so we've reduced from 3000 to under 400 individual areas and some of those could be further reduced because they're fairly small just in the in the middle of other things okay I'm not sure we really have the example the time to do the the exercise with Stefan's cat but on your own time or if you're at home and watching you I've basically given some starting points that are in the middle of the cat's eye in the middle of the cat's nose and then you could use this to iteratively attempt to segment out just the cat's nose or the cat's eye and think about the the challenges you might encounter along the way there because you've got the black eye and maybe you wanted the iris as well as well as this and how you how would you do that which algorithms would be more appropriate as an intellectual intellectual exercise so any other questions about segmentation we get a lot of questions about this okay I'm not sure if silence is golden given the issue that seemingly half the class had with networks but the what I really would like everyone to take away is that there's this large distinction between supervised and unsupervised algorithms and that given the approach and the problem that you have and the luxury you may have given the ability to have a human involved in in the process you are down in an entirely different pathway in terms of what you're trying to do and then show you just kind of iteratively how any one of us would work through a new problem when it doesn't quite give us the result we encounter the first time or that we wanted the first time we look into the parameters see what we can tweak because for a given problem or forgiving input data there's a good chance the defaults might have to be changed slightly okay alright I'm gonna assume silence is golden but you can hit me up at any time alright so we're going to switch gears a little bit now get our hands dirty on some real-world problems and this should be kind of a little bit more relaxing the last section for the afternoon so if you click through to that stack overflow challenges notebook these are questions that were posed on Stack Overflow real questions that people had in how to apply Seiken image to their their problems so I've grabbed a few that I thought would be fairly easily easily solvable and maybe alford are two to a few hours we have often our sir so let's run through that through different ones here so we've got the first one is from an NIH challenge that we participated in and the idea was to build a mobile app for the cell phone that would allow people to photograph their medicine and then identify the medicine that they would take the right pills so the images looked pretty much like the one displayed up here and the challenge is find that pill identified segmented and calculate its area so that would be a first step in identifying the kind of medicine there and then you can imagine you know later on you would measure its its exact shape you would measure maybe some attributes of the bowl itself but for this exercise just try and isolate the pill and calculate its area and I gave you some advice on how to do that so a good start is like equalized image to get the brightness up so play around with the exposure equalize methods once you've brightened the image up nicely do some edge detection on it using the canny edge detector I think that lives in feature now and then you can fit a circle model using the ransack model fitter yep this is like you know all these problems are thrown out there and we kind of used heuristics to solve them so just use whichever creative method you want to solve them but this is one example that I happen to know it works one pathway that I happen to know works in the second example you have a photo of a bunch of coins I asked that do you count the number of coins automatically in the third example you've got this image of a bunch of snake-like lines and the question is identify the start and end points of each of these in this picture we've got a bunch of M&Ms question is how many blue ones do we have and we want to count them automatically in this real-world experiment we have some liquid that goes through some phase change and there's actually a pretty tough image to work with it's very noisy the boundaries aren't very clear everything's translucent so all I ask here is that you try and find some kind of meaningful boundary of that snowflake like object so just try and see if you can segment somehow segment out that that boundary and you'll see that the noise makes it quite hard to do so yeah so pick one of these and and give it a give it a try if you have a problem in the back of your head that you that you brought along maybe some of your own data that you want to play with feel free to to do that and we'll be here to answer questions as you try and apply these tools right so will I'll give you 25 minutes to try and solve one of these problems and then we can quickly run through the solutions and maybe do you have a five-minute Q&A at the end all right so it's about time to wrap up so let's run through a few of the solutions like I say there are many many ways to solve these problems but we have a few sample solutions so electro solutions right so for the establishing the parameters of the the pill you want to use IO Emery to to load the image I equalize the image using adaptive adaptive Instagram equalization and then on that image calculated edges using the canny edge detector so here's the original image here's the equalized image and here are the edges that you get from Kenny the plotting doesn't show that very nicely that's just because the pictures shrunk in the browser to change this to something else yes you can see that you've got nice edges out there and then you can use rance active to fit a circle to that data so rancic is it's a really nifty method for if you've got noisy data for fitting a model to that data so it basically chooses three points at random from your edges and it tries to fit a circle and it sees how many of my data points lie on that circle then it randomly chooses and other points repeats the experiment fits the circle keeps doing that several times and that looks back at the history and it says like where did I do best and gives you back that circle so so yeah if you if you do that then you get a nice circle that fits perfectly to the fill and then you can use the the circle to calculate its area I also applied morphological snakes so that's similar to what Josh said earlier on it's a newer addition to cycle image they take awhile to compute and I haven't tuned the parameters perfectly yet but when it's done you'll see that it sort of grabs the pill plus there we go pearl plus the shadow yeah that makes sense that's sort of where the edges lie and then you saw early on that you've got parameters that you can tweak to say like stay away from the dark area stay closer to the edges that sort of thing this would also be useful if your pills aren't necessarily around so if you have hexagons or triangular poles that sort of thing for the coin counting example the procedure that we do is we equalize the image to get the lighting up we threshold the image to get a black and white version in this case we just use a two threshold thing to do it automatically we remove all objects close to the boundaries then we close the binary objects which means basically we fill in all the small little holes and then we count the objects that we get so what is what am I seeing oh that Jesus did not come out here where'd I get rid of edges what you made these changes thanks 1 all right I'll have to fix that ok so guess what oh here there we go yes it makes sense ok so there we have the coins labeled different colors and counted twenty four coins so this is the easier version of the problem in the M&M one which looks easier is the harder version because you've got the M&Ms touching one another but this one you could get away with a fairly simple strategy so with this the snake example is kind of cool because it uses the full Turing that one taught you earlier today so we so we have two issues here so first we want to know like how do we define an endpoint of one of these snakes because those are the things we want to count so what is an attribute that an endpoint would have like can you think of any attribute that an endpoint would have yes what what was that about the whitespace most sides of it except for for one exactly yes so that's the definition needs so the end point is like it's part of a snake but it's only connected to the snake in one direction so to the sides up and down etc right so how do you how do you find those in the image well you could write a for loop and you could say at each position go and check first is the pixel dark okay great then it's part of the snake now check all its neighbors does it have only one neighbor but this sounds an awful lot like the convolution that we saw earlier on convolution can help us do this counting so indeed what we do is we define the following convolution that the following kernel we have ones with a zero in the middle so this if you convolve the image with with this kernel it basically counts the number of neighbors that each point has so right so that's one part of the problem is convolved the image count the number of neighbors now I've got a map for each pixel we know how many neighbors it's got but then we're uninterested in the pixels that are actually part of snakes so to do that we have to do an end so we have to do give us the number of neighbors but only if that pixel is actually part of the snake so for that we use that and operator so let me show you first count of the whole solution here so that's the whole solution so you can see it finds the end points but I just want to show you this intermediate result that you get just after you do the convolution just counting the neighbors so you see that it gives you back this map where most of these points that are part of the snakes they have two neighbors one to either side and then right at the ends the color goes down a little bit you can't see that on this image but it goes from two to one because it only has one neighbor so then we're interested in this image in the positions where it's got only one neighbor yeah for the M and M one one came up with a better solution during the tutorial but I did something fairly simple which was denoise the image because we don't really care what's going on inside the mms so we just denoise the image using any method available I chose to use D noise nonlinear means I calculated the distance away from blue to get an idea of of where the blue ones are this I call the MAS my mosque so sure yeah that looks so just calculating the distance away from blue and segmenting that distance gave me a very nice mosque and then you can go and count the number of objects you see inside of this mosque you have a bit of a problem because some of these touch so in order to disambiguate those you can use watershed segmentation and if you apply watershed segmentation then it counts yeah gives you a count of 12 and there are 12 names for the finger the viscous fingers here you convert the image to grayscale because it is a grayscale image really they just add some odd lighting there and then if you tried canny edge detection on it you would have noticed that it's an extremely noisy image you get noises edges all over the place so what you definitely need to do some smoothing of the image so we I used the turtle variation denoising algorithm and then apply the canny edge detector and when you do that with the right parameters you get something this so you can see it's not it's not a done tusk you still need to do a lot of clean-up on that result but at least you have a nice boundary that you can work with all right so I hope that gave you sort of a good overview and a bit of a taste for Seiken image and the tools surrounding it in the scientific Python ecosystem if you have any questions we'd be happy to stick around and answer them but otherwise thank you for attending and we look forward to meeting you during the rest of the conference [Applause]

Info

Channel: Enthought

Views: 31,744

Rating: undefined out of 5

Keywords: python, scikit-image, image analysis, SciPy 2018

Id: arXiv-TM7DY

Channel Id: undefined

Length: 139min 6sec (8346 seconds)

Published: Thu Jul 12 2018