EEUS 2018- Image Segmentation and object based methods

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
morning in there okay so I'm no Gorelick I'm one of the cofounders of Earth engine this is a talk on segmentation and object based analysis and it's a talk about a bunch of tools that I built in the past couple months specifically so I could give this talk at this summit it is important to note that every piece of this talk and all the tools are a massive work-in-progress I was still working on those tools Sunday night or Monday night as I was preparing to give this talk the first time yesterday so just be aware that there will be surprises okay this talk is available here if you want to bring it up and see it the talk is entirely pictures it's been a lifelong goal of mine to give a presentation that is nothing but pictures I almost got there there's there's some words on the pictures occasionally but this is all about concepts and results and if you want to see how it's actually done then this is the script that made all of the things that I'm about to show you so you only need this one it will get you everything else but should you want to look while I'm talking there is the second one as well okay so here we go this is a red green blue near-infrared nape image National Agricultural inventory program I think imaging program red green blue one meter image it's a pretty typical kind of image that you might want to do segmentation and object based analysis song you might want to try to do object based analysis so that you can get clean classifications so the idea behind that is that if you classify on the mean of all the vectors inside this polygon you'll get a nice solid result as opposed to classifying on the individual pixels you might get a lot of salt-and-pepper noise so a lot of people want to do object based classification because of an improvement in signal of noise there's a thousand pixels in here if you can use them all you should be able to beat your noise down by a factor of a square root of a thousand so it's a reasonable thing to do but that means that you need the object to be able to get to all those points before you can put them together and use them you might also be conservationists or someone like that where you actually want to know the size of each one of these fields or maybe the size of the areas that aren't fields what's the size of the river what's the size of the forest patch what's the perimeter what's the connectivity what's the contagion and all those things that conservationists do that I don't quite understand I just know how the math works which is a thing I don't really a math guy even so there's a lot of reasons that you might want to turn these things into objects and and they they serve different purposes and you actually end up doing different things with them and my job as an Earth engine engineer and my job specifically as as one of the founders for Earth engine is to try to make that easy for you to do and so my goal was to stand up here and say I have solved all your problems turns out that has not been the case I have solved a few of your problems I'm sure I've created a couple more problems along the way but my goal long term is to say hey here's a simple tool you can use to do object based analysis you don't have to fiddle with all those things anymore right that's that's sort of the underlying Earth engine theme and more or less the theme of my entire career is y'all shouldn't have to fiddle with all that stuff you should be able to jump straight to the science and do do that so I am trying to make this work if there are things that you think will make it better I certainly want to know about it if you start using them I want to know about it and if there are tools if there's things that you want to do like if you really wanted to do contagion and I didn't have that although I'm about to then I'm gonna know so that I can I can make that tool work so that particular image is covered by the USDA crop data layer here it is this tells you type for each of those fields or at least what the USDA thinks is crop type and clearly this image is not perfect it is highly unlikely that this field was planted with five different crops it's just that their classification wasn't so good either they might have benefitted from some object based classification in the process of doing this so but it is training data and in fact the class I just taught was taking an image like this and an image like this putting them together and doing the classification you would sample it a bunch of points something like this and then you might end up with a pixel based classification that looks like this this is not much better than the CDL layer in fact it's probably a little worse that area where they didn't know what was going on our classification didn't come up with any better answer because we were using their data as truth to begin with more than likely this was planted by with some crop that nobody none of us recognized or maybe it was bare or something like that but what you might want to do having done a classification like this is one of those next steps you might want to say what is the area of the rice that's being grown in this region or blue I think blue is rice or we'll pretend that it's rice so you might take this image and you might simply sum up all of the blue pixels of the area of all the blue pixels and say in this region there were you know 7,000 hectares of rice being grown but it would be an approximation and it might be a pretty gross approximation because in a lot of places these fields which are almost certainly homogeneous are not homogeneous in the classification so you're just sort of ignoring that and saying well I got 7.2 million pixels that means seven thousand hectares and you might be comfortable reporting that number or you might not so what you might want to do is you might want to clean this up and we have some tools that you would use to clean this up you could do apply a Meade mode reducer as a reduced neighborhood and that would let you get rid of single pixel outliers all right take the mode of each neighborhood around those things it would erase a bunch of single pixel outliers and that would help or that would at least give you more confidence but it this sort of thing that's not gonna fix there's really no good way to do that on a per pixel basis it's going to get rid of those things same thing here in the red right there's this is sort of a bear patch in the middle of this thing you might want that bear patch to be pulled out as a separate patch or you might want it to go away and then over here that's that's easy on the crops over here where it's a lot more ambiguous it is clearly a lot more ambiguous do you believe in any of these values I don't know but more importantly you might simply want to say okay this whole patch is uncultivated let's cut it out but you need the patch for that same thing for the river the rivers super super spotty here not a very good classification on the river part of that is because these are all the more points that went into the classification I simply sampled at each one of these little dots to get a feature vector and in that feature vector this is it back to the cropland data layer I said okay that point is that crop whatever it is so we've got you know a few thousand points I think this was actually seven thousand points for a classifier ten different classes here not not a great combination to produce a great classifier and in fact the result is kind of iffy like I said you can clean it up you might start get to this point and say well the problem was I was using the red green blue near for red data directly classified directly on that maybe I can do better so what you might think to do next is you might say well let's convert to NDVI this is NDVI you can see it doesn't help much but I'm gonna help you see that it doesn't help much here in the crops the solid crops there's still this sort of wavy pattern going on where they've done some kind of weird planting I have no idea what's going on here really weird planting but these black things are what I think is rice and they're they've got water in them they're like super negative and DDI so those are going to be not come out well in the end EBI and then over here you've got lots of stuff just merged together you there's not really a field boundary or anything in there that helps us do anything to point out how this is this is NDVI thresholded to five different levels and you can see there's no way we could just pass an in DVI threshold over this image and get to fields the fields are vastly different levels of NDVI even inside of a single field you often have vastly different levels of NDVI so some kind of NDVI threshold is not going to help us get to any kind of objects that we can work on so one of our users recently went through this whole process posted a question i said come to the summit or at least I'm going to address this at the summit what you tip oh the reason this doesn't work the NDVI values are all over the place this is zero NDVI there's a wide range of NDVI values in this image it doesn't we can't get two segments easily in that a typical thing that is done in the field especially in computer vision is that you might instead of working on the raw data or the transform data you might actually work on gradients so this is a gradient image of the NDVI and now it's a little more promising the stuff inside the fields is fairly low and we might be able to to threshold that away the edges of the fields where we're going from some kind of crop into some kind of road probably we're expecting those to be big gradients and so we get nice big bright edges here and bright edges are good because that sort of gives us a thing that we could maybe vector eyes those are objects right those are object delineation points the rivers well delineated the roads well delineated maybe this will get us somewhere looking at the histogram of all of this that's nice you know maybe we can maybe one of the points in here is a good point to threshold through lots of optimization and looking at a lot of images this was the best I could do with that gradient image and a single threshold and some of these fields perfect right that's a great object for that field this however is not right it's got a nice boundary all the way around it and then it's got a hole and so if I just vectorize that the vectors are going to go right around that hole and keep going and all of this is going to be one big object in fact it's gonna go over here and all the way down there because these holes the vectorizer doesn't know to not go through those holes and even with all my playing I still wasn't able to get rid of the inter field variations in some cases they're just big enough that the gradients showed it showed up in the gradient image so I think we're on a bad path you right we've gone down several steps we're not really going to be able to vectorize this image and just have our objects so we need to do something else our first something else is maybe we just used the wrong gradient maybe maybe we can't see what we we need to see in there turns out there's a good bit of literature on spectral gradients we had for bands we turned it into one band then we try to do a gradient on the one band that was maybe not the best choice there are a bunch of things to do spectral distance so take any two spectra and compute their distance in some metric my favorite is spectral angle where you simply compute the angle between the vectors the cosine of the dot product there's also squared Euclidean distance which is C there's this thing called spectral information divergence which I was not able to find anybody that has a nice picture of so I actually don't know what it means graphically but it's math and you know it says these spectra differ by this much and then a really neat cool one is earthmovers distance which doesn't look like the other three these three look very similar as you look at them the earthmovers distance looks different and therefore there's some kind of orthogonality in it that might be useful there's a really nice paper that says given a bunch of spectral distances we can compute a spectral gradient and normally if you're computing a gradient you just take the min and the max of the pixels around your pixel and say okay my maximum nearby is for my minimum near bias to my gradient is two and you might do this in the horizontal and vertical directions and in fact that's what this was was a horizontal gradient and a vertical gradient with the sum of squares to give me a just a magnitude but that's what you're looking for is is the difference but the absolute difference between all the pixels in the neighborhood and so if you as you're passing some filter or some some window over your image you can take all the pixels in the window cluster them together using spectral distance and then you can get one that's closest to the centroid and one that's farthest from the centroid and that distance the difference between those two can be used as a spectral gradient there's some tools I made to do this spectral distance is the starting special gradient is the end it also turns out there's two things in the middle that turn into erosion and dilation for spectra and so the erosion is given all these pixels which one is closest and then dilation is given all these pixels which one is furthest so the center pixel is replaced by the pixel farthest from all these things and the erosion is the center pixels replaced by the one closest to all these things and spectral I haven't figured out what spectral dilation is good for yet but spectral erosion is actually pretty good for eroding away the outliers so if you've got a field with one pixel in it that's that's a problem you can just apply a spectral erosion that one pixel will go away and you haven't really munched up your spectra it was one of your spectra that went into that so it's a little useful for cleanup or I have used it for cleanup but really the whole point here is that spectral gradient is simply the difference between the spectral erosion and the spectral dilation and since I had to compute them anyway I turned them into functions and you can use them if you find any use for them or you can abuse them which is much more likely you bet jump in anytime you would select so yes spectral distance is a starting point for all these things it uses all the bands that you give it and you can always just select which bands you want to go in but all of those metrics just say you've given me twelve bands I'm gonna compute the twelve band spectral distance okay so maybe we were just used in the wrong gradient so this image is the same image as before computed with the spectral gradient on the red green blue near for red bands directly but for three different gradients so this red is the they are in this order red green blue just stick three of them together to see how they look and they look pretty much the same everywhere that's where you get white the blue is just where the spectral information divergence did a good job you can see in this image you know I've gotten things have gotten a little better most of my breaks have gone away maybe I could vectorize this maybe I could fiddle with the threshold a little bit more to get rid of this stuff I don't know but it's a place you could start and simply turn these into vectors and maybe it got you somewhere or maybe not again the going down this gradient path there's a lot of fiddling if you've got really obvious answers maybe this will get you there but also maybe there's a better answer so any questions yet yeah the original image is a one meter resolution and the stuff you're looking at I think is actually 18 meters I'm using 18 meter pixels here which means roads are basically one pixel in most cases so here is what I hope one more you could so there is a function for helping you do skeletonization there's erosion and dilation which you could do a rotation of dilation on this with to do skeletonization you could turn it into something easier or you could actually just have it vectorized the black bit sanding and use the white bits as the mask right you mask out the white bits what's left you then vectorize as long as there's no breaks it will kind of work but even in the blue here I think you know there are a few points where there are breaks it's gonna be if you to go this route people do it it's done here's some tools you can do it with not actually what I want to talk about I wanted to point out a bad way to go before maybe we talked about a good way to go okay so I talked about super pixels for a minute and this is where most of my work has been for the past month just like before we're gonna make a regular grid of points and then for each one of those points we're going to take pixels around it and collect them and if one of those points landed in a nice field like this then this particular algorithm I'm using grows out and you should be able to get the nice field the thing about super pixels is is they're not designed to be the objects they're designed to be reduced objects to a few pieces I don't want to say reduce the objects to a few objects because that but but that's what's happening here all right so I'll show you what that looks like take each point expand out this is a random visualization and sometimes it works great and sometimes it doesn't but the nice thing about this one is that the shapes are in there so the river is well-defined most of the fields are well-defined although in some cases they're split by a couple points the roads are there one downside of this particular approach is that all the pixels have to go somewhere and so sometimes you will get a cluster that leaks out into a street because or a road because those road pixels had to go into some cluster and there wasn't any point on top of the road and so the pixel of the clusters just grow and the way they grow is a little bit controllable but they grow into ultimately collecting all the pixels and so you end up with clusters with pixels in them that aren't so great but we take that we classify it and now using the exact same tools we've already done we classify each object there is the this is arguably a better or worse image depending on your point of view the better part of this image is that there's no salt-and-pepper noise right it can't happen because we're classifying this entire region as a single vector so that entire region becomes one homogeneous result in the end but it does mean that some of the roads get misclassified the river is broken into multiple classifications because it actually includes lots of pixels that aren't in there but the problem that we were originally trying to address of salt-and-pepper noise completely gone right it just can't happen in this case but it still has issues this is the concede el4 comparison and you can see as I go back and forth here that's actually a better result than the CDL got okay this one is iffy can't tell this field has been broken in half but in fact that field is planted halfway so it's probably a little better is it better I don't know yet I will point out at this point and I'll probably reiterated a couple more times I'm an engineer I make tools you the scientists are responsible for figuring out if those tools did what you want and often we call this loading the gun I will happily load the gun and help you point it at your foot but you're the one that needs to decide whether or not to pull the trigger okay so the nice thing about the super pixel mechanism is that you get to pick where the pixels go they start off on a grid but you can say no no don't do that here's where I want you to put those pixels and you can do that to some lucrative Chryst extremes at some point this image I made by starting off with my grid pixels and taking my spectral gradient well actually I didn't even start off my spoken with my grid go back to my spectral gradient here for a second I used this and I used some computer vision on this to say give me one set of pixels at the maximum distance from any white points so that should be a pixel at the center of each good field and then I said okay I've still got roads and whatnot that'll be a problem so invert that whole thing and give me a seed on top of the worst points otherwise the the brightest spectral gradients give me a seed point on top of the brightest spectral radiance so this is a local minima these are local Maxima I just put points at the local minima and local Maxima which is why my thing up there says the gradient min max clusters there were a lot more local maxima clusters which means you get much smaller pieces the river is super well donated like every piece of the river is more or less its own cluster did this do a better job I think I have a classification right there nope I didn't do the classification cuz it come on back but you can classify this just like we did a second ago and it looks more or less like the previous one but the point is you can put the clusters anywhere you want so hopefully what you would like to happen is the clusters go in at the right places you get your results you don't have to fiddle that is not the case you'll have to fiddle I'm gonna show you some tools that I've developed to help you fiddle with this and some results to help you put these in the right place I'm basically positing that through some mechanism that I have not yet solved we can get essentially perfect clusters we can figure out a way to drop a point on top of all of the fields we can figure out a way to drop a point on top of all the other things and let them grow into their own regions just fine and then we can do things with those clusters so I'm kind of solving two problems at once here and positing that yeah the first one works even though it doesn't and that will just get better over the next few months hopefully weeks but we'll see so a little bit more about the seed seed region seeded region growing and snick one of our users gana 2 years ago almost posted a example of doing slick super pixel linear iterative clustering I think is what it stands for it uses k-means using this this super pixel mechanism and I said that we should really get that in and so then 18 months later I got around to working and in the meantime people have come up with a better answer so there's another algorithm called snick super pixel non iterative clustering and non iterative and Earth engine go together really well iterative not so much 900 if I was all over that so I went I implemented this and it turns out it was really simple to implement this is a really well an algorithm really well-suited for Earth engine and so I'm gonna show it to you so if you don't do anything else it will just give you a grid of pixels it will grow from the center of each of those pixels adding in the things with the minimum distance so it adds the the closest spectrally matching pixels first and they grow out and there is a parameter on the algorithm to let you define how much influence the spatial component has so if you want a purely spectral segmentation it will do that if you want a purely spatial segmentation where the spatial NIST is defined as the distance from the center point it will do that and you'll just get squares and you can do something in between so you can in fact turn that knob and see the regions change depending on how much compactness you want in them yep right this second snick is limited you to Euclidean distance I'm adding that parameter any second to let you have those other ones I finished this before I spinach the spectral distances so now that they're there I can just bring them back in and let you do any of them okay does that make sense right you change how much weighting you're giving the closeness of the pixels and if you set it really low then you can get clusters that go very very far that are primarily spectrally homogeneous and so you know like this black one squishes down around into non filled fairly nicely at low low numbers for spatial correlation makes sense one more nice thing is that you can there's another function called seed grid which will let you put the seeds anywhere you want so you can actually specify set them to be on a hex grid and you end up basically with hex cells and then you can morph those hex cells around the same way into you know preserving preserving the spectral feed preserving the spectral features of the underlying data correct there's no answer for optimist there's no optimized answer so I don't know what the right answer is yeah this is just it's a tie it's a knob you can play with it set it where you think you should set it all right so again as I mentioned I'm presuming that the clustering will eventually give you good objects but once we have clusters what can we do with them if you're trying to do forest conservation and you want to know connectedness and compactness and contagion and things like that you need clusters that actually represent your forests and I don't have that yet we're gonna assume that I do and we're gonna talk about what you can do when you've got them so clusters as objects here's some clusters assume for a minute that they're the right clusters one thing we can do is using a new tool and this is the important part of this whole talk by the way reduce connected component previously you could take all these things you could turn them into vectors and you could do all kinds of vector based stuff distance perimeter all that sort of thing but the process of doing that is a significant has significant limitations in Earth engine the size of a table that you can generate in Earth engine has limits if each one of these polygons has in it I don't know 50 60 100 points though it's a pretty large feature some of these polygons can actually if you do this not using snick but whatever your wherever you got your polygons from you can end up making a hundred thousand Point polygons million point polygons and the number of million Point polygons that you can fit into a table in Earth engine before it runs out of memory is actually quite low so we're optimized for ty and images and anything we can do to stay in image space means that the scale can be increased sometimes by four or five or six orders of magnitude so we want to do everything we possibly can in image space and avoid table space if at all possible or maybe at the end you just extract your final results into vectors so to do that a piece has been missing and so I built it last week it's called reduced connected components what it does is it takes a tile in that tile it finds all the homogeneous pixels and then applies a reducer to everything underneath those pixels so the cluster band is where the homogeneous this comes and the spectral bands underneath that you can then do things with this particular image is my original nape image clustered and then in each cluster we compute the standard deviation and the reason we do this or the reason I did this is because my clusters that are really good field clusters are low standard deviation and the ones that include field and Road end up with a high standard deviation right this is a great example it's got some several roads in there the standard deviation of that one is way high so this is a great indicator of good clusters and bad clusters it's also per band so you're actually seeing red green blue standard deviation here so the colors don't mean a lot other than the standard deviation in green was a little higher than red and blue here but they were all low the standard deviation in all those bands is is high right so and it makes a cool pretty image yes there's for all tiling operations all tile based operations in Earth engine in which the tiles matter you specify a neighborhood so if this was my tile you say go this much additional and do that extra work so that you don't miss any partial objects so that's the neighborhood size several dozen of the functions in Earth engine have that and we just do the extra work I had another point there oh right in this particular case it's it might be called the neighborhood size it might also be called max object size because if you have an object that actually spans not one tile but two or three tiles then it comes all the way over to here that's a really big neighborhood and it's hard for us to process you can tell us to do that you can say mats object max object size is 8,000 pixels and we will try and bring all that data in and try to do that huge amount of extra work just for this one 256 by 256 tile but you might run out of memory in the process of doing that so to make sure that we don't ever give you the wrong answer instead of neighborhood size here you actually have max object size and objects that are bigger than max object size are thrown away they're masked out so you'll know that we weren't able to do the computation rather than just giving you a partial computation that is going to be completely incorrect yes so yeah that's yeah that's what's happened here standard deviation of all these pixels put back into all those pixels okay okay so we go forward so we have our standard deviation map telling us where our bad clusters are we can threshold them straight away saying these are the outliers and then we can compute the spectral distance for all the pixels inside of each of those clusters so here's one cluster it was a problem the spectral distance from the mean of this cluster to all the pixels in this cluster is what you're seeing here so the field is low spectral distance because the mean is mostly it's mostly field and then we can actually find the worst outlier and stick a point on it okay so this was for this cluster this was the worst outlier for this cluster this was the worst outlier and we can use those as new seed points and grow new clusters assuming that that was the problem point and hopefully this problem goes away doesn't always but let me rephrase that that is not the case but it helps right so here is a new set of clusters growing on those new points here is the new standard deviation map much darker which is good like black would be perfect there's one cluster that's still a problem child and you can see it's got lots of road in it right so we could repeat this again until we stop having clusters above a certain level of standard deviation and suddenly I have taken my wonderful non iterative clustering algorithm and I've made an iterative again but maybe it's just one or two iterations to get there until I can make it better - right this is all just work around because we don't get perfect clusters yet but I'm showing you some workaround so that you can use them okay okay so again backing up and assuming that we've got some some good clusters we've gone we've gone through whatever mechanism it is that we need to to get good clusters we can then do more stuff with them and this is the part I'm excited about because it's new its new I'm not sure anybody has done this before I may need to go do a literature search and see if I can do this but now we can do per cluster statistics in image space we don't need to pull the clusters out to compute the area for instance up till now you would have had to make polygons out of all your clusters and then you can ask each polygon for its area but earth engines got a band functor a function that will let you produce an image where each pixels value is its area and you can simply sum those up underneath each cluster and now you've got cluster area or cluster perimeter this is the one that I think is really cool and I wore my shirt specifically to blend into this today this one is is the one that I woke up in the middle night and wait yes I figured it out so for here what we do is we take every pixel and we take its we find its minimum and maximum of its neighbors in the clusters so on the edges the minimum and maximum of the neighbors will be the other cluster some other cluster whereas in the center the minimum of the maximum of all the neighbors are the same right so this is cluster twelve all these pixels have neighbors with a minute of maximum of 12 these pixels over here will have a minute a max of 12 and 13 and so where they differ I know I'm on an edge I've got an edge pixel so I can erase all the pixels that have the same value for min and Max and I've eroded out all the Centers of all the clusters and what's left are the perimeter pixels and now I can compute perimeter area entirely in image space and that's what you're seeing here each cluster each perimeter is colored by its own cluster color whatever that costs that cluster color was and so I don't know if you can see from the back but you know there's green up here blue down here so each each perimeter is its own each cluster has its own color and we can compute perimeter area again by just summing what's left right so the two tools being used together here are reduced neighborhood to get neighborhood pixels what's around me and then reduce connected component to do something with all of those inside the cluster cool now you're all happy about this as I am I was really happy about this result all right so a couple more easy ones this is the perimeter size the perimeter area just turned back into an image two more real easy ones you can get cluster width and cluster height by just doing the same kind of reduction on top of lap lawn image we've got another image that gives you every pixel is its lat lon you can get the width of each cluster by just getting the min and the max of the ladder and longitude inside of each cluster so there's the width and the height and now you've got a lot of the base stats that are used in frag stats so you can do perimeter to area ratio you can do with the height ratio for compactness my goal when I'm not talking to you guys this week is to actually just work my way through frag stats and see how many of those I can knock out entirely in image space and then hopefully write a paper I don't assume someone has done this before I'm not so smart that I think I invented this but we'll see I'm gonna just how many of them are easy to do the one I'm working on right now is contagion so we'll see you don't you know d2 you have a seed pixel for every one of these there is a seed pixel that is the center of the cluster it's not the spatial center it's not the spectral center but it is guaranteed to always be in that cluster the cluster can't include that so if you want to just go get all the perimeters for some reason you can simply sample every cluster at the seed location and the seed band is included in everything so you can simply it's one call right you reduce region you know you sample on the whole image all the pixels that aren't seed pixels are immediately masked out because sample doesn't let you work on non on partially masked pixels and then the seed pixel is sampled through whatever statue got this actually this one was not well we'll start with that one this one was essentially that all of the gradient minima and all of the gradient Maxima and it can get crazy but if you don't really care one of the reasons to do this is that my original image is actually I don't n million points but this is only seven thousand clusters and so if you're doing something that is limited by the number of points you can put into it like a table or any of our table based operations this lets you suddenly convert you know this is this is a crazy image right I would not ever advocate that these are good clusters except this is representing 100 pixels and they're homogeneous through this mechanism there's a pretty good suggestion of they're homogeneous and so I can represent those hundred pixels by one and get a hundred times less data to then have to do things with so I kind of did that I don't think there was a seed grid on that one at all yeah okay any other questions yeah that's fine this gets into some kind of minimum mapping unit discussion you want the seed size to be pretty close to the size of the things you're hoping to find so if we're hoping to find roads this was way too big of a seed setting but we're really trying to find these what is this forty these are forty acres is that probably what these blocks are you know whatever they're there they're relatively big farms so you'll note that my seeds are roughly about the side of my farms but not because I was principled in that decision it's like yeah that's a good number oh there's a whole thing I gotta tell you about these algorithms and several other algorithms in the system like connected component labor connected count connected components there's this there's a quite a number of others they are entirely pixel based so if I simply zoom one zoom level I'm still using 32 pixel seed centers but I'm suddenly working on twice as high of resolution data so it is often the case that this is fine this all works out great you end up not zooming just don't zoom and you'll be fine but then sometimes you want to zoom in it's like what's going on here is this is this on the edges it's not an edge is that really where I want it to be and so you want to zoom in on that rather than having it recompute at each new zoom level and that's done by just forcing a reprojection and this is the place that reproject was invented for people abuse it all through the system this is the place that we invented it for so as you can say no do this entire computation in this projection and then as i zoom in on that it will stay in that projection and I get to see it so the important part of that this is the big giant caveat for Earth engine reproject always happens last do all your computations which isn't actually doing any computation and reproject the final thing right as you're about to display it and then we propagate that final projection backwards through all those computations okay and it's super unintuitive but that's just the way our thinkin works we project last to make all the computations happen in that projection okay you won't remember that until you zoom and things get weird and then hopefully you'll remember that yeah you're always you're always using whatever is on the map unless you've specified otherwise so when you inspect there's actually something that tells you what zoom level it was using it was using 76 meter pixels or something like that and that's the answer that you're getting from that but if you're inspecting something that's been reprojection Urist neighbor duplication so you can inspect at 76 meters even though you were processing at 2 meters and you'll get it'll work out or other way around inspect viewing at 2 meters ran at 76 meters you're just inspecting one of a bunch of repeater pixels okay yep yes perimeters it's one pixel perimeter inside of each cluster right next to another one so this one's orange this one's purple there are two they're always too wide because there's always two clusters touching each other but one one wide in each one and this is done with eight-way connectedness so in certain circumstances you can actually end up with one and a half pixel wide borders because the eight-way connectedness allows that to happen you can specify four-way connectedness if you want that to happen so that you can only ever have one pixel thick but then you can get two pixels that don't actually overlap they just touch on the corner so it's a it's a trade-off of one to two in reality at this scale neither those matters any other questions I sum the pixels that are left using the same summing technique I used here right this was give every pixel its own area and then sum up all the areas under the cluster this is the same thing but I throw away all the pixels that aren't on the edge so I'm just summing the area of the pixels that are left yep and I can divide by the square root of the area and get a length but there's a whole fractal thing that is involved here like what is length really mean in pixel space the closest I can get is and I think in fact I did divided by square root specifically to get linked an approximation of length all the way through this you want the wiggly ones every seed pixel initiates a new object no and that's that was the future work slide so the the big piece that's missing here and the thing that keeps us from having more or less perfect pixels here perfect clusters this whole process involves growing and splitting and merging so what I want to happen and this suddenly becomes not snick anymore I need to call it something else as soon no they do this but as these pixels grow and they start encountering pixels that don't look like what you've got you want them to split automatically and make something else over there that's easy you know you can I'm an engineer I want simple steps that's the next simple step and then once you've got them all or more more likely as two regions end up touching each other and they say oh we're really the same thing you want them to merge and so splitting and merging is how segmentation is done I just haven't folded it into these algorithms yet and that's basically the next work is trying to figure out how to not have like I could just give you a splitter and I can just give you a merger but then there's a whole bunch of knobs and you guys hate that remote sensing scientists hate that so if I can boil it down into something really simple snick is really easy to use it's three three parameters four parameters if you really want it's super simple that's that's my job right that's why I get paid the big bucks is to go figure out how to boil this down to the simple parts of what you need for merging and splitting so that's that's what I'm doing yeah so you're not supposed to see that don't don't look I should go all the way back to my warning sign with the barricades this so the reason I was really excited about snick let's go back to here is I can make this work without tile boundary effects by simply expanding my neighborhood and doing the calculations etc etc no tile boundary effects in here at all right as an engineer I think job done no bugs right turns out not so much running snick twice and there's another step here in the middle where other things happen somehow I'm getting tile boundaries that should not be there like all of this business here right that's a fake tile boundary it shouldn't be there some I got a bug somewhere in there some part of the tile processing is exposing a bug and me and Mike will spend an entire day together tracking down what's probably a one-character typo to find that tile boundary bug but yeah there are a couple any other questions yeah within height yes a thing that you can I think that I think I can do is maximum length inside of the cluster so I don't quite know how to do that yet but I think I can figure it out Road network specifically are gonna be pretty tricky here because you're going to end up with very long features and that just doesn't work in image space so they're gonna have to be broken up and then how the breakup happens I'm waving my hands as much as I can out here I don't I don't know how that's gonna work not yet that's what I'm doing for contagion right now I have figured out a way to make contagion work with one example with one caveat and I need to add something to make that work you can ask any pixel ask you can you could compute for any pixel what are all of the neighbor values so for this pixel here I can say tell me you know numerate all your neighbors go do reduce neighborhood on to array or something like that I collect all the neighbors into something and then I can distinct them and get them and that's what I need to do for contagion for all the ones in the center here that all the neighbors will be zero or self and so you can erase them and so by aggregating over all of the pixels in the perimeter where each pixel has all of its neighbors yes I can know all the neighbors of my cluster yeah yeah that's the general thought doing sick growing yeah that's that's the it's all in one place and I have all that information at the same time so I'll fork snake into some kind of standard region growing thing and then I can do that thing and like I said it's all about figuring out how to boil it down to the bare minimum that you need to know to say merge at this case split this case and you know maybe that's just two standard deviations you know if it's if the standard deviation is below this merge if it's above this split it seems pretty simple if any of you have used our existing region growing algorithm which is in tests and has been in test for five years it's kind of hard to use because it's got a fair number of parameters that are context dependent so if you're working on a five band image you need different numbers than if you're working on an eight band image and so I've learned that isn't going to work for you guys but maybe this standard deviation option will get us there okay so that's the tools snick reduce connected components some others around it they are in two places I meant to have a summary slide here so I didn't have to do this these are just on images so you can do image spectral distance just straight away snick and its followers and the other stuff that I'm producing around that are a new section called segmentation so ie image algorithms segmentation and they just accumulate in there if you are one of our users that reads the changelog on a weekly basis you might have already seen that I released k-means and G means segmentation options in there as well those are getting filled with you should assume change the word segmentation to test for a little while and that's where I'm putting these things as I develop them and try to make them work snick is pretty robust other than this this one bug that I've got to fix about second use tile scale or tile boundaries so if you were to use that and publish a paper you can expect that to not change dramatically I found a issue with k-means and g means so if you start using those next week they're gonna change but if you're an Earth engine user you already know that things change a little bit underneath you hopefully always for the better so we try to announce those if you aren't reading the change logs you might want to read the just hit change logs to see when things change or I've just told you those things in that segmentation bucket are gonna change for a little bit as I try to make them better okay so that's my talk thank you for coming but I actually really want to know what you want to do with this and more importantly if there are segmentation things that you're already doing that you think work better let me know I want to try to make those work for everybody if there are stats besides you know statistics like the statistics ones are easy area and width and height those were easy but you might not have thought of them again I'm super proud of this one mostly because I match things like this like I mean I I I'm literally gonna go get the frag stats package and see if I can replicate all those things in image space and so if everything you want is in frag stats okay we don't have anymore conversation to have but that's probably not the case so if there's more things that you think you want to do with object based analysis that I have not mentioned here today that means I'm not thinking about it yet and I would like to think about it so that I can get you the tools you need to use so yeah so I skipped that I talked about it the first time I gave this talk I didn't get a chance to talk about it this time the answer is yes and no these tools are designed to work on spectra when you read the literature they're expecting you to have four reflectance bands and you're doing some kind of spectral distance on reflectance bands Earth engine doesn't care so if you've got a image with a bunch of temporal composites in it this will attempt to do spectral distance in time and I'm excited about that I think there's a whole bunch of interesting work there that is free right if you if you can build the temporal composite and if you can't build a temporal composite I just did that in my class right before this so come to my next one and I'll show you how to do that but building a tempo composites super easy and then you can do all this stuff as a difference in time measure and now no I haven't actually made any of these sorts of images from that but I want to right that's that's like I said I was literally working on this Monday night actually even worse I was back there working on this during Rebecca's talk so and the first time I gave this talk was ten minutes later so as soon as I get a chance yeah I want to go do this on a bunch of temporal data that I've already got built and see what I can get out of it I think there's a I think there's a lot of easy stuff that happens that way does that make sense you just stack up temporal stuff and then the perb and stuff happens automatically on it I thought you're asking a different question about scale dependence I don't know how to do that where do you get that information from yeah I don't have that information either so I don't know how to do that almost everybody I go out and I talk to scientists I go to people's labs and I help them solve their problems every single scientist that I end up talking to is almost certainly data limited like they want to do that but they don't have that data I don't have that data either I don't know who has that data so awesome idea are you talking about not spatially now it's just here's a bunch of farms find the ones that are more similar than others that's a straight up cluster I mean you can so all you need to do you don't even need to do that all you need to do is here where you've got per-pixel things or per cluster things you can now apply all our other tools on the seed points so you could simply just k-means cluster the clusters right so that's part of the point of Earth engine is get everything sort of into a standard shape so that all the tools will work on whatever is there k-means clustering these results trivial classifying and clustering these results we showed you classifying that's the whole point is is to let you do those now not on single pixels but on whole clusters at once you can still stay an image space so all those things work in image space as well you're just masking off every point that's not under a seed and so instead of working on a hundred billion points you mask out all the ones that you don't want and you're left with a hundred thousand points or fifty thousand points and it works the same way anybody else anybody using object based analysis and going man this is not the right thing I need all right yeah everything you've seen is in one script which is in that repository which is that link so if you've got the presentation you've got that link if you take a picture of that you've got that link if you send me an email and say where was that I will send you a link to my repository as I mentioned easy is not nothing is easy yet but here here as I was pointing out it's one or two lines to get all the clusters 4 4 4 per pixel but then you can just aggregate them together over the cluster yeah so I will stick that example in there contagion I have to do this for contagion that's the next one I'm working on so at the bottom of the script in a day or two you'll see a section on how to compute contagion and that is get me all my neighbors and then the frequency of all my neighbors so it'll work or I will make it work interesting will this work on things that didn't go through snick basically this will work just fine reduce connected component where is it there we go reduce connected components just once a map of clusters it doesn't care where that came from so here if I'd uploaded this image I wouldn't have the seed points that's fine it's just using everything in that homogeneous region so yeah you don't have to get the clusters from us you can get them from somewhere else you could in fact use the crop data layer directly to do connected component with this yeah it would work yes yeah right Jeff Cardo from McGill University has a hackathon on segmentation he has a medical so he wants to help you build a good segmentation for whatever it is you're looking for but he has a meta goal of trying to collect what are good segmentation parameters for different use cases so if you're in forestry you might want to find a different you might have a completely different set of clustering and segmentation parameters than if you're doing water quality or something like that so he thinks that there's interesting there's an interest there would be some interesting things to learn in comparing the across application results I think he's silly but we'll see I don't have a quantitative so I haven't seen a good description of what earthmovers distance does as a picture but I can show it to you and simply point out how it looks different doot-doot-doot example segmentation run and I think movers distance is still in here yeah right there yes MD so this thing's running away we will zoom in do that's the same picture you've been seeing over and over and I'll turn off all these and load up the gradients and the earthmovers distance so there's the gradients with slightly different stretches than you saw but you should be able to relate to what you just saw and then there's the earthmovers distance and among other things it's quite bright it's much brighter but it really works out differently so I mean we can we can fiddle with these stretches here so that kind of brings it to the same normalization and one of the things that's quite different is that these inter inter cluster things stand out a lot more but I you know all I can do is wave hands here the connectedness is good right there's a the connectors is actually a little better than before in some cases I don't know it just looks different if I throw that one in with the other any of the two others as a three color composite it always stands out regardless of how the stretch works out so it's just a different kind of measure so no I'll go with no I can't answer that question which I hate all right anybody else oh sure even big hello apparently I'm out of time my advisor sitting here in the first row so directly none of it so my my dissertation is not technique based or not about my techniques indirectly maybe a lot of it that's a long conversation but my dissertation is about your techniques and how they're all the same so that's that's the entire elevator pitch right it's as far as I've gotten that is the one sentence on my research proposal so part of you know my job is trying to make it easy for you guys to do this stuff if I can boil this down into three lines and you've suddenly got clusters with all the frag stats you can go worry about conservation and not have to worry about how to do the computation that's my job but it's also interesting as a research topic for okay the things you want as a forest conservationist actually are quite similar to the things you want as a water quality manager which are very similar to the things that you want as poverty mapper and that's what my dissertation is all about well I actually said that in front of a crowd of 100 that's a guess I'm committed now huh anybody else what am I missing what what do you need that I didn't do yep if you wanted to just make a if you wanted to beat down the signal-to-noise beat up the signal noise beat down beat down the noise on a simple cluster this step here where you've done find find a bunch of clusters replace the bad ones and go from here I think that's a quite reasonable technique to just use every one of these clusters you can now to be down the noise by a factor of in this case hundred because there's you know a thousand ten thousand points in in some of those clusters maybe not ten thousand but you know you can you can average together a whole bunch of points for a much higher signal noise if that was the limiting factor that you had if you're working with an instrument where your signal or noise is a limiting factor I'm interested because that tends to not often be the case unless you're looking for maybe trying to distinguish between two species of similar trees in the Brazilian Amazon then signal noise or a piece of the signal noise is interesting but yeah and so since you asked about my dissertation I'm always interested in what you're doing one so I can help you do it but two so I know about the breadth of what people are trying to do so yeah like I said if you've got a signal noise based signal-to-noise limited classification problem I'm super interested in at least knowing what that is and then maybe I can help you solve it if you're looking for clear-cutting it's fine but if you're looking for a long-term degradation yeah you're right it's it's much lower signal noise than you want so but yeah that's that's interesting to me okay so a feature of this talk being new is that it's short which means you get an extra 20 minutes in the coffee line before anybody else gets to the coffee line so we'll call that a bonus but thanks for coming and please let me know if you're able to use these I'm very interested in that if you're able to use these for a specific application I'm even more interested in that and where they don't work for you I'm really interested in that because my job is making them work for you so let me know where it's not working thanks for coming [Applause]
Info
Channel: Google Earth
Views: 7,191
Rating: undefined out of 5
Keywords: earth engine, mapping, maps, remote sensing, geospatial analysis, google earth, google maps, google cloud science, data, trends
Id: 2R0aTaMtYTY
Channel Id: undefined
Length: 71min 24sec (4284 seconds)
Published: Tue Aug 21 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.