OTB QGIS Machine Learning Overview 2018 0126

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello Frank abusive here again what I have today is I have Mike hugest open and I'm running a Windows 10 operating system and I have a huge is 2.18 dot five and I have my nice color infrared image here that I typically use to test out workflows and so forth so today I'm going to be looking at the functionality inside of the Orfeo toolbox and more specifically what I've been working towards is understanding the machine learning algorithms that are available in the Orfeo toolbox OTB and there are a variety of algorithms that are dedicated towards machine learning that you can use in a typical land-cover a supervised land cover classification approach so in the same way that you would collect training data and apply a supervised land cover approach and into in a traditional sense you can do the same thing by collecting training samples of your classes of interest and then training your imagery and then using one of these machine learning algorithms so I also notice that there have been some messages and issues posted to the message board the OTV user message board about access to OTB functions inside of huges so what I usually do is go to processing good options and then come down here to providers and this appears to be maybe a little different depending on users but in the Orfeo toolbox section I click on activate and right now for me I have my old my OTB files located in a specified directory ot the 6-2 oh and for the Applications folder it is the subdirectories of Li bo t be applications and then the command line is ot be whatever your ot be folder is and then in the bin subfolder so as long as you have those pointed to those two locations the functionality here works pretty well and I actually had some issues with trying to figure that out because it didn't seem to be working initially but anyway so going forward I have some training samples here I've collected a water impervious fairground trees open and shadows this is phourb and NAIP imagery in near Charlotte North Carolina north of Charlotte North Carolina and USA and since it is leaf on imagery it has a lot of a lot of shadows so we'll just zoom in here and look around and you can see some of my impervious surface samples here or kind of this rust color water samples tree canopy samples and I have some what I'm calling open which is basically grass grassy areas and also I typically kind of go across my image and make sure I collect a good variety of samples and these purple polygons are so my shadow samples here and if I go over to my training data set and just open up my attribute table and just sort by a label or sort by class you know you can see I have about 37 samples maybe four or five of each of my features and initially the idea here is just a test I'm just testing some of these but now some of the results are actually pretty surprising and pleasing so I thought it would be worth showing a little video here so you know what we do here is there is this icon create new shape file you make it a polygon usse make sure the the reference system is set and then you basically add your field so for my label field which is the numbers is it's an integer field and then I have a text field which is class we're actually right out which class it is and then I go ahead and set up my training data that way and then you know I go into my edit mode and I select the polygons that I want to use and I select the polygons that I want to use for training and then save those and then I those are my training samples so I can turn off my edit so the way this works is and I and I saw a nice video on YouTube it was OTB and QG is a nice wedding or something like that but it's it's a great video to kind of get the idea of how to do supervised training inside of QGIS using the OTB tools so the video shows that the first thing we want to do is compute we want to compute stats so all we're doing is we're computing stats and we're saving it as an XML file and then the next thing I want to do is choose the classifier we want to use we have this support vector machine we have the random forest algorithm we have the Lib SVM so we have two SVM's here one is based off of live SPM the other one is based on the open CV open computer vision algorithms our language and then we have K nearest neighbor we have a boosting algorithm we have a decision tree we have another boosting we have our night Bayes and then our artificial neural networks and if I click on any one of these and go into the help you can see the you know some of the explanations and for any one of these you go to help it has all of the different algorithms so every one of the help dialogues looks the same as long as you choose one of these training images classifier so with that I identify my input image and then I identify my vector list and then I input that stats file that I created from the compute image second order statistics and then one thing I've noticed that if I if this on edge pixel inclusion is check marked or toggled on and by default it is I go ahead and turn it off and you want to make sure that you're using the right attribute field to determine the the classification and this has to be the integer field so in my case it's label and then here we select the classifier we want to use and since I selected the Lib SVM that is already it's only one option there but then underneath each one of these unit you may have a few options and actually in another video I think I might show that I actually this RBF you know turned out the best out of all the Lib SVM functions and and then you out you you basically save a CSV file that's your confusion matrix and it'll output basically a confusion matrix and then I'm a dot model file those are the extensions I've been using so for confusion matrix it's whatever name you give it a dot CBS I know CSV comma space delimited and then the model file is simply a dot mo DL and then when you actually set that up it runs the the training and then you use the image classification algorithm and here you can see that it's going to ask for your model so you input the image you want to use you if you have a mask you don't have to select it and then you input that model file you just created and then your statistics file and then you save your so here's what the model file looks like top model here's the other confusion matrix and I was you know select my my model file and then you name this and it would run it so what I've done here is I actually have several different outputs of of machine learning algorithms and in kind of in the traditional sense of doing a supervised classification approach and so I'm gonna show I'm just gonna zoom in on an area here so you can kind of see some of the differences and I just started at the bottom and went up the list so if we'd go ahead and look at the the SVM options I just used a linear option in SVM and that's what the linear option looks like and it looks pretty good I mean overall it looks really good but when you start looking at it closely you'll notice some of the differences and then here's the ARB RBF option that I use and just clicking on and off you can see some of the subtle differences but you know they're they're not huge and then I use the poly and then I use a sigmoid option those are the four different options and obviously the sigmoid it looks like it's bringing in some of my water pixels into the you know on top of my impervious pixels so that wasn't that great here is a random forest and one thing I've noticed about random forests is especially in the tree canopy there's a lot of like shadowing and smaller polygons within the overall canopy and then if you use a decision tree you know it kind of fills in a lot of those holes the other thing about this particular example in random forests is it really kind of missed the road so in terms of random forests there is such a thing as overfitting and underfitting that you know I haven't really I'm just using the default parameters for most of these so that's kind of why we're seeing some of this then I have a the nearest neighbor output and it's kind of grabbing some of the blinds and I'm sure I can go in and make some modifications and then I'm gonna come down here to this the boosting one that I use because all of these other algorithms actually took anywhere from 30 seconds to a couple minutes misses this image is is a 190 megabyte TIFF image so that kind of gives you an idea but this gbt so this gbt i had to grab a file real quick is a is a boosting algorithm and I'm just scrolling down I have a primer here and I'll tell you a little more about this primer that I use I basically copied and pasted off of a website that's really good so the gbt is a gradient boosting algorithm and it looks like it combines several of the algorithms and uses just a tree classifier and you know if he wants some really good information about general machine learning algorithms and what the differences are just type in analytics VIN die in a Google search and maybe with machine learning and you'll find this information it's a really good website talks about each one here's a decision tree it's a great primer here support vector machines it's a great way to kind of get familiar with you know the details of machine learning but yeah this particular process took four to five hours which I wasn't expecting when I actually ran it but there it is but anyway out of all of these you know this the the boosting algorithm or the the Lib RBF option and I were probably two of the best for this particular example and I have I ran another example on this and I have to say that the random foursome decision trees were actually better in that example so just want to just do a overview of the machine learning algorithms that are available in Orfeo toolbox there are several good ones in there and I just did a general classification here of some basic classes using a four band color infrared and I want to probably create another video because when I go through classification I like to use derivative layers to kind of help answer some questions about pixel confusion so if I if I do that I will post that video shortly so hope that was helpful and you got something out of it and have a nice day and we'll talk to you soon thank you
Info
Channel: GeoFranker
Views: 22,025
Rating: 4.9652176 out of 5
Keywords: OTB, QGIS, Machine Learning, Orfeo Toolbox, libsvm, random forest, decision tree, Gradient Boost, Frank Obusek, GeoFranker, remote sensing, image processing, image analytics, OpenSource, Supervised Classification, Artificial Neural Networks, K Nearest Neighbor
Id: emoGMibsgv0
Channel Id: undefined
Length: 14min 13sec (853 seconds)
Published: Fri Jan 26 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.