RUS Webinar: Crop mapping with Sentinel-2 - LAND01

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
laughing everybody welcome again to another ruse webinar my name is Miguel Castro and today I will be guiding you through this quite interesting topic which is from mapping with sensual data so before starting let me go very quickly through some organization details first of all what you can expect from this webinar so during the next hour more or less you will learn two main things first how to do crop mapping analysis would send you to data and second what is the route service and how it can help you when working with sentinel data so for this exercise we will combine both routes and the sentinel data and we will download process analyze and visualize the free data acquired by the central satellites if it is the first time you are here with us with roots I will show you where you can find all the past webinars the videos and most importantly how you can repeat yet size by your own it's very easy so that's all for in very brief introduction and now let's get started and let's have a look to the outline for this webinar so we will have a look first of all to the study area and understand the characteristics of the location we are using today we will then cover very briefly the remote sensing backbone and understand how satellites can help in in agriculture and in crop mapping we will then describe the route service and we will learn what are the challenges nowadays when working with load sensing and why lose is a great alternative who solve all those problems and then we will use this route service to perform our exercise so at the very end we will have some time for Q&A it will be more or less 20 to 30 minutes but since we are quite out of people here together for this webinar I will ask you please to send your questions as soon as you have them so I'm here with a couple of colleagues they are ready to answer your questions so don't wait for the Q&A session send it to us to the question option that you have in this go to webinar tool and we will reply to you right after so as you can see ya duration of the webinar is going to be one hour 30 minutes more or less let's get started so today's exercise is going to be in the south of Spain in a region called Andalusia and more in detail in the south of its capital which is Sevilla so this area is known as the Equality Vale marshes and represents one of the last territory is occupied by man in this - region settlements were not definitive until 1941 the expansion of agriculture specialized led to a complete transformation of the area nowadays 35,000 hectares are used to grow rice with an average production of 8,500 to 10,000 kilograms per hectare so it's pretty intensive here other crops such as wheat cotton sugar beet or sunflower can be found as well so let's have a look very quickly to some historical images to see the creation and evolution of this agricultural area so in this is first image from 1945 we can see already the presence of agricultural fields in some areas but also water bodies are still prison so we can see here some water bodies in the South off of this area and this image the next one from 1956 shows already clear agricultural patterns and from this moment the image the images from 1977 2008 and later 2017 with Sentinel so the consolidation of the agricultural activities so the blue line just reference so you can compare so knowing that let's now move and check a remote sensing background behind the use of satellites for chrome app reliable information on crops is required to improve agricultural management would use its environmental impact and face food security challenges for that different methods can be used to gather this information but satellite Earth observation techniques offer a suitable approach based on the coverage and type of data that are provided the imagery data from The Sentinel satellites enable a new approach for agricultural monitoring and the combination of the temple our spatial and spectral resolution together with relevant analysis can lead to improvements of the decision-making process so for today's exercise we will be using Sentinel 2 data and more in detail we will be using the multispectral data provided by this sentinel-2a satellite and in case it's the first time you heard about that the Sentinel satellites are included in the space component of the Copernicus program the European Union and the European Space Agency and once completed this program will be formed by six constellations of two satellites each with a range of technologies from start to new to spectral imaging and last but not least in case you don't know the data are completely free to any register users so there is no charge for this for those images so as you can see in this slide just want to highlight two main characteristics of this satellite which is the revisit time which is five days at equator and the different spatial resolutions that we can find from 10 meters to 60 meters so combining those characteristics with this great revision time we can improve and enable new applications with satellite remote sensing such as monitoring activities for agriculture etc so knowing that about the data we are going to use for this exercise let's now have a look to be a route service and understand what ruse is and what it's here offering to the essential community so first of all route stands for research and user support for sending a core product it is an initiative founded by the European Commission and managed by European Space Agency with the objective to promote the uptake of Copernicus Sentinel data and support research and development activities the service provides a free and open scalable platform in a powerful computing environment hosting a suit of open-source tool boxes pre stalls on vector machines which allow you to handle and process the data derived from Sentinel satellites so what does that mean in other words well with the large amount of data produced by The Sentinel satellites the challenge is no longer data availability but rather storage and processing capacity so Reus offers virtual machines so that you have the appropriate computing environment to handle the data but that's not all from ruse in addition to that ruse also provides specialized user help desk to support your remote sensing activities with Sentinel data so in case you are working for example as today in a in an area doing crop mapping and you're not sure about how to use the data or you are not sure about how to solve specific problems you can contact our remote sensing experts and they will come back to you with help and support for your projects but that's not all from this because besides that we also have a specific training plan and specific training activities such as this webinar I'll also have face-to-face events where we train users on different applications so you can find all the information about the routes project and the route service in two main web sites here we have the route / verticals of you and the route that's training you the first one you can find all the information about the project the second one contains mainly all the information about our training activities so let's go very quickly to those websites and have a look so if we go to the routes training we can see here the homepage and something that I wanted to highlight is this training tab here if we click we can access the past trainings and this is very relevant because in case you have missed one of our webinars in the past such as for example the one that took place in December for dirt area mapping well you can come here click on the webinar and you will you'll be able to see video and you will have also a Q&A session document where we summarized all the discussion that usually takes place during the transition of the winners and of course some information about how to repeat the exercise but don't worry I will tell you in a few minutes how to do this exactly something also to mention is that we have recently launched our e-learning material so here you can find courses and videos in an interactive way so that you can learn more about Sentinel about radars our optical data etc so this is our restraining portal and if you go now to the other one which is rules - Copernicus dot-eu we see here all the information concerning the project so from here I just want to highlight two things if we go to the routes offer and more in detail - the computing environment we can see more in detail the vector machines that those offers so as I said we are offering cloud computing power so that you can work with the data and here you can see the specifications of those vector machines for example the list of servers that's already available this server is already installed and ready to use so there is no need for an installation process so for example we see snap we can see also python r and many other many other servers remember that in the respective machines you have full administration rights so you are able to install any software you want either open source or proprietary software but remember that in that case you need a license by yourself so those are the software's that we provide and about the specifications of the machines depending on your project and depending on your needs you will have a different virtual machine with a different storage capacity and a different processing capacity so here you can see all the details and just to mention that in case you have a very large project that requires a lot of resources there is an option to to the clusters in the repo machines and how more capacity so knowing that let's have a look now to the process of registering for routes and applying for those little machines the exercise of today will be done in one of those virtual machines so I just want to show it to you so that you can know that and maybe use it in your own projects remember I think I didn't mention that but the virtual machines are free so there you know there is no cost for this so if we go to the upper right corner to the login registration tab we can first of all register for us so we need to create a pratical single sign-on registration so you just need to fill in all the information and confirm your account so once you click the register button you will receive a confirmation email and your account will be activated and once you are done with that you are able to login in route so the process will be the same we go to the login tab and now we press on log n so I have here my account already so I will login and now the relevant thing to mention is that there is a new tab that appears here it's called your route service and here you have three main options your profile where you can change the details of your account but the most important one is your dash so let's have a look so here your - book you are able to interact with routes and apply and access your virtual machines so as you can see I have already two beautiful machines that are called who's training 1 and 2 and this is because I have already applied for those BND's but if you are a new user you will not have this because the first thing you have to do is to apply for the Metro machine and how to do that it's very easy we just need to go to the dashboard and click on request a new user service so let's have a look to the process so we have here a questionnaire where we have to fill in some details about our product as I said before depending on your project you will end up with a different paper machine with more or less processing search capacity so it's here when you can specify your needs where you can tell us the type of analysis you are going to do the type of data you want and that way we can really understand your needs and we can give you the cloud environment that really is beneficial for you so let's go through this registration process very quickly the first thing to say for example is the years of experience also you if you have already worked with central data if you have already handled properly use data and here and this is very important you can insert the training code so later on when once we finish this webinar I will provide a training code for this exercise and if you wish to repeat the exercise you just need to register in ruse go through this application procedure for the virtual machine and here you will have to specify the training code and that's very helpful for us because in that way we know that you want specifically to repeat this exercise so we will give you the virtual machine with the data and the software and everything ready so that you can focus on in the analysis and don't waste time so let's continue with our application for the virtual machine we provide this information the next thing is to specify more details about our project so for example what kind of activities are we are for silly for example algorithm the moment or very basic data expedite exploration etc then there is another feature that is very interesting in ruse and is that you can ask for us to download the data so in case you are working with a large data set for example of a complete year of a study area in case you have a lot of study areas because you are running different projects you can send us the request and you will know log all the sentinel data that you need so you will just have to specify of course the sensing period the type of product you want and with those details we will put all the information in the virtual machine and once you access it it will be there ready to use you will have the data you will have the software and the computing power and you will you will be able to focus on your research on your on your research anyways today I'm going to do the downloading process myself and here we can specify some extra details okay and then of course let's put a name to our project so today is gonna be crop mapping and then we click Next and the final thing to specify is the data so what type of they are are you going to use so let's take the example of today we are using central to data and we are using level 2a you can also combine other satellites if you want but today it's constantly not so for us the next thing is to define our study area so in our case we zoom in in Spain and we go to the south next to Sevilla and we can define here in the bowler giving marshes our study area you can also specify the lat/long coordinates or even upload a shapefile if you have it then another question are you going to do a multi temporal analysis or are you just looking to a single image so for example this exercise of today we are only using one image but in case you are doing a multi temporal analysis you can here specify the time so I'll leave it as no and here some extra information about the data if you want to give it to us remember that the more information we have about your project the better we can understand what you are doing and the buried will be so from the vector machine and the environment you will be provided with so that is all for the application procedure the only thing now is just to review the thus the application check that all the details are correct of course have a look to the terms and conditions of the service here you will find all the information about house is organized and if you agree with that just click on the option and submit the request so what will happen after that the submission will be sent to our team and within a couple of days they will come back to you with the activation of your vertical machine so don't worry if you don't receive an answer right after your submission this is let's say a customized process when we check your project and we give you the best answer so it takes a few days I would say 2 to 3 or 3 3 to 5 more or less so that's how it works so knowing that I'm not going to submit the request because I already have tribute to machines so I will not send this one so I will go directly to the dashboard well we can find what happens once it is activated so let's go back to the dashboard and once the routine come back comes back to you with the approval of your baton machine you will see this screen here with this table so the project name that we have introduced before etc so what is relevant here to know well maybe two things the first one is the access migrator Machine option so it's here where you can access the virtual machine and go into your account environment and do all the processing in case for example you want to close the service because you are done with your analysis you can also use this option but something also very important is the get support so as I said here in Rouge we are providing also dedicated Earth Observation help desk in case you have doubts in case you have questions about sentinel Dana you don't know how to apply a methodology for your specific application etc so it's through this get support button that you can click here and write us you send us your question and it will be derive to the applicant people and come back to you later knowing that we are ready to access the virtual machines and start our exercise so let's go we click here in access my data machine and this is the environment you will find so you didn't ask for the credentials because I was already logged in but remember once your vector machine is confirmed you will receive the link to the virtual machine so it will be activated on the dashboard and you will also have your username and password so you will just have to put that and you will access the cloud apartment so here we are in this cloud environment its Linux based so if you are links Linux user you will recognize this as you can see we have some software that I have here in the desk desktop that's already installed as a regular computer we have the internet browser where we can do a regular internet actions we also have a word file manager to store everything so remember we are working in a cloud environment so all the data will be stored and will be processed in the cloud environment we are not using the resources of our own laptops and this is the advantage because processing sentient data it can be at some point quite a large task for our computers so something important to know also with the built on machine is that it's very easy to interact with your computer so let's imagine you have to unload 10 images of Sentinel 2 and you have done a crop mapping analysis and you end up having the result we will not have it today so this result you might want to download it to your computer maybe to print it maybe to send it to some colleagues or whatever so it's very easy we just need to activate the map the appropriate menu and this is done by pressing the keys ctrl alt shift by doing so we activate this menu here and now we can go to the device section go to the path where we have our our the classification file or whatever data we want to download and for example I'm going to download this new classify your text file to download it it's very easy we just need to double click and then a regular internet download process will start and you will have this file in your door in case you want to upload something to the virtual machine is exactly the same we go first of all to the path we want to set the data and instead of double clicking on the file we press upload file so here the pop-up window appears and we can for example import this PDF you can also import save files raster images or whatever you want so this is the vector machine we are going to be using and now we are ready to start so in this exercise for these crop mapping analysis we will perform a supervised classification using the random forest algorithm so if it's the first time you heard about this algorithm don't worry I will explain later how it works listen up the best in a basic way and the analysis will be done using the ISA snap software so let's get started and for that let's open our internet browser and download the sentinel image we are going to use so we go to the Copernicus open access hub and we click on the up-and-up so to download sentinel data you need a coupon equals open you need that Copernicus account so it's very easy just need to go here and then sign up in case you don't have one already and just filling the information the activation it's very fast you will receive a confirmation email and you will be ready to download the free data from the sentinel satellites so i have already an account so let me login ok so how to download central data first of all we select the pen option so that we can navigate on the map and we zoom in we zoom in in the south of Spain next to Sevilla so here's the video and we are focusing in the south which is the marshes of the Equality River this here is the quality of a river so here we are in our study area next thing to do is to define this study and for that we can either use the box or the polygon option so I'm going to use the Box option but you can have your own preference so this is going to be more or less my study area the next thing is to specify the parameters of the image we want to download from the satellite and for that we activate this side menu here and we specified the sensing period so the image we want to use for our analysis is from 2017 June the first so it will be one image from that day so we will again take the same reference for the ending of the sensing period okay so then we do not want something low on data we are not doing our exercise today so we go down and we select Sentinel tube and we will select a specific prototype which is the level two eight so level two eight production from Europe are released by a Iza since last year so we can already work with the bottom of the atmosphere values so this is all the parameters we need and now we just need to search for our images so we click on this icon here and they will have all the images that fulfill our query so as you can see we have four images in total but the one we are interested on is the last one here on the list which ends with two to five so we can see cup of parameters about the image for example the mission the instrument the sensing date but if you want to have more details above the image you can click here in this eye icon and there will be a menu where you can see all the information so for example more details about the size of the product more details about the metadata of the image etc and once you are for example also a quick look of the image which is sometimes very nice if you are taking four clouds and you don't want 12 presses so we have here the image looks very nice no clouds so that's great for us so if you are okay with the image and you are sure this is the one you want to download you just need to press on this arrow icon and the download process will start so we save the the image on our virtual machine and the download process will start so remember we are storing everything on the petrol machine not on our computer I want to emphasize that because if you are doing a lot of processing with orphanages it's it's there where the turtle machine gets very useful so I haven't really image dollars on the virtual machine the process doesn't take that much but I done it before to save time so we are ready now to have a look to the image and do the analysis and for that let's open as I said the easiest snap software I have here the icon on the desktop so let's open the software okay so there goes so in case it's the first time you are saying this software might be let me give you a very brief introduction it's very easy to use we have the product product Explorer area where we will have all the files that we will be working with listed you will see the layer we have here in the lower left corner some quick menus such as quick looks of the image or manipulation etc again you will see that later how it's used we have here this big grey area where the images will be displayed and of course we have all the tools that are needed to process the satellite imagery here in this bar and also by menus for example the the rostrum the eraser tools the tools for optical data also the tools for SAR data etc so it's a very complete package so let's open our image and let's have a look how to do that well we have mainly two options we can press here the open project icon or we can go to file open product and then we navigate to the path where we have our image and we can open it so in slap five when you download a sentinel to image it comes as a zip folder so the first thing you have to do is to unzip the folder and once it's unzip you have here the safe format that is used for the center of two images the safe structure let's say and we are looking for this specific XML file this one will be the one that will open the image in snap so we select it and we press open and now snap is reading the file and here we have the product so this is our central to image we see here the name with some key information such as the date the tile etc we can expand the product to have a look to some folders such as the metadata folder containing all the information about the image we also have the Bands folder which is the folder where we have all the image by itself so we have fun1 band 2 etcetera and now what we are going to do is to open a true color RGB composition so for that we right-click on the image and we select open RGB image window so we cancel so by default snap puts the procore combination from the Red Queen blue channels but you can also use other combinations or create your own ones okay I'm going to use the free code today so we click on OK and there we go this is our image so first thing to explain very easy how to move around we have here the span option this help we selected and in that way we can move around we can also use the zoom tool to zoom into our study area so let's do it and for example a little bit more ok so this is going to be the area we want to classify as you can see we have some fields over here we would see what's that later we have the river in the middle and we have this area with different crops so you can see a lot of a lot very large priority of course that can be found here so the first step in our processing is going to be resampling so as I mentioned at the beginning when I was introducing the center of tube setting to has 13 spectral bands but not all of them have the same pixel size the same spatial resolution so one of the main steps that are required most of the times I would say when working with sensitive data is to do something that is we want to put all the pixels to have a common pixel size so all of them having 10 meters 20 meters 60 meters or any other let's say a resolution that we want so for that we go to raster and we go to geometric operations with sampling and he will have the common or the yeah the common interface that snap uses for the tools it's very easy and especially easy to get used if it's if it's the first time you are working with it's not so we have here the input output tab here we specify first of all which is the input for this specific operation we are going to do so in our case is the product number one check this index here and image number one we then define the output name so by default snap adds the keyboard of the tone we are using and this is very convenient I would say because in that way you can keep track of your chain and it's very easy to remember what you have done in case you want you are combining a lot of a lot of steps so this is the name for the product and now you can select it if you want to save it or not if you don't save it it will be saved virtually in the memory of the program so to say but if not you can also save it in your vector machine so I'm going to select this option and just make sure that you are selecting the appropriate path to save the problem so ok and now let's have a look to the parameters of the resulting tool we are using so in snap you can resemble the images in three ways you can use a reference band you can use a target width and height or you can specify the pixel size that you want so in my case I'm going to use a reference pan because I think it's easier so I will select only I would select band two as reference I want all the bands to have the same pixel size as band - this is the meaning of this fall and then we leave the remaining parameters as default and we press run so again for timing issues I'm not going to run the process so I will directly open the result of this operation but in case you are doing the exercise by your own just press run and the process will start so I will close this now and open my pre-process image which is here okay okay so once the tool is finished the new product will appear here in the Explorer as you can see now it has index two and it has this extra name added at the end so of course this is our example image so if we open a true color it will look exactly the same so I will not go through them and now let's go to the next step of our methodology which is the subset so as you can see the image is pretty large and we are only interested in a specific area so we are going to reduce the extent of the image so that we reduce also the size of the image and the processing time is improved so for that week so we go to browser subset okay so there are different ways to subset the image in snap we can first of all do a special subset so reduce the extent as we are going to do but we can also do a band subset to reduce the amount of spectral bands that are in our product or we can even do a metadata subset so what our functions but today we are focusing on this special subset and for that we have again three options we can either draw the the new area we want within within this thumbnail so it's very convenient sometimes if we don't want to be very precise but if we want to be a little bit more precise we can pixel coordinates or geo coordinates that long so today I'm using pixel coordinates but okay you could use any other method so let me put the numbers properly very quickly and okay okay so this is the new extent that we want for our image and once you're finished with that we just need to press ok and the new subset project will appear here again check this number three so there is this order and as you can see now the subset let's say keyboard is added to the product so that we can keep track so let's have a look now to this subset again right if I click open RGB and then true color and we close let's live it together so a very nice feature of Snap is that we can put you images together and compare them visually and for that it's very easy we just need to go to Windows tile horizontally in that way both views are combined and we can see the difference of the subsets versus the the original so you'll see what similar if I move around the image has to go here but don't worry it's it's okay but you can see already the difference of size between the images of course it makes sense this is what we have done so let's close the view number two and let's focus on view number three which is our subset project product so what's our next step for the crop mapping so next we want to do or we want to create the training vectors so remember I said we are doing a supervised classification so as in any other supervised classification we need training data and to put the training data in the sentencer product we have two options we can either create the polygons in snap or we can import the polygons from our own files so I will show you very quickly how to do both things in case you are interested so let's first of all zoom to an area for example I'm going to focus on the river so definitely this is water I know that and I'm going to create a polygon to represent that so the first thing is to expand our product number 3 the subset and we go to the vector data it's here what the polygons will be set so right now we don't have any so the first thing to do is to create Warren snap it's called a new vector data container and we give it a name so you will see let's call this data container water and this is let say like a class so we just press ok and now this a vector class or vector data container as it as it is oh it's greater if we open it you will see it's empty there's no geometry here so we need to add those polygons and for that we can use the drawing tools either the rectangle or the polygon so I'm using the polygon and we just create our water polygon like this and to finish the addition we just double click so now if we go to the water vector data container again and we open it we see that we have a polygon here we have a polygon here with the coordinates of the points etc so we can keep doing that and create our training data for water for example we can create here and on a polygon etc now if we want to create another class it's again the same process so just go to the vector new vector data container and then for example it's called this new class rice so just press ok and now pay attention to them when we create a new polygon snap is going to ask us what do you want to save this polygon so when you use the drawing tool to draw the new polygon it will say ok is it a voter polygon or a rice polygon so it's a polygons now and now we are we can okay you can just draw a polygon here or we can draw a polygon here again it's asking for the same etc so this is how in snap you create training data okay but I'm not going to do that because I need a lot of training polygons for my classification so I'm going to import them already we have here our vector data with no training data and we go to a vector import as we shape file why because I have my polygons in this format but okay you might have maybe you have them in CSV over there so I'm selecting this option now I'm navigating to the past what I've saved my polygons and I'm just selecting all of them and I'm pressing opening so now when when you're importing a polygon snap is asking you if they should be imported separately or not so I'm saying no because I want to be I want each polygon to be independent so that I can they can be treated separately okay so it's asking that for each of the polygons okay so here we have our our polygons on top of the image you can play you can change the display of that by going to the layer manager in the right part and as in other GIS software you can activate or deactivate the view for example you can remove the study area or put it back again etc so once this happens we are ready for our next step so the next step is going to be reproject so we need to make sure that the polygons and the image have both the same metadata that the classifier will not have any confusion and for that we will go to western geometric operations reprojection so again we face the same interface as before we have the input out the tab we have the source product it's going to be number three and then we have here the output details so just make sure that you are specifying the path where you really want to save the data so for example here and then we go to the reproduction parameters so inside there are three options to define the new coordinate reference system you see here you can use a specific epsg code you can select it in different ways but I'm using a very convenient one which is the automatic UTM zone and it's very convenient because you just need to put that and the software will automatically locate the image in the UTM grid and will say okay this belongs to the UTM in this case 29 north so that's all then click one and the reproject will start again I'm not learning this I will toggle show you the up so I'll close here and I go to open my new project product directly which is this one okay so here we have the non product number four and now we can open it again with right-click open RGB and ok so now we can close this view again to avoid confusion and here we have our reprojected image we can remove this those polygons from the view in the same way so just for visualization I will remove all of them yes and don't worry the image is coming there it goes so the original central image was in was projected in zone 30 north final point this one because the subset it's a bit it's a it's located in a different way it's 29 and that's why we see the image a little bit one balance so this is our image so now the next step before running the classification is going to be a mask we to really remove all the pixels that we do not want them to be part of the classification and how do you do that it's very easy it's like another software this is called a clip not exactly the same but we are going to remove those pixels that we do not want for that we go to raster mask land sim mask here again we define the input and output and we go to the processing parameters and we select the use vector as mask option and we unclick the option use as RTM three seconds now the software is asking us which is the shape file you want to use which is the vector you want to use as a mask for the image and since we have there we go you see remember to make sure that you are selecting the appropriate input so I'm going to select as mask the study area vector I want to remove all the pixels are outside the study area so once you are ready for with this remember to this you can click run and again the process will start and again I will show you directly the result so let me open it again this is the mask product this one open and here for the number-5 we can again go and open up GB okay so there it goes this is the match but let me again remove the vector data from the display and now we see that the Sentinel image is completely cropped with the size of our study area maybe you didn't see that but this is exactly the shape of our study area which is this pink layer on top and the remaining polygons by the way are the training polygons so I will show you that so the polygons are imported before are the ones that will be used for the training of the classifier so you can see some of them here for example here here etc I will talk more about training data later so now we are ready we have done all the prep assessing we need for our classification and as I said at the beginning we are running today well known and very famous algorithm which is the run of first algorithm so let me in case you don't know about this algorithm let me give you a very basic introduction to it I think it's very important and I think it's very relevant as well because very it's used a lot by the remote sensing community so it's it's maybe a good chance to learn a little bit so for that I'm coming back to my presentation and to explain the algorithm I will use a very easy example that is not related to remote sensing and you'll see why so let's imagine we have the following task let's imagine we want to classify the world population between boys and girls okay so as always in other classification tasks we need some knowledge about our target so we all know that we all have characteristics and attributes right so we have height weight a nice color hair etc so you can think about any other characteristic about people in general okay so knowing that and as always in statistics we can work with the complete population charity we need some samples and with those samples we can build a model and with this model we can then derive conclusions about the population and those conclusions can be more or less accurate you can be more confident or not but this is how it works so we need some samples and what we call them or more often in remote sensing training data so the training data is going to be you the attendees to this webinar so how is the random forest algorithm going to do the work to classify the world population using you as training data so that's how so he would have the training data this is all of you and the first thing it's gonna happen is that we are going to select and pay attention to that we are going to select a random subset okay I'll random subset so for example this one the red people and then a random first is going to do the the following thing is going to put it back so this is called in the Churchill with replacement so we create this subset random bit and we'll say okay this is subset number one and then we put it back to the training data then we are going to create another random subset for example the blue one but since we are working with replacement it might happen that some of you will be selected twice once or will not be selected at all for the creation of the subsets so for example here we can see that some blue people are selected but also some of the red people from before and then the process keeps going like that so we put it back and we create another subset and another subset and like that so just remember the selection is completely random and with replacement okay so then what's going to happen well now we are going to create for each of those subsets we are going to create a model and run first use as models where it's known as decision trees so decision trees are predictive models the workers that split a data set into regions by using a set of binary rules to calculate a target value for classification can also be for regression but today we are in classification okay so follow my example tourism we have to ask a question to the training data so that we can split the training data in boys and girls okay remember this is training data so we know the class you have because your training data so again when the forest is going to pick and this is very important as well randomly a subset of characteristics of the training data so you remember we all have hair weight height etc well we are going to pick randomly not all of the characteristics but just a subset and out of that subset of characteristics we are going to select the one that is providing the best split for the data I'm not going to cover how this is exactly done because that's too much of detail for the algorithm if you want to know more I will definitely recommend you to check interger but this is how it's done from the subset we select the one that provides the best split between boys and girls because this is our final tests so for example is the her shoe not along then another binary question about we select again another random subset of characteristics we take the one that in this case provides the best speed and we ask again a binary question is it higher or lower than 170 centimeters and if so well you will be classified as a girl or as a boy and if we go back to the other branch again we select another subset and again you will be asked a question and based on that you will be classified as girl or boy so this is just an example of course but this is how the decision tree is built in a basic way so remember we are building this sub this decision tree for the red subset now what will happen is that we will do the same for the other supports subsets exactly the same so for the for the blue and red subset we create a tree which is different from the other one for this blue green and red we again create a novel tree which is completely different as well and each each of the trees has its own set of rules or binary rules that classify the people in this case so what happens when you combine all of trees together well you are not having a forest so here here's the deal random forest so now how is this going to work we have our mobile and now the next step is to make some conclusions about the general people of the world so let's imagine someone joins this webinar now and we want to know if it's a boy or a girl well these people will be sent let's say through each of the decision trees and each of the decision trees based on the questions they have they will vote for a class so it's three we'll say this is a boy this is another one would say this is a boy and no one will say this is a girl etc so the class having more trees voting for will be the class that is assign assigned to this person and this is how random forest works for classification but now someone can ask okay what about accuracy well I'm going to cover this very quickly but what about accuracy in running forest well you remember we were creating a subset and then for this subset we were creating a tree well run the first has a unique way of providing some accuracy assessment based and this is important based on the training data so the people that are not used in this tree because for this tree we are just using the red people for these people which are training data we know that they are girls or boys so they will one by one be sent to the tree and we will see if the tree votes for the same class as the one they have assigned and in that way we can know if the classification is working on top this is called a literature out of the bag no need to remember that just in case you want to check some papers and stuff so that it rains above on you so this is done for each of the trees and putting together all this information we can know the the overall accuracy based on the training data pay attention girl so now let's go back to the remote sensing world and let's see how does that work for Sentinel to image so we have here our something to image have the blue red blue green red channels etc and now as you may know well the attributes of satellite image are these spectral bands those are the attributes in this case and the training data is going to be some pixels that we have identified by overlaying a polygon on top of them and for those pixels we know the class okay this is how supervised dislocation works in general so for example purple pixels are let's say sugar beet and green pixels are rice and yellow pixels are wheat etc so the same process that I explained before is going to happen now so for the training data that's so that is for all the pixels we are going to select a subset of them and we are going to create a tree using the attributes to ask questions to ask let's say binary rows and we will do and we will create a lot of trees so substance on the 1 1 3 sub Sonoma to another tree etc etc so now what will happen well let's imagine we have a pixel that's not only training data it's a pixel of our study area and we want to classify this pixel well it will go through each of the trees and each of the trees will vote for a class and again the class having more votes will be the class assign to that pixel and if you do that for a complete image you are not having a classified image using random first okay so this is a ideal case so this is the theory of random forests I hope you have understand the logic because it's very important to know how the algorithm is working to understand the result we will get so let's go back to snap and let's go back to our exercise now we just need to do the random forest classification and it's very easy the software does everything for us so we just need to select the product number 5 and we go to Western classification or supervised classification remember and here we have the algorithm random forest classifier so we click here and again we have this interface so the first thing to do is to add the good for the classification so to add the sentinel to image that we are going to use remember today we are only using one image of course you can do multi temporal analysis and I will talk about that at the end so how to add images here you can click in this button and navigate to the path of the image or you can just click here and all the images of snap will be opened but since now what we can do is also to update the metadata of the images but remember we are just working with the image that has been gone through all the pre-processing so this is this image here the one that is masked the other ones are just pre-processing and we don't want them so we select them and we press - ok so this is our input image then we need to set a couple of parameters for run of forests and that's why I also like the one in first because you just need to put two parameters and then that's all and it's also very easy to understand how it works so the first thing is to say ok we are doing a new classification we are training on vectors because we have upload those polygons we want random forests to evaluate the accuracy based on what I explained about the accuracy and how it's done in random forests we can select now here the number of training samples that is the number of pixels that will be used for the subsets and the number of trees in the forest you remember so you can check on literature 10 is a good number for testing just to check how the training data is working if you check literature you will see that the more trees you have the better it's not an exponential function but until certain level this is true and usually in literature you see that 500 trees are enough to reduce the noise and have a homogeneous response so I'm going to put 500 the next step is to select the training vectors of course all of them I have imported having put those polygons before but be aware do not select the study area because this is not a training vector so we have here the polygons for sunflower for rice water built-up etc so those are the crops I'm going to classify the next step is to select the characteristics of the sentencer image that I want to use in my classification you remember those questions that we asked during the trees well they are based on the attributes of the image and here we can select which are the attributes that we want to use so I am using all of them you can select for example RGB and near-infrared only you can do whatever you want and you can even do different combinations and check how this affects the result but be aware that the more information the classifier has the better for him until certainly be aware of that as well it's not exponential needham then well that's all we just need to check that we are saving our product properly in the path that we want again it's adding this RF keyword to the product name and that's all so we just press run and the process will start it's very easy again let's close this and let's open directly the output of this operation and let's have a look to the crop mapping result so I have here and there it goes so the first thing you have to do when checking the output of fun-run first is to expand the product and go to the bands folder as you can see now we do not have the spectral bands of Sentinel true because this is not something to image we have the classification and be confidence why what is the confidence will snap implements one forest with a different version in in which a confidence interval is derived to know which is the confidence that the classifier has for each pixel class ok so I'm going to go to live into that you can check that on literature this is just for basic implementation of Alan first so but what you have to know is that you have to check some parameters here so if we go to the classification I will show you that now we go to properties and we are going to remove the valid pixel expression expression sorry and why because if we leave it like that only the pixels that are classified with a higher confidence than 0.5 in this case will be shown and we do not want that we want to show everything for this example of course for other implementations of classification you can have your own rules so we put 0 and we close and now let's just double click and have a look to the image so there it goes so first of all let's analyze this little by little first of all let's remove the vector data guys on top okay now let's change the color scheme so you can have your own score scheme you can change that manually here in the column manipulation tab which is in the lower left corner or you can also import a specific color palette that has been created before and that's very useful sometimes so I have my own color palette here that I created before so I will just use it and now let's have a look to the image so this is our classification using one forest using only one image and etc so what can we see what we see from this perspective rice fields in light green we see also the water in blue we see some rice fields are still flooded this is a common well that's how Rice's is strong but now let's have a closer look for example here so here we can see different colors different crops and we can see that some classes are classified pretty well for example sugar beet in in purple looks very nice the fields are identified pretty sharp the boundaries of the fields are well define etc for example corn in yellow also looks very nice so mobile classes such as tomato and cut them have some confusion but of course this is something that you have to know and you have and that's why it's important to understand how the algorithm work so how can we really interpret this result well first of all the results depends definitely and a lot of the quality of your training data so the polygons we were important so a different strategy or approach to create the polygons will definitely change your result more training data will help the classifier so I in case you're interested I really recommend you to just give it a try I mean open a booth with the Machine set your own study area location and play with the training data and see how different strategies can improve or not maybe the classification remember for example here talking about the classification we see the boundaries of the right they are very clear we can even see some paths through the fields some of them are very detected and others okay we see here again some confusion between some classes for example the red and the gray etc so also something important to mention is that money Forest is a machine learning algorithm and in an easy way what does that mean what it means that there is no we are not assuming anything from the data before the classification as in other parametric classification so what does that mean it means that the more information we give to the algorithm the better it will learn the relation between what you have your image and what you want to have which are your classes so because of that if we add a multi temporal approach if we more images if we add more training data the classifier will have more inputs and we'll be able to learn more this process so I really if you want to give it a try I really recommend you to have a look and check for example a multi temporal classification maybe for example adding vegetation indices maybe if we add NDVI this can help to differentiate between a crop and another maybe not sometimes vegetation illnesses can help or not well you can you can test that and also something that I want to mention about the wonderful rest and snap well random forest provides an accuracy assessment as I said before using what's called all of the back samples but remember that this is just a validation on how good the classifier was able to fit the training data the training data so if this is not let's say completely a complete assessment of the accuracy of the classification it's just an assessment of how good the classifier words on classifying the training data so if you want to have an accuracy assessment of the classification you need an independent validation data set and you can do a regular cursory assessments such as a confusion matrix or you can derive some statistics such as the Kappa coefficient etc unfortunately this type of accuracy assessment is not supported in snap so personally I recommend you to try in QGIS they have very nice tools for that so this is how you perform a crop mapping analysis on snap in a basic strategy let's say something important from my own operational point of view is that you can export this result so let's imagine you want to send this image to another software or whatever well you just need to go to file export sorry you first select the image and then file export just if in this case I'm using Joseph you can use a reverse format so we click here and that's all we just need to put the name and save the image whatever you want and in that way you can export your results and of course for example you could download that from the video machine to your local computer and do other analysis such as every upper crop or change the text maybe from one year to another etc so that is all for the crop mapping analysis that is all for the interpretation of the result and how it can be analyzed and proved etc let me give you before finishing some take-home messages especially about posts and to let you know how this kind of applications are better implemented into the roots environment so for that and going back to my presentation so remember that nowadays we have a massive volume of data produced by the same team of satellites there are more sensing satellites to come it will be huge the amount of data we have and that's great because with more data and new algorithms just a random forest and machine learning options and much better implementations we can derive and we can produce results that can help a lot for example in crop mapping in agriculture entering etc so with the new Sentinel satellites the challenge in satellite remote sensing is no longer data availability but rather how to store and process all this information let's imagine for example we want to do this crop mapping exercise but instead of using one image we want to use 20 images because Sentinel 2 has this great release in time and we end up having a lot of images well it might be that in your computer even though it's it can be very powerful it's not possible because it will not some point crash so we have this challenge now but we also have solutions of course in addition to that it's also necessary to explain how the data can be used and we we I'm in front from from the sentient community we have to produce some you have to explain to people how to use the data for example how to do crop mapping with center to well you have to take into account and whatnot so we have those two challenges to solve and Ruth is exactly here to solve that so Ruth service the route service provides those free cloud computing environment those virtual machines that you have seen me using in this exercise and they are ready to use so you can have the data ready to use you have the sub already to use the storage which goes up to one terabyte or even more if you need it or it goes up in processing capacity also for multiple cores etc so it is a great tool to process and handle the large amount of data we have now and in addition to that we also have a dedicated help desk with Earth Observation experts to support your activity so let's imagine you want to repeat this exercise crop mapping in your local area you want to give it a try you want to see how well you can identify your crops in your region and you have some doubts or you don't understand some steps or whatever or maybe none of applications such as pond area mapping over ever well you can contact us and we will a team of sub servation experts will be there to reply to your doubts and and help you in your process so before moving to the Q&A session let me tell you again how you can repeat this webinar let me remember as well that this webinar is being recorded and will be published but you can also repeat the methodology after and how to do that is very easy just go to roots - copernicus register in roots and open a new user support request and within the application procedure you have to specify the training code so the training code for this webinar is land 0 1 so you can use it it's ready to use you can send your applications we will give you the virtual machines as the one I've used today with the Sentinel 3 Mets the training data software and everything and you and then you can check different implementations you can also load your own set of images and perform your your analysis in your own area so it's free flat so that's all from my side thank you very much for your attention I really hope you have learned something today remember this is the training code here we have the two websites of fruits and of course we go in social media so you can follow us on Twitter YouTube Facebook and LinkedIn its roots Copernicus you will find us so thank you very much for attending this webinar thank you also to my colleagues that were involved and well it was a pleasure for me to be here and I hope I will see you in the next webinar so
Info
Channel: RUS Copernicus Training
Views: 26,839
Rating: 4.9172416 out of 5
Keywords:
Id: jpPoZ6wv9dM
Channel Id: undefined
Length: 74min 10sec (4450 seconds)
Published: Wed Jan 17 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.