See What Others Can’t Using Deep Learning in ArcGIS

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone thank you for attending this session today we will be talking about see what others can't using deep learning in arcgis first off a little bit about myself my name is david yu i'm a data scientist on the geoai team here at esri my background is a computer science whereas esri had traditionally been a gis company and what that means is i spent a lot of my time trying to bring the latest and greatest tools and technologies processes and practices from the world of machine learning to solve real world gis problems and in particular i work on things like object tracking and anomaly detection pattern analysis and more recently generative networks so we begin by reviewing the definition you know we've all heard about ai it's really become this buzzword in recent years but what do we mean by an ai enabled imagery workflow and how do we quantify the effect of its application so technically speaking ai is a paradigm of computing to get computers to perform tasks at the level of a human if not better whereas machine learning which is another term that we hear about a lot is a specific implementation of ai that's all about learning from large quantities of data in order to derive rules and recognize patterns without being explicitly programmed to do so deep learning on the other hand is really what we're interested in today and what i will be showing you in a little while it's a machine learning technique that's been gaining a lot of traction these days which leverages deep neural networks to analyze high dimensional data like videos and images at a higher level all of these deep learning workflows within rgis consists of the following steps you start off with your training data these are typically large volumes of imagery which can be efficiently managed by arcgis then ground truth labels can be created on top of your rasters and these can come in many forms they can be bounding boxes for object detection semantic masks for semantic or instant segmentation or class labels for pure classification data prep is another important step where we can leverage arc pi or rgs pro in order to export imagery into various formats that a deep learning model can then consume so these formats would be something like kitty labels or pascal voc if you're interested in object detection mascara cnn masks and so on after this the rga's the rts api for python can then apply image augmentation to your data either using intelligent defaults or transformations that you specify model training which is an important step can then be done within rgs pro or the python api and we also give you a lot of can models to choose from but you also have the option to integrate your own models through a simple raster function as for inferencing you know it's often the performance bottleneck for a lot of our clients so here we support inferencing at scale through image and raster analysis servers and enables you to carry out advanced analytics on top of those deep learning results so these are basically post-processing tools and by the way all of these tools are native to arcgis and then finally these results can then be used and visualized in a variety of end information products such as dashboards web apps and field apps so i talked about the python api you know we really recognize that in this day and age python has become the language of data science so we put a lot of effort and time into making sure that all of our deep learning capabilities and tools and libraries are made available through a python api module which we call rgs.learn and what this learn module does is it provides a streamlined experience to get your data science project off the ground so for those of you who are data scientists you probably recognize that getting a model to work from scratch requires a lot of boilerplate code so you'd have to start off with you know writing your own training and validation loops you have to specify a loss function you have to figure out what is the optimal learning rate to use in that case you'd probably use something like automl you have to write your own optimizer and if you use pytorch you'd also have to write your own data set and data loaders which is especially difficult to do if you're working with geospatial data so this is the type of work that we're aiming to simplify with the rgs api for python so now with the python api in order to get a model training off the ground it's very simple all you would have to do is use one single line of code to instantiate your model and then to train the model it's model.fit and over here you can see that we're also specifying what is the learning rate to use so we have a handy tool called the learning rate finder that returns what is the optimal slice or region where to choose your learning your learning rate from and by the way all of these deep learning functionalities and classes from the python apr are based off fast.ai which if you know pytorch is basically a pytorch wrapper and of course you have the option to override or replace some of these functionalities with your own implementation and we give you an easy way to integrate your own model as well by specifying a esri model definition file and also an associated python wrestler function so i also want to stress that you know the features that are included in the learn module are not limited to only the training part but basically every single step of that you know end to end pipeline which we have just talked about is represented here in the rgs api for python so everything from data prep to model management to the inferencing apis and so on so the first example i'd like to dive in straight away here is accidents prediction and if you're familiar with machine learning this is not an uncommon workflow basically the central question that you're asking is can we predict the risk of a car accident along roadways and if you think about it there's really tens of variables at play here so first of all there is temporal factors like what is the time of the day what is the day of the week and also weather patterns things like if there's any kind of snow coverage on the road as well as spatial factors so things like if there's any curvature on the road how far is the accident away from an intersection and so on so really there's tens of variables and it's impossible for a human to be able to understand what is the precise combination of each one of these variables that may lead to a higher risk of an accident so this is really where machine learning would be able to help by analyzing this high dimensional data however what people don't realize is just how much insight a gis centric approach brings to the table so in order to motivate this i have a link here to a story map on the impact of geographic considerations towards accidents risk so when we consider a problem like this let's first think about what factors definitely contribute to the likelihood of an accident how about the number of cars on the road obviously this type of data is going to be a little bit difficult for us to come by especially if we're thinking about collecting highly granular data at regular intervals perhaps in real time however for our purposes it's going to be good enough for us to make use of something much simpler let's say a population density layer that's going to be a good enough proxy for us to understand the number of cars on the road at any given time something else that we might want to consider are the various physical attributes of a stretch of road so let's say we have a road here that i can kind of pan around and play around with and let's say we have a good understanding of some of the physical attributes like the number of lanes on that road what is the surface type of that road what is the speed limit that's been imposed on that road and so on so why is geographic context important in this situation well let's say we have this road versus some other road that has the same physical attributes so let's say for the second road it has the same number of lanes it has the same speed limit it has the same surface type and so on well how can we differentiate the two well we can do so by bringing in additional contact contextual information let's say geographic context and if we place this in geographic context we can see immediately that the second road kind of hugs along a mountain side and has a steep drop off to one side and perhaps some obstruction on the other side that prevents the driver from looking across here and this may raise the likelihood of an accident versus the first road which lies in a flat plane with good visibility on both sides and you can very clearly see that it's going to be pretty difficult for you to have an accident here so this is why geographic context is important so let me show you some of the model results that we have so here is running a model within rgs pro and specifically we are using the scikit-learn library and we are using xg boost to understand what is the level of risk on a per road segment basis so you see all of these road segments are color coded and they have been color coded according to the risk that's assigned it by the model so green obviously represents low level of risk whereas orange and red represents regions of high risk and we can see these red dots here that correspond to locations where accidents actually happened and we can see that they essentially just lie on road segments that the model has predicted high likelihood of an accident for and coming back to the point of gis what arcgis allows you to do is to be able to perform this type of end-to-end analysis very seamlessly so it's essentially a framework for you to perform data collection measurement visualization analysis planning and finally turning that data into actionable insight all of that is available within arcgis and from a deep learning perspective here are basically the four key imagery tasks that you can perform with deep learning in arcgis so first of all we have pixel classification and in the world of machine learning this is also known as semantic segmentation this is basically the task of assigning unique values class values on a per pixel basis then you have object detection which is all about returning that minimal bounding box or minimal bounding polygon over objects of interest then you have instance segmentation so those of you familiar with something like nascar cnn will recognize that that's an instance segmentation workflow so this basically combines the best of both worlds between object detection and pixel classification one of the shortcomings of pixel classification is if we look at the example over here we see that some of those houses even though that they've been you know very successfully segmented out they're kind of joined together and the model has no way of knowing what are unique instances of a house so that's where you know you want to be able to segment out also those unique instances and that's where instant segmentation can help and then finally you have your sort of standard machine learning you know 101 model that allows you to be able to perform image classification now the second demo i want to talk about is ai for disaster response this is a project that we started off back in 2018 in october essentially in the wake of hurricane michael so for those of you who know hurricane michael is kind of the worst natural disaster that happened in recent memory it's something that made landfall in tyndall air force base in florida and it caused immense amounts of damage so it had a maximum sustained wind speed of about 155 miles an hour which basically makes it a category 4 storm very very close to category 5 and it essentially nearly leveled the entire air force base in a matter of minutes so unfortunately disasters like this are becoming very very common nowadays and coordinating responses becoming challenging in a number of different ways usually following a disaster like this either a hurricane earthquake or flooding certain agencies will go out and fly aircraft in order to capture a bunch of imagery in the area to be determined in order to determine the level of damage but all that imagery needs to be processed somehow and the the way that this is usually done is that analysts would essentially sit down and manually comb through these hundreds and thousands of of images in order to circle out locations where you know structures may be damaged or roads may be blocked or have debris on them and unfortunately this takes a lot of time as well as effort in a situation where you don't have all the time in the world so this is where we see an opportunity for parts of this workflow to be automated by ai so our workflow is broken down into four pieces first we want to understand which structures are damaged we want to know where people may be trapped and may need rescue and we also want to understand if there's any spatial patterns such as any clustering behavior for this type of building damage and if there's any kind of hotspot then we can isolate it and really prioritize rescue in those locations so we first trained two ai models one model in order to identify building footprints in an area of analysis and the second model we ran on top of the output of the first model in order to classify them as either damaged or undamaged and to show you what that labeling workflow looks like this is it essentially this is a web app that we've created we understand that labeling is very time consuming task and it's definitely not something that a single person wants to do so that's why we created this web app for all of us in the office to be able to collaborate we can have multiple people work on this at the same time and we see that there is an underlying built-in footprint layer already and in order to label it it's very simple we can just click on it and select from a drop down menu whether or not it's damaged or undamaged of course if we don't have access to an underlying building footprint layer we can also just draw bounding boxes as well through this web app so the second model that we've built is for detecting damaged roads you know it's not enough for us to know which buildings are damaged we also want to know what is the best way in order to route first responders to regions where people may need help and that's where the second model comes in because we want to know if there's any debris on the road if you know roads have cracked or if roads are completely washed away and so on so again we are training an ai model in order to detect road debris and this ai model trains very similarly to the previous models for building footprint detection and for damage detection except in this case we're detecting road debris and again this is a case where high quality high volume training data would be needed and again we've built out a web app that does just this so we can see that you can click on a segment of road and you can kind of assign it what is the level of damage and so on so all of the training is done completely within the rgs platform and this is the type of output that you can get so now we see that the model is predicting that these regions have blocked roads or damaged roads we can kind of switch back and forth in between them in order for you to see that yes these are indeed locations where you know either trees have fallen down or if there's any kind of random debris on the road and then the next piece is what i've talked about and that is optimal routing so we want to understand not only the location of damaged structures not only which roads are passable which ones are impassable but we also want to all automatically and intelligently route first responders from let's say their foreign operating base to locations where people may need help and in this case we are using the arcgis network analyst extension in order to perform the routing task so in this case we've assigned a forward operating base to a local fire station here and you see over here these are the damaged structures so you know these red buildings these red boxes represent damaged buildings and by running all this through network analyst it's going to return the optimal route for us to you know route these first responders to those locations and then the last piece of this equation is getting this information into the hands of the people who are going to use it so we utilized two things first we utilized rgs workforce which is a mobile solution from rgis we also are utilizing the rgs operations dashboard which is essentially an end information product that that allows us to be able to visualize some of this data so first of all the arcgis dashboard in this case we've created something like a situational awareness dashboard that allows us to you know be able to see what is the level of damage for buildings and roads and we also can you know see a breakdown of what is the level of damage in this area of analysis right so for the current extent on the screen we can see that you know what is the severity of the building damage what is the rough cost that's been incurred from by the storm on this area so the average damage per building as well as you know what is the number of damaged roads and what is the number of damaged buildings and the other part is an internal dashboard that's kind of exposed to analyst so now someone at hq who's let's say monitoring this response will be able to keep up with live progress updates from all of the first responders from out in the field and we can kind of see you know the same information but we also have access to um information like what is the location of all of those operators out in the field uh what is their current task and and what is their level of completion of the current task let's say you know if they're rescuing somebody and then finally just to conclude on some very cool visuals following the successful project we also collaborated with microsoft and brought this workflow into azure stack edge which is basically their edge device and this piece was actually presented at the microsoft ignite conference in 2019 where we showcase the same damaged outputs but now in the 3d scene which i think also attests to the versatility of the rgs platform and here is kind of that final 3d scene output so with that i'd like to dive really quickly into a demo here that showcases some of those end-to-end components that we have just talked about specifically those end-to-end components that exist within rgs pro so you can do the same workflows within rgs api for python as well so the problem that we're interested in solving is we want to be able to classify clutter classes essentially these types of zoning information so you know we want to understand what part of the image constitutes high risers what part constitutes commercial urban versus uh commercial suburban versus residential versus public and by public i kind of mean you know parks or public facilities so of course with any deep learning workflow the first step that you have to do is create some training samples so i already have an imagery layer that's brought in here so this is basically a nape image that i've been able to obtain from the living atlas and i've then proceeded to clip it out according to this boundary here so this is our area of interest and this image is basically one meter ground resolution imagery in order to create those training labels it's very simple all you would have to do is highlight your imagery layer and then inside the imagery tab we have a button called classification tools and within here you can specify that you want to use the label objects for deep learning tool and this is basically what shows up after you click that tool and to draw some training labels let me first click them on so these are some pre-drawn training labels but in order to draw more all you would have to do is select from one of these tools my personal favorite is the freehand tool you can basically click that and start drawing away and as you can see these are color coded according to their classes the next step after this is export training data for deep learning so it's very important to do that pre-processing step because obviously a machine learning model by itself is not going to be able to take in just this giant raster you would have to somehow tile it out into smaller rasters and also tile out the corresponding labels as well so in this case it's a geoprocessing tool you can specify what is that tile size you can also specify the most important thing which is what is the metadata format that you want to choose so in this case you can see that i've selected label tiles because i'm interested in a object classification workflow whereas if i were interested in let's say object detection ie coming up with bounding boxes i probably would choose something like kitty labels or pascal vlc whereas if i'm interested in semantic segmentation then i would choose classify tiles or rcnn masks for instance segmentation and then i can click run on this tool and what comes up in the output is basically a couple of directories so we first have a directory containing all these images that have been shipped out and then we have a directory containing all of the labels that correspond to those images and of course some other metadata formats here the next step after that is training the model and of course you can do this either within pro or within the python api so i'm going to show the pro step for now it's again a geoprocessing tool it's very simple it's called train deep learning tool and all you would have to do is specify your training data so in this case uh you would basically have to drag in this folder that was just created from that previous step and the tool is kind of intelligent enough to figure out that hey because the metadata that you have selected for the previous step was essentially for object classification i'm only going to give you models that pertain to object classification whereas let's say you chose pascal vlc as a metadata format for the previous step then the model type is going to give you a bunch of can models for pascal vlc formats so something like ssd or faster rcnn and so on and of course you can choose some other advanced parameters as well so in this case you can specify what is the depth of that backbone model and that really depends on what is the complexity of the problem space that you're dealing with in this case resident 34 would suffice sand i can then run this model here and this model actually took less than one hour to train so let me show you some of those inference results but first in order to perform inference it's again another geoprocessing tool and in this case it's called classify objects using deep learning we actually have two more tools that deal with object detection and also semantic segmentation in this case it's just classified objects using deep learning you would have to pass in the input raster and click run and in the output this is what that looks like so let me turn off the actual training labels here for you to get a better look and we can zoom into some interesting areas here so for instance we can zoom into this location where we can see that in the background these are basically just parks right or public facilities and as you can see from the symbology information here that these have indeed been classified as public and some of these other areas that have a lot of concrete or basically large structures these have been classified as either uh suburban or urban commercial zones and one interesting thing that you see as you start moving closer to the city center is that you see that there's definitely a lot more commercial buildings as opposed to residential buildings which are these gray buildings sorry great tiles over here so that was just a very quick demo on how you can employ a end-to-end deep learning workflow completely within rgs pro and i want to quickly switch gears and also talk about some of our machine learning and 3d capabilities as they pertain to point cloud data so within arcgis we're not only able to host manage and analyze lidar point cloud data natively but we're also able to introduce machine learning into the mix and come up with a bunch of useful workflows such as the one that you see here which is 3d building reconstruction with correct roof form prediction so this was achieved through a combination of machine learning approach to predict seven different types of roof forms by analyzing rasterized point cloud data as well as extrusion through our 3d local government solution and we're able to show the finalized product through a web scene which is what you see here on the right so here are some additional resources to learn more if you're interested and thank you for listening to my presentation today you
Info
Channel: Esri Industries
Views: 375
Rating: 5 out of 5
Keywords: Esri, ArcGIS, GIS, Geographic Information System, ArcGIS Pro, deep learning
Id: vAHwxg1MgQI
Channel Id: undefined
Length: 26min 0sec (1560 seconds)
Published: Fri Oct 01 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.