The Need for Normalizing Data (Mapping Density)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
there are a number of different ways to measure density in our raster GIS class we created density surfaces in these density surfaces we found out the number of phenomena or points per square mile or per unit area within a certain distance of a pixel we were able to define the cell size the type of neighborhood and the radius or the parameters for that particular neighborhood also in quadrant analysis is kind of a pseudo density function where that we create quadrants of the same size say five miles by five miles they're all the same size so if we know the number of points that occur within each one we know each of them will have the same density by comparing these now in theory quadrants don't work in the real world because we have counties we have zip codes we have block groups each of which have different areas so you can see my example here I have dear data for 2010 these represent deer-vehicle collisions I just wanted to count these within counties now I want to report this to some sort of county level agency so we can find that the number the density of deer-vehicle collisions and so that the county can make decisions with these with these data I could do some quadrant analysis with this I could also create a deer density surface with this - I'm going to look at the example where we use a spatial join and create new attribute and calculate new attribute so that we can find the highest the density with the highest with the highest density number of deer-vehicle collisions per square mile per thousand square miles this is how I can do it here now I'm gonna run a spatial join right here like I've done in the past so under my NC counties I'm gonna right mouse click I'm gonna run a joint I'm gonna join data from lay it one layer based on a spatial location it's gonna be deer data I can calculate the average of them calculate the sum and I'm just going to save this in my C temp drive and I'm going to call this joint output 3 because all I want to do here is just use my count and then I'm going to extract my count and use something to do that so here's my spatial join in essence I think I have about 20,000 deer-vehicle collisions now all I'm going to do is encapsulate those within one of the 100 counties that we have and just represent that using a count attribute and we're going to look at that in a second here it is now when I right mouse click open my attribute table here you can say here's the data that represents my each of my hundred counties but when I go over to the right here a column called count and you can see the highest and when I do is sort descending the highest is 824 all the way down to only two and I can map these I can double click like my properties symbology quantities under my value I can go down to count I can look at my classify and just see what this distribution looks like it's definitely right skewed and I can map it like this now the one problem with mapping count is the fact that larger counties or tend to have more phenomena of deer-vehicle collisions than smaller counties now we need to normalize this now if we were looking at crime or some sort of social phenomena birth defects we might have to normalize this by population with more physical phenomena say deer-vehicle collisions tornados earthquakes or whatever if we're going to normalize it we're gonna have to normalize it by area so we can see here you know wait County I think this is Johnston County over here they're typically going to have larger or higher values of in this case D vehicle collisions well first of all because there's more people but in this case larger area there's just more of a chance of them happening there and I can do this also per unit area or per unit person here and we're gonna look at both of these here now if I right mouse click and open my attribute table here I'm going to add a couple columns here and I want to normalize this I'm going to normalize this by population and then I'm going to normalize this by area under my table options I'm just going to add a field it's going to be called DV per person I'm gonna make this a double I'm gonna give it a precision and a scale the shapefile I have to define those and in this case I'm just going to do count divided by pop 2013 now the one problem with dividing count / to pop 2013 is that I'm gonna get very small numbers I'm gonna get like point 0 1 so I'm gonna multiply this by a thousand I want to look at the number of deer vehicle collisions per 1000 people by county so I'm gonna right mouse click click on field calculator and I'm gonna do 1000 times my count slide by pop 2013 so this just looked at the deer vehicle collisions per 1000 people and if I sort descending you'll see I've got about ten point zero three in this particular County right here this is Tyrell County versus Duplin County versus Washington County so I can look at how many deer vehicle collisions occur for every 1000 people if I did this per person this value for my maths correct would be point zero one I don't know what point zero one do you vehicle collisions look like but I know what ten to your vehicle collisions were actually in that case point one I don't know what point one deer vehicle collisions looks like per person but I know what ten for every thousand people so I'm just moving the unit's over a little bit I can map this here thinking that per person you know in this case it's per thousand people and you can see per person or per thousand people you can see where they're relatively high so these counties that are darker right here have more deal vehicle collisions for every person ok there were thousand people here the other thing that I can do is I can normalize this by area I'm going to click Add field DV I'll call this per area define precision here by a scale you know I want to make sure this is double deal with calculator I'm gonna do 1000 times what am I gonna look at here I'm gonna look at my count divided by my square miles okay so these are the number of D vehicle collisions for every thousand square miles there so you can see this highest county here and I've highlighted in blue here Wake County has nine hundred and sixty one D vehicle collisions for every one thousand square miles mm-hmm so you can see it's got count 824 if we multiply that number times a thousand divided by the area which is 857 you can see where I get this 961 so now we can make a map of this here this value here which is the per area mm-hmm when I click on classify you can see it's still right skewed here you can see it move it a little bit over here so we can look at a combination of each of these we look at we can look at the density per area or the density per population because obviously if there's not a lot of people around than the chance of them having a do vehicle collision are gonna be a lot less than if there's more people around so we can start to use our knowledge of the subject and to be honest with you I'm not a big expert on deer-vehicle collisions do you see which of these account now the two things that I want to really stress is that we do not want to map count so for mapping homicides or crime obviously they're gonna occur in high population areas because there's more of a chance of those happening there same with tornadoes Texas has the most number of tornadoes because it's so big but it's only three or four on the list of tornado densities tornadoes per thousand square miles that belongs to Texas and Kansas and Oklahoma so we need to normalize data the other thing we need to do is make sure we put it into units that we understand if we go do something perc but a crimes per capita we get very small numbers deaths per capita dia vehicle collisions per capita we might have something like point zero two well let's move the decimal point over to three places instead of point zero two per capita that might represent twenty for every one thousand population or one thousand square miles so these are some effective ways to measure density just like we did with our Quadron Alice's and also with our density map
Info
Channel: DEEGSNCCU
Views: 5,478
Rating: 5 out of 5
Keywords: GIS, DEEGS, NCCU, Normalizing Data
Id: QMClRuAYY5g
Channel Id: undefined
Length: 9min 40sec (580 seconds)
Published: Thu Dec 15 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.