Hi everyone! In this lab, we're gonna make
a continuous map of PM 2.5 or fine particulate matter for the entire state
of Ohio as an application of GIS in management of air quality dataset.
In the first step, let's Google EPA air quality data: open up your browser
and then google EPA air quality data and then click on download daily data.
Here we're gonna make him a based on PM 2.5 so the pollutant: PM 2.5 for the
year 2018 because 2019 is is still not complete dataset so the most recent
version of the data is 2018 for the whole year and the geographic area: Ohio and we want to have all of the monitoring sites so click on get data and then
download CSV. You need to click on "show in folder"
and copy the data into your lab 7 folder and paste it here so let's change
the name of this to PM25 don't use the point PM25_OH so now
let's open the CSV file so you can see we have almost 15,000 records of data so
because the data are daily data and each station so we have a variety of a
stations in Ohio that record data daily so for each station we have almost daily
data. Sometimes you have 3 days 3 days for example for this one 3 days 3 days
we have so many stations that record the data and some of them are daily some of
them are after a few days so in a equal interval they record the data so what
we're looking for is daily mean PM 2.5 concentration and also daily air quality
index and we also have the latitude and longitude right so for example for West Union the latitude and longitude is the same right? so what we're gonna do is
that we're gonna have the yearly data so for this one for West Union I'm gonna
have the average of the PM 2.5 for the entire year and also the average of the
AQI for the for the entire year and also for this station we also need the
average for the entire year of the PM 2.5 and AQI so what we should do we have
to use a function with the name of subtotal. First of all, before using that function let's select this site name and then
sort it-->go to the Data and then click on sort from lowest to highest and then
sort so it sorts the data and after that select all of your data
okay control+A and all of the data so the next step is to in this tab (Data tab)
you have to go to the subtotal and we're gonna classify the data so that for each
station we have the yearly average of the AQI and also the yearly
average of the PM 2.5 so here we have to select the based on
which field based on site_name so based on site name we're gonna have daily mean
PM 2.5 so I checked this and also daily AQI and I'll also need a latitude and
longitude ok then click OK. let's see what happened so if you
go to the second tab you can see we have the (something that was wrong is that
when you go to the subtotal you have to select the average instead of the Sum)
okay and then click OK so now if you go to the second tab you
can see we have the average for each site for the daily AQI for daily mean
PM 2.5 and also the latitude and longitude which is the same average of
all of them all of the same value so it doesn't have any influence
taking average.. The next step is to I need to bring the data into a new data
set and based on latitude and longitude I can bring it into GIS. But what is the
problem? the problem is that this data is based on so many *records okay based on
for example 123 records sorry so if I copy this example if I copy this into
a file: new: blank (and then paste it!) so what happens? the the problem is it
has 15,000 record but I only need the subtotal these values okay so
what we can do is first of all, we need to convert this to the values okay so if
I copy this field and then here if I instead of paste so again copy and then
I use paste "Values" okay So I have this values and also for the
latitude and longitude copy these values here paste values. The next thing is
let's remove this field, so the next thing is now I can use
the filter and only select these records okay? the average so one thing that I can do is that if I only select this blank
cells in this field it shows me just the average right? so I can put a data
and then filter, filter out this field based on the blank and then okay
so what happens I only have the average okay so I can now select all of these
values Except for the grand average we don't
need to crown the average so that's for the average for all of these locations
to copy that and paste it in another excel so now we have the average right? And it's only based on the 55 stations okay? we
don't need the "unit" so we did a very good job for us we couldn't do that
without the unit okay so you have to select the values that are unique all of
the averages so the other thing is we need to remove the word average because
we're gonna put a label based on the site name so how can I remove the word
average so we need to write a query. here we need to write a query that I
only want from the left part and use tabs so when you write the left and then
push the tab button so from this field and based on the lengths of the field
again you have to like Len of the this field - so I have to I'm gonna remove 1
2 3 4 5 6 7 8 so to remove this 8 characters of this field, so what happens?
now we can see that I remove the word average okay so from the left part so we
remove the 8 characters which is common in all of this fields so you have to
remove the word average and also there is space after that so
now I can apply this for all of the fields okay if I click on this one I can
apply the formula for all cells. This one is so much better so I can
type site underscore name because I'm gonna import this into ArcGIS so do not
use a space and then I can remove this field or I can copy here and then paste
values yes then delete so everything looks good the only thing that you
should do is that this one is not good for importing into GIS so I just
and it's not daily PM yearly it's a better name and do not use a space PM_yearly. So I type just the AQI air
quality index the average air quality index we have this one also should be
Yearly. site name site latitude and longitude so we can easily bring the
data into GIS so let's save the data as CSV file in GIS_lab--> lab 7 with the
name of PM_AQI_2018_OH so now we can
oops one thing that I forgot is that this one should be based on CSV
file right? so comma separated value and then save, Yes. Now we can remove it So we have PM_AQI_2018_OH. Let's
launch your ArcMap and then at the counties of Ohio I think we have County
information from lab6 so I add the counties_Ohio here and then I add this CSV
that I just prepared PM_AQI_2018_OH and so the next thing is I have to geocode the
value "Display XY data..." okay so if I click on "Display XY data..." so X field is your
longitude Y field is your latitude and Z is based on so you don't need that
but you need to you need to tell ArcMap the coordinate system let me look
at the values or I can look at the values from here.... So the values are latitude and longitude
and okay so geographic coordinate system because it's latitude and longitude is
not based on meters a decimal degree value so display XY data and then X is
longitude Y is latitude geographic coordinate system North America 1983 so
that's a very good option because when we add the counties
of Ohio so it's kind of got their information the coordinate system from
this one so that's a good option and then click OK so you can see the
location of the of the stations that record the information of the air
quality so you can see here we have so many station but we don't have any
stations here. So question is for example, what is the air quality index and pm in
this stations okay so we only have the data for a specific points so we need to
create a raster, a continuous surface for this, right? so to do that we need to
use interpolation so the interpolate between the points and then change
this points to the continuous surface so if you open up your ArcToolbox and
then we need to find the interpolation and then "Spatial analyst" tools
interpolation and then IDW inverse distance weighting so it requires the
license so we need to before that we need to go to the customize-->
extensions and then check the "spatial analyst" and then close and now it works
because we need to activate the license first.
Here your point feature will be this one PM_AQI and you're Z value where
you're gonna make a map based on the PM 2.5 right so PM_yearly is okay so
where you want to save it in lab7 with the name of PM_2018_OH and then output cell size so we accept this default values
okay and to the environment you can go to the Raster Analysis and set the mask to
counties of Ohio and then okay and okay. What happens? you need to
change the order of the layer so it changed the points to the
continuous surface but it can't do that for this part because we don't have any data here, this process cannot extrapolate so it can only interpolate between the
points so here you can see higher concentration of the p.m. so in these
areas you can change the symbology to for example five classes, natural breaks
okay and then maybe use this color. Here, we
have higher concentration of the PM here also we have higher concentration but
these areas you have lower concentration of p.m. okay so that's a map of PM 2.5
and also we need to do this for the AQI air quality index, the Z value
that you want to make a map is now AQI_Yearly so this time that will be AQI_2018_OH AQI_2018_OH and save and again raster analysis, mask:
counties. Okay and okay. You can change the classification 5, natural break
classification okay apply but let's flip the symbol so that's very similar to
this one so here you can also you can compare PM & AQI that's very
interesting that AQI & PM are highly correlated to each other right
because AQI is made one of the one of the important pollutant in AQI is p.m.
so that's why we have high correlation between these two maps so you have to
go to the layout view and then make two maps the first one for the AQI which is this one and the other one is
for so you need to insert data frame and also one of them is for pm the
other one is for AQI. You can create a map based on both of them that
shows the air quality index Yearly air quality index for the year 2018 yearly
PM for the year 2018 and also you can insert if you open up the
attribute table of this one the shapefile and then look at the....if you
go to the table option--> create graph you can create a scatter plot that shows the
relationship the x for example can be PM the Y can be AQI so you can see
there is they are highly correlated okay so after that add to legend.. remove this
so it's obvious that if I add a trendline
so it's this line which is kind of we already know that so click Next
and then here we can put AQI yearly; PM yearly so that's or maybe you can add
PM 2018 and bottom AQI 2018 remove the legend
finish. So you can also copy this legend to your map and make a map of the
relationship between AQI and also the PM 2.5 and add the legend, North arrow,
you may also add the histogram showing the histogram is not really necessary I
don't think so let's look at the histogram of the AQI: statistics so
histogram is almost normal distribution of the AQI so most of them
AQI are between 33 to 38 and also for the PM we have few *stations that have
very high PM okay so you can select them we have few stations that have
very high PM because these are based on a stations right.