Howdy! In this tutorial we will be creating a
confusion matrix in ArcMap to assess the accuracy of an image classification. For
instructions on the classification process please see one of our image
classification tutorials. Thematic maps created from imagery will have some
classification errors. Accuracy assessments provide the user with more
information on where the errors are occurring. Depending on the acceptable
level of error the user will be able to determine if their classification is
usable or if they need to reclassify the image. The class accuracies are
determined by comparing test pixels with the corresponding location in the
classified image. In a perfect world we would be able to use field verified
ground reference locations for the test pixels. This is not always possible in
which case the user may also select references that they have visually
identified from the imagery. The test pixel should be evenly distributed
across the image. They should also be distinct pixels from the training areas
used for supervised classifications. Confusion matrixes are widely accepted
method of determining the accuracy of a classification, but it is important for
the user to remember that the biases that are present it their test pixels
will also bias the accuracy of their confusion matrix. The rule of thumb that
I learned was to have 10 times the number of pixels for each class as there
are classes, so if there are 3 land cover classes then there should be 30
test pixels for each land cover, so a total of 90 test pixels. It may not
always be possible to have an equal number of pixels for each each class in
a classification. If you know that there is not very much forest in your
classification and there is a large amount of water it would make more sense
to have 20 test pixels for the forest and 40 the water. The test pixels need to
be as near to evenly distributed as possible, if all of your tests pixel from one section of the image the result
will be biased. Only the accuracy of the section of the image that you had tests
pixels for. In ArcMap open your pre classification image. On the top of the
map open the ArcCatalogue interface. Navigate to the file that you wish to
work in or create one. I have created a file for my accuracy assessment now we will
create a new shapefile. I'm going to name mine Reference Points. Make sure that the
type of shape file is set to point. Now I'm going to drag my new shapefile into the
map. Reference points is currently empty before I add points to the shapefile i'm
going to add two fields to the attribute table. Open the attribute table by
right-clicking on the shape file and selecting attribute table. We will left
click on the button on the upper left and then select add field. The first
field i am adding will be the reference for me and any other user of the status
set. The datatype will be text and I will name it Landcover. I'm going to repeat
the process of adding a column, but this time I'm going to name it class and
leave it as a short integer. In order to add a point I'm going to use the edit
function. To use this open the editor toolbar by right-clicking and the top of
the map and selecting the editor toolbar. Left click on the arrow next to the word
editor and select start editing. Next we will select reference point's shapefile
to open an editor window. Select the reference points file in the editor box
which will open options in the construction tool. Select points. You will
now be able to add points. When adding points be sure that you are using the
image that you use to create the classified image. Be zoomed sufficiently that you are able to see
where you are placing the point. Before i start placing points i need to determine
how many land covers I can identify in the image. I am noticing four: trees, pasture, water, and urban areas. Based on the four classes and the rule of thumb to have 10
times the number of test points for each class as there are classes I will select
40 points for each class and we'll end up with a total of 160 points. For the
moment I'm just going to select a few points for the water class. Once you have
selected your points for one class you will need to add information to the
attribute table so that you know what the points represent and so that the
computer knows. This is why we added both a text and a number column. You may not
know what the class number is now because you should not have your
classified image open as it leads to the temptation to select points that match
your classified image. When adding the information to the attribute table it is
easiest to use the select by attribute tool and choose landcover from the
dialog box, now it will equal the unique value of the quotation marks space. The
attribute table for the reference points will now have all of the blank fields
highlighted and I can input the value of water by using the field calculator,
because water is text i need to you put quotes around the word water. Now you can
start adding points to the next class, these will be blank in the attribute
table and you can fill in their values the same way that water's value was
in filled. I'm going to skip ahead here and open up a point file that i have already
completed adding all of my reference points too. After completing all of your points add
the classified image to your map document. Check what the number code is
for your classes and then assign them to the point file so that class 1 is the
same in both. In my case class one is pastor so I will go to Selection, select
by attributes and enter "Landcover" equal "Pasture" and hit OK. The
attribute table for the reference points will now have all of the pasture
highlighted and I can input the values the value one by using the field
calculator. I have already input the numbers into my file. There's one major problem with my data
that is now apparent based on the classes that I observed in the classes that the
computer observed. There's no urban class in my classified image. At this point I
have two options, I can stop and perform the class location over again or I may
go ahead and run the accuracy assessment. So that I know what the urban areas are
being classified as so that I may pay more attention to them when selecting my
training data if I am using a supervised classification method. To continue with
our accuracy assessment we need to set the system to align the pixels we will
create from the reference points with the pixels of the classification. To do
this go to the reprocessing tab, environments, processing extent, fill in
the extent as your classified raster and set the snap raster to the classified
raster as well, now it OK. If you do not have the spatial analyst extension on
let's do that at this time. Go to customize, extensions, and select spatial
analyst. Now we will convert the reference points into reference pixels. Open the tool box and go to conversion
tools, then to raster, and finally point to raster. The input feature is the
reference points, the value field will be class, and then browse to the location
you would like to save the output raster file and name it. You can leave the cell
assignment and priority fields in their default position, but make sure that the
cell size is correct, mine should be 30 because it is based on a Landsat image,
if you do not know what size, it should be you may also drag the classified raster
to the cell size and the computer will match it. Hit OK. Zoom into one of your
points to see if it appears to be aligned correctly. I'm going to zoom into
a river because this is a small object and it is easier to tell if the pixel is
in the same is the same size when it is next to multiple classes. Mine looks good, if yours has issues
revisit the environments and cell size settings. We will combine the reference
points and classified image. Go to the toolbox spatial analyst, local, and
combine. Looking at the results of the combine is interesting. If it were not
for the urban class I would say this was a very successful classification. Class three has 39 out of 40 points correctly classed and the missing point isn't
showing up in any other classes which means it was an unclassified pixel. Class
1 and 2 were 100 percent placed. My urban class was nearly equally placed
in pasture and forest. Although this has told us a lot of
information it is still not a confusion matrix. To create confusion matrix we
will need to use the pivot tool but this cannot be done with the attribute table
so we will have to export the table. Click the button on the upper left
corner and select explore, and then browse to the location that you
wish the table to be saved in. Save the table as a debased table. The pivot table
tool may be found under the data management, tables and then the pivot
table tool should be there. The input table should be the table you just
exported. Select the classified raster field as your input and the reference
points as the pivot field. The value field will be count. Now navigate to the
location you wish to save your table and name it then hit OK. My matrix is in
order that makes it more difficult to read, but that's ok for now. I'm going to
use Excel from this point forward. This may not be the best way, but I like to
export my table as a text file. Then in Excel I will open the text file
and select the delimited, hit next, and then the file is comma delimited so
check that box and then next and next. I'm going to get rid of the object ID
field. Next I'm going to rearrange my column so that they are in one, two, three, four order. I'm going to add a class for wrote, my urban class and input zeros for all four columns. To make it easier to
interpret I'm going to change my classes from numbers into landcover names. In my
case that is pasture, forest, water and urban. So now the matrix is looking
pretty and it is time to add the formula so that we can get meaningful numbers
out of the data. The measurements that we will be finding our the kappa
coefficient, the overall accuracy the class accuracies, the Commission and the
omission. I'm going to insert several rows above my matrix. I'm going to find
the total number of pixels for each class. I can do this easily by using the
formula equal sum and select the cells that I want and then drag the formula
into the neighboring cells i am going to find the ground truth totals in the same
manner. I know that I had 40 points in water so I'm going to manually correct
the total for the water class. Next we will find the ground truth percent these
are the class accuracies. These will have the same headers as the matrix, so I'm
going to copy my matrix and then delete the pixel data from my copy. In my first
cell I will input the formula equal, my pasture cell divided the pasture reference
total multiplied by 100. Add a dollar sign in between the letter and the number of the
reference total cell. Now put the formula down and then let's pull it across. Next we'll find the Commission;
Commission is how many tests pixels were incorrectly classified as a class. So it
is the incorrectly classified pixels in the row divided by the total number of
pixels in the row. This is a little more time consuming to
fill in the cells because you cannot just put them down the way you could for
the percent. We can think of this as the rate at which the class has been over
classified. I'm going to skip ahead here. Omission is opposite. It is the
incorrectly classified pixels in the column divided by the total number of
the column. Let's take a moment to fill these in. I'm going to put 1 in for
water because I had added that one back in earlier, but I know it was not
classified. In mine I have zero percent omission of pasture and forest, but I
have one hundred percent omission of urban. This makes sense because there
were no pixels classified as urban in my classified image. Next we will add the Producers Accuracy. These are the correctly classified cells
for pasture, forest, water, and urban divided the reference point total for the class
which are all 40. And then we will find the percentage for each land cover. The
users accuracy is like the producers accuracy, in that it is the correctly
classified cells for pasture, forest, water, and urban but this time it is
divided by the total points appearing in a given class so 57, 63, 39 and 0. Now let's go back to the very top and
find the overall accuracy and the kappa of coefficient. The overall accuracy is the
sum of the correctly classified cells divided by the total number of cells.
This one is pretty easy to find the kappa coefficient is a little less
straightforward, so let's take a moment to talk about it. The capital efficient may be used as a
measure of agreement between the model predictions so that would be our
classified image and reality. Or to determine if the value is contained in
an error matrix represent a result significantly better than random a one.
Would indicate perfect agreement between reality and are classified image and 0
is representative of complete randomness. N here's the kappa coefficient
equation. N is the total number of sites in the matrix. R is the number of rows in
the matrix Xii is the number in row I and column I, then X plus I is the total
for row I and Xi plus is the total for column L. So what is this
actually asking us to do? We need to multiply the total number of reference
points by the sum of the correctly classified pixels. From this number we
will subtract the sum of all the class row total by the class column total. Next
we will divide all of this by the square of the total number of reference points
and subtracting the sum of the class row total by the class column total. This is more sophisticated than the
overall accuracy as it takes the misclassifications into account as well
as the correctly classified and the ideal classification. The harder part is
now to put this all into Excel, but I'm going to leave that to you. Your kappa
results should be similar to those of the overall accuracy. This concludes our tutorial on accuracy
assessment in ArcMap. Thanks for listening and for more information or
resources from Texas A&M University Maps & GIS Library, please visit our website
at library.tamu.edu/maps-gis