Hello, and welcome to the
second video in Module One of, Working With Data
& APIS in JavaScript. Now, we're going to do some
real stuff in this video. We did real stuff in
the previous video, but the previous video was just
about practicing with the fetch API and getting
some image files. We weren't really
working with data and we weren't doing
anything with that data yet. In this video, I
want to take a look at this idea of tabular data. There are a lot
of different file formats for storing data in a
table format, in tabular data. The one that I want to
look at in this video, and probably I would think is
the most common one, is CSV, or comma-separated value. Meaning the data, the
data in the table, is literally
separated by commas. The first line of text, after
all, this file, the CSV file, is ultimately just
a plain text file. It might function as
something like a hedero. So it would have the
names of the fields of data you're going to have. So it might have something
like, Item, comma, Cuteness. So you're going to have a table
of things and a cuteness score. So all the rest of the lines
would have the actual items and their cuteness scores. So you could have,
puppy, comma, 10. Kitten, comma, 10. Because we live in a world
where everything has a cuteness curve 10. But I want to do something
with real data, data that's out there in the
world, that I can grab with the fetch function,
load onto my web page, and do something with,
for example, graph. And so the data set that
I'm going to show you, it comes from NASA,
National Aeronautics and Space Administration, in
particular from the Goddard Institute for Space Studies. This CSV file includes the
combined global average land surface air and sea
surface water temperature from 1880 all the
way to present. It's stored in sort
of a funny way. The values that are
actually in the data set are the difference from
the mean temperature. And what do I mean by
the mean temperature? Well, I mean the
average temperature. Well, what's the
average temperature? Well, it so happens that
there is the average world temperature, as recorded
from 1951 to 1980, which also recorded an average by NASA,
the Earth Observatory website, at 14 degrees Celsius. So the data in this CSV file is
the difference from the mean, from 14 degrees Celsius,
from combined land surface air and sea surface water
temperature from 1880 to present. So I want to load that CSV
file, parse it, graph it, and we're done. I'm going to do
this in two parts. The first part, that
you're watching right now, is just loading the
CSV file, parsing it. I want to be able
to see it, maybe as a console log in the browser,
in the Chrome Developer Tools. And then once I see I
have the data there, then I want to try to graph it. I'm going to graph it using a
particular JavaScript library, called Chart.js. I'll talk about some other
ways that you can choose to graph stuff, then, as well. If you want to follow
along with coding, I'm about to start
coding, you're first going to want
to grab that CSV file and have it stored
locally on your computer. So it's a pretty easy process. You just want to go to
data.giss.nasa.gov/gistemp. The URL's here and in
the video's description. And then you want to scroll all
the way down and find the place on the web page that
says, Tables of Global and Hemispheric Monthly
Means and Zonal Annual Means. So, there are actually a ton of
different data sets on this web page, and you
might explore them, and perhaps as an
exercise try doing graphing a different data
set on this web page. But the one that I'm
using is the last entry on that section called,
Zonal Annual Means, from 1880 to present. And I'm using the
CSV file format. You'll notice there's also a
TXT file format, that's probably tab delimited so that
the sort of data records have tabs in between
them instead of commas. Again, there's a variety
of different formats, but the CSV is the one
that I'm going to use. Time to start coding. So let's check in
and see if you-- if you want to follow
along, let's check in and see if you have
exactly what I have. So what I have is some
boiler plate HTML. It's just a plain
index of HTML file with a title, a head, a body,
and an empty script tag. I've got links to where
the data is coming from, just to make sure I'm
referencing and crediting properly in my code. And then I also have
that CSV file itself. So, yeah, I'm in
Visual Studio Code. And you can see,
there's my index of HTML and there's my CSV file in
the same local directory. But you might be
using a different text editor or a different
environment, all of this will work as long as
you have your HTML file and your CSV file. Let's take a look
at that CSV file. So here's the CSV file. You can see that there are a
number of columns: Year; Glob, which I assume stands
for global; Nhem , northern hemisphere;
et cetera, et cetera. And then you can see that
the columns of data each being separated by commas. So this isn't really meant
to be human readable. There are ways of viewing a
CSV that's more human readable. For example, here's
how that same data looks in spreadsheet format. You might notice here,
however, that it is colored. Each column has a
different color. This is because I'm using
a Visual Studio Code extension called Rainbow
CSV, which it says if it was like tailor made for me. And I'll include a link
in this video's scripture if you want to install that
extension as well so you can have things color coded. Another thing I
really like to do when I'm working with a
data set for the first time, is I like to give
myself a test file that has much less stuff in it. Because if I want a console,
log, and check stuff, sometimes this big file--
this isn't that big of a file, it's just 1880 to 2018. But, potentially, I could have
like a really, really big data file. So something I'm
going to do is, I'm going to just quickly do a
Save As and call this test.csv. And so now you can see that I
have a separate test.csv file. And I'm going to just
leave two years in there. So I'm going to scroll all the
way down and delete everything. And now I have a CSV file that
just has three rows in it: the header row and then the data
for 1880 and the data for 1881. So I'm going to work
with this first. And once I have the
parsing and everything I want to do working properly,
then I'll load the real data. So the first step is exactly
what you might think, fetch test.csv. Let's write that code. I'm going to write,
fetch test.csv. And then remember,
fetch returns a promise, and I can handle the
resolution of that promise when it is finished loading
the data with dot vin and any errors with dot catch. But I prefer to use the
await and async syntax, so I'm actually going to put
this in a function called, get data. I might think of a different
name for that later. And I'm going to say
the response equals a weight fetch test.csv. So I'm rating a function--
oh, and this function, I almost forgot, has to have
the keyword async because it's an asynchronous function that's
making asynchronous calls with the await keyword. So the response equals
await fetch dot test CSV. Now you might
remember, on the web fetch API there are a variety
of kinds of data streams that might come in. There is a blob, there's a
JSON, there's an array buffer, there's text. And this is what I want
to actually use, raw text. Even though it's tabular
data in CSV format, I'm going to do the parsing
of it manually in my own code. So I just want to
receive it as text. And that means-- I'm going to say-- I'm just going to
say table equal-- maybe I'll just
call this all data, equals await a rate
response dot text. So let's console log that data. And let's call the
function, get data here. And then let's go and
see this running actually in the browser. And here it is. And you can see,
there we go, the data has been logged to the console. Now, ultimately, here
there are a variety of JavaScript libraries that
will parse a CSV for you. And by I mean, parse,
I mean figure out where all the commas are
and break up the data and put it into objects
and make it usable for you. D3, which is a JavaScript
library for data visualization, has a parser in it. p5 js,
which is a JavaScript library that I use a lot
on this channel, has a load table function,
which will actually parse the CSV for you. And there are many others. So I'll include some links to
those in the video description. But I think it's a useful
exercise right now. It's a simple enough data for
us to do the parsing manually with the split function. What? What split function? What are you talking about? So the JavaScript
string class, any time you have a piece of text
in a variable in JavaScript it's a string object, has
a function called split. And that function allows you
to take any arbitrary text and split it up into different
elements of an array. And that's basically
what we want to do. We want to split up all the
rows, and then each row, we want to split
up all the columns. The split function requires a
single argument, a separator or otherwise known
as a delimiter. And in this case, we have
two kinds of delimiters. For each row, the
delimiter, the thing that differentiates one row
from another, is a line break. So first let's call
split with line break. Going to my code, I can say, and
I'm going to call these rows, the rows equals data dot split. And I'm going to
split by backslash n. So backslash n is an
escape character sequence that indicates a line
break or new line. Depending on your
file format, you might need a slash r, also
which is like a carriage return. You can also use something
called a regular expression here. This should also work. Instead of in quotes, if
I have forward slashes, the delimiter is a
regular expression. What's a regular expression. So that's beyond
the scope of what we're doing in this
particular video, but I think expressions are
so useful when doing string parsing, that I will also
link in this video description to a whole series of videos
that I have about that. But for now, just the
backslash n in single quotes should do for us. So I'm going to say
console.log rows, just to make sure that works. And it does. So we can see here,
this is the raw text. And now this is split into
an array with three elements, each element is one
line in that array. And one thing, though, we don't
actually need to first row. The first row is really useful
important information for us as human beings to think
about what the data is. But just for parsing it,
I don't actually need it. An easy way that we can
remove that first row is with the slice function. Slice function is an array
function in JavaScript that makes a copy of an array,
but a portion of the array from beginning to end. So I want the arry all
the way to the end, but I want it from element 2,
which is index 1, to the end. So in other words,
what I can do is, I can say data split by line
break dot slice index 1. So this will basically
delete the zero element and give me a copy of the
array from index 1 to the end. And if we go back to
the browser, we can see, there we go. We now have an array with
just these two rows in it. Perfect. So what's the next step? The next step is splitting
each one of these rows into all of the fields. And truth of the matter is,
for what I'm doing right now, I only need the year
and the difference from the mean
temperature globally. And that fit this data,
this negative .18. So I'm now going to say,
for let I equal zero. I is less than rows dot length. So I'm going to just iterate
over all of the rows. You know what? This will be a nice time
for a four each loop. So I'm going to do
rows dot for each row. Once again, I'm using the ES
6 JavaScript arrow syntax. So for each is a
higher order function, that allows me to
apply something to every element of the array. And each element of the
array is represented by this variable row. So if I just say console.log
row, and go look in here, we can see, there we go, we're
console logging each row. But that's not
what I want to do. I want to say-- you know what I'm going to do,
I'm going to call this row. And I call this elt, for
like element of the array. And I'll say row equals
elt split by commas. And then I'll console log row. So what I want to
do is, for each row, I want to split it up by commas. So little let's make
sure that works. Let's code. And we can see, OK. So we can see that I've got
both 1880 as an array and 1881 as an array. And then I want to say const
year equals row index 0. And then const
temp, temperature, equals row index 1. Let me remove the first console
logs, sort of clean things up a little bit. And let me run this. And I should see just 1880
temperature, 1881 temperature. And that's exactly
what I have here. And guess what? Now that we've worked
this out, I can go and use the full data set. So I'm going to just change
from test.csv to this more complicated file name,
ZonAnn.Ts+dSST.csv. So that's the full data set
that I downloaded from the NASA website. I'm going to save. I want to go back. And we can see, there we go. We have every single
year and the difference from the mean
temperature next to it. Now I just noticed, if I go
all the way to the bottom, there's a little bit of
an extra undefined here. So it looks like probably
I need to clean up my data file a tiny bit. I'm assuming there's an extra
line break at the bottom. You can see there's
an extra line, 141. So I'm going to just delete
that, hit Save, and then we can see. There we go, undefined is
no longer appearing there. So It's important
for me to mention that I have kind of
created this very pre-prepared, easy situation. I know that this data file
has no empty date, no mistakes in it, no empty pieces. It's actually already
in CSV format. Just removing that little
extra line break at the end was a tiny bit of cleanup
that I needed to do. And in fact, there is a
function in JavaScript. I could have just
said, data.trim, that would have cleaned
it up for me anyway. But I do want to emphasize,
what if the data actually has commas in it? So if the data has commas
in it, my parsing system is going to break down. Well there are
conventions for this. CSV files actually use
quotes around the information that shouldn't be split,
where there actually is a comma in there. You might find that your data
isn't already in CSV format. You found this data
that you want to use, but it's a PDF, if it's scan. Your going to have
to optical character recognition to turn it into
data that you can work with or transcribe it manually. This might be data that you
want to collect yourself, from your own sensor readings. So there is a ton
of work that can go into prepping and cleaning
data for a project like this, but we're getting
started here in the sort of basic sense of just already
having a easy to use data set for us. In some of the future
videos I will look at collecting your own data. And we'll see that as well. Another little quick
bit of refactoring that I could do here is, I think
this rose variable is a little bit confusing. This is really
ultimately the variable that holds the entire table. Right? I'm taking the raw
data, splitting it up into rows, that's the table. And now taking the raw data
and splitting it up into rows, the table is holding
all that information. So the table-- this
is really looking, not at each element
of the array, this is now looking at
each row of the table. And then it would make sense
to call this is splitting up. This is an array
that's all the columns. So maybe I'll write
that fully out. So this is columns, this is
row, and then this is row.split. So I think this is
a bit more clear, in terms of what's
actually going on here in parsing that CSV. So I'm getting the
raw data as text. I'm splitting it up, putting
it into a variable called data, going through each--
sorry, it's hard putting into a
variable called table, going through each row of
the table, splitting each row into its corresponding columns. And then, I forgot, this now
has to be columns index 0, columns index 1. There we go. I think I like this better. A little bit of refactoring. This is something that's
very useful to do when you're working on something. Maybe you make up some
variable names as you're going and you come back
and refactor it to something a bit more clear. We're ready for the next step. Now that we see the data
logged there in the console, we know we could do something
like add it to a dom element. We could present it
back to the user, to the viewer, the user of
that web page in some form. So what I want to do is try
to do a simple line chart. I think this will be a nice
way of showing the data. And I'm going to do that with
a Javascript library called Chart.js. So before I get there,
though, maybe you want to try a little
exercise yourself. Can you console log a
different column of data? Can you load a different
CSV that you found and do the same thing with? See if you could
find your own data set that you might want
to play with and just get the data appearing
in the console. And then you'll be ready for
the next video doing something with charting it. [MUSIC PLAYING]