1.2 Tabular Data - Working With Data & APIs in JavaScript

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Hello, and welcome to the second video in Module One of, Working With Data & APIS in JavaScript. Now, we're going to do some real stuff in this video. We did real stuff in the previous video, but the previous video was just about practicing with the fetch API and getting some image files. We weren't really working with data and we weren't doing anything with that data yet. In this video, I want to take a look at this idea of tabular data. There are a lot of different file formats for storing data in a table format, in tabular data. The one that I want to look at in this video, and probably I would think is the most common one, is CSV, or comma-separated value. Meaning the data, the data in the table, is literally separated by commas. The first line of text, after all, this file, the CSV file, is ultimately just a plain text file. It might function as something like a hedero. So it would have the names of the fields of data you're going to have. So it might have something like, Item, comma, Cuteness. So you're going to have a table of things and a cuteness score. So all the rest of the lines would have the actual items and their cuteness scores. So you could have, puppy, comma, 10. Kitten, comma, 10. Because we live in a world where everything has a cuteness curve 10. But I want to do something with real data, data that's out there in the world, that I can grab with the fetch function, load onto my web page, and do something with, for example, graph. And so the data set that I'm going to show you, it comes from NASA, National Aeronautics and Space Administration, in particular from the Goddard Institute for Space Studies. This CSV file includes the combined global average land surface air and sea surface water temperature from 1880 all the way to present. It's stored in sort of a funny way. The values that are actually in the data set are the difference from the mean temperature. And what do I mean by the mean temperature? Well, I mean the average temperature. Well, what's the average temperature? Well, it so happens that there is the average world temperature, as recorded from 1951 to 1980, which also recorded an average by NASA, the Earth Observatory website, at 14 degrees Celsius. So the data in this CSV file is the difference from the mean, from 14 degrees Celsius, from combined land surface air and sea surface water temperature from 1880 to present. So I want to load that CSV file, parse it, graph it, and we're done. I'm going to do this in two parts. The first part, that you're watching right now, is just loading the CSV file, parsing it. I want to be able to see it, maybe as a console log in the browser, in the Chrome Developer Tools. And then once I see I have the data there, then I want to try to graph it. I'm going to graph it using a particular JavaScript library, called Chart.js. I'll talk about some other ways that you can choose to graph stuff, then, as well. If you want to follow along with coding, I'm about to start coding, you're first going to want to grab that CSV file and have it stored locally on your computer. So it's a pretty easy process. You just want to go to data.giss.nasa.gov/gistemp. The URL's here and in the video's description. And then you want to scroll all the way down and find the place on the web page that says, Tables of Global and Hemispheric Monthly Means and Zonal Annual Means. So, there are actually a ton of different data sets on this web page, and you might explore them, and perhaps as an exercise try doing graphing a different data set on this web page. But the one that I'm using is the last entry on that section called, Zonal Annual Means, from 1880 to present. And I'm using the CSV file format. You'll notice there's also a TXT file format, that's probably tab delimited so that the sort of data records have tabs in between them instead of commas. Again, there's a variety of different formats, but the CSV is the one that I'm going to use. Time to start coding. So let's check in and see if you-- if you want to follow along, let's check in and see if you have exactly what I have. So what I have is some boiler plate HTML. It's just a plain index of HTML file with a title, a head, a body, and an empty script tag. I've got links to where the data is coming from, just to make sure I'm referencing and crediting properly in my code. And then I also have that CSV file itself. So, yeah, I'm in Visual Studio Code. And you can see, there's my index of HTML and there's my CSV file in the same local directory. But you might be using a different text editor or a different environment, all of this will work as long as you have your HTML file and your CSV file. Let's take a look at that CSV file. So here's the CSV file. You can see that there are a number of columns: Year; Glob, which I assume stands for global; Nhem , northern hemisphere; et cetera, et cetera. And then you can see that the columns of data each being separated by commas. So this isn't really meant to be human readable. There are ways of viewing a CSV that's more human readable. For example, here's how that same data looks in spreadsheet format. You might notice here, however, that it is colored. Each column has a different color. This is because I'm using a Visual Studio Code extension called Rainbow CSV, which it says if it was like tailor made for me. And I'll include a link in this video's scripture if you want to install that extension as well so you can have things color coded. Another thing I really like to do when I'm working with a data set for the first time, is I like to give myself a test file that has much less stuff in it. Because if I want a console, log, and check stuff, sometimes this big file-- this isn't that big of a file, it's just 1880 to 2018. But, potentially, I could have like a really, really big data file. So something I'm going to do is, I'm going to just quickly do a Save As and call this test.csv. And so now you can see that I have a separate test.csv file. And I'm going to just leave two years in there. So I'm going to scroll all the way down and delete everything. And now I have a CSV file that just has three rows in it: the header row and then the data for 1880 and the data for 1881. So I'm going to work with this first. And once I have the parsing and everything I want to do working properly, then I'll load the real data. So the first step is exactly what you might think, fetch test.csv. Let's write that code. I'm going to write, fetch test.csv. And then remember, fetch returns a promise, and I can handle the resolution of that promise when it is finished loading the data with dot vin and any errors with dot catch. But I prefer to use the await and async syntax, so I'm actually going to put this in a function called, get data. I might think of a different name for that later. And I'm going to say the response equals a weight fetch test.csv. So I'm rating a function-- oh, and this function, I almost forgot, has to have the keyword async because it's an asynchronous function that's making asynchronous calls with the await keyword. So the response equals await fetch dot test CSV. Now you might remember, on the web fetch API there are a variety of kinds of data streams that might come in. There is a blob, there's a JSON, there's an array buffer, there's text. And this is what I want to actually use, raw text. Even though it's tabular data in CSV format, I'm going to do the parsing of it manually in my own code. So I just want to receive it as text. And that means-- I'm going to say-- I'm just going to say table equal-- maybe I'll just call this all data, equals await a rate response dot text. So let's console log that data. And let's call the function, get data here. And then let's go and see this running actually in the browser. And here it is. And you can see, there we go, the data has been logged to the console. Now, ultimately, here there are a variety of JavaScript libraries that will parse a CSV for you. And by I mean, parse, I mean figure out where all the commas are and break up the data and put it into objects and make it usable for you. D3, which is a JavaScript library for data visualization, has a parser in it. p5 js, which is a JavaScript library that I use a lot on this channel, has a load table function, which will actually parse the CSV for you. And there are many others. So I'll include some links to those in the video description. But I think it's a useful exercise right now. It's a simple enough data for us to do the parsing manually with the split function. What? What split function? What are you talking about? So the JavaScript string class, any time you have a piece of text in a variable in JavaScript it's a string object, has a function called split. And that function allows you to take any arbitrary text and split it up into different elements of an array. And that's basically what we want to do. We want to split up all the rows, and then each row, we want to split up all the columns. The split function requires a single argument, a separator or otherwise known as a delimiter. And in this case, we have two kinds of delimiters. For each row, the delimiter, the thing that differentiates one row from another, is a line break. So first let's call split with line break. Going to my code, I can say, and I'm going to call these rows, the rows equals data dot split. And I'm going to split by backslash n. So backslash n is an escape character sequence that indicates a line break or new line. Depending on your file format, you might need a slash r, also which is like a carriage return. You can also use something called a regular expression here. This should also work. Instead of in quotes, if I have forward slashes, the delimiter is a regular expression. What's a regular expression. So that's beyond the scope of what we're doing in this particular video, but I think expressions are so useful when doing string parsing, that I will also link in this video description to a whole series of videos that I have about that. But for now, just the backslash n in single quotes should do for us. So I'm going to say console.log rows, just to make sure that works. And it does. So we can see here, this is the raw text. And now this is split into an array with three elements, each element is one line in that array. And one thing, though, we don't actually need to first row. The first row is really useful important information for us as human beings to think about what the data is. But just for parsing it, I don't actually need it. An easy way that we can remove that first row is with the slice function. Slice function is an array function in JavaScript that makes a copy of an array, but a portion of the array from beginning to end. So I want the arry all the way to the end, but I want it from element 2, which is index 1, to the end. So in other words, what I can do is, I can say data split by line break dot slice index 1. So this will basically delete the zero element and give me a copy of the array from index 1 to the end. And if we go back to the browser, we can see, there we go. We now have an array with just these two rows in it. Perfect. So what's the next step? The next step is splitting each one of these rows into all of the fields. And truth of the matter is, for what I'm doing right now, I only need the year and the difference from the mean temperature globally. And that fit this data, this negative .18. So I'm now going to say, for let I equal zero. I is less than rows dot length. So I'm going to just iterate over all of the rows. You know what? This will be a nice time for a four each loop. So I'm going to do rows dot for each row. Once again, I'm using the ES 6 JavaScript arrow syntax. So for each is a higher order function, that allows me to apply something to every element of the array. And each element of the array is represented by this variable row. So if I just say console.log row, and go look in here, we can see, there we go, we're console logging each row. But that's not what I want to do. I want to say-- you know what I'm going to do, I'm going to call this row. And I call this elt, for like element of the array. And I'll say row equals elt split by commas. And then I'll console log row. So what I want to do is, for each row, I want to split it up by commas. So little let's make sure that works. Let's code. And we can see, OK. So we can see that I've got both 1880 as an array and 1881 as an array. And then I want to say const year equals row index 0. And then const temp, temperature, equals row index 1. Let me remove the first console logs, sort of clean things up a little bit. And let me run this. And I should see just 1880 temperature, 1881 temperature. And that's exactly what I have here. And guess what? Now that we've worked this out, I can go and use the full data set. So I'm going to just change from test.csv to this more complicated file name, ZonAnn.Ts+dSST.csv. So that's the full data set that I downloaded from the NASA website. I'm going to save. I want to go back. And we can see, there we go. We have every single year and the difference from the mean temperature next to it. Now I just noticed, if I go all the way to the bottom, there's a little bit of an extra undefined here. So it looks like probably I need to clean up my data file a tiny bit. I'm assuming there's an extra line break at the bottom. You can see there's an extra line, 141. So I'm going to just delete that, hit Save, and then we can see. There we go, undefined is no longer appearing there. So It's important for me to mention that I have kind of created this very pre-prepared, easy situation. I know that this data file has no empty date, no mistakes in it, no empty pieces. It's actually already in CSV format. Just removing that little extra line break at the end was a tiny bit of cleanup that I needed to do. And in fact, there is a function in JavaScript. I could have just said, data.trim, that would have cleaned it up for me anyway. But I do want to emphasize, what if the data actually has commas in it? So if the data has commas in it, my parsing system is going to break down. Well there are conventions for this. CSV files actually use quotes around the information that shouldn't be split, where there actually is a comma in there. You might find that your data isn't already in CSV format. You found this data that you want to use, but it's a PDF, if it's scan. Your going to have to optical character recognition to turn it into data that you can work with or transcribe it manually. This might be data that you want to collect yourself, from your own sensor readings. So there is a ton of work that can go into prepping and cleaning data for a project like this, but we're getting started here in the sort of basic sense of just already having a easy to use data set for us. In some of the future videos I will look at collecting your own data. And we'll see that as well. Another little quick bit of refactoring that I could do here is, I think this rose variable is a little bit confusing. This is really ultimately the variable that holds the entire table. Right? I'm taking the raw data, splitting it up into rows, that's the table. And now taking the raw data and splitting it up into rows, the table is holding all that information. So the table-- this is really looking, not at each element of the array, this is now looking at each row of the table. And then it would make sense to call this is splitting up. This is an array that's all the columns. So maybe I'll write that fully out. So this is columns, this is row, and then this is row.split. So I think this is a bit more clear, in terms of what's actually going on here in parsing that CSV. So I'm getting the raw data as text. I'm splitting it up, putting it into a variable called data, going through each-- sorry, it's hard putting into a variable called table, going through each row of the table, splitting each row into its corresponding columns. And then, I forgot, this now has to be columns index 0, columns index 1. There we go. I think I like this better. A little bit of refactoring. This is something that's very useful to do when you're working on something. Maybe you make up some variable names as you're going and you come back and refactor it to something a bit more clear. We're ready for the next step. Now that we see the data logged there in the console, we know we could do something like add it to a dom element. We could present it back to the user, to the viewer, the user of that web page in some form. So what I want to do is try to do a simple line chart. I think this will be a nice way of showing the data. And I'm going to do that with a Javascript library called Chart.js. So before I get there, though, maybe you want to try a little exercise yourself. Can you console log a different column of data? Can you load a different CSV that you found and do the same thing with? See if you could find your own data set that you might want to play with and just get the data appearing in the console. And then you'll be ready for the next video doing something with charting it. [MUSIC PLAYING]
Info
Channel: The Coding Train
Views: 143,446
Rating: undefined out of 5
Keywords:
Id: RfMkdvN-23o
Channel Id: undefined
Length: 17min 15sec (1035 seconds)
Published: Thu May 23 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.