Using NCDUMP to reveal the secrets of Netcdf

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome back hello welcome back so what we're going to do in this video is actually download and examine the next cdf file that we get from the data requests we made in the previous video so if your request completed satisfactory you should find that when you go if you remember if we go to the your requests tab on the copernicus website i introduced in the last video we now find we have a list of requests that are with the status of completed so we have a green download button here on the right hand side we also have the size of the file and you can see that one month's worth of data which is global for a single field on a single level and we retrieved the two meter temperature as a file size of 1.5 gigabytes because we have all the days of one month and we have all of the times in one day 24 hourly steps so the files are quite large so now we can just proceed to click the download button and then you should get a save dialog box here now you will notice that the save file name has a very strange file name suggested with an internal request number and so on uh which is rather obscure so you probably want to alter that to a more logical name such as t2m and then the the year of the data then you basically click a location i've actually clicked on new folder previously and made a new folder here class and then clicked on save now you'll find it depends on your internet speed it will take quite a while to download that data so once you've done that you set it off and it will start to download so i'm going to move this window to one side and minimize this so what you need to do now is open a terminal window and if you're using a mac i'm using a mac here so i've got a terminal window but it runs the uh the zs8 shell which is a uh should we say an enhancement of the the classical bash shell if you're using an ubuntu machine you can also open a terminal window and run basically these same commands if you're using windows 10 as you'll see in the comments below the video then what you can do is actually install ubuntu as a sub system now under windows 10. it's very easy to do and once you do that you can simply click on the terminal symbol and open a terminal and you basically have ubuntu directly under windows 10. so this makes this very easy so all of these three common platforms you can use these same commands so first of all we have to make sure that we're in the directory where the file is found so the first linux command we need is pwd print working directory and you can see i'm in users tompkins class which is the directory the folder where i saved the data file earlier so we can list the files using l s now we can see that we've got one previously downloaded netcdf file which i've just called t2m.nc and then this is the download which i just started from a website which is currently underway so what we want to do now first of all is we want to basically examine the netcdf file and we're going to use two utilities to do this so the first utility is called nc dump now nc dump if we then we need the file name a little tip here by the way is if you start to type a file name if it's unique you can press the tab button and it basically auto-completes up until the point that the file name is unambiguous so if you have two files that start with the initial first few letters the same it will only complete up to the point where the string is unique and then it will stop and if you press tab again it will give you a list of the choices here we only have one file that's starting with this name so when i press tab we get all the completion of the whole file name what we're going to do now is if we do nc dump t2m what happens well it basically dumps everything in the file there's a lot of numbers as we said in the previous video these are very very large files so i'm going to interrupt this with control c okay and so what we're going to do now is we are going to run the command again so i can press the up arrow to get to the previous command and instead of just simply saying dump the whole file i'm going to add an option so we have nc dump and a space and then we're going to use minus h now using the minus h the h stands for header and so this option says only dump the header of the file so let's see what happens now so we press enter and again it's scrolled up but we can scroll back down back to where we first typed in the command you can see the command here so what does it show us it shows us first of all it says netcdf the format of the file t2m the file name and then we have a curly brackets and now we see those different sections that were described in the previous video so let's go through these in turn the first is the dimensions if you recall so these describe the dimensions of which the variables can be a function so we have a dimension called longitude and it says equals one thousand four hundred and forty that basically tells us how many longitude sales we have so we have one thousand four four longitude cells latitude 721 cells okay so that's half of the amount of longitude points as we would expect if the resolution of the data is the same in the longitude and the latitude dimension we then have 744 times if you do a quick back of the envelope calculation you can see this is what we expect it's the number of days in a month multiplied by 24 slices so now we come to the next section which is the variables section we have the actual values and the metadata for the dimensions now the first thing it says is float so this tells us what type that variable is so float of course is a real number but you can see that the time is actually stored as an integer so we have the longitude and now we have basically the other metadata so the longitude has units of degrees underscore east and the longitude long name is longitude as well so it's the same as the short name now just to point out if you recall right at the end of the last video i said that there are climate and forecasting conventions that have been defined that are usually abbreviated to cf conventions and so this units value of the field degrees underscore east is actually the standard cf conventions for longitude so we could have just put east or d east or deck east all of those would be valid and a user reading those attributes would understand what it meant however because it's not this standard that's defined by the cf conventions many programs that open the file wouldn't recognize this as longitude even though it has the longitude name so it's important if you're creating a file to try and look up what the cf conventions are and try and stick to the standards in the definition of your file so latitude we have degrees underscore north and then for the time again this is again using cf conventions so we have the units hours since and then we have a date and time in a standard date time format so you can use hours cents or days since the long name is time and then we have the calendar type which is gregorian next we come to the variables now this file only has one variable inside it now it's a short and we'll come back to what that means in a moment and it's t2m which is a function of time of latitude and of longitude we have first of all these two metadata which are called scale factor and add offset i'm going to come back to those in just a moment but let's just skip pie those just for a second we have t2m we have the fill value if you remember from the last video this refers to a value where if you find this value minus 32 767 and s stands for again a short type it means that that value is missing this is the t2m units so they are kelvin k and then the long name is two meter temperature so if you recall t2m maybe doesn't say very much to a user who's not familiar with this kind of data but immediately in the long name we can see what the variable is last but not least we have the global attributes so we have conventions and which is cf 1.6 so these are the conventions used to define these attributes and then we have history so this is a common global attribute and it tells us how the file has been created so we can see that because we requested the data in an xcdf format that after it's been retrieved now actually on their internal systems this data is stored in a grid format which is another self-describing file type and so it's being converted from grip into an xcdf file using this command internally and that command is actually being added then to the history so you can see exactly what has gone on and how this file has been created just quickly want to go back and talk about this ad offset and scale factor what you'll see is that this t2m is actually listed as being a short so what does that mean well let's go back to the previous command where we try to dump the whole file so that dumps not just the header but the actual contents of the dimensions and the variables too so these numbers i'm going to interrupt this these numbers are actually temperature values but they don't seem very realistic as temperature values 3681 kelvin would be pretty toasty this is because the data is has been stored in a compact way to actually get the real value of the temperature what we need to do is take that integer and we need to apply a scale factor and add an offset in other words the actual value is the scale factor times the value in the file and then you have to add the offset 2.261.7724 so in this way the data is compacted now if you don't like that and you want to actually store it as a float i will show in the next video how to convert into a full float which of course will inflate the file sizes that means that when you do nc dump you see the values directly so we've seen in today's film how we can use nc dump to interrogate a file and see what the contents are by reading the header in the next video what we're going to do is use another utility nc view in order to graphically display that file and the variables within we can use that utility to see the values at certain locations and also generate time series or even animate the different frames of the file so hopefully i'll see you soon in the next video thank you very much [Music]
Info
Channel: Climate Unboxed
Views: 657
Rating: undefined out of 5
Keywords: netcdf, ncdump, climate, weather, gridded data, analysis, metadata
Id: ggp6pEHllgU
Channel Id: undefined
Length: 13min 59sec (839 seconds)
Published: Fri Apr 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.