Calculating spatial statistics of netCDF files - Don't make this mistake!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hi and welcome back to climate unboxed today i want to talk about how we make spatial averages of fields now if you look online you'll see lots of people have posted this kind of question and often you'll see an answer posted giving a piece of code based on calculating the arithmetic mean people suggest simply adding up the values in each of the cells one two three four five six all the way up to the n cells and then dividing by the number of cells in the file now the problem with that suggested solution is that in fact it's absolutely wrong and i'll explain the reason why now when we have a regular grid of regularly spaced latitude and longitude points it's easy to forget when we project that field onto a flat screen that in fact it is not a flat surface we are not living on a flat earth but in fact our projection from the globe when we actually look at the original layout of the grid points we can see that as we move towards the poles the size of the grid cell changes let's look at this example here now here we have a globe of regularly spaced latitude longitude points if we look at the grid cell that's coloured red near the equator and then compare it to a grid cell that's lying further north we will see that the cells as we move further north become smaller and smaller inside because essentially the longer two points converge at the north pole this means if we want to take a spatial average we need to wait by the cosine of the latitude now it sounds like a minor point but in fact it can be quite important especially if you're looking at a field which has strong spatial gradients in the latitude directions now one example of cooks is temperature now if we take a temperature data set and average it we will find that in fact the differences can be substantial the blue line which constitutes the weighted average is a lot warmer than the simple arithmetic mean and this is simply the result of the fact that temperature is getting colder as you move further north and if you look in the winter when the temperatures are colder the differences are far greater because the gradients are larger there and the differences can be up to five degrees so they're not negligible so what i'm going to do now is i'm going to show you how to do a spatial average using climate data operators and this actually accounts for the weighting automatically so now i can show you how we can simply calculate the field mean using climate data operators so in this directory we have our two meter temperature file t2m and now what i'm going to do is i am going to calculate the field mean using cdo and the command is simply cdo field mean t2m and then we need to think of an output file and it is as simple as that there we go 2.76 seconds and now we list the contents of the file i'm listing it in long form with minus l we can see we have a second new file and the file size is much smaller than the original global file now if we look at this file using nt view and there we have it we have a time series of the global mean t2m temperature now in cdo there are statistical functions that we substitute for the mean so instead of just doing the field mean we can for example calculate the spatial variance field for the spatial calculation of the statistic but instead of mean we simply substitute var for variance and we have the input file name t2m and the output file name here t2m underscore valve now we have an arrow the reason why we have this error as we can see here is that there is a problem with the precision and that's because the data that's stored and actually passed from the climate data store is in a compact form if we look at the file using nc dump t2m dot nc see that it's not actually a floating point but it's stored as a short which needs to be unpacked using a scale factor and an offset in other words we multiply the integer value stored here by this scale factor and then add the offset to get the full value and the problem is when cdo tries to make a calculation sometimes it runs into problems with precision but we saw what the solution was up here we can add the option minus b f 32 or minus b f 64 to turn it into a 4 byte or a 8 byte double precision output so let's just add that minus the f32 the function now works and we have another file which tells us what the variance is mcu t2m and here we have the spatial variance of course for a global field these values are very large so we've seen in today's video how we can with climate data operators cdo very quickly and easily calculate spatial statistics of a net cdf or grid stored field accounting for the underlying grid structure of changing cell sizes now in addition to calculating the mean just to remind you you can substitute a whole host of other functions such as calculating the maximum in space the minimum in space the variance and standard deviation you can check the documentation for the full range of mathematical functions that are available so i hope you found the video useful and i look forward to seeing you soon on climate unboxed
Info
Channel: Climate Unboxed
Views: 656
Rating: undefined out of 5
Keywords: cdo, netcdf, grib, spatial statistics, average, variance, standard deviation, maximum, minimum, climate data, weather data
Id: LS4RMiwsURk
Channel Id: undefined
Length: 7min 11sec (431 seconds)
Published: Fri Apr 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.