Creating a land-sea mask for gridded data - just like magic!

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] the earth is heating up year by year but the rate at which it does so depends on your location land points are heating faster than ocean points if we want to separate the two we need a way of masking our data that's why today on climate unboxed i'm going to show you how to generate a land c mask so how are we going to do this well just above my head right here we can see a schematic of three grid boxes and in these grid boxes we have a schematic of the topography the height of the surface so the first box is a c point and in that box the topography is actually below c level it's showing the c floor in the second and third boxes we're instead showing land where the topography is actually above sea level so if we have a topographical file that shows the height we can use this to decide whether a grid box is a land point or an ocean point where are we actually going to get a topographical file well this is where we introduce a little piece of magic from cdo cdo will allow us to create a topographical file from nothing [Music] just like magic so the command to perform this little piece of magic is called cdo topo and then we need an output file name and here we're calling it topo.nc so here we have an empty directory so let us now try to create a topographical file with the topo command and this is the output file and the command has worked there we have topo.nc but when we try and open this with nc view we get this error message nc view can't recognize format of input file topo.nc and the reason for this is that cdo produces grip by default irrespective of the file extension so what we need then is actually the command line option minus f nc 4 to ensure that the output is in netcdf format we can also have the standard netcdf3 or we can also specify grip or grip2 formats as well so now what we want to do is try again but this time we're going to specify minus f for the file format nc4 command is topo and then we have topo.nc and this time when we open the file we have a global file of the topographical height there are positive values over the land and negative values over the oceans and the resolution is half a degree we can confirm the resolution by first of all just confirming the name of the longitude and latitude we can see we have long and lat and we can print those out then using nc dump minus v for variable and then for example longitude we notice we have a half a degree resolution so we can create a topographical file very easily using cdo it's global and by default it produces a half degree resolution but the problem is we can't combine this with our data file because our data file may be on a smaller domain and even if our data file is global it won't necessarily be on the same half a degree grid so this means we need to re-map the data now i've already shown you how to remap a data file using remap and then n for example the nearest neighbor you can see the video in the link just right above my head we can also use bilinear or conservative remapping and so we can remap the topographical file to a new file i'm calling topo underscore error using cdo remap and then the target data file to specify the grid right after the comma however we don't actually need to do this in these two separate stages because in fact what we can also do is we can provide the data file directly as an argument to the topo command so here we actually combine these two together we have cdo minus f nc4 then we have topo to create the topographical file and then comma and then the target resolution we can just provide it our data file data.nc and then we have an output so this will automatically re-grid the topographical file onto the same grid and domain as the data file now this uses nearest neighbor interpolation so if you want to use a different interpolation method you will need to do the two separate steps so that's step one we now have a topographical file that's on the same resolution now what we need to do then is we need to convert this into a mask and we can do that using the logical commands that we've also seen in a previous video so we take the topographical file and we use the logical command greater than or equal to a constant and we provide the argument zero so what does this do it in the output file makes a mask where wherever the topographical height is above sea level we will end up with a one and when we're below c level we end up with a zero we can also use less than or equal to a constant to make a mask which has ones for the ocean points now it seems like a simple step to combine this with our data file but wait there's a problem now if we imagine a situation here we have our three grid boxes one ocean and two land points and here again we're imagining a data set of temperature and in the middle row we have our newly created masks with one over land and zero over ocean if we are simply to multiply these two together in order to mask the data then what happens is we end up with a field which retains the temperature over the land points but sets the temperature to zero over the ocean points now immediately you can see that if we make the average spatially we end out with the incorrect result because if we add up the contents of these boxes we have 32 plus 28 plus 0 makes 60 divided by the 3 boxes and it gives us an average temperature of 20 degrees now that's not what we want well what we actually want to do then is set the zeros to missing in these data files and if we do that then now if we take the mean in the spatial sense then we have 32 plus 28 plus missing and the mean function ignores the missing values so now we have 60 divided by 2 which gives us the correct mean spatially remember we need to use mean and not average because average will just set the result to be missing if there's a missing number present now how do we actually do this well there's a command in cdo which is set c to miss and again this should be familiar now we can and we can break this down set is for setting parameters c is the abbreviation we've seen already for constant to miss is self-explanatory we're just setting the missing value and then we need to give an argument and in this case we want to set the c points which are 0 in the mask to missing and they will be set to the value specified in the fill value attribute of the field we will see later that you can also use instead of c you can use r for range which gives us another option to actually mask out data which we'll see later in the video so now we are able to mask the data using cdo multiply with the data and then the mask to provide the final output which is a masked data set so in this case using this operation of mole to multiply two fields it will take the data file it will take the masked file with the missing points set over ocean and will provide a file which is correctly masked so let's see this put together what we're going to be using is the era 5 monthly average data on a single level so we're going to be using the monthly average free analysis the two meter temperature all of the years and all of the months and we're going to select net cdf here's the api request so i've done it here and here's the file that results now this is actually the data from 1950 to 1978 merged with the data from 1979 to 2020 this is a monthly file i'm going to average this using year mean and then the output is going to be year me nc so now we have in this directory we have the t2m and we have t2n underscore eme and move myself up here first of all make a topographical file so we have so we're going to have cdo minus f and c4 for next cdf four formats and then topo comma t2m and then topo okay so this is already being projected onto the same grid as the input data and if you recall we do cdo and then we have greater than equal to a constant topo and then we're going to have toppo and then we'll say mask look at this now we have old master nc now we have a file which is red over the land and it's blue over the ocean and if we look at where the cursor value is read out here we can see that we have zeros over ocean and one over land step three was to then turn the zeros into missing so we say set c to miss comma zero so this sets all of these zeros over the ocean to missing values so we have topo it's called mask and then we say toppo mask miss dot nc so now when we open this latest file [Music] we now have one over all the land points and we have the missing fill value here over the ocean points which are now colored white so now all we need to do is take the product of the data file so we have the year meme and now we have the topo mask underscore miss and i'm going to call it t2m must dot nc or we can call it land dot n3 now if you notice when we make this multiplication the first file has many time slices the second one with the mask only has a single time slice cdo is broadcasting this data by repeating these slices as many times as we need to be able to multiply it by all of the steps in the first file now when we open this latest file t2m underscore land we see we are left with just the land points and all of the ocean points are masked out and set to missing so now if we do any spatial statistics for example i was showing in the opening scene the time evolution of the land temperature so we can do this with cdo field main land nc t2m and mean.nc now when we look at this latest file we click and there we have it we have the global mean land temperature from hero5 in this case starting from 1950 and going all the way up until 2020 and as easy as that so let's just summarize we have four steps the first step was to generate the topographical file which can be remapped on the target data sets grid by passing the data set as an argument to the topo command step two then was to create a binary lan c mask and we use the logical function greater than or equal to a constant if we want to mask out and keep the lamp points or less than a constant with an argument of zero if we want to retain the c points the third step then was to set all the zero points to missing in this mask by using set c to miss and the last step then was to multiply the data set by this new mask to give us the final masked data four easy steps now if you look online you might find some pages that actually make these steps in a slightly different order and that's where the set range to miss comes in so we can actually reverse the second and the third steps and if we do this use the range command directly on the topographical file so in this case we're taking topo r with the topographical height and the range from minus 20 kilometers all the way up to zero is then set to missing and then on that resulting file we then use the logical greater than or equal to a constant it doesn't really matter which way you do it if you use this way of course you need to make sure that the lower bound of the range your values set to missing is lower than the lowest ocean floor depth so there we have it we've seen how we can very quickly and easily mask land or ocean points using cdo in four easy steps i hope you've enjoyed this video and i look forward to seeing you again soon on climate unboxed and then an output file to store the topography to store the topography the topography
Info
Channel: Climate Unboxed
Views: 166
Rating: undefined out of 5
Keywords:
Id: -GC3e7fqF7I
Channel Id: undefined
Length: 15min 38sec (938 seconds)
Published: Tue Nov 23 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.