Python: How To Subset NetCDF File For The Study Area

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome back hi everyone so today i'm going to show you how to make a subset of a net city file if you have an hcd file that is covering the wide area and if you want to convert it to a smaller radio right if you want to cut the values and the data for your study area that is that was previously pretty big anyone yeah you were not able to even process all these things because you may be out of your computer memory and every other thing it will take a lot of time to process the enter file and basically we don't need that right so that's what we have to do in order to make the process faster we have to make the soft set of the data so let me quickly show you how to do that if you are familiar with python programming you can do that and even even by this time you know right how to read net cdf file if you don't have that idea so i have another video where i already demonstrated that how to read on hcdf file it is pretty easy it is not that hard you can just use one module called net cdf4 and then you can easily read the variables or you can use panoply to read the variables right and then you can call those variables by their name right by the names and then you can even use the soft set but in order to make a subset you have to have the coordinate right your boundary code in it you can do that you know right if you have any gis j file you can make that a corner so i'm going to show you how to do that at first let me just uh show you the box right the boundary box of the bounding box so here it is i'm sharing my screen right now so that is the code before going to that code okay i'm gonna show you based on that one okay if i wanna extract the data and before that i'm gonna show you using panoply okay because i already mentioned that panoply that what i have right the extent of the data at first you have to understand if it covers this area then definitely we can create that subset otherwise we won't be able to do that so we have that panoply there and i'm opening that wind and pressure field we can open any any net cdf file so that's that i'm gonna okay let's plot the pressure plot just only to see the extent of the data nothing else so let's see see it's covering the entire usa right part of the usa and part of the atlantic ocean but my study area that is locating there right it is somewhere there not there it is there in texas okay so yeah definitely it will cover no worries so we'll extract the stream flow not stream flow that pressure precipitation if we have stream flow from national water model we can even make that subset because that is also in net cdf file but you can't even extract that stream flow for point data if you have the resolution like 0.25 kilometers so you can only do that because these are the graded data and we are gonna use that uh gridded data right here so that is the extent and let me make the boundary lower left corner upper right corner how can i do that you can create one new rectangular shape file that covers this area right so if you go to your home directory and your home jira database go to new and create a feature class and here because we are going to create a polygon or rectangle so the name is the boundary okay that is the study area boundary and i'm keeping the same name as alias and then it's polygon that's okay and i'm clicking next and that is wgs 1984 next and next so it will create an empty share file what you have to do go to editor and click start editing and then select that boundary because we are going to edit that one okay and then it is showing it's going to show you some options you have to click continue then on the right side you can see we have the create feature option if you click on the boundary it is giving you this option on the lower right corner polygon rectangle circle anything so for this case i'm gonna select rectangle okay so that is the easiest one because if you select two point it will just continue with that one so i'm clicking there one point and i'm clicking there as another point just making it as a rectangle perfect rectangle okay that covers my area and then if i go down so it is covering my area so double click it's done okay and now what i have to do stop editing and it will ask you to save yeah i'm saving it it's done and if you want to just make it hollow you can do that and it is that one so what you have to know right here you can extract this way right you can't you can just click there right you can click there and you can see the latitude and longitude there you click there on the right side on the lower right corner you can see there it's changing wherever you are moving your cursor or mouse so point mouse pointer it's showing you the latitude and longitude if i put it here because i need this point and the upper right corner so only it will cover the boundary box because if you go this way the latitude is same right the latitude will change but longitude will be same and if you go from left to right means from east to west your longitude will be changing but your latitude will be same so we have one latitude this way we have one latitude this way we have one longitude this way we have one long achieved that way so i'm clicking there and i'm writing that okay and before going to do that i'm editing at the same time the code i have so here it is i'll just show you sequentially so here i have the code and i'm just locating okay i'm just importing the net cdf4 as before right i already mentioned that how to read new cdf file if you don't have ideas so you can watch the previous video right i have already demonstrated that one how to read that so you have to import nate cdf4 and in net cdf4 we have the method to read an initiative file it is called dataset so i'm importing that one since i'll just uh do some processing in array i'm importing that numpy array if i need to and since i'll deal with the time because it also related to time so i'm importing that time as well as time module and i'm just suppressing some warnings this way you don't need to do that but you need only the upper part of this code and here i'm just specifying where the file is located okay let me just find the file where i have that probably it is there inside that one and let me find it so it is uh it's worth right it should be somewhere in between not there each warp should be here right it should be here it is a net cdf file that one wind and pressure so how can you extract the link you can go to there and you can click it is only e but including the name of the file go to property and then security it will show the full length okay i'm just copying the full length and i'm pasting it there because we need that file name since it is windows so i'm doing this way okay so that's it and if you want to use the forward slash you can do that okay i have the link of the file where it is located now i have to specify my boundary right the latitude boundary and the longitude boundary from minimum to maximum for longitude it is opposite c 84 the value is maximum but since it is negative then that's what is the minimum from minimum to maximum and that is from minimum to maximum as well and now we need the gis and here if you click there it is going to show you that 28 for the second one see on the lower right corner you can see that we are focusing on the latitude first okay it is giving us 28.96 so we can use even 28 point lower is better right if we want to increase that see if you go that way it is going to increase 29 so you can you can use 29 okay you can use 29 not a problem 29 is okay but here it is we can use 28 point 28.9 is better 28.9 so that will be 28.9 28 point nine zero okay and what will be the other one the maximum it will be there or it is there the same the boundary it is 30.85 okay we can go with that one as well 30.85 okay 30.85 and what is the boundary of that longitude and that's it you can't even select there that is the maximum value and you can go that way that is the minimum so here is a minimum minus 96.28 minus 96.28 there it will be negative 96.28 okay and what will be the other one i can remember that one it may be 93 or let's check the decimal degrees as well so that one is 93.88 okay or eight seven we can go with that h7 87 as well so we have the boundary so now because it is covering the entire atlantic part of the atlantic and basically the half of usa so i don't need that because my study area is located somewhere there is pretty tiny i need to exclude all this right hassles because i don't want to even because this data is pretty big and it will high result it's a high resolution data so it will kill your space and sometimes it will sometimes it will show some error that you are out of memory your computer can handle that array because we will do some area operation right so that's why i'm just making that subset so we have the boundary we are doing the same thing we did right to read an inner cdf file what i'm doing i'm just uh i already have see use that data set command method and i have that data object and out of this data object i'm calling the name of the variables right i'm reading the variables at last i'm reading the variables as long and also i have the times time name time variable right i'm just making the total length of the time because i'm extracting everything i know we have 38 step and then what i'm doing here i'm just calling the minimum latitude because i know right this one and i'm calling here the minimum argument right the minimum lat so what i'm doing i'm passing that boundary here so that's where the first one and it is taking a difference from that latitude right and from the upper part it is also making the difference and it is calling the value that is the command arg mean so argument will give you the index okay index of minimum value on arg max will give you the index of your maximum value so that's why i'm making the difference i'm just trying to catch the value so the same logic i'm using here that i use for extracting the time series from an hcdf file if you haven't watched that video you can go back and check so that is the logic i used here i'm making the difference in between all this latitude right within that net cd file and the latitude i'm giving here and i'm trying to extract the index of that latitude from the net studio file which is close to my point if this two points are superimposing then the difference will be zero it will be the minimum if it is close to 0.1 and it is the minimum so we'll consider that value as 28.9 and similarly i'm just doing the same thing for other stations as well so this is how i'm getting the minimum or the index of four points four corner and once i have that index so i can easily right extract the values from those corners so that is the logic i'm using here in order to make the subset now i'm making the latitude and longitude boundary right from that data file because these points may not be the same point for the net zero file so that's why we are just assuming or just getting the closest point right or closest boundary point so that's why we're using this logic here and i'm making this boundary it's pretty simple that from this led lower boundary to upper boundary and i'm specifying this one and i'm getting all this latitude and longitude from that net serial file because i'm subsetting i'm excluding everything once i know the boundary of our index right and we can extract that one so that is what i'm doing so i know the index here from the net cdf file not from my uh latitude and longitude i'm giving the latitude and longitude i'm taking the difference and i'm asking okay what is the index of that point what is the index of this point that point and this point once i have that i'm now okay getting all this latitude from this one to that one and all the longitude from this one to that one and at the same time all the values in right within this boundary box so that is what i'm doing here and i'm writing some and here we can write the same where i'm going to save the file here it is it is in e drive i can copy that it will be also there okay e drive and what will be the name of the file you have to specify now i'm writing i'm creating an empty net cdf file this is how we write net studio file so it will be like that it is it's an e drive and so i can use today's date and since i'm just extracting this data for florence hurricane florence so i'm gonna use that so today's date i'm using that date and let's see and the format will be neat cdf4 as well and i'm opening that file as a writing mode not the reading mode so the first one i use that says reading mood because i was reading the main file now i'm writing i'm opening a new net cdf file empty file and it is in writing mode so that i can write the values right in that file then i'm just giving file description because it is the pressure florence and reduced to mean sea level and i'm subsetting i'm giving this description whatever you want you can do that and i'm just creating okay when i created the file i'm okay giving this history so there are some options right you can add the description file anything you can add and now what i'm specifying here i'm just creating the dimensions of the new variables what will be the dimension because previously we saw we have like 3200 columns and rows right so here how many dimension how many columns and rows we will have because it depends on the resolution of the data that's why we're extracting the minimum latitude and the maximum latitude and longitude and within that one we'll yeah we'll calculate that's why i'm taking the difference between see the lower left latitude and the upper latitude then i'll have the number of values so that is the dimension similarly for longitude it is same right and i'm just what i'm creating now i'm creating that dimension once i have the number i'm creating the dimension that is the command to create right the file name because i'm opening the file as writing mode as file underscore my underscore file and i'm using that file name and create dimension and then i'm giving the name of the dimension long and i'm giving the number of the dimension similarly for latitude and similarly for time okay and for time i'm not giving any dimension you can add as much time stiff as you want because that is the third dimension it is third dimension means the time dimension you can go that way keeping the boundary same the main boundary is the spatial box but the time is in the vertical direction for this case we can go all the way to the infinity okay and then here creating the variable to be written in the new file right so these are the variable and i'm creating that is the dimension i created and now i'm creating the variable name so for the time create variable giving the time and the type of the data it is a flood 32 and it is also the time and i'm giving the unit okay so that is the times and what will be the unit the command is units times dot units and i'm giving the day since this one it is the same unit right i used you can read the net cdf file you can use the same unit time unit and every other unit because we are making the soft set we are not changing the variables we can change that that is another history but for this case we are just making a smaller domain not changing any values or dimension or type of the data and i'm exactly using the same types and data and everything i'm just creating the variables and units are same degree naught degree south and then what we need our main variable is the pressure right i'm creating another variable and then i'm giving the name as p because in the main file is also the p atmospheric pressure and that will be three dimension the first value will be the time and the latitude and longitude okay and that's why the origin of the field value if there is any empty or missing value so i'm using this field value you can use that or you can use any other value you want and for you wind and for v win it is the same same command these are the three dimensional data at first it will be the time and then let us show longitude we can change the order as well it doesn't matter right you can even put longitude first let it shoot second and time that is the general thing that we use time latitude and longitude so that's the common the convention we use and the unit as well whatever unit you want you can specify your unit by that command the variable name dot units and then i have the empty net cdf file with those variable names now what i need to do i need to fill up those variables with the actual values right from the main net cdf file but within that boundary box so in order to do that what i have to do here see i'm just filling up that latitude and calling what because i already mentioned that is the variable name latitude and it is empty now so if i use latitude and then that square bracket and colon means all so i'm substituting that it cheat right suffset i did it here i just mentioned that when it is there i created that boundary from lower left to the upper right upper right i have the boundary and i'm just adding all this thing for longitude all values i'm adding so this is simple for additional longitude but for times and for other variables what do we need to do because these are the three dimensional data so we have to look through each of these times because we have 38 times step so i have to move through 38 times and i have to extract the values of p u and v and i have to add those first and then i have to go to the next time step because that will be superimposing to one another right it is kind of a layer we have one layer zero time and i'm adding the another layer on the top of that and i'm adding another layer so this is how the dimension or the file right it arranged in the net cdf file so you can check for that one how it works it is pretty simple but it is pretty convenient to store a lot of information within small right yeah like uh storage and the format is pretty good that's why it is called scientific uh all these scientific data basically they store as net cdf file and it's pretty convenient if you know the command and if you know the operations you can do that and what i'm doing i have 38 times so that's why i listed the times here right at the beginning i have the times variables there you can see i have all these times times right it is there and i already have the total time and step i know the total time step what i'm doing here i'm looking through that i unveil because i'm reading those exact values of the files of the times okay so if i use only that for val in times it will give you the values but i need the index at the same time index and the value so that is the command i and the value the first one is the key or the index and if you enumerate you the enumerate command built in function then it will give you the index and the values as well so the first one will be the index of that list and the second one will give you the value and i'm reading the first value by its index right i have every values of the index within that times variable and i'm getting the first value for the first index and i'm writing that right inside my times variable i specified the variables empty variable so whatever the time is in the main file it will be there in the subset and i'm also adding the same way for pressure you win v win okay the first one will be the first index first time index zero for the case for this case but it will add all this latitude and longitude for that time so that is the command okay it is i'm giving the i and the latitude boundary from lower to upper and from left to right and it will add all those latitude and longitude means all those p values not the latitude longitude but i'm specifying the boundary of the latitude and longitude so this is how it is written in the variable p right if you give the latitude and longitude and the time then it will give you the pressure value not the latitude and longitude we added the latitude and longitude using different command it's already done now we are dealing with the variables here and we have the u wind and v wind right for first time in step it will be zero time step and it is gonna add that time and we're reading the values all the values for those time steps right within that boundary that's why i'm assigning that boundary and once it's done it will close the reading file data file and it will also close the file that i just wrote here right as an hcdf file so let me quickly run it if it works if everything is okay if we have values within this range it won't show any error otherwise it will show you error if it doesn't have any values within this boundary okay let me quickly run it let's see if it works yeah maybe see it's working on the right side you can see the console it is see it's done see pretty quickly it's done because 38 i just mentioned all this stuff see it's pretty quickly it's done at first what it did it just opened the file destroying the variables and everything and then quickly it's done and if i open that directory definitely i'll now get that florency the name the date and time just created this one this file the bottom one okay and let me open that using piano play and we'll compare that open and what is the name of the file it does florence right kind of starting with f so that one today state and i'm opening it that's it see we have the same variable latitude same now i have the values only 160 and 130 but for this one it was like 2761 it was only 3241 but here it just subsided this data and the pressure value everything is same right the pa i'm changing the unit and time i have everything right see you in and viewing if i plot pressure again let me just plot it see we have the value can you see that it is there see we have the subset there nothing else i'm fixing the proportion and i'm just so that is the boundary i can show you that's it and if you want to see if it is really superimposing your area or not even we can check that how can we do that since i have this shape file here i can even extract the shape file i don't know where it is so i'm just trying to export it again and i'll add that i'll let me show you there maybe it's gonna be not there okay galveston okay galveston galveston okay then where is gonna be in hd and then inside the snapshot seven okay or i can even go back and i can even go back and data download and i can put it there as well i'm gonna create uh galveston okay galveston so inside that one it will create and i'm gonna add that one okay yeah yeah i can add it or i can exclude that one that is not a problem let me show you if it is really substituting our data or not that is the idea i need so what i have to do i have to use that overlay right that overlay we have to use that overlay there we have to add that file so i'm gonna open that from let me show you like if i can find there definitely it should be uh there inside this file and there are download and then galveston yeah that is the ship file i can show you so it is showing should i import as obviously yeah okay so that's it see you can see that pretty quickly right it's definitely showing that one so this is how we can subset okay we can subset our data and we can make anything so that's it it's pretty simple right and pretty interesting we had a data that is covering entire usa so what we did we created the soft set from this to that okay i'm just placing that side by side see this tiny location you can see that it's pretty tiny location here in texas and that's it so if you wanna do this type of processing in your research right or any other task you can follow this one maybe this will help you because i was trying to do the same thing previously i didn't have that idea but when i need it right so i had to do it there is no way i needed to do it and i did it finally okay so thank you very much for watching and you can try by yourself and see you in the next tutorial and if you have any query and question you can make your comment and if i need to improve anything i'll do that and if you need any help i'll try to provide that suggestion if i know that otherwise i'll try to figure it out by myself and i think you people you are smarter than anyone i guess and you can do that as well so thank you very much and see you in the next tutorial okay
Info
Channel: Md Arifur Rahmahn
Views: 340
Rating: undefined out of 5
Keywords: python, NetCDF, ArcGIS, Python programming, listing directory, for loop, read csv file, multiple directories, time series conversion, time series interpolation, read nercdf file, mrms, nldas, ghcn
Id: aHzr43rSww4
Channel Id: undefined
Length: 28min 53sec (1733 seconds)
Published: Sat Sep 04 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.