Download and Convert CHIRPS Gridded Satellite Rainfall Data into Time Series using Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi welcome to another tutorial in this tutorial I'm going to show you how to download the chips rainfall data for free and then how to create a daily time series of rainfall data using the downloaded chips rainfall data using a small Python script so chips rainfall data stands for climate hazard groups infrared precipitation with station data it's basically a collection of 30 plus years quasi global rainfall data and chips data has a grid resolution of 0.05 degrees which translates into approximately 5.5 kilometers by 5.5 kilometers over here you can see I have one station of interest I have selected one station in the southern part of South Africa if i zoom in a little bit you can see that this station is located in the tan Kokoro National Park so I'm interested in no I'm interested in deriving a daily time series of rainfall using using this downloaded gridded satellite ships rainfall data so I'm going to do that using a small Python script so let's get started so first of all I would like all of you to go to this website climate serve serve your global dotnet once you are in the website just click on get started gather using get the data using climate serve and here you do get multiple options to specify the area the area that you would need the data for so I'm going to go with the first option where I get the opportunity to actually draw a custom polygon and after that I'm just going to zoom into the southern part of South Africa and basically just draw a polygon which cause the region which I'm interested in once you complete the polygon you can just double-click to complete it and now you can see the type of data request is datasets and the data source is chirps rainfall and the calculations make sure you download the raw data otherwise you will get sort of an if you if you select the average then you will get sort of an average for this entire polygon and that's not what we need we need the exact we need the precise rainfall value for for that particular station that we are interested in so just make sure you select download raw data and here you you can specify the starting date and the ending date so just for this tutorial I'm going to specify the starting date as the first of January of 2010 and I'm going to download the data up until the 31st of December of 2015 all right snow you can just submit the job and it might take a while usually doesn't take that long and as you can see it's already ready faster download so you can click here and then you can download the data file I'm going to save the file all right now let's have a look at the downloaded data now you see that our starting there is first of January of 2010 and we have quite a lot of rust as TIFF files all the way up to 2015 31st of June 31st of December and as you can see we have one raster for each day so it's basically let me check how many rosters it's 2191 rosters so that's corresponding to the number of days starting from the 1st of January of 2010 all the way up to 31st of December of 2015 so let's export all of these files to a location of your choice now once you're done exporting you can just have a look and you see that it's actually a big collection of rosters it's just a big bunch of rosters now let's open our GIS and using the our catalog just try to navigate to the to the place where you downloaded your data and you can see it's over here so I'm just going to drag one file and then drop it over here just to see how it looks yeah this one is actually covering a huge area then beyond what I actually need but that's that's alright if I drag another one and then we can even change the color scheme now just to explain you the the meaning of this roster you can see that if you maybe I'll remove this one and move the ground station too to be on top of this roster and you can see that we have a color bar over here a color scale over here and each of these pixels actually correspond to different rainfall values in millimeters now for example this low this red these red regions show and you can see that the distrust is the name of this rust is basically the day this is 10th of January 2010 and on 10th of January 2010 this region has not received much rainfall in fact it's it's it's it's either zero or very close to zero but then you see in this region this this eastern region this not eastern region you see relatively a higher rainfall it has received relatively a higher rainfall amount now you can even go to this take this identify tool and you can click on this each each of these pixels for example if I click over here on this blue pixel you can see that this location has received a rainfall of 14 millimeters 14.1 millimeters whereas this coastal region has received a rainfall of absolutely zero so that's how you actually interpret this roster but now you can see that this rusty is actually covering a huge region alright so for the purposes of this demonstration in case if you need to create a new point let me just get rid of this ground station I will go to my folder I will create a new shapefile this one I will name it as ground station a it's a point and the coordinate system is WGS 1984 then I'm going to edit these features and then start editing continue and I'm going to go to editor editing windows and create features and let's say I'm going to I'm interested in knowing the the rainfall time series probably at this location right I created the point and now I'm going to save it yes let's change the symbology to be something like this at this point now you can imagine that for this particular roster if I were to check the rainfall at this this particular location you can see that it shows a rainfall of fifteen point zero five millimeters if I select a different day now this is 2010 January 10th now you can see if I select the 4th of March and probably check the rainfall value over here it's fourteen point two two now you see that if I were to do this manually it's going to be a very tedious process so in this tutorial I'm also going to teach you how to use a simple script in Python in order to read all of these files and extract the date and the corresponding value for a point which is of interest to us and I'm going to specify that the location of that coordinate by giving the latitude and longitude of that particular point so it'll generate a continuous time series so we can later use this time series for various types of analysis that we might need to do so let's see how to do that so in order to do that first of all you need to open your Python IDE so I'm using spider so I'm using a spider as my IDE now over here you can specify this folder that you that your that you intend to work in after you specify the folder over here using this browse to working directory option you can go to the file explorer and you can actually go and create a new module this is going to be your Python script and I'm going to name this as chirps to time-series alright so first of all we need to import a couple of libraries the first library that I work that I would like you to import is raster IO because we are going to read in the Rastas that we downloaded so if you have not installed the raster i/o Python library you can check this tutorial out I'll put the link down in the description below so you can have a look how to install the Rasta io Python library without any hassle so it's just a very straightforward installation so I'm just going to import that library import Rasta i/o and I'm also going to import the numpy Python library now if you if you wonder why we import these libraries right now just bear with me because I will explain exactly why we import all of these libraries during the tutorial so just import numpy as well as NP because that's how we do it and also I would like you to import the OS library and finally we also going to make use of pandas alright so the first thing that I'm going to do is I'm going to create an empty table I'm going to name it as table which is equal to PD dot data frame and I'm going to create this table and then fill it with zeros for the time being and then we specify the index index of the table it's going to be NP dot a range starting from 1 up to two to one nine two now you might wonder why we type here one comma two one nine two now all right for the time being I'll just put a comment over here so that when I run this command you see that all the libraries got him got imported and then now I'm going to demonstrate to you if I were to type N P dot a range 0 comma 10 you see that it creates an array which starts from 0 and then ends at 9 but now over here I want to create an empty table in order to just fill in all of these rosters we you see that we have 2191 rosters so I would like this table to start from 1 and then end in 2091 2191 so if you want to do that you have to start from 1 and then you have to end in 2192 that's going to be my index and then now I have to specify the columns so here I'm just going to specify two columns within square brackets the first column is the date and the second column is going to be rainfall so what we are going to do is basically we are going to extract this debt over here this name over here and then we are going to space a sort of pick the corresponding rainfall value of this location and then save it to the table which we are just creating now as a as a panda's data frame so once you are done with this just run or you can even hit f5 and if you want to see how this looks you can either go here and then type table because you assign this whole data frame which is consisting of zeros right now into a variable called table so you can either go here and type table and press Enter and that will show you this this table which starts from 0 and then ends in 2191 which is basically consisting of two columns date and rainfall and right now it's filled with zeros and later we are going to fill this with the appropriate the corresponding values and next what I'm going to do is I'm going to run a for loop and I'm going to run it over all of these these files these actually TIFF files even you can check the type of the file by going into the properties you can see that actually it's a dot gif file it's a raster and then I'm now I'm going to run a follow through all of these files and then read the information of these files and then try to extract it so for that I'm just going to type for in and now I'm going to make use of this OS library because OS library actually can read the files which are located within a folder which are located in a folder so I'm just going to go and type OS dot this dir and then here we have to specify the the path where you keep those rosters so I'm just going to go here and then copy and paste it here as a path and just going to add this our letter so that it's it recognizes that this is actually a path so this is the first line of our fault for loop go to colon and then now we are supposed to say what it's supposed to do so let's just write something like print files and see what happens if you run this now you can see that it's actually it's it's basically reading all the files inside that folder right you see it's actually reading all the TIF files plus unwanted stuff as well we actually don't want to read these chips chips script folder round location folder if I open this one you see that these two are actually these two folders I don't need these two folders I only wanted to read the TIF files so now I'm going to actually insert a condition over here and my condition is if if when you read the file if the last four strings are equal to something that is that looks like dot gif then I'm asking to print the files variable you get the meaning over here right so first it's actually trying to iterate through all of the files all of the files in the inside that folder and then there's a condition saying that if you read once you're reading this file just check the last four characters and then if this last four characters are equal to dot gif and then print the files but if it's not equal to the if it's not equal to dot TAF then it will not actually read those files and that means it's just going to ignore these these two folders because the last four digits the last four characters of these folders do not equal to dot gif so that that means it's actually only going to read the Rastas which we which were interested in so that's how you actually specify a condition and now what I'm going to do is I'm just going to put a counter I equals I plus one and if you put a counter it's better to actually specify the starting value of the counter I'm going to specify the starting value of the counter to be zero and now I'm going to create a new variable called data set we'll get rid of this print files as well because we won't need it data set this data set variable equals two and now I'll make use of this raster i/o by the library raster IO dot open this is how you open a raster once you're here you have to specify the path in here to the location where you have saved the Rastas followed by this files because for each loop it's reading the because every time it's looping over the Rastas it's actually reading the name of the raster and we need to specify each of that each of those names over here as well so in order to do that you can actually just copy this one over here this is where you keep the Rastas and you have to put a plus sign and you have to create two backslashes and add another plus sign and you have to just add the name of the file at the end so that it's iterating through the through the TIF files inside that folder inside this folder specified by by the name of the files specified by the file names and then then we will need to specify the X&Y coordinates so first I'm going to create this empty bracket and now I would require the X&Y coordinates in decimal degrees so I'm just going to go back to my ArcGIS and over here in ArcGIS actually you can retrieve the you can retrieve the decimal degree value quite easily either you just point your mouse point over here and then you have a look at the lower right corner you can see that you can see your decimal degrees over there if not you can just basically go over here and you add a new field called X make it a double and you make a new column called Y you can go to you can right click over here and then calculate geometry and we can ask it to generate the x coordinate of the point in decimal degrees that's what this one is 21 point seven eight and we can do the same thing for the y coordinate as well this is going to be the y-coordinate that's negative 31 point three eight so I'm just going to copy this one and paste it over here go back copy this one and paste it over here all right now I'm going to create a new variable called row comma column data set dot index and now this is going to retrieve the index of x and y in that roster so let's after we do that we just let's run and see what happens now you can see it's still running especially when you see this red color button over here all right now it ran without any issues now we can see what this what each of these variables mean now for example if I just copy this one and then paste it over here you can see that you can see that this particular XY coordinate this decimal degree coordinate is actually located in the 27th row and the 95th column of this roster so that's how we extract this row and column based on the given X y coordinate that's quite handy actually so the next thing that we do is we create a new variable called data array and that's going to be equal to data set dot read inside we specify 1 so this basically reads the the whole array now we can run this command and also see how this data array looks all right seems to be fine now if I select this data array and then check how it looks you can see some numbers you can even go to the variable Explorer and then we can actually open this data array and now we can see some values over here so many zeros and there are some beliefs over here in the middle now by looking at this I think you get the idea what this one actually means so from here you can see that actually we are still running in a loop so for the last loop that it ran it read the file name using based on this condition it read the last TIF file and then it's basically opening that roster for us so this negative ninety nine nine nine nine nine refers to the areas where there is no data available and these zero and these areas with zero basically means that there has not been any rainfall in that regions and then these values actually means how much was the rainfall specified in that particular roster so this is quite handy as you can see now we'll just close this one for the more and now what I'm going to do is if I if I may open this table for you now I'm asking that every time when you iterate through these TIF files you pick the name of the dead and then put it in here so we know that it's going to iterate 2191 times so each time just don't iterate over it just while you're iterating actually pick pick the characters which specify the date and then just copy over here and then and then during the second loop just copy it over here and then during the third loop copied over here and so on until you read all the files all the 2191 files so let's see how do we do that we see I can put recommend saying that now I'm going to copy the name copy the date to the to the date column in in table during each iteration so this table is actually referring to this table which we created the pandas dataframe so let's see how do we do that we go to table and then specify which column that we are going to fill in which is there and that's going to be equal to sorry and then we specify the location using lock and each time when it iterates this I value will keep on increasing so that's why we specify IO here so for the very first loop you see that the IE value is zero and then by the time it reaches here it becomes 1 and every time it loops it actually keeps on adding one to the existing I value so in the second loop it's going to be the I the value of I is going to be two and so on so that's why we specify I over here and that's going to be equal to files and if I put : and negative 4 that means it's actually going to pick everything behind the last four characters for example now for example when it reads the file name like this it's going to eat it's going to ignore it's going to ignore the last four characters and then it's going to pick everything that's behind this last four character and that's exactly what we need that's actually how we specify the date so that's what we do over here and let's run this one and see what happens press f5 we get a warning that should be fine so after a while you will see that the processing has been completed now you can just go to table and see how it looks now you can see that actually the table got filled with filled with all the dates that we actually saved as trustus before now inside our folder but still you can see actually our rain all value is still empty so now what we do is we actually also try to fill the rainfall value as well and the way to do that would be we still follow the same format and now over here instead of debt it's going to be rainfall mmm because that was the name of the the column the heading of the column and still we are going to go with location I and that's going to be equal to this variable and this one is going to be row and column now you see that the row and the column are going to be integers are definitely integers there cannot be any decimal point so just be on the safe side we can specify this to be integer and this one also to be an integer now that's done you can just go ahead and run this command and wait for the result now you can see that the processing is done already we can go table again and check over here and now you can see that actually this rainfall column has also been filled there has been some rainfall events very rarely but that's that's not a problem but because that's actually what the data says because that's because that's actually what the data tells us so here you can see that so over here you can see that actually the data got extracted without any issue and for each day we get the corresponding rainfall value as well now what we can do is we can actually export this into a CSV how do we do that it's quite straightforward table dot to CSV and here we just specify the path so let's say that I want to put it in here and I'm going to name it as rainfall dot c sv to CSV and then we can run the command again okay once it's done we can navigate back to the folder and this was the file which just got created so you can open that file then you can open it in Excel in Microsoft Excel and this is how it basically looks you can remove this column which we do not need and you can see that actually the dates are not sorted you can either sort them in Python itself or even you can just select the whole thing and then go and sort from all this to newest all right even you can do a cross validation or if you open you can see that in 2010 March 4th the rainfall value has been let me zoom in and check the exact pixel which is 14 point 2 to 5 so I'll just go back and check the 4th of March it's fourteen point two to five and even if you can if you want you can check for the 6th of March it it's supposed to be four point eight three it's supposed to be this maybe I'll change the color scheme 6th of March and you can see that it's four point eight three so you can sort of verify that our data is correct already and if you need to do any further analysis or even even if you need to plot the whole thing you can just plot it out over here so that you can visually see how the rainfall has has its variations during those during those years that we selected so you can see actually in which during which period we got the higher rainfall events and then during which periods we have sort of prolonged dry periods dry spells so I guess that was the objective of this of this tutorial and one thing to mention is that you don't have to specifically use these coordinates in case if you decide to use extract the values of a different place you just have to create a new point and then extract the extra XY coordinates just replace these XY coordinates from the new coordinates and then basically just run the model just run the script and then it will give you the the rainfall values in terms of time series for that particular coordinate so I think that's about it for this tutorial if you have any questions please comment them down below I will try to answer as soon as possible and if you would like to see interesting or tutorials which incorporates GIS and Python or together to generate very valuable and useful outputs like this you can definitely consider subscribing to this channel and stay tuned for another interesting tutorial
Info
Channel: GeoDelta Labs
Views: 15,809
Rating: undefined out of 5
Keywords: CHIRPS, Gridded, Satellite, Rainfall, Data, GIS, Python, how to, convert, rasterio, ArcGIS, QGIS, numpy, pandas, Os, DEM, Raster, vector
Id: _uaVrSeLFmA
Channel Id: undefined
Length: 34min 20sec (2060 seconds)
Published: Fri Nov 08 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.