Using SatPy to Process Earth observing Satellite Data | SciPy 2019 Tutorial | David Hoese

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to the SAP high tutorial sci-fi 2019 all of the tutorial materials are at this github URL here if you haven't followed the installation instructions already please do so as quickly as possible it does involve creating a Condon environment and downloading like three gigs of data so it takes a little bit I'm Dave Hays I'm a software developer at the University of wisconsin-madison at the Space Science and Engineering Center I'm a core developer on SAP I and a member of the PI troll open source programming group who maintains a pipe we have two helpers today as far as I know we have Jeff sitting right there if you have any issues raise your hand we also have Alex sorry if you have any issues that's with installation or something not working the way you expect raise your hand and hopefully they can come in and help you feel free to interrupt me at any time throughout this tutorial ask questions ask for clarification and hopefully everybody can have a good time based on the survey that about 20 of you took we have a wide range of experiences here from all of the different technologies that I hope to use today nobody has used Sat pi so that's good I guess but yeah so I'll try to explain what I'm doing with everything I'm using so that everybody is kind of all the bases are covered yeah oh also their instructions here if you didn't download the data and the Wi-Fi just is not working well there are two USB drives one of them is USB 3.0 if you want to grab those and continue following the installation instructions that should save you hopefully a couple minutes yes I think we can get started and just verify that everybody's got things so we're gonna start by going to our terminal here we're gonna use plain Jane Jupiter notebooks today and step through those I will try to this tutorial as a whole provides an overview excuse me overview of SAP Pi I might not cover your exact use case but hopefully I give you enough of an overview that you can see we're set pi is going and the use cases that we were trying to solve when we made it and hopefully you can give advice on where we should go from there yeah so in a terminal we need to start Jupiter notebook but we need to make sure that we're in the right spot and that we're starting it with the right Python environment right here I am in this long nice path but I'm in the root of my tutorial directory that I downloaded from github so you should have cloned the github repository you should have this if I list what's in here we have all of the notebooks directory I have all the readme and the install stuff that's the directory you should be in if you've if you have some recent version of Conda you probably have base here we want to activate the SAP PI environment that you created for this tutorial so we'll type Tando activates at pi to get that activated the other thing we want to check is that all of the data directories are that they exist the way they should if I have been updating this repository over the last week or two with fixes based on feedback that I've gotten from some of you so if we do an LS on the data directory you should see an ABI l1 B and a Vere's SDR directory the other important thing that you should see is if you do LS data slash ABI l-1b you should see two directories in there one that ends up with CONUS and one that ends with mezzo there was a bug in the original download script that only made one of those directories if you don't have both of those directories you should rerun the download script which is Python download data pi - - data only data - only that will make sure you have the zip files that will extract them again and then you should have all the data so hopefully we're all generally in that spot we can start the notebooks by typing Jupiter notebook this should open your browser and we can hopefully make this bigger so here Jupiter notebook when it first opens gives us our current like a file browser view of what we have available all of the lessons for this tutorial are in the notebooks directory so if I click on that I see a lot of I PI and the files we're gonna start with the introduction so click on that that should open a new window I would suggest keeping this notebooks tab open so that you can go to future lessons if you are worried about your computer's performance may want to shut down notebooks after each lesson just to clear out any memory that's being used anything that might be sitting there some of the processing and the later lessons can take a little bit and it does take a little bit of memory it will eventually finish but if you close out old lessons your computer may perform a little better so a little introduction into SAP ISAT PI was created by the PI Trull open source group it was created Pikeville has existed since 2009 ish 2010 I joined it in 2015 a lot of the members are based in Europe SAP I was created as a combination of their original em pop library and my polar degrade command-line tool suite we realized we were basically doing the same thing and created SAP pi as a result of that so that we could work together so this is just a lot of scientists and programmers that all had to do the same thing and read satellite data so let's all work together SAP I can read data files it can write them two different formats it can resample data two different projections we can make RGB composite images we can because of the tools that SAP is using we can easily access or use other visualization tools or other Python based tools like Carta PI or geo views and we'll see all of these things today hopefully this notebook includes some other helpful links pi Trull does have a slack team if you just want to talk about working with satellite data it doesn't have to be about SAP PI doesn't have to be about other PI - libraries that we make just come and talk to us so there's links to that and the various repositories like I mentioned earlier hopefully everybody ran the installation instructions that should have downloaded that should it should have involved creating the kind of environment and downloading the data we're gonna run this simple command just to verify that things are importing and that our Jupiter notebook is being run by our SAP environment so if you click on this cell this is a code cell in notebook so it's got Python code in it we can run it by hitting shift enter on our keyboard and you should get some output like this everything on this should say okay that means that we have all the imports successfully in there maybe we'll run into some issues later but this should be a pretty good start should have already downloaded the test data again if you didn't there's USB drives up here and some steps to hopefully cut through the wireless issues so what are we working with today we're working with earth-observing set meteorological satellite instrument data what does that mean these satellites are Earth observing so they're not looking at space they're not looking at other planets we're looking at the earth its atmosphere the land looking at the earth so it looks something like that we'll be dealing with geostationary and polar orbiting satellite instruments these are really two types of orbits that you end up dealing with when working with this data a geostationary orbit I had a co-worker make some videos for me from his modelling tool so this is an example of a geostationary orbit you have the satellite staying over the same spot on the earth as it orbits you're seeing the same image every time the instrument is observing the earth every time it's recording data so that's geostationary polar orbiting these satellites you can kind of see the little blip going around here right about here so that is a satellite that's going from pole to pole and throughout the day it will cover the entire Earth it's a much lower orbit than the geostationary satellites so it sees less of the earth but it's much closer so you typically have higher resolution data because of that and we'll explore that more later today we'll be looking specifically at imager data data from imager instruments essentially this means we have 2d images 2d arrays that represent that we can view as images we map some value that we have from the data to some color on the screen whether it be grayscale or some matplotlib color map our data is geo-located so we have a lat LAN pixel for every element in the array that we have sometimes the data is gridded meaning it's uniformly spaced sometimes it is not so we have so we really need that lat/long coordinate for each thing for each element to know where it is in general these satellites their detectors their sensors are each pixel represents some kind of circle or ellipse on the earth we simplify that in a lot of cases when we're looking on screens as a square so our lat line point that we have mention refers to the center of that pixel and that pixel then has some cell height and some width it usually in meters is how we're measuring them these instruments have multiple bands or channels or wavelengths you'll hear me use them interchangeably what that means is that well we're we are looking at the radiance reflected or emitted from the earth so Sun beating down on earth objects will emit or reflect that that energy or maybe you have a fire that fire is emitting energy these satellites are passively receiving that radiation that electromagnetic spectrum of radiation and each channel on those instruments will look at a specific wavelength so that we can get different types of information about the earth so here in this image we have four different channels from the AVI instrument and you can see that each one although it's a different shade there is there are actual differences in what's visible in the image because of what wavelength we're looking at so some are good at looking at water vapor some are good at looking at see the actual oceans some are good at looking at snow and ice the land vegetation so depending on which wavelength we're looking at we get different information with geostationary satellites you also typically have sectors so although the geostationary satellite is looking at one position of the earth one spot on the earth it has different cutouts that it can look at different at a different time resolution so we might take a image of the full globe full earth from what the satellite can see every 10 minutes might have a smaller chunk that we look at every 5 minutes and an even smaller chunk or two that we look at every one minute so we get that data at different times this allows us to get the large overview of the earth but when there's a storm or some interesting event we can get very high resolution changes over time we talked about this a little bit some of us in here are newer to notebooks you don't need to be an expert in Jupiter notebooks to do this tutorial the basics are that there are markdown cells like this one where I have text there are code cells like the ones we ran before with testing the installation you can run shift-enter to run that cell and if you need any other tips you can read some of the stuff I wrote here about keyboard shortcuts creating new cells deleting cells you can also hit the H key and you'll get this nice little pop-up with a lot of keyboard shortcuts all right we can go on to the first real data lesson so you can close this tab if you wish click on reading she get this reading section reading lesson so sap pi depends on x-ray and bask arrays in this lesson i'll go over some of the basic interfaces of sap pi but we'll really be talking about what your data looks like when it comes out of sap higher when it's loaded into sap i how you can access information from the x or a data array objects and since we're using desc arrays underneath those what does that mean for you the user and how can you treat your data differently so the we'll start out in this first code cell we'll be using the sap eyes scene object i'll note that every notebook i start with this init notebook script that is just a little script that i set up to set up desk and numpy just to make sure that everybody's machines aren't being we're not using too many of your resources just kind of keeping everything in check just because there's different levels of machines in the room then we go to importing sap by importing the scene object we use the built-in a function here to load some file names these are the files that you should have downloaded and if we run this cell using shift-enter we get 16 files we can create a scene object by calling this reader or sorry by calling the scene passing reader we tell it what format our data is in so ABI l-1b give it the file names that we have and then we get the scene object back scene object X a lot like a dictionary so we have things like keys so if we run this cell we get nothing we haven't loaded any data yet we can ask the scene what's available if we run this cell we see we get 16 different channels listed so this is the ABI instrument we have 16 different channels in SAP pi there named like this and so what we've done here by creating this scene is we've passed these file names to the reader in SAP pi it's looked through the files figured out which files which what's available in the files what's actually something that the user can load and now we've asked it to tell us that so we get these 16 channels so as I mentioned in the introduction instruments have different channels these channels measure different wavelengths they also typically have different resolutions to them so for ABI we have this long list of well it's not even that long 16 channels but some are 2 kilometer resolution so that means every pixel refers to or represents 2 kilometres on earth some are 1000 meters and one of them is 250 meters and now what we're going to do is load one of these channels so if you're not familiar with wavelengths I suggest just picking one for now we'll explore them in the future but it's up to you to just pick a channel so pick whatever channel you want so C 0 7 or you'll be using that C name so in this next cell we have this little edit me I'm gonna use C 0 7 here you can use whatever you'd like to use and then we're going to pass that string to the scene load method we could load multiple channel for simplicity right now we're just going to load one and then we're going to use a typical dictionary bracket access to print out the the object that is in the scene so in Jupiter notebooks whatever the last line is it will print out the representation of whatever that returns so here we're getting a data array object so I ran that cell we have a data an x-ray data array object a couple important things about this so a data array object is can be thought of like a numpy array with extra stuff added onto it we have we have dimensions added onto our numpy array so we have named dimensions so we have the Y dimension so you might think of these as rows in SAP pi we're going to call these the Y dimension we have 1500 rows here we have X the X dimension those are our columns 2500 here we have the data array is telling us that our data underneath is a desk array this is normally if you were working with numpy arrays and you printed out a number I know a number array this is where it would print out every value here is just telling us Baskar a I'll explain why it's doing that instead of printing out values in a second we have coordinates so this means for every row for every column we have some coordinate associated with it these are actually in meters this has to do with projections the the projected data we're looking at and we will talk about more more about that later and then we have attributes so this is a lot of metadata that was either added by SAP PI or came from the file directly for the particular variable we're looking at so I loaded channel 7 channel 7 is a brightness temperature band that means the wavelength that it represents or that it's observing is an infrared temperature band so we're measuring the radiance but we can convert that into a brightness temperature it's easier for scientists that sometimes use that oh I should have also said I'm not a scientist I'm a developer so if you are smarter than me on please correct me but I'll tell you what I know and I think it's mostly correct so we're looking at a brightness temperature band some of the lower bands like channel 1 2 3 those are reflectances so you'll have units of % we can see by looking at the the units attribute here I have K for Kelvin try to follow CF conventions typically you have a lot of other metadata here things like what platform it's on so where is it G 16 there we go platforming goes 16 it's also referred to as ghost goes east so that comes from the metadata inside the file so we just have a lot of metadata o start time and end time those are also important so that's when the instruments started observing the data we're looking at and how long it took that's our end time as far as geolocation we have the coordinates for the rows and columns this is gridded data so this means in X&Y projected space every every columns at the same coordinate every rows at the same coordinate we'll talk more about that when we get to projections and resampling but the way sap pi encodes all of this stuff is through this area attribute this area piece of metadata it's an actual object and we'll learn to use that more and what that represents and what that means later on so we have some metadata we can with with a data array we can access that metadata through the dot adders property so we can do seen rec at my channel so that gets us that data array on the data rate adders and then we can say we can treat dot adders like a dictionary and say give me the start time we get that date time object for our data we have also access to the dimensions so we can get a list of the dimensions I mentioned before why for rows x4 columns we can check how big those are by using the sizes dictionary get the number of those data arrays also have properties that you might see on an umpire so we have we still have access to shape we also have access to the end end property so we have a two-dimensional array here and if you're looking at a different resolution band you might see different numbers for the rows and columns okay and although it's not typically needed we do have access to the - great underneath the data array so the data array is the large overall container that record that holds on to our dimension names our coordinates our metadata and it holds on to some pointer to our data array so this could be an umpire a button SAP PI we're using Basques arrays and so when we do this you might get something like this depending on what version of some of the libraries you have some things have changed recently so you might see this where it prints out a representation a string representation of the object you might also see a little graphic showing you like a tan chunk that has boxes in it that's trying to show you what this - gray looks like it gives you some other stats about how much memory it's using so that's available in I think newer versions of desk so delete calculations - greys we use - Gray's in SAP PI because we're typically dealing with more data than or throughout our operations we end up using more data and more memory than we can actually hold on our machine we don't want to swap data onto the hard disk that just makes things slow we would like to just compute things as best we can with the memory available so that's why we're using - Gray's the other thing about desk arrays is that it automatically runs things in parallel so it'll use multiple threads do things in parallel we can do things faster the way it achieves this the way it looks to us as the users is is that everything is delayed we don't see the actual results we don't have to wait for the results as we're preparing these calculations so we can see this in this code cell we're going to create a new Python variable we're gonna access that date array we're gonna add two and a half so even though I have a what was it 2500 by 1500 array that finished instantly because Basques isn't actually running through each of those pixels and adding those adding that two and a half and this is an arbitrary example adding two and a half is in some scientific purpose so yeah desk is building a graph of these calculations that we're building up over time and it's not actually going to perform those until we tell it to perform those operations we can print out what my new var is we see we get a data array it still has those dimensions that we had it still has a desk or a underneath and it still has some coordinates same coordinates as before we lost all of our attributes this is the way x-ray is working by default it's assuming that we did some calculations so the metadata might not apply so just to be safe it's going to remove the attributes if we wanted the attributes still apply to this new data array we'd have to manually set those again this can this whole delayed operation concept can get a little confusing when you're used to working with numpy arrays because you might want to result right away so let's say we want to know the maximum value and our channel that we loaded we run dot max and we see some some text but it's not a number this is a new data array that max produced you can see that by the dimensions which would normally be right here we have no dimensions we have no shape this is a scalar array so this is one single value which we expect we're getting the maximum the single maximum value for this whole data array but it didn't actually compute anything yet the way we can do that is ask x-ray to ask desk to compute this thing we do that by running the compute method so we run that and we get a data rate that actually has a numpy array underneath because it's fully computed it's not a desk array that's delaying these operations we have an umpire with a real value in it so 411 for my the data that I loaded we can also compute from our our new var that we made where we added two and a half we can see it's two and a half greater so even though we are so we got a new data array nothing was done in place we computed these things we got the results we wanted but we had to ask to compute them another common operation you'll see if you're googling things about x-ray and desk is using dot values if we use dot values we get we don't get a data array back we get the numpy array underneath if because we're using gas Gray's this produces the computed numpy underneath implicitly so even though we didn't run dot compute that values as computing that data array still or the desk array still these operations are a lot like using numpy rays except for the compute part of it but we can use data arrays a lot like numpy rays so if we start running some matplotlib code and we want to make a plot we can give it a date array and it will create our platforms so here we imported matplotlib spy plot is PLT I created a figure so give me some image where I can put data on it I'm running em show or image show and I'm giving it that scene data array object and then I asked for a color bar so this is what I got back I got no labels but I have a color bar and this is what our data looks like so you can kind of pick out that this is looking at the US this upper left corner here this is actually space where the satellite is looking just off the edge of the earth on this CONUS view CONUS continental United States that's what this sector is for ABI so this was passing data arrays to matplotlib x-ray also comes with its own utilities for plotting we can access these will create a new figure first but we can access these through this plot attribute and then there are your typical dot plot M show functions we're going to use I'm gonna specify the color map explicitly and I run this and so now I get pretty much the same image that I had before with some key differences I'll stretch this out too didn't there we go so we see that our color bar now has a label this label is coming from the long name attribute from the data array so x-ray is making some educated guesses you have these attributes these are typically used as this type of label in our plots so it's going to use the long name it took the units and put that in brackets there it labeled our dimensions for us so our x dimension our Y dimension our axes labeled us for us in some cases it will also add a title for us depending on what data we're looking at and what the attributes are set up as yeah so it's a nice little utility because it's using matplotlib underneath we still have access we could add a title if we wanted so we could do something like PLT dot title here and i could say i don't know c 0 7 or maybe my channel c 0 7 i rerun that now I have my title so I can do all my normal matplotlib functionality by doing that another thing you can do here is if you're familiar with using the V min or V Max keyword arguments in matplotlib you might want to use those to limit that color bar range so I see with my data I'm going from about 240 to 400 maybe that's not what I actually want to look at maybe I want to look from 300 to V Max 380 so we can specify these keyword arguments here and now I get a darker image okay maybe that was a bad choice Dave but get a different image I can specify Beaman V Max this is important later on when we might be looking at differences or things that have outliers that we don't want them to make our color bar not look so great so V man V Max C map if we wanted to change this we could also make this Magma's another popular color map maybe I'll remove vmon v-max so see map equals some color map name so these are all matte public color map names we get a different image oh I should also point out since we're using matplotlib when I created my figure before I added this matplotlib notebook magic command this is telling matplotlib that we want to use its interactive widgets so that we can actually interactively move around our image so you can use this this empty square here is your zoom rectangle so we can click on that and then if we drag a little square it will zoom into that square of data if we if we want to go back well actually if we want a pan and zoom we can click on these four arrows here and move around so we can move around on our data if we want to go back to our full view we can click the home if you wanted to save this image as a PNG plot you could either run the normal map plot lib commands to run and save that or you could click the download button so that's how we're gonna be using that plot lib today just to explore our data and look at it we'll do other things to look at our data but this is the easiest way when looking in notebooks looking at an interactive environment I listed here some other color maps that you could use some of these are more useful than others and some I just thought they made weird images kind of going on that whole jet route of color maps just having fun with it so you could try those out too does anybody have any questions so far so the question was is asked smart enough to realize when we were making the plots was desk being recomputed every time the answer is yes we will I'll go over some ways to help with that get around that but yes if luckily right now we were only loading data from disk with like a few scaling calculations and boom it was on the plot oh I should have mentioned this before on the terminal if you created your Conda environment I'd say more than a week ago and you haven't updated x-ray you should do that either now we're on a break and restart your notebooks I believe it's zero zero twelve - this includes a bug fix by me that I discovered while I was making these notebooks that x-rays utilities are actually computing the desk arrays twice every time you make a plot so it's not a huge deal right now because we're just plotting data from the file but when we get into making our GBS and things that have more calculations you end up waiting a minute for things that shouldn't pick that long so if you need help with that maybe raise your hand and talk to the helpers you can check it by running Conda list x-ray in your terminal but we can talk about it when we get to the breaks and stuff hopefully it won't affect anybody but if you created I think that release I forget what the PI pi page said but I think it's been out for like a week so if you created it within the last week you should be fine yeah so yes I believe that would be mad plat lib functionality it's good you're saying you're saying so the question was if you zoom in on the part of the plot can you reuse that for later plots not in interactively through matplotlib as far as I know and what I'm doing with it here we're gonna talk about slicing in a bit and we'll talk about cutting out areas of the data programmatically using SAP PI based on the geolocation we'll talk about that later but as far as selecting an area in matplotlib and then using that later I know that a lot of people are working on that and I'm sure it exists in a lot of different libraries that are already out there I will not be using that today though yep so if I zoomed in you're saying if I went like this and I saved this is it saving every data point it's gonna say if so so the question is if i zoom in like this and I save this image is it saving every data point of the whole image is what is it doing so this is all through matplotlib and this is saving the actual PNG colors of the plot it's not looking at every data pixel it's not even looking at the real data that made this plot because there's a very like if I go so I think if my data was 1,500 by 2500 that's not how many pixels are in this image and so matplotlib is interpolating or whatever it wants to do to make the image fit in this plot and so yeah so it's not saving every data point it's saving what ever it has to to make this image just red green blue RGB PNG pixels okay move on to slicing so I talked about how data arrays can be used a lot like numpy array is part of using numpy arrays are getting the exact row and column that you want we can do that with data arrays also here we're gonna call the scene download method again even though we created the scene and we loaded before we can load more data later if we want to so here we're gonna load channel two if you load a channel out channel two already it's fine to rerun this that's okay it should realize that it's already loaded and then we can use slicing we can also use a property of slicing called striding so here we're using this is our y dimension our rows we can say we have this start index that we want we have this end index and we want every four pixels same thing for the columns start at this index go to this index get every four pixels if we run this the data array that was sliced will give us a new data array back it realizes that we're looking at a smaller chunk of our big data array so it keeps all the dimensions it keeps the coordinates but it also strides those coordinates for us slices those coordinates and we keep all of our metadata you can see that through the striding I have 250 pixels by 250 pixels and we can use this just like any other data array and plot a map plot live figure so we run this we use those utilities that we learned how to use and we get every four pixels you might be able to tell well obviously we're looking at a smaller chunk of the data but it's also a lower resolution because we're only looking at every four pixels of the original data so what I would like to do now is take five to ten minutes and learn what we learned about with matplotlib load different bands see if you can see the differences in the different channels so remember that we had C 0 1 up to C 16 so you can use the seen load method and load that feel free to play around with the veve vmon v max look at color maps so take five to ten minutes and explore the different bands see what you see we'll use these bands later on combining them but if you find one that you think looks better than the others maybe use that in the future lessons as we go along so take five to ten minutes and feel free to raise your hand ask questions of me or the helpers and yeah so we'll come back at 5850 we'll start again oh and also explore explore the data use the interactive map plot lib tools you may have guessed by the names of the files that you downloaded for this data that there's a particular event in these files you might be able to find that by zooming in yeah so the question is is SAP I the SEP I do anything with higher resolution satellites or a higher resolution data do you have an instrument in mind that has that resolution okay and do those measure are those remote sensing satellites that measure wavelengths okay no I have not done that we'll see you later in this lesson that SEPA has a lot of readers we have contributors from that work with you met set in different European organizations I do a lot of work with NOAA and NASA and so I use a lot of the data that they're providing another thing I could have pointed out earlier is we'll be looking at what's called level 1 data so that's like the radiance data there's level 0 that's what we get directly from the instrument you might you can also have level 2 data where people have taken that level 1 and said oh I see this wavelength sees this this wavelength sees this I know that there's clouds or I know there's this type of cloud so we support reading all of that type of data there's also different types of instruments but as far as like remote sensing imagery instruments as far as I know goes is the highest resolution NOAA one for geostationary but yeah I've never had to use anything higher than that or have been told of anything higher than that so maybe we can talk later all right why don't we bring it back so a couple questions came up during this one thing some people had issues with is you do have to explicitly load a new channel even though it says it's available through that available data set names you do have to load it to access it later to actually use the data another thing that came up is questions about what readers does SAP I have will see a function in a little bit that actually lists what readers you have available and what what ones you have all of the dependencies installed for that are also listed on the SAP PI documentation SAP I read the docs I oh there's a nice big long table here of what's supported SAP pie and the pie troll organization is not run by a company or funded directly at all it's all volunteers working at different agencies across the world and so a lot of the readers were added by necessity so as far as I know we don't have anything for Landsat but that's just because nobody's been needing it yet there's documentation on how to create your own reader but yeah me personally I get paid to do a lot of stuff with NOAA satellites and NASA satellites or the products that they produce and so that's where the readers I've contributed come from but we have developers all over Europe they work with Russian satellites and you met set European satellites China and Japan India Australia so whatever these developers these contributors ended up needing they probably contributed those readers yeah so that's where those came from another question that came up is why can't you do something like I have seen zero - and I want to change a specific value I want to say okay the value at zero zero I want it to equal five now well I can't do that this variables data is stored in a desk gray which does not support item assignment to sign to this variable you must first load it into memory explicitly using the load method or access it with dot values so based on how or because of how desk works we are creating this the series of tasks that we are doing to get our final task array that we're using - we can't change a task once it exists on on that series of tasks we have to create a new task that affects the result of the array so one way to do that you may be familiar with the numpy dot wear function there is a task where function so I'm just going to show this quit you don't have to follow along with this part because this is just proof of concept I guess so if I do da da dot wear and hopefully this works because I've never tested this greater than or equal to 300 and I say set it to 50 and of course that doesn't work Oh cuz I'm using okay hundred yeah so I get a new desk or a this has a new set of tasks that are built from that first desk array I had and I get something else most numpy functions now based on the work that numpy and desk and x-ray have done you can pass gas Gray's or x-ray data array objects to numpy functions and they will return a desk array or a data array whatever you originally gave it it will give you that same type back where is one of those functions that as far as I know doesn't do that if I did import numpy as NP and I didn't um PI that we're I'm not sure what this will return yeah so that actually computed the data and returned a numpy race so there's a couple functions like that that have to be computed based on how numpy is structuring things I'm sure those will change over time as far as they can if you're curious about it I'm sure there are plenty of numpy and - developers here at this conference this week that you could talk to about it so let's move on a little further any other questions regarding that No okay one thing I didn't mention is with x-ray we can slice things based on those coordinates that we have we can use this dot cell select method on the data array object we can specify the dimension that we want to slice and we can slice based on some of those values and so here for mine for the channel i loaded and this should work for years also I can slice based on some metered values so I mentioned those x and y coordinates are in meters I can slice based on those using this fancy slice object the if you're not familiar the slice object is the actual object that gets used when you use that : syntax in your brackets so we can create these and slice our data array based on the coordinates rather than the index this is very useful if you're very familiar with your geographic coordinates in a certain projection so you're actually slicing into your array with the ADA's coordinates not the the actual data not some array location next up accessing data in different ways so I mentioned that every channel represents a wavelength if you were using higher level products like what like cloud type what type of cloud is this that doesn't have a wavelength associated with it but for the data we're working with right now each channel has a wavelength associated to it we can access that by doing by getting this wavelength attribute so dot adders wavelength and this will print out a three element tuple where the middle number is the nominal wavelength so this is what the instrument was designed to observe it's meant to see see is a bad word I'm not a scientist so I can say that so 3.9 is what this channel is meant to be recording observing but based on the design and the engineering of the instrument it sees a little wider outside of that exact wavelength so this is what this is trying to tell you is that anything between 3.8 and 4.0 could be observed by this channel if we use this we can use this this wavelength number instead of the channel name in our scene this is very hell for scientists who get used to using the wavelengths instead of the channels because if I use a different satellite even if it's the same like family of instruments so ABI we're using right now there's a Himawari aah I instrument which is the same type of design but it has different channel numbers with slightly different wavelengths but we know we want the 3.9 so I can use the in syntax to say Oh is there a 3.9 wavelength channel in this scene and I don't have to use the exact wavelength I can do something like three point eight five this is useful because anything within that range I can use this is useful because because of the engineering like I mentioned or just by the specification for the instrument it might not be the exact wavelength but it's pretty close so I can still do the same type of science that I want to do with something close enough we can use this with the same bracket syntax that we've been using before but instead of using the string channel name i can use the wavelength we could use this in a load method as well yeah just another way to access things and we'll do a quick little mention on geolocation a lot of people are used to use in longitude and latitude arrays to locate their data or to have points for everything we'll use we can get access to those arrays with the area attribute it has a get LAN latts method to it so if we do this it takes a little bit and we get this big long numpy array of all the longitude and latitude coordinates we could have it return a task array if we wanted to say chunks equals 2048 this is just a way to to basically tell SAP PI in the libraries underneath that it should break up these arrays into desc arrays and so we run this and then we get a desk array back right away if that's what you want to use that's how you can do that it's not necessary to usually use raw longitude latitude arrays we Sat pi but I wanted to mention it here it is important to know what an area object is and like I said we'll do this more later but I want to point it out now just so things make sense so the area for our ABI data is an area definition object that comes from a PI resample library that's used to define gridded projected data and if we print it out we get information about projection and the size and where our data is in this projection space and so that yeah that's how it's a pi is an encoding all of this information about geolocation and we'll use that later on lastly if you want to see what readers you have available to you based on based on what your environment has installed you can run this available readers function so if I run this I get a ton of warnings because it's trying to import things that I don't have the functions for are the dependencies for when I get this nice long list of reader names that I can use to to read data files and like I said like I said this is also on the sap ID documentation website so there's a big long table I don't think these are alphabetical they mostly are yeah so there's a list there with a small description of what they're doing and yeah any questions on reading or basic desk data race stuff before we move on to the next lesson No all right I'm gonna close this tab I'm gonna shut it down just to show how to do that so I'm gonna check the box on this notebooks page and click shutdown actually I'm gonna do that for introduction to so that killed the Python environment that was running that notebook I don't need it anymore so it's not necessary to have it so I'm gonna open writing oath rewriting so a very common thing that we work with in SAP I is needing to give data that we either created or load say hi to somebody else who's not familiar with Python not familiar with SAP PI or they're required to use some visualization client that doesn't support the raw data files that we have here in the lessons that we're doing today we're lucky that the data files we're using a netcdf and hdf5 which are fairly popular fairly easy to read there are a lot of data file formats out there that are just random binary custom things that are very difficult to read that's one of the benefits of having readers that everybody can use so there are some clients like the national weather service for the US has to use a whips to view their data a whips only supports certain file formats so we need to have a way to convert to something that that can read another popular one is a web mapping service or a WMS server you typically use geo tips with those so we might want to make some geo tips SAP PI comes with a couple writers we can use available writers but the the available writers function just like we did available readers we can run this you get a warning probably if you don't have dependencies for some of the writers but here are some of the ones that exist some of them are like proprietary formats for things that contributors needed they come with SAP pipe because they're not overly complicated and just have a few dependencies but we have the ones we'll be talking about today are the Geo Tiff's writer simple image like PNG s and JPEGs and the CF writer for writing trying to write CF standard netcdf files so we'll go into the next cell we're gonna talk about making geo tips so we're gonna have you pick a channel again I'm gonna do a channel seven again change that edit me to say whatever channel name you want we're gonna load the same files we were looking at before we're gonna create a scene we're gonna do scene download to load that channel and in the least exciting way possible we're gonna run one function one method and we're gonna get a geo TIFF so what this did the saved data sets method on the scene by default we'll take every channel you've loaded and try write it out to a fairly default generic geo TIFF file it uses a lot of the same defaults that Google or GDL has we can use in notebooks you can use an exclamation mark to run shell commands so commands that you would normally run in your terminal so hopefully this works for everybody I didn't have a lot of Windows testers for this tutorial so hopefully this works on all Windows machines but I can run PWD to list with uh my current directory is so it's telling me I'm I'm in notebooks my notebooks directory of this tutorial I run LS to list the directory and I now see that I have a C 0 7 with a date TIF I can use the good ol info command to list out information this comes from the C Goethe library GTL however you want to pronounce it oh I'm using a question question question mark question mark here too just as a wild-card so whatever channel you chose this command should work for you from Google's output we see some size information about this geo - if we see some projection geolocation information about it we have metadata in here that SAP I added like the start time the date time of the image we have we can see that it's using a deflate compression by default and then at the bottom here we can see there are two bands in this file so our data is a single 2d array the by default what sap is storing is called an LA image or luminance plus alpha this means that the value or the level of grey is in the first band so that's where our real data is the second band is an alpha band that determines what's transparent in our image so if it's zero then it's all transparent if it's 255 then it's all opaque I think that's right we can look at this image if you go to your operating systems file browser and navigate to where this geo TIFF is so if you go to the your tutorial can I make this bigger ooh that's kind of ugly I can navigate two notebooks and in on Mac this should open in preview if I double click on this geo TIFF I get an image like this if you do this on Windows I think there's your default Photo Viewer Linux should use image magic if you have it installed but you can open this geo TIFF if you had some visualization client like ArcGIS QGIS WMS server if you really wanted to start one up and upload it the CEO tip should work fine for that you'll notice that this looks a little different than it did in matplotlib that's because there's an extra step going on here in sap pi where it has to convert our float observe two data values to something the imagery can understand something that fits inside the image which is an 8-bit unsigned integer so there's just some extra enhancement steps is what we call them this will come into play even more in later lessons but just know that there's that extra step there so the question was why are we writing 8-bit integers instead of floats I guess my question to you is what applications do you use that can read float geo tips for viewing images my experience image viewers can't understand the float data and also if you are let like preview and Mac doesn't understand flow jet tips if you the other thing is are you saving a float and scaling it to 0 to 1 or are you saving it and storing the raw data so are you normalizing the data before you writing it out to the geo TIFF there's just different use cases and what we'll do next is actually write out a float geo TIFF so that is what that image looked like I should have pointed out there's like the space pixels those aren't considered valid data with ABI so those were transparent another type of geo TIF that is very common is called a cloud optimized geo TIF also called a tiled geo TIF this is just a different way of writing the data to disk it compresses better and it's much easier it works very well for WMS servers those web mapping services it's also supposed to work very well if you were storing these geo tips on cloud storage because that you can read individual tiles instead of the default which is to write row by row to disk if you're curious there's links in this notebook to COG so that explains what cloud optimized geo tips are if you're familiar with the Pangaea project they prefer that geo Tiff's be written as cloud optimized geo tips so that they are more friendly to cloud computing resources so we can run this or we can create these by specifying some extra keyword arguments to our saved data sets method any that are available in raster i/o or coudl which is used by raster I Oh can be passed here so we can say tiled equals true and we'll run this I added in the cell the Basques Diagnostics progress bar this will detect inside this with statement this will detect inside the with block this will detect any time a task array is computed and show us a progress bar for that computation so we see it took about 0.3 seconds on my laptop and it wrote that geo TIFF if we do an LS again oops we see we still have the C 0 7 Sat pi by default doesn't care about overwriting files assumes you the user know that you were going to overwrite that we can run Google info again and the main difference here is that this block parameter says that it wrote a square of data instead of a line of data so that's that tiled operation otherwise everything else is the same as was mentioned with the question sometimes you want to write float data I mentioned there's that enhancement step if you want to write the raw data values so those I talked about brightness temperatures there's also those reflectance values sometimes you don't want to scale them to some other smaller data type like an 8-bit integer you want to save the raw float values we can do that by passing some additional keyword arguments to save datasets we can also tell it to save to some other directory so I'm gonna do that here by specifying baster and say put everything in a float images directory I can customize the file name and specify that I want it to be named whatever the name of the channel is with a dot float if that's what I want to name this I can specify what data type the output should be so a numpy 32-bit float and I can turn off the enhancement that conversion from observed data values to image values by saying enhance equals false and if I run this got a progress bar and we can LS I see here that I have a float images directory now so that was created automatically if it didn't exist if i LS float images I see I have a single geo TIFF they're a nice key point about this I guess is I can specify anything in this this format string anything that's in the dot adders dictionary this can be very helpful if you have some long custom name that you need or you need things in a different order from earlier steps we had C 0 7 with the start time maybe you want the end time maybe you want the standard name in the file name maybe you want that platform name and you can access those by using a standard Python format string like this so curly brace whatever the name of the attribute is and end curly brace so good ol info on this we see here that instead of two bands at the bottom we have one band it says it's a float 32 data data array image and that are no data values so where where there is bad missing data it'll be man in this output image yeah so that's how you make a float to do tip if you need to do that with whatever applications you use so take a couple minutes real quick just so that this lesson has some amount of estimation and see if you can change the file name and add things like start time or standard name and just do that real quick just for like a minute and I'll show you something cool if nobody's familiar with it after that so just take a minute or two to play with the file name and and if you have any questions feel free to ask now and yeah get back in a minute or two any other questions before I do a quick example okay so one thing I wanted to point out before we go too far is let's do let's do this I'm gonna select the cell I'm gonna hit B and this gives me a new cell underneath I'm gonna do seeing that save datasets you don't have to type this if you don't want to this is just me playing with stuff so like I said we can use anything that is in the attributes in our file name here so I did docked if R sorry just start name like this and I save this and I do an LS I get kind of this ugly file name at least I think it's ugly where it has the name and it by default it formatted that date/time object like this if you aren't familiar with Python format strings you can customize how date/time objects are formatted right in the string so if you do : after the name of the attribute I can do things like you would normally pass to string format time or Jing F strf time so percent y % m % d % h % m so maybe I didn't need the seconds in my file name so now I see that I have this myth out of this hour and minute file name and I didn't have to do anything fancy with the date time I added some stuff to the format string so nice little trick if you didn't know it existed hey you do another format that's very common as pngs if you don't want to make plots but you want a real one-to-one scale of this data pixel is a pixel on my output image I don't want any interpolation I want the full image you can use save datasets and use this simple image writer by default this will make a PNG image you may notice from the progress bar that this took longer than the last one that's because by using raster i/o we're able to use desc more effectively to write to the file in parallel as far as I know it's not possible to do that with PNG at least with the way that the pill library implements it so it's a little slower our images are small enough that this isn't a huge deal if I list this I have this PNG image I could go back to my file browser PNG same kind of output as I had with the geo TIFF as far as the way I'm looking at it if we do save datasets will also guess what writer it should use based on the file name so if we do filename equals I say something like test dot PNG it will also make that PNG based on the file name so I could do that if it would guess that it would be geo tip but yeah so if you did that JPEG there that should work there are some some gotchas with that just based on what kind of data you can write to JPEG yeah so it will guess at what writer you're trying to use lastly for writers today there are other writers but most of them are specific to specific visualization clients so we'll talk about netcdf which everybody should hopefully be kind of familiar with you store your data and in that CDF file it's supposed to be self describing there are CF conventions so anybody I think it's climate forecast is that what that stands for I'm at forecast forecast standard yeah climate in forecast so these are standard attributes that you should have with your netcdf file to make them more self-describing to be more useful to more applications so we can create the same scene object that we created before and we're going to load three channels at once so seven eight and nine I'm gonna use that progress bar we're gonna call save datasets again here we're gonna tell it we want to make a netcdf file and we're telling it what engine to use engine isn't technically needed this was a bug in there was a bug in past versions of SAP Pi where the engine wasn't set properly depending on what what you had installed there are different netcdf libraries that you could use so here I'm explicitly saying engine that CDF for if you have an older version of these tutorials it might not have that you may want to specify it here but I run this and now we get multiple progress bars that's because it's not only saving the channel data for the three channels it's also saving the longitude latitude data so it has to compute those values I think yeah so there's five so that should be good there's a warning here about there's not extra time information I could argue that that warning doesn't need to be there if we list our directory we see that we get the file that we specified in the file name keyword argument we can use the NC dump command and - H to say give us the header give us the information about this netcdf file we get this nice long output of what was written to our netcdf file so like I said this is the CF writer in SAP PI and it's trying to generate a CF standard netcdf file and we're we're making improvements every release but you have the X and y coordinate so these are those 1d arrays that are the metered coordinates for that data longitude and latitude 2d arrays there's options to turn the generation of these off we have the data arrays with all of the attributes in there that could be converted to netcdf written to them by default this is writing float data X array has its own u2 he's writing that CDF files it can't parse everything that sat pi uses and it also does some other things that are slightly different to what SAP high is doing and how its writing its netcdf file oh and there's global attributes about what CF standard we're trying to go with and what software created this netcdf file so that was the shorter lesson but I thought it was important that if you need to use geo tips or netcdf files there is this functionality in here it can be very useful if you're working with this data I would say reading the data from the files is the hardest part about using satellite data satellite instrument data but usually you have to give it to your colleagues if they need to do something outside of Python or there's a visualization tool that you want to use so being able to save these formats should hopefully be easy and that's what we tried to do here and in sap i so any questions before we move on to something else could you tell me what that's see that again okay I question is do I know what nco is [Music] with MCO said oh hmm hole requests are welcome yeah so the into values oh that's an interesting topic so the original question was is there a way to set a fill value other than man because some people prefer not to have man as a fill value in netcdf files I understand it can be hard to deal with sometimes I also agree with you that Nana's it's a good value to use for that I wish integers had an and value I wish there were maybe multiple nan values that were easily to access with numpy there are not so the answer is no there's not right now there's another question after that but anyway hmm Oh what do you do for ends so set pi I would say it's a relatively recent standard practice that we're trying to stick with is for integer fields we try to use the underscore fill value that you might see with netcdf files we try to keep that along with with our data arrays that we use in SAP I one problem that you might run into if you're used to working with integers is that an X ray doing certain operations it will change the data type of your data so if you have integers and you say I want to mask out everything I want to replace every value so using that wave function I want to replace every value that's greater than this with zero it will it might I think there's some gotchas I can't be sure but it might change your data type to something like float that's because underneath X ray tries to or it defaults to using nan as its fill value so if you have integers Nana's not integer SAP pi tries to preserve datatypes as best it can it does that by detecting that there's a fill value and that we have an integer data array and keeping track of those things because it's very important in working with like level two products where you have things like you have category products like cloud type where a cloud being type zero it's not the same cloud being type zero in a cloud being type to the clouds in between them are not necessarily type one so doing any type of interpolation or averaging you need to be really careful in keeping things as integers so it's a heavy loaded concept have we just have we thought about using a binary mask SAP has gone through many revisions we used to use masked arrays for everything before we used ask arrays we switch to using desk arrays which meant it was easiest to use nan as a fill value for everything and that just reduces the amount of memory you're using because you don't have the data and a mask that was a major plus now to ask raise support masks to raise much better as far as I understand based on what I've read so maybe that's something we could do in the future but yeah right now it's not heavily needed by us and we're doing okay so yeah I had any other questions so the question is if I'm saving a ton of data to something like a geo TIFF will I automatically split that into multiple files and there is not there are writers that by design are meant to do that like the SCM I writer is a netcdf writer for a whips for the National Weather Service that I wrote that is based on the idea of writing tiled netcdf files so a netcdf file for each tile and so it might a fault does that there's nothing like that for geo TIF we're pretty plainly using raster i/o just selectively selectively writing tiles one at a time okay all right I think that's it for writing let's go on to the Superfund concept of resampling I say it like that I'm gonna shut down the writing notebook by the way so resampling tab so I say I said that like that because resampling if you're not used to this is one of the most complex topics to explain and really understand normally hopefully you don't have to fully understand them to do your own work but because of the concepts involved it's good to have a small little handle on them because they do provide you with some nice functionality or give you some availability to do other things that you might not be able to do so let's start out with a quote from the map projections book by Battersby so map projection is the process of transforming angular spherical or elliptical coordinates into planar coordinates all map projections introduce distortion to area areas or angles or distances in the resulting planar coordinates understanding what where and how much distortion is introduced as an important consideration for spatial computations and visual interpretation of spatial patterns as well as for general aesthetics of any map so what that means is that typically we model the earth as a sphere or a sphere so a sphere that's smooshed a little bit that's not a perfect representation of the earth there are certain distortions we are working on computer screens that's a 2d surface it would be nice if our data was on a 2d surface so that we can do more calculations first of that we can see things better but you can imagine if you take a I mean it's it's look it's globe versus map if you take the globe and you try to smoosh it on a flat surface things get distorted they don't look great so there are different ways you can make a flat surface from the round ish earth here are some examples in this image so we could take a flat surface place it on top of the globe and project the earth ant onto that surface and so you might get something that looks like this bottom image in the first column so just what you're seeing from the earth from that flat surface we could use different shapes so in the middle column here we're using a cone if you project to the earth on to that cone you cut that cone and you flatten it you might get something that like this middle column on the bottom there so you can see it looks like a cone that was cut you might use a cylinder so you get something like that's in the right one thing you might notice on these images is that especially like in the cone there's this blue line here in the middle of the cone that's where the cone is intersecting the earth this is kind of where you're kind of your reference longitude where where the distortion is the least where where you're referencing all your coordinates from depending on how you change the shape of the cone how far you put it into the earth how you expand it you can get different coordinates different results different distortions and so some people have their favorite projections for where their data is that they're looking at or looking at and it is its own science of trying to find out where the least distortions are to make your data look the best it's nice to put data on to these directions to reduce those distortions and also it gives you a a single coordinate space for working with multiple instruments multiple data from different sources so you may get one that's from one projection you may have random latitude longitude points but you want to combine all these together maybe the sphere of the earth isn't the best way to look at that so let's try some projections let's start peeking into this concept with SAP pi we're gonna start with our first code cell loading channel 6 everything is the same as before same reader just loading channel 6 this time I'm talk about that area definition again so if I print out this area definition we see things like identify our names like goes east we have a human readable readable description talking about this is a 2 kilometer resolution area so this is this is the geolocation how are the ya-ya do location of the data we're looking at so our data is an image on the earth where is it on the earth how do we describe that the way that SAP eye is doing that is through this area object so we have these names we have things like how many pixels are there and columns and rows we have this area extent so this is saying where is this so this is lower or this as the left-most coordinate of our square this is the bottom of our square this is the right side and this is the top so those are metered coordinates so that's telling us where our square of their rectangle of data is in this XY coordinate space this projection here you don't have to know what every parameter is but just know that like there are the different shapes that we can choose here we can specify the different shape we can specify how big it is how how what that shape of that cone is all through these parameters right here SAP I is using parameters from the proj silja you may have also heard of well-known text or wkt it's another way to describe these parameters so by changing these parameters and what their values are your you can change what that shape is and what your end distortion is and what you're looking at here one of the I guess one of the key points here is that the prodigy owes that's short for geostationary that's the type of projection this is it looks something like this image this image of the earth we can I have this little link here this here I can open that this is the progress description of this projection this is where I took the image from so we can see that this image this image of the earth is described by this string of parameters in SAP I we were using a dictionary but it's also can be represented as a string so there's all this math that goes into it all these parameters that we can change but the end result is that we have an image of the earth and so you can imagine if you are the satellite looking at the earth this is what you might see so this is how we use this with geostationary data or gridding data in general is that we have we have this image of the earth if our data was something on a different projection like lambert conformal conic we can click on this link to bring up the proj description so we get this image it's described by these parameters down here and we get a different image so different distortions different amount of what we're looking at and what we're trying to do is describe in this image where is our square where is our rectangle of data we need to describe that so we know every point if we we could use latitude longitude points for every pixel but then that means we have to hold those into memory for every as we pass our data around if we have something like this area definition we have some strings describing it but we have this dictionary we have this four element tuple and we have the number of rows and columns and that's all we have we that fully describes where this data is on the earth or or in this projection so saves memory save same saves computation time let's go a little further and look at the differences of some different resolutions so we're talking about our image being on one of these projections where it is on the earth but there are different numbers of rows and columns for the different bands because they're different spatial resolutions so if we load channel five and we already had channel 6 loaded so if we look at the two areas we can see that the the columns and rows are different so channel 5 has twice as many rows and columns it's area extent so where those outer edges of the image are are generally the same relatively the same but most importantly the projection information is the same this is because these are recorded from the same instrument the data is on the same projection that means it's on the same coordinate system so x equals 1000 y equals 1000 that point is the same in the coordinate systems for the these data now so I mentioned we have two data arrays they are different sizes so how do we do some calculations with those you would have to make them the same shape so that you could actually work with them the easiest way to do that in sap i is to resample there are many different resampling algorithms and we'll step through some of them here the most basic of which is native which is technically not resampling but it makes the most sense in the interface of SAP height to put it here so we can call scene dot resample with the resampler native it will return it will return a copy of the scene with this resampling applied to it so let's run the cell and I'll explain a little bit about this so the last line of the cell is saying are the channels the same shape now do they have the same number of rows and columns and that's true what native resampling is doing is it's saying by default what is the highest resolution channel we're working with how do I make the lower resolution channels the same resolution as the higher ones the easiest way to do that is replicating pixels that's why I need of resampling isn't technically resampling but we're making things the same resolution it's close enough for a lot of work that we do we can verify that everything is the same by looking at the area attribute of channel 6 again and now we have the same number of rows and pixels or rows and columns has our channel 5 so let's say we wanted to plot a difference now that things are the same resolution we can actually do that hurt our arrays are the same size so we're gonna load matplotlib and we're gonna treat data arrays just like we would if we had numpy rays so we're accessing the scene object getting channel 6 we're subtracting the channel 5 data set and notice I'm using the new scene variable that I got here this was the one that resample returned so I have this new data raid if I use the x-ray plotting utilities and running this I get a plot so I now worked with data that was at different resolutions that I couldn't do before so like I said native resampler will either it will replicate data if it needs to so this pixel is 1/4 of the resolution of the destination that we want so we will repeat the pixel of four times to make them the same resolution so that's one way we can resample make things the same resolution but what if we wanted to change the projection we can use a different resampling method for that we can use nearest neighbor resampling but to start with that we have to actually have an output area and output projection that we want to go to we can use the PI resample libraries create area def so that this is like if you wanted to change the projection but effect distortion and again you want to change the sum projection parameters change maybe you want to cut out a specific area of the data that you're looking at with a different projection so we can use this create area def to define those distortions while not the distortions but define the projection and the parameters of that projection that area that we're looking at so I'm going to import this and then you can put a question mark at the end of anything in a notebook well an object function and it will give you the documentation so we here this open this little pop-up window for me this little pane and this gives me the back string for that function if you want to you can look at all of the different parameters that it accepts I'm gonna close that by hitting the X in the upper right so what this function is trying to do is you give it some set of parameters for describing this area and it will figure out what it has to do to make a valid area definition for you that fully describes that area that you're trying to make so here I'm calling it with a dictionary of some projection parameters this is just when I grabbed and wrote I'm saying I want a 1,000 by 1,000 image area extent we've talked about how this is in meters if somebody knows how to magically come up with those meters that's amazing I do not know how to do that I typically work with degrees I know where things are relative to latitude and longitude so I can specify things in degrees by using this units keyword argument and saying degrees that means any number that represents some distance in here will be in degrees some coordinate so area extent here this is the left side of the image and then we go to the bottom the right and the top coordinate so these are longitude and latitude values so if I run this I print out what that area definition is we see I have the projection that I had that I asked for I have the rows and columns and then create area def did the conversion of the meters for us using the progeny to convert to transform those coordinates for us now we have this output area that we're trying to get to so we defined a new number of pixels we can pass this area definition to resample by default it will do nearest neighbor resampling and we get new scene similar to what we did before oh I should have pointed out I think these cells are out of order as part of this area definition we're also defining a resolution so we defined number of pixels but the other side of that coin is there's a resolution for each of those pixels so we can use these attributes on the area definition to say that Oh in the X dimension that they're about 1,500 meters it's about 2 kilometers in the Y dimension tells us the resolution of our open area going back to our resampled area if I plot channel 5 this is after the resampling will see that this took a little bit longer but we now have a smaller area we have it's a different resolution I can tell you it's 1,000 by 1,000 pixels it kept all of our attributes because we can see that our color bar has spread this out a little further our color bar has its label we have our access or access labels as well but we're at a different resolution different size different projection and that all happened with that resample call I don't know if I have yeah see we could look at the data array that's produced from that and we'll do that next yeah are there any questions so far before I move on to the next part of this so it is the default so you could also do it here by doing resampler equals nearest but that is the default before what I did when I did native resampling as I did resampler native and I didn't specify anything there because the default area will be the maximum resolution of whatever channels were looking at yeah so that would return the same results so so far we created an output area definition with everything specified we had we gave it enough information to tell it what projection what size all of that information what if we didn't know everything right away what if we wanted to say we have the projection and we know we want 5 kilometer resolution of our pixels but we don't know where our data is just just make it work just make an image that shows all of our data we can use that same create area that function but not provided all of the information and it will create what's called a dynamic area definition a dynamic area definition doesn't have all the parameters filled in but will they will be filled in by the resampling method when we call it so let's create this area definition it's the same projection as before but now I've specified a keyword argument saying five kilometer resolution so that means in the X&Y dimension I want both of them to be I want each pixel to represent five kilometers on this this projection we passed this area to the resample method just like we did before this will default to the nearest neighbor resampling and we can print out the shape of our area now this is going to take a lot longer than the last resampled that's because it's actually figuring out based on the longitude latitude points of our input data what it needs to create as far as the area definition is concerned and so we can see we get it decided that based on the projection the resolution we want and where our data is on the earth the best size for this output area would be this many rows this many columns we can now plot this and this is actually going to do the nearest neighbor resampling that we prepared using desks so we built up those calculations everything from loading to the file to scaling it out of the file to doing the actual nearest neighbor resampling and then plotting the data all of that happened and was computed by x-ray technically when we did this plot so we get this image so this is what that goes data that a bi data looks when you transform it from a geostationary projection to a lambert conformal conic projection and you tell it to look at all of the data we mentioned earlier when we were looking at this data that that top-left corner was in space that's why we have this long arm going off on the upper left because in the projection that that data is actually going off the earth and the projection isn't doesn't have a place to put it the data also has some distortions up there because we're going from one projection to another and in this projection these pixels are really far apart and so I default the nearest neighbors choice of how far it looks for valid pixels was not enough so we get this kind of separation so I guess if I go I use the zoom tool and we get these cool artifacts cool so that's something you have to worry about as far as how far do I tell the algorithm to look there are a lot of options for resampling there are different resampling algorithms but that's where we go as far as that this is how we do that as far as changing and transforming projections our data from one projection to another all right any questions now No all right so what I would like you to do is take about ten minutes and play with I have some cells started here where we're creating an area definition we're resampling and we're plotting it we're using channel five for this whole time I give you some suggestions for what projections you could try we are looking at a relatively small area of data so the distortion is not very large I will tell you that but try these different projections in this create area depth change some of the parameters may be the area extent feel free to look at the other parameters that create area def can accept try resampling see how it affects things if you come up with questions feel free to ask remember we have some helpers so raise your hand if you have some problems and we'll see where we can go from there so let's do ten minutes just to play around with the projections all right I think I'm going to cut this a little short just so I can show an example that came up from a question the question was I have Mercator but why are the coordinates and meters Mercator is an actual metered projection it just looks a lot like a lat LAN grid because it's a cylinder and we're expanding that cylinder so it looks a lot like a lat lon set up degrees if we wanted to do degrees we can do long lat here for the projection that's part of the progeny I'll remove this LAN 0 everything else I'm gonna keep the same I want at 1000 by 1000 I want the air extent to be the same I run these same resample commands I get a plot and we can see that my x and y are now in degrees because I'm in a lot land a lot long lat projection or lat/long or if you want to say it yeah same thing though extends coordinates are in degrees and that all worked anybody you have any questions based on what they saw or anything else so the question is what does resolution mean because the resolution is different in each projection and yes so what we're defining is a grid of pixels in a particular projections coordinate system so with this projection we're defining where in that projection where in that shape where is 0 0 where is our origin of our XY coordinates and then from that origin what are the coordinates of each of our pixels for our image we're saying that in that projection space in that number of in those coordinates how wide are each of our pixels how high are each of our pixels those meters are based on the projection parameters and the size of that projection and how that coordinate system that we've defined what a meter means in that projection generally it's it's representing a meter on the earth at the best location at the best reference where there's no distortion so um so 750 meters in the geostationary projection is not 750 meters in a different projection but we're defining this grid that is equal distance between each pixel in projections based not on the earth but in projection space so depending on which projection we choose we get different distortions based on those distances does that kind of answer that no yeah I could say yeah yeah yeah like if you were going over Austin Texas and you changed like this lawn zero and these lat parameters based on the definition of what that projection means that projection has some definition of what location has the least amount of distortion typically it's somewhere like where this lawn zero is and wherever your lap' parameter is that's your your reference so that's supposed to generally be the least amount of distortion so that should be where you have a one-to-one correspondence between the distance you're defining so that pixel resolution and what's actually on the earth but as was mentioned in the definition of map projections from that book defend depending on which projection you're looking at you're not always maintaining distance between the projection and the actual Earth distance sometimes you're maintaining the angle or the the area of something yeah so yes complicated there's some math that happens and then some things happen yeah that is I'm sure you could and I think people actually I think Wikipedia does that Wikipedia projection Mercator randomly searching Wikipedia that's just a good idea so this is the Mercator projection as deformation so this I think all the Wikipedia pages have things like this for these projections trying to show what amount of distortion is there are people in this room that know much more about this than I do but in general so as an explanation of this this is a cylinder over the earth and so you if you imagine you have that globe and you're spreading out onto the cylinder you're projecting under the cylinder you have much more distortion at the higher latitudes because of how far those pixels have to spread this does understand this at all it requires some good amount of 3d visualization in your head at least that's the way I understand it but yeah there's a lot of math and complex stuff yeah short answer is I don't know what I typically do normally I'm told by whoever I'm working with what projection they want their stuff done so I get around it by doing that otherwise I have projections that I typically use for different areas because I've been told or have had some level of experience where I know that there should be less distortion at this location but yeah beyond that yeah like like mid-latitudes lambert conformal conic does pretty good Mercator is pretty good for things at the equator because you can see it's it's all the same level of distortion there anything near the polls you might want to use a polar stereographic so I typically stick with those three projections everyone has their own little thing about why they choose the projection people come up with new projections all the time because they think it's there's less distortion yeah but we're getting to the point where I yeah I have I am NOT an expert yeah yeah so for dynamic areas what like the question is introspection on how it's doing what it's doing and what are their options would there be it is definitely maybe not the simplest approach to doing it but there is some stuff like if we have this projection we know that it's it's very strange so we need to do completely different calculations to estimate things well if so yeah it's dynamic areas are very give me your best guess I don't know if there's an overall goal in the algorithms that are involved with that like what are we reducing as far as distortion or size and that kind of thing and also differs on what parameters you provide and what you want out okay let's take a 15-minute break and get back here around ten twenty let's say feel free to ask me questions I will probably be around here so talking with Jeff our helper one of our helpers during the break he brought up a good idea for explaining this distortion in this resolution better and I can only think of doing it with hand gestures because I don't know if anybody would see it if I drew it on the board I'm not very good at drawing anyway so if we think about the geostationary projection what we're doing is we have our sphere our earth and we're putting some piece of paper some some graph paper on top of that earth and we're projecting whatever is on the earth onto that piece of paper when we're talking about area definition resolutions we're talking about the distance on that piece of paper not the distance on the earth and so the distortion is the difference between the distance on that piece of paper when it's projected back on to the earth so what is the difference between those distances so yeah so when we move it's when we have this pixel and the next pixel is 500 meters to the left that's 500 meters on that piece of paper and he pointed out Jeff pointed out that this image that I showed if I can make it bigger so this is from Matt Gunn short with Sims which is at the Space Science and Engineering Center at the University of Wisconsin so this is that goes a bi sector or what it sees so we have the different sectors to find the colors on this are the approximate pixel area on the earth if I'm understanding it correctly so within this dark blue circle these are supposed to be like a pretty good one-to-one ratio ish but as we move out you can imagine the earth is curving but we're still representing it on a flat piece of paper so there's more distortion between what 1 kilometer on our piece of paper represents on the earth or what what it's actually seeing so you get some curving effects so we're gonna go back to resampling moving on to other things to do with resampling and we're going to talk about our first polar orbiter satellite so the polar orbiter instrument we're dealing with now is veers VII VIII RS there is a veers instrument on two satellites soomi NPP and NOAA 20 also called jps s1 those like I mentioned I was a polar orbit of satellites so throughout the day they're going from pole to pole as the Earth orbits the instrument ends up seeing the entire Earth in strips of data so I had you download some of the I had you download some data from veers from the NOAA 20 satellite that is seeing this fire that we've been looking at with a bi data so that's what this scene creation here is looking at and so we're gonna create this scene and use it like we did the a bi data so we're gonna ask what data sets are available so we see a couple things we see a list of some angles satellite and solar angles some longitude and latitude and some and two data sets called IO three and io 4 so these are the bands of years to two of the bands of years beers has 21 bands does that sound right it has two main resolutions of bands so it has eye band or imagery resolution bands it also has M band or moderate resolution bands but it follows the same kind of concept with ABI where each band represents some wavelength or it's it's measuring some wavelength and so we get that same type of information with veers data with polar orbiters in general you usually get more channels on them and you also get higher resolution channels so let's start out by looking at the io4 band and looking at some of the properties of that band so we're gonna use the load method like we did with ABI we're gonna look at io for so here we see we have a data array again we have a wine and x dimension one thing you should notice is that this data that I'm showing you is one small chunk I believe of just about Texas and we already have 7600 by 6,400 pixel this is showing you how much higher resolution polar orbiter data generally is compared to geostationary again it's lower to the earth it's easier to get a higher resolution without having to go as far we have wavelengths associated with this this is a brightness temperature band that we're looking at start time and end time we have an area it's slightly different than the area we dealt with before and I'll explain that in a second what else so I bands from beers are 375 meters per pixel mm bands are 750 meters generally and there's another band called DnB or day/night band we won't be looking at that today sorry what else to do so that's beers beers being a polar orbit er and the way it actually operates we can't the data is not gridded it's not each pixel is not the same distance apart we have to have a latitude and longitude point for each pixel to know exactly where it is it's still the same idea that we have a coordinate for the center of the pixel and we know generally what size it should be but we're going to simplify it and say that the resolution right under the instrument where that where that that satellite is moving right under the instrument is one resolution and we're just going to pretend that it's the same resolution throughout that's not true but we're generally going to resample the data so that it doesn't matter and we'll see why in a second yeah let's look at let's look at the data I can explain this mark the area for the Vere's data is a swath definition it's just like the area definition as far as its purpose it's meant to define the geolocation where this data is but because of the properties of the instruments it's holding on to a longitude and latitude array if we do get lawn lacz we get a desk or a full-size of the original image and we could use it like we would longitude latitude data if we plot this data so that PLT figure we create a new figure we use the x-ray utilities this is gonna take a little bit so we get some data but it doesn't look very good two things you should notice one our data is flipped horizontally California Baja California is on the right side of the image and Texas is on the left side of the image this is just coming from how the data is stored in the file and how the satellite is orbiting the earth and whether it's descending or a sending just what the satellite is seeing the white pieces on the side I'll try zooming in and see if this makes a little sense oops maybe okay you can kind of see it so the way the viewers instrument works is by scanning the earth so as the satellite is moving or orbiting around the earth it will do one scan of some amount of rows but it'll do one scan the satellite is still orbiting it does another scan still orbiting another scan and you end up with a little bit of overlap on the edges just because of the way the earth is round it is not flat as you go out what you're seeing expands so you have basically seeing a tight set of pixels here and as you go out you're seeing more and more which eat with with each pixel that you're recording you're seeing more of the earth just because of the curvature of the earth in the way the instrument is scanning for each scan you end up overlapping just a little bit and these these white parts of this plot of this data are called bowtie deletion or a bowtie effect and that's because you can imagine that kind of looks like a bowtie as it expands out and so this is a processing software feature I guess where you don't need to have those duplicated pixels so they will remove those pixels and send the data down to earth from the satellite but they don't have to send as much down because they know it's duplicate so it's a compression kind of thing and saving transmission but you get this effect so if we look at the whole thing again so we have this bowtie effect on the edge it's not what we want to look at we have latitude longitude points our raw array does not look like what we expect the earth to look like the way we can resolve that is by resampling so we're gonna use the my area that we defined before so mine is LCC this is what it should be for everybody but so my area it's an LCC projection we see that here it's 1,000 by 1,000 it's got some extents we're gonna call resample same way we did with the geostationary satellite but we're gonna do polar or this is a polar scene running the hit resample and we're gonna plot and now this is gonna take a little while because we're going through each pixel of the polar data trying to find out what I guess I should technically to be correct we're looking at each output pixel and saying which input pixel is the closest to this output pixel and what we should get this is one of the few places where the speed of your machine actually matters hopefully this should finish what else can I say about this thank you Matt right now nearest neighbor is using a Katy treat Katy tree requires that you put a KD tree is a structure that lets you query for at least the way we're using it I have this coordinate what is the nearest coordinate in your your tree to that location so look through the tree using standard CS tree kind of things and give you some value back so one bottleneck is that we have to create that KD tree and we can't use desks ability to do things in parallel at least with the way the algorithm is written right now because we have to have build that KD tree out of every pixel we're not making any assumptions about where our input data is as far relative to each other every other pixel so we build this huge KD tree and then we can pretty quickly query that tree to find the output pixels so it is mostly that we have to up to this point desk is doing things in parallel we get to resampling and it and we say nope we have to do everything one thread all in memory right now and then we can split it up and do calculations again in parallel but yeah we have to hold that thing with that whole thing in memory there are pull requests sitting right now for pi resample with algorithms that do this better things like knowing that this pixel in my input data is geographically right next to the pixel next to it in the numpy array or in the array so we don't have to look at everything the other thought that we had was doing a KD tree of KD trees so we can build those KD trees in parallel and know whether or not we need to use them later on thank you Matt so what finished we have our XY meters as we did when we did geostationary data one thing that is hopefully obvious is all of the white space on this image this is the same my area that we use before with the ABI data but we have a lot of missing space there's there's no data there and in fact this is the only data I could find for Fears for the veers instrument that covered this Texas fire during the duration of the Texas fire for four hours this is this shows one of the major downsides of polar orbiter data is how long it takes to get another pass of that satellite over the same area API we talked about it has some sectors that it's it's viewing every minute taking data every minute this I have to wait hours to get another pass and even though there's two veers instruments in orbit right behind each other this was the only data that I could get that had any bit of the fire in it we can actually zoom in there's a little yellow pixel right here so there's a there's the fire in Texas so let's see what do we have next so to explain more about these pros and cons so geostationary we have high temporal resolution so we have data every one 5 10 minutes with a bi we have lower spatial resolution generally with geostationary satellites so the highest resolution for a bi was 250 meters and that was one channel for Viers we have five channels at 375 meter another con right now of geostationary satellites is that they generally have less channels less wavelengths that they're looking at what else yes so with that temporal resolution you can see as atmospheric events meteorological events as they evolve over time very easily veers polar orbit of data you have higher resolution the satellite is closer to the earth it's easier to see closer you technically or you generally have more bands you get a full coverage of the earth geostationary you're only seeing one part of the earth because the satellite is not moving and but it takes more time to get that full coverage or or to get an image of the same chunk of the earth so that is the quick and dirty introduction to polar orbiter beer's data any questions on that so far okay so now something you might want to do and this is this is really depends on what your area of science is and what you're trying to do with this data but this is a very simple proof of concept of we have a bi data it's showing something we have Vere's data it's showing something how do we compare the results from these different instruments maybe we want to make sure that they're seeing the same thing maybe one has one wavelength for a channel that the other does not but the way we have our data when we're starting is the ABI data is on the geostationary projection the Viers data is lat/long coordinates it's not technically on a projection so how can we compare these well this is the resampling lesson so we're going to resample we're gonna create two separate scenes one for a bi so we're gonna call that a bi scene we're gonna load channel 5 and channel 7 one thing I should point out here remembering correctly this a bi data is from the meso scale so those are those one-minute sectors that was one of the reasons I chose this fire is that the mezzo one of the mezzo sectors for a bi was over the fire yeah I didn't mention that they can move the mezzo sectors for those a bi so if there's a if there's a hurricane we can look at the hurricane and get one minute data hold there's a fire in Texas we're gonna move it and watch Texas yeah so this is mesoscale data we were looking at CONUS continental US data so that was a larger area we're now looking at a smaller set of data next we're gonna load a beer scene so we're doing the beers SDR I believe just based on my file pattern I'm loading slightly less data but we're gonna load ioe and io5 and let's print out some information from these so we're gonna print out the start time and the size of these arrays so ABI the data that I have that's closest to the beers data is 8 seconds different from the beers data as far as start time you can see that the ABI data even though it generally covers the same geographic area it has many less pixels let's see what else these bands do line up so do I have this somewhere so channel 5 from ABI and channel or i/o 3 from beers are the same wavelength and channel 7 from ABI is about the same wavelength as io4 from beers so we're comparing to near the same wavelengths I move to this big chunk of code so what we're doing here and actually let's start this so run this and I'll explain it so we're using desks progress bar again just to see some see where we are in processing this this first line inside this progress bar we're creating a new scene we're creating or we're doing taking the ABI scene and we're resampling we're using the native resampler this is because we want to do this because channel 5 and channel 7 are different resolutions we want to make them the same resolution so that we're comparing the same thing numpy arrays are the same we're going to skip down to the Viers part for a sec Vere's we're gonna resample and I'm gonna pass in this new area definition that we haven't used yet we're gonna take the maximum resolution area from the ABI scene and resample veers to that projection so we're taking Viers lat lawn data and resampling it to the ABI geostationary projection by doing that we're making them on the same projection same resolution same size of the arrays and so now they are comparable the other key part of this is I'm using this fancy desk persist method or function and I'm giving it these data arrays what this is doing is we talked about how desk is building up these tasks that it's performing we want to do if we were to plot the results of this reasoning it would have to recompute everything up to this point every time we reap lotted every time we wanted some results it would have to recompute that every time this is a long resampling process we see that to resample the Vere's it took 24 seconds to do it the way we did it persist is saying I want you to do all of the computations up to this point and then hold them in memory still treat it like a desk array but compute everything up to this point so now everything that we do in the future with this data is starting from that starting point that that that check point that we made rather than having to come everything loading things from disk we're starting from here this will save us in the future operations so that it doesn't have to take 24 25 seconds every time we want to do something so now that we've done this resampling let's use those new scenes that we got we're going to access those channels and plot them using the same x-ray plotting utilities that we had so I get channel 5 that looks like this okay and plot IO 3 and so you can see how fast these were we're before when I was plotting viewers data it took quite a while that's because of we persisted the data into memory so some things to point out I might have to make this smaller resolution further ok so you can see the colors are a little different you can see that the Viers data doesn't fully cover the area that we've defined so it's not it doesn't have the same coverage as the avi data but yeah you should see that the shades of the colors are slightly different so let's say we want it to take a difference of those channels we know that they're the same resolution now so we should be able to do that so let's take 10 minutes and use what you know about matplotlib plotting these data arrays knowing what you know about treating data arrays like numpy rays doing basic arithmetic see if you can create a plot for each of the differences here it's less scientifically useful I was told by one of the scientists at my work but if you finish that quickly you could do the average of two of the bands the other thing you could try to do if you're feeling adventurous is instead of resampling the Vere's data to the ABI projection you could try reprojected to a new area definition that you define with a higher resolution right now we've kind of negated all of the high resolution miss of the Viers data because we've put it at a lower resolution so anyway take 10 minutes see if you can create some plots there are solutions there just don't cheat and we'll talk about that in 10 minutes I'm gonna bring us back a little early I'm a little worried about time you should get an image like this for your first plot channel 5 - io 3 if that's the way you chose to do that difference I set this color map here this red blue one just to show the difference I limited the color bar from negative 25 - 25 Kelvin because these are brightness temperature bands so they're in Kelvin you would expect things that are at the same wavelength to have generally no differences but we see quite a few here a lot of them are on the edge of clouds especially if you look at things like where this fire is you have this big dark shadow one reason for these differences the data was not recorded at exactly the same time so you have slight shifts just from things moving and the satellites moving and then the other main reason is that these satellites are not viewing the Earth from the same spot not only just height of their orbit but one is over here and one is over here and so you see different sides of shadows and clouds and so you just see things from a different angle so there's differences so to summarize what we did right here we took data from two different instruments that were on completely different projections or not even on projections we resampled them and compared them fairly quickly in what 5 to 10 lines of code this opens the door for us to compare satellites or compare instrument from different satellites or compare instruments that are on the same satellite so right now we've been looking at API but or we've been looking at veers but there are other instruments on the satellite that biers is on so they might record data in different ways different resolutions different locations different scanning patterns sometimes you get some weird ones where they spin halfway through their scanning pattern so by resampling we're able to put them pendant on the same domain so that we can work with them together that's that for that part of resampling because I'm worried about time I'm gonna go pretty quick through this next part let's say you wanted to cut out a specific chunk of your data without having to do this complex resampling step or what if you didn't need to do all the processing of every pixel you just want a general idea of what the image looks like sapa comes with some methods for helping with that on the and you can act on the scene as a whole so we're gonna create an ABI scene again here there is a cropping method on the scene we can specify a bounding box just like we've been doing with the area extends in degrees we can also specify XY if you want to do that or you can provide an area and you want to say match to this area if we do this it quickly returns a result we have a much smaller data array so we can we can plot this as we would and we see we have a much smaller section i hard-coded this so that it shows the fire but yeah so this didn't have to do any of the resampling steps it didn't have to do nearest neighbor at all it didn't even have to replicate pixels necessarily it just cut out the section of the data that we wanted based on the geolocation the other method similar to this if we didn't want to look at every pixel we just wanted a general idea is we can aggregate data so by default this will average data by dimension so we can say a bi scene aggregate every 10 pixels in the Y dimension every 10 pixels in the X dimension and return me a new scene with that average so if we do that we get 100 by 100 our original was 1000 by 1000 we plot this we can see it's much more low resolution a lot of jaggedy edges here one interesting thing is I believe we're looking at channel 5 you see you don't really see anything from the fire that's because it's averaging every 10 pixels if we wanted to take the maximum can provide a different function that it uses underneath the aggregate function it's using x-rays new Corsan function new ish so here we did the maximum instead of the mean so now we can see those bright fire pixels from that fire in Texas before we move on to the next lesson any questions on resampling so the question is do we normalize based on the size of the pixel for example as you go further or higher and latitude your pixels might stretch or be represent more data we do not I did not write this function but it's using x-rays Corsan which i believe is doing everything just on number of pixels and so if your data is not already normalized it is not doing that yeah these are custom algorithms written in available through the PI resample library which is through PI troll back when these were written access to the Google resampling algorithms was not easy from Python and it also back when we started had trouble dealing with non gridded data so dealing with like those beers swathes it didn't really handle that they weren't uniform distances so yeah they're all custom and PI resample right now but yeah the idea is to expand and use goodall's interfaces when we can okay all right let's close out resampling Oh during the exercise I was looking up other projections in Wikipedia so you have some some strange ones you just whatever you want to do okay so I'm gonna kill the resampling lesson I'm going to load the lesson 5 composites let's see just remind myself how long this is ok let's see how fast we can get through this okay we're going to now work on another common step when working with satellite data which is to make RGB images or combined bands in different ways to make something new so we kind of talked about how you might combine different wavelength bands to figure out what type of clouds there are a more simple form of doing something like that is making an RGB image so RGB images have three values per pixel you have red green and blue represents how much of that particular color is should be in that pixel and when you do that you get a nice color image out now what we can do is set all of the red values to a particular particular Channel all the green values to a particular satellite channel all the blue to another channel and by doing this you get particular features that a wavelength picks up B end up being a certain color in your output image so let's look at a couple of these so we're gonna look at the ABI CONUS data again same set up as always before except we're gonna end this cell by asking the scene what available composites are there so I get this nice long ish list of various composites various variations on those composites these are all built into SAP pi and they're things that either the contributors or the developers have been playing with or their semi standard by some organization some space organization and they might publish these in their papers and say we have this new satellite here's how we combine these bands to make an RGB and this is the results and so they'll they'll bring up this recipe we took this channel we set it to red we took this one set it to green and this is what we got these are used RGB s are used by forecasters a lot because it gives you a quick way to see I guess the state of the atmosphere if you're trained at looking these are at these RGB is you can say oh I know that yellow is going to mean this in this particular RGB so we will see an example of this by looking at the airmass RGB so the airmass RGB takes for the and this is gonna get confusing because I'm gonna accidentally say Channel when I'm referring to the red component of the image so I'll try to keep it straight the red component of the image will be the channel 8 minus Channel 10 data the green component will be channel 12 minus Channel 13 and blue will be Channel 8 by itself the you met set so that's the European organization their training material talks about airmass and how it involves water vapor channels and so the main applications are the detection of dynamic processes such as rapid cyclogenesis jet streams and potential vorticity anomalies I don't know what that means I'm not a scientist but it shows stuff that's as far as I'm gonna go with that so we can take these composite names that are pre-configured in SAP I and asked for them in the same load method that we've been loading our channels with if we run this load method we can see it returns relatively instantly and we have a data array as we had before the main difference here is that we now have a bands dimension and it's got a size of three if we look at the coordinates we can go to the next cell and print out the coordinates so we can do dot chords that gives us access to those coordinates you can print out bands bands itself is a data array with values RG and B so this this index in this array represents the red component of the image and so on we can save this just like we did in the writing lesson it's treated just like any other data array so I can write this as a PNG we'll see that it takes a lot longer than it fit well relatively a lot longer so I can go back to my file browser I'll find an air mass and this is what air mass looks like so this is a lot more interesting than looking at well depending on who you are there's a lot more interesting to look at just because it's more colorful but this isn't a color map this isn't feared essaouira or whatever matplotlib color map this is the colors are determined by the data from three separate band well in this case five separate bands and so you can imagine forecasters being trained on this and saying okay well red I know that red represents that there's this state of the atmosphere green is this and so on so moving on there's a little bit of extra added complexity when trying to plot this stuff with x-ray because what we tried to do in Sat pi is for these composites for these RG bees keep the actual the temperature differences so for that red component we want the actual temperature differences but when we save it to an image there's typically these color limits that they have so they say everything below this difference should be clipped everything above this difference should be clipped that goes into that enhancing step that I talked about converting data values to image values those 8-bit unsigned integers so we're gonna use a new function in this step to do that normalization for us to do that that limiting called get enhanced image if we do this with a particular data array it will go through those built-in enhancements that SAP I has for doing things like this and give us this image object back the image has a data attribute we're just going to kind of skip over that for now but it gives us a data array back and we can use the same x-ray plotting utilities so using that extra little step of get enhanced image and doing dot data we can tell x-ray here that we're making an RGB and that the bands dimension is where those bands are located we tell it that the data has been normalized so it should not have to do anything between are anything greater than or less than 0 or greater than 1 if we didn't do that it would try to it would take the min and Max of each channel I believe or as a whole so anyway point is we get this airmass RGB and we got lucky here lucky I mean I wrote it this way airmass is combining bands of all the same resolution so we got lucky in that when we asked it to make air mass it could do all those calculations they're all the same resolution and all the same shape let's do the subtractions and boom we have an RGB if we are compositing things of different resolution the only way we know to do that is by resampling so let's try to load a different composite now the natural color composite it's made up of channel 5 channel 3 and channel 2 a channel to is at a different resolution than the other channels you met Seth's training materials say snow on the ground as well as ice over mountains frozen lakes and sea ice appear cyan in the natural color RGB images so that's what people might use this to might look for in this RGB so we'll use scene download again call natural color and we get this warning says the following data sets were not created and require resampling to be generated and it was this big long thing about natural color so this is telling us what we well what I already told you which is that natural color combines different resolutions and it's not going to guess on how those should be combined so if we actually check we use that in syntax natural color isn't actually in the scene even though we asked for it to be loaded we can use a missing data sets attribute or property and it tells us that this natural color is missing and so if we wanted to generate this RGB we'd have to resample make things the same size simple way of doing that is using the native resampler here instead of using the default of the maximum resolution area we're gonna use this min area and we're gonna call resample and then we will see what natural color gives us so now we actually have our data array so we asked for natural color we were warned it can't be generated we checked the properties it said yeah I wasn't generated it's not in it's not in the scene yet we resampled sat PI's scene said oh I know that you wished to have this RGB before and I resampled let's see if I can make this RGB now it successfully did and now we have that natural color data array in our scene so same kind of idea with the airmass we now have a bands dimension set to RGB we have some attributes filled in taken from the channels that were involved but also some extra added things by the compositing step of this so let's use that same matplotlib functionality get enhanced image getting the data telling it that we're making an RGB making an art and RGB over the bands dimension this will take a little bit longer there's I made it so there's a progress bar at the bottom maybe should have started running this before I talked so this is going through loading the data doing that native resampling doing the differences combining them and we get this image so cyan is supposed to be ice and all that kind of stuff and that's what we see yeah so this is called natural color this recipe comes from the you met set organization now so far we've been working with built-in RGB x' from that are built into SAP I a lot of these are standard by some organization like you met set or NOAA and they're the kind of RG bees that they make all the time and they give to people that's why they're built into SAP hive because they're needed they need to be generated all the time but let's say we are making up our own RGB we want to this is actually an area of research for some people is to figure out what RGB looks the best what bands we can do to see different information just from the basic bands that we're loading from the data files so we're gonna do that now so we're gonna use the same CONUS image but we're just going to start from scratch with the scene if we list out the available data set names we see we have 16 channels as we did before and now we're gonna load three channels that we plan to use in an RGB so this is up to you to choose whichever three you want you might want to choose at least one of the reflectance bands so I think that's channel one through five at least as one color just to give some some differences to it so I'm gonna do channel 7 channel 8 and then channel one and I'm gonna load all those three channels and then to do the compositing in a way where we're keeping attributes that we want shared attributes we're keeping the dimensions the way we want we're also doing certain checks like making sure they're the same resolution before we combine them we're gonna use a built in base compositor class called generic compositor from SAP pi so we're gonna import it here we're going to create it and we give it a name so we're just gonna call it my RGB I suggest you keep it as that name because we'll use it later and then we call the compositor so what this compositor is gonna return is a new data array we give it data arrays it does the math and the combinations and the aggregating that it needs to do and return some new data array so if we run this let's see what we get it's a big long ugly exception in Jupiter notebook but the important part to see is this part that says incompatible areas so this is set by telling us resolutions are different at least for mine you may not have gotten this depending on what bands you chose and that's completely fine you can still follow along with this nothing will be wrong but yeah this is SAP highs way of saying MLB's not the same resolution we need to do something about this so just as before we're gonna resample so we're gonna use the resample method again native resampler i'm going to call the compositor object that i created before with the same same bands but realize here that I'm using the resampled scene this new scene and will print out my RGB so just like all the other edgy B's we have a bands mention size 3 RGB and the coordinates and we have some attributes that the compositor figured out now we can use our our matplotlib stuff that we've been going over saying RGB bands and we get something something out one issue with this is that by default sap I said I don't know what my RGB is I don't know what you Mets at or whoever has defined as good limits for these values I don't know how to make this good quote-unquote as defined by these Official recipes so it took the minimum and the maximum of every component red green and blue and stretched the data linearly to fit that full range in our output image so you have very bright values and you have very dark values mine doesn't actually look that bad but we might want to control that so since we're doing everything manually what we can do over the next let's say a five ish - 10 s minutes we can take these calculations and rebuild our composite and reap lot it note that the numbers here are Kelvin in the first two for the red and green the bottom one is the reflectances and what we're doing here is a linear stretch between two values so we have a minimum to 20 and I use that twice in this equation and then a maximum of 260 so if you take the next five to ten minutes play with these values see if you can get a plot that looks kind of okay and you can change actually no you should keep it that the same but yeah so make sure that if you're using Kelvin if you're the band you chose is a brightness temperature and reflectance otherwise it should be kind of easy to tell once you plot it with the wrong values yeah so let's do that for let's say five to ten minutes see how people are doing it let's bring it back so this is what I came up with I think I had channel seven eight and channel one came up with some limits there's some colors on it I'm not going to look up what it actually means but so so far we had to do all of these steps manually we had to import that compositor we had to load the bands that we wanted to use we had to set the limits we might want to reuse this a lot we might want to share it with somebody we had a compositing step and we had that enhancing scaling stretching step SAP PI allows you to define some yeah mall configuration files so that you can reuse this stuff or provide it to SAP PI in a pull request as a composite that you think other people should have access to from the built-in Sat PI by default it will look at what's in the SAP high library it will also look at there's ways to point it to a particular directory or it will look in your current directory for a certain directory structure to find these annal files so what we're going to do now is edit these yamo files and configure our composite in them so down here below the plot that you made there's a section and it's got does he gamble texts right here it starts with composites : if you could highlight that and copy it we're then going to go to our notebooks directory we're going to edit a text file edit a Yama file but we can do that through Jupiter notebook so I'm gonna go to notebooks in my notebooks directory inside composites there's an ABI die Amal it's gonna click that and it gives me this text yeah Mel document I'm going to paste the highlighted text that I had in here and what we're doing is defining that recipe of what bands go into this composite we did the generic compositor so we're taking one channel and assigning it to each component of the image so what you should do now is take this text and edit each of these channel names just say whatever channel you used so I use channel seven here Channel eight here and channel one for green so yeah that's an order RG and B so we can specify that in the animal and then you can save it file save and I'll just wait a little bit make sure everybody's got it when you get done with that part of it if we go back to the lesson below that text there was also an enhancements section starts with enhancements : I'm gonna copy that just like we did before but instead of going in the composite tape ABIM while we're going to go back to notebooks enhancements ABIM oh and we're just gonna paste that text in there if I paste correctly and here's where we're gonna put in those stretch limits that we came up with so I'm gonna edit our min I don't actually remember what I did for this so I'm gonna go back so 22222 300 I did so 220 here 300 here and so on so I'll let everybody fill those in and then also save that file so we have created these llamo files that represent the RGB recipe that we've come up with over the last 10-15 minutes specifying what channel's go in that's our composite recipe and then specifying what the limits are these llamo files essentially point to Python functions or Python classes that should be called and gives them parameters to know how they should operate if you looked at the source code for sap pi or in the repository there are configuration configuration files like this for every instrument some of them are defined for all instruments that are that have these wavelengths so that we can reuse these recipes so right now we listed every channel we could have also listed what wavelength just a different way to define these recipes so that we can reuse them later on and that we don't have to type and copy this code everywhere so because we are in the notebooks directory and this composites and enhancements directory are in there that PI will automatically find those and use them so hopefully everybody has those defined has those limits in there and we can go back to our lesson we can create a new scene and we'll run that available composites name again composite names everybody should see a my RGB now in that list all right so that tells us that we added that correctly to that composites configuration now we can use it just like every other composite we've specified and pass it to download mine involves multiple recipes or multiple resolutions so I get that warning about there being different resolutions I'm going to do the native resampler and when I do that I now have my data array with the band's dimension and everything else set I can now skip through that process of specifying how to stretch things and I can do get enhanced image to look into that enhancement configuration do do the stretching the way I specified doing that linear transformation and I get the image that I had before so what we've done is we've gone from using built in recipes defined by somebody else to coming up with our own recipes playing with the limits seeing the changes interactively and then storing those in a configuration file so that we could reuse them later in the next lesson we will reuse them again are there any questions with composites before I move on No okay I'm gonna close yeah by default so how does it know where to find the composite file so by default it will look in it will take everything that's built into SAP pi and then it will update that set of configuration with whatever a specific setting INSAT Piatt you can there's an extra setting that I didn't say in here to specify where your configurations are maybe you have a certain directory on your system where you like to store configurations so you could point it to there and then it will also look in your current directory looking for a composites directory that has yanil files in it because our data is ABI because the scene that we created was for a bi sensor a bi instrument it knew to look for the abig animal file there's also like this little thing at the top it's a sensor named visi R is what that's supposed to save is ir /a b this is telling sap i that we are on an ABI or we're using an ABI instrument so here's all the composites for the ABI instrument but ABI is also a vis ir instrument it's got visible channels that has IR channels so also load all of the composites for visi our instruments any generic composite that can apply here and those are also built in to SAP pi okay all right I'm gonna close these lessons I'm gonna close the yeah no files I'm gonna go back to the notebooks directory I'm gonna shut down my notebooks that I had open and then I load lesson six animations so so far we've been dealing with single time steps we have goes data and I've been mentioning all this stuff about temporal resolution and oh it takes images all the time rapidly but we haven't actually used them now let's use them we want to see things evolve over time that is the main benefit of doing stuff like this as part of the data you downloaded you have mesoscale data so that's a one minute resolution data we're gonna look at an hour's worth of that and so if we run this first cell we're using glob and we're saying give us all the files that have this date time in them and we list the file names there's 960 files if you wanted to create a scene for every one of these files so there's there's 16 files per time step because there's one file per channel that's just how API is structured if you wanted to create a scene for each one of these you'd have to find a pattern in the file names to sort through them and do replacing and make sure that you organize them all correctly sampai comes with some utilities to do that for you the way you access that is through the multi scene object the multi scene is can be treated very similar to a scene object it just operates on multiple of them so we're gonna import the multi scene and we're gonna use this class method called from files and we say here are all of our file names here's the reader that I know knows how to read these files sort them and give me a multi scene object so if we run this takes a couple seconds and we get a multi scene object back the multi scene is semi I think in the documentation I still have it marked as experimental because it doesn't do everything I want it to do but it's it operates similar to how desk does things lazily so it didn't actually it might not create each one of those scenes right away it will create them as it needs to process them this saves you on processing time and memory usage and number of open files that it has to have but it gives you an interface very similar to the scene so we have a load method so we can load the my RGB that you created in the last lesson so we do that we can resample which will return return a new multi scene so we'll do native resampling again and I should have ran this before I started talking because this will take a while so run that last save animation cell so save animation is a new function that's only available on the multi scene and what this is doing is using a library called image i/o underneath and underneath that it's using the ffmpeg tool and it's generating the images for you sending them to ffmpeg and creating an mpeg-4 video through these keyword arguments we said we want 12 frames per second we gave it a file name we did all the time formatting and then what's happening underneath is the multi scene is stepping through each scene that it created so it grouped those files that said ok this is these 16 files go with this time step let's load that channel let's do the native resampling and now let's generate an image and write it to an mpeg-4 video hopefully this doesn't take too much longer and so this is stepping through what else is this doing yeah and it's trying to load things as it needs them oh it have finished so mine took on my computer took one minute five seconds it's it's we loaded that RGB so it's looking at those those recipes again if I navigate to my notebooks directory in my file browser you should see a my RGB mp4 file so I can open that in QuickTime for me at least and so I now have an animation that I made with my RGB over time and down by the player here let's see if I can make can I make this loop no loop you know and so when the controls go away the fire is right underneath those controls and so we can see as the smoke increases and we can analyze this looking at the different colors seeing how things evolve using pre-configured RGB recipes and what five lines of code or something like that so the other option we could have gone with here is the multi scene has a saved data sets method we could have saved all of these images to disk and done our own complex ffmpeg set of operations ffmpeg or image i/o also supports other types of animation formats like gif or Jif however you want to say that keep in mind those are much larger and we're dealing with we're writing raw data arrays ffmpeg is compressing things for us and making things a size that is reasonable if we were to use pill which is what image I you would use you're writing the raw data array so you're ending up with an animated image a gif that is very large and gif doesn't really compress well so you end up with very large images it's just better to use MPEG please yes yeah and so image I allows extra keyword arguments so you can control more of the frames per second and all the encoding settings it has some fairly logical defaults that are supposed to let it work in on most machines the one thing I've run into is that it's too it's too high quality for a PowerPoint and so PowerPoint doesn't do well with large video files and so I've had to use ffmpeg after the fact to lower the quality of the video but otherwise it works everywhere that I've seen yeah so that's making an animation with the multi scene the next step in this so that save that I'm gonna run all of these cells and I suggest you do the same and then I'll talk about what they're doing okay so if we start up at the top of this stacking we are now loading Viers data so I talked about how Viers data is orbiting around the poles and so over time you end up with one orbit and then the satellite goes around again and you end up with another orbit to make a full image of what you're looking at are the the area that you're looking at you might need to or you might want to stack those orbits on top of each other and that is what we're trying to do here so we're taking the fire data and I actually am loading if you looked at the file names there's JPS s 1 so that's NOAA 20 and then there's MPP files and I've loaded both of them ok so now I mine finished but I'm creating a multi scene the same way I did before using from files loading io 3 I'm resampling to a custom area that I defined LCC projection doing all this with the multi scene and then I'm using a new blend method which by default will take every frame and stack the next frame on top of it so it took one orbit stack the next one on top of it and then when we plot this it took about 40 seconds I guess this is actually two separate orbits from two different instruments that we put on top of each other if you had the time and I didn't make you download gigabytes of data you would be getting a much larger image with much more data on it and you can end up building an entire an entire scene of just beers polar orbiter data ok so that's that I wanted to do some examples of other type of videos you could do and why this is very neat for animations you have things like this where you can get this is hurricane Florence I don't remember what year that is from I think 2018 yeah it has to be and so you can actually see the rotation of the hurricane I'm not sure how well that shows up on there that was generated with SAP PI and I think I forgot to load the other one that I wanted this is another this is not out of the box with SAP pi but I wanted to show what's possible is that this is veers data we're making a what's called a true color RGB in the daytime and then we use a special compositor to say o at night I want a different image so I have it loading that that day/night band and then I throw this into the multi scene to give me one granule at a time so this is as the orbit as the satellite is orbiting we're getting different sets of the swath and making this video as we go through time still working on making that makes sense in a set by interface yeah so that kind of thing is possible I didn't mention there a lot of different compositors in set pi to that you might want to play with so that's compositing and doing the multi scene to use those composites you could do the same thing with the stacking and the videos of using regular bands I don't have any exercises for this because these steps take quite a while to process all these frames I'm sorry so let's go on to the next lesson I'm gonna go car to PI geo views so based on the survey the idea got is that most people in the room don't really use car to PI or and I think very few use geo views and that kind of works to my advantage here because I'm not an expert on either of them but we added some utilities to SAP pi so that you can more easily communicate and use these tools they have their advantages they're very useful I just don't have to use them all the time in my day to day job but we'll we'll see what they can do here we're gonna use ABI CONUS data again I'm going to use Carta PI to start out or create the scene you could load whatever channel you want to load again I'm gonna stick with channel 7 and then we're gonna have this big long matplotlib box and while we'll run this and then we'll talk about it so you should get an output image like this we have our I have my channel seven data I have the color bar still I have lat LAN lines they're kind of far apart and I have Coast lines added if we look at the code that generated this I guess I should say what Carta pyo's Carta Piatt is kind of an extension of matplotlib where it understands geographic information so coordinate systems it understands that there should be there could be coastlines or other features on the earth that you want to plot it to do that it uses its own internal CRS object coordinate reference system and to use SAP PI's loaded data with carta pi we have to convert our area definitions to Carta PI CRS objects the way we do that is through this line of code so this is taking our data array getting the area and then running this to carta pi CRS method it gives us a CRS object we can then use fairly similar matplotlib functions so we create a figure now we're gonna create an explicit axes object where we tell it the projection of these this axis the the coordinate system for this is this projection so we're saying we have a geostationary projection use that for our our figure axes we get our data array we use we can because we're doing everything on matplotlib we can use the same utilities from x-ray so my data dot plot dot M show and we tell it this data is in this coordinate system so we give it that CRS again we use coastlines and we use grid lines from the axes object to add those to the figure so there are a lot of extra functionality there is a lot of extra functionality in carpi for adding features they have a whole gallery of examples of what you might want to do this is a proof of concept I guess the other cell that's in this section shows how you can let Carta PI do the resampling if you want so here we're telling it we want a big large plot create a projection of all of our data so it took that and it resampled it however it wanted to do it to get our data on the map and so that's what we get as an output it was all the same code except for I changed what CRS object I was using and I took that from Carter PI so that's that so that's really all I have Ricardo pie in the links here I have a card a pie tutorial I felt by Phil Elson he is one of the creators of Carter pie I highly recommend you read through that it has some good advice and some extra examples there documentation which I also have linked here has a lot of like I mentioned a gallery of examples which would be good to look at geo views is a similar tool providing interactive plots in notebooks and other interfaces to plot your data interact with it we're gonna step through some examples of doing that there are utility methods from SAP bytes who help us work with geo views and we'll see some cool examples here first we have to import everything that geo views is going to use this is pretty much copied and pasted from some of their examples I just changed some of the defaults for the options I was setting and then we take our scene that we had we have my channels and we create this geo views object that geo views can then use to create plots to geo views we tell it we want to look at my channel we tell it what color map we want we tell it to add coastlines and borders geo views handles that by you multiply use the star asterisk on your geo views object if we run this we get a nice warning but then eventually we will get a plot hopefully really there we go so this is still something I'm playing with on getting it to look better I think what Carter PI is doing here is it's hitting the edge of the data and trying to stretch it to fill the projection that it's using by default you have uses guessing what projection would be best to look at our data in we're giving it the lat/long values or maybe it's giving it the CRS object but we have a color map and similar to matplotlib tools we have tools on the side here so we have this magnifying glass so we can zoom into certain areas we have I believe this is zooming with your mouse if you wanted to do that so we can do that by clicking that button and then there's also pan and zoom or sorry pan so we can click the four arrows and move around so it provides a similar interface to MATLAB but a different one plotting the data we can hit this little refresh looking button to reset the axes geo views also uses the CRS objects from car to pie or it can use them so we can run these same examples but resample to a different area so this first this next cell this will resample in the same projection so we provided we told geo views here projection equals CRS and now we get that a bi scene that we've been looking at all day but have the coastlines added we have our color bar same kind of interface we can just like car dope I have it to the resampling and have it do a Lambert conformal here so this is also using those CRS objects from Carter PI give it a sec same kind of idea where it's stretching the data I'm working on that but different projections different views different distortions looking at the data interactively in notebook they have a lot more functionality than just plotting this this next section shows some of that and this is something one of our one of our users came and said I would like to look at my data and geo views what do I need to do and I said I don't know and they said ok I'll figure it out and so they came up with this interface that takes advantage of geo views ability to give you sliders or tools on your plots so if we run through these cells we're gonna start by loading the mezzo data again create a new scene but we're gonna create a multi scene so this is the the mezzo data that we made that video from we're gonna load channel seven we're going to use that blend method which previously we use to stack Vere's orbits on but we're gonna give it this special time series and say join these data arrays together on some time dimension and if we do that that will return an a regular scene object but where our 2d data is now 3d with a time dimension so we took those mesoscale one-minute time steps blended them together into one data array and then we can use the two geo views that we've used before but telling it that we want to dynamically go through this data so you should if you run through these cells have a slider on the right side it's labeled time and if you move this around you will see the data as we step through it so I'm going to use the zoom box to zoom in on the fire down here and so now I can slide through here and look through all the data that I've loaded at least and go back and forth and do all your normal interactive stuff and we've looked through color maps and added borders and coastlines I believe this image is too zoomed in to get any actual coastline yeah so if i zoom out you can see that there there are coastlines there and I went through this kind of fast but are there any questions and I know there weren't any exercises but like I said I'm not an expert if anybody can give me hints or tips on how to do this better I'm all ears but I've shown you the basics of these tools and what they provide you any questions on that yes do you just want to view the data or are you doing something with it is one of the best problems of this dealing with this data I typically resample if I can so that the dateline is not or the anti meridian I should say is not the splitting point of my projection so you could use Mercator and tell it well zero isn't my starting point zero longitude 180 is or something like that that is one way that I think dynamic area definitions can do that actually too but yeah I don't have a better suggestion than that not that I can think of offhand at least right okay so this was an introduction to these tools and this really kind of marks the last bit of the code part of this tutorial we're getting towards the end the only the last thing I have is talking about contributing to SAP bi so what we did today is we use SAP pi and at scene object mostly at scene object to load data we kind of point I pointed out that there are multiple readers for SAP I we used two of the semi easier ones we loaded data that I had you download I got that data from if you google go sixteen ABI I believe I got it from Google's cloud computing so Noah has put their data up there on Google Cloud they also have it on the Amazon Web Services so I downloaded the data from there there are other services that provide that data phears data I got from Noah class there every organization for this these different satellites have their own websites or different tools for being able to download data like I mentioned we were loading that CDF and hdf5 files there are many different instruments that have binary formats some use JPEG 2000 and some custom binary ones some use HD f4 depending on how old they are sometimes new ones use HD f4 don't ask me why but we can load that data into using a simple interface with the scene ask for data sets that Sat PI has configured for us and get it into an x-ray data array object once we have that we can use any x-ray tool that understands our data which should be everything one thing that is being worked on in the open-source community with x-ray is I'm trying to start a Geo x-ray project there's also the r io x or a project that is trying to kind of standardize how x-ray data array objects record or specify this geolocation information if that's something that's very important to you and you work I can provide you some links to that I'm hoping that it becomes semi standard and I'm trying to standardize it and SAP I and other people are doing it in their own tools and we're trying to make it the same across all these tools so we loaded data with our scene we wrote data to geo tips netcdf files we plotted data with x-rays utilities and the standard matplotlib tools we loaded pre-configured composites our jeebies we created our own composites our own RGB our own channels our own enhancements we use those to make videos MPEG videos showing how the satellite data evolved over time we did stacking of multiple orbits for polar data using resampling a couple commands to get all of that data into one image and we showed a quick view of what card PI and geo views can do and how you can provide SAP PI's data to those tools if there is a huge use case that you think SAP pi is missing you think something isn't working the way you would expect it that's where I would ask you to let us know SEPA has its own github repository I mentioned the PI troll slack you can come talk to us complain to us if you want you can doesn't have to be about SAP PI or PI resample or event any of our other libraries you can just come talk about satellite processing and satellite instruments but if you do have some issue please raise a an issue on github on the SAP bi repository hole requests are always welcome let us know let us know what you're using SAP by for what you think could be done with it what features you request it would be great to to add you on to the community as far as what Patrol and SAP PI are using underneath a lot of the tools we use today are listed in this are shown in this graphic we used x-ray data array objects that desk underneath we use the basics of desks doing things like multi-threading computing things when there was memory available so we only computed a chunk of the data at a time trying to save our systems from killing themselves the Pangaea this is a Pangaea logo pan geo is a open-source community who strives to bring scientists to the compute to the data they're big supporters of having scientists go to cloud computing resources I believe they have a couple talks this week they yeah they want you to go on those clouds resources get the people providing the data to put their data on those cloud sources cloud services so that you can work with them without having to have huge compute on your laptop and without having to download a ton of data like you did today we use geo views Carpi matplotlib hughes matplotlib for almost everything numpy is underneath all of that this tutorial is also it's possible to load it with Jupiter hub and binder hub and run this stuff on the cloud with Google there's still a few bugs that I'm tweaking all of the data was provided by Noah Noah goes 16 data is what we looked at and the Noah JPS s1 or Noah 20 and assuming NPP VIIRS instrument I hope that you got something out of this tutorial even if you don't think you're gonna use SAP I hopefully learned a little bit about projections even if you didn't want to or satellite instruments in general thank you for coming I hope you have a good week thank you [Applause]
Info
Channel: Enthought
Views: 3,706
Rating: 5 out of 5
Keywords: python, scipy, satpy
Id: t4a_NrHy7NA
Channel Id: undefined
Length: 190min 10sec (11410 seconds)
Published: Thu Jul 11 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.