Introduction to Python Programming for Scientists I

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Thank you. First of all, does everybody have a handout? This is how to install Python on your own computer. If you’d like to do that, it’s really easy, at least the way I found here. And for next time, it’d be good if you actually have Python on your computer. So this is your homework. So what we’re going to do today is just go over some really basic stuff. I’ll try to show it instead of just tell it. How many people already have some Python experience? Okay, okay. So hopefully this won’t bore you if you already know some stuff. And feel free to ask questions at any time, or if you have seen a different way to do something and want to ask about it go ahead. So, so just to recap some stuff from last time about what is Python. It’s general purpose programming language. It’s not specific to a particular task like Matlab is or Gym Pack or something. It’s designed to be very readable and very easy and fun to program in. It is an interpretive language, which means as soon as you type something it does it. As opposed to a compiled language which you type something you save it you compile it you save that, you run that and there’s many steps and you have to deal with a lot of overhead. So, because it’s interpreted you can use it interactively it will respond right away, or you can save your code and run it later with the interpreter offline. It’s also very modular, which is one of the reasons it’s very popular. It’s very flexible, there’s a core language that you can extend with all these different modules. And we’ll talk more about that in a minute. Another reason it’s popular is because it’s free, you can go out and get it and download it and not pay anything. And it has a big community of people working on it and there’s a lot of activity and improvements happening all the time, so it’s just this big thing. Even though it’s been around a long time, the actual language was started in 1991, and the name Python comes from Monty Python, as in the comedy troop. Now, it didn’t start out as a language for scientific computing, it was just a language, a programming language. But what happened is over time, people started writing modules, these extensibility parts that add functionality to the language. They wrote these modules that started to do useful things that scientific data analysis and module running and that kind of stuff can be done. So the modules, they’re a big part of this. If you just had the core language, you can do some things, but it would be a lot harder. The modules make things easier and add a lot of depth to things. So, let’s see. Mainly what I’m going to be talking about is the collection of modules called SciPy, scientific python. And those are the core of the scientific computing part of Python. So there’s Numpy, which gives you numerical arrays, very similar to matrices in Matlab. Then you have Matplotlib and Pyplot, which let you make figures very similar to Matlab. Basically, they said we’re used to Matlab, but it costs too much and we don’t like the way it works, so we’re going to do all that same stuff in Python. So, a lot of the scientific stuff in Python is sort of similar, not identical, but similar to how it’s done in Matlab. There’s another one called Pandas, which I probably won’t get to this time, hopefully next time I could show that a little bit. That has more data handling stuff for larger data sets and stuff. So if you take SciPy and you put it with Python, you get some powerful programming abilities. I just want to show a couple webpages I came across. This one’s talking about how Python is like the most popular introductory language for computer science now. And then, Nature Magazine is telling scientists to try it, so I think that’s pretty interesting. This one over here was from February this year. Alright, so just to cover what we’re talking about today. It’s going to be how to get started with Python on your own, certain commands and concepts and modules that you need to have some familiarity with. I’ll show examples for loading and saving data. This is like really basic stuff. Examples for creating plots and saving the images and hopefully I’ll get to also where you can go for more because there’s a lot of stuff out in the world and I’m only going to be able to cover a very small part. And then in two weeks, that’s where we’re going to get a little more in depth and go into just examples. One of the things I’d like you to do is, my e-mail address is on this handout, send me ideas for examples that you want to see. I’ll come up with some if I don’t hear from anyone, but you know, the more interactive this is, the better. If you have something you know how to do in Matlab, say or some other language and you want to know how to do it in Python you can offer that as a suggestion. It’s going to happen anyway. So, in order to use Python, you have to have Python. The handout goes into the details of installing something called Anaconda, obviously another snake name. Anaconda is a distribution of Python. Because Python is free software and all these modules that I’m going to talk about, they’re all free, they’re all out there. But if you wanted to go collect them all and put them on your computer, it would take a while and you’d have a lot of hurdles you’d have to jump over and it wouldn’t be fun. So these distributions, they go out and do all this work for you and they make it an easy download and click on it kind of installer, that gives you pretty much everything you need, not 100 percent, but enough that you can start and there’s also some infrastructure and tools that you can go out and fetch other pieces that you might need. Because they don’t give you everything because they don’t know exactly what you need but they give you a lot of it. So, the thing is that, these distributions, they’re made by companies, for-profit companies. The distributions themselves are free, but they’re kind of like a hook to try to get you to buy something of theirs. And someday they might not be free, but for now they are. And if they did become not free, someone would come along and make a free one. So, for now I recommend this Anaconda one, it’s complete, it’s really easy. Even though it looks like there’s a lot of instructions it’s really not that big a deal to install it. You basically pick the one you want to download, download it, run it and you’ve got it. You’ve got everything I’m going to show you. Okay, let’s move to the next thing. So Python is a developing thing. These modules, this code that’s out there for it, it’s all under development, and under work. So things change over time. Python has been around a long time. It’s had version 1 is like long gone, but version 2 has been around for many years. Version 3 has been around for 5 years or so and they’re trying to get everybody to switch from using 2 to 3, but there’s always a lot of inertia with that kind of thing. So, when you go to get the Anaconda stuff, you’re going to have an option between version 2 and version 3. So, I’m going to recommend version 3 because it’s newer. There are a few modules out there that you might want to use that are not yet available for 3, so if that happens you could always go back to 2. But I would recommend starting with 3 because that’s the way the future is, the way forward. In fact, the version of 2 that’s out now, version 2.7 is the last major 2 version they’re going to have. So they’re saying we’re done, move over to 3. So, anyway. So it’s on the sheet to do that, but I just wanted to mention it. Because if you end up writing some Python code and getting further in it, you may encounter a situation where you need to switch back to 2. With Anaconda, it let’s you have both actually if you want. It’s not that big a deal, but it’s something you need to know about. Okay, so I’m just going to show you these things. Oh, let’s start with this. Do you see the url here? Try.jupyer.org. You can do that right now on your laptops. You can go and you will have access to Python right then and there. In fact, I’ll do it too, just to show what we’re talking about. Okay, so this is a really cool website, that people are essentially donating server time to let you play with Python. Now the thing is, is that you get this server and you can play with it but if you’re idle for 10 minutes, it kills what you’ve done. And it’s not for saving things. It’s not for doing your real work. It’s for playing around. Okay, so this is something you can mess around with today while we’re exploring. Then you go to the handout and you install the real thing on your computer so that you can save things and use your own data. There’s no way to get your own data into this. But if you go here, then what you want to do is go to welcome to Python. There’s some data they made available, I haven’t looked at that. So this is called the jupyter notebook and it’s like one of the really futuristic things about Python nowadays. It’s a webpage that has in it a connection to an actual Python running program. And you can type code into it and execute it within your web browser and you can also interchange between text and code. And so, what you do, well it says right here what to do; run some Python code. So you click on this box that has some code on it. You press shift, enter and it computes for a minute. The little star shows you that it was working. The 1 means it’s finished and now it has created this figure. So you can see, in this much code, you can make this figure. And you are free to sit there at your desk and edit this and play with this and see what it does and you’re not going to hurt anything or break anything, and just have fun. But I’m going to go through sort of the commands that you kind of need to know to get anywhere with this. But again, if you're working on this and you don't do anything for 10 minutes It's going to erase what you've done. I think you might be able to save your work. Yeah you can download this if you want. If you download the iPython Notebook, iPython is another name for Jupyter. If you download the IPYNB you can save on your computer and then once you have your own Python you can bring it back up and running. So anyway, if you have your laptop today you can play with this, but I'm not going to mess with this particular one. I have my own notebook that I'm going to use. So the notebook is a way to interact directly with Python. There's other ways to talk to Python, let me show you. Is that readable? So what I can do is that I can just type Python. So if I type Python it gives me this command prompt and says hi I'm Python, what do you want me to do? So you can start typing Python commands like print, hi there, it does what you say. But this is like the least useful environment to interact with. This is really the one meant to run non-interactive programs, so let's show that. So if I make a file, this is an empty text file using my favorite text editor, you can use whatever one you like, and you can start typing commands in here. So you can say print hello world, from Python. You save this and then you say Python test dot p y and it runs your program. So that is non-interactive way, that is the save your code, run your code way, which is very good if you have stuff you need to reuse a lot and I feel like you want to do the interactive stuff when you're exploring and then once you've gotten everything sort of figured out, then you right it into a script so that you sort of keep a permanent copy of that. So Jupyter is like the notebook, but the notebook is part of Jupyter. Jupyter is another way of talking to Python and in here it's always going to be interactive. So you can say a equals 12, print a, I meant lower case a, and it does what you say. You can say run test dot p y, the percent means this is not a Python command, this is a Jupyter command and then it reads the file and it does what it says. So I'm not going to be using these scripts a lot right now because I want to do the interactive stuff but just to show you how that works. Are there any questions right now? Yeah iPython has been renamed to Jupyter, so iPython was the interactive Python. What they've done is generalized it. So Jupyter can now do Python through iPython, but Jupyter can also other languages like R and Julia and other stuff, so Jupyter has gotten so big it's more than Python even. So Jupyter is this environment that you run things, but it defaults to Python typically, so it's been called iPython for a while, they just renamed it so you're going to see both names. Any other questions? I'm sure you can, but I'm not sure how right now because I think the arc py comes with the arc's software. So I know you just have to put it in the right place. Of course. So when you start dealing with code that's written by a third party, you have these issues. In that case, you want to install the Python 2 7 30 2 bit and then you could, I don't know exactly the instructions but you would take the arc py's stuff and put it in the right place so that the anaconda can see it. Maybe. There may be other modules. My understanding is that arc is proprietary so it's probably not going to be available through the normal channels, but I think there are other modules out there, so I'll write that down as something to look into for next time. Any other questions right now? Alright, let's see. So going back to here, oh Spyder, yes. Another thing that comes with Anaconda is Spyder, which is what's called an integrated development environment. This is kind of like when you run Matlab and it brings up this big window with all these sub windows and things. This is the Python version of that and it'll take it a minute to load. But what this is like a full environment for you to edit Python code, run it, debug it, all kinds of stuff like that. So this is yet another way to interact with Python. So here, this window here is a text editor which is Python intelligent, so it knows that what you're typing and what it means and it can help you type it faster and then over here is like iPython or Jupyter, you can type things and here and get immediate results and then up here is like where you can find information about stuff as you type it, so if you're not sure how a particular module works you can look it up on the right side. I haven't used this a whole lot but it looks promising, so just another option. So if I do Jupyter notebook, what that does is it starts running the notebook, so what the notebook is is a web server that it runs on your own computer and then it brings up your browser to connect to that web server and the web server underneath talks to Python. I'm going to actually use the other browser for this. So this is the Jupyter notebook where it starts and it gives you your files and you can look at them. So anything that is n p y n b is a notebook. And you can open those and play with them. So let's start with basic Python. So what this has is a collection of commands that I've typed in and text that I've done and we can go through and edit them and run them and see what they do. So let's start with that. So we're going to start with really basic things and again stop me if you have questions. So the first thing you usually want to know how to do is print things and what I'm just going to do is go through and hit the shift return in each one and it's going to execute when I do that. So you see 1 here means this one has been executed and it was the first one to be executed. That's important because if you wanted to you could execute these out of order and if you did that what happens is that the outcome of each one affects the ones you do later. They're all in the same memory. So we said print, hello world, and it printed hello world. This is always the first program anybody does in computer language. So, a couple of things here. There's the print function which you send it something and it does something with it and the thing that we're sending it is a string, it has these quotation marks. A string is just some text, some letters or numbers that you want to display or produce something with. So we tell that to print that and it prints it, okay great. That's not very useful yet but it's something. So now we have some variables. So this is where you have information and you want to store it somewhere. So I have a variable I call message and that is equal to this string, Hello world, and these things here are comments. If you're not used to that in a language, that is stuff meant for the humans, not for the computer, so anything after the hashmark is ignored by Python. It's just for our readability. You can have an integer, like this one and you can have floating point numbers, so these are like the three really basic stuff. And notice that I'm trying to use good names for these, not just blah and x and stuff like that. You always want to name things well and especially in science contexts, you probably want to indicate units somewhere, so I put this, it's in Fahrenheit for temperature. So if execute this, it's not printing anything it's just storing things. So if we want to see them we have to print them. So let's print these, let's do that. This one's been run as second, this one has been run as third. First it prints the message, so that says hello world. Then it prints this thing, the answer to the ultimate question is, and then the answer. So that creates this one line where its got two pieces of information on the same line, the string that I gave it followed by the number that was stored in the answer. Let me scroll up. And then, I created another string here just to represent the units in a human readable way. Oh wow, this is scrolling off as you can see. So what I've got here is I've got this string, this string here is called a format string. It contains text that you want to print and instructions for including other information. So the other information is represented by these curly braces here, so curly braces 0 means the first thing that you give it and curly brace 1 means the second thing you give it, and by what I mean by giving it is what I call this dot format. So it's string dot format and then the information that we want to stick in this string. So I'm giving it this body temperature f as the first thing and then I'm giving it units as the second thing. So units is going to go here and body temp f is going to go here. Now the colon point 2 f are instructions for how I want the body temp f to be shown. So with that saying is that you should write that number with two places after the decimal point. That's often useful if you want the numbers to line up and stuff like that. So these are examples of different ways you can output stuff. Any questions here? Confusion? I just threw that in here and tried to keep this short, but it's still not short enough. Now you can also do it this way, let me show that. So this is another way that you can do it. You can actually store the format string in another variable and then use that to do the formatting. It doesn't matter. It does make it a little easier to read though, it's the same behavior though. Any other questions? Let's do a little math, so let's convert the Fahrenheit to Celsius. So this has all basics of math, you have parentheses, you've got subtraction, multiplication and division, you can also do addition of course. So that's pretty straightforward I think and then if I print this, oh no I got to run it, so that gives us 37 degrees Celsius. Now because this is an interactive thing, you don't actually have to print everything to see it. So if you just type a variable on a line by itself and run it, it shows you that result. This is the input. This is the output. When you use print, it doesn't say out, it's just doing what you told it to do. So this is what you told it to do and this is the sort of implicit output that you want to see. But if you have two such things like that it only shows you the last. So this does not print the message followed by the answer plus three, it just prints the last line. So if you did want multiple things to show up you would need to use print for that. It's just different ways of interacting with the notebook basically. Alright? Questions? And feel free, you can type this kind of stuff into the trial Jupyter thing and play with it and do whatever you want. Let me show you something about ordering, so this is specific to the way that Notebook works. If I were to go up here to this one that I’ve already executed and hit shift enter again, you see that the 2 turned into an 8 here. That means this cell has now been executed eighth in the sequence. So even though that’s not the order they’re presented on the screen, that’s the order that Python remembers things. So you can actually go back and edit these and re execute them and that will update the situation that Python sees. So, I didn’t change anything so it really made no effect. But if I, for example, changed this to 24 and then did it. It would change that value of the answer. Now if you go down here, to where I’d done this, it still shows 42 because that’s what it was at the time that I ran it. But if I run it again, now it’s updated to the new value. So all these things are in memory somewhere and you’re messing around with what’s in memory and that could affect the outputs that you’re seeing. Alright, let’s see here. Now we have to get into slightly more complicated things. But these are the things that start to make Python really useful and simple compared to some other languages. So one of the main types in Python is the list. So this is, instead of a single value, it’s a collection. So you see the square brackets and commas so I’m storing 4 separate members into temps_C. Doesn’t have to be 4, it can be 4,000, whatever. And then, this one is demonstrating that they don’t all have to be the same kind of thing. This one’s a string, this one’s a floating point, and this one’s and integer, they’re all different, but that’s okay. So, another way to do this, I could do list 3 equals, so that’s an empty list it contains nothing. I can say list 3 append the_answer and list 3 append 654321. So, if I run that, and then let’s stick here print list 3. Okay, so I’ve printed these 5 things, so let’s look at that. You’ve got temps_C, which was these 4 numbers, so it prints those 4 numbers, just as you’d expect. Now what I’m doing here is I’m saying the length of temp_C equals, and I’m calling it the len function L E N, short for length. So when you call that and you give it a list, it tells you how many things are in the list. So since that’s included in the print, then it’s sticking that here. So the length of temp_C equals 4. So the 4 is coming from len. Then I print arbitrary list, which is this list of things and you can see it’s putting quotes around the things that are strings and it’s showing the different types, the integer and the floating point. And doing the same thing here with the length of that list, you see the length of that list is 3. You know, 1, 2 , 3. And then I’m printing list 2, which I built in a different way. And that one is here. Any questions about that? Okay. Right. Well. So, this list is not the one you use for mathematical stuff. This is a generic way of storing information. You will use lists to, so you can have lists of lists. So you can have something, we can call this matrix and I can say 1, 2, 3, 4, 5, 6, 7, 8, 9. Now, I got here, a list containing 3 lists. And each list contains 3 things. You might think that that would be useful for math, but that’s too cumbersome. We’ll get to the real thing for that in a minute. But the concept is very similar to what we’ll see in a minute. So if we do print matrix, it prints that. And if we did, we’re going to insert a cell below here, type matrix. Yeah, it’s the same thing. Sometimes it prints them differently. Never mind. So, yeah, it’s not a row or a column, it’s a list. And we’ll come to rows and columns in a minute. Another thing is if you have a list, you probably want to grab things out of the list instead of dealing with the entire list. So if you use square brackets after the list name and put in an index, then that allows you to retrieve the item at that location. But this may be non intuitive to some people, Python starts counting at 0. So, the first thing in the list is at location 0, the second thing in the list is at location 1. So it’s basically, it’s like an offset, how many away from the beginning is it. So, you’ve got to remember that because like in Fortran and Matlab and stuff they start at 1. In Python they start at 0. It’s going to throw you off if you’re not careful. So, temp_C of 0, should give us the first number in the list. And then, what I’ve got here is 1:3, and if you’re used to Matlab you might recognize that idea of starting at 1 and going to 3. But Python is a little tricky there. Let’s execute this, so I can how you. So, alright so, remember this one here is temp_C. So if I say 0 that gives me 12.3 which is the first one. If I say 1:3, you might expect to see this one this one and this one. But Python doesn’t do that. It gives us this one and this one. Because what Python does, and there’s a programming reason for this, it never gives you the last one. You go up 2 but not including that one. So 1:3 means start at 1, go up 2, but not getting to 3. So that means you get 1 and 2. And so if that’s 0, and that’s 1 and that’s 2 then you get those 2 as your answer. It’s a little weird at first, but it’s very consistent with the way Python does a lot of different things so it’s important that it works that way. Now this one, also if you know Matlab, you know that there’s a way to specify the beginning of the interval, the end of the interval and a skip so you don’t get every single one. Now, in Matlab, the skip goes in between but in Python the skip goes on the outside. So this where you start, this is where you end, and this is how much you skip. But what does a minus 1 mean? Minus 1 means from the end. So minus 1 means the last one. So, I’m saying I want to go from 0 to the last one, skipping by 2s. So that gives you this one, then we skip that one we do that one, we skip that one there’s nothing left, so we’re done. So it gives you the first and the third. Yes? Let’s find out. Let’s go up here and add another one. So I have to re execute this one for that to show up. I’ll re execute this one just so we can see, there’s the 88. Do this one again, it didn’t do it. So, I couldn’t remember which way it would go actually. I think that the minus 1 says that one, so it never does the last one. But I think, if I remember right, we can do this. If I don’t say anything at all, that means all of them. In fact, I don’t even have to put the 0 in this case. Colon, colon means all of them. And then the 2 is just saying every other one. You’ll have to play with that to get used to it. But, that’s just how it is. Any other questions so far? Okay. Alright, dictionaries! That’s the other big deal in Python. This is the kind of thing that in other programming languages like Seed or Fortran, would take many many many lines of code and be very cumbersome to deal with in Python it’s quick and easy. So what a dictionary is, its kind of like a list, except you label the things in the list. So, in this case, I create an empty dictionary with the curly braces instead of the square brackets. And this is kind of like up here, like when I said bracket 0 I meant the first element in that list, except, this isn’t the list, it’s a labeled list kind of. So, instead of giving numbers here, I give labels. And the labels can be anything. The labels can be numbers, but they’re not indexes, they’re just arbitrary numbers. So I could also say, quantities 5,000 = 5,000. So, what I’m doing is I’m creating an entry for each one of these lines. I’m creating an entry in the dictionary and I’m saying I want to associate with this thing, this value and I want to associate with this thing, that value. And they don’t even have to be numbers. You can say, something like that. And so, when I print it, I get this curly brace thing that says the label of milk goes with the value of 1, the label of egg goes with the value of 6, the label 5,000 goes with the string 5,000 and the label bread goes with 0. You’ll notice they’re not in the same order that we put them. The order is arbitrary. [Audience member asks question] Potentially. You can’t rely on that order. It’s the order that it appears in the computer’s RAM, which is based on a lot of different things. But usually, you don’t use one of these for ordering purposes. I think there is a variation that you can specify order, but I never used that. But what you do with this is, this is for looking up things, it’s a dictionary. So it’s like, well I didn’t know, like instead of trying to remember that milk is the first one on the list for example, I don’t have to remember that, I just say I want the one that goes with milk in my dictionary. So I can look it up. So if I say print quantities bracket, bread, then that prints the number or the value that was associated with bread, which is 0. Now then I try to look up something that’s not in the dictionary, oh and I get an error. So this telling me, this is the type of error you’ll see in Python. It’s not super user-friendly, but you can start to get some ideas. Here it’s pointing, it’s trying to show you the line of code that it doesn’t like. It’s saying, okay this is the one that I had a problem with and then it says key error soda. So you have to know to understand that, that keys are what these things are. So key error means that key is not in the dictionary. So you can always Google that or something. Now, it’s not clear until you start doing things with these why they’re useful. But the combination of list and dictionary, you can do a whole lot of different stuff with that. So to illustrate, I’m going to do this. Well let’s go through the code first. So temp_data is an empty dictionary. I’m adding something called a name and giving it this value. I’m adding something called units, giving it that value. And then a thing called values and I give it a list. So now I’ve got a list inside of a dictionary. That’s where it starts to get useful. You can start to build these things inside of each other. Then I create another dictionary and I say data_temp equals this whole dictionary. So now I’ve got a dictionary inside a dictionary, which has a list. So you can create this whole infrastructure. And then data(‘press’) is another variable that I would come up with. It has a name of pressure it has units of millibars and values of whatever. So the idea is that you could, for example store information with not just numbers but also associated useful things. So let me run this, and it prints out kind of ugly but the outer curly brackets is the entire dictionary and press is the label, colon and that associates with another dictionary which goes from here to here and then temp is another dictionary. And within each one you can see the name and the values and units. Who’s confused? I mean, any questions? It’s not; I know it’s not completely clear yet. But these are the building blocks of writing code in Python. So, the thing is is that when you start to use these modules, I haven’t used any modules yet, this is just the core language. When you start to use the modules, they take these ideas and run with them. So they do things that are very similar to dictionaries and lists, but aren’t quite. They actually have other behaviors and abilities and performance and stuff. So I’ll try to show that next. Okay, so I need to talk about modules a little bit more. So the modules are these bits of code that get written by people and added on to the language. They extend the functionality in different ways and the idea is that they are usable not just for one program, but for many. And in order to have the module, you have to install it. Now if you install Anaconda, it comes with hundreds of modules but there might be occasionally one or two that it doesn’t come with. In fact, the last line of the instructions, suggests to install a couple of modules that it doesn’t come with, which might be useful. So, here’s some not necessarily scientific, but generally useful modules. So the math module gives you things like log and sin and cosine and all that. And then you’ve got the os module, which lets you talk to the operating system and find files and things like that. The datetime module lets you deal with dates and times in a semi sane way. Now the scientific modules, I mentioned these a little bit. So you’ve got the numpy, which gives you numerical arrays. Numerical arrays are like lists but they’re more suited for math and for containing lots of data, like numerical data. You got the pyplot, which is for making figures. I’m going to show that. Scipy would be if you needed to do actual number crunching. Fourier transforms, optimization, what was the other one, interpolation and then like linear algebra type stuff. Pandas is for dealing with spreadsheet like data. Net CDF4 if you’re doing any kind of geophysical data that usually comes in a format like that. And xray is a way to deal with NetCDF data in a little easier way. Alright, so the core language is what you get when you start Python. In order to grab one of these modules and make it available, you use the import command. So, like if I said import numpy, then that would make the numpy module. It would just appear and I can use it. But every time you use something in that module, you have to specify the name of the module, and then a dot, and then the thing that you’re using. Now, you could also alias that. So if I said numpy as np, which is commonly done, then instead of saying numpy dot whatever, I say np dot whatever. It’s just a way of saving typing. You could also specify a particular thing you want. So I can say from numpy import ndarray and then I don’t get everything, I just get that. And it doesn’t stick the name on it, you just use ndarray. [Question from audience] Yes. Well, so yeah let’s see. I didn’t even mention that. So there’s a help system. So if I’m interested in the module numpy, I can ask for help on numpy. There’s different ways of doing this, but this is like a simple one. So this gives you sort of the rough documentation about numpy. Numpy’s pretty big, there’s a lot of stuff in it, so there’s a lot of text here. But usually, the easiest thing to do is to go find the online documentation. But Python knows what’s in there and it can show you, but it can be a little overwhelming when it shows it to you like this. Look how long this scrollbar is. So that’s probably not the best thing to do, but it demonstrates that there is help. If I said import numpy as np. Now it’s imported. And I say np dot tab. If I hit the tab key, this is specific to the way that Jupyter works; it actually gives you the list of things you can type next. Which is very big because I’m talking about an entire module. So there’s all these different things I could do. So these are different ways of finding out, but they don’t have a great way of like pinpointing exactly what you want. [Audience member] Yeah, yeah, yeah. Like some modules are really big, it takes a while to bring everything in. You’re probably not going to do that third one very often, but that’s for particular cases. Yeah, good question. Any other questions on importing? Alright, let’s go to the examples. So, just one other thing before I get to the other stuff. So, if you wanted to write a function. So that would be where you have some code that you want to reuse for a lot of things. Then here’s how you do it. You say def for define, and the function name and then the inputs, or parentheses the inputs, and then a colon. And then you’d have some code here and then you’d return whatever the function gives back. I didn’t show things yet like looping and if statements. I don’t know if I’ll have time now but maybe next time I can show that. But, so now that I have defined this function, I can call this function. If I don’t run this cell it won’t work. So I run it, then I call it. So now I’ve converted 20 degrees Celsius to 68 degrees Fahrenheit. Alright, so here’s an interesting thing. So here, my temperature is in Celsius and if I want to call this function on all of them. I do this sequence. This uses the square brackets to represent a list, but it’s instead of the actual items in the list, it’s the code to generate the list. So, what it says here is for t in temps_C, which means go through each thing in temp_C and call it t and then do this for that, for ‘t’. So then I call convert_C _to_F on t. So what it’s doing is it’s automatically looping through the first list, taking each item, sending it into the function, taking the result, and putting it in a new list. So you can see the two lists here. It took 12.3, it ran it through this function, and got back 54.14. And it did the same thing for each value. You don’t have to do that too often. This is just to show some of the differences between the basics and the more useful things. Let’s skip that. This is a little bit out of sorts here. So let’s start with numpy, and what I’m going to do here is actually load a file. So let’s look at that file real quick. So these are some temperature numbers that I grabbed from the Rutgers PAM site. Which is an instrumented research facility nearby. These are hourly temperatures from October of 2013. And it’s just a list of numbers so there’s no way to see the context of where this came from. But those numbers are in there. And if I go back to my notebook and I do this line here, then I can look at that. And now it’s got all these numbers. It just crashed. My browser crashed. That is not good. It must have been too big. Let’s do this again. Sorry. So this is, this is the first 10 items in the list, instead of the whole thing, which apparently crashes the browser. But it’s basically a bunch of numbers. But you see here it’s got the square brackets that the lists had before, but you see wrapped around that is array. So array is the numpy thing. So I called numpy loadtext and I gave it the name of the text file that has numbers in it. And it read those numbers and it converted them into numeric values and stored them in this long vector essentially. And so, what I’m going to do is skip through all these lines and go to the bottom because it’s easier. So if I say temps_hourly.shape, this is telling me the dimensionality of that variable. So it’s saying it’s 744 rows and 0 or 1 column, basically. It’s one-dimensional. But I could do this temps_daily=temps_hourly.reshape, hopefully I’m doing this right. So now temps_daily. Alright, so what I’ve done here is I took a shape that was 744 lines long and I said now rearrange that so it’s 31 rows by 24 columns. So now I made it two-dimensional. And I’m asking for the shape and it says 31 by 24. Then what I say here, this is like Matlab. I say give me the 0th row, which is the first row and give me all the columns. So with a comma here I am telling it what I want for each dimension, the first dimension comma the second dimension. So it’s giving me here the 24 temperature values for the first day in October of 2013. And what’s it’s not showing is all the rest of the rows because it’s too much information. So then what I can do, since I reshaped it, I made each day have it’s own row. I take the mean over all the rows and I get the daily mean temperature for every day. So I took some data, I read it. I rearranged it and I calculated something useful. Any questions so far? Yeah, just hourly numbers. Yes. So, so well if you have this long list and what I’m saying is you know, this 744, which is 31x24. So what it’s doing is it’s taking the first 24 and making that the first thing. And then it’s taking the next 24 and making those the next thing like that. So it’s just taking it and it’s, so whereas the first position is here, the first position is here. And it’s going down here and going to the right here, basically. And then it wraps around when it gets to this one. Does that make sense? Yeah. Right. That’s because Ferret uses the Fortran ordering of data. It’s completely arbitrary, but each language can do it either way. So, in Ferret, things are stored essentially down the columns, whereas in Python, it’s based on C, which stores things in rows first. It’s just the order that it is in memory. So, yeah, that’s a good point. If you’re used to Ferret or, I think Matlab might even be the same as Fortran too in some ways. So this might be a difference from Matlab but I’m not quite sure. Well you could transpose the result. You know, all the types of matrix operations are available, so. Okay, here we go. So, wait before I do that, I want to do this. I’m actually going to replace temps_daily with it’s mean. So if I say temps_daily, now I just get the actually daily temperatures. So what I’m going to do now is, let’s import. So, matplotlib is this big module. I’m asking for a piece of it. Matplotlib.pyplot, it’s actually a module inside of the module. And I’m naming that whole thing as plt. So whenever you see plt, I’m referring to pyplot. And then, what I’m doing here, so if I have my hourly temperatures; well let’s not do that yet. We’ll come back to that. Plt.plot temps_daily. C’mon. Oh, I think I forgot an important thing. Yes, I have to do this %matplotlib inline tells the notebook to keep the figures in the notebook. So if I do this again it puts the figure here. So we’ve made a plot. We had some data, we’ve made a plot. The bottom, since I didn’t give it an x value, for the data, it’s just using the index. So it’s starting at 0 and going to 30. The y is the value in the list. Now if I wanted to plot the hourly temperatures as well, I can do that. But the problem is they don’t have a consistent x-axis associated with them. So, this one has 30 things in it, or 31, and this one has 700 and 40 something, so it’s not really helpful. So what I do is I make some x-axes. So the arange function inside of numpy, you tell it a start and an end and an interval, and it creates a list of numbers that follow those rules. So, I’m saying, I want the first number to be 1, the last one to be 32, and the ones in between are 1/24th. So these are essentially days, or fractional days, along which my hourly data runs. This one is the same thing, except it’s daily data. So, if I then put days and hours, now they have consistent x-axes and they go together. So you’ve got your hourly fluctuating temperatures, and you’ve got your daily means. But now, you know, this isn’t right, this is not, well it’s starting at 1 now, but this is starting at 0 so we can do some things. plt.xlim(1, 31), plt.xlabel ‘Day in October 2013’. If you’re familiar with Matlab, these function names would look similar. They basically stole them from Matlab. So now, I’ve got a nice x-axis, I’ve got labels, I can do, you know, all the stuff you can do with plots, it’s all in here. But I just wanted to show. And then the final thing you’d want to do is do savefig and then October temperatures. So now, it’s taking that figure and saving it to a file. So now I’ve got some output. I might Let’s try, I don’t know. eps, okay that one might be tricky. Let’s see. There’s okay, so open. Yeah, I don’t know if it worked or if the conversion failed. Yeah, it looks like it didn’t work very well. It doesn’t have any; it doesn’t have enough information in here. So, I may need to install something for that to work. But we could try a different thing like gif. Nope it didn’t like that. Png is the one I’m used to but I’m sure there’s other ones it’s just, I don’t, off the top of me head I don’t know [Audience question] Let’s see matplotlib savefig. So, yeah there’s stuff about it. I’d have to look into it, but. Okay, and then, so I can save the figure. Now I want to save the data that I made because I generated some new data here. And where’s that example? I think that’s np.savetxt. Oh, well yeah it supports eps but it didn’t work. Probably because there’s something on my computer that’s missing. Tiff is another good option. I would not use jpg for any kind of figures. It’s going to screw it up. Maybe I could save as pdf directly. That one might be interesting. Let’s see, because Macs are very pdf friendly. Yeah, but there’s something, there’s something not working there. Maybe next time I’ll have that working. Savetxt temps_daily. That didn’t work. Let’s see, file name first and there. There are my daily temperatures saved off. So we’ve got the whole loop. We could read data, save data, manipulate data, make a figure, save a figure. Those are the basic pieces for doing scientific computing. We’ll get into a little more interesting things next time. If you have any ideas or suggestions, please email me. Anything was unclear let me know. This is being recorded so you can watch it again. Any questions right now, before we stop? I’ll do that. I’m actually working on a website to do just that, so. Okay. Thank you.
Info
Channel: Rutgers University
Views: 28,248
Rating: 4.940043 out of 5
Keywords: Python (Programming Language), programming, environmental science, Rutgers University (College/University), education, higher education, Programming Language (Software Genre)
Id: h4B7txfsqvs
Channel Id: undefined
Length: 77min 46sec (4666 seconds)
Published: Mon Nov 23 2015
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.