Thank you. First of all, does everybody have
a handout? This is how to install Python on your own computer. If you’d like to do that,
it’s really easy, at least the way I found here. And for next time, it’d be good if
you actually have Python on your computer. So this is your homework. So what we’re
going to do today is just go over some really basic stuff. I’ll try to show it instead
of just tell it. How many people already have some Python experience? Okay, okay. So hopefully
this won’t bore you if you already know some stuff. And feel free to ask questions
at any time, or if you have seen a different way to do something and want to ask about
it go ahead. So, so just to recap some stuff from last
time about what is Python. It’s general purpose programming language. It’s not specific
to a particular task like Matlab is or Gym Pack or something. It’s designed to be very
readable and very easy and fun to program in. It is an interpretive language, which
means as soon as you type something it does it. As opposed to a compiled language which
you type something you save it you compile it you save that, you run that and there’s
many steps and you have to deal with a lot of overhead. So, because it’s interpreted
you can use it interactively it will respond right away, or you can save your code and
run it later with the interpreter offline. It’s also very modular, which is one of
the reasons it’s very popular. It’s very flexible, there’s a core language that you
can extend with all these different modules. And we’ll talk more about that in a minute.
Another reason it’s popular is because it’s free, you can go out and get it and download
it and not pay anything. And it has a big community of people working on it and there’s
a lot of activity and improvements happening all the time, so it’s just this big thing.
Even though it’s been around a long time, the actual language was started in 1991, and
the name Python comes from Monty Python, as in the comedy troop. Now, it didn’t start out as a language for
scientific computing, it was just a language, a programming language. But what happened
is over time, people started writing modules, these extensibility parts that add functionality
to the language. They wrote these modules that started to do useful things that scientific
data analysis and module running and that kind of stuff can be done. So the modules,
they’re a big part of this. If you just had the core language, you can do some things,
but it would be a lot harder. The modules make things easier and add a lot of depth
to things. So, let’s see. Mainly what I’m going to be talking about is the collection
of modules called SciPy, scientific python. And those are the core of the scientific computing
part of Python. So there’s Numpy, which gives you numerical arrays, very similar to
matrices in Matlab. Then you have Matplotlib and Pyplot, which let you make figures very
similar to Matlab. Basically, they said we’re used to Matlab, but it costs too much and
we don’t like the way it works, so we’re going to do all that same stuff in Python.
So, a lot of the scientific stuff in Python is sort of similar, not identical, but similar
to how it’s done in Matlab. There’s another one called Pandas, which I probably won’t
get to this time, hopefully next time I could show that a little bit. That has more data
handling stuff for larger data sets and stuff. So if you take SciPy and you put it with Python,
you get some powerful programming abilities. I just want to show a couple webpages I came
across. This one’s talking about how Python is like the most popular introductory language
for computer science now. And then, Nature Magazine is telling scientists to try it,
so I think that’s pretty interesting. This one over here was from February this year. Alright, so just to cover what we’re talking
about today. It’s going to be how to get started with Python on your own, certain commands
and concepts and modules that you need to have some familiarity with. I’ll show examples
for loading and saving data. This is like really basic stuff. Examples for creating
plots and saving the images and hopefully I’ll get to also where you can go for more
because there’s a lot of stuff out in the world and I’m only going to be able to cover
a very small part. And then in two weeks, that’s where we’re going to get a little
more in depth and go into just examples. One of the things I’d like you to do is, my
e-mail address is on this handout, send me ideas for examples that you want to see. I’ll
come up with some if I don’t hear from anyone, but you know, the more interactive this is,
the better. If you have something you know how to do in Matlab, say or some other language
and you want to know how to do it in Python you can offer that as a suggestion. It’s going to happen anyway. So, in order to use Python, you have to have
Python. The handout goes into the details of installing something called Anaconda, obviously
another snake name. Anaconda is a distribution of Python. Because Python is free software
and all these modules that I’m going to talk about, they’re all free, they’re
all out there. But if you wanted to go collect them all and put them on your computer, it
would take a while and you’d have a lot of hurdles you’d have to jump over and it
wouldn’t be fun. So these distributions, they go out and do all this work for you and
they make it an easy download and click on it kind of installer, that gives you pretty
much everything you need, not 100 percent, but enough that you can start and there’s
also some infrastructure and tools that you can go out and fetch other pieces that you
might need. Because they don’t give you everything because they don’t know exactly
what you need but they give you a lot of it. So, the thing is that, these distributions,
they’re made by companies, for-profit companies. The distributions themselves are free, but
they’re kind of like a hook to try to get you to buy something of theirs. And someday
they might not be free, but for now they are. And if they did become not free, someone would
come along and make a free one. So, for now I recommend this Anaconda one, it’s complete,
it’s really easy. Even though it looks like there’s a lot of instructions it’s really
not that big a deal to install it. You basically pick the one you want to download, download
it, run it and you’ve got it. You’ve got everything I’m going to show you. Okay, let’s move to the next thing. So Python
is a developing thing. These modules, this code that’s out there for it, it’s all
under development, and under work. So things change over time. Python has been around a
long time. It’s had version 1 is like long gone, but version 2 has been around for many
years. Version 3 has been around for 5 years or so and they’re trying to get everybody
to switch from using 2 to 3, but there’s always a lot of inertia with that kind of
thing. So, when you go to get the Anaconda stuff, you’re going to have an option between
version 2 and version 3. So, I’m going to recommend version 3 because it’s newer.
There are a few modules out there that you might want to use that are not yet available
for 3, so if that happens you could always go back to 2. But I would recommend starting
with 3 because that’s the way the future is, the way forward. In fact, the version
of 2 that’s out now, version 2.7 is the last major 2 version they’re going to have.
So they’re saying we’re done, move over to 3. So, anyway. So it’s on the sheet to
do that, but I just wanted to mention it. Because if you end up writing some Python
code and getting further in it, you may encounter a situation where you need to switch back
to 2. With Anaconda, it let’s you have both actually if you want. It’s not that big
a deal, but it’s something you need to know about. Okay, so I’m just going to show you these
things. Oh, let’s start with this. Do you see the url here? Try.jupyer.org. You can
do that right now on your laptops. You can go and you will have access to Python right
then and there. In fact, I’ll do it too, just to show what we’re talking about. Okay,
so this is a really cool website, that people
are essentially donating server time to let you play with Python. Now the thing is, is
that you get this server and you can play with it but if you’re idle for 10 minutes,
it kills what you’ve done. And it’s not for saving things. It’s not for doing your
real work. It’s for playing around. Okay, so this is something you can mess around with
today while we’re exploring. Then you go to the handout and you install the real thing
on your computer so that you can save things and use your own data. There’s no way to
get your own data into this. But if you go here, then what you want to do is go to welcome
to Python. There’s some data they made available, I haven’t looked at that. So this is called
the jupyter notebook and it’s like one of the really futuristic things about Python
nowadays. It’s a webpage that has in it a connection to an actual Python running program.
And you can type code into it and execute it within your web browser and you can also
interchange between text and code. And so, what you do, well it says right here what
to do; run some Python code. So you click on this box that has some code on it. You
press shift, enter and it computes for a minute. The little star shows you that it was working.
The 1 means it’s finished and now it has created this figure. So you can see, in this
much code, you can make this figure. And you are free to sit there at your desk and edit
this and play with this and see what it does and you’re not going to hurt anything or
break anything, and just have fun. But I’m going to go through sort of the commands that
you kind of need to know to get anywhere with this. But again, if you're working on this and you
don't do anything for 10 minutes It's going to erase what you've done. I think you might
be able to save your work. Yeah you can download this if you want. If you download the iPython
Notebook, iPython is another name for Jupyter. If you download the IPYNB you can save on
your computer and then once you have your own Python you can bring it back up and running.
So anyway, if you have your laptop today you can play with this, but I'm not going to mess
with this particular one. I have my own notebook that I'm going to use. So the notebook is
a way to interact directly with Python. There's other ways to talk to Python, let me show
you. Is that readable? So what I can do is that I
can just type Python. So if I type Python it gives me this command prompt and says hi
I'm Python, what do you want me to do? So you can start typing Python commands like
print, hi there, it does what you say. But this is like the least useful environment
to interact with. This is really the one meant to run non-interactive programs, so let's
show that. So if I make a file, this is an empty text file using
my favorite text editor, you can use whatever one you like, and you can start typing commands
in here. So you can say print hello world, from Python. You save this and then you say
Python test dot p y and it runs your program. So that is non-interactive way, that is the
save your code, run your code way, which is very good if you have stuff you need to reuse
a lot and I feel like you want to do the interactive stuff when you're exploring and then once
you've gotten everything sort of figured out, then you right it into a script so that you
sort of keep a permanent copy of that. So Jupyter is like the notebook,
but the notebook is part of Jupyter. Jupyter is another way of talking to Python and in
here it's always going to be interactive. So you can say a equals 12, print a, I meant
lower case a, and it does what you say. You can say run test dot p y, the percent means
this is not a Python command, this is a Jupyter command and then it reads the file and it
does what it says. So I'm not going to be using these scripts a lot right now because
I want to do the interactive stuff but just to show you how that works. Are there any
questions right now? Yeah iPython has been renamed to Jupyter,
so iPython was the interactive Python. What they've done is generalized it. So Jupyter
can now do Python through iPython, but Jupyter can also other languages like R and Julia
and other stuff, so Jupyter has gotten so big it's more than Python even. So Jupyter
is this environment that you run things, but it defaults to Python typically, so it's been
called iPython for a while, they just renamed it so you're going to see both names. Any
other questions? I'm sure you can, but I'm not sure how right
now because I think the arc py comes with the arc's software. So I know you just have
to put it in the right place. Of course. So when you start dealing with
code that's written by a third party, you have these issues. In that case, you want
to install the Python 2 7 30 2 bit and then you could, I don't know exactly the instructions
but you would take the arc py's stuff and put it in the right place so that the anaconda
can see it. Maybe. There may be other modules. My understanding is that arc is proprietary
so it's probably not going to be available through the normal channels, but I think there
are other modules out there, so I'll write that down as something to look into for next
time. Any other questions right now? Alright, let's see. So going back to here,
oh Spyder, yes. Another thing that comes with Anaconda is Spyder, which is what's called
an integrated development environment. This is kind of like when you run Matlab and it
brings up this big window with all these sub windows and things. This is the Python version
of that and it'll take it a minute to load. But what this is like a full environment for
you to edit Python code, run it, debug it, all kinds of stuff like that. So this is yet
another way to interact with Python. So here, this window here is a text editor which is
Python intelligent, so it knows that what you're typing and what it means and it can
help you type it faster and then over here is like iPython or Jupyter, you can type things
and here and get immediate results and then up here is like where you can find information
about stuff as you type it, so if you're not sure how a particular module works you can
look it up on the right side. I haven't used this a whole lot but it looks promising, so
just another option. So if I do Jupyter notebook, what that does
is it starts running the notebook, so what the notebook is is a web server that it runs
on your own computer and then it brings up your browser to connect to that web server
and the web server underneath talks to Python. I'm going to actually use the other browser
for this. So this is the Jupyter notebook where it starts
and it gives you your files and you can look at them. So anything that is n p y n b is
a notebook. And you can open those and play with them. So let's start with basic Python.
So what this has is a collection of commands that I've typed in and text that I've done
and we can go through and edit them and run them and see what they do. So let's start
with that. So we're going to start with really basic things and again stop me if you have
questions. So the first thing you usually want to know how to do is print things and
what I'm just going to do is go through and hit the shift return in each one and it's
going to execute when I do that. So you see 1 here means this one has been executed and
it was the first one to be executed. That's important because if you wanted to you could
execute these out of order and if you did that what happens is that the outcome of each
one affects the ones you do later. They're all in the same memory. So we said print,
hello world, and it printed hello world. This is always the first program anybody does in
computer language. So, a couple of things here. There's the print function which you
send it something and it does something with it and the thing that we're sending it is
a string, it has these quotation marks. A string is just some text, some letters or
numbers that you want to display or produce something with. So we tell that to print that
and it prints it, okay great. That's not very useful yet but it's something. So now we have
some variables. So this is where you have information and you want to store it somewhere.
So I have a variable I call message and that is equal to this string, Hello world, and
these things here are comments. If you're not used to that in a language, that is stuff
meant for the humans, not for the computer, so anything after the hashmark is ignored
by Python. It's just for our readability. You can have an integer, like this one and
you can have floating point numbers, so these are like the three really basic stuff. And
notice that I'm trying to use good names for these, not just blah and x and stuff like
that. You always want to name things well and especially in science contexts, you probably
want to indicate units somewhere, so I put this, it's in Fahrenheit for temperature.
So if execute this, it's not printing anything it's just storing things. So if we want to
see them we have to print them. So let's print these, let's do that. This one's been run
as second, this one has been run as third. First it prints the message, so that says
hello world. Then it prints this thing, the answer to the ultimate question is, and then
the answer. So that creates this one line where its got two pieces of information on
the same line, the string that I gave it followed by the number that was stored in the answer.
Let me scroll up. And then, I created another string here just to represent the units in
a human readable way. Oh wow, this is scrolling off as you can see. So what I've got here
is I've got this string, this string here is called a format string. It contains text
that you want to print and instructions for including other information. So the other
information is represented by these curly braces here, so curly braces 0 means the first
thing that you give it and curly brace 1 means the second thing you give it, and by what
I mean by giving it is what I call this dot format. So it's string dot format and then
the information that we want to stick in this string. So I'm giving it this body temperature
f as the first thing and then I'm giving it units as the second thing. So units is going
to go here and body temp f is going to go here. Now the colon point 2 f are instructions
for how I want the body temp f to be shown. So with that saying is that you should write
that number with two places after the decimal point. That's often useful if you want the
numbers to line up and stuff like that. So these are examples of different ways you can
output stuff. Any questions here? Confusion? I just threw that in here and tried to keep
this short, but it's still not short enough. Now you can also do it this way, let me show
that. So this is another way that you can do it. You
can actually store the format string in another variable and then use that to do the formatting.
It doesn't matter. It does make it a little easier to read though, it's the same behavior
though. Any other questions? Let's do a little math, so let's convert the
Fahrenheit to Celsius. So this has all basics of math, you have parentheses, you've got
subtraction, multiplication and division, you can also do addition of course. So that's
pretty straightforward I think and then if I print this, oh no I got to run it, so that
gives us 37 degrees Celsius. Now because this is an interactive thing, you don't actually
have to print everything to see it. So if you just type a variable on a line by itself
and run it, it shows you that result. This is the input. This is the output. When you
use print, it doesn't say out, it's just doing what you told it to do. So this is what you
told it to do and this is the sort of implicit output that you want to see. But if you have
two such things like that it only shows you the last. So this does not print the message
followed by the answer plus three, it just prints the last line. So if you did want multiple
things to show up you would need to use print for that. It's just different ways of interacting
with the notebook basically. Alright? Questions? And feel free, you can type this kind of stuff
into the trial Jupyter thing and play with it and do whatever you want. Let me show you something about ordering,
so this is specific to the way that Notebook works. If I were to go up here to this one
that I’ve already executed and hit shift enter again, you see that the 2 turned into
an 8 here. That means this cell has now been executed eighth in the sequence. So even though
that’s not the order they’re presented on the screen, that’s the order that Python
remembers things. So you can actually go back and edit these and re execute them and that
will update the situation that Python sees. So, I didn’t change anything so it really
made no effect. But if I, for example, changed this to 24 and then did it. It would change
that value of the answer. Now if you go down here, to where I’d done this, it still shows
42 because that’s what it was at the time that I ran it. But if I run it again, now
it’s updated to the new value. So all these things are in memory somewhere and you’re
messing around with what’s in memory and that could affect the outputs that you’re
seeing. Alright, let’s see here. Now we have to
get into slightly more complicated things. But these are the things that start to make
Python really useful and simple compared to some other languages. So one of the main types
in Python is the list. So this is, instead of a single value, it’s a collection. So
you see the square brackets and commas so I’m storing 4 separate members into temps_C.
Doesn’t have to be 4, it can be 4,000, whatever. And then, this one is demonstrating that they
don’t all have to be the same kind of thing. This one’s a string, this one’s a floating
point, and this one’s and integer, they’re all different, but that’s okay. So, another
way to do this, I could do list 3 equals, so that’s an empty list it contains nothing.
I can say list 3 append the_answer and list 3 append 654321. So, if I run that, and then
let’s stick here print list 3. Okay, so I’ve printed these 5 things, so let’s
look at that. You’ve got temps_C, which was these 4 numbers, so it prints those 4
numbers, just as you’d expect. Now what I’m doing here is I’m saying the length
of temp_C equals, and I’m calling it the len function L E N, short for length. So when
you call that and you give it a list, it tells you how many things are in the list. So since
that’s included in the print, then it’s sticking that here. So the length of temp_C
equals 4. So the 4 is coming from len. Then I print arbitrary list, which is this list
of things and you can see it’s putting quotes around the things that are strings and it’s
showing the different types, the integer and the floating point. And doing the same thing
here with the length of that list, you see the length of that list is 3. You know, 1,
2 , 3. And then I’m printing list 2, which I built in a different way. And that one is
here. Any questions about that? Okay. Right. Well. So, this list is not the one
you use for mathematical stuff. This is a generic way of storing information. You will
use lists to, so you can have lists of lists. So you can have something, we can call this
matrix and I can say 1, 2, 3, 4, 5, 6, 7, 8, 9. Now, I got here, a list containing 3
lists. And each list contains 3 things. You might think that that would be useful for
math, but that’s too cumbersome. We’ll get to the real thing for that in a minute.
But the concept is very similar to what we’ll see in a minute. So if we do print matrix,
it prints that. And if we did, we’re going to insert a cell below here, type matrix.
Yeah, it’s the same thing. Sometimes it prints them differently. Never mind. So, yeah,
it’s not a row or a column, it’s a list. And we’ll come to rows and columns in a
minute. Another thing is if you have a list, you probably want to grab things out of the
list instead of dealing with the entire list. So if you use square brackets after the list
name and put in an index, then that allows you to retrieve the item at that location.
But this may be non intuitive to some people, Python starts counting at 0. So, the first
thing in the list is at location 0, the second thing in the list is at location 1. So it’s
basically, it’s like an offset, how many away from the beginning is it. So, you’ve
got to remember that because like in Fortran and Matlab and stuff they start at 1. In Python
they start at 0. It’s going to throw you off if you’re not careful. So, temp_C of
0, should give us the first number in the list. And then, what I’ve got here is 1:3,
and if you’re used to Matlab you might recognize that idea of starting at 1 and going to 3.
But Python is a little tricky there. Let’s execute this, so I can how you. So, alright
so, remember this one here is temp_C. So if I say 0 that gives me 12.3 which is the first
one. If I say 1:3, you might expect to see this one this one and this one. But Python
doesn’t do that. It gives us this one and this one. Because what Python does, and there’s
a programming reason for this, it never gives you the last one. You go up 2 but not including
that one. So 1:3 means start at 1, go up 2, but not getting to 3. So that means you get
1 and 2. And so if that’s 0, and that’s 1 and that’s 2 then you get those 2 as your
answer. It’s a little weird at first, but it’s very consistent with the way Python
does a lot of different things so it’s important that it works that way. Now this one, also if you know Matlab, you
know that there’s a way to specify the beginning of the interval, the end of the interval and
a skip so you don’t get every single one. Now, in Matlab, the skip goes in between but
in Python the skip goes on the outside. So this where you start, this is where you end,
and this is how much you skip. But what does a minus 1 mean? Minus 1 means from the end.
So minus 1 means the last one. So, I’m saying I want to go from 0 to the last one, skipping
by 2s. So that gives you this one, then we skip that one we do that one, we skip that
one there’s nothing left, so we’re done. So it gives you the first and the third. Yes? Let’s find out. Let’s go up here and add another one. So
I have to re execute this one for that to show up. I’ll re execute this one just so
we can see, there’s the 88. Do this one again, it didn’t do it. So, I couldn’t
remember which way it would go actually. I think that the minus 1 says that one, so it
never does the last one. But I think, if I remember right, we can do this. If I don’t
say anything at all, that means all of them. In fact, I don’t even have to put the 0
in this case. Colon, colon means all of them. And then the 2 is just saying every other
one. You’ll have to play with that to get used to it. But, that’s just how it is. Any other questions so far? Okay. Alright, dictionaries! That’s the other
big deal in Python. This is the kind of thing that in other programming languages like Seed
or Fortran, would take many many many lines of code and be very cumbersome to deal with
in Python it’s quick and easy. So what a dictionary is, its kind of like a list, except
you label the things in the list. So, in this case, I create an empty dictionary with the
curly braces instead of the square brackets. And this is kind of like up here, like when
I said bracket 0 I meant the first element in that list, except, this isn’t the list,
it’s a labeled list kind of. So, instead of giving numbers here, I give labels. And
the labels can be anything. The labels can be numbers, but they’re not indexes, they’re
just arbitrary numbers. So I could also say, quantities 5,000 = 5,000. So, what I’m doing
is I’m creating an entry for each one of these lines. I’m creating an entry in the
dictionary and I’m saying I want to associate with this thing, this value and I want to
associate with this thing, that value. And they don’t even have to be numbers. You
can say, something like that. And so, when I print it, I get this curly brace thing that
says the label of milk goes with the value of 1, the label of egg goes with the value
of 6, the label 5,000 goes with the string 5,000 and the label bread goes with 0. You’ll
notice they’re not in the same order that we put them. The order is arbitrary. [Audience member asks question] Potentially.
You can’t rely on that order. It’s the order that it appears in the computer’s
RAM, which is based on a lot of different things. But usually, you don’t use one of
these for ordering purposes. I think there is a variation that you can specify order,
but I never used that. But what you do with this is, this is for looking up things, it’s
a dictionary. So it’s like, well I didn’t know, like instead of trying to remember that
milk is the first one on the list for example, I don’t have to remember that, I just say
I want the one that goes with milk in my dictionary. So I can look it up. So if I say print quantities
bracket, bread, then that prints the number or the value that was associated with bread,
which is 0. Now then I try to look up something that’s not in the dictionary, oh and I get
an error. So this telling me, this is the type of error you’ll see in Python. It’s
not super user-friendly, but you can start to get some ideas. Here it’s pointing, it’s
trying to show you the line of code that it doesn’t like. It’s saying, okay this is
the one that I had a problem with and then it says key error soda. So you have to know
to understand that, that keys are what these things are. So key error means that key is
not in the dictionary. So you can always Google that or something. Now, it’s not clear until you start doing
things with these why they’re useful. But the combination of list and dictionary, you
can do a whole lot of different stuff with that. So to illustrate, I’m going to do
this. Well let’s go through the code first. So temp_data is an empty dictionary. I’m
adding something called a name and giving it this value. I’m adding something called
units, giving it that value. And then a thing called values and I give it a list. So now
I’ve got a list inside of a dictionary. That’s where it starts to get useful. You
can start to build these things inside of each other. Then I create another dictionary
and I say data_temp equals this whole dictionary. So now I’ve got a dictionary inside a dictionary,
which has a list. So you can create this whole infrastructure. And then data(‘press’)
is another variable that I would come up with. It has a name of pressure it has units of
millibars and values of whatever. So the idea is that you could, for example store information
with not just numbers but also associated useful things. So let me run this, and it
prints out kind of ugly but the outer curly brackets is the entire dictionary and press
is the label, colon and that associates with another dictionary which goes from here to
here and then temp is another dictionary. And within each one you can see the name and
the values and units. Who’s confused? I mean, any questions? It’s
not; I know it’s not completely clear yet. But these are the building blocks of writing
code in Python. So, the thing is is that when you start to use these modules, I haven’t
used any modules yet, this is just the core language. When you start to use the modules,
they take these ideas and run with them. So they do things that are very similar to dictionaries
and lists, but aren’t quite. They actually have other behaviors and abilities and performance
and stuff. So I’ll try to show that next. Okay, so I need to talk about modules a little
bit more. So the modules are these bits of code that get written by people and added
on to the language. They extend the functionality in different ways and the idea is that they
are usable not just for one program, but for many. And in order to have the module, you
have to install it. Now if you install Anaconda, it comes with hundreds of modules but there
might be occasionally one or two that it doesn’t come with. In fact, the last line of the instructions,
suggests to install a couple of modules that it doesn’t come with, which might be useful.
So, here’s some not necessarily scientific, but generally useful modules. So the math
module gives you things like log and sin and cosine and all that. And then you’ve got
the os module, which lets you talk to the operating system and find files and things
like that. The datetime module lets you deal with dates and times in a semi sane way. Now
the scientific modules, I mentioned these a little bit. So you’ve got the numpy, which
gives you numerical arrays. Numerical arrays are like lists but they’re more suited for
math and for containing lots of data, like numerical data. You got the pyplot, which
is for making figures. I’m going to show that. Scipy would be if you needed to do actual
number crunching. Fourier transforms, optimization, what was the other one, interpolation and
then like linear algebra type stuff. Pandas is for dealing with spreadsheet like data.
Net CDF4 if you’re doing any kind of geophysical data that usually comes in a format like that.
And xray is a way to deal with NetCDF data in a little easier way. Alright, so the core language is what you
get when you start Python. In order to grab one of these modules and make it available,
you use the import command. So, like if I said import numpy, then that would make the
numpy module. It would just appear and I can use it. But every time you use something in
that module, you have to specify the name of the module, and then a dot, and then the
thing that you’re using. Now, you could also alias that. So if I said numpy as np,
which is commonly done, then instead of saying numpy dot whatever, I say np dot whatever.
It’s just a way of saving typing. You could also specify a particular thing you want.
So I can say from numpy import ndarray and then I don’t get everything, I just get
that. And it doesn’t stick the name on it, you just use ndarray. [Question from audience] Yes. Well, so yeah
let’s see. I didn’t even mention that. So there’s a help system. So if I’m interested
in the module numpy, I can ask for help on numpy. There’s different ways of doing this,
but this is like a simple one. So this gives you sort of the rough documentation about
numpy. Numpy’s pretty big, there’s a lot of stuff in it, so there’s a lot of text
here. But usually, the easiest thing to do is to go find the online documentation. But
Python knows what’s in there and it can show you, but it can be a little overwhelming
when it shows it to you like this. Look how long this scrollbar is. So that’s probably
not the best thing to do, but it demonstrates that there is help. If I said import numpy
as np. Now it’s imported. And I say np dot tab. If I hit the tab key, this is specific
to the way that Jupyter works; it actually gives you the list of things you can type
next. Which is very big because I’m talking about an entire module. So there’s all these
different things I could do. So these are different ways of finding out, but they don’t
have a great way of like pinpointing exactly what you want. [Audience member] Yeah, yeah, yeah. Like some
modules are really big, it takes a while to bring everything in. You’re probably not
going to do that third one very often, but that’s for particular cases. Yeah, good
question. Any other questions on importing? Alright, let’s go to the examples. So, just
one other thing before I get to the other stuff. So, if you wanted to write a function.
So that would be where you have some code that you want to reuse for a lot of things.
Then here’s how you do it. You say def for define, and the function name and then the
inputs, or parentheses the inputs, and then a colon. And then you’d have some code here
and then you’d return whatever the function gives back. I didn’t show things yet like
looping and if statements. I don’t know if I’ll have time now but maybe next time
I can show that. But, so now that I have defined this function, I can call this function. If
I don’t run this cell it won’t work. So I run it, then I call it. So now I’ve converted
20 degrees Celsius to 68 degrees Fahrenheit. Alright, so here’s an interesting thing.
So here, my temperature is in Celsius and if I want to call this function on all of
them. I do this sequence. This uses the square brackets to represent a list, but it’s instead
of the actual items in the list, it’s the code to generate the list. So, what it says
here is for t in temps_C, which means go through each thing in temp_C and call it t and then
do this for that, for ‘t’. So then I call convert_C _to_F on t. So what it’s doing
is it’s automatically looping through the first list, taking each item, sending it into
the function, taking the result, and putting it in a new list. So you can see the two lists
here. It took 12.3, it ran it through this function, and got back 54.14. And it did the
same thing for each value. You don’t have to do that too often. This is just to show
some of the differences between the basics and the more useful things. Let’s skip that. This is a little bit out
of sorts here. So let’s start with numpy, and what I’m going to do here is actually
load a file. So let’s look at that file real quick. So these are some temperature
numbers that I grabbed from the Rutgers PAM site. Which is an instrumented research facility
nearby. These are hourly temperatures from October of 2013. And it’s just a list of
numbers so there’s no way to see the context of where this came from. But those numbers
are in there. And if I go back to my notebook and I do this line here, then I can look at
that. And now it’s got all these numbers. It just crashed. My browser crashed. That
is not good. It must have been too big. Let’s do this again. Sorry. So this is, this is
the first 10 items in the list, instead of the whole thing, which apparently crashes
the browser. But it’s basically a bunch of numbers. But you see here it’s got the
square brackets that the lists had before, but you see wrapped around that is array.
So array is the numpy thing. So I called numpy loadtext and I gave it the name of the text
file that has numbers in it. And it read those numbers and it converted them into numeric
values and stored them in this long vector essentially. And so, what I’m going to do is skip through
all these lines and go to the bottom because it’s easier. So if I say temps_hourly.shape,
this is telling me the dimensionality of that variable. So it’s saying it’s 744 rows
and 0 or 1 column, basically. It’s one-dimensional. But I could do this temps_daily=temps_hourly.reshape,
hopefully I’m doing this right. So now temps_daily. Alright, so what I’ve done here is I took
a shape that was 744 lines long and I said now rearrange that so it’s 31 rows by 24
columns. So now I made it two-dimensional. And I’m asking for the shape and it says
31 by 24. Then what I say here, this is like Matlab. I say give me the 0th row, which is
the first row and give me all the columns. So with a comma here I am telling it what
I want for each dimension, the first dimension comma the second dimension. So it’s giving
me here the 24 temperature values for the first day in October of 2013. And what’s
it’s not showing is all the rest of the rows because it’s too much information.
So then what I can do, since I reshaped it, I made each day have it’s own row. I take
the mean over all the rows and I get the daily mean temperature for every day. So I took
some data, I read it. I rearranged it and I calculated something useful. Any questions so far? Yeah, just hourly numbers. Yes. So, so well
if you have this long list and what I’m saying is you know, this 744, which is 31x24.
So what it’s doing is it’s taking the first 24 and making that the first thing.
And then it’s taking the next 24 and making those the next thing like that. So it’s
just taking it and it’s, so whereas the first position is here, the first position
is here. And it’s going down here and going to the right here, basically. And then it
wraps around when it gets to this one. Does that make sense? Yeah. Right. That’s because
Ferret uses the Fortran ordering of data. It’s completely arbitrary, but each language
can do it either way. So, in Ferret, things are stored essentially down the columns, whereas
in Python, it’s based on C, which stores things in rows first. It’s just the order
that it is in memory. So, yeah, that’s a good point. If you’re used to Ferret or,
I think Matlab might even be the same as Fortran too in some ways. So this might be a difference
from Matlab but I’m not quite sure. Well you could transpose the result. You know,
all the types of matrix operations are available, so. Okay, here we go. So, wait before I do that,
I want to do this. I’m actually going to replace temps_daily with it’s mean. So if
I say temps_daily, now I just get the actually daily temperatures. So what I’m going to
do now is, let’s import. So, matplotlib is this big module. I’m asking for a piece
of it. Matplotlib.pyplot, it’s actually a module inside of the module. And I’m naming
that whole thing as plt. So whenever you see plt, I’m referring to pyplot. And then,
what I’m doing here, so if I have my hourly temperatures; well let’s not do that yet.
We’ll come back to that. Plt.plot temps_daily. C’mon. Oh, I think I forgot an important
thing. Yes, I have to do this %matplotlib inline tells the notebook to keep the figures
in the notebook. So if I do this again it puts the figure here. So we’ve made a plot.
We had some data, we’ve made a plot. The bottom, since I didn’t give it an x value,
for the data, it’s just using the index. So it’s starting at 0 and going to 30. The
y is the value in the list. Now if I wanted to plot
the hourly temperatures as well, I can do that. But the problem is they don’t have
a consistent x-axis associated with them. So, this one has 30 things in it, or 31, and
this one has 700 and 40 something, so it’s not really helpful. So what I do is I make
some x-axes. So the arange function inside of numpy, you tell it a start and an end and
an interval, and it creates a list of numbers that follow those rules. So, I’m saying,
I want the first number to be 1, the last one to be 32, and the ones in between are
1/24th. So these are essentially days, or fractional days, along which my hourly data
runs. This one is the same thing, except it’s daily data. So, if I then put days and hours,
now they have consistent x-axes and they go together. So you’ve got your hourly fluctuating
temperatures, and you’ve got your daily means. But now, you know, this isn’t right,
this is not, well it’s starting at 1 now, but this is starting at 0 so we can do some
things. plt.xlim(1, 31), plt.xlabel ‘Day in October 2013’. If you’re familiar with
Matlab, these function names would look similar. They basically stole them from Matlab. So
now, I’ve got a nice x-axis, I’ve got labels, I can do, you know, all the stuff
you can do with plots, it’s all in here. But I just wanted to show. And then the final
thing you’d want to do is do savefig and then October temperatures. So now, it’s
taking that figure and saving it to a file. So now I’ve got some output. I might Let’s try, I don’t know. eps, okay that
one might be tricky. Let’s see. There’s okay, so open. Yeah, I don’t know if it
worked or if the conversion failed. Yeah, it looks like it didn’t work very well.
It doesn’t have any; it doesn’t have enough information in here. So, I may need to install
something for that to work. But we could try a different thing like gif. Nope it didn’t
like that. Png is the one I’m used to but I’m sure there’s other ones it’s just,
I don’t, off the top of me head I don’t know [Audience question] Let’s see matplotlib savefig. So, yeah there’s
stuff about it. I’d have to look into it, but. Okay, and then, so I can save the figure.
Now I want to save the data that I made because I generated some new data here. And where’s
that example? I think that’s np.savetxt. Oh, well yeah it supports eps but it didn’t
work. Probably because there’s something on my computer that’s missing. Tiff is another
good option. I would not use jpg for any kind of figures. It’s going to screw it up. Maybe
I could save as pdf directly. That one might be interesting. Let’s see, because Macs
are very pdf friendly. Yeah, but there’s something, there’s something not working
there. Maybe next time I’ll have that working. Savetxt temps_daily. That didn’t work. Let’s
see, file name first and there. There are my daily temperatures
saved off. So we’ve got the whole loop. We could read data, save data, manipulate
data, make a figure, save a figure. Those are the basic pieces for doing scientific
computing. We’ll get into a little more interesting
things next time. If you have any ideas or suggestions, please email me. Anything was
unclear let me know. This is being recorded so you can watch it again. Any questions right
now, before we stop? I’ll do that. I’m actually working on
a website to do just that, so. Okay. Thank you.