DAVID MALAN: All right. This is CS50, and this is week 8. So for the past several weeks have
we been focusing on first Scratch and then C. And now today do we
introduce another language altogether, that of Python. Indeed, even though we've spent
all this time talking about C-- and hopefully understanding
from the ground floor up what's going on inside of a
computer and how things work-- the reality is that C is not
the best language with which to solve a whole lot of problems. |ndeed, as you yourselves
might have realized by now, the fact that you have to manipulate
sometimes memory at its lowest level-- the fact that any time you want
to get something real done, like add capacity to a data
structure or grow a string, you have to do all of
that work yourself-- means that C really creates a whole
lot of work for the programmer. But ever since C's
invention many years ago has the world developed any
number of new languages-- higher level languages,
if you will-- that add on features, that fill
in gaps, and generally solve problems more effectively. And so today, we start to
do exactly that transition, having motivated this just a week ago
with our look at machine learning. Indeed, one of the tools that
we use to have that conversation was to introduce snippets
of this language, Python, because indeed it is much more of a
well-suited tool than something like C. But let's begin this transition now. We, of course, started this
conversation many weeks ago when we looked at Scratch. And yet even though you probably
found it pretty fun, pretty friendly, and pretty accessible, the reality
was that built into Scratch was quite a lot of features,
loops, and conditions, and customized functions, and variables,
and any number of other features that we then saw the week after
in C-- albeit a little more arcanely with more cryptic syntax. But the expressiveness of
Scratch remained within C. And indeed, even today, as we transition
to another language altogether, you will find that the
ideas remain consistent. And indeed, things just get
easier in many ways to do. So we transitioned to C. And
today we transition to Python. And so let's, just as
we did with Scratch, try to convert one
language to another, just to emphasize that fundamentally the
ideas today are changing, simply the way of expressing it. So this perhaps was
the very first program we looked at in C--
arguably the simplest, and yet even then there was
quite a bit of overhead. Well, starting today, if you wanted to
write a program that does exactly that, voila! In Python, you simply say what you mean. If you want to print "hello world,"
you literally in a Python program are going to write print open
parenthesis quote unquote, "hello world." And you can even omit
the semi-colon that might have hung you up so many times since. Now in reality, you'll often see
a slightly different paradigm when writing the simplest of programs. You might actually see
some mention of main. But it turns out that a main function
is not actually required in Python as it is in C. Rather, you can simply
write code and just get going with it. And we'll do this
hands-on in just a bit. But you'll find that
a very common paradigm is to actually have code
like this, where you do, in fact, define a function called main. And as we'll soon see
through quite a few examples, this is how now in Python,
you define a function. You literally say "def"
for define, "main" if that's the name of the function,
open paren, close paren, and maybe zero or more parameters
therein, and then a colon, and an absence of the curly
braces-- with which you might now have gotten so familiar. But then indented beneath that,
generally four spaces here, would be the code that
you want to execute. And we'll come back to this before long. But this is just a
common paradigm to ensure that at least one function in a
Python program is called by default and by convention, we'll
see-- it's called main. But the reality is that the program
can now be as simple as this. So let's distill some
of the fundamentals that we first saw in Scratch, then saw
in C, and now see in Python as well. So Python has functions and it also
has something called methods-- but more on that when we talk about
object-oriented programming. But a function in C for printing "hello
world" might have looked like this. Notice the printf for
printing a formatted string. Notice the backslash
n that's inside there. Notice the semi-colon. In Python, it's indeed going
to be a little simpler. We can distill that to just this. So we're not going to use printf,
we're just going to use print. We don't, it turns out, have to have
the backslash n in this example. You're going to get that for free. Just by calling print are you going
to get a trailing newline printed. And we don't, again, need
the semi-colon at the end. Well, what about loops? Well, in Scratch, we
had the repeat block. We had the forever block and
some other constructs still. In C, we had things like for loops
and while loops and do while loops. Well, let's do a couple of conversions. In C, if you wanted do something
forever, like print "hello world" again and again and again, never
stopping, one per line, you might use a while loop like this. In Python, you're going to do
something pretty similar in spirit, but it's going to be formatted
a little differently. We still have access
to the while keyword. The boolean value true now has to
be capitalized with a capital T. And again, instead of
using curly braces, you're going to use a colon
at the end of this statement and then indent all of the code beneath
it that you want to happen cyclically. And again, we've borrowed print
"hello world" from before, so no semi-colon necessary there. No f and no backslash n is required. Meanwhile, if we had a
for loop in C that we wanted to say print
"hello world" 50 times, we might use a fairly
common paradigm like this. Well, in Python you can do
this in any number of ways. But perhaps one of the most common
is to do something like this, to literally say for i in range
50-- more on that in just a moment-- and then print "hello world." So this is shorter hand notation. And this is perhaps the first instance
where you really see just how pedantic, how much C belabors the
point, whereas in Python you just probably with higher
frequency just say what you mean. So for implies a looping
construct here. i is declaring implicitly a
variable that we're about to use. And then what do you want i to be? Well, you want it to be in a range of
values from 0 up to but excluding 50. So you want to go from
0 to 49, effectively. And the way you can express
that here is as follows. You call this range
function, which gives you essentially a sequence of numbers
starting at 0, and then 1, and then 2, and then 3-- all the way up to 49. And on each iteration of this loop
does i get assigned that value. So functionally, what we've
just done is equivalent to what we've just done here, but it does it
in a more Pythonic way, if you will. We don't have access to
that same for construct as we did in C. We
actually have something that's a little easier, once
you get used to it, to use. Now how about variables? Well, recall that in Scratch, we had
variables, those little orange blocks. And we didn't have to
worry about the type. We could just put in numbers
or other such things into them. And then in C, we had to
start caring about this. But we had booleans, and
we had floats, and we had doubles, and chars, and strings,
and longs, and a few others still. Well, in Python, we're still going
to have a number of data types. But Python is not nearly as
strongly-typed, so to speak, whereas in C-- and languages
like C and a few others-- you have to know and care about and
tell the compiler what type of value some variable is. In Python, those types exist. But the language is more loosely-typed,
as we say, whereby they have types, but you as the programmer don't
have to worry about specifying them, a bit more like our world from Scratch. So whereas in C, we might have declared
an integer called i and assigned it an initial value of 0-- we
might have used syntax like this. In Python, it's going to be similar
in spirit, but a little more succinct. Again, just say what
you mean. i gets zero with no semi-colon, no
mention of the type. But insofar as Python
supports numbers, it's going to realize-- oh, that zero
looks like an integer, is an integer. I'm going to define, ultimately,
i as of being of type int. Meanwhile we have boolean
expressions in Python as well. And these actually translate perfectly. If you have an expression in C
testing whether i is less than 50, this is the same thing
in Python as well. You literally use the same syntax. If, instead, you want to generally
compare two variables, just like we did a few weeks
back in C, you might do x less than y-- same exact
code in Python as well as in C. Now how about conditions? So conditions are these
branching constructs where we can either go this way
or maybe this way or another way. So it's the proverbial fork in the road. Well, in C, if you wanted
to have an if statement that has three different branches,
you might do something like this. And as you may recall,
these curly braces are not strictly necessary,
simply because we have one line of code nested beneath
this if, and one line of code beneath this else if, and one
line of code beneath this else. Technically, and you might have seen
this in section or other resources, you can actually omit all of these
curly braces, which to be fair, makes the code look a
little more compact. But the logic is pretty straightforward. And we saw similar
yellowish blocks in Scratch. Now in Python, the idea is
going to be exactly the same, but some of the syntax is
going to be a bit different. So if we want to say, is x
less than y, we still say it, but we don't need the parentheses. In fact, if they don't
add anything logically, we're just going to start omitting
them altogether as unnecessary. We do have the colon, which is
necessary at the end of the line. We do have consistent indentation. And those of you who
have not necessarily had five for fives for style, realize
that in Python the language by design is going to enforce the
need for indentation. So in fact, I see myself being a little
hypocritical here, as I inconsistently indent this actual code. So this would not
actually work properly, because I've used a
variable amount of spacing. So Python is not going to like that. And in fact, that's why I made that
mistake to make this point here, so that you actually have to conform
to using four spaces or some other, but being consistent ultimately. So notice this. This? Not a typo. I didn't make that many
mistakes here. "elif" is actually the keyword that
we use to express "else if." So it's simply a new keyword
that we have in Python, again, ending the same
line with the colon. And then here, logically,
is the third and final case. else, if it's not less than
and it's not greater then, it must in fact be equal to. So we've used print as before to
express these three possible outputs. What about things like arrays? Well, Scratch had things called
lists that we essentially equated with arrays, even
though that was a bit of an oversimplification at the time. Python also has effectively
what we've been using and taking for granted now in C, that of arrays. But it turns out, in Python we're
going to start calling them lists. And they're so much easier to use. In fact, all of this
low-level memory management of having to allocate
and reallocate and resize arrays potentially if you want to
grow or shrink them-- all of that goes out the window. And indeed, this is a
feature you commonly get in a higher-level language like Python. It's a lot of this functionality
built into the language, as opposed to you, the programmer,
having to implement those low-level details. So, for instance, whereas in C,
particularly in a main function, we've been using for some time argv,
which is an argument vector or an array of arguments at the command line-- you
might access the first of those with argvargv[0]-- we're actually going to
have that same syntactic capability. We're going to access, in particular,
argv a little differently via an object called sys. So sys.argv, as we'll see,
is going to be the syntax. But those square brackets
are going to remain and the ideas of arrays, now called
lists, are going to remain as well. So what's a little bit
different in Python? We're about to see a
whole bunch of examples. And indeed we'll port-- so to
speak-- convert, or translate some of our previous C examples into Python. But what's the mental model that
you need to have for Python? Well, all this time, C, we've
described as being compiled. In order to write and use a program in
C, you have to write the source code. And you have to save
the file in something.c. And then you have to run something like
clang something.c in order to output from source code your machine code. And then that machine code, the zeros
and ones that the-- Intel, usually-- CPU inside understands, can actually be
run by double-clicking or doing ./a.out or whatever the program's
name actually is. So as you may have realized already,
this gets fairly tedious over time. Every time you make a
darn change to your code, you have to recompile it
with clang-- or with make, more generally-- and then run it. To make a change, compile, run it. Make a change, compile, run it. Wouldn't it be nice if we could
reduce those numbers of steps somehow by just eliminating
the compilation step? And indeed, a feature you get with
a lot of higher-level languages like Python and JavaScript and PHP and
Ruby is that they can be interpreted, so to speak. You don't have to worry so much
about compiling them yourself and then running resulting machine code. You can just run one command in
order to actually run your program. And there's a lot more going on
underneath the hood, as we'll see. But ultimately if we had a
program that looks like this-- simply a function called
main as we saw earlier, and we'll see some more
examples of this soon-- that simply prints out
"hello world," it turns out that you can run this program
in a couple of different ways. We can either, in the spirit
of clang-- whereby in C, we ran clang hello.c and
then ./a.out-- in Python, if this program is stored
in a file called hello.py-- where .py is the common file extension
for any programs written in Python-- we can distill those two steps,
as we'll soon see, into just one. You run a program called Python, which
is called the Python interpreter. And what that does
underneath the hood for you is it compiles your Python source
code into something called byte code, and then proceeds to interpret that
byte code top to bottom, left to right. So this is a lower-level
implementation detail that we're not going
to have to worry about, because indeed one of the
features of this kind of language is that you don't need
to worry about that. And you don't need that middle step
of having to compile your code. But for the curious, what's going to
happen underneath the hood is this. If we have a function like main that's
simply going to print "hello world" and we do run it through
that Python command, what happens underneath the hood is
that it gets converted first into something called byte code--
which fairly esoterically looks a little something like this, which you
can actually see yourself if you run Python with the appropriate commands. And then what Python
the interpreter does is it reads this kind
of code-- top to bottom, left to right-- that we the programmers
don't have to worry about in order to actually make your program do work. So you'll often hear that Python
is an interpreted language, and that kind of is indeed the case. But there can indeed be
this compilation step, and it actually depends on
the implementation of Python that you're using or even the
computer that you're using. And indeed, what we're
now starting to see is the dichotomy between what
it means to be a language and what it means to be a
program, like this thing Python. Python is a language. C is a language. Clang is a compiler. Python is also not just
a language, but a program that understands that language,
otherwise known as an interpreter. And so anytime you see me starting
to run the command "python," as you will too for future problem sets,
will you be interpreting the language, the source code that you've written. All right. So let's go ahead now and make a
transition in code from the world of C to the world of Python. And to help get us there, let's
put back on just temporarily some training wheels of
sorts-- a reimplementation of the CS50 library from C to
Python, which we've done for you. And we won't look at the
lower-level implementation details of how that works. But let me propose that at
least for part of today's story, we're going to have access
to at least a few functions. These functions are going to be
called GetChar, GetFloat, GetInt, and GetString, just like those
with which are already familiar. The syntax with which
we access them is going to be a little different in this case. By convention, we're going to
say cs50.GetChar cs50.GetFloat and so forth, to make clear that
these aren't globally available functions that might have even come
with the language, because they're not. Rather, these are inside
of a module, so to speak, that CS50 wrote that implements
exactly that functionality. We'll soon see that Python has at
least these data types of bools, true or false, whereby
the T, and in turn the F, have to be capitalized
in Python, unlike in C; floats, which are going to give
us real numbers, floating point values with decimal points; int,
which is going to give us an integer; and str or string, which is going
to give us the string that we've now come to know and love. But nicely enough, you
can start to think again of string as an abstraction,
because it's actually what's called a class that has a
whole lot of functionality built-in. No longer are we going to
have to worry about managing the memory for our strings
underneath the hood. Now Python, realize, also comes with a
bunch of other features, some of which we'll see today too. You can actually represent
complex or imaginary numbers in Python natively in
the language itself. You have the notion of lists,
as we mentioned before, an analog to C's arrays. We have things called
tuples, so if you've ever seen like xy coordinates or any kind
of groups of values in the real world, we can implement those too in
Python; ranges, which we saw briefly, which whereby you can define a range
that starts at some value and ends at some value, which is often helpful
when counting from, say 0 to 50; a set, which like in mathematics, allows
you to have a collection of objects-- and you're not going to
have duplicate, but it's going to be very easy to check whether
or not something is in that set; and then a dict or
dictionary, which is actually going to be really just a hash table. But more on that in just a bit. And these are just some of
them that we'll soon see. So let's now rewind in
time and take a look back at week one and perhaps this first
and simplest example that we ever did, which is this one here called hello.c. And meanwhile, let me go ahead
here on the right-hand side and create a new file that I'm
going to go ahead and call hello.py. And in here, I'm going to go ahead
and write the equivalent Python program to the C program on the left. print "hello world" Done. That is the first of
our Python programs. Now how do I run it? There's no clang step. And it's not correct
to do just ./hello.py, because inside of this
file is just text. It's just my source code. I need to interpret that code somehow. And that's where that
program Python comes in. I'm going to simply do
python space hello.py-- and I don't need the
dot slash in this case, because hello.py is assumed to
be in the current directory. Hit enter and voila! There's my first Python program. So what I haven't put in
here is any mention of main. And just to be clear, we could. Again, a common convention
in Python, especially as programs get a
little more complicated, is to actually do something like
this-- to define a function called main that takes, in this case,
no arguments, and then below it, to have this line pretty
much copied and pasted. If name equals, equals, underscore,
underscore, main, underscore, colon, then call main. So what's actually going on here? Long story short, this line 4 and line
5 is just a quick way of checking, is this file's default name quote
unquote "main" with the underscores there? If so, go ahead and
just call this function. Now generally, we won't bother
writing our programs like this when it is not in fact necessary. But realize, all these
two lines of code do is it ensures that if you do have a
function called main in your program, it's just going to call it by default.
That does not happen automatically. And indeed, if I just wrote hello to py
like this, and gave it a main function, gave it a code, like
print "hello world," but did not tell Python
to actually call main, I could run the program like this, but
nothing's actually going to happen. So keep that in mind
as a potential gotcha as you start to write
these things yourself. Well, now let's take a look back at
another program we had in week 1. This one might have had
me doing this in string.c. So in string.c did we introduce
the CS50 library in C. And we also introduced from
it the GetString function. And to use it, we had to declare
a variable, like s, of type string and then assign it the
return value of GetString. Well, let's go ahead
and do this same program in Python, this time
calling it string.py. And I'm going to go ahead now
and include the CS50 library. But the syntax for this is a
little different in Python. Instead of pound including,
you do import cs50. And that's it, no angle brackets, no
quotes, no .h, or anything like that. We have pre-installed in CS50
IDE the CS50 library for Python. And that's going to
allow me now to do this. s gets cs50.get_string
print "hello world" And we'll fill in this
blank in just a moment, but let's first see what's going on. On line 3 here, I'm declaring a
variable called s on the left. I'm not explicitly mentioning its type,
because Python will figure out that it is in fact a string, because the
function on the right hand side of this equal sign, cs50.get_string, is going
to return to s a value of type string. Now as an aside, in C, we kept
calling these things functions. And indeed, they still are. But technically, if
you have a function-- like get_string in this case--
that's inside of an object, that's inside of what's called a module
in Python, like the cs50 module here, now we can start calling
get_string as a method, which just means it's a function associated with
some kind of container-- in this case, this thing called cs50. Now unfortunately, this program,
of course, is not yet correct. If I do python space string.py
and then type in my name "David," it's still just says "hello world." So I need a way of
substituting in my name here. And it turns out there's
a couple of different ways to do this in Python, some of which
are more outdated than others. So long story short, there are
at least two major versions of this language called Python now. There's Python 2 and there's Python 3. Now it turns out-- and we didn't really
talk about this in the world of C-- there's actually different versions
of C. We in CS50 have generally been using version C11, which
was the 2011 version of C, which just means it's the most recent
version that we happen to be using. For the most part, that hadn't mattered
in C. But in Python, it actually does. It turns out that the inventor of
Python and the community around Python decided over the past several years
to change the language in enough ways that they are breaking changes. They're not backwards
compatible, which means if you wrote code in version 2 of
Python, it might not work in version 3. And unfortunately both
versions of the language have been coexisting for some time,
such that there's a huge community that still uses Python 2. There's a growing community
that uses Python 3. So that we stay at least as current as
possible, we for the class' purposes will use Python 3. And for the most part, if you're
learning Python for the first time, it's not going to matter. But realize, unfortunately,
that when you look up resources on the internet
or Google things, you'll very often find older
examples that might not necessarily work as intended. So just compare them against what we've
done here in class and in section. All right. So with that said, let's
go ahead and substitute in my name, which I'm going to do
fairly oddly with two curly braces here. And then I'm going to do this. .format open paren, s, close paren. So what's going on here? Well, it turns out that in
Python, quote unquote "something" is indeed a string, or
technically an object of type str. And it turns out that in Python
and in a lot of higher level languages, objects-- as I keep calling
them-- have built in functionality. So a string is no longer just
a sequence of characters. It's no longer just the address of a
byte of memory terminated eventually with backslash 0. There's actually a lot more
going on underneath the hood that we don't really have to care about. Because indeed, this is a good thing. We can truly now think
of a string in Python as being an abstraction for
a sequence of characters. But baked into it, if you
will, is a whole bunch of additional functionality. For instance, there is a
function that is a method called format that comes with strings now. And it's a little weird
to call them in this way. But notice the similarity. Just like the CS50 library, or module,
or really object, has inside of it a get_string method or
function, so does a string, like quote unquote "whatever"
have built inside of it a method or function called format. And as you might have guessed, its
purpose in life is just to format the thing to the left. So you get used to this format--
and there's no pun intended-- and there's other ways to
do this still, but we'll see why this is useful in just a moment. For now, it just looks like a
ridiculously unnecessarily complex way of plugging in a name to simply do this. If I type in my name David, and
hit enter, now I get "hello David." But trust for now that this
is going to be useful as we start to use other file formats still. Now as an aside, so that we've not
just removed training wheels and now putting them back on you
just for the sake of Python, let me emphasize that we can actually
implement this program exactly the same way without using anything CS50
specific using built-in functionality, like the input function
in Python version 3. The input function here optionally takes
a prompt inside of its parentheses. But if I exclude that, it's
just going to ask for some text. And here I can do this now. If I run Python string.py and
type in my name, it still works. And if I actually do something like
this, name colon space, save the file, and rerun it, now I
get a prompt for free. So here, too. Super simple example. But whereas in C,
typically we would have had to add that prompt using printf
and loop again and again as needed, here we can simply prompt
once via the input function and get back a value all at the same
time, such as say, Zamyla's name here. So we're only using the CS50
library for today's purposes to show you the equivalence
of some of our C examples vis-a-vis these Python examples. But it is by no means
necessary, just gives us a bit more functionality that's useful. For instance, if I were to write a
program very similar to this one-- recall way back when we had this
program in C, which simply got int from the user and printed
it out-- let me this time create a new file called int.py. And inside of it, import
the CS50 library, which also has a function called cs50.getint. And then use this function to simply
say, print, quote, unquote, "hello." Open curly brace, closed
curly brace, .format i. Save this file. Run Python int.py. I can type in a number like 42. And voila. Now we've used Get Int. But now let's actually format something. You'll recall that in the world of
C, we had some issues of imprecision. So recall that this program, whereby I
printed the value of 1/10 to 55 decimal places, actually did not
yield 0.100000 to infinity, as I was taught in grade school. Rather, we saw some raring
of the head of imprecision, whereby floating point values in C were
not represented infinitely precisely. In fact, let's do this too. Imprecision.py shall be
the name of this file. And you know what? I don't even need to
write much code here. I'm just going to go ahead and
print out, somehow or other, a value like, say, 1 divided by 10. Let me go ahead and save that. Run Python of imprecision.py. And I do get 0.1. So this is kind of interesting. And in fact, it's revealing
a feature of Python. But I don't want to see
just one decimal point. I want to do the equivalent
of %.55f, as we saw in C. It's almost the same in Python. But instead of using the percent sign,
I'm going to use a colon instead. And now notice inside of all of this
is just 0.55f preceded by that colon. So it's almost exactly
what we did earlier, but with a bit more specificity. And now I see again that ridiculously
disappointing imprecision eventually, which we also saw in C. So it turns out in Python, too,
only a finite number of bits are used typically to represent
a floating point value. So we still have, unfortunately,
that issue of imprecision. But what we don't seem
to have is something that we stumbled over some weeks ago. And in fact, the reason in the C version
I did 1.0 divided by 10.0 was what? Why didn't I just do 1 divided
by 10 in the C version? What happened? So as I recall, if you take an int in
C and then divide it by an int in C, you get back and int in C.
Unfortunately, 1 divided by 10 should be 0.1. But that's not an int. That's a floating point value. So we solve this issue of
truncation with integers whereby, if you have a value 1 divided
by a value 10, both of which are ints, you're going to get back an int. The closest int after
throwing away everything after the decimal point,
which unfortunately would have been 0 if I didn't
define them instead as being floats. But it seems that Python
has actually fixed this. In fact, one of the features of
Python 3 is to redress exactly this. For many years, we've all
had to deal with the fact that an integer divided by an integer
is, in fact, an integer and therefore mathematically incorrect, potentially. Well, turns out that's been fixed
such that now 1 divided by 10 gives you the value that you
actually expect-- not, in fact, 0. But what does this actually mean? Let me go ahead and open up an example
that I wrote in advance, this one being a translation of what we
didn't see some time ago, like this. You'll recall that in the
version we wrote weeks back, we just tested out the plus operator
in C, the subtraction operator, multiplication, division,
and modulo for remainder. Well, it turns out we can do
something almost identically in Python here if we look at int.py. But notice that just as I've
changed the program slightly to use this CS50 library
for Python to get a value x here, to get a value y here. Notice that there is one
additional example down here. I'm still demonstrating plus. I'm still demonstrating minus,
multiplication, division. And what is this? So it turns out that in Python
3, if you want the old behavior and you actually want to do integer
division such that you not only divide but effectively floor the value
to the nearest int below it, you can actually use this syntax,
which somewhat confusingly, perhaps looks like a comment in
C. It is not a comment in Python. In fact, in Python, as you
may have gleaned already, comments typically will start
with just a single hash symbol. But there's other ways
to do comments as well. But notice one other curiosity, too. This program does not print out
new lines when prompting the user. In fact, if I run this
program, let me go ahead and run this example-- which,
again, is called ints.py. Notice that it prompts me
for an int x and an int y. And I supply the new lines. They don't get printed for me. And then we get back the answers
that we hopefully expect here. But what is this going on here? Well, in the previous examples, I
got away with not using /n anymore. On the one hand, that's nice. I don't have to remember this
annoying thing that often you might omit accidentally. And therefore, your prompt
ends up on the same line. And just things look incorrect. Unfortunately, the price we pay by no
longer having to call a /n in order to get a new line from Python's print
function is if you don't want that freebie, if you don't want
that /n, unfortunately, you're going to have to pass a
second argument to the print function in Python that overrides what
the default line ending is. So whereas you would be
getting by default /n for free, if I instead say comma end
equals, quote, unquote, nothing, that means Python, don't
use the default /n. Instead, output nothing whatsoever. So it's a tradeoff. And again, much like you might
have gleaned from the recent test, there's this theme of tradeoffs. So even in terms of the usability of a
language, might there be this tradeoff? If you want one feature, you might
have to give up some other altogether. So let's just tie this all together
and implement a program together for temperature as follows. Let me go ahead and create a
file called temperature.py. And this simply I want
to use to convert, say, Fahrenheit to Celsius,
to convert two temperatures. I'm going to go ahead for
convenience and use the CS library. I'm going to declare a variable called
f that's going to become, as we'll see, of type float by using cs50.getfloat. And now I'm going to declare another
variable, c, for Celsius, that's going to equal 5 divided
by 9 times f minus 32, which I'm pretty sure is the formula
for converting Fahrenheit to Celsius. And then I'm going to
go ahead and print this, not with printf but
with print, as follows. I'm going to have some placeholder
there formatting this variable c. And what do I actually
want to put inside of here? Well, if I want to go ahead and
format it to just one decimal place, I'll use .1f. Let's go ahead and run
Python on temperature.py. Enter. Let's type in a temperature
like 212, 100 in Celsius. Let's type in the only other temperature
I really know, 32, zero in Celsius. So we've done the conversion. And we've not had to
worry nearly as much as we did a few weeks ago about
all of the issues of integers being truncated when you divide. All right. So let's not focus so much
on math and operators. Let's actually do a little bit of logic
by way of this example from a while back. We had an example in C called
logical.c, which simply did this. It asked me for a char. And it stored it inside
of-- and actually, this could have been this--
char c gets get char. And then I compared that char c against
Y in capital letter or y lowercase. And if they matched, I printed yes. Otherwise, if it was capital N
or lowercase n, I printed no. Else, I just said error. So it's just an arbitrary
program that's meant to assess, did I type yes or no effectively by its
first letter, capitalized or otherwise? Let's go ahead and port this,
translate this to Python as follows. Let me go ahead and create
a new file over here. We'll call this logical.py. And I'm going to go ahead as
before and import the CS50 library. But again, you could just use Python's
built-in input function to do this. But at least this way, I'm guaranteed
to get exactly the data type I want. CS50.getchar. And then over here, I'm
going to now say conditions. So remember some of
the syntax from before. You might be inclined to
start saying, if open paren. But we don't need that here. We can instead just say if c equals
equals yes, or c equals equals y, then go ahead and print yes. Now, this just seems ridiculous. All these weeks later, finally, you
can truly just say what you mean? And indeed, in Python, there's not
going to be the same double vertical bar or double ampersand that we've used
now for some time to express or or and. Rather, we can really type this a
bit more like an English sentence. It's still somewhat cryptic, to be
sure, but at least there's less clutter. There's no required parentheses anymore. We don't need the curly braces even. We don't need vertical
bars or ampersands. We can just use the word with which
we're more familiar in the real world. But notice, too, I've done
something subtly different from C. In the C version, to compare
this variable c against y in capital letters or
lowercase, I use single quotes. Why was that? In C, you actually have
a data type called char. And it's fundamentally
distinct from a string. So if I'm checking a char in C
against some hard coded value, I have to use single quotes to make
clear that this is just a single Ascii byte, capital Y or lowercase y. It's not capital Y /0. It's not lowercase y /0. It's just a single byte
that I'm trying to compare. But it turns out in Python, there really
is no such thing as a single char. If you want a character like capital
Y or lowercase y, that's fine. But you're going to get an entire
string-- a string with just one character in it plus whatever
else is hidden inside of a Python string object. But what that means for us is that
we don't have to worry as much about, is this a char? Is this a string? Just compare it in the
more intuitive way. In fact, notice moreover
what I am not using. In C, when we started
to compare strings, we used things like
StrComp or string compare. No more. You want to test two
strings for equality. Does c from the user actually
equal y, capitalized or lowercase? We can just double quote it like this. And in fact, it turns out that
it doesn't matter in this context whether I use double
quotes or single quotes. Generally in Python, you
can actually use either. I'll simply adopt the habit here,
and throughout these examples, of using double quotes,
if only because they're identical to what we've done in CS50
for C. But realize that both of these are correct. Stylistically, generally just be
consistent with respect to yourself. All right. So let's do another example and
start to build on the sophistication. Because this isn't all that impressive. And actually, this of
course is not yet done. Else if c equals equals N or
c equals equals lowercase n, then I'm going to go ahead
and print out-- oops. Not with printf but with no. Else, colon, I'm going
to print out error. Almost forgot to finish my thought. So that's why the program was so short. Now it's almost as long although,
again, if you ignore the curly braces, it's pretty much the same length. Just a little syntactically simpler. All right. So let's build up
something a little more interesting in the interest of design. So some weeks ago, we introduced this
example in C, the purpose of which, in positive.c, was to implement
a program that doesn't just get an int from the user. It gets a positive integer. And this was a useful
opportunity way back when to implement a custom
function of our own, a feature that we had in Scratch. But it also was a nice
way of abstracting away what it means to be get
positive int, because we could use get int underneath the hood,
but not necessarily care about it thereafter. So in C, recall a few details. We needed, one, not only
our header files up top. But we also need this
forward declaration. We need this prototype
at the top of the file because C is going to read things
top to bottom, left to right. So we'd better tell Clang or whatever
compiler we're using about the function before we use it in the code itself. I now have an int i
getting a positive int. And then I just go ahead
and print this out. So the real magic seems
to be below the break here whereby we implemented
get positive int. And to do this in C,
notice a few features. One, we declared it as a function, get
positive int, that takes no arguments and returns an integer. Inside of that, we declared a
variable n outside the scope of the do while loop because we want
n to exist both here and here, as well as when we
actually finally return it. And then in this do
while loop, we just kept pestering the user so long as he or she
gave us a value that's less than one, so non-positive. And then we returned it and printed it. Let's try to now port this to Python. In Python, let me go ahead
now and do the following. I'm going to create a new
file called positive.py. I'm going to go ahead and import
the CS50 library as before. And I'm going to go ahead and define a
main function that takes no arguments. We're not going to worry
about command line arguments. And indeed, even when we are
going to worry about them, we're not going to declare them
inside those parentheses anymore. Now I'm going to go ahead and
do i get get positive int. And now I'm going to go ahead
and print out, with print, the placeholder is a positive
integer, closed quotes. And then I'm going to do format
i, plugging in that value. So let me shrink the screen
here a little bit so that things fit a little better on the Python side. And now that's it for main. No curly braces. I just unindent in order to now
start my next thought, which is going to be this. I'm going to go ahead and define another
function called get positive int. I don't use void in Python. I simply leave the parentheses empty
and add a colon at the end to say, here comes the function's
implementation. And it turns out in Python, there
isn't this do while construct. So the closest match to do while
we did see earlier is just while. And a very common paradigm in
Python is to deliberately induce, as you might have in C, an
infinite loop capitalizing True because in Python, a bool that's true
or false is going to be capitalized. And then inside of this loop, let's
go ahead and do the following. Let's go ahead and say, print n is. And now below this, I'm
to say n gets get int. But this is inside the CS50 module. So I need to do that there. And then I'm already
in an infinite loop. So you know what? If n is greater than or equal to
1, I'm going to go ahead and break. So the logic is a little
bit different this time. But I'm breaking out of the
loop once I have what I intend. So I need to do one last thing. Once I've broken out
of this loop, what do I need to do to complete the
implementation of get positive int? I've gotten it. But I need to hand it back to the user. So let me go ahead on this last
line and return that value as n. So notice a few distinctions here
versus C. Whereas in C a few weeks ago, we had to give some hard
thought to the issue of scope. Turns out we don't have to
worry about that as much. As soon as I declare n here, it's going
to be within scope within this function such that I can return it down here,
even though that return statement is not indented and not inside, so to
speak, that actual looping construct. Notice too, because we don't
have a do while construct, I had to re-implement
it using while alone. And I actually could
have done that in C. Do while does not give us any
fundamental capabilities that we couldn't implement for ourselves
if we just implemented it logically a little more like this. We're still printing out n is first. We're then getting an int. We're then checking if it's positive. And if so, we're breaking
out and returning. There is one or two bugs in here. And we'll trip over
these in just a moment. Let me go ahead now and save this file
and then run Python positive.py, Enter. Nothing seemed to happen. Hm. It's not running anymore. I'm back at my $prompt. Let me try running it again. Python positive.py. I mean, there's no error message. And in the world of C, no error message
usually meant something's right. And it's right. I've just kind of
forgotten a key detail. I've imported CS50 library. I've defined main. I've defined get positive int. But what is different in
this world now with Python? Main is not called by default. So
if I want to actually call main, I'd better adopt a convention
of, for instance, this paradigm. So if name equals equals
main, then, with a colon, actually call the main function. And technically, as an aside, this
would still work even without this. We could simply put main down here. But let me wave my hand
at that detail for now and just emphasize that anytime
you want to proactively call main, if you've set up your code in this
way, we should indeed do it like this. Let me go ahead now and
rerun Python positive.py. n is 42. n is a positive integer. Let me go ahead and
run n is, and then 0. Nope. Negative 1. Nope. Foo. Retry. That's the CS50 library kicking
in noticing that's a string. Let's try 50. And OK. That worked. Now, the bug I alluded to earlier
is just that this looks stupid, having the cursor now on the next line. I can fix this, recall, by adding the
second argument whereby the line ending for print is just quote unquote. Let me go ahead and rerun it. n is 42. And now things look
a little bit cleaner. Now, at the risk of complicating, let
me just point out one other detail. Technically, I could also do this. If you don't need a main function,
then why do I have it at all? It stands to reason that I could
just write my program like this. Yes, I'm defining an additional
function, get positive int. And that's going to work as expected. But technically, if I
don't need a main method-- and all of the simple
examples we've done thus far just have me writing code
right in the file itself and then interpreting it at the command
line-- I should be able to do this, I would think. So let me try this. Let me go ahead and run again Python
positive.py but on this new version. Enter. And now we get the first
scary looking error message. So trace back most recent call last. File positive.py line 3, and
module i get positive int. Name error name get
positive int is not defined. So the first of our
Clang-like error messages-- this one coming, of course, not from
Clang, but from the Python interpreter. And even if the first few lines
are indeed pretty cryptic-- name error name get
positive int is not defined. But yes it is. It's right there at
the moment on line 6. So it turns out Python is not
all that much smarter than Clang when it comes to reading your code. It too is going to read it
top to bottom, left to right. And insofar as I'm trying to
call get positive int on line 3, but I'm not defining it
until line 6, unacceptable. Now, you might be inclined to fix
this like we did in C, whereby you say, all right, well, let me
just do get positive int up here maybe, and just put a prototype. But this now looks especially weird. This now looks like a function
call, not a prototype, because we're omitting now the
return type because there is none. And there's no semicolon
here by convention. And indeed, if I do this
again, it's the same error. Now the problem is I'm
calling it in the wrong place even earlier-- on this line,
still line 3, in addition to line 5, which is now there. So how do we fix this? Well, back in C, we didn't technically
need prototypes in most cases. We could instead just
kind of work around it by moving the code to,
say, the top of the file and ignore the problem, really. And now run the program. And now it's back to working. Why is that? Well, the Python interpreter is reading
this file top to bottom, left to right. It imports the CS50 library. It defines a new function
called get positive int. And then, on lines 11 and 12
now, it uses that function and actually then prints
out the return value. But again, this very
quickly gets a little messy. Now to find what this
program does, I have to look all the way at the bottom
of the file just to see my code. It would be nice if the
actual logic of the program were at the top of the file, as has been
our norm with C, putting main up top. So another good reason
for having a main method is just to avoid these kinds of issues. If I rewind all of these
changes that we just made and go back to this last version,
this avoids all of these issues. Because if you're not calling main until
literally the last line in your file, it's going to be defined at that point. So is any functions that it defines. And all of that will
be implemented for you. And so now we're good to go. So again, we're complicating
the program deliberately, but to proactively address
those kinds of issues. Let's introduce one other topic now. Abstraction has been a theme,
not only recently in the test, but also in the earliest
weeks of the course. Well, you might recall
from those early weeks, we had examples like this, where we
had an example called cough0.c, whose purpose in life was to do [COUGHING]. So three coughs in a row. Now, this was clearly copy paste
because all three of these lines are equivalent. But that's fine for now. Let me go ahead and verbatim convert
this to Python as closely as I can. And cough0.py turns
out it's pretty easy. Print quote unquote cough. And then I can really
demonstrate how poorly designed this is by literally copying
and pasting those three lines. I don't need standard IO.h. I don't need the CS50 library. I don't need main. We know-- because now, if I just do
Python cough0.py, Enter, cough, cough, cough. All right. But we improved upon
this example in C. Recall that in C, we then looked at
cough1, which at least used a loop. So how do I do this in Python? Let me go ahead and save
this now as cough1.py. And let me try to borrow
some logic from earlier. Let me do for i in. And you know what? I'm going to do range 3. We had 50 before. But I don't need it to
iterate that many times. Now let me just go ahead
and print cough three times. And now run Python cough1.py, Enter. Cough, cough, cough. All right. But recall in the world of C,
we improved further in cough2.c as follows. We abstracted away, so
to speak, what it means to be coughing by wrapping it in
its own function called cough. Because we don't really care that
cough is implemented with printf. We just like the idea, the
semantics, if you will, of having a new custom
function called cough. So let's go ahead and
try to do that in Python. Let me go over here and create
a new file called cough2.py. And in here, let me go ahead
and define main as before. Inside of this, let me
do for i in range 3. And let me go ahead here
and call proactively cough, even though it doesn't yet exist. Let me go down here now and
implement cough in such a way that it simply prints cough. Let me go ahead now and
do Python cough2.py. Wait. Something's wrong. What's going to happen? Nothing. I need to actually call the function. And again, the paradigm
that we'll adopt is this. The name of the file is the default
name of quote, unquote, __main__. Then let me go ahead and call main. So now if I run this again, voila. Cough, cough, cough. Notice again no prototype. No imports from CS50
because we don't need it. But let's improve upon this further. In C, we took this one step further
and then parameterized cough so that we could cough three times
but not have to implement the loop ourselves in main. We just want to punt,
so to speak, or defer to the actual implementation of cough
to cough as many times as we want. So if I want to do that here, let me go
ahead and save a file called cough3.py. And let me go ahead and again define
main to just do a cough, but this time three times, actually
giving it an argument. And then we go ahead
and define cough again, but not with open paren, closeed paren,
but with an actual variable called n. Here too, I don't need its data type. Python will figure that out for me. And then here, I can do
for i in range of not 3 anymore, but n, because that's a
local argument that's been passed in. And now let me go ahead and
print cough that many times. Down here, let me go ahead and do my if. The name of this file is
the default name of __main. Then go ahead and call main. So now let me run this, cough3.py. And I get cough, cough, cough. And you recall we kind of took
this to an extreme a few weeks ago. Suppose I now want to implement
the notion of sneezing. Well, sneezing was
deliberately introduced, not so much because it's all that
useful, per se, as a function, but because it allowed me to
factor out some common code. It would be a little lazy of
me if, to implement sneeze, I went ahead and did something
like this, whereby I literally copy and paste the
code, call this sneeze, and then say "achoo" here instead. Because look how similar
these two functions are. I mean, they're literally
identical except for the words being used therein. The lines of code
logically are the same. So instead of that, let me go ahead
and port this as I did in C as follows. Let me go ahead and save this as
cough4.py and in here go ahead and define main. And main now is going to
call cough three times. And it's going to call
sneeze three times, which just means I need to implement them. So let me go ahead and define cough
as before, taking in an integer n, we can call it. But we could call it anything we want. But now you know what? Let me generalize this and just
have it call a say function with the word we want it
to say, and how many times. Meanwhile, let me go
ahead and define sneeze as taking a similar int that simply
says achoo, n that many times. And now I just have to define say. And before in C, on the left hand
side here, took two arguments. We can do that as well in Python. We can simply say a word and n
without worrying about their data type and declaring them. And now in here, I need to
do this for i in range of n. Let me go ahead and print word. Now technically, if I really
wanted to be consistent, I could do print quote, unquote,
curly braces, format word. But I literally gain nothing
in this case from doing that. So it's a lot cleaner
and a lot more readable just to literally print the word. You don't strictly
need that placeholder. Then down here, let's do if the name of
the file equals equals, main as before. Call main. Voila. Let's go ahead now and
do Python of cough4.py. Enter. Cough, cough, cough. Achoo, achoo, achoo. So it's kind of an exercise
in futility in the end because the program still
doesn't do anything that's all that fundamentally interesting. But notice how quickly we've
moved from just printing something like hello world just
a little bit ago to defining our own main function that calls two
functions that are parameterized, each of which in turn calls
some other function that takes multiple parameters. So we're already very
quickly building up these building blocks, even
faster than we might have done in the earliest weeks of the class. All right. So that's essentially week one
that we've now converted to Python. Recall now in week two of CS50,
we started to look at strings. We looked at command line arguments. So let's now, with
relatively fewer examples, compare and contrast what we
did then to what we'll do now and see what new features we have. Recall indeed that in week two,
we implemented strlen ourselves. Before we even started
taking it for granted that there is a strlen function
that returns the length of a string, recall that we
implemented it as follows. We got a string from the user. We initialized some counting
variable, like n to 0. And then while that location
in the string using, our square bracket notation, was not
equal to the special sentinel value, /0, do n plus plus,
thereby incrementing n, and then eventually print
out what the value of n is. So this, though, assumed in
week two an understanding of what's going on underneath the hood. In Python, we're not going
to want to worry about what's going on underneath the hood. Indeed, this whole principle
of abstraction-- and more specifically, encapsulation--
whereby, these implementation details are deliberately hidden from us, is now
something we can embrace as a feature. No longer do we need to worry as much
about how things are implemented, but just that they are implemented. So increasingly will we start to rely
on publicly available documentation and on examples online
that use features of code, as opposed to worrying
as much about how they're implemented underneath the hood. So toward that end, let me go
ahead and implement the equivalent of this program in Python in a
manner that would be appropriate here with strlen.py. I'm going to go ahead and import
the CS50 library so that I can get a string like this with get string. And then I'm going to
print the length of s. So recall, of course, in C, we
could have done this with strlen. In the world of Python, we're not
going to use strlen, but rather len, or L-E-N for length,
which it turns out can be used on any numbers of
different variables and objects. It can be used on strings. It can be used on lists and
other data structures still. So for now, know that this is how we
might print the length of a string. So let's go ahead and try this. Python of strlen.py. Type in something like foo,
which is three letters. And indeed, that's what we get back. Well, now let's actually take a
look at the fact that we do still, nonetheless, have this notion of
Ascii underneath the hood going on, although not necessarily
Ascii but Unicode, which is a far more
powerful encoding of symbols so that we can have far more characters
than just, say, 128, or even 256. Let me go ahead and create
the following example. We'll call this Ascii0.py so
that it lines up to the example we did called Ascii0.c a few weeks back. And let me go ahead
and do the following. For i in the range of 65, 65 plus 26. So if I want to start
iterating at 65, and then iterate ultimately over 26 characters
like we did a few weeks ago, I can actually do this. I can say something like,
something is something, specifically if I format two values. I essentially want to
format i and i again. But the first of these I want to
actually print as a character. So it turns out that if you
have in a variable, like i, a decimal value, an integer, that
corresponds underneath the hood to an Ascii value, or really
Unicode value, which is a superset, you can call the CHR function,
which is going to convert it to its character equivalent. If I go ahead now and run Python
of Ascii0.py, I've made a mistake. And you'll notice even
CS50 IDE noticed this. And I didn't notice CS50 IDE. If I hover over that little x,
it's yelling at me, invalid syntax. Because CS50 IDE actually understands
Python even more than it does C. So I can actually fix this with that
additional in keyword, which I forgot. And now I can see the exact
same tabular output which, again, prints out capital A as 65. So not necessarily a useful program
other than to show us this equivalence. Well, what about arguments
at the command line? Let me go ahead and implement a program
similar in spirit to argv0.c a while back, this time calling it .py. And in here, let me
go ahead and do this. If-- and actually, let me go
ahead and import sys first. So sys is a system module that has
a lot of lower level functionality, among them command line
arguments-- which, again, we do not declare as being part of main. They're globally
accessible, if you will. I'm going to go ahead and do this. If the number of command line arguments
in that list there equals equals 2, then I'm going to go ahead and
print out hello placeholder. And then format inside of
that sys.argv bracket 1. So if there are two command
line arguments-- something, something-- I'm going to
print the second of those because the first of them is going to be
the program's name or the file's name. Else, I'm going to go ahead and just
print out generically hello world. Let me go ahead and save that. Run Python argv0.py. Enter. And voila. We have hello world. Now, as an aside-- and
just so that you've seen it-- there are other ways of
outputting strings because frankly, this very quickly gets tedious
if all you're trying to do is plug in some value. Generally, for consistency,
I'll still do it this way. But we could have done
something like this. And those of you who took, for
instance, AP Computer Science A in high school, or a Java
class more generally, might know that the plus
operator is sometimes used as the concatenation operator
to take one string and another and jam them together. And indeed, we could do this as follows. I could now do Python of
argv0.py and get the same result. But you'll find generally that using the
format approach, as I originally did, tends to be a little more sustainable
once your code gets more complex. Let's do something else. Let's go ahead and print out a whole
bunch of command line arguments, just as we did a few weeks ago,
this time in argv1.py, which again corresponds to our earlier code. And here, I'm going to go ahead and
import the sys module again and do for i in range. And now this time, I'm going to do the
length of sys.argv which, to be clear, is going to give me the number of
arguments in the argument vector. And that list, called
argv, which sounds awfully equivalent to what special
variable that we kept using in C? If you recall, not just argv, but argc? The latter doesn't exist in Python. But we can query for it by
just asking Python, what is the length of the argument vector? That means what is argc? So I'm going to go ahead now and
just print out sys.argv bracket i. And if you think through
these lines of code, it would seem that this is going to
iterate from 0 on up to the number of arguments in that argv vector, or
list, and then print out each of them in turn. So let me go ahead and
run Python of argv1.py. Enter. And indeed, it just printed out one
thing, the name of the program itself. What if I did foo, bar, [INAUDIBLE],
some arbitrary words, and hit Enter? Now it's going to print
all of those as well. So this is just printing out,
as we did a few weeks ago, all of the words in argv. But we can do something a
little neater now as follows. Suppose that in, argv2.py, just
like a few weeks ago in argv2.c, I wanted to print out
all of the characters in all of the words of the
command line arguments. I'm going to go ahead
and import sys again. And now I'm going to
do for s in sys.argv. So here's a new approach altogether. And then do for c in s. And then in here, I'm
going to do print c, and then eventually,
just print a new line. So now things are getting a
little magical, or frankly, just a little convenient. I'm still importing the
sys module so that I have access to argv in the first place. And it turns out that insofar as
sys.argv is just a list-- like in C, it's similar in spirit
to an array-- I don't have to do the for loop with the int i
and index into the array using bracket i. I can get from Python's for keyword
this beautiful feature, whereby if I just say, much like the ranges
I've been using it with thus far, for s in sys.argv, this is going to
assign s so the first string in argv. Then on the next iteration,
to the next string in argv. Then on the next iteration, the next
string in argv, each time updating s. Meanwhile, on line 4
here, which is indented as part of being inside this
outermost loop, for c in s. Well, it turns out that Python treats
strings similar in spirit to C, as sequences of characters. But rather than put the burden on
you to declare an int called i or j or whatever, and then iterate
over bracket i or bracket j in each of these variables,
you can just tell Python, for each character in the
string, for c-- and this could have been any variable name altogether
in the current argument from argv-- go ahead and just print out C. So again, here we see another hint of
the ease with which you can write code in a language like Python without
having to worry nearly as much about low level implementation details
about random access and square bracket notation and indexing into
these arrays effectively. You can just allow the language
to hand you more of the data that you care about. So let's run Python of argv2.py. Enter. And it looks a little weird. But if I increase the screen, you'll see
that it printed one character per line, exactly those command line arguments. And if I do foo, you'll
see argv2.py space F-O-O. It's doing the exact same thing. So not a useful program. But it indeed is allowing us to actually
access those characters and strings still. So let's just open up an example
I wrote in advance to demonstrate one other point altogether. If I go into week two's folder here
from this week and go into exit.py, you'll see this example. It doesn't do all that
much, this program. But it does seem to check this. On line 4, it checks
the length of sys.argv. And if it doesn't equal
2, it yells at the user. Missing command line argument. And then it just exits. So just like in C, we have the ability
to return an exit code to the shell, to your prompt, not using
return, as we did in C. You still use return in Python, but
to return from methods or functions. In Python, when you want to
exit the program altogether, because there is not
necessarily a main function, you just call exit and then pass
inside of its parentheses the number that you want to return--
the convention, as always, being 0 for success and
anything nonzero for failure. And so that's why I'm
arbitrarily, but conventionally, returning 1 here to the prompt. I'm exiting with an exit
status code or exit code of 1 to indicate as much here. Otherwise, I'm just printing
out whatever the word is. So if I run this program, and I
go into today's second directory, and I run Python of exit.py,
missing command line argument. And you might recall this
trick from a few weeks back. If you, at your prompt, run echo$?, it
will show you the exit code of the most recently run program. So if I run this correctly this
time with, for instance, my name, and it says hello David. And now I do echo$?, I should see a 0. So just a lower level way of seeing
what's going on underneath the hood. Well, let's go ahead and do
another example demonstrating what also has changed for the better. Let me go ahead and now do this. In a file called compare1.py,
which will line up, you'll find, with
compare1.c a few weeks back, I'm going to go ahead and
import the CSV library. I'm going to go ahead
and print out just quote, unquote s, and then kill the new line. And then use s get CS50.getstring. And then let me do this
once more with a t variable, also getting rid of the new
line, just for aesthetics. And then t gets CS50.getstring. And then let me go ahead
and do a sanity check. It turns out-- and you would only
know this from reading our source code or the documentation therefore--
turns out that get string could return a special value. It's not null because Python
does not have pointers. We don't have to worry about
addresses anymore, per se. But it does have special
sentinel values like this one. If s does not equal None with a
capital N, and t does not equal None, indeed None is a special value
similar in spirit to null or similar in spirit to false, but
different from both. It's not a pointer, as it is
in C. And it's not a Boolean. It's sort of the absence of a value. And indeed we, in designing
the CS50 library for Python, decided that if
something goes wrong with get string-- maybe the computer or the
interpreter is indeed out of memory, even though there is no notion
of allocating memory per se. But something goes wrong inside
of get string for whatever reason, these calls could return None. So I'm just for good measure
checking that s is not None and t is not None so that I can indeed
trust that they're indeed strings, so that I can now do
something like this. If s equals equals t, then print same. Else print different. And you will recall, perhaps, that
when we did this in C some time ago, this did not work. In the world of C, line 10 would
not have worked as intended because it would have been comparing
two pointers, two memory addresses. And insofar as in C, get string
returns two distinct addresses. Even if the user types the same
word as we did a few weeks back, it's going to use the heap via malloc to
give you two separate strings somewhere in memory whose first byte's
address is going to be different. And so s and t in the world
of C were not the same. But that was never
really all that useful. I didn't really care about
those memory addresses. I wanted to compare the strings. And I had to resort back
in the day to STR compare. Well, as we've already
seen, you don't need to worry as much about that in Python. If you want to compare s and t, just
do it using equals equals as always. So that when I run this program
now and type in Python compare1.py, something like Zamaila,
something like Zamaila. Those are indeed the same. But if I instead type
Zamaila and then my own name, those are indeed different. And so this is as expected
whereby, if I type two strings that happen to be the same, and they're
both retrieved by two different calls to get string, they're
nonetheless going to be compared as expected for equality. Let's do one other thing to
demonstrate one other point of Python. Let me go ahead and open up a new file. I'm going to call this copy1.py. And you'll see that
it lines up in spirit with copy1.c from a few weeks back. Let me import the CS50 module. Let me go ahead and print out
s with new newline ending. Let me go ahead and do
CS50.getstring as before. And let me go ahead
and do a sanity check. If s equals None, then let's just
exit because this program's not going to be useful if something
bad happened underneath the hood. And now let me go ahead and capitalize
this thing, as I tried weeks ago. Let me go ahead and
do t get s.capitalize. And then print out s,
and then a placeholder that I can format with s itself. Then let me go ahead and print out
t colon, and a placeholder, and then format t itself. And then let me go ahead,
just for good measure, and exit with 0, even though that
will be assumed to be the default. So what's going to happen here? Let me run this program,
Python copy1.py. Type in something like
Zamaila in all lowercase. Enter. And you'll see that it's now
uppercase just t, and not s. Let me go ahead and do another
example with Andy's name. And we've indeed
capitalized Andy's name. So what's going on? And what's with all these dots? The only time we ever
really got into dots in C was when we had structures
or pointers thereto. But it turns out that Python is
an object oriented programming language in the sense
that it has support for objects, full-fledged
objects, really built into it. C just has structs. And structs, by definition,
contain typically only data. They will contain fields
like dorm or house or name, or whatever it is we're implementing,
like a student structure in C. But it turns out that in Python and
in other object-oriented language, you can have inside of structures
objects, as they're more properly called, not only pieces
of data, as we'll eventually see, but also built-in functionality. So the syntax, to be fair, has been
very weird when we look at strings. But if you trust me when I say a string,
or an STR variable, is an object, that object has inside of it
somewhere underneath the hood a sequence of characters,
whatever I've typed. But it also has apparently
built-in functionality. Among that functionality
is a function, a.k.a. a method called format. Similarly do string
objects in Python have a built-in function called capitalize
that do exactly as you would expect. So in C, we had toupper. But that operated on
just a single character. And the burden was entirely on me to
figure out what character in a string I wanted to make uppercase. In Python, this built-in capitalize
function for the string class will do exactly what
we intend, uppercasing the first letter in a string and
leaving everything else untouched. But it turns out that
in Python, a string is immutable, which is to say that
once it's created, you can't change it. And this is not the case in C. In C, when we used getstring,
or scanf, or malloc, or created strings on the stack by
allocating them effectively as arrays, if we allocated memory on the heap
or the stack and put strings there, we could change those
strings thereafter. And in fact, the earliest
version of this program in C was buggy insofar as it accidentally
capitalized both s and t, even though we only
intended to capitalize t. But it works right out of the box with
Python, at least as implemented here. Because it turns out once s
exists as a string, that's it. That's the sequence of
characters you're going to get. You can't go in and
change just one of them. And so what's really
happening here when I call s.capitalize is this function
is designed underneath the hood by the authors of Python
to give you a copy of s but quickly change the first
letter to a capital letter, and then return the resulting copy. All of that happens for me. I do not need to use malloc. I do not need to do STR copy. I don't need to iterate
over the characters. All of this we get for free,
so to speak, with the language. Let's look now at just
where else we can go. One of the biggest problems
we ran into, recall, in C was near the end of our focus on it. And we started tripping
over issues like memory. You'll recall in C, we had
this example here, noswap.c. And this program was pretty arbitrary. It allocated an x and
a y int and assigned them the values 1 and 2 respectively. It claimed to swap them by
calling the swap function. But then even though it
said it swapped them, it swapped only copies
of those variables. And indeed, the swap function, if
we scroll down below the break here, you'll see that it declares
two parameters, a and b, that by nature of how C argument
passing happens become copies of x and y such that a and b do get
successfully swapped, but there's no permanent effect
on the caller's variables in main's stack frame because
that was fundamentally flawed. And so we fundamentally fix
that with this version here. In swap.c some weeks
ago, we instead started passing an x and y by
reference, by their addresses using the ampersand operator
to get their address in memory, passing in effectively pointers, as
declared here with the star operator. And then we had to use the star
operator inside here of swap to dereference those pointers,
those addresses, and to go to them and actually change or get
the values at those addresses. So this worked. But let me go ahead now and implement
in Python something very similar. I've already written this one
up in advance in noswap.py. And it looks like the following. I define main up top. I'm not going to bother
using the CS50 library because everything is hard coded here. x and y shall be 1 and 2 respectively. Don't need to mention int again because
it's loosely tied to this language. Now I'm going to go ahead
and print x is this, y is this, swapping dot,
dot, dot, passing in x and y. And then I do what's here swapped. I claim it's swapped. I print them out again. Swap seems to be implemented. I'm a little nervous about this. This seems to really be just an
implementation of literally noswap.c. So let's try to confirm as much. Let me go ahead now and go into this
fourth week's directory in Python noswap.py, Enter. Indeed, it doesn't seem to work. So it would seem that Python 2
passes these things in by reference. So how do I fix this? Unfortunately, the fix isn't as-- and
this is kind of an understatement-- easy as it was in C to just change
these arguments to be by reference, and then use pointers to
actually dereference them and actually do the actual swap because
we don't have pointers in Python. So in some way, here's another
tradeoff that's been thematic. We were getting all these new features. Things are relatively
simpler syntactically, even though it will take some
getting used to, by all means, and some practice. But now we've given up that ability
to look underneath the hood and change what's going on underneath the hood. So pointers were scary. And pointers were hard. And managing memory is risky
because you risk seg faults, and you might have memory
leaks, and all of the headaches you might have had with psets four or
five or any number of the challenges we had involving addresses. You really start to bang
your head against the wall, potentially, because you have
access to that level of detail. Unfortunately, as soon
as it's taken away, we would seem to lose the ability
to solve certain problems. And indeed, in this case, can't
really solve it in the same way. There are multiple ways
we could address this. But let me propose one that has
the advantage of introducing a tiny piece of syntax that's pretty
cool to see it the first time. So in swap.py, let me go ahead
and declare x is 1 and y is 2. Let me go ahead and print out x is this
placeholder, and then plug in x there. And then go ahead and print
out y is this placeholder, and then plug in this placeholder there. And now let me go ahead and say
print swapping dot, dot, dot. And then we'll come back to this to do. And now I'm going to go ahead
and say print swapped boldly, and then print x is this
placeholder, x, and then print y is this placeholder, and then format y. So all that remains to do
is the interesting part. So it turns out we could
do something like this. We could say temp gets x, and
then x gets y, and y gets temp. And that would work. It's a little inelegant
because now, the beauty of having a swap
function before in C was that we were factoring out that logic. We could use it in multiple places. Made the code a little more readable. And now, in the middle of this
beautiful print statement, I've got this mess here. But it turns out that's
the right spirit, at least to keeping the solution simple. But notice what you can do in Python. It turns out that you can
actually swap two things at once. And it's because of a feature
that's implicit in the syntax here. These are actually data types
on each side of the equals sign. It turns out that Python
supports not just lists, which we've generally known
thus far as arrays in C, but it also supports, again,
tuples, a data structure that allows you a comma
separated list of values, the burden of which is entirely on
you to remember what comes first, what comes last, what's in the middle. But by way of doing this-- and I can
do this in a couple of different ways. And I can do it not
even just with tuples. You can think of this a little more
like this, like an xy coordinates, Cartesian plane and so forth. You can actually consider this as
happening really simultaneously, but letting the language,
Python and its interpreter, figure out how to do that
switcheroo without losing one or both of the
variables in the process. It doesn't matter to us the
low level implementation detail that that might actually
require some kind of temporary storage. That is now a feature
of the language that we get for free if we actually want to
assign two values simultaneously. And this is actually powerful
for that same reason. It turns out that if you
have some function called foo that returns a single value,
you could do something like this to get back that value, as we've been
doing all throughout these examples. But it turns out foo could
potentially return two values, which you could assign like this. Or foo could return
three values like this. If foo was indeed implemented
as returning a tuple, a comma separated list
of values like this. So you don't want to take this
necessarily to an extreme. But in C, you might
recall that we did not have this capability of being
able to return multiple values. And that is now an option,
although there's alternatives to needing to do that altogether. So we're almost caught up in time
in Python vis-a-vis where we started and where we ended with
C. But let's introduce one other feature of Python that
allows us to translate something from C as well. Recall that we introduced
structures some time ago. And indeed, I'm going to
go ahead here and save a file called structs0.py, which
is a bit misleading because these aren't technically structures. They're objects, as I'm about to use. But we'll clarify that in a moment. Let me go ahead here and import CS50. And let me also import, using
slightly different syntax, this. In a moment, I'm going to create on the
fly my own Python module, my own class, if you will, called
student, inside of which is going to be a class
called Student capital S. And first, let's assume that
it exists so that I can just take on faith that it will soon exist. And let me give myself a list of
students like this, an empty array, if you will, as implied by the
square bracket notation here. So new syntax. But what's nice is it's pretty readable. On the left is the
variable's name, assigning what's on the right hand side. We've seen square brackets for
arrays or lists more generally. So this just means give me an empty
list and assign it to students. Unlike strings, a list in
Python is mutable, changeable. So this does not mean that students
is forever going to be an empty list. We can add and append things to it,
much like a stack or a queue or a linked list more generally. So now let me go ahead and do this. For i in range three-- I'm just going
to arbitrarily do this three times, just like we did a few weeks ago. I'm going to in here now print out
print name with no line ending, just to keep things pretty. Let me go ahead then
and use CS50.getstring to actually get a student's name. Then let me say hey, give me
your dorm with no line ending, just to keep it clean. And then use dorm CS50 get string. And then down here, let me do
students.append students name dorm. So this is new now. And we'll come back to
this in just a moment. Then after this loop, let's
just for good measure do this. For students in students,
print the following placeholder is in placeholder. Then format student.name, student.dorm. So now things are getting
a little more interesting. I have now done a few
things in this program. I have imported something
called a student, which doesn't yet exist but will in a moment. I have declared a variable,
or a list, specifically, called students, and
assigned it an empty list. Then I'm iterating
three times arbitrarily just so we have a demo to play
with saying, give me your name, give me your dorm, and then this. So students is an object,
as we say, a structure in C. But now we call them objects,
inside of which is going to be data. There's not much data now. It's just an empty list. But it turns out, if you read
the documentation for Python, you'll see that a list has some
built-in functions, or methods, as well-- not just data, but also
functionality-- one of which is called append. And if we read the
documentation, we see we can pass in an argument to append
that is a variable or a value that we want to append to the
list, add to the end of it. And we'll see in a moment
what this syntax means. It turns out this is similar in spirit
to using malloc in C to malloc a struct and then put inside of it
two values, name and dorm. But what's nice about Python
and languages like PHP and Ruby and Java, all of which support
something similar in spirit, is this single line gives me a new
student object, inside of which is that student's name
and dorm as strings. Later, outside of this
loop, just for good measure, we reiterate over this list as follows. For student in students,
well, what is this doing? This, again, is an iterable list. So not irritable, iterable list,
whereby you can iterate over this list, calling each element inside
of it temporarily student, as in our previous use of for. And then just print
so-and-so is in this dorm, formatting those two values using the
same dot notation as we used in C. So we need a students object. Otherwise, what's going to happen? Let me go ahead and try to run
this incorrectly as follows. Python struct0.py. Enter. Import error. No module named student. So creating a Python module,
it turns out, is super simple. I create a file called student.py. I now have a module called Student. Of course, there's nothing in there. So I need to actually populate it. So let me go ahead and do this. And we'll come back to this in the
future with a bit more complexity. But for now, let me introduce,
with a bit of a wave of the hand, the following. If I want to create a structure
called Student, technically in Python, it's called a class. And that class should be
Student, the convention of which is to call your structures
in Python, your classes, with a capital letter
for the first letter. And now I'm going to define
a standard method called init that takes as its first argument
a parameter that's conventionally called self, and then any number of
other arguments that I want to pass it. And then inside here,
I'm going to do self.name gets name and self.dorm gets dorm. So this is perhaps the most
new-looking piece of code that we've seen thus far in Python. And we'll explain it just
at a high level for now. But in line 1, we're saying, hey
Python, give me a new structure. Give me a class called Student,
capital S. Line 2, hey Python, inside of this class, there shall
be a method, a function, that's called init for initialization. And it's going to take by convention
three arguments, the first of which you just have to do, let's say,
for now, the second and third and beyond of which are
completely up to you. Name and dorm is what I chose. And what's neat is this. Lines 3 and 4 mean whatever the user
passes into me as a student's name and dorm when this class is
instantiated, allocated as an object, go ahead and remember their name and
dorm inside of these instance variables called self.name and self.dorm. So if you think of the
scenario as follows, in struct0.py, we had this
line of code toward the end. Not only were we appending something
to the list called Students. We had this highlighted portion of code. Capital Student, open paren,
name, dorm, closed paren. That is similar in spirit,
again, to calling malloc in C and automatically, all in one
breath, installing inside of it two values, name and dorm. So if this is similar in spirit to
malloc, you can think of this line here, this highlighted portion,
as creating somewhere in memory, in your computer-- doesn't matter
where-- a structure like my fist here, passing into it name and dorm. And then what happens on those two lines
of code in student.py, lines 3 and 4, is if name and dorm are the
two values that were passed in, they get stored inside of this
structure and saved permanently in what are called instance
variables inside of self. Self just refers to the object
that has been allocated. So we'll come back to that before long. But just take on faith for now that
init has to be the name of the method that you use. Self is conventionally used
as the first argument there. And this just ensures that
we're remembering a student's name and his or her dorm as well. So if I now run this, you'll
see I'm prompted for David. And I'll say Mather and Zamaila
and Courier and Rob and Kirkland. Enter. And the program doesn't
do all that much. But it manipulates and
it creates these objects, and ultimately does
something useful with them. But it throws the information away. And so for our command
line examples here, let's do one final example that
improves upon that as follows. Let me go ahead and create a new file
called structs1.py, similar in spirit to what we did some
time ago in structs1.c. I'm going to start with
that same code from before. And I'm going to keep around student.py. But instead just printing
it, you know what? I'm going to get rid of the
printing of these names. I'm going instead do this. File gets open students.csv,
w, quote, unquote. Writer gets csv.writer file for
student in students, just as before. Writer.writerow student.name,
student.dorm, file.close. Definitely a mouthful,
and it's not perfect yet. But let's try to glean what I'm doing. Open turns out is similar
in spirit to fopen from C. And it takes two arguments just
like in C, which is wonderful, the name of the file
to open and the mode in which you want to open it--
writing, w, or reading, r. And there's a few other options too. This just returns to me a
reference to that file somehow. And indeed, all this time
I've been describing variables as just that, variables. But technically speaking, all of these
variables-- x and y, and now file and s and t and others-- are
references or symbols that have been bound to objects in memory. Which is just to say that
you'll see online, especially when reading up on Python, that
there's certain terminology that's associated with the language. But at the end of the day, the
ideas are no different fundamentally from what we've been doing in Scratch
and in C. These are just a variable called file. Here's another variable called writer. And it is storing the return
value of CSV.writer file. So what's this? I only knew this by reading
up on the documentation because I was curious in Python, how
do I actually save my data inside of a CSV, Comma Separated Values file? This is sort of a very
super simple Excel file that just uses commas to
separate what are effectively different columns in a file. So my goal here is to ultimately
print David, Mather, Enter. Zamaila, Courier, Enter. Rob, Kirkland, Enter. And that's it. And save it permanently
on disk, if you will, so that we actually keep
this information around. So what does this do for me? It turns out that Python comes with
a built-in feature called the CSV Module, inside of which is a whole
bunch of functionality, some of which is this one here, a
class called writer that takes one argument when you
instantiate it called file. So this just means, hey Python,
give me a writer for CSVs. Give me an object whose purpose in life
is to write CSV files to hard drives. Iterate over my students in students. And then just from
reading the documentation, I know that I can call writer.writerow,
which is a bit hard to say quickly several times, but writerow. And then it takes as an
argument a tuple in this case. That's why there's the
double parentheses. A tuple, a comma separated list
of values, which in this case I want to be student.name
and student.dorm. Then I close the file at the end. So the net result here is
kind of underwhelming to run. And indeed, we're about to see a bug. Python structs1.py. Enter. David Mather, Zamaila
Courier, Rob Kirkland. Damn it. After all that input,
then there's an error. But this is actually illustrative of
another feature, or design aspect, of Python. I'm not necessarily going
to get compilation errors. I might actually get
runtime logical errors. If I have made a mistake
in my program that isn't something super simple
or dumb or frustrating, like leaving off a parenthesis
or a misplaced comma, or something like that
that's syntactically invalid, Python might not notice
that my program is buggy. Because if it scans my code
top to bottom, left to right and doesn't notice some
glaring syntax issue, it might proceed to just run the program
for me, that is, interpret the program. Only once the Python interpreter gets
to a line of code that syntactically is correct but confuses it might it
bail out with a so-called runtime error, or more properly, throw an exception. This one's saying name
CSV is not defined. And indeed, if I scroll
up, the first time I mention CSV was indeed on this line
with the x, undefined variable CSV. You know what? I messed up. I should have imported the CSV module. And I would only know that
from the documentation. But I can infer as much from the
fact that CSV does not exist. Let's try this one more time. David Mather, Zamaila Courier,
Rob Kirkland, and Enter. Nothing seems to happen. But notice students.csv
has now appeared. And indeed, I have David, Mather,
Zamaila, Courier, Rob, Kirkland. I have my own tiny little database. It's not a database in a
particularly fancy sense. I can't query it. I can't change it very easily. I have to just rewrite the
whole thing out essentially. But I have now persisted this data. And never before in
these Python examples have we kept any of
the information around until now, much like the
equivalent C version. So guess what else we
can do with Python. Not only can we re
implement all of week's 1 through 5 examples from C in Python. So can we implement the entirety
of our recent spell checker. For instance, you may recall that
the staff solution for speller was run a little something as
follows at the prompt, whereby we specify optionally a dictionary. But I'm going to go ahead
and use the default. And then I can spell check
something like AustinPowers.text, which, in the CS50 staff solution,
which this one happens to use a try, took me a total of 0.05
seconds to spell check a pretty big dictionary with 19,190 words. But it took me a long time
to implement that try. It probably took you
quite a while to implement your try, your hash table, your
linked list, or other data structure. But let me propose that
today, we have in our speller directory a reimplementation
of speller in Python. And this was the program you
didn't need to worry too much about in C. Speller.c we asked you
to read through and understand. But you didn't need to change it. And so indeed today, we
won't change it either. But I'm going to go ahead and
create a file called dictionary.py, inside of which is going to be my very
own implementation of this dictionary. And it turns out in Python, we
can implement speller as follows. Class dictionary, thereby giving me
really the equivalent of a structure, much like we have in C. And I'm
going to go ahead inside of this and declare a function that's by
default, and convention called init. That takes in one argument,
in this case called self. And I'm going to simply do self.words
gets set where set, it turns out, is a function in Python
that returns to me an empty set, a collection of
values that facilitate, generally, on the average case, constant time
lookups of whether something's there, and constant time insertions of
putting something into that set, much like a mathematical set. I'm now going to go ahead and implement
my load function in Python as follows, whereby I take in self as an argument
as before, by convention, but then also the name of the file to
use as my dictionary. And similar to C, I'm going
to use a function like fopen, but this time called open, where I
simply pass in dictionary and quote, unquote, r. And then for each line in that file, I
am going to access the set called words and add to it the line I've just
encountered after stripping off the trailing new line. Then I am going to close the file. And I'm going to return true. And I'm going to have
finished my homework for load. With just those few
lines of code, can we reimplement the entirety
of the load function for problem set 5 speller
dictionary in Python itself? Now the check function, maybe
that's where the price is paid. Maybe the tradeoff is check's
going to be really, really scary. So I'm going to implement this
one as a method inside here too, taking in a word that
we want to spellcheck. And I'm going to return
word.lower in self.words. And that's it for the check method. What is this doing? This is saying, return, true or
false, whether the lowercase version of the given word is in my own word set. So self.words just refers to this
container that's initially empty but that has just been
populated by the load method by adding in all of the words
that we loaded from that file. So this true or false is
implemented as follows. Lowercase the given word and
check whether it's in that set, and return true or
false in just one line. Well, all right. Maybe size is going to be
where the price is paid. Maybe size is what's really broken here. So let's go ahead and implement size. And let me return self.words. All right. That one's perhaps not a surprise
since size in C is also pretty easy. But what about unload? Well, how about in unload,
we similarly declare it. Well, there's nothing to
unload because Python does all of your memory management for you. So even though you might be
allocating more and more memory as you use this set, there's
nothing to actually unload because the interpreter
will do that for you. So it turns out that all of these
conversions from C to Python are useful in part
because clearly, you can implement the same kinds
of programs that we've been implementing for a week. And frankly, in many cases, more
easily and quicker, or with fewer lines of code, or in a way that's
just much less painful to write. All of that low level stuff where
you're implementing hash tables or trees or tries is wonderfully illustrative
of how those things work, and hopefully gives you a true understanding of
what's going on underneath the hood. But my god. If you just wanted to store
words in a dictionary, if you had to implement dozens of lines
of code to implement your own try, or your own hash table or linked
list, programming very quickly devolves into an incredibly
mundane, frustrating profession. But in this case do we begin to
see hints of other languages, Python among them, that allow us to
solve the same problems much more quickly, much more efficiently,
much more effectively, much more pleasurably, such that
now we can start to stand on the shoulders of even more
people who have come before us, start building on not
only this language, but on other APIs and libraries. And indeed, that's now
why we introduced Python. No longer in the weeks
to come are we going to be focusing on the command
line alone, but rather on web-based interfaces. Indeed, in Python do we have the
ability to so much more easily than in C write web-based software,
actual websites that are dynamic, not just
built out of HTML and CSS, but that have shopping carts and use
databases and send emails or SMSes, or any number of dynamic features,
all of which, to be fair, we could implement in C. But it
would be the most painful experience in the world to implement
a dynamic website with all of those features in a lower level
language like C. But with Python can we start to do this
so much more readily. So how do we go about using
Python to generate websites? A couple of weeks ago when we
first looked at HTML and CSS and talked more generally about HTTP,
we hard coded everything we wrote. We wrote HTML in our text editor. We wrote CSS in our text editor. We saved those files. And then we loaded
them using our browser. But there was nothing dynamic about it. There was no even hello world program
that dynamically took my name. But we did discuss, in
the context of HTTP, this ability of web
browsers and web servers to use HTML parameters in order to
transmit inputs in between the two. For instance, we talked
about get, whereby you can pass in these key value pairs
via the get string, the query string, in the URL itself. We talked a bit about
post, whereby you could transmit more sensitive information,
or bigger things like photographs and passwords and
confidential information, via post, which is still passing in
key value pairs from browser to server. But we didn't at the
time have any ability to actually read or parse those
inputs and produce dynamic outputs. In fact, the most dynamic
we got a couple of weeks ago was with those search examples
whereby I reimplemented the front end interface of Google, sort of our very
low budget version of Google's website. And then I just completely
punted to their back end using the action attribute of
https://www.google.com/search, pretty much deferring entirely to Google
all of the interesting, dynamic output for my search results. So today, we won't generate
those search results ourselves. But we will give ourselves, now that
we have a language and the environment with which to handle those
inputs, we will give ourselves the capability to start creating
websites more like that. In fact, ultimately, the goal
of creating web-based software is to dynamically
output stuff like this. This, of course, is
the simplest web page we could perhaps implement in HTML. But it's entirely hard coded. Wouldn't it be nice if we
could minimally, for instance, add someone's name
dynamically to that output so that it actually interacts
with them in some way? And you can, of course, extrapolate
from that kind of feature to things like Gmail,
where it's constantly, dynamically interacting
with your keyboard input based on who you put in the To field,
what you put in the subject line. The website's going to do and behave
differently in order to send that mail. Facebook Messenger or Gchat
or any number of tools are constantly taking
web-based input from users and producing dynamically output. But how do we get at
that input and output? Especially since at the end of the
day, this is all HTML boils down to. Inside of those virtual
envelopes, so to speak, going between client and
server or browser and server, are requests like these from the client. Get me the home page
using this version of HTML specifically from this host name here. And then maybe some other additional
detail and maybe some parameters in that URL string. Meanwhile, the server is
going to respond similarly with something pretty simple-- a textual
response, some HTML headers like this saying the content type is text HTML,
if it indeed is, followed by the HTML that the server has generated. So it would seem that we
need the ability, when writing web-based
software, to be able to, one, dynamically generate
HTML based on who the user is or what he or she wants
to see dynamically. So we have the ability to write
HTML, of course, per two weeks ago. But we haven't yet printed it
or generated it dynamically. And we're also going to need a
feature whereby, somehow or other, any HTTP parameters
coming to us from browsers can be interpreted so that if a
user is trying to add something to their shopping cart, we can actually
see what it is they've requested to add to their shopping cart. So it turns out we need just
one mental model, if you will, for this world of the web. Back in the day, this mental
model didn't necessarily exist. But over time, we humans have come
up with certain paradigms, or design patterns, so to speak, that guide common
implementations of web-based software or mobile software. Because the world realized over time
that they adopted certain habits. Or there are certain convenient
ways to implement software. And one such method, or
one such design pattern, is generally called MVC,
Model View Controller. And in this world, the
controller is really where the brains of your program or
your website are-- all of the logic. The logging in of users,
logging out of users, adding things to a shopping cart,
removing things, checking out, billing them, all of that sort
of business logic so to speak. And that exists in one
or more files, typically, on a web server that collectively
are called the controller. So it's not a technical term per se. It's just a descriptor for what
your code is ultimately doing. View, meanwhile, the V in MVC,
refers to the aesthetics of your site typically-- the templates
that you use for HTML, or the CSS files that you use
in order to style your website. In other words, while the
thinking, all of the code logic might be embedded in files
called your controller, all of the sort of fluffier
but still important stuff. The aesthetic stuff, might be
in the view side of things. And then lastly is the M in MVC, Model,
which is where your data typically comes from. So we just did an
example using a CSV file. That's a model of some sort. It's a super simple model. But a model is just a general term
describing where your data lives and how you access it. And before long, we're
going to use a fancier version of a model, an
actual database server, that we can query and insert
into and delete from and edit, and any number of
other features as well. But for now, today, let's just focus
on the C and the V in MVC as follows. I'm going to go ahead and open
up CS50 IDE, where we have a simple program here called serve.py. And this is perhaps
among the lowest level ways we could go about
implementing our own web server. So again, CS50 IDE comes
with its own web server. And Google has its own web server. And Facebook has its own web server. And many of them are using, like us,
open source software, freely available software that's super popular. But suppose we want to
implement our own web server that listens on TCP
port 80 for HTTP requests for those virtual envelopes. In Python, we might do it as follows. And a lot of the words on
the screen might be new. But the syntax is fundamentally the same
as what we've been focusing on today. So from some module that comes with
Python called HTTP server imports a class called base HTTP
request handler and HTTP server. So it turns out that Python comes with
some built-in web server functionality. It's not all that user
friendly, as we'll see. We have to do a lot of work to use it. And the names are fairly
verbose unto themselves. But it comes with the
ability, as a language, to let you implement a web
server, a piece of software that when you run it just starts
listening on the internet, on your computer's IP address on TCP
port 80 for incoming HTTP requests and then responds to
them as you see fit. So we've defined a class here
called HTTP server request handler that descends from
this parent class, so to speak. But more on that in the days to come. On line 7 here, I'm defining a
method conventionally called do Get, where Get is capitalized, thereby
making super clear that this is the function, the
method, that's going to be called if our server
receives a request via HTTP get, as opposed to
post or something else. Self is, again, the convention when
implementing a class for methods to take in a reference to
themselves, so to speak. A reference to the containing
object will just call self. Now inside here-- and you'd only know
this from having read the documentation or having done this before--
notice that we're going to do a few things in this web server. Super simple. We're going to, no matter
what, just send a response code, a status code of 200. Everything is always OK in this server. It's not realistic. Things could certainly go wrong,
especially if the user asks us for something that we don't have. A 404 for might be more appropriate. But we're going to keep the
example simple and no matter what, send 200, OK. Meanwhile, we're also going to send
another HTTP header using this Python call here of self.sendheaader. And to be clear, these features--
send response, send headers, soon end headers-- are
methods or functions that come with Python's
built-in web server that we are simply extending the
capabilities of at the moment. What is the header that we want to send? Content type colon text HTML. So we're going to behave exactly like
that canonical example I put up again a moment ago. Lasly, we're going to send
a super simple message. We're simply going to write
essentially to the socket connection that my server has with the browser,
the internet connection that we have. I'm going to write the following bytes. Hello, world. And I'm going to use an
encoding called UTF-8, which is a way of encoding
Unicode, which, again, is an encoding scheme
that's a superset of Ascii, as we discussed back in week 0. That's it. Return. Now, this just defines a class, my
own customisation of a web server. Python comes with a web server
built in-- specifically, that class called base
HTTP request handler. And I'm simply extending
its capabilities to specifically return hello
world with content type text HTML and with a status code of 200. That wouldn't necessarily
be the case by default-- certainly not that generic message. But I have to start this server. And I could add a main function or
implement this in any number of ways. But I'm going to keep it simple. At the bottom of the file, I'm
going to configure the server here, hard coding port 8080 to be
the value of this variable. A server address here
is going to be a tuple. And you would only know this,
again, from the documentation. This tuple, this comma
separated list of values, is going to be this weird-looking
IP address, and then that same value, 8080. And this weird-looking at IP
addresses is a convention. If you specify that you
want a web server to listen, to talk on IP address
0.0.0.0, that's generally shorthand notation for saying, listen
on all possible network interfaces that are built into my computer, whether
it's CS50 IDE, or an actual server, or a Mac, or a PC. This is sort of like
the wildcard saying, just listen on any one
of your ethernet cables or Wi-Fi connections for incoming
requests, but specifically on this port 8080. This last line here essentially
instantiates an HTTP server, passing into it our request handler,
which is that customization of behavior that I described earlier. And then lastly, nicely
enough, there's a method, a function built into this Python
server called serve forever, which just turns the server
on and never turns it off unless I forcibly kill
it with, say, Control-C. So let's go ahead and actually run this. I'm going to go ahead into
the folder containing serve.py and run Python serve.py, Enter. And nothing seems to happen just yet. But I'm going to go ahead and
open up another tab in CS50 IDE. And I'm going to go to
http://127.0.0.0:8080. So why this IP address? Even though this is a little
inconsistent with what I just said, technically, 0.0.0.0 is
not your actual IP address. It's, again, just kind
of a wildcard string that represents all of your
possible network interfaces. Every computer on the
internet, generally, has a local host address--
not its public IP, not even a private IP
that's in your own home network behind your own
firewall-- but 127.0.0.1 represents your own local
host address, an IP address that by default every
computer in the interest has insofar as it refers to itself. So we all have generally, in our
own Macs and PCs, or CS50 IDEs, access to this IP address,
which just refers to myself. And port 8080 after the colon. Normally, using a browser, you
don't specify the port number by saying colon 80 or colon 443. But in this case, because
it's a nonstandard port, what I want to do with Google Chrome here is
talk to my computer on this local host address on that port. Now, if you play along at home
using CS50 IDE on the web, your address will actually be different. I simply happen to be using a local
version of CS50 IDE on my own Mac here so that I don't have to
combat with any Wi-Fi issues. But the idea is going
to be exactly the same. Whatever your workspace's
IP address is or host name, the English version of it, colon
8080, is what you will type. Let me hit Enter. But it's not all that interesting. Indeed, if I view the page source, as
we have in the past, this is not HTML. I've been super lazy right now, simply
outputting a promise via that header that I'm outputting a
content type of text HTML. But this isn't really HTML. This is just text. And so this really isn't
a full-fledged web server. It's certainly not dynamic in that
I've literally hard coded hello world. So let's do something a little better,
a little more pleasurable to write. And for that, we're actually going
to need something called a framework. And so it turns out that writing
code like this-- totally possible, and folks did it for some time. But eventually did people
realize, you know what? We're doing the same kinds of
lines of code again and again. This isn't particularly fun to
implement the website or the product that I'm working on. Let me actually start to
borrow ideas from past projects into current projects. And thus were born
things called frameworks, collections of code written
by other people that are often free or open source that you can
then use in your own projects to make your life easier. And indeed, this is thematic. Especially as we get farther and
farther from C and lower level languages toward Python, and
eventually JavaScript and beyond, you'll find that it's
thematic for people to sort of stand again
on each other's shoulders and use past problems solved to
solve future problems more quickly. So what do I mean by that? Well, one of the very first
things I did way back in the day when learning web programming myself,
after having taken CS50 and CS51, is I taught myself a language called Perl. It's not really in vogue these
days, though still around and still under development. But it's similar in spirit to what
we're talking about today in Python. And I happened to use that
language back in the day to implement a website, the first
ever website for the freshman intramural sports program. So all the freshmen or
first years who want to participate in sports
just for fun, back in my day, we would register for sports by
walking across Harvard Yard, uphill both ways in the snow, and
then slide a piece of paper under one of the proctor's
or RA's doors saying, I want to register for volleyball,
or soccer, or whatever it was. So this was an opportunity ripe
for disruption with computers. So I taught myself web
programming back in the day and volunteered to make
this website for the group so that students like
myself could just-- well, maybe students not like myself
could register for sports online. And so what did I actually do? Well, we won't look at the Perl version. We'll look instead at a Python version
using a very popular framework, freely available code called Flask. So Flask is technically
a micro framework in that it doesn't have a
huge number of features. But it's got relatively few
features that people really seem to lately that helps
you get worked on faster. And by that I mean this. This is how I might implement the
simplest of websites for the freshman intramural sports program. Now, admittedly, it's lacking
in quite a few features. But let's see how it works. And indeed, with some of our
future web-based projects in CS50, will we build upon Flask
and borrow these same ideas. So you'll notice that
from Flask, am I importing a whole bunch of
potential features, none of which I want to implement
myself, all of which, pretty much, I would have had to implement myself if
I used that base HTTP web server that comes with Python itself. So Flask is built on top of
that built-in functionality. How does it work? Once I've imported this
module's components and classes, I'm going to go ahead
and instantiate, so to speak, an application of type Flask,
passing in the name of this file. So this is just a
special symbol, __name, that we've seen before in the context
of main that just refers to this file. So this says, hey Python, give me a
Flask application based on this file. So now notice on line 5,
a slightly new syntax, something we'll call a decorator. And it's a one liner in this case that
simply provides Python with a hint that the following method should
be called anytime the user requests a particular route. A route, typically, is something like
/foo or /bar or /search or the like. So a route is like the path that you
are requesting on the web server, slash generally being the default. So this is saying to Python,
anytime the user requests slash, the default home page, go
ahead and call this index method. Technically, we could
have called anything. But this is a good convention. And return what? The rendering of this template. In other words, don't just
return a few bytes, hello world. Return this whole HTML file. But it's a template in the sense that we
can plug in values, as we'll soon see. Meanwhile, hey Python, when you
see a request for /register, not using get by default,
but by using post, which might happen in a form submission,
go ahead and call this method called Register. And just as a sanity
check, let's do this. If the request that we have
received has a form in it that has a Name input in it that's
blank, equals quote, unquote, or the request we've received has
a form whose Dorm field is blank, then return this template
instead, failure.html. Otherwise, return success. So in other words, if the user has
submitted a form to register for sports and he or she has not given
us their name or their dorm, let's render this failure message. Don't let them register because
we don't even know who they are or where they're from. So we're going to display failure.html. Otherwise, by default,
we'll display success.html. So let's see what this looks like. I'm going to go ahead and hit Control-C
to get out of the old web server. I'm going to go into
this Frosh IMs directory. And this time, instead
of running Python, I'm instead going to run Flask, Run. And then I'm going to be
just super specific here. I'm going to say the host I want
to use is every possible interface. And then the port I'm
going to use is 8080, though you can configure
these in different ways. And I'm going to hit Enter. A whole bunch of stuff
scrolled on the screen. But the essence of it is that
serving Flask App application. Debug mode and CS50 IDE is turned
on by default at the moment. And now we're ready to go. If I now go back to my web page and
reload, I'm still at the same URL. But a different web server is
now responding to my requests. And this is sort of in the spirit of
1996, '97, whenever I implemented this. This is what the web looked like. And in fact, this might
be a little worse. So now, suppose I'm kind of in a rush. I just want to register for sports. I don't think to
provide my name or dorm. Let me hit Register. And I'm yelled at. You must provide your name and dorm. And notice, where am I? If I look at the URL, I'm
at that same IP and port number, which will vary based on where
you are in the world and what service you're using, like CS50 IDE. But I'm at /register, that route. All right. Let me go back. Let me go ahead and give
them my name at least. David. Register. And voila. I am being yelled at again because
even though I provided my name, I've not provided my dorm still. So this seems to be a fairly
lightweight error message. But let me cooperate. Let me provide both David and
say Matthews, and click Register. Aha. You are registered. Well, not really. So why not really? Well, that's because this particular
website has no database yet. There's no place to store the data. There's not even a CSV file. There's no email functionality. All it's being used for
today is to demonstrate how we can check for the
presence of form submissions properly to make sure the user is
actually providing those values. So if I actually go back into CS50 IDE,
let's go into this Frosh IMs directory, inside if which is a
Templates directory, and take a look at Failure,
the first thing that we saw. Now, this admittedly
looks a bit cryptic. But for a moment, notice that it's
extending something called layout.html. And actually, it looks like
there's these special syntax here. So it turns out that Flask
supports a templating language. It's not HTML. It's not CSS. It's not even Python. It's sort of a mini
language unto itself, a templating language that gives you
simple instructions and features that allow you to dynamically
plug in values to your HTML. So they don't have to
hard code everything. And so this is saying,
hey Flask, go and get the template file called
layout.html, and then plug in this title, and then this body. So block title here. Notice the funky syntax,
curly brace with percent sign, percent sign with curly brace. This is literally saying, hey Flask,
the title of this page shall be Failure. And the body of this page,
as per this block here, shall be, quote, unquote, "You
must provide your name and dorm." Meanwhile, if we open up
success.html, it's similar in spirit. But notice it has a title of Success. And it has a body of,
"You are registered." Well, not really. So nothing interesting is happening. But this body and this title
will be plugged into this layout, this other template file. So let's now look at layout. This looks more familiar. So layout.html is sort of the
parent of these children-- success.html and failure.html. And this is because I
realized, when designing the website for the first time, I
don't want to have to copy and paste a whole lot of similar HTML. I don't want to have HTML in
every file, head in every file, title in every file, body in every file. There's a lot of redundancy, a lot
of structure to these web pages. It would be nice if
I can kind of come up with a general layout, the aesthetics
for my overarching website, and then, on a per page basis,
just plug-in a custom title, just plug-in a custom body. And that's all this
funky syntax is doing. It's saying, hey Flask, put
the body of this page here. And hey Flask, put the
title of this page here. So again, this has nothing
to do with Python per se, nothing to do with HTML or
CSS per se, except that it is a templating language,
another language used for really plugging in values in this way. And it's conventionally used in
exactly this context with Python, with CSS, and with HTML. It helps us keep things a
little cleaner and avoiding a whole lot of copy, paste. The form, meanwhile,
that we originally saw is perhaps even more familiar
except for the block up top. It too extends layout.html. It has the title of Frosh IMs. And then it's pretty
much just got an H1 tag, which you might recall
from a couple weeks back. It's got a form tag. It's got some BR for line breaks,
a select element, and more. And the only thing that's a little
interesting here is notice this. Whereas two weeks ago, I hard coded
an action value like google.com/search or to my own file, this is a
nice abstraction, if you will, whereby I can say in my template,
give me the URL for my register route. Now realistically, it's probably just
/register because that's what we hard coded into the file. But it would be nice to
not have to hard code things that could change over time. And so this URL for is a way of
dynamically asking the templating language, you go figure out
what this route is called and plug in the appropriate
relative URL here. So that's all Frosh IMs
does on the front end. What's the back end? Well for that, we need to
look at application.py. Again, this is where we started. When I submit via post that super
simple HTML form to /register, that is, this route, first,
this if condition runs. If the request form's Name field
is blank or its Dorm is blank, return failure. Else, return success. But this isn't especially dynamic. It would be nice, if I keep saying
"dynamic," that things actually are dynamic. So let's look at one final example. So now let's rerun Flask in this Store
subdirectory, still on the same IP, still on the same ports, but now serving
a different application on the same. Indeed, when we now
reload the browser, we see not the Frosh IMs site, but a
super, super simple e-commerce site, a web storefront that allows us to buy
apparently foos and bars and bazes. This is just an email form. These are text fields here. These are just labels on the side. And this is a Submit button. And the shopping cart
ultimately is going to show me how many of these things
I've added to my shopping cart already. Indeed, let's try adding one
foo, two bars, and three bazes, and click Purchase. I'm redirected automatically to my cart. And I just see a reminder that
I've got one foo, one bar, one baz. And let's just confirm that
this is indeed the case. Let me go ahead and continue shopping. And let me go ahead
and buy 10 more foos. Click Purchase. And indeed, it's
incremented this properly. And indeed, while you can't quite
see what I'm doing on my keyboard, I'm hitting Reload now. And nothing's changing. Indeed, if I close the
window and then reopen it, you'll see that it retains state. In other words, no matter whether
I close or open my browser, it seems to be remembering that I've
got 11 foos, two bars, and three bazes in my shopping cart, so to speak. So how did we implement
this functionality? Well first, notice that the
store itself is super simple. It's just some HTML and a template
that has a whole bunch of input fields textually for foos, for
bars, and for bazes, as well as that Submit
button called Purchase. And it, like before,
extends layout.html, which this application
has its own copy of. The cart, meanwhile, is
actually pretty simple. And notice what's nice about
this templating language. It lets us do this. This is a file called cart.html
that also descends from that layout. And we have this H1 tag
here that just says Cart. And now notice this template language
looks quite like Python here. Has for item in cart. And it allows me using this curly
bracket notation, two of them on the left, two of them
on the right, to plug-in the Quantity field of this Item object. So it seems that Cart
is some kind of list, and Item are the elements, the
objects inside of that list. And this is giving me the Quantity
field inside of this structure and the Name field
inside of this structure. And that's why I see 11 foo and two bar
and three baz in my templating language here. I'm just interesting over what
apparently is my shopping cart. Now this invites the question,
what is this shopping cart? And for that last detail, we
need to look at application.py. So as before, we instantiate
a Flask application up here. But we also configure it with
a property called secret key per the documentation for Flask. You probably shouldn't
use a value of shh, but for now we'll keep things simple. But that key, long story short, is
used to ensure with higher probability the security of the sessions of
the shopping cart that we're using. This line here again
declares a route for slash, specifying we only want
to accept get and post. Turns out there's other verbs
like put and patch and others, but we're going to ignore those for now. And if that route is requested,
call this method store. Meanwhile that method says, hey, if
the request method is post-- that is, if the user submitted a form
not by get but by post-- go ahead and iterate over
the three possible items that we sell in the
store, foo, bar, and baz. If the item is not
already in the session-- and you can think of session
as our shopping cart. It's a special object dictionary
that allows us to store keys and values, a hash table of sorts. Then go ahead and add to the session,
the shopping cart, that item-- foo or bar or baz-- and then as an
integer the number of foos or bars or bazes that the user
requested via the form. We're calling int here, effectively
casting whatever that value is. Because it turns out HTTP, all of
those messages going back and forth all this time are purely textual. So even though it looks like 10 or 1
or 2 or 3, those are actually strings. So using int here converts
that to the integer. So we're actually storing numbers with
which we can do simple arithmetic. Otherwise, if a foo or bar or baz
was already in the shopping cart, go ahead with Python's plus
equal operator, which C also had, and just increment that
count from 1 to 10, for instance, as I did a moment ago. And then redirect the user to whatever
the URL is for the cart routes. In other words, after I
add something to my cart, let's just show me the cart right away
rather than showing me the order form instead. Otherwise, if the user
requested this page via get, go ahead by default and just
return store.html, which is that simple form that lists the text
fields and the numbers of foos and bars and bazes that you might want to buy. Meanwhile, the shopping cart is
implemented in this application as follows. Here's a route for /cart, in which
case a method called cart is called. We then declare inside of this
method an empty list called cart. And then as before, we iterate
over the available items in our store-- foo and bar and baz. And then what do we do? We simply append to this cart,
this list object, the following. We append what's called a dictionary,
a hash table, a collection of key value pairs, simply by associating a name
with an item capitalized properly, and a quantity associated
with the actual number of those items in my session. And then we return this time not just
the template via its name, cart.html. We furthermore render cart.html,
passing into that template a variable called cart
whose value is also cart. In other words, this variable
is going to be called cart. And it's going to be equal
to whatever this list is. And it's these two lines
here in the for loop that are appending a
set of key value pairs so that we know how many foos we have,
how many bars, and how many bazes. And that's why in cart.html do
we have access to, on line 9 here, a cart list over
which we can iterate. So there, we're just scratching
the surface of what we can do. But we now have a
language with which we can express these new kinds of features. We now have a server environment
that allows us to actually execute Python code not only
at the command line, but also via HTTP and in
turn TCP/IP, and in general, over the internet itself. So now, using this language--
and soon a database, and soon a client side language
called JavaScript and more-- can we start to build the very kinds of
websites with which you're probably already familiar and using them
every day on your laptops and phones. We're just now beginning our
foray into web programming. And next week, we'll add a back
end so that we can actually do all this and more. [AUDIO PLAYBACK] [MUSIC PLAYING] -I never even got to know him. I just-- I don't know what happened. Please, please. I-- I need to be alone. The people need to know. [INAUDIBLE] No. Please-- please go. I never did get that dinner with
him at his favorite restaurant. [END PLAYBACK]