CS50 2021 - Lecture 2 - Arrays

Video Statistics and Information

Video

Captions Word Cloud

Captions

all right this is cs50 and this is week two now that you have some programming experience under your belt in this more arcane language called c among our goals today is to help you understand exactly what you have been doing these past several days wrestling with your first programs in c so that you have more of a bottom-up understanding of what some of these commands do and ultimately what more we can do with this language so this recall was the very first program you wrote i wrote in this language called c much more textual certainly than the scratch equivalent but at the end of the day computers your mac your pc vs code doesn't understand this actual code what's the format into which we need to get any program that we write just to recap so binary otherwise known as machine code right the zeros and ones that your computer actually does understand so somehow we need to get to this format and up until now we've been using this command called make which is sort of aptly named because it lets you make programs and the invocation of that has been pretty simple make hello sort of looks in your current directory or folder for a file called hello.c implicitly and then it compiles that into a file called hello which itself is executable which just means runnable so that you can then do dot slash hello but it turns out that make is actually not a compiler itself it does help you make programs but make is this sort of utility that comes on a lot of systems that makes it easier to actually compile code by using an actual compiler the program that converts source code to machine code on your own mac or pc or whatever cloud environment you might be using in fact what make is doing for us is actually running a command automatically known as clang for c language and so in fact here for instance in vs code is that very first program again this time in the context of a text editor and i could compile this with make hello but let me go ahead and use the compiler itself manually and we'll see in a moment why we've been automating the process with make i'm going to run clang instead and then i'm going to run hello.c so it's a little different how the compiler is used it needs to know explicitly what the file is called i'll go ahead and run clang hello dot c enter nothing seems to happen which generally speaking is a good thing because no errors have popped up and if i do ls now for list you'll see there is not a file called hello but there is a curiously named file called a dot out this is a historical convention stands for assembler output and this is just the default file name for a program that you might compile yourself manually using clang itself let me go ahead now though and point out that that's kind of a stupid name for a program even though it works dot slash a dot out would work but if you actually want to customize the name of your program we could just resort to make or we could do explicitly what make is doing for us because it turns out some programs among them make support what are called command line arguments and more on those later today but these are like literally words or numbers that you type at your prompt after the name of a program that just influences its behavior in some way it modifies its behavior and it turns out if you read the documentation for clang you can actually pass a dash o for output command line argument that lets you specify explicitly what do you want your outputted program to be called and then you go ahead and type the name of the file that you actually want to compile from source code to machine code let me go ahead and hit enter now again nothing seems to happen and i type ls and voila now we still have the old a.o because i didn't delete it yet and i do have hello now so dot slash hello voila runs hello world again and let me go ahead and remove this file i could of course resort to using the explorer on the left hand side which i am in the habit of closing just to give us more room to see but i could go ahead and right click or control click on a dot out if i want to get rid of it or again let me focus on the command line interface and i can use anyone recall we didn't really use it much but what command removes a file rm so rm for remove rm a dot out enter remove regular file a dot out y for yes enter and now if i do ls again voila it's gone all right so let's now enhance this program to do the second version we ever did which was to also include uh cs50.h so that we have access to functions like getstring and the like and let me go ahead and do string name gets get string what's your name question mark and now let me go ahead and say hello to that name with our percent s placeholder comma name so this was version two of our program last time that very easily compiled with make hello but notice the difference now if i want to compile this thing myself with clang using that same lesson learned all right let's do it clang dash o hello just so i get a better name for the program hello.c enter and a new error pops up that some of you might have encountered on your own so it's a bit arcane here and there's this mention of a cryptic looking path with temp for temporary there but somehow my issues in maine as we can see here uh it somehow relates to hello.c even though we might not have seen this language last time in class but there's an undefined reference to getstring as though getstring doesn't exist now your first instinct might be well maybe i forgot cs50.h but of course i didn't that's the very first line of my program but it turns out make is doing something else for us all this time just putting cs50.h or any header file at the top of your code for that matter just teaches the compiler that a function will exist it sort of asks the compiler to it asks the compiler to trust that i will eventually get around to implementing functions like getstring and cs50.h and standardio.h like printf therein but this error here some kind of linker command relates to the fact that there's a separate process for actually finding the zeros and ones that cs50 compiled long ago for you that the authors of this operating system compiled for you long ago in the form of printf we need to somehow tell the compiler that we need to link in code that someone else wrote the actual machine code that someone else wrote and then compiled so to do that you'd have to type lcs50 for instance at the end of the command so additionally telling clang that not only do you want to output a file called hello and you want to compile a file called hello.c you also want to quote unquote link in a bunch of zeros and ones that collectively implement getstring and printf so now if i hit enter this time it compiled okay and now if i run dot slash hello it works as it did last week just like that but honestly this is just going to get really tedious really quickly notice already just to compile my code i have to run clang dash oh hello hello.c dash lcs50 and you're going to have to type more things too if you wanted to use the math library like to use that round function you would also have to do lm typically to specify give me the math bits that someone else compiled and the commands just get longer and longer so moving forward we won't have to resort to running clang itself but clang is indeed the compiler that is the program that converts from source code to machine code but we'll continue to use make because it just automates that process and the commands are only going to get more cryptic the more sophisticated and more featureful your programs get and make again is just a tool that makes all that happen so to speak let me pause there to see if there's any questions before then we take a look further under the hood yeah in front sure let me come back to that in a moment what does the lcs50 mean we'll come back to that visually in just a moment but it means to link in the zeros and ones that collectively implement getstring and printf but we'll see that visually in a sec yeah behind you really good question how come i didn't have to link in standard io because i used printf in version one standard io is just literally so standard that it's built in it just works for free cs50 of course is not it did not come with the language c or the compiler we ourselves wrote it and other libraries even though they might come with the language c they might not be enabled by default generally for efficiency purposes so you're not loading more zeros and ones into the computer's memory than you need to so standard io is special if you will other questions yeah sorry a little louder sorry oh what does the dash o mean so dash o is shorthand for the english word output and so dash o is telling clang to please output a file called hello because the next thing i wrote after the command line recall was clang dash o hello then the name of the file then dash lcs50. and this is where these commands do get and stay fairly arcane it's just through muscle memory and practice that you'll start to remember oh what are the other commands that you what are the command line arguments you can provide to programs but we've seen this before technically when you run make hello the program is called make hello is the command line argument it's an input to the make function albeit typed at the prompt that tells make what you want to make even when i used rm a moment ago and did rm of a dot out the command line argument there was called a dot out and it's just telling rm what to delete it is entirely dependent on the programs to decide what their conventions are whether you use dash this or dash that but we'll see over time which ones actually matter in practice so to come back to the first question about what actually is happening there let's consider the code more closely up so here is that first version of the code again with standardio.h and only printf so no cs50 stuff yet until we add it back in and had the second version where we actually get the human's name when you run this command then there's actually a few things that are happening underneath the hood and we won't dwell on these kinds of details indeed we'll abstract it away so to speak by using make but it's worth understanding at least from the get-go how much automation is going on so that when you run these commands it's not magic you actually do have this bottom-up understanding of what's going on so when we say you've been compiling your code with make that's a bit of an oversimplification technically every time you compile your code you're having the computer actually do four distinct things for you and this is not four distinct things that you need to sort of memorize and remember every time you run your program what's happening but it helps to sort of break it down into building blocks as to how we're getting from source code like c into zeros and ones it turns out that when you compile quote unquote your code technically speaking you're doing four things sort of automatically and all at once pre-processing it compiling it assembling it and linking it just humans decided let's just call the whole process compiling but for a moment let's just consider what these steps are so pre-processing refers to this if we look at our source code here version 2 that uses the cs50 library and therefore getstring notice that we indeed have these include lines at top and they're kind of special versus all the other code we've written because they start with hash symbols specifically and that's sort of a special syntax that means that these are technically called preprocessor directives fancy way of saying they're handled special versus the rest of your code in fact if we focus on cs50.h recall from last week that i provided a hint as to what's actually in cs50.h among other things like what was the one salient thing that i said was in cs50.h and therefore why we were including it in the first place so getstring specifically the prototype for getstring we haven't made many of our own functions yet but recall that any time we've made our own functions and we've written them like below main in a file we've also had to somewhat stupidly copy paste the prototype of the function at the top of the file just to teach the compiler that this function doesn't exist yet it does down there but it will exist just trust me so again that's what these prototypes are doing for us so therefore in my code if i want to use a function like getstring or printf for that matter they're not implemented clearly in the same file they're implemented elsewhere so i need to tell the compiler to trust me that they're implemented somewhere else and so technically inside of cs50.h which is installed somewhere in the cloud's hard drive so to speak that you all are accessing via vs code there's a line that looks like this a prototype for the getstring function that says the name of the function's getstring it takes one input or argument called prompt and that type of that prompt is a string getstring not surprisingly has a return value and it returns a string so literally that line and a bunch of others are in cs50.h and so rather than you all having to copy paste the prototype you can just trust that cs50 figured out what it is you can include cs50.h and the compiler is going to go find that prototype for you same thing in standard io someone else what must clearly be in standardio.h among other stuff that motivates are including standardio.h2 yeah printf the prototype for printf and indeed i'll just change it here in yellow to be the same and it turns out the format the prototype for printf is actually pretty fancy because as you might have noticed printf can take one argument just something to print two if you want to plug a value into it three or more so the dot dot just represents exactly that it's not quite as simple a prototype as getstring but more on that another time so what does it mean to pre-process your code the very first thing the compiler clang in this case is doing for you when it reads your code top to bottom left to right is it notices ooh here is hash include oh here's another hash include and it essentially finds those files on the hard drive cs50.h standardio.h and does the equivalent of copying and pasting them automatically into your code at the very top thereby teaching the compiler that getstring and printf will eventually exist somewhere so that's the pre-processing step whereby again it's just doing a find and replace of anything that starts with hash include it's plugging in the files there so that you essentially get all the prototypes you need automatically okay what does it mean then to compile the results because at this point in the story your code now looks like this in the computer's memory it doesn't change your file it's doing all of this in the computer's memory or ram for you but it essentially looks like this well the next step is what's technically really compiling even though again we use compile as an umbrella term compiling code in c means to take code that now looks like this in the computer's memory and turn it into something that looks like this which is way more cryptic but it was just a few decades ago that if you were taking a class like cs50 in its earlier form we wouldn't be using c if it didn't exist yet we would actually be using this something called assembly language and there's different types of or flavors of assembly language but this is about as low level as you can get to what a computer really understands be it a mac or pc or a phone before you start getting into actual zeros and wands and most of this is cryptic i couldn't really tell you what this is doing unless i really thought it through carefully and rewound mentally years ago from having studied it but let's highlight a few key words in yellow notice that this assembly language that the computer is outputting for you automatically still has mention of main and it has mentioned of getstring and it has mention of printf so there's some relationship to the c code we saw a moment ago and then if i highlight these other things these are what are called computer instructions at the end of the day your mac your pc your phone actually only understands very basic instructions like addition subtraction division multiplication move into memory load from memory print something to the screen like very basic operations and that's what you're seeing here these assembly instructions are what the computer actually feeds into the brains of the computer the cpu the central processing unit and it's that intel cpu or whatever you have that understands this instruction and this one and this one and this one and collectively long story short all they do is print hello world on the screen but in a way that the machine understands how to do so let me pause here are there any questions on what we mean by pre-processing which just finds and replaces the hash includes symbols among others and and compiling which technically takes your source code once preprocessed and converts it to that stuff called assembly language correct each type of cpu has its own instruction set indeed and as a teaser this is why at least back in the day when we used to like install software from cd-roms or some other type of media this is why you can't take a program that was sold for a windows computer and run it on a mac or vice versa because the commands the instructions that those two products understand are actually different now microsoft or any company could generally write code in one language like c or another and they can compile it twice saving a pc version and saving a mac version it's twice as much work and sometimes you get into some incompatibilities but that's why these steps are somewhat distinct you can now use the same code and support even different platforms or systems if you'd want all right assembly assembling thankfully this part is fairly straightforward at least in concept to assemble code which is step three of four that is just happening literally for you every time you run make or in turn clang this assembly language which the computer generated automatically for you from your source code is turned into zeros and ones so that's the step that last week i simplified and said when you compile your code you convert it to source code from source code to machine code technically that happens when you assemble your code but no one in normal conversations says that they just say compile for all of these terms all right so that's assembling there's one final step even in this simple program of getting the user's name and then plugging it into printf i'm using three different people's code if you will my own which is in hello.c some of cs50s which is apparently in hello dot c sorry which is in cs50.c which is not a file i've mentioned yet but it stands to reason that if there's a cs50.h that has prototypes turns out the actual implementation of getstring and other things are in cs50.c and there's a third file somewhere on the hard drive so to speak that's involved in compiling even this simple program hello.c cs50.c and by that logic what might the other be yeah standardio.c and that's a bit of a white lie because that's such a big fancy library that there's actually multiple files that compose it but the same idea and we'll take the simplification so when i have this code now and i compile my code here i get those zeros and ones that end up taking hello.c and turning it effectively into zeros and ones that are combined with cs50.c followed by standardio.c as well so let me rewind here here might be the zeros and ones for my code the two lines of code essentially that i wrote here might be the zeros and ones for what cs50 wrote some years ago in cs50.c here might be the zeros and ones that someone wrote for standard io decades ago the last and final step is that linking command that links all of these zeros and ones together essentially stitches them together into one single file called hello or called a dot out whatever you name it that last step is what combines all of these different programmers zeros and ones and my god like now we're really in the weeds who wants to even think about running code at this level you shouldn't need to but it's not magic when you're running make there's just some very concrete steps that are happening that humans have developed over the years over the decades that sort of break down this big problem of source code going to zeros and ones or machine code into these very specific steps but henceforth you can call all of this compiling questions or confusion yeah sure what is a.out signify a dot out is just the conventional default file name for any program that you compile directly with a compiler like clang it's just kind of a meaningless name though it stands for assembler output and assembler might now sound familiar from this assembling process it's just kind of a lame name for a computer program and so we can override it by outputting something like hello instead say that again does it use all of them so there are other to recap there are other prototypes in those files cs50.h standardio.h technically they're all included on top of your file even though you strictly speaking don't need most of them but indeed they are there just in case you might want them and finally any other questions yeah doesn't matter what order we're telling the computer to run sometimes with libraries yes it matters what order they are linked in together but for our purposes it's really not going to matter it's just going to make is going to take care of automating that process for us all right so with that said henceforth compiling technically is these four things but we'll focus on it just as a higher level concept and abstraction if you will known as compiling itself so another process that we'll now begin to focus on all the more this week because invariably this past week you ran against ran up against some challenges you probably created your very first bugs or mistakes in a program and so let's focus for a moment on actual techniques for debugging as you spend more time this semester in the years to come if you continue to program you're never frankly probably going to write bug free code ultimately your programs are just going to get more featureful more sophisticated and we're all just going to start to make more sophisticated mistakes and to this day i write buggy code all the time and i'm always horrified when i do it up here but hopefully that won't happen too often but when it does it's just a process now of debugging trying to find the mistakes in your program and you don't have to just stare at your code or sort of shake your fists at your code there are actual tools that real world programmers use to help debug their code and find these faults so what are some of the techniques and tools that folks use well as an aside um if you've ever um bug in a program is a mistake that's actually been around for some time if you've ever heard this tale some 50-plus years ago in 1947 this is actually an entry in a log book uh written by a famous computer scientist named grace hopper who happened to be the one to record the very first discovery of a quote-unquote actual bug in a computer this is actually like a moth that had flown into at the time it was a very sophisticated system known as the harvard mark ii computer sort of very large sort of refrigerator size type systems in which an actual bug caused an issue uh the etymology of bug though actually predates this particular instance but here you have as any computer scientist might know the example of a first physical bug in a computer how low do you go about removing such a thing well let's consider a very simple scenario from last time for instance when we're trying to print out various aspects of mario like this column of three bricks let's consider how i might go about implementing a program like this and let me switch back over to vs code here and i'm going to go ahead and run write a program and i'm not going to trust myself so i'm going to call it buggy.c from the get-go knowing that i'm going to mess something up but i'm going to go ahead and include standardio.h and i'm going to go ahead and define main as usual so hopefully no mistakes just yet and now i want to print those three bricks on the screen using just hashes for bricks so how about for int i gets 0 i less than or equal to 3 i plus plus now inside of my curly braces i'm going to go ahead and print out a hash followed by a backslash n semicolon all right saving the file doing make buggy enter it compiles so there's no syntactical errors like my code is syntactically correct but some of you have probably seen the logical error already because when i run this program i don't get this picture which was three bricks high i seem to have four bricks instead now this might be jumping out at you while it's happening but i've kept the program simple just so that we don't have to actually find an actual bug we can use a tool to find one that we already know about in this case what might be the first strategy for actually finding a bug like this rather than just staring at your code asking a question trying to sort of just think through the problem well let's actually try to diagnose the problem more proactively and the simplest way to do this now and years from now is honestly going to be used to use a function like printf printf is a wonderfully useful function not for formatting printing formatted strings and all that but just looking inside the values of variables that you might be curious about to see what's going on so you know what let me do ah do this i see that there's four coming out but i intended three so clearly something's wrong with my i variables so let me just be a little more pedantic let me go inside of this loop and just temporarily say something explicit like i is percent i backslash n and then just plug in the value of i right this is not the program i want to write it's the program i'm temporarily writing because now i'm going to go ahead and say make buggy dot slash buggy and if i look now at the output i have some helpful diagnostic information i is zero and i get a hash i is one and i get a hash two and i get a hash three and i get half okay wait a minute i'm clearly going too many steps because maybe i forgot the computers are essentially counting from zero and now oh it's less than or equal to now you see it right again trivial example but just by using printf you can see inside of the computer's memory by just printing stuff out like this and now once you've figured it out oh so this should probably be less than three or i should start counting from one there's any number of ways i could fix this but the most conventional is probably just to say less than three now i can go ahead and delete my temporary print statement rerun make buggy dot slash buggy and voila problem solved all right and to this day i do this like whether it's making a command line application or a web application or a mobile application it's very common to use printf or some equivalent in any language just to poke around and see what's inside the computer's memory thankfully there's more sophisticated tools than this let me go ahead and reintroduce the bug here and let me go ahead and reopen my sidebar at left here and let me go ahead now and recompile the code to make sure it's current and i'm going to run a command called debug 50 which is a command that's representative of a type of program known as a debugger and this debugger is actually built into vs code and all debug 50 is doing for us is just automating the process of starting vs code's built-in debugger so this isn't even a cs50 specific tool we've just given you a debug 50 command to make it easier to start it up from the get-go and the way you run this debugger is you say debug 50 space and then the name of the program that you want to debug so in this case dot slash buggy so you don't mention your c file you mention your already compiled code and what this debugger is going to let me do is most powerfully walk through my code step by step because every program we've written thus far just kind of runs from start to finish even if i'm not done sort of thinking through each step at a time with a debugger i can actually like click on a line number and say pause execution here and the debugger will let me walk through my code one step at a time one second at a time one minute at a time at my own human pace which is super compelling when the programs get more complicated and they might otherwise just fly by on the screen so i'm going to click to the left of line 5 and notice that these little red dots appear and if i click on one it stays and gets even redder and i'm going to now run debug 50 on dot slash buggy and in just a moment you'll see that a new panel opens on the left hand side it's doing some configuration of the screen and now let me go ahead and zoom out just a little bit here so we can see more on the screen at once and sometimes you'll see in vs code that debug console opens up which looks very cryptic just go back to terminal window if that happens because that the terminal window is where you can still interact with your code and let's now take a look at what's going on if i zoom in on my buggy.c code here you'll notice that we have uh the same program as before but highlighted in yellow is line five not a coincidence that's the line i set a so-called point at the little sort of red dot means break here pause execution here and the yellow line has not yet been executed but if i now at the top of my screen notice these little arrows there's one for play there's one for this which if i hover over it says step over there's another that's going to say step into there's a third that says step out i'm just going to use the first of these step over and i'm going to do this and you'll see that the yellow highlight moved from line five to line seven because now it's ready but hasn't yet printed out that hash but the most powerful thing here notice is at top left here it's a little cryptic because there's a bunch of things going on that'll make more sense over time but at the top there's a section called variables below that something called locals which means local to my current function main and notice there's my variable called i and its current value is zero so now once i click step over again watch what happens we go from line seven back to line five but look in the terminal window one of the hashes has printed but now it's kind of printed at my own pace i can sort of think through this step by step notice that i has not changed yet it's still zero because the yellow highlighted line hasn't yet executed but the moment i click step over it's going to execute line five now notice at top left i has become one and nothing has printed yet because now highlighted is line seven and so if i click step over again we'll see the hash and if i repeat this process at my own human comfortable pace i can see my variables changing i can see output changing on the screen and i can just think about should that have just happened and i can pause and give thought to what's actually going on without trying to race the computer and figure it all out at once i'm going to go ahead and just stop here because we already know what this particular problem is and that just brings me back to my default terminal window but this debugger let me disable the breakpoint now so it doesn't keep breaking this debugger will be your friend moving forward in order to step through your code step by step at your own pace to figure out where something has gone wrong printf is great but it gets annoying if you have to constantly add print this print this print this print this recompile rerun it oh wait a minute print this print this like the debugger just lets you do the equivalent but automatically questions on then this debugger which you'll see all the more hands-on over time questions on debugger yeah really good question we'll see this before long but those other buttons that i glossed over like step into and step out of actually let you step into specific functions if i had any more than main so if main called a function called something and something called a function called something else instead of just stepping over the entire execution of that function i could step into it and walk through its lines of code one by one so anytime you have a problem set you're working on that has multiple functions you can set a breakpoint in main if you want or you can set it inside of one of your additional functions to focus your attention only on that and we'll see examples of that over time all right so what else and what's the um you know the sort of elephant in the room so to speak is actually a duck in this case why is there this duck in all these ducks here well turns out a third genuinely recommended debugging technique is talking through problems talking through code with someone else now in the absence of having a family member or a friend or a roommate who actually wants to hear you talk about code of all things um generally programmers turn to a rubber duck or other inanimate objects if something animate is not available and the idea behind rubber duck debugging so to speak is that simply by looking at your code and talking it through okay on line three i'm i'm starting a for loop and i'm initializing i to zero okay then i'm printing out a hash just by talking through your code step by step invariably finds you having the proverbial light bulb go off over your head because you realize wait a minute i just said something stupid or i just said something wrong and this is really just a proxy for any other human teaching fellow teacher friend colleague but in the absence of any of those people in the room you're welcome to take on your way out today one of these little rubber ducks and consider using it for real any time you just want to talk through one of your problems in cs50 or maybe life more generally but having it there on your desk is just a way to help you hear ill logic in what you think might otherwise be logical code so printf debugging uh rubber duct debugging are just three of the ways you'll see over time to sort of get to the source of code that you will write that has mistakes which is gonna happen but it'll empower you all the more to solve those mistakes are any questions on debugging in general or these three techniques sorry what's the difference between what and what what's the difference between step over and step into at the moment the only one that's applicable to the code i just wrote is step over because it means step over each line of code if though i had other functions that i had written in this program maybe lower down in the file i could step into those function calls and walk through them one at a time so we'll come back to this with an actual example but step into will allow me to do exactly that in fact this is a perfect segue to doing a little something like this let me go ahead and open up maybe another file here actually we'll use the same buggy and we're just going to write one other thing that's buggy as well let me go ahead up here and include as before cs50.h let me include standard standardio.h let me do in main void so all of this i think is correct so far and let's do this let's give myself an int called i and let's ask the user for a negative integer this is not a function that exists technically yet but i'm going to assume for the sake of discussion that it does and then i'm just going to go ahead and print out with percent i and a new line whatever the human typed in so at this point in the story my program i i think is correct except for the fact that get negative int is not a function in the cs50 library or anywhere else i'm going to need to invent it myself so suppose in this case that i declare a function called get negative int its return type so to speak should be int because as its name suggests i want to hand the user back an integer and it's going to take no input to keep it simple so i'm just going to say void there no inputs no special prompts nothing like that let me now give myself some curly braces and let me do something familiar perhaps now from problem set one let me give myself a variable like n and let me do the following within this block of code assign n the value of getint asking the user for a negative integer using getint's own prompt and i want to do this while n is less than zero because i want to get a negative in from the user and recall from having used this block in the past i can now return n is the very last step to hand back whatever the user is typed in so long as they cooperated and gave me an actual negative integer now i've deliberately made a mistake here and it's a subtle sort of silly mathematical one but let me compile this program after copying now the prototype up to the top just so i don't make that mistake again let me do make buggy enter and now let me do dot slash buggy i'll give it a negative integer like negative 50. uh huh that did not take uh how about how about negative five maybe it's two no uh how about zero huh all right so it's clearly sort of working backwards or incorrectly here logically so how could i go about debugging this well i could do what i've done before i could use my printf technique and say something explicit like n is percent i new line comma i just sorry comma n just to print it out let me recompile buggy let me rerun buggy let me type in negative 50 okay n is negative 50. so that didn't really help me at this point um because that's the same as before so let me do this debug 50 dot slash buggy oh but i've made a mistake so i didn't set my break point yet so let me go ahead and do this and i'll set a break point this time i could set it here on line eight let's do it in main as before let me rerun debug 50 now on dot slash buggy that fancy user interface is going to pop up it's going to highlight the line that i set the break point on notice that on the left hand side of the screen i is defaulting at the moment to zero because i haven't typed anything in yet but let me go ahead now and step over this line that's highlighted in yellow and you'll see that i'm being prompted so let's type in my negative 50. enter all right and notice now that i'm stuck in that function so all right so clearly the issue seems to be in my get negative int function so okay let me go ahead and stop this execution my problem doesn't seem to be in main per se maybe it's down here so that's fine let me set my same break point at line eight let me rerun debug 50 one more time but this time instead of just stepping over that line let's step into it so notice line 8 is again highlighted in yellow in the past i've been clicking step over let's click step into now and when i click step into boom now the debugger jumps into that specific function and now i can step through these lines of code again and again i can see what the value of n is as i'm typing it in i can think through my logic and voila hopefully once i've solved the issue i can exit the debugger fix my code and move on so step over just goes over the line but executes it step in two lets you go into other functions you've written all right any questions then on this yes let's switch over to carter and our classmates online carter i don't have a question about you know when you add these break points does that change the underlying file or is do people actually looking somewhere else for these break points as well uh good question so if you change a break point a break point is something that's useful only in the context of the debugger not actually changing your code or making any permanent changes to it it's just a clue to the debugger as to where you want to stop execution and step through more methodically did i answer that correctly carter all right it's back to us here in sanders all right so let's go ahead and do this we've got a bunch of possible approaches that we can take to solving some problem let's go ahead and pace ourselves today though let's take a five minute break here and when we come back we'll actually take a look at that computer's memory we've been talking about see you in five all right so let's let's dive back in and up until now both by way of week one and problem set one for the most part we've just translated from scratch into c all of these basic building blocks like loops and conditionals boolean expressions variables so sort of more of the same but there are features in c that we've already stumbled across already like data types the types of variables that doesn't exist in scratch but that in fact does exist in other languages in fact a few that we'll see before long so to summarize the types we saw last week just recall this little list here we had ins and floats and longs and doubles and chars there's also bulls and also string which we've seen a few times but today let's actually start to formalize what these things are and actually like what your mac and pc are doing when you manipulate bits as an int versus a char versus a string versus something else and see if we can't put more tools into your toolkit so to speak so we can start quickly writing more featureful more sophisticated programs in c so it turns out that on most systems nowadays though this can vary by actual computer this is how large each of the data types typically is in c when you store a boolean value a zero or a one a true or a false or true it actually uses one byte that's actually a little excessive because strictly speaking you only need one bit which is one eighth of this size but for simplicity computers use a whole byte to represent a bull true or false a char we saw last week is actually only uh one byte or eight bits and this is why ascii which uses one byte or technically only seven bits early on was confined to only 256 maximally possible characters notice that an int is 4 bytes or 32 bit 32 bits a float is also 4 bytes or 32 bits but the things that we called long is literally twice as long 8 bytes or 64 bits and so is a double a double is 64 bits of precision for floating point values and a string for today we're going to leave as a question mark because we'll come back to that later today and next week as to how much space a string takes up but suffice it to say it's going to take up a variable amount of space depending on whether the string is short or long but we'll see exactly what that means before long so here's a photograph of a typical piece of memory inside of your mac or pc or phone and odds are it might be just a little smaller in some devices this is known as ram or random access memory and each of these little black chips on this circuit board the green thing these little black chips are where zeros and ones are actually stored each of those stores some number of bytes maybe megabytes maybe even gigabytes nowadays so let's actually focus on just one of those chips just to give us a sort of zoomed in version thereof and let's consider the fact that even though we don't have to care exactly how this kind of thing is made if this is like one gigabyte of memory for the sake of discussion it stands to reason that if this thing is storing one billion bytes one gigabyte then we can number them kind of arbitrarily like maybe this will be byte zero one two three four five six seven eight and then maybe way down here in the bottom right corner is byte number one billion right we can just number these things as might be our our convention so let's actually draw that graphically not with a billion squares but fewer than those and let's just zoom in further and consider that all right at this point in the story let's abstract away all the hardware and all the little wires and just think of memory as taking up a rather just think of data as taking up some number of bytes so for instance if you were to store a char in a computer's memory which was one byte it might be stored literally at this like top left-hand location of this this black chip of memory if you were to store something like an integer that uses four bytes well it might use four of those bytes but they're going to be contiguous back to back to back in this case if you were to store a long or a double you might actually need eight bytes so i'm just kind of filling in these squares to represent how much memory and given variable of some data type would take up one or four or eight in this case here well from here let's go ahead and just abstract away from all of the hardware and just really focus on memory as being a grid or really like a canvas that we can paint any types of data onto that we want at the end of the day all of this data is just going to be zeros and ones but it's up to you and i to sort of build abstractions on top of that things like actual numbers and colors and images and movies and beyond but we'll start lower level here first suppose i had a program that needs three integers like a simple program whose purpose in life is to like average your three scores on an exam or some such thing and suppose that your three scores were these 72 and 73 not too bad and 33 which is particularly low let's go ahead and write a program that actually does this kind of averaging for us let me go back to vs code here let me open up a file called scores.c and let me go ahead and implement this as follows let me include standardio.h at the top it main void as before and then inside of main let me go ahead and declare score 1 which is 72 give me another score 73 and then a third score called score 3 which is going to be 33 and now i'm just going to use printf to print out the average of those things and i can do this in a few different ways but i'm going to just print out percent f and i'm going to do score 1 plus score 2 plus score 3 divided by 3 closed parenthesis semicolon so just some relatively simple arithmetic just to compute the average of three scores if i'm curious like what my average grade is in the class with these three assessments all right let me go ahead now and do make scores huh all right so i've somehow made an error already but this one is actually germane to a problem we hopefully won't encounter too frequently what's going on here so underline discord one plus score two plus score three divided by three format specifies type double but the argument has type int well what's going on here because the arithmetic seems to check out yeah correct and we'll come back to this in more detail but indeed what's happening here is i'm adding three ins together obviously because i define them right up here and i'm dividing by another ant three but the catch is recall that c when it performs math treats all of these things as integers but integers are not floating point values so if you actually want to get a precise average for your score without throwing away the remainder everything after the decimal point it turns out in this case we're going to have to we're going to oh we're going to have to we're going to have to convert this whole expression somehow to a float and there's a few ways to do this but the easiest way for now i'm just going to go ahead and do this up here i'm going to change the divide by 3 to divide by 3.0 because it turns out long story short in c so long as one of the values participating in an arithmetic expression like this is something like a float the rest will be treated as promoted to so to speak a floating point value as well so let me now recompile this code with make scores enter this time it worked okay because i'm treating a float as a float and let me do dot slash scores enter all right my average is 59.33333 and so forth alright so the math presumably checks out floating point imprecision per last week aside but let's consider the design of this program like what is kind of bad about it or if we maintain this program longer term are we going to regret the design of this program what might not be ideal here yeah sorry can you say a little letter yeah so in this case i have hard-coded my three score so if i'm hearing you correctly this program is only ever going to tell me this specific average i'm not even using something like get int or get flow to get three different scores so that's not good and suppose that we wait later in the semester i think other problems could arise yeah i can't reuse the number because i haven't stored the average in some variable which in this program not a big deal but certainly if i wanted to reuse it elsewhere that's a problem and let's fast forward again a little later in the semester i don't just have three test scores or exam scores maybe i have four or five or six where might this take us yeah i've sort of capped this program at three and honestly this is just kind of bordering on copy paste even though the variables yes have different names score one score two score three imagine doing this for like a whole grade book for a class having score four five six eleven ten twelve twenty thirty like that's a lot of variables and you can imagine just how ugly the code starts to get if you're just defining variable after variable after variable so it turns out there are better ways in languages like c if you actually want to have multiple values stored in memory that happen to be of the same data type and so let's take a look back at this memory here to see what these things might look like in memory here's that grid of memory and each of these recall represents a byte so just to be clear if i store score one in memory first how many bytes will it take up so four aka 32 bits so i might draw score one as filling up this part of the memory it's really up to the computer as to whether it goes here or down there or wherever i'm just keeping the pictures clean though for today from the top left on down if i then declare another variable called score two it might end up over there also taking up four bytes and then score three might end up here and so that's just representing what's going on inside of the computer's memory but technically speaking to be clear per week zero what's really being stored in the computer's memory are patterns of zeros and ones 32 total in this case because 32 bits is four bytes but again it sort of gets boring quickly to think at think in and look at binary all the time so we'll generally abstract this away as just using decimal numbers in this case instead but there might be a better way to store not just three of these things but maybe four maybe five maybe ten maybe more by declaring one variable to store all of them instead of three or four or five or more individual variables and the way to do this is by way of something generally known as an array an array is another type of data that allows you to store multiple values of the same type back to back to back that is to say contiguously so an array can let you create memory for one int or two or three or even more than that but describe them all using the same variable name the same one name so for instance if for one program i only need three integers but i don't want to sort of uh messily declare them as score one score two score three i can actually do this instead and this is today's first new piece of syntax the square brackets that we're now seeing this line of code here is similar to int score one semicolon or int score one equals 72 semicolon this line of code is declaring for me so to speak an array of size three and that array is going to store three integers why because the type of that array is an int here the square brackets tell the computer how many ins you want in this case three and the name is of course scores which in english i've just deliberately pluralized now so that i can describe this array as storing multiple scores indeed so if i want to now assign values to this variable called scores i can do code like this i can say scores bracket 0 equals 72 scores bracket 1 equals 73 and scores bracket 2 equals 33. the only thing weird there is admittedly the square brackets which are still new but we also notice zero indexing things to zero index means to start counting at zero and we've talked about that before our for loops have generally been zero indexed arrays in c are zero indexed and you do not have choice over that you can't just start counting at one in arrays just because you prefer to you'd be sacrificing one of the elements you have to start in arrays counting from zero so out of context this doesn't necessarily solve a problem but it definitely is going to once we have more than even three scores here in fact let me go ahead and change this program a little bit let me go back to vs code here and let me go ahead and delete these three lines here and let me replace it with a scores variable that's ready to store three total integers and then let me go ahead and initialize them as follows scores bracket 0 is 72 as before scores bracket 1 is going to be 73 scores bracket 2 is going to be 33. notice i do not need to say int before any of these lines because that's been taken care of already for me on line five where i already specified that everything in this array is going to be an int now down here this code needs to change because i no longer have three variables score one two and three i have one variable but that i can index into i'm going to here then do scores bracket 0 plus scores bracket 1 plus scores bracket 2 which is equivalent to what i did earlier giving me back those three integers but notice i'm using the same variable name every time and again i'm using this new square bracket notation to quote unquote index into the array to get at the first int the second int and the third and then to do it again down here now this program's still not really solving all the problems we described like i still can only store three scores but we'll come back to something like that before long but for now we're just introducing a new syntax and a new feature whereby i can now store multiple values in the same variable well let's enhance this a bit more instead of hard coding these scores as was identified as a problem no we don't want to ask siri let's go ahead and use getint to ask the user for a score let's then use getint to ask the user for another score let's use getint to ask the user for a third score storing them in those respective locations and now if i go ahead and save this program recompile scores huh i've messed up here but now these errors should be getting a little familiar what mistake did i make let me give folks a moment cs50.h so that was not intentional so still making mistakes all these years later i need to include cs50.h now i'm gonna go back to the bottom in the terminal window make scores okay we're back in business dot slash scores now the program's getting a little more interesting so maybe this year was better and i got 100 and a 99 and a 98 and there my average is 99.000 so now it's a little more dynamic so it's a little more interesting but it's still capping the number of scores at 3 admittedly but now i've kind of introduced another sort of symptom of bad programming there's this expression in programming too called code smell where like something smells a little off and there's something off here in that i could do better with this code here does anyone see an opportunity to improve the design of this code here if my goal still is just to get three scores from the user but without it like smelling kind of bad yeah yeah exactly those lines of code are almost identical and honestly the only thing that's changing is the number and it's just incrementing by one we have all of the building blocks to do this better so let me go ahead and improve this let me go ahead and delete that code uh let me go ahead now and have a for loop so for ins i get 0 i less than 3 i plus plus then inside of this for loop i can distill all three of those lines into something more generic like scores bracket i equals get int and now ask the user just once via getint for a score so this is where arrays start to get pretty powerful you don't have to hard code that is literally type in all of these magic numbers like zero one and two you can start to do it programmatically as you propose with a loop so now i've kind of tightened things up i'm now dynamically getting three different scores but putting them in three different locations and so this program ultimately is going to work pretty much the same make scores dot slash scores and 199 98 and we're back to the same answer but it's a little better design too if i really want to nitpick there's something that still smells a little bit here the fact that i have indeed this magic number three here that really kind of has to be the same as this number here otherwise who knows what's going to go wrong so what might be a solution per last week to kind of cleaning that code up further too okay so we could leave it up to the user's discretion and so we could actually do something like this let me take this a few steps ahead let me say something like int n gets get int how many scores question mark then i could actually change this to an n and then this to an end and indeed make the whole program dynamic ask the human how many tests have there been this semester then you can type in each of those scores because the loop is going to iterate that many times and then you'll get the average of one task two test three i lost another um or however many scores that were actually specified by the user yeah question how many bytes are used in an array uh so the purpose of an array is not to save space it's to eliminate having multiple variable names because that just gets very messy quickly if you literally have score one score two score three dot dot score 99 that's literally like 99 different variables potentially that you could actually collapse into one variable that has 99 locations if you will at different indices or indexes as someone would say the index for an array is whatever's in the square brackets so it's a good question so if you i'm using ins for everything and honestly we don't really need ants for scores because i'm not really likely to get a 2 billion on a test anytime soon and so you could actually use different data types and that list we had on the screen earlier is not all of them there's actually a data type called short which is literally shorter than an int you could actually technically use char in some form or even other data types as well generally speaking in the year 2021 these tend to be over overly optimized uh decisions like everyone just uses ins even though no one's going to get a test score that's 2 billion or more because int is just kind of the go-to years ago memory was expensive and every one of your instincts would have been spot on because memory is so tight but nowadays we don't worry as much about it is uh so what is the difference between dividing two ins and not getting an error as you might have encountered in a program like cache versus dividing two ants and getting an error like i did a moment ago the problem with the scenario i created a moment ago was printf was involved and i was telling printf to use a percent f but i was giving printf the result of dividing integers by another integer so it was printf that was yelling at me and i'm guessing in the scenario you're describing for something like cache printf was not involved in that particular line of code so that's the difference there all right so we now have this ability to create an array and an array can store multiple values what then might we do that's more interesting than just storing numbers in memory well let's take this one step further as opposed to just storing 72 73 33 or 100 99 98 at these given locations because again an array gives you one variable name but multiple locations or indices they're in bracket 0 bracket 1 bracket 2 on up if it were even bigger than that let's now start to consider something more modest like simple chars chars being one byte each so they're even smaller they take up much less space and indeed if i wanted to say a message like hi i could use three variables if i wanted a program to print high hi exclamation point literally i could of course store those in three variables like c1 c2 c3 and let's just for the sake of discussion let's go ahead and whip this up real quickly let me create a new program here now in vs code this time i'm going to call it uh hi dot c and i'm not going to bother with the cs50 library here i just need the standard i o one for now int main void and then inside of main i'm gonna simply create three variables and this is already hopefully striking you as a bad idea but we'll go down this road temporarily with c1 and c2 and finally c3 storing each character in the phrase i want to print and i'm going to go ahead now and print this in a different way than usual now i'm dealing with chars and we've generally dealt with strings which was easier certainly last week but percent c percent c percent c will let me print out three chars and like c1 c2 and c3 so kind of a stupid way of printing out a string so we already have a solution to this problem last week but let's just poke around at what's actually going on underneath the hood here so let's make high dot slash high and voila no surprise but we again could have done this last week with a string and just one variable or even zero at that but let's go ahead now and start converting these characters to their apparent numeric equivalents like we talked about in week zero two let me go ahead and modify these percent c's just to be fun to be percent eyes and let me just add some spaces so that there are gaps between each of them let me now recompile hi and let me rerun it and just to guess what should i see on the screen now any guesses yeah the ascii values and it's intentional that i keep using the same word hi because it should be hopefully the old friends 72 73 and 33 which is to say that c knows about ascii or equivalently unicode and can do this conversion for us automatically and it seems to be doing it implicitly for us so to speak notice that c1 c2 and c3 are obviously chars but printf is able to tolerate printing them as integers if i really wanted to be pedantic i could use this technique again known as typecasting where i can actually convert one data type to another if it makes logical sense to do so and at the end of the day we saw in week zero chars or characters are just numbers like 72 73 and 33 so i can actually use this parenthetical expression to convert incorrectly three chars to three integers instead so that's what i meant to type the first time there we go strike two today so parenthesis int close parenthesis just says take whatever variable comes after this c1 or c2 or c3 and convert it to an end the effect is going to be no different here make high and then re-running whoops then running dot slash high still works the same but now i'm explicitly converting chars to ins and we can do this all day long chars to ins floats to ins inst to floats sometimes it's equivalent other times you're going to lose information taking a flow to an int just intuitively is going to throw away everything after the decimal point because after all an int has no decimal point but for now i'm going to go ahead and rewind to the version of this that just did implicit type conversion or implicit casting just to demonstrate that we can indeed see the values underneath the hood all right let me go ahead and do this now the week one way this was kind of stupid let's just do printf quote unquote actually let's do this string s equals quote unquote hi and then let's go ahead and do a simple printf with percent s printing out s is there so now i've rewound to last week where we began this story but you'll notice that if we keep playing around with this whoops oh what did i do here oh and let me introduce the cs50 library here more on that next before long let me go ahead and recompile re-run this we seem to be be coding in circles here like i've just done the same thing multiple different ways but there's clearly an equivalence then between sequences of chars and strings and if you do it the real pedantic way you have like three different variables c1 c2 c3 representing hi exclamation point or you can just treat them all together like this h i exclamation point but it turns out that strings are actually implemented by the computer in a pretty now familiar way what might a string actually be as of this point in the story where are we going with this let me try to look farther back yeah and way back yeah a string might be and indeed is just an array of characters so last week we just took for granted that strings exist technically strings exist but they're implemented as arrays of characters which actually opens up some interesting possibilities for us because let me see let me see if i can do this let me try to print out now three integers again but if string s is but an array as you propose maybe i can do s bracket zero s bracket one and s bracket two so maybe i can start poking around inside of strings even though we didn't do this last week so i can get at those individual values so make high dot slash high and voila there we go again it's the same 72 73 33 but now i'm sort of hopefully like wrapping my mind around the fact that all right a string is just an array of characters and arrays you can index into them using this new square bracket notation so i can get it any one of these individual characters and heck convert it to an integer like we did in week 0 as i might but let me get a little curious now too what might what else might be in the computer's memory well let's toggle back to the the depiction of these same things here might be how we originally implemented high with three variables c1 c2 c3 of course that mapped to these decimal digits or equivalent these binary values but what was this looking like in memory literally when you create a string in memory like this string s equals quote unquote high let's consider what's going on underneath the hood so to speak well as an abstraction a string it's h i exclamation point taking up it would seem three bytes right i've gotten rid of the bars there because if you think of a string as a type i'm just gonna use one big box of size three but technically a string we've just revealed is an array and the array arrays of size three so technically if the string is called s s bracket zero will give you the first character s bracket one the second and s bracket three the third but let me ask this question now if this at the end of the day is the only thing in your computer memory and the ability like a canvas to draw zeros and ones or numbers or characters or whatever on it but that's it like this is what your mac and pc and phone ultimately reduce to suppose that i'm running a piece of software like a text messenger and now i write down by exclamation point well where might that go in memory well it might go here b-y-e and then the next thing i type might go here here here and so forth my memory just might get filled up over time with things that you or someone else are typing but then how does the computer know if potentially b y e exclamation point is right after h i exclamation point where one string ends and the next one begins right all we have are bytes or zeros and ones so if you were designing this how would you implement some kind of delimiter between the two or figure out what the length of a string is what do you think okay so the right answer is use a null character and for those who don't know what does that mean yeah so it's a special character let me describe it as a sentinel character humans decided some time ago that you know what if we want to delineate where one string ends and where the next one begins we just need some special symbol and the symbol they'll use is generally written as backslash zero this is just shorthand notation for literally eight zero bits zero zero zero zero zero zero zero zero and the nickname for eight zero bits in this context is null n-u-l so to speak and we can actually see this as follows if you look at the corresponding decimal digits like you could do by just doing out the math or doing the conversion like we've done in code you would see for storing high 72 73 33 but then one extra byte that's sort of invisibly there but that is all zeros and now i've just written it as the decimal number zero the implication of this is that the computer is apparently using not three bytes to store a word like high but four bytes whatever the length of the string is plus one for this special sentinel value that demarcates the end of the string so we might draw it like this instead and this character is again sort of pronounced null or written n-u-l so that's all right if humans at the end of the day just have this canvas of memory they just needed to decide all right well how do we distinguish one string from another because it's a lot easier with chars individually it's a lot easier with ins it's even easier with floats why because per that chart earlier every character is always one bite every inch is always four bytes every long is always eight bytes how long is a string well high is one two three with an exclamation point by is one two three four with an exclamation point david is d-a-v-i-d-5 without an exclamation point and so a string can be any number of bytes long so you somehow need to draw a line in the sand to separate in memory one string from another what's the implication of this well let me go back to code here let's actually poke around this is a bit dangerous but i'm going to start looking at memory locations past my string here so let me go ahead and recompile uh make hi whoops what did i do here oh i forgot a format code let me add one more percent i now let me go ahead and rerun make hi dot slash hi enter there it is so you can actually see in the computer unbeknownst to you previously that there's indeed something else going on there and if i were to make like one other variant of this program let's get rid of just this one word and maybe let's have two so let me give myself another string called t for instance just just this common convention with by exclamation point let me then go ahead and print out with percent s s and let me also print out with percent s whoops printf uh print out t as well let me just recompile this program and obviously the out this is what happens when i go too fast all right third mistake today close quote as i was missing make high fourth mistake today make high dot slash high okay voila now we have a program that's printing both high and by only so that we can consider what's going on in the computer's memory if s is storing high and apparently one bonus byte that demarcates the end of that string by is apparently going to fit into the location directly after and it's wrapping around but that's just an artist's rendition here but by bye exclamation point is taking up one two three four plus a fifth byte as well all right any questions on this underlying representation of strings and we'll contextualize this before long so that this isn't just like okay who really cares this is going to be the source of actually implementing things in fact for problem set two like cryptography and encryption and actually scrambling actual human messages but some questions first a good question too and let me summarize as if we were instead to use chars all the time we would indeed have to know in advance how many charge you want for a given string that you're storing how then does something like getstring work because when cs50 wrote the getstring function we obviously don't know how long the words are going to be that you all are typing in it turns out next uh two weeks from now we'll see that getstring uses a technique known as dynamic memory allocation and it's gonna grow or shrink the array uh automatically for you but more on that soon other questions good question why are we using a null value isn't it wasting a bite yes but i claim there's really no other way to distinguish this end of one string from the start of another unless we make some sort of mark uh notation so to speak in memory all we have at the end of the day inside of a computer are bits therefore all we can do is spend those bits in some creative way to solve this problem and so we're minimally going to spend one byte to solve this problem here if you don't how does a the computer know to move to a next line when you have a backslash n so backslash n even though it looks like two characters it's actually stored as just one byte in the computer's memory there's a mapping between it and an actual number and you can see that for instance on the ascii chart from the other day it would be if i had put a backslash n in my code here right after the exclamation point here and here that would actually shift everything in memory because we would need to make room for a backslash n here and another one over here so it would take two more bytes exactly other questions and what's the last thing you said it's context sensitive so if at the end of the day all we're storing is these numbers like 72 73 33 recall that it's up to the program to decide based on context how to interpret them and i simplified this story in week 0 saying that photoshop interprets them as rgb colors and imessage or a text messaging program interprets them as letters and excel interprets them as numbers how those programs do it is by way of variables like string and int and float and in fact later this semester we'll see a data type via which you can represent a color as a triple of numbers and red value a green value and a blue value so we'll see other data types as well yeah really interesting question why could we not just make all data types variable in size and some languages some libraries do exactly this um c is an older language and so because memory was expensive memory was limited the reality was you gain benefits from just standardizing the size of these things you also get performance increases in the sense that if you know every int is four bytes you can very quickly and we'll see this next week jump from integer to another to another in memory just by adding four inside of those square brackets you can very quickly poke around whereas if you had variable length numbers you would have to kind of follow follow follow looking for the end of it follow follow you would have to look at more locations in memory so that's a topic we'll come back to but it was generally for efficiency and other questions yeah good question why not store the okay same one why not store the null character at the beginning uh you could i uh let's see why not store it at the beginning you could do that um you could absolutely well could you do this if you were to do that at the beginning short answer no okay i retract that no because i finally thought of a problem with this if you store it at the beginning instead we'll see in just a moment how you can actually write code to figure out where the end of a string is and the problem there is you wouldn't necessarily know if you eventually hit a zero at the end of the string because it's the number zero in the context of like excel using some memory or if it's the context of some other data type altogether so the fact that we've standardized the fact that we've standardized strings as ending with null means that we can reliably distinguish one variable from another in memory and that's actually a perfect segue now to actually using this primitive to building up our own code that manipulates these things at a lower level so let me go ahead and do this let me create a new file this time called length and let's just use this basic idea to figure out what the length of a string is after uh it's been stored in a variable here so let's go ahead and do this let me include both the cs50 header and the standard i o header give myself int main void again here and inside of main let me go ahead and do this let me prompt the user for a string s and i'll ask them for a string like their name here and then let me go ahead and actually let me name it more verbosely name this time and now let me go ahead and do this let me iterate over every character in this string in order to figure out what its length is so initially i'm going to go ahead and say this int length equals 0 because i don't know what it is yet so we're going to start at zero and then while the following is true well let me do i want to do this let me change this to i just for clarity let me go ahead and do this while name bracket i does not equal that special null character so i typed it on the slide as nul but you don't write nul in code you actually use its numeric equivalent which is backslash 0 in single quotes while name bracket i does not equal the null character i'm going to go ahead and increment i to i plus plus and then down here i'm going to print out the value of i to see what we actually get printing out the value of i all right so what's going to happen here let me go ahead and run make length fortunately no errors dot slash length and let me type in something like hi exclamation point enter and i get three let me try buy exclamation point enter and i get four let me try my own name david enter five and so forth so what's actually going on here well it seems that by way of this for loop we are specifying a local variable called i initialized to zero because we're figuring out the length of the string as we go i'm then asking the question does location zero that is i in the name string which we now know is an array does it not equal backslash zero because if it doesn't that means it's an actual character like h or b or d so let's increment i then let's come back around to line nine and let's ask the question again now i equals one so does name bracket one not equal backslash zero well if it doesn't and it won't if it's an i or a y or an a based on what i typed in we're going to increment i once more fast forward to the end of the story once i get to the end of the string technically one space past the end of the string name bracket i will equal backslash zero so i don't increment i anymore i end up just printing the result so what we seem to have here with some low level c code just this while loop is a program that figures out the length of a given string that's been typed in let's practice our abstraction and decompose this into maybe a helper function here let me actually grab all of this code here and let me assume for the sake of discussion for a moment that i can just call a function now called string length and the length of the string is name that i want to get and then i'll go ahead and print out just as before with percent i the length of that string so now i'm abstracting away this notion of figuring out the length of the string that's an opportunity for me to create my own function if i want to create a function called string length i'll claim that i want to take a string as input and what should i have this function return as its return type what should getstring presumably return a lot of yeah an int right and it makes sense float really wouldn't make sense because we're measuring things that are uh integers in this case the length of something so indeed let's have it return an int i can pretty much use the same code as before so i'm just going to paste what i cut earlier in the file and the only thing i have to change here is the name of the variable because now this function i decided kind of arbitrarily that i'm going to call it s just to be more generic so i'm going to look at s bracket i at each location and i don't want to print it at the end this would be a side effect what's the line of code i should include here if i actually want to hand back the total length yeah say again return i in this case so i'm going to go ahead and return i not print it because now my main function can use the return value stored in length and print it on the next line itself i just need a prototype so that's my one forgivable copy paste here i'm going to rerun make length hopefully i didn't screw up i didn't dot slash length i'll type in high oops i'll type in hi again that works i'll type in buy again and so forth all right so now we have a function that determines the length of a string well it turns out we didn't actually need this all along it turns out that we can get rid of my own custom string length function here i can definitely delete the whole implementation down here because it turns out in a file called string.h which is a new header file today we actually have access to a function called more succinctly sturlang strlen which literally does that this is a function that comes with c albeit in the string.h header file and it does pretty much what we just implemented manually so here's an example of admittedly a wheel we just reinvented but no more we don't have to do that and how do you know what kinds of functions exist well let me actually pop out of my browser here to a website that is a cs50s incarnation of what are called manual pages it turns out that in a lot of systems macs and unix and linux systems including the visual studio code instance that we have in the cloud there are publicly accessible manual pages for functions they tend to be written very expertly in a way that's not very beginner friendly so what we have here at manual.cs50.ao is cs50s version of manual pages that have this less comfortable mode that give you a sort of cheat sheet of very frequently used helpful functions in c and we've translated the sort of expert notation to things that a beginner can understand so for instance let me go ahead and search for string up at the top here you'll see that there's documentation for our own getstring function but more interestingly down here there's a whole bunch of string related functions that we haven't even seen most of yet but there's indeed one here called sterling calculate the length of a string and so if i actually go to stirling here i'll see some less comfortable documentation for this function and the way a manual page typically works whether in cs50s format or any other system is you see typically a synopsis of what header files you need to use the function so you would copy paste these couple of lines here you see what the prototype is of the function so that you know what its inputs are if any and its outputs are if any then down below you might see a description which in this case is pretty straightforward this function calculates the length of s then you see what the return value is if any and you might even see an example like this one that we've whipped up here so these manual pages which are again accessible here and we'll link to these in the problem sets moving forward are pretty much the place to start when you want to figure out has a wheel been invented already is there a function that might help me solve some problem set problems so that i don't have to really get into the weeds of doing all of those lower level steps as i've had sometimes the answer is going to be yes sometimes it's going to be no but again the point of our having just done this together is to reveal that even the functions you start taking for granted they all reduce to some of these basic building blocks at the end of the day this is all that's inside of your computer is zeros and ones we're just learning now how to harness those and how to manipulate them ourselves any questions here on this any questions at all good question is it so common that you would have to specify it or not you do need to include its header files because that's where all of those prototypes are you don't need to worry about linking it in with dash l anything and in fact moving forward you do not ever need to worry about linking in libraries when compiling your code we the staff have configured make to do all of that for you automatically we want you to understand that it is doing it but we'll take care of all of the dash l's for you but the onus is on you for the prototypes and the header files other questions on these representations or techniques yeah a good question if you were to have a string with actual spaces in it that is multiple words what would the computer actually do well for this let me go to ascii chart.com which is just a random website that's my go-to for the first 127 characters of ascii um this is in fact what we had a screenshot of the other day and if you look here it's a little non-obvious but sp is space if a computer were to store a space it would actually store the decimal number 32 or technically the pattern of zeros and ones that represent the number 32. all of the u.s english keys that you might type on a keyboard can be represented with a number and using unicode can you express even things like emojis and other languages question only strings are accompanied by nulls at the end because every other data type we've talked about thus far is of well-defined finite length one byte for char four bytes for ins and so forth if we think back though to last week we did end the week with a couple of problems integer overflow because like four bytes heck even eight bytes is sometimes not enough we also talked about floating point and precision thankfully in the world of scientific computing and financial computing there are libraries you can use that draw inspiration from this idea of a string and they might use 9 bytes for an integer value or maybe 20 bytes you can count really high but they will then start to manage that memory for you and what they're really probably doing is just grabbing a whole bunch of bytes and somehow remembering how long the sequence of bytes is that's how these higher level libraries work too all right this has been a lot let's take one more break here we'll do like a seven minute break here and when we come back we'll flesh out a few more details all right so we just saw sterling is an example of a a function that comes in the string library let's start to take more of these library functions out for a spin so we're not relying only on the built-ins that we saw last week let me go ahead and switch over to vs code and let me create a file called say string.h just to kind of apply this lesson learned as follows let me go ahead and include cs50.h let me include standardio.h and this new thing string.h as well at the top i'm going to do the usual intman void here and then in this program suppose for the sake of discussion that i didn't know about percent s for printf or heck maybe early on there was no percent s format code and so there was no easy way to print strings well at least if we know that strings are just arrays of characters we could use percent c as a work around so to speak a solution to that sort of contrived problem so let me ask myself for a string s by using get string here and i'll ask the user for some input and then let me go ahead and print out say output and all i want to do is print back out what the user typed now the simplest way to do this of course is going to be like last week printf percent s and plug in the s and we're done but again for the sake of discussion i forgot about or someone didn't implement percent s so how else could we do this well in pseudocode or in english like what's the gist of how we could solve this problem printing out the string s on the screen without using percent s how might we go about solving this just in english high level what would your pseudo code look like yeah okay so just print each letter and maybe more precisely like some kind of loop like let's iterate over all of the characters in s and print one at a time so how can i do that well for ins i get zero is kind of the go-to starting point for most loops i is less than okay how long do i want to iterate well it's going to depend on what i type in but that's why we have sterling now so iterate up to the length of s and then increment i with plus plus on each iteration and then let's just print out percent c with no new line because i want everything on the same line uh whatever the character is at s bracket i and then at the very end i'll give myself that new line just to move the cursor down to the next line so the dollar signs not in a weird place all right so let's see if i didn't screw up any of the code make a string enter so far so good string and let me type in something like hi enter and i see output of high two let me do it once more with buy enter and that works too notice i very deliberately and quickly gave myself two spaces here in one space here just because i literally wanted these things to line up properly and input is shorter than output but that was just a deliberate formatting detail so this code is correct which is a claim i've made before but it's not well designed now it's it is well designed and then i'm using someone else's library function like i've not reinvented a wheel there's no line 15 or below i didn't implement string length myself so i'm at least kind of practicing what i've preached but there's still an imperfection a sub-optimality this one's really subtle though and you have to think about how loops work what am i doing that's not super efficient yeah in back yeah this is a little subtle but if you think back to the basic definition of a for loop and recall when i highlighted things last week what happens well the first thing is that i get set to zero then we check the condition how do we check the condition we call sterling on s we get back an answer like three if it's h i exclamation point and zero is less than three so that's fine and then we print out the character then we increment i from zero to one we recheck the condition how do i recheck the condition i call sterling of s get back the same answer three compare three against one we're still good so we print out another character i gets incremented again i is now two we check the condition what's the condition well what's the string like the best it's still three two is still less than 3. so i keep asking the same question sort of stupidly because the string is presumably never changing in length and indeed every time i check that condition that function's going to get called and every time the answer for high is going to be 3 3 3. so it's a marginal sort of sub-optimality but i i could do better right like don't ask multiple times questions that you can remember the answer to so how could i remember the answer to this question and ask it just once how could i remember the answer to this question let me see yeah back there so stored in a variable right that's been our answer most any time we want to keep something around so how could i do this well i could do something like this int maybe length equals sterling of s then i can just change this function call so to speak and let me re fix my spelling here let me fix this to be now comparing against length and this is now okay because now sterling is only called once on line nine and i'm reusing the value of that variable aka length again and again and again so that's more efficient turns out that for loops actually let you declare multiple variables at once so we can actually do this a little more elegantly all in one line and this is just now some syntactic improvement i could actually do something like this n equals sterling of s and then i could just say n here or i could call it length but heck well i'm being succinct i'm just going to use n for number so now it's just a marginal change but i've now declared two variables inside of my loop i and n i is set to zero n is to the string length of s but now hereafter all of my condition checks are just i less than n i less than n and n is never now changing all right so a marginal improvement there now that i've used this new function let's use some other functions that might be of interest let me go ahead and write a quick program here that maybe like upper capitalizes the beginning of uh that uh changes to uppercase some string that the user types in so let me go ahead and code a file called uppercase.c uh up here i'll use my new friends cs50.h and standardio and string.h so standardio and string.h so just as before int main void and then inside of main what i'm going to do this time is let's ask the user for a string s using get string asking them for the before value and then let me go ahead and just print out something like after uh so that it uh just so i can see what the uppercase version thereof is and luke do you mind taking the volume down a little bit and then after this let me go ahead and do the following for int i equals zero oh let's practice that same lesson so n equals the string length of s i is less than n i plus plus so really nothing new really fundamentally yet how do i now convert characters from lowercase if they are to uppercase in other words if i type in high h i and lowercase i want my program now to uppercase everything to capital h capital i well how can i go about doing this well you might recall that there is this you might recall that there is this ascii chart so let's just consult this real quick on the ascii chart.com we've looked at this last week notice that a capital a is 65 capital b is 66 capital c is 67 and heck here's lowercase a lowercase b lowercase c and that's 97 98.99 and if i actually do some math there's like a distance of 32 right so if i want to go from uppercase to lowercase i can do 65 plus 32 will give me 97 and that actually works out across the board for everything else 66 plus 32 gets me to 98 or lower case b or conversely if you have a lower case a and its value is 97 subtract 32 and boom you have capital a so all right there's some arithmetic here involved but now that we know that strings are just arrays and we know that characters which are in those arrays are just binary representations of numbers i think we can manipulate a few of these things as follows let me go back to my program here and first ask the question if the current character in the array during this loop is lowercase let's force it to uppercase so how am i going to do that if the character at s bracket i the current location in the array is greater than or equal to lowercase a and s bracket i is less than or equal to lowercase z kind of a weird con boolean expression but completely legitimate because in this array s is a whole bunch of characters that the humans typed in because that's what a string is greater than or equal to a might be a little nonsensical because when have you ever compared numbers to letters but we know from week zero lowercase a is 97. lowercase z is what is it one i don't even remember what's that 132. we know and so that would allow us to answer the question is the current letter lowercase alright so let me go ahead here and answer that question if it is what do i want to print out i don't want to print out the letter itself i want to print out the letter minus 32 right because if it happens to be a lowercase a 97 97 minus 32 gives me 65 which is uppercase a and i know that just from having stared at that chart in the past else if the character is not between little a and big a i'm just going to print out the character itself by printing s bracket i and at the very end of this i'm going to go ahead and print out a new line just to move the cursor to the next line so again it's a little wordy but this loop here which i borrowed from our code previously just iterates over the string aka array character by character through its length this line 11 here is just asking the question if that current character the ith character of s is greater than or equal to little a and less than or equal to little z that is between 97 and 132 then we're gonna go ahead and force it to uh uppercase instead all right and let me go ahead and zoom out here for just a second and sorry i missed book 122 which is what you might have said there's only 26 letters so 122 is little z let me go ahead now and compile and run this program so make uppercase dot slash uppercase and let me type in high and lowercase enter and there's the capitalized version thereof let me do it again with like my own name and lowercase and now it's capitalized as well well what could we do to improve this well you know what let's stop reinventing wheels let's go to the manual pages so let me go here and search for something like uh i don't know lowercase and there i go i did some autocomplete here our little search box is saying that okay there's an is lower function check whether a character is lowercase well how do i use this well let me check is lower now i see the actual man page for this function um now we see include ctype.h so that's the proto that's the header file i need to include this is the prototype for is lower it apparently takes a char as input and returns an int which is a little weird i feel like is lower should return true or false so let's scroll down to the description and return value it returns oh this is interesting and this is a convention in c this function returns a non-zero int if c is a lowercase letter and zero if c is not a lowercase letter so it returns non-zero so like one negative one something that's not zero if c is a lowercase letter and zero if it is not a lowercase letter so how can we use this building block let me go back to my code here let me add this file include ctype.h and down here let me get rid of this cryptic expression which was kind of you know painful to come up with and just ask this is lower s bracket i uh that should actually work but why well is lower again returns a non-zero value if the letter is lower case well what does that mean that means it could return one it could return negative one it could return 50 or negative 50. it's actually not precisely defined why just because like this was a common convention to use zero to represent false and use any other value to represent true and so it turns out that inside of boolean expressions if you put a value like a function call like this that returns 0 that's going to be equivalent to false it's like the answer being no it is not lower but you can also just in parentheses put the name of the function and its arguments and not compare it against anything because we could do something like this well if it's not equal to zero then it must be lower case because that's the definition if it returns a non-zero value it's lower case but a more succinct way to do that is just a bit more like english if it's is lower then print out the character minus 32. so this would be the common way of using one of these is functions to check if the answer is true or false okay we'll we might be done okay sorry say it again no so it's not necessarily one it would be incorrect to check for one or negative one or anything else you want to check for the opposite of zero so not equal zero or more succinctly like i did by just putting it into parentheses let me see what happens here so this is great but some of you might have spotted a better solution to this problem a moment ago when we were on the manual pages searching for things related to lower case what might be another building block we can employ here based on what's on the screen here yeah so two upper there's a function that would literally do the upper casing for me so i don't have to get into the weeds of like negative 32 plus 32 i don't have to consult that chart someone has solved this problem for me in the past and let's see if i can actually get back to it there we go let me go ahead now and use this so instead of doing s bracket i minus 32 let's use a function that someone else wrote and just say two upper s bracket i and now it's going to do the pro the solution for me so if i rerun make uppercase and then do slowly dot upper case type in high now it's working as expected and honestly if i read the documentation for two upper by actually going back to its man page or manual page what you'll see is that it says if it's lowercase it will return the uppercase version thereof if it's not lowercase it's already uppercase its punctuation it will just return the original character which means thanks to this function i can actually tighten this up significantly get rid of all of my conditional there and just print out the two upper return value and leave it to whoever wrote that function to figure out if something's uppercase or lowercase all right questions on these kinds of tricks again it all reduces to like week zero basics but we're just building these abstractions on top yes unfortunately no there is no easy way in c to say give me everything that was for historically uh performance reasons they want you to be explicit as to what you want to include in other languages like python java one of which we'll see later this term you can say give me everything but that actually tends to be best practice because it can actually slow down execution or compilation of your code yeah say again uh what kinds of special characters uh does two upper accommodate special characters like punctuation yes if i read the documentation more pedantically we would see exactly that it will properly hand me back an exclamation point even if i passed it in so if i do make uppercase here and let me do dot slash upper sorry a dot slash uppercase high with an exclamation point it's going to handle that too and just pass it through unchanged yeah really good question too no we do not have access to a function that at least comes with c or comes with cs50s library that will just force the whole thing to uppercase and c that's actually easier said than done in python it's trivial so stay tuned for another language that will let us do exactly that all right so what does this leave us with there's just a co let's come full circle now to where we began today where we were talking about those command line arguments recall that we talked about rm taking a command line argument the word the file you want to delete we talked about clang taking command line arguments that again modify the behavior of the program how is it that maybe you and i can start to write programs that actually take command line arguments well here is where i can finally explain why we've been typing int main void for the past week and just asking that you take on faith that it's just the way you do things well by default in c at least the most late uh the most recent versions thereof there's only two official ways to write main functions you might see other formats online but they're generally not consistent with the current specification this again was sort of the boilerplate for the simplest function we might write last week and recall that we've been doing this the whole time void what that void means for all the programs i have written thus far and you have written thus far is that none of our programs that we've written take command line arguments that's what the void there means it turns out that main is the way you can specify that your program does in fact take command line arguments that is words after the command in your terminal window if you want to actually not use get int or get string you want the human to be able to say something like hello david and hit enter and just run hello print hello david on the screen you can use command line arguments words after the program name on your command line so we're going to change this in a moment to be something more verbose but something that's now a bit more familiar syntactically if you change that void in main to be this incantation instead int argc comma string rgv open bracket close bracket you are now giving yourself access to writing programs that take command line arguments argc which stands for argument count is going to be an integer that stores how many words the human typed at the prompt we the c automatically gives that to you string r v stands for argument vector that's going to be an array of all of the words that the human typed at the prompt so with today's building block of an array we have the ability now to let the humans type as many words or as few words as they want at the prompt c is going to automatically put them in an array called argv and it's going to tell us how many words there are in an int called argc the int as the return type here we'll come back to in just a moment let's actually use now this definition to make maybe just a couple of simple programs but in problem set two well we actually use this to control the behavior of your own code let me go ahead and code up a file called rgv.0 just to keep it aptly named let me go ahead and include cs50.h let me go ahead and include oops that is not the right name of a program let's start that over let's go ahead and code up argv dot c and here we have uh include cs50.h includes standardio.h int main not void let's actually say int argc string argv open bracket close bracket no numbers in between because you don't know in advance how many words the human is going to type at their prompt now let's go ahead and do this let's write a very simple program that just says hello david hello carter whoever the name is that gets typed but not using getstring let's instead have the human just type their name at the prompt just like rm just like clang just like make so it's just one and done when you hit enter no additional prompts let me go ahead then and do this printf quote unquote hello comma and instead of world today i want to print out whatever the human typed in so let's go ahead and do this argv bracket 0 for now but i don't think this is quite what i want because of course that's going to literally print out argv bracket 0 bracket then i need a placeholder so let me put a percent s here and then put that here so if arg v is an array but it's an array of strings then argv bracket 0 is itself a single string and so it can be plugged into that percent s placeholder let me go ahead and save my program and let me go ahead and compile argv so far so good let me go ahead now and type in my name after the name of the program so no get string i'm literally typing an extra word my own name at the prompt enter okay it's apparently a little buggy in a couple of ways i forgot my backslash n but that's not a huge deal but apparently inside of argv is literally everything the humans typed in including the name of the program so logically how do i print out hello david or hello so and so and not the actual name of the program what needs to change here yeah yeah so presumably indexed to one if that's the second thing i or who whichever human has typed at the prompt so let's do make argv again dot slash rgv enter huh hello null so this is another form of null but this is user error now on my part i didn't do exactly what i said i would yeah yeah i forgot the parameter so that's actually i should probably deal with that somehow so that people aren't sort of breaking my program and printing out random things like null but if i do say argv david now you see hello david i can get a little curious like what's it location two well we can see make argv bracket dot slash argv david enter all right so just nothing is there but it turns out in a couple weeks we'll start really poking around memory and see if we can't crash programs deliberately because nothing is technically stopping me from saying oh what's at location 2 million for instance we could really start to get curious but for now we'll do the right thing but let's now make sure the human has typed in the right number of words so let's say this if argc equals 2 that is the name of the program and one more word after that go ahead and trust that in rv1 as you proposed is the person's name else let's go ahead and just default here to something simple and uh basic like well if we don't get a name from the user just say hello world like always so now we're sort of programming defensively this time the human even if they screw up they don't give us a name or they give us too many names we're just going to say hello world because i now have some error handling here because again argc is argument count the number of words total typed at the command line so make argv dot slash rv let me make the same mistake as before okay i don't get this weird null behavior i get something well defined i could now do david i could do david malin but that's not currently supported i would need to alter my logic to support more than just two words after the prompt so what's the point of this at the moment it's just a simple exercise to actually give myself a way of taking user input when they run the program because consider it's just more convenient in this new command line interface world if you had to use getstring every time you compile your code it'd be kind of annoying right you type make then you might get a prompt what would you like to make then you type in hello or cache or something else then you hit enter it just really slows the process but in this command line interface world if you support command line arguments then you can use these little tricks like scrolling up and down in your history with your arrow keys you can just type commands more quickly because you can do it all once all at once and you don't have to keep prompting the user more pedantically for more and more info so any questions then on command line arguments which finally reveals why we had void initially but what more we can now put in main that's how you take command line arguments yes if you were to type at the command line something like not a a word but something like the number 42 that would actually be treated as a string why because again context matters so if your program is currently manipulating memory as though it's characters or strings whatever those patterns of zeros and ones are they will be interpreted as ascii text or unicode text if we therefore go to the chart here that might make you wonder well then how do you distinguish numbers from letters in the context of something like chars and strings well notice 65 is a 97 is a but also 49 is 1 and 50 is 2. so the designers of ascii and then later unicode realized well wait a minute if we want to support programs that let you type things that look like numbers even though they're not technically ins or floats we need a way in ascii and unicode to represent even numbers so here are your numbers and it's a little silly that we have numbers representing other numbers but again if you're in the world of letters and characters you got to come up with a mapping for everything and notice here here's the dot even if you were to represent 1.23 as a string or as characters even the dot now is going to be represented as an ascii character so again context here matters all right one final example to tease apart what this int is and what it's been doing here for so long so i'm going to go ahead and add one bit of logic here to a new file that i'm going to call exit dot c so in exit dot c we're going to introduce that something that are generally known as exit status it turns out this is not a feature we've used yet but it's just useful to know about especially when automating tests of your own code when it comes to figuring out if a program succeeded or failed it turns out that maine has one more feature we haven't leveraged an ability to signal to the user whether something was successful or not and that's by way of maine's return value so i'm going to go ahead and now modify this program as follows like this suppose i want to write a similar program that requires that the user type a word at the prompt so that rxc has to be 2 for whatever design purpose if argc does not equal two i wanna quit out of my program prematurely because i want to just insist that the user operate the program correctly so i might give them an error message like missing command line argument backslash n but now i want to quit out of the program now how can i do that the right way quote-unquote to do that is to return a value from main now it's a little weird because no one called maine yet right main just gets called automatically but the convention is anytime something goes wrong in a program you should return a non-zero value from main one is fine as a go-to we don't need to get into the weeds of having many different exit statuses so to speak but if you return one that is a clue to the system the mac the pc the cloud device that something went wrong why because one is not zero if everything works fine like let's go ahead and print out hello comma percent s like before uh quote-unquote uh rv bracket one so this is just a version of the program without an else so this is the same as doing essentially an else here like i did earlier i want to signal to the computer that all is well and so i return zero but strictly speaking if i'm already returning here i don't technically need if i really want to be nitpicky i don't technically need the else because the only way i'm going to get to line 11 is if i didn't already return so what's going on here the only new thing here logically is that for the first time ever i'm returning a value from main that's something i could always have done because main has always been defined by us as taking an int as a return value by default main automatically sort of secretly returns zero for you if you've never once used the return keyword which you probably haven't in main it just automatically returns zero and the system assumes that all went well but now that we're starting to get a little more sophisticated with our code and you know the program or something went wrong you can abort programs early you can exit out of them by returning some other value besides zero from main and this is sort of fortuitous that it's an int right zero means everything worked unfortunately in programming there are seemingly an infinite number of things that can go wrong an ant gives you four billion possible codes that you can use aka exit statuses to signify errors so if you've ever on your mac or pc gotten some weird pop up that an error happened sometimes there's a cryptic number in it maybe it's positive maybe it's negative it might say error code 123 or negative 49 or something like that what you're generally seeing are these exit statuses these return values for main in a program that someone at microsoft or apple or somewhere else wrote something went wrong they are sort of unnecessarily showing you the user what the error code is if only so that when you call customer support or submit a ticket you can tell them what exit status you encountered what error code you encounter all right any questions then on exit statuses which is the last of our new building blocks for now any questions at all yeah no question is can you do things again and again at the command line like you could with get string and get in which by default recall are automatically designed to keep prompting the user in their own loop until they give you a stint or a float or the like with command line arguments no you're going to get an error message but then you're going to be returned to your prompt and it's up to you to type it correctly the next time good question yeah if you do not return a value explicitly main will automatically return zero for you like that is the way c simply works so it's not strictly necessary but now that we're starting to return values explicitly if something goes wrong it would be good practice to also start returning a value for main when something goes right and there are no errors in fact so let's now get out of the weeds and contextualize this for some actual problems that we'll be solving in the coming days by way of problems at two and beyond so here for instance so here for instance is a problem that you might think back to when you were a kid the the readability of some text or some book the grade level at which some book is written if you're a young student you might read at a first grade level or third grade level in the us or if you're in college presumably you're reading at a university level of text but what does it mean for text like in a book or in an essay or something like that to correspond to some kind of grade level well here's a quote a title of a a childhood book one fish two fish redfish bluefish what might the grade level be for a book that has words like this maybe when you were a kid or if you have siblings still reading these things what might the grade level of this thing be any guesses yeah sorry again before grade one is in fact correct so that's for really young kids and and why is that well let's consider these are actually pretty simple phrases right one fish two fish red i mean there's not even verbs in these sentences they're just nouns and adjectives and very short sentences and so that might be a heuristic we could use when analyzing text well if the words are kind of short the sentences are kind of short everything's very simple that's probably a very young or early grade level and so by one formulation it might indeed be even before grade one for someone quite young how about this mr and mrs dursley of number four privet drive we're proud to say that they were perfectly normal thank you very much they were the last people you would expect to be involved in anything strange or mysterious because they just didn't hold with such nonsense and onward all right what grade level is this book at okay i heard third seventh fifth okay all over the place but grade seven according to one particular measure and whether or not we can we can debate exactly what age you were when you read this and maybe you're feeling ahead of your time or behind now but here we have a snippet of text what makes this text assume an older audience a more mature audience a higher grade level would you think yeah it's longer different types of words there's commas now and phrases and so forth so there's just some kind of sophistication to this so it turns out for the upcoming problem set among the things you'll do is take as input texts like this and analyze them considering well how many words are in the text how many sentences are in the text how many letters are in the text and use those according to a well-defined formula to prescribe what exactly the grade level of some actual text there's the third might actually be well what else are we going to do in the coming days well i've alluded to this notion of cryptography in the past this notion of scrambling information in such a way that you can uh hide the contents of a message from someone who might otherwise intercept it right the earliest form of this might also be when you're younger and you're in class and you're passing a note from one person to another from yourself to someone else you don't want to just necessarily write a note in english or some other written language you might want to scramble it somehow or encrypt it maybe you change the a's to a b and the b's to a c so that if the teacher snaps it up and intercepts it they can't actually understand what it is you've written because it's encrypted now so long as your friend the recipient of this note knows how you manipulated it how you added or subtracted sort of letters to each other they can decrypt it which is to say reverse that process so formally in the world of cryptography and computer science this is just another problem to solve your input though when you have a message you want to send securely is what's generally known as plain text there's some algorithm that's going to then in cipher or encrypt that information into what's called ciphertext which is the scrambled version that theoretically can get safely intercepted and your message has not been spoiled unless that intercept actually knows what algorithm you used inside of this process so that that would be generally known as a cipher these ciphers typically take though not one input but two if for instance your cipher is as simple as a becomes b b becomes c c becomes d dot dot z becomes a you're essentially adding one to every letter and encrypting it now that would be what we call the key you and the recipient both have to agree presumably before class in advance what number you're going to use that day to rotate or change all of these letters by because when you add one they upon receiving your ciphertext have to subtract one to get back the answer so for instance if the input plaintext is high as before and the key is one the ciphertext using this simple rotational algorithm otherwise known as a caesar cipher might be i j exclamation point so it's similar but it's at least scrambled at first glance and unless the teacher really cares to figure out what algorithm are they using today or what key are they using today it's probably sufficiently secure for your purposes how do you reverse the process well your friend gets this and reverses it by negative one so i becomes h j becomes i and things like punctuation remain untouched at least in this scheme so let's consider one final example here if the input to the algorithm is t uijt t 5 0 and the key this time is negative 1 such that now b should become a and c should become b and a should become z so we're going in the other direction how might we analyze this well if we spread all the letters out and we start from left to right and we start subtracting one letter u becomes t i becomes h j becomes i t becomes s x becomes w a was d t this was cs50 we'll see you next time you

Info

Channel: CS50

Views: 78,271

Rating: undefined out of 5

Keywords: cs50, harvard, computer, science, david, j., malan

Id: zsKs38dwPZk

Channel Id: undefined

Length: 152min 5sec (9125 seconds)

Published: Mon Sep 13 2021