CS50 2020 - Lecture 5 - Data Structures

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

[Music] [Music] all right this is cs50 and this is week five recall that last week in week four we introduced a few new building blocks namely pointers and spoke in great detail about how you can now manipulate a computer's memory and begin to do things at a lower level with it well today we'll sort of use those basic building blocks to start creating things called data structures in the computer's memory it turns out that once you have this ability to refer to different locations in the computer's memory you can kind of stitch together your own custom shapes your own custom data structures as they're called and indeed we'll start doing that by rolling back for just a moment to where we first saw a data structure in week two so recalling week two which was our second week of playing with c we introduced you to the notion of an array and an array is just a contiguous sequence of memory in which you can store a whole bunch of integers back to back to back or maybe a whole bunch of chars back to back to back and those uh arrays might have been represented pictorially like this so this would be an array of size three and suppose that this were an array indeed of integers we might have one two three inside of it but suppose now and here's where we'll start to bump up against a problem but also solve a problem today suppose that you want to add another number to this array but you've only had the forethought to create an array of size three the catch with arrays in c is that they're not really easily resizable again you all know that you have to decide in advance how big the array is going to be so if you sort of change your mind later or your program's running long enough where you need to store more values in it you're kind of in a bind like intuitively if you wanted to insert the number four into this array you would ideally just plop it right here at the end of the array and continue about uh your business but the catch with an array is that that chunk of memory is not uh it doesn't exist in a vacuum recall if we sort of zoom out and look at all of your computer's memory this byte this byte and a whole bunch of other bytes might very well be in use by other variables or other aspects of your program so for instance for the sake of discussion suppose that the program in question has one array of size three containing the integers one two three and then suppose that your same program has a string somewhere in the code that you've written and that string is storing hello world well next to this array in memory just by chance may very well be in h-e-l-l-o comma space w-o-r-l-d backslash zero and there might be free memory so to speak memory you could use that's filled with garbage values and garbage isn't bad it just means that you don't know you don't really care what values are or were there so there is free space so to speak each of these oscars represents effectively free space with some garbage value there remnants may be of some execution passed but the problem here is that you don't have room to plop this four right where you might want to put it so what's the solution if we have this array of size three containing three integers one two three but it's kind of been painted into a corner whereby h e l l and so forth are already immediately next to it we can't just put the four there without sacrificing the h and that doesn't really feel like a solution so any thoughts on how we can solve this problem are we sort of completely out of luck can you just not add a number to an array in a situation like this or is there a solution perhaps that comes to mind even if you've never programmed before if on the screen there sort of the lay of the land santiago what do you think i would say that um maybe you could copy the elements in the original array and create a new array but that's one size bigger or one element bigger and then add that new element yeah that's really good intuition after all there's all these oscars on the screen right now which again represent garbage values or in turn free space so i could put one two three four up here i could put one two three four over here i could put one two three four down here so we sort of have some flexibility but santiago is exactly right intuitively all we really need to do is let's focus only on four of the available spots a new array of size four if you will which initially has these four garbage values but that's okay because the santiago also notes we it suffices now to just copy the old array one two three into the new array and heck maybe we can now even free the memory from the original ray much like we could if we used malloc and that leaves us then of course with just an array of size four with that fourth garbage value but now we do have room for like the number four itself so it would seem that there is a solution to this problem that doesn't violate the definition of an array again the only definition of an array really is that the memory must be contiguous you can't just pop the four anywhere in the computer's memory it has to become right after your existing memory if this whole thing this whole structure is indeed going to still be an array but i worry that that might have cost us a bit of time and in fact let me go ahead and open up on the screen here in just a moment a question that you're all welcome to buzz in for what would be the running time of inserting into an array let me go ahead and reveal the poll question here feel free to go to the usual url which brian if you wouldn't mind copying pasting as usual what do you think the running time is of inserting into an array inserting into an array recall in the past we talked about arrays in terms of searching but today this is the first time we're really talking about inserting into them and if we take a look at the results coming in it looks like 80 some percent of you feel that it's linear time big o of n whereby it might take as many as n steps to actually insert into an array five percent of you propose n log n seven percent of you n squared and then two and five percent respectively for the other values as well so this is a kind of an interesting mix because when we talked about arrays in the past and we talked about searching recall that we typically achieve uh big o of log n that's really good unfortunately if we have to do what santiago proposed and actually copy all of the elements from the old array into the new array it's going to indeed take us as many as n steps because we have to copy each of the original elements one two three over into the new array which is going to be of size n plus one so it's on the order of n steps in total so the running time then of inserting into an array in terms of an upper bound at least is going to be indeed big o of n because you've got to copy potentially all of those elements over but perhaps we might feel differently if we consider a lower bound on the running time of insert what might the lower bound of insert be when it comes to an array again omega notation is what we can use here how many steps maybe in the best case might it take me to insert a value into an array we won't do a poll for this one brian why don't we go ahead and call in a hand for this what's a lower bound on the running time of insert ryan what do you think well the best case scenario would be if there's only one element in the array so you would just have one or it would just be one element right into the array yeah so if you've got an array and let me emphasize that's already empty whereby you have room for the new element then indeed omega of one constant time is all you need so long as you have space available in the array and it doesn't matter how large the array is maybe it's a size four but you've only put one number in it that's okay because you can immediately put the new number in place recall that arrays support random access you can use the square bracket notation and just jump to any location in so-called constant time in just one step so if your array is of sufficient size and is not full then yes the lower bound on the insertion into an array is going to be constant time omega of one but as we saw in santiago's situation whereby you have an array that's already filled with elements and you want to add another well in that case then upper bound is indeed going to be big o of n because you have to do the additional labor of copying the values over from one to the other now those of you have programmed before maybe have used java you might be familiar with the phrase vector a vector is kind of like an array that can be resized can grow and shrink that's not what arrays are in c arrays and c are just contiguous blocks of memory with values back to back to back but once you decide on their size that is it you're going to have to resize it essentially yourself they're not going to automatically grow for you so an array was the first and really the simplest of the data structures we'll see but it's also not nearly as powerful as what we can do now that we have access to a computer's memory today we're going to start to leverage these things called pointers the addresses by which we can refer to locations in memory and we're going to start to stitch together some fancier data structures first one-dimensional in some sense then two-dimensional in some sense by using some very basic building blocks recall these three pieces of syntax from past weeks struct recall is this mechanism this key word in c whereby we can define our own structures in memory we saw one for a person whereby a person might have both a name and a number in something like a phone book and you've seen the dot operator the dot operator was how we go inside of such a structure and get at dot name or dot number the specific variables inside of the struct and then last week recall we saw the star operator whereby the dereference operator which colloquially means go to this particular address so just by using these three ingredients are we going to be able to now build up our own custom data structures that are even fancier than arrays and can ultimately help us solve problems more efficiently and in fact this is such a common technique in programming in c that these two operators the dot and the star are often combined into one that intentionally looks like an arrow so we'll see its usage in just a little bit so syntactically today it's pretty much building blocks past but we're going to use these building blocks now in new ways to start solving problems differently and we'll first do this by way of something called linked lists so for instance a linked list is going to be a data structure that solves some of the problems with an array and what's a problem arguably well if it takes big o of n steps to insert into an array frankly that's kind of annoying that's kind of expensive because over time if you're writing a really big program with lots of data you're the googles of the world the twitters of the world it's fine if you want to store all of your data in a big array for the sake of efficient searching which recall was big o of log n if we use something like binary search and keep everything sorted but it's going to be pretty painful if every time you want to add another tweet to the array or some other web page to the array depending on what the problem is that you're solving you might potentially santiago notes have to copy all of the contents of your original smaller array into a new bigger array just to add more tweets or more web pages or the like so a linked list is going to be a data structure that's more dynamic whereby you can grow and shrink the data structure without having to touch all of the original data and move it from old location to new so what might like what might this look like well let's consider again our computer's memory and let's propose that i want to store those same values again so like the number one and just for the sake of discussion suppose that it's in my computer's memory at address ox123 ox just means it's a hexadecimal number one two three is completely arbitrary i made it up just for the sake of discussion so let me stipulate that that's where the number one happens to be in the computer's memory in this new solution to the problem of storing lots of data suppose i want to store number two maybe it's at address ox456 and then suppose i want to store a number three suppose it's at address ox 789 so notice deliberately these numbers are spread out in the computer's memory because after all the problem we ran into a moment ago with arrays is that you might have hello world or other values from other parts of your program kind of in the way so if i'm proposing instead that if you want to store one and then two and then three that's fine plop them anywhere you want and you don't have to worry about where there is already existing values instead you can just put these values where there is room the problem though is that if you just start plopping values like one two three anywhere you want in the computer's memory you can no longer very easily find those values right you might know where one is but it no longer suffices to just look one location to the right to find the next value or add two to find the next value after that in an array everything is contiguous but if we instead start to treat the computer's memory it's just a canvas where we can just draw numbers anywhere we want that's fine so long as we can somehow help ourselves get from the first to the second to the third irrespective of all the other stuff that's cluttering up the computer's memory so in fact let me propose that we do this by maybe stealing a bit more space from the computer so rather than use just enough memory to store one two and three let me store twice as much information in addition to every number that i actually care about one two three my data let me store a little metadata so to speak values that i don't fundamentally care about but that are going to help me keep track of my actual data and let me propose that in this box here i literally store the value ox456 again it's written in hexadecimal but that's just the number that's the address of somewhere else in memory in this box let me propose that i store ox 789 and in this box let me arbitrarily say ox0 which is just all zero bits now why have i done this even if you've never seen this structure that's evolving to be what's called a linked list why have i just done what i've done in addition to storing one two and three respectively i'm also now storing ox456 in an additional chunk of memory and ox 789 in an additional chunk of memory but why sophia so that we know how the first element like relates to the second or how they're linked together exactly between the first and second yeah so now i'm using essentially twice as much space to store each piece of data one two and three respectively but in the second chunk of space i'm now storing a pointer to the next element in the thing i'll now think of as a list so this is ox456 because the number two which is the next number in the list i care about lives at ox456 this number is ox 789 because the next number after that that i care about is at address 0x789 so it's just a helpful way now of kind of leaving myself breadcrumbs so that i can plop the one the two the three anywhere i want in the computer's memory wherever there's available space and still figure out how to get from one to the other to the other and we've actually seen some of this syntax before it turns out that ox 0 which is all zero bits that's just uh the technical uh that's the numeric equivalent of what we've called null n-u-l-l which we introduced last week is a special symbol indicating that something has gone wrong with memory or you're out of space it's sort of the absence of an address c guarantees that if you use ox0 aka null that just indicates the absence of any useful address there but you know what again like last week this is sort of getting way into the weeds i don't really care about ox one two three four five six seven eight nine so let's just kind of abstract this away and start thinking about this really as a list of numbers that are somehow linked together underneath the hood the links are implemented by way of addresses or pointers those low level numbers like ox123 456 nine but pictorially it kind of suffices for us to just start thinking of this data structure here after known as a linked list as being a collection of nodes so to speak and ode that are connected via pointers so a node is just a generic computer science term that refers to some kind of structure that stores stuff you care about what i care about here is a number and a pointer to another such structure so this is a linked list and each of these rectangles represents a node and in code we'll implement those nodes ultimately by way of that thing in c called a struct but let me pause here to see first if there are any questions about the structure we've built up any questions about this thing called a linked list before we see it in some code brian yeah a question came in in the chat asking isn't this kind of a waste of memory that we're now using all of these cells to store addresses too yeah really good observation isn't this kind of a waste of memory and that we're storing all of these addresses in addition to the numbers one two three that we care about yes and in fact that is exactly the price we are paying and this is going to be thematic this week last week and really every week thereafter anytime we solve a problem in cs and programming in particular there's always going to be some price paid there's going to be some cost and really there's going to be some trade-off so if a moment ago it was unacceptable that inserting into an array is in big o of n because man that's going to take so many steps to copy all of the values from the old array into the new array if that is unacceptable to you for performance reasons or whatever the problem is that you're solving well that's fine you can solve that problem and now have much more dynamism whereby you can plop your numbers anywhere in memory without having to move the existing numbers anywhere else thereby saving yourself time but the price you're going to pay indeed is more space so at that point it kind of depends what's more important to you the computer's time your human time or maybe the space or the cost of the space the more memory that you might really need to literally buy for that computer so this is going to be thematic this trade-off between space and time is omnipresent really in programming well let's consider how we might translate this now into actual code recall that when we last saw structs in c we did something like this to define a person as having two things associated with them a name and a number so today we don't care about persons and names and numbers we care about these things we're going to start calling nodes so let me go ahead and rewind from that erase that and let's instead say that every node in this structure renaming person as well to node is going to have a number which will be an int in our case here and i've left room for one other value because we ultimately need to be able to store that second piece of data that second piece of data is going to be a pointer it's going to be an address of something but how might i express this this is going to be a little less obvious but we laid the foundation last week with pointers how could i describe this structure as having a pointer to another such structure any thoughts on verbally the syntax to use or even if you're not sure of exactly the incantation exactly what symbols we should use to express and address to another such node uh someone is suggesting we use a node star as a pointer to a node node star alright so star i i definitely remember from last week in indicating that if you have int star this is the address of an int if you have char star this is the address of a char so if all of these arrows really just represent addresses of nodes it stands to reason that syntax is probably going to be something akin to node star now i can call this pointer anything i want by convention i'll call it next because literally its purpose in life is to point to the next node in the data structure and that's going to be in addition to the int called number that i'll propose describes the top of that individual data structure but there's a subtle problem here in c recall that in c it's kind of a a simplistic language complicated though it might often seem in that it doesn't understand anything that it hasn't seen before so at the moment notice that the very first time i have mentioned node up until now was on the very last line of the snippet of code the problem is that by nature of how typedef works the thing called a node does not actually exist until the compiler is done reading that last line of code and the semicolon which is to say that it would actually be incorrect to use or refer to quote unquote node inside of this structure because literally by definition of how typedef works a node does not exist until again that last line of code in the semicolon are executed thankfully there's a workaround it's a little weird to look at but what you can do in c is this you can actually add an additional word after literally the keyword struct and we'll keep it simple we'll use the exact same word though technically that's not necessary and now i'm going to change the inside of the structure to say this instead so it feels a little verbose and it feels like a little bit of copy paste but this is the way it is done in c type def struct node is essentially a hint similar in spirit to the prototypes we've talked about for functions that gives the compiler a clue that okay something called a struct node is going to exist you can then use it inside of that data structure and refer to it as struct node star it's more of a mouthful this week we've not seen multiple words like this but it's similar to char star or instar like last week and i'm going to call that arbitrarily next and down here the same thing happens as in the past with persons by calling this node at the very last line of the code that just tells the compiler you know what you don't have to refer to it as struct node all over the place you can just call this thing node so it's a little bit of it's a little verbose in this case but all this is done is create for me in the computer the definition of a node as we have depicted it pictorially with that that rectangle all right so how can we now translate this into more useful code not just defining what these structures are but how do we begin building up linked lists well let me propose that a linked list really begins with just a pointer and in fact here we have thanks to the theater's prop shop just kind of a null pointer if you will i'm going to call this variable list and list is currently pointing to nothing the arrow will say is just pointing at the floor which means it's null it's not actually pointing at anything useful suppose i now want to start to begin to allocate a linked list with three numbers one two and three well how am i going to do this well at the moment the only thing that exists in my program is going to be this variable called list there's no array in this story that was last uh that was uh in week two today is all about linked lists so how do i get myself a wooden block that represents one another wooden block that represents two and a third that represents three well i need to use our new friend from last week malloc recall that malloc allows you to allocate memory uh as much memory as you might want so long as you tell it the size of that thing so frankly i think what we could do ultimately today is use malloc to allocate dynamically one struct and put the number one in it another struct put the number two on it another struct put the number three in it and then use these playful arrows here to actually stitch them together having one point to the other so thankfully the prop shop has wonderfully created a whole bunch of these for us let me go ahead and malloc a very heavy node that has room for two values and you'll see it has a room for a number and a next pointer so the number i'm going to first install here is going to be the number one for instance and i'm going to leave its pointer as just pointing at the ground indicating that this is a null pointer it's not actually pointing at anything else but now that i'm starting to instantiate that is create this list now i'm going to do something like this and say that all right my variable called list whose purpose in life is to keep track of where this list is in memory i'm going to connect one to the other by actually having this variable point at this node when it comes time then to allocate another node i want to insert into this linked list back in the world of arrays i might have to allocate a new chunk of memory copy this value over into the new values i don't have to do that in the world of linked lists i just call malloc for a second time and say give me another chunk of memory big enough to fit a node thankfully from the prop shop we have another one of these so let me go ahead and unlock another node here at the moment there's nothing in it just the placeholder so it's garbage values if you will i don't know what's there until i actually say that the number shall be number two and then i go over to my linked list whose variable name is list and i want to insert this thing so i follow the arrow i then point the next field of this node at this node here so now i have a linked list of size two there's three things in the picture but this is just a simple variable this is a pointer that's pointing at the actual node which in turn is pointing at an actual other node now suppose i want to insert the number three into this linked list recall that malloc is powerful in that it takes memory from wherever it's available the so-called heap from your computer and that means by definition that it might not be contiguous the next number might not actually be here metaphorically in the computer's memory it might be way over there so that might indeed happen the third time i call malloc now and allocate a third node it might not be available anywhere in the computer's memory except for like way over here and that's fine it doesn't have to be contiguous as it did in the world of our arrays i can now put the number 3 in its place but if i want to keep a pointer to that node so that all these things are stitched together i again start at the beginning i follow the arrows i follow the arrows and now if i want to remember where that one is i'm going to have to connect these things and so now that pointer needs to point at this block here and this visual is very much deliberate these nodes can be anywhere in the computer's memory they're not necessarily going to be contiguous the downside of that is that you cannot rely on binary search our friend from like week 0 onward binary search was amazing in that it's big o of log n you can find stuff way faster than big o of n that was the whole point of even the phone book example in the very first week but the price the upside of this approach is that you don't have to actually keep allocating and copying more memory and all of your values anytime you want to resize this thing and i'm a little embarrassed to admit that i'm out of breath for some reason just mall locking notes here but the point is like using malloc and building up the structure does come at some price it's it's exhausting frankly but it's also going to spread things out in memory but you have this dynamism and honestly if you are the twitters of the world the googles of the world we have lots and lots of data it's going to kill you performance wise to have to copy all of your data from one location into another santiago originally proposed as a solution to the array so using these dynamic data structures like a linked list where you allocate more space wherever it's available and you somehow remember where that is by stitching things together as with these physical pointers is really the state of the art and how you can create these more dynamic structures if it's more important to you to have that dynamism all right any questions before we now translate these physical blocks to some samples of code a question came in in the chat when in this whole process of linked lists are we actually using malloc and what is malloc being used for really good question so where are we using malloc every time i went off stage and grabbed one of these big blocks that was my acting out the process of mallocking a node so when you call malloc that returns to you per last week the address of the first byte of a chunk of memory and if you call malloc of one that gives you the address of one byte if you call malloc of 100 that gives you the first address of a hundred bytes that are contiguous so each of these nodes then represents the return value if you will of a single call to malloc and in fact perhaps brian the best way to translate to answer that question in more detail is to translate this now to some actual code so let's do that and then revisit so here is for instance a line of code that represents the beginning of our story where we only had a variable called list that was initialized to nothing initially this arrow was not pointing at anything and in fact if it was just pointing up down left or right it would be considered a garbage value right this is just memory which means there are values inside of it before i actually put actual values in it so if i don't assign it a value who knows what it's pointing to but let me go ahead and change the code this list variable by default has some garbage value unless i initialize it to something like null so i'll represent that here figuratively by just pointing at the ground that now represents null this would be the corresponding line of code that just creates for you an empty linked list that was the beginning of the story now recall that the next thing i did was i went off stage and i called malloc to bring back one of these big boxes that code might instead look like this node star n though i could call the variable anything i want malloc size of node so size of we might have seen briefly in that it's just an operator in c that tells you how many bytes large any data type is so i could do the math and figure out in my mac or pc or cs50 ide just how many bytes these nodes are supposed to take up sizeof just answers that question for me so malloc again takes one argument the number of bytes you want to allocate dynamically and it returns to the address of the first of those bytes so if you think of this as like one of my earlier slides in yellow it returns to like the address of the top left byte of this chunk of memory if you will and i'm going to go ahead and assign that to a variable i'll call n to represent a node and it's node star because again malloc returns always an address and the syntax we saw last week for storing addresses is to use this new star syntax here so this gave me a block that initially just had garbage values so there was no number in place and who knows where the arrow was pointing at so it looked a little something like this if i draw it now on the screen list is initialized to null it's not pointing at anything right now but n the variable i just declared is pointing at a node but inside of that node or who knows what it's garbage values number and next or just garbage values because that's what's there by default remnants of the past but now let me propose that we do this code so long as n is not null which is something you want to get into the habit always now of checking anytime in c when you call a function that returns to you a pointer you should almost always check is it null or is it not null because you do not want to try touching it if it is indeed null because that means there is no valid address here that's the human convention when using pointers but if n does not equal null that's a good thing that means it's a valid address somewhere in the computer's memory let me go ahead now and go to that address the syntax for which is star n just like last week and then the dot operator means go inside of the structure that's there and go into the variable inside of it called number in this case so when i go ahead and do something like this when i have a list that's initially of size one where this variable is pointing at the one and only node at the moment it's going to have the number one in it as soon as i execute this line of code star n means start at the address embodied here go to it and then put the number one in this case in place but i need to do one other thing as well i want to go ahead two and replace the garbage value that represents the next pointer in that structure and replace it with null null just indefinite this is the end of the list there's nothing there i don't want it to be garbage value because a garbage value is an arbitrary number that could be pointing this way this way this way metaphorically i want to actually change that to be null and so i can use the same syntax but there's this clever approach now i don't have to use star and then dot all over the place as mentioned earlier star and dot comes with some syntactic sugar you can replace the star and the parentheses and the dot that we just saw with an arrow notation which means follow the arrow and then set this thing equal to null which i'll represent again by just having the arrow literally point at the floor for clarity so again rewinding from where we started star n dot number with the first of those in parentheses is the same thing as this and the reason that most people prefer using the arrow notation so to speak is that it really does capture this this physicality you start at the address you go there and then you look at the field number or next respectively but it's equivalent to the syntax we saw a moment ago with the star and the dot as before so after these two steps have we initialized this node to containing the number one and null inside of it but what comes next what comes next well at this point in the story in this code i have some other variable here that's not pictured because we're now transitioning from the world of woodwork to actual code so there's another variable n which i might as well be representing myself if i am this temporary variable and i too am pointing at this value it's not until i actually execute this line of code list equals n that i remember where this node is in the computer's memory so n up until now has really just been a temporary variable it's a variable that i can use to actually keep track of this thing in memory but if i want to add this node ultimately to my linked list list started out as null recall that this pointer was pointing at the ground representing null but when i now want to remember that this linked list has a node in it i need to actually execute a line of code like this list equals null all right what did we next do let's take one more step further so at this point in the story if i was representing n i'm also pointing at the same block n is temporary so it can eventually go away but at this point in the story we have a linked list of size one let's go ahead and take this further suppose now i execute these lines of code and we'll do it a little faster in all at once so that you can see it more in context the first line of code is the same as before hey malloc give me a chunk of memory that's big enough for the size of a node and again let's use this temporary variable n to point to that and suppose that means that i if i'm representing this temporary variable i'm pointing at this new chunk of memory here i then check if n does not equal null then and only then do i go ahead and install the number two as i did physically earlier and do i initialize the pointer originally to pointing not at some other node which doesn't yet exist but pointing at let's just call it the floor thereby representing null and that's it that has now allocated the second node but notice literally this disconnect just because i've allocated a new node and put the number i care about and initialized its next pointer to null that doesn't mean it's part of the data structure the linked list is still missing a pointer from old to new so we need to execute one other line of code now so that we can get from this picture ultimately to the final one and here's where we can use the same kind of syntax as before if i start at my list variable i follow the arrow as per that code and then i update the next field to point to n which is my newly allocated node now after that final line of code do i have a link list of size 2 because i've not only allocated the node initialized its two variables number and next respectively i've also chained it together with the existing node on the linked list and let me do this one even slightly more quickly only because it's the same thing just so we can see it all together at once now that we have this picture let's execute the same kind of code again the only difference in this chunk of code is that i'm initializing number to three so what has this done for me that chunk of code has malloc a third and final node i've initialized the top to the number three i've initialized the bottom to null as i'll represent by just pointing the arrow at the ground there's one final step then if i want to go ahead and insert that third node into this linked list i've got to go ahead now and not just point at it myself me representing again the temporary variable n i need to now do this and this is syntax you won't do often we'll see it in an actual program in just a moment but it just speaks to the basic building blocks we're manipulating if i want to go ahead and link the 2 to the 3 i can start here at list i can follow the arrow once i can follow the arrow again and then i can set that next field equal to n because again n is the current address of that most recently allocated node so even though the syntax is getting a little new the two arrows i'm using are literally just like a code manifestation of just follow the pointer follow the pointer boom assign one value to the next all right so at this point in the story the picture now looks like this so long as i and i'm still in the picture but if i just get rid of myself because i'm a temporary variable voila we've just built up step by step a brand new linked list of size 3. seems like a lot of work but it allows us to now grow this thing dynamically but let me pause here any questions or confusion on linked lists any questions or confusions on the linked list one question just came in in the chat if we're trying to make a list that's going to be much longer like more than three elements wouldn't it get tedious to have like arrow next arrow next over and over again yeah really good observation and so that's why i said you won't usually write it like this we'll do it temporarily in a full-fledged program in just a moment just to demonstrate it but in general you'll probably use something like a loop and what you'll do is use a temporary variable that points at this one then iterates again then iterates again and let me stipulate for now that if you use a loop in the right way you can end up writing just a single arrow by just keep updating the variable again and again so there is a way to avoid that and you can do it much more dynamically let me go ahead and ask a question of my own here let me go ahead and ask this one if you'd like to buzz in to this question what is the running time of searching a linked list what's the running time of searching a linked list in big o notation so what's an upper bound worst case of searching a linked list like this one whether it has three elements or even many more so it looks like as before about eighty percent of the group is proposing that it's big o of n and this is actually correct because consider the worst case suppose you're looking for the number three you're going to have to look at all three numbers big o of n if you're looking for the number 10 you're going to start here at the beginning you're going to keep looking looking looking looking you're going to get to the end realize oh the number 10 is not even here at which point you've already looked at n elements but here's one of the trade-offs again of linked lists with arrays you could jump to the end of the list in constant time you could just use a little bit of arithmetic you could jump to the middle element or the first element all using constant time a linked list unfortunately is represented ultimately by just a single address the address that points to the very first node and so even though you all on camera can see this node in this node and this one as humans all at once the computer can only follow these breadcrumbs and so searching a linked list is going to be big o of n in that case but let me ask a follow-up question what's the running time of inserting into a linked list what's the running time of inserting into a linked list so you've got some new number like the number 4 or 0 or 100 or negative 5 whatever it may be there's going to be a malloc involved but that's constant time it's just one function call but you're going to have to insert it somewhere and here it looks like 68 percent of you are proposing big o of one which is interesting constant time uh 25 percent of you are proposing big o of n would anyone be comfortable chiming in verbally or on the chat as to why you feel it's one or the other it is indeed one of those answers it could be o n and because because of the fact that uh even though you're using malloc to create a new note essentially i think ah all the computers doing when you assign it is gonna it's used like those arrows like going from one hour to the next to the next to the next uh-huh and that way i would think it would be over and it is own event as you've described it but you two are making an assumption like 80 or like 25 of other people are making you seem to be assuming that if the new number suppose it's number four has to go at the end or if it's the number five it has to go at the end and i kind of deliberately set things up that way i've happened to maintain them in sorted order one two three from left to right but up until now i have not made the condition that the linked list has to be sorted even though the examples we've seen thus far deliberately that way but you know what if you want to get fancy and a little more efficient and you want to allocate the number four and frankly you don't really care about keeping the linked list in sorted order well heck just pull this out put your new node here plug it in here plug the other one back in here and just insert the new element at the beginning of the list and for every number thereafter malloc as before but just keep inserting it here inserting it here inserting it here now it's not going to take a single step because as i verbalized it there's like the malloc step i have to unplug this i have to plug it back in so it's like three or four steps total but four steps is also constant that's big o of one because it's a fixed number of steps so if you're able to sacrifice sorted order when it comes to this list you can in constant time insert insert insert insert and the list is going to get longer and longer but from the beginning of it rather than the end so that too is a trade-off if you don't care about sorted order and none of your algorithms or code require that it be sorted then you can go ahead and cut that corner and achieve constant time insert whichever twitter or google or the like maybe that's actually a net savings and a good thing but again you sacrifice the sorted order in that case well let's go ahead i think and let's translate this to some actual code let me go ahead here in cs50 ide and let's go ahead and write a couple of variants of a program that now actually do something with numbers and start to manipulate things in memory so i'm going to go ahead here and create a program called list dot c and my first version is going to be very simplistic i'm going to go ahead and include standard i o give myself an int made and void and then inside of here let me go ahead like we began and give myself a list of integers of size three so this is going to be an array that's of size three and this array of size three i'm going to go ahead and hard code some new values into it so at the very first location i'll put the number one at the second location i will put the number two at the third location i will put the number three and then just to demonstrate that this is working as i think i intend i'm gonna do a quick for loop so for int i get zero i less than three i plus plus and then inside this loop i'm going to go ahead and print out percent i and then i'm going to print out the value of i so now that i've printed out all of these values in my loop let me go ahead and do make list let me go ahead then and do dot slash list and hit enter and indeed i get oops not not what i wanted so good uh teachable moments not intended admittedly but what have i done wrong here my goal is to print out the list but somehow i printed out 0 1 2 and those are indeed not the numbers in this list uh great so you've printed i you should have printed list of i yes so i should have printed the contents of the array which is list bracket i so that was just a newbie mistake by me here so let me fix that let me go ahead and recompile make list dot slash list and voila i've now printed the list so this is sort of week two stuff when we first introduced a raise in week two but now let me go ahead and transition now to something more dynamic where i don't have to commit in advance to creating uh an array i can do this with a dynamically allocated chunk of memory so let me delete everything i've done inside of main and let me go ahead and give myself this let me go ahead and declare a list of values where list is now going to be an address as per the star operator and i'm going to go ahead and malloc let's see i want space for three integers so the simplest way to do this if i'm just going to keep it simple i can actually do this three times sizeofint so this version of my program isn't going to use an array per se it's going to use malloc but it's going to dynamically allocate that array for me and we'll see what the syntax for this is as always now anytime you use malloc i should check whether list uh equals equals null and if so you know what i'm just going to return one recall that you can return 0 or 1 or some other value for main to effectively quit your program i'm gonna go ahead and just return one if list is null just assuming that something very badly went wrong like i'm out of memory altogether but now that i have this chunk of memory that's of size three times the size of an int this is actually the malloc way to give yourself an array up until now every time we've created arrays for ourselves we've used square bracket notation and you all have put a number inside the square brackets to give yourself an array of that size but frankly if we have malloc and the ability to just ask the computer for memory well if i want to store three integers why don't i ask malloc for three times the size of an integer and the way malloc works is it's actually going to return to me a contiguous chunk of memory of that size so that many bytes back to back to back and that's just like well that's a technique that we'll use in just a moment when allocating actual nodes so at this point in the story so long as list does not equal null i now have a chunk of memory that's big enough to fit the size of three ins and as before i can go ahead and initialize those the first element will be one the second element will be two the third element will be three and notice this sort of equivalence now between using arrays and using pointers c is kind of versatile in this way in that if you have a chunk of memory returned to you by malloc you can per last week use square bracket notation you can use square bracket notation and treat that chunk of memory as an array because after all what's an array it's a contiguous block of memory and that is exactly what malloc returns if you want to be fancy instead you could actually say go to that address and put the number one there you could say go to that address plus one and put the next number there you could say go to that address plus two and put the third number there but honestly this just be very quickly becomes unreadable at least to most people this is that thing called pointer arithmetic you're doing arithmetic with pointers that is equivalent to using the syntax that we've used for a while now which is to just use the square brackets and the nice thing about square brackets is that the computer will figure out for you how far apart each of those integers are because it knows the size of an int but now at this point in the story things get interesting and also annoying and santiago recall was the one that helped us solve this earlier suppose i didn't plan ahead i only allocated three integers on line five there but now at line 13 i'm like oh damn it now i want to add a fourth integer to the list i could obviously just redo all the code but suppose that part of the story here is to go ahead and dynamically allocate more memory well how can i do this well let me go ahead and allocate another chunk of memory temporarily so i'll call it temp by convention and this time i'm going to go ahead and allocate four times size of ints because again for the sake of the story i've messed up and i want to allocate enough space now for that original i haven't so much messed up i have now decided that i want to add a fourth number to this array as always i should check if temp equals equals null you know what i'm going to go ahead and free the memory i already allocated and then i'm just going to get out of here return one something went wrong there's nothing to demonstrate so i'm going to exit out of maine entirely but if malloc did not return null and all is well what am i going to do well let's first do what santiago proposed when we first began this this conversation for int i gets 0 i less than 3 i plus plus let's go ahead and copy into this new temporary chunk of memory whatever is at the original chunk of memory so when santiago proposed that we copy one two three from the old array into the new array here's how we might do that in code just using a simple for loop a la week two and then let me go ahead now and add one more value temp bracket three which is the fourth location if you're starting from zero i'm going to go ahead and put the number four there and now at this point i'm going to go ahead and remember the fact that temp is my new list so i'm going to go ahead and free the original list and i'm going to update my old list to point at the new list and then lastly i'm going to go ahead and use another for loop just to demonstrate that i think i did this correctly this time iterating up to four instead of three i'm going to go ahead and print out with percent i the contents of list bracket i so let's rewind real quick we began the story by allocating a an array of three integers but we did it this time dynamically to demonstrate that malloc just returns a chunk of memory and if you want to treat that chunk of memory as an array you absolutely can this stuff here if list equals equals null is just error checking just to make sure that nothing went wrong the interesting code resumes here i'm putting the numbers 1 2 and 3 at location 0 1 and 2 respectively in that chunk of memory which again i'm treating like an array but now at this point in the story i've stipulated that wait a minute i want to go ahead and add a fourth value how can i do that and let's stipulate that i want to go back and change the existing program because suppose that for the sake of discussion this is code that's running at google or twitter over time and it's only after receiving another tweet that their code realizes oh we need more space so how do i do this here on line 15 this time i allocate enough space for four integers and i again do some error checking if temp equals null then something bad won't happen let's just exit altogether but if nothing bad happened let's take santiago's suggestion and translate his english advice into c let's use a for loop from zero to three and copy into this new temporary chunk of memory the contents of the original chunk of memory so temp i gets list i and then here which was the point of this exercise let me add my fourth number at temp bracket three which is the fourth location if you start counting from zero but at this point in the story much like my earlier slides i have both the one two three in an array of size three and i have one two three duplicated in the array of size four let me go ahead and free the original list and give back to the computer that original chunk of memory let me then remember using my better named variable what the address of this new chunk of memory is and then just to show off let me go ahead and with another for loop this time counting four times not three let me print out all of those values now here's where i'll cross my fingers compile my new program it does not compile okay because it looks like i have one too many parentheses there so let's recompile the program with make list another error so let me scroll up there and oh interesting so this is a this is a a a common mistake implicitly declaring library function malloc something something so anytime you get an implicitly declaring error odds are it means you just did something um simple like this you forgot the requisite header file in which that function is defined and indeed recall from last week malloc is in standard lib as is free so now let's do make list cross my fingers again that time it worked dot slash list voila one two three four so this is now a completely literal translation of all of that code into a working program that again starts off by using uh an array of size three having dynamically allocated it and then it resizes it by creating a new one of size four copying old into new freeing the old and then proceeding as before and i've deliberately used malloc both times here as follows if you create an array in c using square bracket notation you have painted yourself into a corner you can't use any lines of code that we have seen and resize an array that you have declared using square brackets more technically speaking when you use the square brackets you are statically allocating the array on the stack you're putting it into the frame of the computer's memory that belongs to that computer to that uh functions stack frame per the diagram last week if however you use malloc our new tool our new tool from last week and say give me a chunk of memory that comes from the heap and that you can resize that you can give back and take more of and back and forth and in fact there's even a more simple way of doing this relatively speaking if you want to reallocate an array of a chunk of memory by resizing it you don't have to do all of this which i did before you don't have to use malloc twice you can use malloc once at the beginning and then you can use a new function that's actually kind of helpful in this case called reallock and you can actually do this re-alloc in a chunk of memory of size four times size of end but specifically reallocate the thing called list so re-alloc is very similar to malloc but it takes two arguments one is the size of the memory you want whether bigger or smaller but it takes a second argument its very first argument now is the address of a chunk of memory that you have already allocated as with malloc so again at the top of the same program recall that i used malloc to give myself a list that points at a chunk of memory big enough for three integers on line 16 i'm now handing that address back to reallock saying wait a minute here's that same address you gave me please now resize it reallocate it to be of size 4. and what the function does is if all goes well it returns to you the address in memory that it is now of sufficient size otherwise it returns null if anything bad one happens so i'll leave that code alone but what i don't have to do anymore is this reallock actually copies the old into the new for you so again coming back to santiago's story at the beginning of today reallock will not only give you a bigger chunk of memory if you ask for it by handing back the address of the memory you already requested and it's going to hand you back the address of a new chunk of memory that is big enough to fit all of those new values and it's smart too if there happens to be room at the very end of the existing chunk of memory there's no hello world like we saw on my slide earlier then you're actually going to get back the exact same address but the computer's operating system windows mac os or linux is going to remember okay yes i know i gave you three bytes originally there happened to be room at the end of that chunk of memory so now i'm going to remember instead that that same address has room for four integers or whatever number you pass in so again you don't have to bother copying yourself you can let the computer actually do the reallocation for you any questions then on malloc on reallock on free or fundamentally unlinked lists notice that this isn't yet a list this is still an array so we still need to take this program one step further and actually transition from this chunk of memory using arrays to these actual nodes but before we do that any questions or confusion yeah a question came in why do you not need to free temp at the end of the program why do i not need to free temp at the end of the program because i'm an idiot and glossed over that key important detail you absolutely should free not temp in this case but list so at this line here 27 i use my list variable which just has a better name and i make it equal to temp so that i can just refer to it as a bigger list but you are quite right that was an oversight on my part valgrind would not have liked that at the very end of this program i should absolutely free list however i don't need to free temp per se because i've simply reused the variable name through that assignment good question and good catch unintended other questions or comments brian uh question came in why does the linked list improve this situation if we can just use arrays and reallock and malloc to do all this stuff yeah really good question so how have we improved the situation if we can just use arrays in this way recall that with this kind of a regression what i just did is a regression to where we started the story whereby in any of the versions of code i just wrote i reallocated more space for this array which means that i manually with that for loop or realock with its own for loop how to copy all of the old values into the new so the approach we've taken in all three versions of this program that i've written thus far on the fly they've all been big o of n when it comes to insert they have not given us the dynamism of a linked list to just add without that duplication and we haven't had the ability yet to just do an insert for instance at the beginning of the structure in big o of one time so again this is the code translation really of that slower approach from which we began so the ultimate goal now is going to be to change this code and give us that dynamism and actually implement things as a proper linked list not just as an array of integers but we're about an hour in let's go ahead and take our first five-minute break here and when we come back we'll translate the nodes themselves to a full program all right we are back and recall that we began today by revisiting arrays and pointing out that searching is great in arrays if you keep them sorted you get the big o of log n that we liked back from week zero but as soon as you want to start dynamically modifying an array it gets very expensive quickly it might take you a big o of n steps to copy the contents of an old small array into a new bigger array and honestly over time especially for real-world software with lots of data even big o of n is expensive like you don't want to be constantly copying and copying and copying all of your data around the computer's memory so we can avoid that by using pointers and in turn stitching together the structure called the linked list all be at a price of spending more memory but with that additional memory that additional cost comes dynamism so that if we want we can even achieve constant time when it comes to inserting but of course then we have to sacrifice things like sortability so this theme of trade-offs we've just seen a few examples of actual c programs that implement first the old school array per week zero where we just hard code the array's length and unfortunately we painted ourselves into a corner using the bracket notation alone so we deployed instead malloc which is more versatile tool that lets us get as much memory as we want and we use that to recreate the idea of a list implemented as arrays but even then we saw that i had to copy using a for loop or we had a copy using indirectly relock old and to new and again for these small programs you don't even notice the difference the programs run like that but for large real world software all of that big-o of end time is going to add up quickly so it's best if we can try to avoid it all together and achieve dynamism so the code by which you can add to linked lists dynamically is actually part of the challenge for problem set 5 this coming week but let's see some of the building blocks via which we can syntactically start to allocate nodes and stitch them together when we know in advance how many we want which is not going to be the case for problem set 5 but for now is indeed the case because i only want three of these things so i'm going to go back to my program from before and i'm going to rewind and erase everything inside of main and i'm going to go ahead and declare myself a type called struct node initially with a number inside of it and a struct node star called next inside of that and i'm going to call this whole thing quite simply node so that's quite similar to what we did with a person but now it's a little fancier in that i'm giving the structure itself a temporary name struck node i'm referring to that temporary name inside of the structure so that i can have a pointer there too and then i'm renaming what was person to now node now let's go ahead and actually use this thing inside of main so let me go ahead and create an empty linked list the simplest way to translate the simple block with which we began today is just doing new node star list semicolon unfortunately anytime you declare in a variable that does not have an assigned value it's garbage and garbage is bad in the world of pointers again to be clear if you create this variable called list and you do not explicitly initialize its value to be something like null pointing at the ground but instead leave it as a garbage value it's the sort of metaphorical equivalent of this arrow pointing this way this way this other way that is to say you might accidentally in your own code follow this arrow to a completely bogus place and that's the point at which you have what are called segmentation faults as some of you might have experienced already with problem set four when you touch memory that you shouldn't so garbage values are bad ever more so in the context of pointers for that reason so you rarely want to do this you almost always want to initialize the pointer to some known value in the absence of an actual address we're going to use null to indicate that there's nothing there but that's deliberate on our part now suppose i want to insert just as i did physically by lugging the block number one onto stage before let me go ahead and allocate a node we'll call it n temporarily using malloc this time asking for the size of a node so the story is now changing i'm not allocating individual ins i'm allocating individual nodes inside of which is enough room for an integer and another pointer to a node and this size of operator figures out from the definition of this structure up here above main how much space is needed to store an integer and a pointer to a struct node so as always now i'm always going to check if n equals equals null i'm going to get out of this program immediately and just return one because something went wrong and there's just not enough memory but if all went well i'm going to go ahead now and go into that node n i'm going to go into its number field and assign it the value 1 and i'm going to go into that node n and go into its next field and for now assign it the value null so this is as though i've just allocated the wooden block with a 1 in it and i have initialized its next pointer to null now i'm going to go ahead and update the list itself to point at that value so again my variable called list is the variable via which i'm representing the whole list and now that i have an actual node to point to i'm setting list which again is a pointer to a node equal to whatever n is the address of an actual node so at this point in the story i have the small wooden block connected to the larger block containing one let's suppose for the sake of discussion i now want to add the number two to this list this time using a node as well not just an integer i'm going to go ahead and allocate n using malloc giving myself the size of another node i'm going to again just going to check if that thing equals null let me go ahead and free the list so i don't leak memory then let me go ahead and return one so that's just a quick sanity check to make sure i free any memory i've already allocated before but if all goes well and that's what i'm hoping for i'm gonna go ahead and go into this node n and store in its number field literally the number two and then now because this thing i'll insert in sorted order for now i'm going to go ahead and insert this next as null and if i indeed want to put this number two node after the number one node i can start at the top of the list i can go to the next node and inside of its value i can say n so this line of code here starts at the little block follows the arrow and then updates the next pointer of that first node the one node to instead store the address of this new node n and then lastly let's do one more of these so n gets malloc sizeof node one last time let me go ahead and do my sanity check one more time if n equals equals null something bad happened so now i'm going to go ahead and don't worry about the syntax just yet but i'm going to go ahead and free list next and i'm going to go ahead and free list and then i'm going to go ahead and return one but more on that another time that's just in the corner case where something bad happened but if nothing bad happened i'm going to update the number field to be three i'm going to update the next field to be null and now i'm going to update the list next block next block to equal this new one n and then here after this i can proceed to print all of these things if i want and i'll in fact i'll go ahead and do this with a loop the loop is going to look a little different from before in this case but it turns out we can use for loops pretty powerfully here too but at this point in the story my list pointer is pointing at the one node which is pointing at the two node which is pointing at the three node and again as someone observed earlier it's not common to use this double arrow notation in this case i bet i could actually use a loop to iterate over these things one at a time and we can see this here when it's time to print let me go ahead and do this 4 and instead of using i because there really aren't any numbers in question this is no longer an array so i can't use square bracket notation or pointer arithmetic i need to use pointers so this might feel a little weird at first but there's nothing stopping me with a for loop from doing this give me a temporary pointer to a node called temp and initialize it to be whatever is at the beginning of the list keep doing the following so long as temp does not equal null and on each iteration of this loop don't do something like i plus plus which again is not relevant now but go ahead and update my temporary pointer to be whatever the value of the temporary pointer's next field is so this looks crazy cryptic most likely especially if you're new to pointers as of last week as most of you are but it's the same idea as a typical for loop you initialize some variable before the semicolon you check some condition after the first semicolon and you perform an update of that variable after the second semicolon in this case they're not integers though instead i'm saying give myself a temporary pointer to the beginning of the list like my finger pointing at or if you prefer the foam finger pointing at some node in the list go ahead and call that temporary variable temp and now do the following so long as temp is not null that is so long as it's pointing at an actual legitimate wooden block what do i want to do let me go ahead and print out using printf and percent i as always whatever value is in the number field of that node there and that's it with this simple for loop relatively simple for loop i can essentially point at the very first node in my list and keep updating it to the next field updating it to the next field updating it to the next field and i keep doing this until my finger sort of walks off the end of the list of wooden blocks thereby pointing at null o ox0 at which point the loop stops and there's nothing more to print so in answer to that question earlier do we need to use this double arrow notation short answer no this is kind of the secret ingredient here this syntax inside of the for loop takes whatever you're pointing at follows one arrow and then updates the temporary variable now to point at that structure instead so this is kind of the equivalent in the world of pointers and linked lists of doing i plus plus but it's not as simple as i plus plus you can't just look one byte to the right or to the left instead you have to follow an arrow follow an arrow but by reassigning this temporary variable to wherever you just followed it's a way of following each of these orange arrows as we did physically a moment ago after this i should for good measure go ahead and free the whole list and let me just offer up a common way of freeing a linked list i can actually do something like this while list not equals null so while the whole list itself does not equal null go ahead and get a temporary pointer like this to the next field so i remember what comes after the current head of the list free the list node itself and then update list to be temp so again this probably looks crazy cryptic and certainly in the coming days especially with problem set five you'll work through this kind of logic a little more uh logically a little more pictorially perhaps but what am i doing here first i'm going to do the following so long as my linked list is not null and if i've got three nodes in it by definition it's not null from the beginning but my goal now is to free all of the memory i've allocated from left to right so to speak so how do i do that well if i've got two wooden block if i've got a wooden block in front of me it's not safe to free that wooden block yet because that wooden block recall contains the pointer to the next node so if i free this memory prematurely i've then stranded all subsequent nodes because they are no longer accessible once i've told the computer you can take back this chunk of memory for the first node so this line of code here on line 52 is just saying temporarily give me a variable called temp and point it not at the list itself the first node point at the next node so it's like using my right hand to point at the current node my left hand to point at the next node so that i can then on line 53 free the list itself which is should not be taken literally list represents the first node in the linked list not the whole thing so when you say free list that's like freeing just the current node but that's okay even now this memory has been given back i still have my left hand pointing at every subsequent node by way of the next one so now i can update list to equal that temporary variable and just continue this loop so it's a way of sort of pac-man style like gobbling up the entire linked list from left to right by freeing the first node the second node the third node and then you're done but by using a temporary variable to look one step ahead to make sure you don't chomp the free the memory too soon and therefore lose access to all of those subsequent nodes all right that was a big program but it was meant to be in succession starting with an array transitioning into a dynamically allocated array followed by finally an implementation using linked list albeit hard coded to support only three nodes but in that example do you see some sample syntax by which you can manipulate these kinds of nodes questions or confusion that i can help address yeah someone asked similar to one of the examples you did before why could we not have just done malloc three times sizeof node to get three nodes and do it that way really good question could i not just use malloc and allocate all three at once absolutely yes that is completely your prerogative i did it a little more pedantically one at a time but you could absolutely do it all three at once you would then need to use some pointer arithmetic though or you would need to use square bracket notation to treat that bigger chunk of memory as essentially an array of nodes and then stitch them together so i am assuming for demonstration purposes that even though we have these little simple examples that demonstrate the syntax in a real world system you're not going to be inserting one then two then three odds are you're going to be inserting one sometime passes then you want to insert two so you allocate more memory then some more time passes then you want to insert three and so there's gaps in between these these chunks of code in the real world other questions or confusion yeah another question came in why would malloc ever fail to allocate memory why would malik ever fail it's rarely going to fail but if the computer is out of memory so essentially if you're writing such a memory hungry program with so many variables big arrays big structures lots of data you may very well run out of memory maybe that's two gigabytes maybe it's four gigabytes or more but malloc may very well return null to you and so you should always check for it in fact i dare say on macs and pcs one of the most common reasons to this day for programs to freeze to crash for your whole computer to reboot is truly because someone did something stupid like i've done multiple times now already today and last week by touching memory that you shouldn't have so in problem set four and now five anytime you experience one of those segmentation faults whereby your program just crashes that is the uh the problem set version of like your whole mac or pc crashing because someone more experienced than you made that same mistake in their code just to reinforce this let's take a quick final example involving linked lists which again are this very one-dimensional structure left to right and then we'll add a second dimension and see what that buys us but we've changed the numbers around here now we still have our list but it's first pointing at the number two here and then the number two is pointing to some other chunk of memory that's been mallocked way over here and this then is the number four and this then is the number five so we have a linked list of size three but i've deliberately spread the numbers out this time two four five because suppose that we do want to insert more numbers into this list but in sorted order it turns out that we have to think a little bit differently when we're adding nodes not to the end and not to the beginning but in the middle like when we want to allocate more nodes in the middle there's a bit more work that actually has to happen so how might we go about doing this suppose that we want to allocate for instance the number one and we i want to add the number one well we could use code like this this is the same code as we used before we allocate the size of a node we check whether it equals null we initialize it with the value we care about and by default we set next equal to null and pictorially it might look like this it's kind of floating somewhere in the computer's memory i have this temporary variable n no longer pictured that i'm just pointing at when i allocate the number one so what does this look like this is like having the number one maybe somewhere over here and i'll just put it in place we got lucky and there's a chunk of memory right there so what do i want to now do well i want to go ahead and connect this so what i could do just intuitively if i one should go before two i can unplug this and i can plug this into here which makes sense but there's already a problem if i have done nothing else up until this point i have just orphaned three nodes two four and five to orphan a node means to forget where it is and if i don't have another variable in my code or if i'm not acting out with one of my hands pointing at the original beginning of the list i have literally orphans the rest of the list and the technical implication of that per last week is that now i have a massive memory leak you have just leaked the size of three nodes in memory that you can literally never get back until you reboot the computer for instance or the program quits and the operating system cleans things up for you so you don't want to do this like order of operations actually matters so what i should probably do is this when i insert the number one first i should probably recognize that well the one begins at the beginning of the list so what i should really do is point this arrow also at the same node and we'll sort of do it a little sloppily like that but let me stipulate those are both pointing at the same node now that my new node aka n in the code i showed is pointing at this thing now i can do kind of a switcheroo because i'm already pointing at the final destination there and now i can remove this safely because this is my list this is n therefore i have variables pointing to both and i can go ahead and insert that correctly so long story short order of operations matters so graphically if i were to do this as before just by saying list equals n if this is n and this is list and i adjust this arrow first bad things are going to happen indeed we end up orphaning 2 4 and 5 thereby leaking a significant amount of memory potentially and leaking any memory typically is bad so i don't want to do that so let's look at the correct code the correct code is going to be to start at n and update its next field this arrow here to point at the same thing as the list was originally pointing at and then go ahead and update the list such that both of them are currently pointing in duplicate then update the list to point to the new node so again the code's a little different this time from before because before we kept adding it to the end or i proposed verbally that we just added to the beginning here we're adding it indeed at the beginning and so the actual steps the actual code are a little bit different well let's do one final example if we want to allocate three well i gotta malloc another node the number three suppose that ends up somewhere in the computer's memory let's go ahead and plop this one over here so now three is in place how do i now insert this thing well similar to before i'm not going to want to update this pointer and go like this and then plug this guy in over here because now i've orphaned those two nodes so that again is the wrong step when you're in beside the middle of a linked list any code that you write to insert into the middle if you care about inserting in sorted order this should be updated first and odds are i should kind of cheat and point this at the same thing even though there's only one physical plug at the moment so we'll just pretend that this is working there we go and now i can go ahead and safely say that n and the previous note are already pointing where they should be so now it's safe for me to unplug this one and go ahead and update this final arrow to point at the new node in the correct location so let's see that in code again if i go here i've got graphically the number the node three kind of floating in space i first update its next field to point also at the four in duplicate then i update the two to point to the three the goal being again to avoid any leaking of memory or orphaning of nodes all right we are about to leave linked lists behind because as multiple of you have noted or probably thought they're good but maybe not great they're good in that they are dynamic and i can add to them as by inserting at the beginning if i really want and don't care about sorted order but there's still a good amount of work to do if i want to keep them in sorted order and i insert them in the middle or the end because that's like big o of n if i keep traversing all of these darn arrows so we get the dynamism but we don't necessarily get the performance increase but we fundamentally have opened up a whole new world to ourselves we can now stitch together these data structures in memory using pointers as our thread if you will we can just use memory as a canvas painting on it any values we want and we can sort of remember where all of those values are but this is indeed very single uh one dimensional left to right what if we give ourselves a second dimension what if we start thinking sort of not left right but also left right up down so again this is meaningless to the computer the computer just thinks of memory as being byte 0 1 2 3 but we humans can kind of think of these data structures a little more abstractly a little more simply and we can think about them in a way familiar perhaps to us in the real world trees not the ones so much that grow from the ground but if you're familiar with family trees where you might have a matriarch or patriarch and then sort of descendants hanging off of them graphically on a piece of paper something you might have made in grade school for instance we can leverage this idea of a tree structure that has a root that kind of branches and branches and branches and grows top to bottom so again more like a family tree than an actual tree in the soil so with trees it turns out this idea of a tree we can take some of the lessons learned from linked lists but we can gain back some of the features of arrays and we can do that as follows consider the following definition of what we're gonna we're about to call a binary search tree binary search being a very good thing from the first week zero and a tree now being the new idea here's an array from week zero week one week two whenever and it's of size seven and recall that if it's sorted we can apply binary search to this array and that's great because if we wanna search for a value we can start looking in the middle then we can go either left or right halfway between each and then we can similarly go left or right so binary search on a sorted array was so powerful because it was big o of log n we've concluded by just having and having and having the problem again and again define it tearing the phone book in half again and again and again but the problem with binary search is that it requires that you use an array so that you have random access you have to be able to index into the array in constant time using simple arithmetic like bracket zero bracket n minus one bracket n minus one divided by two to get the halfway point you have to be able to do arithmetic on the data structure and we've just proposed getting rid of that random access by you more uh transitioning to a dynamic data structure like a length list instead of an array but what if we do this what if you and i start thinking not on one dimension but on two dimensions and what if we alter our thinking to be like this so think of an array perhaps as being a two-dimensional structure that has not only width or length but also height and so we maintain its seams visually this relationship between all of these values but you know what we can stitch all of these values together using what well pointers pointers are this new building block that we can use to stitch together things in memory if the things in memory are numbers that's fine they're integers but if we throw a little more memory at them if we use a node and we kind of wrap the integer in a node such that that node contains not only numbers but pointers we could probably draw a picture like this not unlike a family tree where there's a root node at the very top in this case and then children so to speak left child and right child and that definition repeats again and again and it turns out the computer scientists do use this data structure in order to have the dynamism of a linked list where you can add more more nodes to the tree by just adding more and more squares even lower than the one the three the five and the seven and just use more pointers to kind of stitch them together to sort of grow the tree vertically down down down if you will but a good computer scientist would recognize that you shouldn't just put these numbers in random locations otherwise you're really just wasting your time you should use some algorithm and notice does anyone notice the pattern to this tree can anyone verbalize or textualize in the chat what pattern is manifest by these seven nodes in this tree they're not randomly ordered they're very deliberately ordered left to right top to bottom in a certain way can anyone put their finger on what the definition of this thing is what is the most important characteristic besides it just being drawn like a family tree uh great uh you have put in the middle of them on top of them you have put the middle number for example between one and three you have put two between um five and seven you have put six so on top of them you have put the middle number exactly there's this pattern to all of the numbers between 1 and 3 is 2 between 5 and 7 is 6 between 2 and 6 is 4 and it doesn't even have to be the middle number per se i can generalize it a little bit and comment that if you pick any node in this tree so to speak its left child will be less than its value and its right child will be greater than its value and we can do that again and again so here's four its left child is two that's less than here's four it's right child to six that's greater than we can do this again let's go to two its left child is one which is less its right child is three which is more six its left child is five which is less six its right child to seven which is more and so this is actually if you don't mind the um revisiting recursion from last week this is a recursive definition this is a recursive data structure so it's not only algorithms or functions that can be recursive by calling themselves a data structure can also be recursive after all what is this thing this is a tree yes i'll stipulate but it's technically a tree with two trees right this node here number four technically has two children and each of those children is itself a tree it's a smaller tree but it's the same exact definition again and again and anytime we see a recursive data structure it's actually going to be an opportunity to use recursive code which we'll take a look at in just a moment but for now notice what we've achieved again the dynamism of using pointers so that we can if we want add more nodes to this tree as by stringing them along the bottom in the correct order and yet we've preserved an important order for binary search aka the formal name of this data structure binary search tree by making sure that left child is always less right child is always more because now we can go about searching this thing more efficiently how well if i want to search for the number three what do i do well i start at the beginning of the tree just like within a linked list you start at the end of the link the beginning of the linked list so with the tree you start with the root of the tree always suppose i want to search for three well what do i do well three is obviously less than four so just like in week zero where i tore the phone book in half you can now think of this as like chopping down half of the tree because you know three if it's present is definitely not going to be anywhere over here so we can focus our attention down here here's the number two this is another tree it's just a smaller sub tree if you will how do i find the number three well i look to the right because it's greater than and boom i found it but by contrast suppose i were searching for the number eight i would start here i would look here i would look here and then conclude no it's not there but again every time i search for that aid i'm ignoring this half of the tree this half of the sub tree and so forth so you're going to achieve it would seem the same kind of power the same kind of performance as we saw from week zero so how do we translate this idea now into code we have all the building blocks already let me go ahead and propose that instead of the node we used before for a linked list which looked like this with a number and one pointer called next but again we could have called those things anything let's go ahead and make room for not just a number but also two pointers one that i'll call left one that i'll call right both of those is still a pointer to a struct node so same terminology as before but now i have two pointers instead of one so that one can conceptually point to the left and b point to a smaller subtree one can point to the right and point to a larger subtree so how do we go about implementing something like binary search well let's actually see some code and this is where recursion really gets kind of cool we kind of forced it when building morrow's pyramid with recursion like yeah you can do it but like and yes the pyramid was i claimed a recursive physical structure or virtual structure in the game but with data structures and pointers now recursion really starts to shine so let's consider this if i declare a function in c whose purpose in life is to search a tree for a number it's going to by definition search from the root on down how do we implement this well my function i'll propose is going to return a bool true or false the number is in the tree yes or no it's going to take two arguments a pointer to a node aka tree i could call it root or anything else and it's going to take a number which is the number i care about whether it's four or six or eight or anything else so what's going to be my first chunk of code well let me do the best practice that i keep preaching anytime you're dealing with pointers check for null so that your program doesn't freeze or crash or bad thing happens because who knows maybe you will accidentally or maybe intentionally pass this function a null pointer because there's no tree maybe you'll screw up and that's okay so long as your code is self-defensive so always check pointers for null if so if the tree is null that is there's no tree there obviously the number is not present so you just return false so that's one of our base cases so to speak else if the number you're looking for is less than the tree's own number so again this arrow notation means take tree which is a node star so take this pointer go there to the actual node and look in its own number field if the number you're looking for from the argument is less than the number in the tree's own number field well that means that you want to go left and whereas in the phone book i went to the left of the phone book here we're going to go to the left sub tree but how do i search a sub tree here's where it's important that a tree is a tree is a recursive data structure a tree is two sub trees with a new root node as before so i already have code by which i can search a smaller tree so i can just say search the left subtree as expressed here which means start at the current node and go to the left child and pass in the same number the number is not changing but the tree is getting smaller i've effectively encode chopped the tree in half and i'm ignoring the right half and i'm returning whatever that answer is otherwise if the number i care about is greater than the number in the current node do the opposite search the right subtree passing in the same number so again just like with the phone book it kept getting smaller and smaller here i keep searching a smaller and smaller sub tree because i keep chopping off branches left or right as i go from top to bottom there's one final case and let me toss this out up to the toss this out to the group there's a fourth case verbally or textually what else should i be checking for and doing here i've left room for one final case a few people are suggesting if the tree itself is the number if the tree itself contains the number yeah so if the number in the tree equals equals the number i'm looking for go ahead and only in this case return true and this is where the code again gets kind of recursion rather gets a little mind bending i only have false up here true here but not in either of these middle two branches no pun intended if you will but that's okay because my code is designed in such a way that if i search the left subtree and there's nothing there like literally i'm at the leaf of the tree so to speak then it's going to return false so that's fine that's like if i search for the number eight it's not even in the tree i'm only going to realize that once i fall off the end of the tree and see oops null i'll just return false but if i ever see the number along the way i will return true and these two inner calls to search these two recursive calls to search are just kind of like passing the buck instead of answering true or false themselves they're returning whatever the answer to a smaller question is by searching the left or right tree instead respectively so again this is where recursion starts to get not really forced or even necessarily as really as forced but really as appropriate when your data is itself recursive then recursion as a coding technique really rather shines so if ultimately we have oh and minor optimization as we might have noted in week zero with scratch we of course don't need to explicitly check if the number is equal we can get rid of that last condition and just assume that if it's not null and it's not to the left and it's not to the right we must be standing right on top of it and so we just return true there well let me re-summarize the picture here this is now a two-dimensional data structure and it's sort of better than a linked list in that now it's two dimensions i gain back binary search which is amazing so long as i keep my data in sorted order per this binary search tree definition but i've surely paid a price right nothing is absolutely better than anything else in our story thus far so what is the downside of a tree what price have i secretly or not so secretly paid here while preaching the upsides of trees and again the answer is often in this context sort of space or time or developer time or money or some resource personal or physical or real world yeah how about over to uh besley so i think that inserting is no longer constant time and i guess you need more memory you need memory to sort two pointers instead of one this time yeah it seems insertion is no longer constant time because if i need to preserve sorted order i can't just put it at the top i can't just keep pushing everything else down because things might get out of order in that case it would seem or rather even if i maintain the order it might kind of get very long and stringy if i add for instance another number another number and i keep jamming it at the top i probably need to kind of keep things balanced if you will and yeah the bigger point too is that immediately i'm using twice as many pointers so now my node is sort of getting even bigger than these things i now have room for not only a number and a pointer but another pointer which is of course going to cost me more space again so a trade-off there and let's go ahead and ask the group here when it comes to insertion why don't we consider for a moment what the running time of insertion might be when inserting into a binary search tree if you'd like to pull up the url as always let me go ahead and present this one what's the running time of inserting into a binary search tree so if you want to insert the number 0 into that tree if you want to insert the number um eight or anything in between or bigger or smaller what are the answers here so not quite as big of a victory for the tallest bar about 60 percent of you think log n and the good instincts there frankly are so that is going to be the right answer and that's the kind of the right instinct anytime you have binary search odds are you're talking something logarithmic but we've also seen divide and conquer in merge sort with n log n so not unreasonable that about 10 of you think that too and squared would actually be bad so n squared is like the worst of this times we've seen thus far and that would suggest that uh a tree is even worse than a link list is even worse than an array and thankfully we're not at that point but it's indeed log n based on certain assumptions so why is that so if we consider the tree from a moment ago it looked a little something like this and what is involved in inserting into a linked into a tree well suppose i want to insert the number 8 well i start here and it obviously belongs to the right because 8 is bigger i go here belongs to the right because 8 is bigger i go here it belongs to the right because 8 is bigger and so a new node is going to be created somewhere down here and even though it doesn't fit on the screen i could absolutely call malloc i could update a couple of pointers and boom we've added an eighth node to the tree so if it took me that many steps starting at the root one two three how do i generalize this into big o notation well a binary search tree if you lay it out nice and prettily like this nice and balanced if you will the height of that binary search tree it turns out is going to be log of n if n is the number of total nodes seven or eight now in the story then log base two of n is going to be the height of the tree so if you take n nodes n numbers and you kind of balance them in this nice sorted way the total height is going to be log n so what is the running time of insert well in the running time of insert is equivalent to how many steps does it take you to find the location into which the new number belongs well that's one two three and as it turns out log base two of eight is indeed three so the math actually works out perfectly in this case sometimes there might be a little rounding error but in general it's going to indeed be big o of log n but what if we get a little sloppy what if we get a little sloppy and we start inserting nodes uh that are giving us a bit of bad luck if you will so for instance suppose that i go ahead and let me do something on the fly here suppose that i go ahead and insert the number one the number two and the number three such that this is what logically happens this adheres to the definition of a binary search tree if this is the root it's one it has no left subtree and that's not strictly a problem uh because there's nothing violating the definition of a search tree here there's just nothing there two is in the right place three is in the right place so this two technically is a binary search tree but it's a bit of a corner case a perverse case if you will where the way you inserted things ended up in the binary search tree actually resembling more of a what would you say if you want to chime in in the chat and brian if you might want to relay a few people are saying it looks like a linked list yeah so even though i've drawn it sort of top down so in the sort of second dimension that's really just an artist's rendition this tree is a binary search tree but it's kind of sort of also a linked list and so even the most well-intentioned data structures given unfortunate inputs or some bad luck or some bad design could devolve into a different data structure just by chance so there's a way to solve this though even with these values when i insert 1 2 3 i could allow for this perverse situation where it just gets long and stringy at which point everything is big o of n it's just a linked list it just happens to be drawn diagonally instead of left right but does anyone see a solution intuitively not in terms of code no formal language but there is a solution here to make sure that this tree with one two three does not get long and stringy in the first place what might you do instead to solve this a few people in the chat are suggesting you should make two the new root node at the top of the tree so if i instead make two the new root node let me go ahead and mock this up real quickly and in a moment i'll reveal what i think you've just verbalized what if instead i make sure that when inserting these nodes i don't naively just keep going to the right to the right to the right i exercise some judgment and if i notice maybe that my data structure my tree is getting kind of long and stringy maybe i should kind of like rotate it around using that second dimension for real so that i change what the root is and we won't go through the code for doing this but it turns out this is the solution like that is exactly the right intuition if you take a higher level class on data structures and algorithms specifically in computer science you'll study trees like avl trees or red black trees which are different types of tree data structures they kind of have built into them the algorithms for kind of like shifting things as needed to make sure that as you insert or maybe as you delete you constantly rebalance the tree and long story short doing so might cost you a little extra time but if you've got a lot of data keeping that thing balanced and logarithmic in height so to speak and not long and stringy and linear in height is probably depending on your application going to save you quite a bit of time overall so we might say that insert into a balanced binary search tree is indeed big o of login but that is conditional on you making sure that you keep it balanced and that's going to be more code than we'll go into today but indeed a possible design decision all right any questions then on now trees and binary search trees in particular we started with a raise a few weeks ago we've now got linked lists which are good better but not great trees which seem maybe to be great but again it's always a trade-off they're costing us more space but i bet we can continue to stitch some of these ideas together for other structures still brian anything outstanding yeah one question came in as to why it's a problem if you have like the one and the two and the three all in just one sequence on the right side yeah really good question why is it a problem maybe it isn't like if you don't have a very large data set and you don't have many values in the structure honestly who cares like if it's three elements definitely don't care if it's ten elements if it's a thousand heck if your computer is fast enough it might be a million elements and it's not a big deal but if it's two million elements or a billion elements and it totally depends again on what is what is the business you're building what is the application you're writing how big is your data how fast or how slow is your computer it might very well matter ultimately and indeed when we've seen some of our algorithms a couple of weeks ago when we compared bubble sort and selection sort and merge sort even though those were in a different category of running times n log n and n squared just recall the appreciable difference and log of n in the context of searching is way better than n so if your data structure is devolving into something long and stringy recall that's like searching a phone book a thousand total pages but binary search and not letting it get long and stringy gives you like 10 steps instead of a thousand steps in order to search those same pages so again even in week zero we saw the appreciable difference between these different categories of running times all right well let's see if we can't maybe take some of the best of both worlds thus far again we've seen arrays we've seen linked lists we've seen trees what if we kind of get a little frankenstein here and mash things together and take sort of the best features of these things and build up something grander in fact i feel like the holy grail of a data structure would be something for which cert and insertion aren't n big o of n r big o of log n but wouldn't it be amazing if there's a data structure out there where the running time is like constant time big o of one like that's the holy grail if you can cleverly lay out your computer's memory in such a way that if you want to search for insert a value boom you're done boom you're done and none of this linear or logarithmic running time so let's see if we can't pursue that goal let me propose that we introduce this topic called hash tables hash table is another data structure that's essentially an array of linked lists so again it's this frankenstein monster whereby we've combined arrays ultimately with linked lists let's see how this is done let me propose that we first start with an array of size 26 and i'm going to start drawing my arrays vertically just because it sort of works out better pictorially but again these are all artists renditions anyway even though we always draw arrays left to right that's completely arbitrary so i'm going to start for now drawing my array top to bottom and suppose that the data structure i care about now is going to be even more interesting than numbers suppose i want to store things like names like dictionaries uh or names like uh contacts in your phone if you want to keep track of all the people you know it would be great if it doesn't take linear time or logarithmic time to find people instead constant time would seem to be even better so here's an array for instance size 26 and i propose that deliberately in english there's 26 letters a through z so let's consider location 0 as a location 25 is z and if i now go and start inserting all of my friends into my new phone into the contacts application where might i put them well let me go ahead and do this let me go ahead and think of each of these elements is again 0 through 25 or really a through z and let me upon inserting a new friend or contact into my phone let me put them into a location that has some relationship with the name itself let's not just start putting them at the very beginning let's not necessarily put them alphabetically per se let's actually put them at a specific location in this array not just top to bottom but at a specific entry so suppose the first person i want to add to my contacts is albus well i'm going to propose that because albus starts with an a he is going to go into the a location so the very first entry in this array suppose i next want to add zacharias well his name starts with z so he's going to go in the very last location so again i'm jumping around i went to from 0 to 25 but it's an array and i can do that in constant time you can randomly index to any element using square brackets so this is both constant time i don't have to just put him right after albus i can put him wherever i want suppose the third person is hermione well i'm going to part her at location h why because i can do the math and i can figure out h okay i can just jump immediately to that letter of the alphabet and in turn thanks to ascii and doing a bit of arithmetic i convert that to a number as well so she ends up at 0 1 2 3 4 six seven because h uh ends up mapping to the eighth character or location seven all right what el who else all these other people end up in my address book and so they're all spread out i don't have as many as 26 friends so there's some gaps in the data there but i fit everyone here but there might be a problem and you can perhaps see this coming thus far i've kind of gotten lucky and i've only know people whose names are uniquely start with a letter but as soon as i meet someone at uh you know school and i add them to my contacts well now harry for instance has to go in the same location now this is a problem if i want to store both hermione and harry because their names both start with h but again if it's an array it's absolutely a deal breaker at that point all things break down because i could yes grow the array but if i grow the array then it's size 27 and then it's like how do i know what number is what letter at that point it just devolves into a complete mess but if i borrow the idea of a linked list what if i make my array an array of linked lists so yes even though there's this collision where both hermione and harry's belong at the same location in the array that's fine in the event this happens i'm just going to kind of stitch them together into a linked list from left to right so it's not ideal because now it takes me two steps to get to harry instead of one using simple arithmetic and square bracket notation but heck at least i can still fit him in my address book so a bit of a trade-off but feels reasonable well someone else hagrid all right it's not ideal that now it takes me three steps to get to hagrid in my address book but three is way better than not having him in there at all so again we see a manifestation of a trade-off here too but we've solved the problem and a hash table is indeed exactly this data structure it is an array of linked lists at least it can be implemented as such and it is predicated on introducing the notion of a hash function this is actually something we'll see in other contexts before long but a hash function is going to allow us to map not only hermione harry and hagrid but also ron and remus severus and sirius to their respective locations deterministically that is there's no randomness involved here every time i look at these people's names i'm going to figure out the location at which they belong and that location is never going to change so how do i do this well it turns out we can think back to problem solving itself and what functions are so this is problem solving as we've defined it this is also the function a function in any language that takes inputs and outputs a hash function is going to be sort of the secret sauce inside of this black box for now and so what is a hash function well hash function is literally a function either mathematically or in programming that takes as input some string in this case like hermione or harry and it returns some output and the output of a hash function is usually a number in this case the number i want is going to be between 0 and 25. so in order to implement this notion of a hash table not just pictorially on the screen but in actual code i'm literally going to have to write a c function that takes a string or if you will char star as input and returns an int between 0 and 25 so that i know how to convert hermione or harry or hagrid to the number 7 in this case so what does this hash function do it takes as input something like albus and it outputs zero it takes someone like zacharias and it outputs 25. and you can probably see the pattern here the code i would write in order to implement something like this is probably going to look at the user's input that char star and it's going to look at the first character which is a or z respectively for these two and it's then going to do a little bit of math and subtract off like 65 or or what not and it's going to get me a number between 0 and 25 just like with caesar or some of our past manipulations of strings so from here we can now take this building block though and perhaps solve our problems a little more effectively like i don't love the fact that even though yes i've made room for harry and hermione and hagrid and now luna and lily and lucius and lavender these some of these linked lists are getting a little long and there's another term you can use here these are kind of like chains if you will because they look like chain-link fences or little links in a chain this is these are chains or linked lists but some of them are starting to get long and it's a little stupid that i'm trying to achieve constant time big o of one but technically even though some of the names literally take one step some of them are taking two or three or four steps so it's starting to devolve so what would be an optimization here if you start to get uncomfortable because you're so popular and you've got so many names in your contacts that looking up the h's looking up the l's is taking more time than the others what could we do to improve the situation and still use a hash table still use a hash function but what would maybe the logical solution be when you have too many collisions you have too many names colliding with one another how could we improve our performance and get at locations again closer to one step not two not three not four so literally one step because that's our holy grail here a few people have suggested you should look at more than just the first letter like you could look at the second letter for example yeah nice so if looking at one letter of the person's name is obviously insufficient because all of a whole bunch of us have names that start with h or or a or z or the like well why don't we look at two letters and therefore decrease the probability that we're going to have these collisions so let me go ahead and restructure this and focusing on the hermione harry and hagrid problem why don't we go ahead and take our array in the hash table the vertical thing here and let's think of it as maybe not just being h at that location but what if we think of that location specifically as being h a and then h b h c h d h e h f all the way down to h z and then i a i b i c and so forth so we now enumerate all possible pairs of letters from a a to z z but this would seem to spread things out right because now hermione goes in the h e location in the array now harry goes in the h a and now hag oh damn it like hagrid still goes in the same location so what would maybe be a better fix again this isn't horrible like two steps is not a big deal especially on fast computers but again with large enough data sets and if we're no longer talking about people in your contacts but maybe all the people in the world who are who have google accounts or twitter accounts and the like where you want to search this information quickly you're going to have a lot of people whose names start with h and a and z and everything else it would be nice to spread them out further so what could we do well instead of using the first two letters frankly i think the logical extension of this is to use the first three letters so maybe this is the h-a-a bucket this is the h-a-b-h-a-c-h-a-d-h-a-e dot dot dot all the way down to h-z-z and then i-a-a but now when we hash our three friends hermione goes in the h-e-r bucket so to speak the element of the array harry h a r goes in that bucket and hagrid now gets his own bucket h-a-g as would everyone else so it seems to have solved this specific problem you could still imagine and i have to think harder and probably google to see if there's other harry potter names that start with h-a-g or h-a-r or h-e-r to find another collision because you could imagine using four letters instead but what price are we paying like i'm solving this problem again and again and i'm getting myself literally faster look up time because it's giving me one step i can mathematically figure out by just doing a bit of ascii math what the number of the index is that i should jump to in this bigger and bigger array but what price am i paying brian any thoughts you'd like to relay yeah a few people are saying it's going to take a lot of memory yeah my god like this is taking a huge amount of memory now previously how much memory did it take well let me pull up a little uh calculator here and do some quick math so if we had originally 26 uh buckets so to speak elements in the array that of course isn't that bad that feels pretty reasonable 26 slots but the downside was that the chains might get kind of long three names four names maybe even more but if we have a a through z z instead of a through z that's 26 times 26 that's 676 buckets doesn't sound like a huge deal though that's bigger than most things we've done in memory thus far not a huge deal but if we have three that's 26 possibilities times 26 times 26 for aaa through zzz now we have 17 576 buckets in my array and the problem isn't so much that we're using that memory because honestly if you need the memory use it that's fine just throw hardware at the problem buy and upgrade more memory but the problem is that i probably don't know that many people whose names start with h z z or a z z or any number of these combinations of letters of the alphabet a lot of those buckets are going to be empty but it doesn't matter if they're empty if you want an array and you want random access they have to be present so that your arithmetic works out per week too where you just use square bracket notation and jump to any of the locations in memory that you care about so finding that trade-off or finding the inflection point with those trade-offs is kind of an art and or a science figuring out for your particular data your particular application which is more important time or space or some happy medium in between the two and with problem set five as you'll see you'll actually have to figure out this balance in part by trying to minimize ultimately your own use of memory and your own use of computers time but let me point something out actually this notion of hash table which up until now definitely the most sophisticated data structure that we've looked at it's kind of familiar to you in some way already like these are probably larger than the playing cards you have at home but if you've ever played with a deck of cards and the cards start out randomly odds are you've at some point needed to sort them for one game or another sometimes you need to shuffle them entirely if you want to be a little neat you might sort them not just by number but also by suits so hearts and spades and clubs and diamonds into separate categories so honestly i have this literally here just for the sake of the metaphor we have four buckets here and we've gone ahead and labeled them in advance with spade there so that's one bucket here we have a diamond shape here and here we have here we have hearts here and then clubs here so if you've ever sorted a deck of cards odds are you haven't really thought about this very hard because it's not that interesting you probably mindlessly start laying them out and sorting them by by suit and then maybe by number but if you've done that you have hashed values before if you take a look at the first card and you see that oh it's the dime it's the ace of diamonds you know yes you might care ultimately that it's a diamond but that it's an ace but for now i'm just going to put it for instance into the ace but into the diamond bucket here's the two of diamonds here i'm going to put that into the diamond bucket here's the ace of clubs so i'm going to put that over here and you can just progressively hash one card after the other and indeed hashing really just means to look at some input and produce in this case some numeric output that outputs to like bucket zero one two or three based on some characteristic of that input whether it's actually the suit on the card like i'm doing here or maybe it's based on the letter of the alphabet here and why am i doing this right i'm not going to do the whole thing because like 52 steps is going to take a while and get boring quickly if not already but why am i doing this because odds are you've probably done this not with the drama of like actual buckets you probably just kind of laid them out in front of you but why have you done that if that's indeed something you have done yeah over to us sophia there's a possibility that we could actually get to things faster like if we know what bucket it is we might be able to even search things for like oh one or less yeah something like that yeah you start to gain these optimizations right like at least as a human honestly like i can just i can process four smaller problems just much easier than one bigger problem that's size 52 i can solve for 13 card problems a little faster especially if i'm looking for a particular card now i can find it among 13 cards instead of 52 so there's just kind of an optimization here so you might take as input these cards hash them into a particular bucket and then proceed to solve the smaller problem now that's not what a hash table itself is all about a hash table is about storing information but storing information so as to get to it more quickly so to sophia's point if indeed she just wants to find like the uh ace of uh ace of uh diamonds ace of diamonds she now only has to look through a 13 size problem a linked list of size 13 if you will instead of an array or a linked list of size 52 so a hash table allows you to bucketize your inputs if you will colloquially and get access to data more quickly not necessarily in time one in one step it might be two might be four might be 13 steps but it's generally more fewer steps than if you were doing something purely linearly or even logarithmically ideally you're trying to pick your hash function in such a way that you minimize the number of elements that collide by using not a through z but a a through zz and so forth so let me go ahead here and ask a question what then is the running time when it comes to this data structure of a hash table if you want to go ahead and search in a hash table once all of the data is in there once all of my contacts are there how many steps does your phone have to take given n contacts in your phone to find hermione or hagrid or anyone else so i see again eighty percent of you are saying constant time big o of one and again constant time might mean one step two steps four steps but some fixed number not dependent on n eighteen percent of you or so are saying linear time and i have to admit the twenty percent of you or so that said linear time are technically asymptotically mathematically correct and here we begin to see sort of a distinction between like the real world and academia so the academic here or rather uh the real world here the real world programmer would say just like sophia did obviously like a bucket with 13 cards in it is strictly better than one bigger bucket with 52 cards like that is just faster it's literally four times as fast to find or to fill us flip through those 13 cards instead of 52 like that is objectively faster but the academic would say yes but asymptotically and asymptotically is just a fancy way of saying as n gets really large the sort of wave of the hand that i keep describing asymptotically taking 13 steps is technically big o of n why well in the case of the cards here it's technically n divided by 4. like yes it's 13 but if there's n cards total technically the size of this bucket is gonna end up being n divided by four and what did we talk about when we talked about big o and omega well you throw away the lower order terms you get rid of the constants like the divide by four or the the plus something else so we get rid of that and it's technically a hash table searching it is still in big o of n but here again we see a contrast between like the real world and the theoretical world like yes if you want to get into an academic debate yes it's still technically the same as a linked list or an array at which point you might as well just search the thing left to right linearly whether it's an array or a linked list but come on like if you actually hash these values in advance and spread them out into 4 or 26 or 576 buckets that is actually going to be faster when it comes to wall clock time so when you literally look at the clock on the wall less time will pass taking sophia's approach than taking an array or linked list approach so here the ant those of you who said big o of n are correct but when it comes to the real world programming honestly if it's faster than actual end steps that may very well be a net positive and so perhaps we should be focusing more on practice and less sometimes on the theory of these things and indeed that's going to be the challenge the problems at five to which i keep eluding is going to challenge you to implement one of these data structures a hash table with 100 000 plus english words we're going to in a nutshell give you a big text file containing one english word per line and among your goals is going to be to load all of those 140 000 plus words into your computer's memory using a hash table now if you are simplistic about it and you use a hash table with 26 buckets a through z you're going to have a lot of collisions if there's 140 000 plus english words there's a lot of words in there that start with a or b or z or anything in between if you maybe then go with a a through z z maybe that's better or aaa through zzz maybe that's better but at some point you're going to start to use too much memory for your own good and one of the challenges optionally of problem set five is going to be to playfully challenge your classmates whereby if you opt into this you can run a command that will put you on the big board which will show on the course's website exactly how much or how little ram or memory you're using and how little or how much time your code is taking to run and so we just sort of put aside the sort of academic hand waves of the hand saying well yes all of your code is big o of n but a big but n divided by four sophia's approach is way better in practice than n itself and we'll begin to tease apart the dichotomy between theory here and practice but these aren't the only ways to lay things out in memory and we wanted to show you just a few other ideas that come out now that we have all of these building blocks one of which is a data structure that we're going to call a try the tri is actually short for the word retrieval even though it's not quite pronounced the same it's also known as a prefix tree and it's a different type of tree that is typically used to store words or other more sophisticated pieces of data instead of just numbers alone so a try is actually a tree made up of a raise so you can kind of see a pattern here like a hash table was an array of linked lists a try is a tree each of whose nodes is an array so at some point computer scientists started getting a little creative and started just like literally smashing together different data structures to see what they could come up with it seems and so a try begins to look like this but it has this amazing property that's better than anything in theory we've seen before here is one node in a try and it's a node in the sense that this would be like a rectangle or a square but inside of that node is literally an array of size 26 in this case and each of those locations or buckets if you will represent a through z and what we're going to do is anytime we insert a word like a name like harry or hagrid or hermione or anyone else we are going to walk through the letters of their name like h a g r id and we are going to follow a series of pointers from one node to another as follows so for instance if this is a through z or zero through 25 here is location h so if the goal at the moment is to insert the first of our uh contacts for instance harry i'm going to start by looking at the first node the root of the tree looking up the h location and i'm going to kind of make mental note that harry starts there the h in harry starts there then if i want to insert the a in harry i'm going to go ahead and add a new node representing 26 letters but i'm going to sort of keep track of the fact that okay a is here so now i'm going to have another pointer oh i'm sorry not harry hagrid first h a g r i d so what have i just done a try again is a tree each of whose nodes is an array and each of those arrays is an array of pointers to other nodes so again we're really just mashing up everything together here but it's the same building blocks as before each node in this tree top to bottom is an array of pointers to other nodes and so if i wanted to check if hagrid is in my contacts i literally start at the first node and i follow the h pointer i then follow the a pointer i then follow the g pointer the r pointer the i pointer and then i check at the d pointer is there a boolean value inside of that structure more on that another time perhaps that just says yes or no there is someone named h-a-g-r-i-d in my contacts notice there's no other letters noted at the moment and there's no other green boxes green just denotes a boolean value for our purposes now so that means there's no one whose name is h-a-g-r-i-a or r-i-b or r-i-c it's only h-a-r-h-a-g-r-i-d that exists in my contacts but notice what happens next now if i go ahead and insert harry notice that harry and hagrid share the h the a and then this third node but then harry needs to follow a different pointer to store the r and the y and notice the green there it's sort of a check mark in the data structure a boolean value that's saying yes i have someone in my context name h-a-r-r-y and then if we add hermione she shares the h and then also the second node but hermione requires some new nodes all together but notice this key property the reason for this sort of complexity because this is probably the weirdest structure we've seen thus far is that even if i have a billion names in my phone book how many steps literally does it take me to find hagrid someone feel free to chime in in the text the chat window if you'd like even if i have a billion names in my contacts how many steps does it take for me to look up and check if hagrid is among them people are saying 6 6 h a g r i d if i have two billion names four billion names how many steps does it take six and this is what we mean by constant time constant time at least in the length of these uh the humans names in the data structure so what does this mean no matter how many names you cram into a try data structure the number of other names does not impact how many steps it takes for me to find harry or hagrid or hermione or anyone else it is only dependent on the length of their name and here's where we can get a little academic if you assume that there's a finite number of characters in any human real or imaginary's name maybe it's 20 or 100 or whatever it's more than six but it's probably fewer than hundreds then you can assume that that's constant so it might be big o of like 200 characters for someone with a super long name but that's constant and so technically a try gives you that holy grail of lookup times and insertion times of big o of 1 because it is not dependent on n which is the number of other names in the data structure it is dependent only on the length of the name you are inputting and if you assume that all names in the world are reasonably length less than some finite value like 6 or 200 or whatever it is then you can call that and it technically is big o of one that is constant time so here is that the goal of this whole day like trying to get to constant time because constant time is better than it would seem linear time or logarithmic time or anything else we've seen but but but again there's always a trade-off what price have we just paid if you see it wire tries not necessarily all that there's still a catch uh danielle for example if let's say you wanted you had two people in your contact list one person was named daniel and one person was named danielle and and you know that daniel is in your list so the l would have this like boolean operator of like true but then how would you get to danielle if your l was an operator and it didn't point to another l for danielle really good question a corner case if you will what if someone's name is a substring of another person's name so dan daniel d-a-n-i-e-l and i think you're saying danielle d-a-n-i-e-l so the second name is a little longer let me stipulate that we can solve that i have not shown code that represents each of the nodes in this tree let me propose that we could continue having arrows even below so if we were to have daniel in this tree we could also have danielle by just having a couple of more nodes below daniel and just having another green check mark so it is solvable in code even though it's not obvious from the graphical representation but absolutely a corner case but thankfully it is solvable in tries what might another downside be though of a try like you do get big o of one time you can solve the daniel danielle problem but there's still a price being paid any thoughts yeah how about over to uh uh attend if i'm saying it right uh yeah i'm eating um i think it's because it would just take a lot of uh memory to house all of that and that could take a lot of time a lot of um could slow down the system exactly yeah you can kind of see it from my picture alone we only added three names to this data structure but my god like there's dozens maybe a hundred plus pointers pictured here even though they might all be null right if the absence of an arrow here suggests that they're null ox0 but even storing null ox0 is eight zero bits so this is not lacking for actual memory usage we're just not using it very efficiently we have spent a huge number of bits or bytes or however you want to measure it because look even with the h's i'm using one pointer out of 26 25 pointers are probably initialized to null which means i'm wasting 25 pointers and you can imagine the lower and lower you get in this tree the less likely there is to be a name that even starts with h-a-g-r-i-d daniel came up with a good example with danielle but that's not gonna often happen certainly the lower you get in this tree so as e10 says you're wasting a huge amount of memory so yes you're gaining constant time lookups but my god at what price you might be using megabytes gigabytes of storage space because again the most important property of an array is that it's all contiguous and therefore you have random access but if you have to have every node containing an array of size 26 or anything else you have to have it you have to spend the memory over and over again so there too is a trade-off while this might be theoretically ideal theory does not necessarily mean practice and sometimes designing something that is textbook less efficient might actually be more efficient in the real world and in problem set five in your own spell checker uh when you build up this dictionary that we then use to spell check very large corpuses of text will you begin to experience some of those real-world trade-offs yourself well we wanted to end today with a look at what else you can do with these kinds of data structures just to give you a taste of where else you can go with this and what other kinds of problems you can solve again thus far we've looked at arrays which are really the simplest of data structures and they're not even structures per se it's just contiguous blocks of memory we then introduced linked lists of course where you have this one-dimensional data structure that allows you to stitch together nodes in memory giving you dynamic allocation and if you want de-allocation inserting and deleting nodes if you want then we had trees which kind of gives us the best of both worlds arrays and linked lists but we have to spend more space and use more pointers then of course hash tables kind of merge together two of those ideas arrays and linked lists and that starts to work well and indeed that's what you'll experiment with with your own spell checker but then of course there's tries which at first glance seem better but not at great not without great cost as e10 says so it turns out with all of those building blocks at your disposal you can actually use them as lower level implementation details to solve higher level problems and this is what are known as abstract data structures or abstract data types an abstract data structure is kind of a mental structure that you can imagine implementing some real world problem typically that's implemented with some other data structure so you're sort of writing code at this level but you're thinking about what you've built ultimately at this level and that's abstraction taking lower level implementation details simplifying them for the sake of discussion or problem solving higher up so what's one such data structure a queue is a very common abstract data structure what is a queue well those of you who grew up and say in britain call a line outside of a store a queue and that's indeed where it gets its name a queue is a data structure that has certain properties so if you're standing outside of a store or a restaurant in healthier times waiting to get in you're generally in a queue but there's an important property of a cue at least if you live in a fair society you'd like to think that if you are the first one in line you are the first one to get out of line first in first out it would be kind of obnoxious if you're first in line and then they start letting people in who are behind you in that queue so a queue if it's implemented correctly has a property known as fifo first in first out and we humans would think of that as just a fair property and a queue generally has two operations associated with it at least nq and dq those are just conventions you could call it add and remove or insert delete whatever but nq and dq are sort of the more common ones to say so enqueuing means you walk up to the store and you get in line because you have to wait dq means they're ready to serve you or have you and so you get out of line that's dq and again fifo is just a fancy acronym that describes a key property of that which is that it's first in first out so how could you implement a queue then well what's interesting about that data structure is that it's abstract right it's more of an idea than an actual thing in code you want to implement some kind of fair queuing system and so you think of it as a queue but frankly if we're going to translate that example to code you could imagine using like an array of persons that could implement a cue you could use a linked list of persons frankly either of those would actually work underneath the hood as the lower level implementation details but what would be a problem if we translate this real world analogy like queuing up outside of a store to get in into code and you used an array what would a downside be of using an array to represent a queue in general even though we're making a bit of a leap from like real world to code suddenly but there's probably a downside what might a downside be of using an array to implement a queue ryan if you're if you're using it an array you can't really just take out the existing values you would still have a bunch of because if you're thinking about doing this in a line you would have to take out the first person take out the second person but you can't really dynamically sort of change the memory after that yeah yeah that's a really good point think about a line suppose that there's like a line that can fit 10 people outside the apple store because apple's pretty good right now during the health crisis of letting people in only so many at a time so suppose they have room for 10 people 6 feet apart that's actually a pretty apt analogy this year more than ever but as ryan says if you want to take someone if you want to dq someone then the first person in line is going to go into the store and then the second person you're going to dq them they go into the store the problem with an array it would seem is that now you have essentially empty spaces at the beginning of the line but you still don't have room at the end of the line for new people now there's an obvious real world solution there you just say hey everyone would you mind taking a few steps forward but that's inefficient like not so much in the human world you got to get into the store but in code that's copying of values you have to move like eight values two places over if two people were just let into the store so now your dq operation is big o of n and that doesn't feel quite ideal and we can do better than that if we're a little clever with some local variables and such but that would be one challenge of a cue certainly is just how well we could implement it using an array so you might imagine too and array is limited too because it would kind of be obnoxious if you get to the apple store there's already 10 people online and they don't let you get in line they say sorry we're all full for today when they're obviously not because eventually there'll be more room in the queue a linked list would allow you to keep appending more and more people and even if the line outside the store gets crazy long at least the linked list allows you to service all of the customers who are showing up over time an array of fixed size would make that harder and again you could allocate a bigger array but then you're going to have to ask all the customers hey could everyone come over here no go back over there i mean you're constantly moving humans or values in memory back and forth so that is only to say that to implement this real world notion of a cue which is very commonly used even in the computer world to represent certain ideas for instance printer queue when you send something to the printer especially on a campus or in a company there's a queue and ideally the first person who printed is the first one who gets their printouts thereafter cues are also used in software but there's other abstract data types out there besides cues one of them is called a stack so a stack is a data structure that can also be implemented underneath the hood using arrays or linked lists or heck maybe something else but stacks have a different property it's last in first out last in first out so if you think about the trays in a cafeteria in healthier times when everyone is on campus using trays from a cafeteria you'll recall of course that trays tend to get stacked like this and the last uh the last tray to go on top of the stack is the first one to come out if you go to a clothing store or your own closet if you don't hang things on hangers or put them in drawers but kind of stack them like here like all of these sweaters this is a stack of sweaters and how do i get at a sweater i want well the easiest way to do it is with last in first out so i constantly take the black sweater the black sweater but if i've stored all of my sweaters in this stack you may never get to the sort of lower level ones like the red or the blue sweater because again of this data structure so lifo last in first out is in fact the property used to characterize stacks and stacks are useful or not useful depending on the real world context but even with the computing we'll see applications over time where stacks indeed come into play and those two operations that those things support are generally called push and pop it's the same thing as add or remove or insert or delete but the terms of art are generally push and pop where this is me popping a value off of the stack this is me pushing a value onto the stack but again it's last in first out otherwise known as lifo and then there's this other data structure that actually has a very real world analog known as a dictionary a dictionary is an abstract data type which means you can implement it with arrays or linked lists or hash tables or tries or whatever else an abstract data type that allows you to associate keys with values and the best analog here is indeed in the real world what is a dictionary like an old school dictionary that's actually printed on paper in book form what is that what is inside that book a whole bunch of keys a whole bunch of bold-faced words like apple and banana and so forth each of which have definitions otherwise known as values and they're often alphabetized to make it easier for you to find things so that you can look things up more quickly but a dictionary is an abstract data type that associates keys with values and you look up the values by way of their keys just like you look up a word's definition by way of the word itself and dictionaries are actually kind of all around us too you don't think of them in these terms probably but if you've ever been to sweet green for instance in new haven or in cambridge or elsewhere this is a salad place where nowadays especially you can order in advance online or on an app and then go into the store and pick up your food from a shelf but the shelf the way they do it here in cambridge and in other cities is they actually have letters of the alphabet on the shelves a b c d e f all the way through z the idea being that if i go in to pick up my salad it's probably on the d section if brian goes in to pick up his it's in the b section and so forth now here too you can imagine perverse corner cases where this data structure this dictionary whereby letters of the alphabet map two values which are people's salad is not necessarily fail proof like can you think of a perverse corner case where sweet green's very wonderful methodical system actually breaks down can you think of a limitation here even if you've never been to sweet green or never eaten salad like what could break down with this system if going into a store and picking something up based on your name any thoughts a few people say there might be a problem if two people have the same name yeah if two people have the same names you start to sort of stack things up so literally sweet green will start stacking one solid on top of the other so there is actually an interesting incarnation of one data type being built on top of yet another data type so again all of these are sort of like custom scratch pieces if you will that we're constantly sort of reassembling into more interesting and powerful ideas but at some point if there's a lot of b names or d names or any letter of the alphabet i surely see a finite height to this shelf so it's kind of as though sweet green has implemented their dictionary using stacks with arrays because arrays are fixed size so there's surely only a so many inches of space here vertically so you can see a real world limitation so what does sweep green do if that happens they probably just kind of cheat and put the b's in the c section or the d's and the e section like who really cares in the real world your eyes are probably going to skim left and right but algorithmically that is slowing things down and in the worst case if you're like really late to pick up your salad or sweet green is really popular and there's a huge number of salads on the shelf your name might be albus but your salad might end up way over here in the z section if they're just out of room and so that too is a valid algorithmic decision to just make room somewhere else but again trade-offs between time and space and so we thought we'd end on a note thanks to some friends of ours at another institution who made a wonderful visualization that distinguish these notions of stacks versus cues stack and again a queue are these abstract data types that can be implemented in different ways they have different properties each of them respectively fifo or lifo and here for instance is a final look in our final moments together here today about how these ideas manifest themselves perhaps in the real world not unlike this stack of sweaters here [Music] once upon a time there was a guy named jack when it came to making friends jack did not have the neck so jack went to talk to the most popular guy he knew he went up the blue and asked what do i do lou saw that his friend was really distressed well lou began just look how you're dressed don't you have any clothes with a different look yes said jack i sure do come to my house and i'll show them to you so they went off the jacks and jack showed lou the box where he kept all his shirts and his pants and his socks luce said i see you have all your clothes in a pile why don't you wear some others once in a while jack said well when i remove clothes and socks i wash them and put them away in the box then comes the next morning and up i hop i go to the box and get my clothes off the top lou quickly realized the problem with jack he kept clothes cds and books in a stack when he reached for something to read or to wear he chose a top book or underwear then when he was done he would put it right back back it would go on top of the stack i know the solution said a triumphant lou you need to learn to start using a queue lou took jack's clothes and hung them in a closet and when he had emptied the box he just tossed it then he said now jack at the end of the day put your clothes on a left when you put them away then tomorrow morning when you see the sunshine get your clothes from the right from the end of the line don't you see said lou it will be so nice you'll wear everything once before you wear something twice and with everything in queues in his closet and shelf jack started to feel quite sure of himself all thanks to lou and his wonderful cue alright that's it for cs50 we will see you next time [Music] so [Music] [Music] you

Info

Channel: CS50

Views: 141,389

Rating: 4.9652176 out of 5

Keywords: cs50, harvard, computer, science, david, j., malan

Id: 2T-A_GFuoTo

Channel Id: undefined

Length: 146min 39sec (8799 seconds)

Published: Thu Dec 31 2020