Rust Memory Management - Ownership and Borrowing

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi guys my name is tensor today we're going to get into rusts memory management pattern and talk about ownership and borrowing ownership is one of the features that sets rust apart from other programming languages languages like Java c-sharp dart and go use a garbage collector which is a part of the runtime or a virtual machine that runs along with your program dynamically cleaning up and freeing memory where it can garbage collectors are efficient but they are a bit too costly for a systems level language like rust other system levels languages like C and C++ allow the developer to explicitly define when and where memory can be allocated and D allocated while this method yields control and performance over a garbage collector it also introduces complexity and the potential for many different errors rust on the other hand opted to set a few intuitive rules for how the compiler will treat data types in memory via its ownership and borrowing models now before we get into these rules let's discuss the stack and the heap we'll use this simple example as a guideline now as the compiler reads through our code here it populates a bunch of frames called stack frames and I'm going to represent the stack frames using these block comments each of our frames will be named after the functions that they represent and it will contain pointers values and variables which are used by that function if we look at the main function frame here it's called main and it has the variable X with the value 1 as well as the call to the print function down here when the compiler hits a new function call a new frame is created and populated with the contents of the call so when we hit this print function we can create a new frame like this representing the print function now it's important to note that our newly created frame cannot access any of the data from our main frame unless it's explicitly passed through as an argument but I can't access this x equals 1 value unless it gets passed into our print function which it is but if we add another variable in here then I couldn't access it from our print frame when we hit the end of the frame that would be the final line in the comment block then we go back to where the execution started in the last frame once we hit the end of the print frame this frame gets removed from the stack and we come back to after we're print was called in the main frame and in this case because nothing happens after print this frame then gets removed and we go back up to the global frame where the global frame gets removed and the application exits because it's finished now it's important to note that each of these frames represents the scope of the function that it represents with the global stack frame being the global scope of the application and so on the snack memory model is pretty efficient that I must follow a set of rules firstly all data is put on the top of the stack and it's popped off the top of the stack in other words when we get data or when we put data onto the stack we're always doing it from the same place secondly data on the stack must be a fixed size at compile time if the compiler cannot determine how much memory it type needs then it cannot be put directly on the stack so if we look at our example here we have our x value which is a type of UA we know a compile time that the u8 is a size of 8 bytes in memory because we know it's a size of 8 bytes in memory we can put this directly on the stack and that's why it shows up in our stack frame in rust all of the primitive types live on the stack types like the boolean numbers slices characters fixed size arrays tuples containing primitives and function pointers can all sit on the stack what happens when we need to use a more complex data structure which may grow or shrink in the future types like strings and vectors put a pointer on the stack which points towards the memory address inside of something called the heat you can think of the heap like a big hash map of data where each piece of data corresponds with a memory address if we look at our example here we now have a string we're mutating that string by using s pop to take the last character from the string and just throw it out and then we're printing the string out now again the compiler will build a stack rain but when it runs into the string in our stack frame it goes and it takes the string and it allocates it to the heap like this so we know that our string has a length of 8 so we need to allocate 8 cells in our heap map to this string and then we can take the first value of this cell the first address and put it back on the stack so that we can identify where on the heap our data structure actually is we take that address along with the length and the capacity and that defines the pointer which actually sits on our stack so instead of just putting the entire string on the stack we can just have this metadata which points towards the data on the heap this pointer which we've now created has a fixed size and can reside on the stack but it points towards a piece of data which does not have to have a fixed size which resides on the heat if we change the string some of the values in our pointer will also change so for instance we're calling s dot pop this removes the g' from our string the length of our string will change from 8 to 7 but the capacity will stay the same because when we first initialize the string we had a length of 8 and the capacity represents how many cells in the heap we've reserved for this data structure so even though we've popped off the G that last cell is still reserved for this data structure just in case we decide to grow it some more now if on the other hand we go to increase the size of our string then both the length and capacity will increase to account for this news change and the new data will set next to the old data on the heap so essentially we'd start to allocate cells next to the existing cells on our heap hashmap sometimes however it might be necessary for the address pointer to change in response to this increase in size for instance maybe there aren't enough slots next to each other for us to fit our new string in the same position and so we need to find a new position on the heat alright so now that we know a bit about the stack and the heap let's move on to the rules of ownership ownership and rest follows three rows firstly each value has a variable which is its owner in our example the variable S owns the pointer to our string and if we create another variable here called X then we can say that X owns the value one now the next rule in ownership is probably the most important one and that is that there can only be one owner of a value at a given time let's look at this example using primitives so we've taken ten and we've assigned it to X in other words X owns 10 then we're taking X and we're assigning it to Y and we're also assigning it to Z now we know that there can only be one owner at a given time but as you'll note we're not getting any kind of error here so what's actually happening here is that the compiler is creating copies every single time we're assigning X to a new variable so the stack frame for this main function would look like this where we have X equaling 10 Y equaling 10 and Z equaling 10 it does not however look like this X is the sole owner of this value of 10 and Y can't own this value of 10 2 and Z can't own this value of 10 it's extremely cheap and fast for the compiler to do this because the data type has a fixed size which means that the data type is only on the stack if we come back up to our stack types we can now amend this list to be copy types as well because all of these types which reside on the stack are able to be cheaply copied by the compiler and they also in a trait called copy we can create a function like this to see if our type implements the copy trait so we're passing in a generic type called T and we're using a guard to say that T implements the copy trait if we come up here and we put in say a boolean you can see that there's no error if we then put in a character again there is no error if we put in a slice of string again there's no error and if we put in our number 10 again there's no error however if we do put in a string you can see that we do in fact get an error where it says that the trait copy is not implemented for the string type now don't worry too much about the syntax of this function I'm just using it to make a point so then what happens if we try to copy a non fixed size data type like a string as we saw our string type doesn't implement the copy trade our dynamically sized type leaves a pointer on the stack which points to the data on the heap if we copy this pointer over to a new variable say B both a and B are now owners of the data on the heap which breaks the one-owner rule and because both of these pointers contain information about length and capacity of the data on the heap if the data on the heap has changed through one of these variables then the other variable has no means of knowing about it so if for instance I called B dot pop a would not change its length to correspond with it because it wouldn't know about that operation and so then the pointer that we have for a would be fundamentally wrong then when the compiler goes to deallocate these values that would then try to clean up the same data twice this also could lead to a pointer which points towards a piece of data that doesn't exist anymore which is called a dangling pointer so instead of copying the data from one variable to the next we move the pointers ownership to the latest variable this means that the original variable no longer has any ownership over the pointer or the data in the heap so our stack frame would look like this where we have a pointing towards the pointer and then as soon as we assign B of this data associated with a goes away and a becomes invalid and after we've declared B we can now say that B is the sole owner of the data on the heap if I try to do something with a after we've moved it to B you can see that we get an error because a has moved to B and a no longer exists in this scope moving variables is very performant for the compiler because it only needs to copy the pointer from one variable to the other and then delete the original copy now there are cases where we may want to have two variables with the same data inside of them and for these cases we have the clone trait the clone trait copies the data structure in the heap to another address in the heap and then gives the new pointer to the associated variable so if we come down to our stack frame here we can see that a points towards the address 0 then B points towards a completely different address in the heap even though it has the same length capacity and it even points towards a string that says a string this of course obeys the rules of ownership because there are now two distinct and separate pointers and two data structures in the heap if one changes then the other one is not affected at all now the final rule of ownership has to do with scope when the owner of a value goes out of scope then the value will be dropped out of memory I've mentioned before that our stack frames here are essentially the scope of our functions in rust when you define a variable the variable is scoped to the set of brackets that surrounds it this is called block scoping so both a and B exist from where to find till we hit the ending bracket here in which case they get dropped or de-allocated from memory now with regards to ownership function arguments work similarly to variable assignments so if we look at this example here where we have a function called own string we define that we want to have a variable a which is of string type and then the function does nothing with it but what essentially happens here is when we pass a and B into this own string function the variables a and B get moved to the scope of the on string function and to this new a variable so this function takes ownership of these values and when the function terminates the value is dropped we're also allowed to return ownership from a function by just returning the value from the function so here you can see that we take in the value a and then we just return that value a and in here what we can do is pass a into own string and then pass it back into another variable called a and then pass B into own string and pass it back to another variable called B and so ownership moved from our main scope to this function scope and then back to our main scope treking ownership may seem easy enough but it can get complicated when you start to deal with larger examples so we need a way to pass around values without having to pass around ownership and this is where the concept of borrowing comes into play borrowing as a concept has its own set of rules so when we borrow were allowed to have infinite borrows for read-only access if we look at this example here you can see that we create a which is of course our string and then we're taking a and we're assigning it to B C and D and we're using this ampersand before the value of a and what this is doing is it's actually taking a and borrowing it to be C and rather than just assigning it to BC and D and because we're just using an ampersand and not an ampersand with a mute keyword we're giving these variables read-only access to our string so even though our string here is mutable these values are not because they are read-only borrows of this string now each of these ampersands creates what's called a reference so we're creating references to a by borrowing A to B C and D another important rule in borrowing is the fact that when we're making read-only borrows the original data becomes immutable for their duration so even though our string here is defined as a mutable string it can't be mutated while we're borrowing from it with these variables and like with ownership a borrow will last until the end of the current scope so these borrows start when we define the values and then they end at the end of this bracket here once the borrow ends though we can go back to the original owner and mutate the value freely as you can see here I've created a new block scope and I've bound all of our borrows to the block scope so these burrows will end at this curly bracket and once this scope finishes we can then go back and start to mutate our original value using the a variable now the final rule of borrowing has to do with what are called mutable borrows so we're only allowed to pass one borrow at a time for a right access or mutability so what we're allowed to create infinite read-only burrows we're only allowed to create one mutable borrow we define a mutable borrow by using the ampersand followed by the keyword mute followed by the value that we want to borrow so in this case we're borrowing a mutable e with the value of X and I can take this value of x and bind it to its own scope and this way when I go to print out a we'll have called pop on the a string twice so it will lose two characters now the reason why rust restricts mutable borrows to only a single borrow at a time is to avoid a concept known as a date erase a data race happens when you have two or more pointers trying to access the same memory location at the same time while at least one of these pointers is writing to the data when this happens the operations are not synchronized across all of the pointers and then we get all kinds of a key issues function arguments just like with ownership also work like variables in the case of borrowing so here i've defined a function called own and borrow stuff and in it we're passing in three different values one of them is a mutable borrow of a string the other one is a borrow of a boolean and then the last one is a ownership of a yu8 number so when we invoke this function we want to pass in a mutable borrow for our string then the reference for our boolean and then finally the value for our u8 and the function will now take ownership and borrow the values accordingly it's also important to note that you can borrow parts of complicated data structures like strings structs and vectors so here you can see that I'm taking the index of one two three of our string and I'm putting it into a mutable value a as a borrow then I'm taking the index from two to five of the string and putting it into B as a borrow and then again I'm borrowing from these values into other values these slices will work like any other borrow and they follow all of the same rules our borrowing system but keep in mind that for instance when we slice from a string we turn the string into a primitive type so now we have the copy tray and we can more freely move our data around in an easier way than when we just deal with a string or a vector another thing I want to show you guys really quickly before we leave is the definition of a string and rust as you can see here the string is just a struck and it's a struct with one field and that field is a vector of u8 type if we dig into the vector type you can see that it's a bit more complicated than a string because it has two fields but the two fields are literally just the length and the raw vector type digging further into the raw vector you can see that it's almost set up like the pointer that we were talking about before so we have the pointer itself then we have the capacity and then we have this a value so even though our representation of pointers was a little bit simplified compared to how they actually look inside a memory it's more or less the model that they use when we're dealing with those types of things all right guys well I hope you enjoyed this video if you did feel free to like and subscribe if you have any questions or comments feel free to leave them in the box below and if you dislike this video then by all means download it as much as you like if you want to catch more videos like this go ahead and click that notification bell and if you want to support the channel then feel free to go check out the patreon channel pitch in a few dollars have a good night
Info
Channel: Tensor Programming
Views: 19,055
Rating: 4.9331942 out of 5
Keywords: rust programming, learn to program, learn to code, programming in rust, introduction to rust, intro to rust lang, intro to rust, programming tutorial, programming tutorials, rust programming tutorials, rust ownership, memory management, rust borrowing explained, rust borrowing ownership, rust references and borrowing, rust smart pointers, rust language, rust programming language, references and borrowing, Stack, Stack frames, Stack and Heap, the Heap, Scopes, Rust Scope
Id: 2IxQgXQl_Ws
Channel Id: undefined
Length: 21min 33sec (1293 seconds)
Published: Tue Dec 17 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.