Rust Ownership and Borrowing

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey welcome to this video on rest ownership in borrowing my name is Doug Milford from lambda Valley ownership and borrowing goes to the heart of why rust is rust and fundamentally different than other languages if you can get this topic rust will be an amazing tool if you can't rust will feel like it's constantly fighting you in previous videos when I talked about these significant trade-off decisions that rust made in order to do what it can do well this is it this topic initially frustrates both new and senior developers don't feel like that because you've mastered a dozen languages that this will be just one more have some patience and realize nobody gets this right out of the gate nobody makes the first jump what we're really talking about when I say ownership/borrowing is memory management for those of you coming from a different language you may be surprised that there's no garbage collection that means you have to manage your memory as a developer but don't worry rust compiler will help you every step of the way to understand ownership and how to manage your memory we'll need a basic understanding of stack versus heap so we'll discuss that there are significant benefits to all this effort it's not just paying with no gain the first is that your runtime will be extremely fast if frequently benchmarks faster than C++ also parallel and concurrent processing will become much easier and last but not least safety memory issues plague other languages and causes incorrect data or even crashing of your application rust puts safety first so let's go to visual studio code and start off by showing the compile air that initially causes so much confusion for people learning rust including experienced developers if I create a simple string variable and then create a second variable base on the first something unexpected happens if I try to use my first variable after that we'll get a mysterious compile error other languages don't have an issue with this this is at the heart of rust ownership and by the end of this video you'll understand why rust doesn't like this as well as multiple ways to handle this depending on your data in your coding needs downstream to truly understand what's going on here we first need to discuss stack versus heap stack variables are used for fast memory creation and retrieval this helps you make your program extremely fast memory management of stack variables is very easy in the memory of the variable is automatically recaptured by the program after the variable goes out of scope I'll create a variable on the stack as an example what this does is it create enough memory to store our i-8 variable rust uses the stack by default for its memory needs let's create some more stack variables so what all these have in common well each of these have a memory size that is known to rust at compile time when it needs a bullion it knows precisely how much memory to allocate for that regardless of whether it holds a true or false same goes for floats and integers even though those types can hold a range of values our I eight will take eight bits no matter what we do with it even if we mutate the value so all these stack variables are fixed in size even though there are different types and have different memory needs the fact that the compiler knows precisely how much memory to allocate at compile time makes them viable for stack variables other data types such as collections and vectors cannot be stack variables because they can change in size the exception is a fixed size array which can be unstacked because the compiler can know the exact memory size needed at compile time hence the name the fixed size array cannot grow as a sidenote strings our collections of u eights that can grow so it can't be on the stack we'll get to that in a minute so what I really mean by a stack well as your program creates new variables it will actually allocate enough memory right on top of the previous stack variable it's just like stacking papers and each paper represents some data when you create a new variable it adds it on top of the stack when that variable is no longer needed it's removed from the stack and discarded so that the memory can be recaptured for a future use here I've created four separate variables the memory allocated for these variables is literally right next to each other on the hardware that's what makes stack memory so fast there's no need to search where the next variable should be located in memory by having to find an open slot big enough it just places it right on top and when it needs to access that memory it's just as easy to locate an update that's also why these stack variables have to be of a known fixed memory size in rust since the variables are crammed in tightly like that there's no wiggle room for a stack variable to grow in size so where does the memory management part come into this didn't I say the developer has to manage the memory manually well rust knows to clean up all memory associated with a scope once the scope exits in this case it's at the main end curly bracket but let's do some more examples so that you get the idea anything that has curly brackets will create a new scope so I'm gonna create a new scope with an if statement if I create a variable inside of here this scope will literally be placed on top of our stack car variable as that was the latest variable created on our stack I can use the new variable to my heart's content but as soon as I leave the if statement it goes out of scope and the memory is recaptured now if I try to use it outside of the scope you'll see that I'll give me a compile error that's because rust already cleaned up that memory by the time it reached this point in the code it's gone poof rust knows to junk the inside scope variable and leave the other variables we created earlier intact so far this is very common to other languages this concept really goes for anything that has curly brackets if I create a procedure or a function it's the same idea any variables that get created inside of that scope automatically get cleaned up and the memory is freed once the end curly bracket is hit the concept is called LIFO meaning last in first out it's just like a stack of papers where you add memory for the variables on top and then once the scope finishes it just takes all the papers off for the scope in throws away begone okay let's clean up a bit now let's talk about the heap the heap is going to give us the flexibility we need in certain situations heat variables can grow in size for example a vector can add elements to its listing as needed the same goes for other collections such as hash mappers one thing that you may not realize needs to be on the heap is a string as mentioned a string is just a collection of you eights you can add or remove from the string as you please which makes it unsuitable for the stack in the needs to be on the heap there is a runtime cost to using the heap to allocate memory but sometimes the heap is just what the doctor ordered anytime a variable needs to change in size means it can't live on the stack and must live on the heap memory can live beyond the scope that created it and similar to the stack he memory will be automatically cleaned up and recaptured when they go out of scope or more specifically when their last owner goes out of scope which we'll see in just a minute if you're confused by any or all of this hang tight let's create a few variables that will be used on the heap the first is a vector of I 8 and I'll use the avec nous functionality we could have used the built in vector CRO instead and that allows you to pre populate the vector with initial values but I'll stick with the avec nous for this example let's also create a heap-allocated string quick quiz when can you allocate a string on the stack answer never because it's really collection of you eights under the hood and collections need the flexibility to grow that only the heap can provide and lastly let's create a heap allocated i8 to complement our stack allocated I ate I don't do this very often but there are some use cases ok I'm going to pick on our II's for this example we have our stack and our heap versions let's create a second version of the stack I ate based on the original and then print both versions afterwards okay so that works just fine and at this point if you have any programming experience you probably shrug your shoulders and say so what however let me also create a second version of the heap I ate based on the original and then print them out we get a compile error if you hover your mouse it'll say value borrowed here after move so what's the dingdong is that this has to do with the concept of ownership and borrowing in rust every piece of data in memory has an owner and there can only be a single owner at a time when I created heap I ate enough memory was allocated on the heap to represent that data and then the ownership to that allocated memory is assigned to heap I ate it's the one and only one official owner when I created heap I ate two from heap I ate what I was really doing was keeping the original allocated memory intact but transferring ownership of that allocated memory to heap I ate to the variable heap I ate is no longer pointing to any allocated memory so rest makes sure it's never used inappropriately by a compile time check this is in stark contrast to most other languages most languages will simply have two variables pointing to the same allocated heap memory and when they do they open up a Pandora's box of issues which we'll discuss near the end of this video we still have our compile error and we'll deal with that in a minute I want to go back to the stack test we did so I told you that rust-eze wanted only one owner for a piece of memory but if that were true why isn't our stack test up here giving us issues you lied to me well not really for stack data it's so cheap to create a new copy of the data that it will just go ahead and do that it'll make a copy of the memory that stack I ate is pointing to an assign it to stack I ate to managing memory on the heap is expensive for efficiency purposes rust defaults to the stack allocation whenever it can back to the compiler how do we keep heap i-8 intact after the assigning of heap I ate to the first question we should ask is should we but I'll table that discussion for now if you want to keep both heap I ate and heap I ate two there are a couple of ways to do that and it depends on what you want to do with both variables downstream in your code one way is to borrow the ownership and that's done with the ampersand hold that thought because we'll be describing borrowing in just a second another way is to call clone clone will create a new heap allocated version of your data and assign it to heap I ate two so really these are two completely different variables now if I change the value on one it doesn't affect the other in other languages where two object variables can point to the same allocated memory changing one affects both that leads to parallel and concurrency programming issues such as race conditions down the road the way rust does things though that's really not an issue cloning may seem like the easiest way to solve our compiler but cloning is relatively expensive if performance is your top priority you probably don't want to be doing expensive clones everywhere similar to stack variables the variables heap I ate and heap I ate two are both automatically cleaned up by rust when they go out of scope as mentioned heap memory gets cleaned up when the last owner goes out of scope both of these variables own some memory and both will be cleaned up and recaptured for our future use once the curly brackets are hit okay let's clean up so we can get to some meteor examples I'll create two variables representing floats one on the stack and one on the heap the memory associated with both of these variables will get cleaned up once it hits the end curly bracket so that's not really a difference here first I want to create a procedure that accepts the stack variable I'll pass in the appropriate parameter for the test and just printed to the terminal both inside the procedure and after we call it note I'm just identifying the procedure as stack for naming purposes there's nothing forcing parameters to be all stacks or all heap you can mix and match heap and stack parameters all you like I'm just trying to keep things clean for demonstration purposes okay as you can see if I call the stack procedure I can still use my variable afterwards as a reminder about what's going on here when stack F 64 gets passed the procedure it'll create a copy of the memory and place it on the stack to represent the per am to illustrate I'll make my parameter mutable and do a modification rust assumes and mutability by default and you have to explicitly tell the variable to be mutable there are very good reasons to favor immutability and when I learned about functional programming it opened my eyes since rust has made a completely new copy of the F 64 and put it on our stack it will have no effect on our variable that we passed in if I run this you see that the value of 10 is correctly printed out in our stack procedure but when we printed out our made version it still has one mutating a parameter had no effect on the variable being passed into our procedure indicating is pointing to a different memory location let's create a heap procedure and call it to see how it's different I'll pass in my heap f/64 and then printed to the terminal after the procedure call even in a simple print statement we get a compile error again this is our friend about the value borrowed after move the owner of the memory associated with the heap f/64 gets transferred to the procedures parameter since the procedures pram now owns that memory when it reaches the end of its scope which is the in curly bracket of the procedure the memory automatically gets cleaned up it no longer exists so that's why Russ gives a compile error it won't allow us to use heap F 64 because that doesn't have viable memory associated with it it no longer owns any memory at all so how would we go about solving this oftentimes a function or a procedure will have several variables and it wouldn't be feasible to just lose access to them well you might first be tempted to do a clone of your variable and then pass it into procedure this has the same effect as how the stack works it creates newly allocated memory and uses that for the parameter but the heap is inefficient so cloning variables is a massive waste imagine if you had a vector with a million entries just using that vector as a parameter for some calculation would have to create new memory for those million entries that's not right another way is to make a function that returns a heap allocated F 64 and then at the bottom just return your original parameter back to the caller now we can reassign the result back to our original variable and we'll have to make it mutable so that it knows it can update that variable okay so that works we never create a copy of the memory and our heap memory is preserved after we return it now we can use the original variable after the fact but the solution sucks Rox what happens if we have more heap variables that we want to pass in now we'd have to pass back some tuple and then we'd have to manage the return and shove it into multiple variables it gets super ugly super quick rust has a much better way though let's go back to our original situation okay back to our wonderful original compile error instead of giving ownership away to the function or procedure rusts as a way to borrow the ownership of the memory and then return it once the procedure ends remember allocated memory needs to have one and only one owner there's exceptions to that beyond the scope of this video but for now memory needs to have one and only one owner to tell Russ to borrow ownership temporarily you need to put an ampersand in front of the parameter type it's saying hey there I'd like to borrow your memory for a while and when I'm done I'll give it back to you and now the compiler is complaining on our call it's complaining that it's expecting a reference type to do that put an ampersand in front of the variable being passed in and now everything compiles fine if I'm reading someone else's code or even my own the ampersand immediately tells me that the memories ownership will be borrowed for the procedure call so temporarily the procedures parameter will become the owner of the allocated memory but as soon as the procedure is done it'll make the passed in variable the owner again and you can continue on your merry way heap 64 regains ownership after the procedure call and hence why the compile error goes away rust will do its best to make sure I can't do anything stupid with the memory and believe me I've tried this method has the benefit of being able to do calculations without having to worry someone else will change your data unexpectedly there is one and only one owner at a time so if nobody else owns it they can't touch it this makes parallel programming ultra simple and allows you to utilize more cores on your computer which we'll do in a video later on in this course this basic concept of memory management and all the compilers it produces is why so many developers have such a hard time with rust at first this goes for new developers as well as experienced ones once you fundamentally understand what's going on with memory though things start to click in place it's a steep learning curve and a much different way of thinking about your variables if you're foggy headed so far that's to be expected a bit note the stack procedure can also use the ampersand borrower Russ just prefers to create simple copies of stack variables so borrowing on stack variables is the same on heap but it's not necessary I like consistency so I'm a little frustrated there is it one and only one way to do things it just makes it harder to learn but if the benefit is that some stack variables can speed up my runtime yeah who can complain earlier we had a video on string versus string slices let's explore that a bit I'll create a string variable first remember a string bias nature can grow in size so it can't be on the stack and is always on the heap a string slice though is not locked to either the stack or the heap but as more of a pointer to someone else's memory location the string slice is borrowing it the string slice does not own a memory slot it only borrows it and points to the original memory that's why there's an ampersand in front of your string slice type it's a reference to someone else's memory and is never truly the owner if you try to create a string slice without the ampersand you'll get all sorts of craziness okay moving on now let's create a procedure that accepts both a string and a string slice and now let's call it even though this compiles can you think of what might cause you problems down the road before I continue what do you think will happen if I try to print out both variables hmm let's do that and see what happens Wow we get our friend the compiler again the string slice is fine because Russ can assure memory problems won't be cost but our string is on the heap how would you solve this one possible inefficient way might be to clone it this will create a copy of the allocated memory and use that for the parameter but that's a dirty trick another way is to put an ampersand in front of the calling variable and the parameter type this once again tells the compiler you will be borrowing the ownership of the memory temporarily and the return the ownership once it's done yeah that compiles in the string versus string slices video I mentioned that unless you need to pass a mutable string to a function or procedure it's recommend you make your parameter string slices it's just more versatile I'm only doing the string type parameter as an example here we're not doing any mutations so in real code I probably had to find per am a as a string slice okay let's clean up although the heap data can have one and only one owner at a time it can actually have multiple references and it can only do so if the variable is immutable or doesn't change to show you what I mean let's create a variable and then reference it by two other variables I'll then print out all three to the terminal at first this seems to contradict that there is one and only one owner people think that because of RB is created from bar a that VAR c will give a compiler because they think that ownership of the data has already been transferred to VAR b and likewise the print statement seems like it should complain on both VAR a and VAR b if you'll notice though I declare B and C by using the ampersand this means that VAR b and VAR c are just referencing VARs a data and never truly are the owners if I remove the ampersand that tells the compiler that you're moving ownership of the data to VAR b and then we get our favorite compiler again VAR a no longer points to valid memory but the code is trying to use it downstream so who is the true owner in this situation because of the ampersands bar a remains the owner throughout this example b and c are just read-only pointers to that data but why do the amazon's work the rest compiler knows that VAR a will not change downstream so it doesn't care that multiple variables are reading from the same memory but as soon as that guarantee is broken and potential data issues could occur rest will produce a compile error let's do so in a few different ways if I try to make var be mutable and make a change rust will complain that VAR B does not own the memory it's only a reference it's only allowed to read it if I try to make VAR a mutable and then modify it like so that too will produce a compile error the situation here is different though it's saying hey I have VAR b in VAR c referencing my data they were told they can rely on that data not changing on them while they're using it but the print statement is using b and c downstream the guaranty that the data will not change is broken but if I put the mutation after B and C are last used rest is smart enough to know that there is no possible way for B and C to access bad data their job has been completed prior to the mutation if I try to use either after the mutation at any point though rust will rightfully complain likewise if I make the mutation above the declarations of B and C that too can now guarantee that B and C are okay there is never situation where a memory problem could occur so rust says everything's cool this really boils down to can rust guarantee that memory issues won't occur based on your code if it can it'll compile one way to guarantee that is to use immutable data which Russ does by default if data can never change there's never a chance for data to change in your reference another way is just to be careful where mutations occur and yet another way is to make a clone of your heap data if you truly need to that's valid in many situations when good references like this come in handy well in heavy calculation situations where you can have millions of data elements you would probably want to utilize all the computing power your computer has to offer that gets to the topic of parallel processing which is beyond the scope of this video you also probably don't want to create a copy of your data which could eat up time let's create a mini example so that you get the idea I'll create two string variables now I'll put them into a vector note I only created two but this could have literally been millions of values I'll now create a function that accepts a vector reference and spit out a result I'll just pick in I 64 is the return type for simplicity sake now I can do some heavy duty calculations that utilize available cores of my computer I'm not really going to go through all the calculations I'm just setting up an example framework and we'll pretend the result comes out to be I don't know 10 so I'll just hard code that I can now perform calculations based on the vectors string pointers and I'll print out the results and because everything is immutable I can still use each of my variables downstream let's print out each one because rust is so strict about memory management and safety if it compiles you can sleep soundly that it's not going to have a data race or multi-threading issue effectively utilizing your computer's resources in safe effective ways is a major strength of rust on a sidenote if you had watch my strings versus string slices video you would also know that this can be done with string slices because of coercion okay let's clean up again what about structs I'll create a simple struct with two fields an integer and a flow I'll annotate it with derive debug this is a macro that will give ducks truck the ability to print out to the terminal if needed I'll create a procedure that accepts duck struct and I'll just print it out to the screen the colon in the question mark is used here two prints trucks if this looks odd to you I'll refer to you the video where I discuss printing and formatting already I'll now create a variable and pass it into our new procedure because Doug's struct is comprised entirely of stack variables I might be tempted to think that a copy of the data will automatically be copied to per am a but in fact if we try to use var one after passing it to our procedure we get our favorite compile error but why in my mind ducks truck should behave just like a simple flow would the exact size of duck struct is known at compile time so there shouldn't be issues of using the stack the only thing I can think of is Doug Strock could potentially have thousands of fields if so making the assumption to copy each of the fields automatically without asking the developer if they should probably isn't recommended for performance critical applications passing a borrowed reference in that situation would probably be much more performant so yes we can put the borrow ampersand in front of the parameter and it will compile that but we have some other options as well when passing a box variable we were able to type dot clone on it to explicitly copy the value and allocate a new memory slot to the parameter it doesn't seem to like that because clone isn't defined for the struct but there's an easy way to fix that all you need to do is tell it that you'd like duck struct to support clone by putting in the annotation this is actually a macro that automatically creates the clone trait implementation for you we could have done this by hand like so by implementing clone for ducks trucks and then filling out the clone function it's pretty trivial all you do is return a new struct of the self type and then fill out each field based on the original simple but that can get very tedious and as you can tell is really quite pattern istic if I had a thousand fields you code each field one by one which would be there are shorthand ways to copy fields from one struct to another but I don't really want to sidetracked anyways instead someone has created a nice little macro that will actually generate our code for a clone implementation just by using the annotation kind of neat huh but our primitive types of integers and floats didn't need to explicitly call clone it's just implicitly new to create a new copy of the memory the way to do that is to implement derive copy and now we can pass in our variable as if it behaves like a primitive type one small note the copy requires that you also derive clone if you don't include it it will give you a compile error and unfortunately in this case it doesn't give you enough information to know why it's having an issue the only clue I see is that the trait definition has a colon and clone on it but if I were scouring the error message I'm not sure I'd know what was wrong with it off the bat you might be wondering why don't structs automatically implement copying clone if it's that easy I can think of to reason but there's probably more the first is that the macro will literally create code in the background that eventually has to be compiled and that takes time the more you assume is automatically included the slower things become in extremely large applications I'm not sure how much effect this would have on our compile time but it's not zero I find it's better to be lean and build up functionality when you need it than to make global assumptions that add bloat the second and probably most important is that not all structs can implement copy for example let's add another field as a string you'll see that the copy macro now complains but it can do clone just fine let me clean up the compile errors and there we go even though I've discussed the clone and copy for a while now I want to get across you that you should probably still use the borrowing ampersand for many situations I'm just showing you what's possible in case you need it I'm going to revert to borrowing the variable for the next discussion sometimes a function or a procedure has they need to mutate a variable being passed in I'm not a fan of mutating data like this and prefer pure functions whenever possible but you may find this useful at some point first let me make the parameter mutable I'll make a modification on our parameter if we didn't have pram a defined as mutable this would have given us a compile error the variable will need to be passed in as mutable so that it can be modified by the procedure and lastly the variable type itself will need to be designated as mutable all of this ceremony may seem like overkill but when I'm reading the code it's crystal clear what the variable represents and whether mutation is likely to happen so the workflow is that var 1 is created as a Doug struct and is the owner of that allocated memory it's designated as mutable so that the memory can be altered at some point we call some procedure and pass the ownership of the memory to the procedure it does its work modifies pram a which represents the original allocated memory and then returns the ownership back to var 1 once it's done if I run this the last line shows that yes var one's a field has been changed from 9 to 15 success so your question might be why is all this necessary yes this is more work for the developer up front but it all but eliminate certain memory issues such as null pointers dangling pointers and data races all of those are deadly in non travela applications in rust I don't even have to worry about them it also eliminates the garbage collector some people may not care about this one the effect is that on occasion your program has to pause so that can recapture memory not being utilized anymore if you have a game for example it would be necessary to pause everything while it does this job even if it's just a fraction of a second that pause significantly reduces user experience you're going to get some pretty poor reviews that the screen freezes on occasion with rust because you're doing so much to manage your memory at compile time the garbage collector is not even necessary and doesn't even exist those pauses caused by a garbage collector are not even possible and rust and parallel processing is a breeze because so much care has been done to ensure data issues are taken care of at compile time mastery of this topic will take time and you'll have your fair share of struggles with the compiler try to be patient and stick with it ownership and borrowing is by far the biggest hurdle to becoming a true rust developer once you have this part down you're pretty much there part of the confusion of learning memory management is that rusty O's with stack variables differently than heap people often get confused about when the ampersand is needed and when an isn't and eventually they get frustrated with a compiler hopefully you understand the difference at this point and can work with either fluidly and remember even though other languages may be working with data and memory feel easier at first you still have to constantly read code to handle situations like the object is null error those languages didn't actually make memory management easier they just covered it up with a newspaper and pretended the cat didn't dump on the carpet Russ made bold moves that other languages couldn't and it was absolutely the right choice in the future video we'll be talking about arrays of vectors and collections and how you can work with those to continue this conversation thank you for watching this video I hope you learned some good stuff my name is Doug Milford from lambda Valley and I'll see you next time
Info
Channel: Doug Milford
Views: 54,820
Rating: undefined out of 5
Keywords: Rust, Training, Programming, Language, Own, Owns, Ownership, Borrow, Borrowing, System, Tutorial
Id: lQ7XF-6HYGc
Channel Id: undefined
Length: 38min 20sec (2300 seconds)
Published: Sun Nov 17 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.