Understanding Ownership in Rust

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

When will these types of thumbnails go out of fashion?

I hope soon.

👍︎︎ 10 👤︎︎ u/pokemonplayer2001 📅︎︎ Apr 28 2021 🗫︎ replies
Captions
welcome back to let's get rusty my name is bogdan and this channel is all about the rust programming language if that sounds interesting to you hover over that subscribe button and give it a fist bump last time we went over chapter 3 of the rustling book in which we covered basic programming concepts in the context of rust if you haven't already check that video out in this video we're going to be talking about chapter four and this is a very special video because chapter four covers rust's most unique feature ownership ownership is what allows for russ to make memory safety guarantees without the use of a garbage collector we'll also cover references borrowing the slice type and how rust lays out data in memory so with that let's get started first we'll answer the basic question what is ownership or the ownership model in rust the ownership model is a way to manage memory now you might ask why do we need a way to manage memory and to understand it's helpful to look at the two other solutions for managing memory today first we have garbage collection if you've ever written an app using a higher level programming language such as java or c-sharp you didn't have to worry about memory management because the garbage collector did it for you now this approach has some pros and cons the first pro is that it's error-free meaning that if you were to manage memory yourself you might introduce memory bugs but since that's being handled by the garbage collector you can be pretty sure that there aren't any memory issues i put asterisks because garbage collectors could have bugs themselves but for the most part you can be rest assured that your memory is being managed safely the second pro is faster right time because you don't have to deal with memory you can write your programs faster now let's look at the cons firstly we're giving up fine grain control of our memory that's because the garbage collector now handles all our memory secondly our runtime performance can be slower and more unpredictable slower because we can't manually optimize our memory and unpredictable because the garbage collector could choose to clean up memory at any point in time and when it does so it slows down our program lastly we have a larger program size because the garbage collector is a piece of code that we have to include within our program now let's look at manual memory management if you've written c or c plus you have to allocate and deallocate memory manually now the pros of this is full control over your memory which leads to a faster runtime because you can optimize things and a smaller program size because you don't have to include a garbage collector the cons are first it's extremely error prone many many bugs and security issues are caused by incorrect memory management and secondly you have a slower right time because you have to think about memory it takes longer to write your program notice here that the pros and cons of garbage collection and the pros and cons of manual memory management are the exact opposite we're making opposite trade-offs here either of these solutions could be appropriate depending on the context if you're writing a high-level app like a website it makes sense to sacrifice runtime performance and have a larger program size for the ease of use and faster right time you get with a garbage collector on the other hand if you're writing low level system components then you probably care a lot more about runtime performance and program size so it would make more sense to use manual memory management now we could talk about the ownership model which is a third way to manage memory now rust is a systems programming language so it does care about runtime performance and program size and as you can see here we get all the benefits of manual memory management control over our memory faster runtime and smaller program size rust however is also a memory safe language so we can't use manual memory management because as we said before it's air prone and as you can see here the ownership model is error-free and the way rust achieves this is by doing a bunch of compile time checks to make sure you're using memory in a safe way now i put astrix here because even though rust is memory safe by default it does allow you to opt out of memory safety with the unsafe keyword but that's meant to be used sparingly as you might know everything in software is a trade-off so the ownership model gives us memory safety but the con is we get a slower right time slower than manual memory management and that's because rust has a strict set of rules around memory management and if you break those rules you'll get compile time errors you have to deal with this is sometimes known as fighting with a borrower checker and it could be frustrating but as anything with time you'll get better and things will become easier the big idea here is that this trade-off is worth it it's worth spending the time up front dealing with the borrower checker so you don't have to spend hours and hours later debugging runtime memory issues now because rust is a systems programming language it's important for us to know how our memory is laid out during runtime rust makes certain decisions based on if our memory is stored on the stack or on the heap so next we'll briefly go over what the stack and heap are during runtime our program has access to both the stack and the heat the stack is fixed size and cannot grow or shrink during runtime the stack also stores stack frames which are created for every function that executes and the stack frame stores the local variables of the function being executed the size of a stack frame is calculated at compile time so that means the variables inside of stack frame must have a known fixed size variables inside of a stack frame also only live as long as the stack frame lives so for example in this program a gets executed first so we push a onto the stack then a executes b so we push the stack frame for b onto the stack now when b finishes executing it gets popped off the stack and all of its variables are dropped then when a gets done executing all its local variables are dropped the heap on the other hand is less organized it could grow or shrink at runtime and the data stored in the heap could be dynamic in size it could be large amounts of data and we control the lifetime of the data let's go back to our example first we execute the function a which creates a new stack frame a initializes the variables x and y x is a string literal which is actually stored in our binary so in the stack frame x here will be a reference to that string in our binary y is a sine 32-bit integer which is a fixed size so we can store y directly in the stack frame then we execute the function b so another stack frame is created and b creates its own variable named x x is of type string which could be dynamic in size so we can't store it directly on the stack instead we ask the heap to allocate memory for the string which it does and then the heap passes back a pointer the pointer is what we actually store on the stack note that pushing to the stack is faster than allocating on the heap because the heap has to spend time looking for a place to store the new data also note that accessing data on the stack is faster than accessing data on the heap because with the heap you have to follow the pointer i know that was a very brief explanation so if you're still confused or want a more thorough explanation i put a link in the description of a video that goes into a lot more detail on the stack and heap so make sure to check that out all right let's get back to ownership and before we go any further there are three ownership rules that are crucial to remember write these down put them in a word doc get them tattooed whatever it takes the rules are one each value in rust has a variable that's called its owner so one variable one owner two there can only be one owner at a time so variable cannot have two owners at the same time and three when the owner goes out of scope the value is dropped as an example i created a new scope here using the curly brackets and inside i defined the variable s here s is not valid because it's not declared yet then we declare s and it's valid from this point forward we could do things with s then when the scope ends s is invalidated and rust drops the value s here is a string literal and as i've mentioned before string literals are stored directly in the binary and are fixed in size so what if we wanted a string that's dynamic in size and that we could mutate well we would have to use the string type i went ahead and converted s to a string type and now our string is stored on the heap in programming languages such as c plus to allocate memory on the heap you would have to use the new keyword and then you would have to de-allocate that memory using the delete keyword when you're done with it in rust this happens automatically when we declare s here russ automatically allocates memory on the heap for our string and then when the scope ends s is invalidated and rust drops our value meaning that it deallocates the memory on the heap automatically next let's talk about how variables and data interact here we have two variables x and y x is set to 5 and y is set to x and as you can tell by the comment this is going to do what you might expect which is copy the value 5 into y let's look at a more interesting example here we have the variable s1 and we set that equal to a string type on the right hand side you can see what s1 looks like under the hood we have a pointer that's pointing to the actual memory location on the heap we have a length which is the length of the string and we have capacity which is the actual amount of memory allocated on the heap for our string on the next line we declare s2 and set that equal to s1 so what would we expect to happen in this situation some might expect the value to be cloned like we see on the right hand side s1 is pointing to a string on the heap and s2 is another pointer pointing to a new string on the heap but this is not what happens as it would be very expensive to create a new string on the heap others might think that we do a shallow copy so s1 has a pointer that points to hello on the heap and s2 has a pointer that points to the same hello on the heap this is not quite what happens because to ensure memory safety rust invalidates s1 so instead of being a shallow copy this is called a move going back to our program let's try to print s1 after it's been moved into s2 we'll call cargo run and as you can see we got a compile time error which says s1 was moved here and then we tried to use s1 after it has what if we did actually want to clone the string instead of moving s1 well rust has a common method for that so instead of setting s2 equal to s1 we'll set s2 equal to s1 and we'll call the clone method on it and now we can run our program and it compiles successfully so rus defaults to moving a value and if you want to perform the more expensive clone operation there's a method for that there's one other detail here up above when we set y equal to x this did a copy not a move rust has a copy trait a simple type stored on the stack such as integers booleans and characters implement this trait allows those types to be copied instead of moved next let's talk about ownership and functions here we have a variable called s that's a string and a function called takes ownership the takes ownership function takes in some string as a parameter and then prints it out in main we try to print the variable s after we call it takes ownership but we get an error and the error states that s cannot be borrowed after a move and that's because when we pass in parameters into a function it's the same as if we were to assign s to another variable so here passing in s moves s into the sumstring variable then some string gets printed out and after this scope is done some string gets dropped let's look at another example we have the variable x here that is an integer and a function called makes copy makes copy takes in an integer and then prints it out here you can see we pass an x but instead of being moved remember integers are copied so it gets copied into the sum integer variable print it out but we can still use x after the function call this also works the opposite way here we have a variable s1 that's equal to the return value of gives ownership gives ownership is a function that returns a string and here you can see we create a string that's called some string and then we return that string returning the string moves the ownership of the string to the s1 variable and then we can use it afterwards lastly we could take ownership and give it back for example here we have a variable called s2 which is a string and we pass that into a function called takes and gives back takes and gives back takes in an argument that's a string called a string and then just returns a string so here we're moving the value of s2 into the function and then we're just returning a string which moves the value back out of the function into s3 moving ownership into a function and back out is tedious what if we just wanted to use a variable without taking ownership well that's where references come in and that's what we'll talk about next to understand references let's see how they could fix the following situation here we have a function called calculate length what we want to do is take in a string and return the length of that string however we don't want to take ownership of the string the solution here is to return a tuple that contains the string and the length of the string up here you can see we assign the string to s2 and the length to a variable called len you could also see that this looks very strange and probably not something you would ever want to write to fix this let's first modify our calculate length function first we'll change the return type to be just the length of the string next we'll return just the length up above we'll get rid of s2 and print s1 down here as expected we get an error here because we're borrowing s1 after it's been moved into the function to fix this error instead of calculate length taking a string we'll make it take a reference to a string and we do that by adding an ampersand before the string next in the main function instead of passing in a string we'll pass in a reference to a string by using the ampersand again great now we have no errors and that's because s is a reference to a string and references don't take ownership of the underlying value here in the rest book you can see a diagram of what a reference looks like s is the reference and it points to s1 which actually points to our string in the heap s is a local variable inside of calculate length and when the function finishes executing s is drop but that's okay because if we drop s here we still have s1 pointing to our string passing in references as function parameters is called borrowing because we're borrowing the value but we're not actually taking ownership of it also note that references are immutable by default so if we try to modify s here you can see that we get an error which says we cannot borrow the value as mutable here i have another example we have a string and a function called change which takes a reference to that string and then attempts to modify it again references are immutable by default so we can't modify the value but let's say we actually did want to modify the value without taking ownership of it to do that first we'll need to make s1 immutable variable next instead of passing in a reference we're going to pass in a mutable reference finally we'll have the change function take an immutable reference now the change function is able to mutate our string without taking ownership of the underlying value mutable references have a big restriction though which is you can only have one mutable reference to a particular piece of data in a particular scope for example we have a string here and a variable called r1 which is immutable reference to the string now let's add another mutable reference and print out both references here we have r2 which is another mutable reference to string and then a print statement that prints out both r1 and r2 and you can see we get an error here the air states you cannot borrow s as immutable more than once at a time the big benefit of this restriction is that rest can prevent data races at compile time a data race occurs if we have two pointers pointing to the same piece of data and one of those pointers is used to write to the data and there's no mechanism to synchronize data access between those pointers in that situation you could imagine that one pointer will try to read the data in the middle of the other pointer modifying the data and in that case we'll get corrupt data back to fix this error we can switch these references back to be immutable references and now we don't have any errors but what happens if we mix immutable references with mutable references let's go ahead and add another variable we'll also go ahead and make s mutable but as you can see we still get an error here we're running into another restriction you can't have a mutable reference if an immutable reference already exists immutable references don't expect the underlying value to change and this is problematic if we do have a mutable reference you can however have multiple immutable references it's okay to have multiple immutable references because the underlying data is not going to change note that the scope of a reference starts when it's first introduced and ends when it's used for the last time so for example r1 is introduced here so the scope starts here and it's used for the last time in this print line statement so the scope ends here this means that we can add a third mutable reference underneath this print line statement and we'll make s mutable and this works just fine because at this point r1 and r2 are out of scope so we can declare a mutable reference next let's talk about dangling references what happens if we have a reference that points to invalid data well here we can see we have a variable called reference to nothing and it's set to the return value of dangle dangle is a function that returns a reference to a string inside of dangle we can see we have a variable called s which is set to a string and then we return a reference to that string however s is defined within the scope of this function so when this function finishes executing rust will drop our string or de-allocate our string from the heap therefore our reference will be pointing to invalid memory but as you could see russ prevents this from happening because we get an error here if we hover over we see that the error states this function's return type contains a borrowed value but there is no value for it to be borrowed from we don't have a value for it to be borrowed from because the value gets dropped we also get a recommendation to use something called a lifetime lifetimes is something we'll talk about in chapter 10 but for now we can ignore it the point here is that again russ prevents us from doing something that's memory unsafe to wrap up references let's review the rules of references number one at any given time for a particular piece of data in a particular scope you can either have one mutable reference or any number of immutable references and number two those references must be valid the data they point to must be valid the last thing we're going to talk about are slices slices let you reference a contiguous sequence of elements within a collection instead of referencing the entire collection and just like the references we covered in the last section slices do not take ownership of the underlying data to understand why slices are useful let's start with the problem imagine we have a function we want to write called first word takes in a reference to a string because we don't want to take ownership of that string and then it returns the first word in that string looking at the function signature what would our return type be we don't have a way to return part of the string but maybe we can return an index to the end of the first word that implementation will look something like this here we're returning an index to the end of the word and inside our function what we do is we take the string and convert it into an array of bytes then we use a foreign loop here we take the bytes call the iterator function and then call enumerate so we can get the index of each item and the item itself then for every item we check if it equals an empty space which signifies the end of a word and if that's true then we return that index if we go through this entire loop and don't find an empty space then that means that the entire string is one word so we can return the length of the string there are two problems with this implementation the first is that the return value is not tied to the string and here's what i mean in main we have a string under the variable s we pass that string to first word and we store the return value in a variable called word in this case our return value is 5 because our first word is hello h is at index 0 then we have e at index 1 ll index 2 and 3 o at index 4 and the space is at index 5. now the problem is on line 4 we call the clear method on our string which makes it an empty string so now our variable word is still 5 even though the string is empty and again this is because our return value is not tied to the string itself this means we have to manually keep a return value in sync with the string which is extremely error prone second problem comes when we want to change the implementation imagine instead of the first word we wanted to return the second word in that case we would have to return a tuple with two values an index to the start of the word and an index to the end of the word and now we have more values that we need to keep in sync with the string to get around these issues let's introduce the string slice here underneath our string we'll declare two string slices here we've defined two string slices hello and world they look very similar to string references except we have this bracket syntax here with a range inside this is saying give us a reference to the string but we only want part of the string specifically we want the part of the string starting from index 0 and ending at index 4. the 5 is exclusive here and that will give us the word hello on the second line we're doing something similar but from index 6 to 11 which will give us world the rust book has a nice diagram to show us what's going on here we have s which is our string and you can see it has a pointer pointing to the string on the heap and then world is our string slice that's a reference pointing to the same string on the heap but starting at index six we can simplify these string slices a bit further if we're starting from the beginning of the string we can omit the zero here and if our range continues to the end of the string we can omit 11 here as well lastly if we want a string slice to span the entirety of the string we can omit the first and second value here in this case world is a string slice that will reference the entire string now that we know about string slices let's modify our first word function to take advantage of them the first thing we'll do is change the return type from the index to a string slice next if the first word is found instead of returning the index we'll return a string slice here we're returning a string slice from the beginning of the word to the index where the space was found lastly at the end here instead of returning the length of the string will return a string slice to the entire string now our word variable is a string slice which is tied to the string itself and to prove that let's try printing out word after we clear the string and here we can see we get an error cannot borrow s as mutable because it is already borrowed as immutable word is a string slice which is an immutable reference to string and here we call clear which mutates the string that means we need a mutable reference and if you recall we cannot mix immutable and immutable references in the same scope let's get rid of the code that's erroring and we'll create a new variable called s2 and we'll set it to a string literal hello world as you can see string literals are actually string slices as i mentioned before string literals are stored directly in the binary but s2 here is actually a string slice to that location in the binary now let's say we wanted our first word function to also work with string literals in that case it would have to take in a string slice so let's change that notice that our call to first word here kept working this is because our string reference gets automatically coerced to a string slice and now we can pass in s2 and our function still continues to work and notice here that strings have the type of string with a capital s and string slices have their type written as ampersand lowercase str we can also have slices on different types of collections for example we have an array here let's say we want to create a slice on this array so we'll create a new variable called slice and we'll set that equal to a portion of the a array here you can see we use a similar syntax we have the ampersand a for the array and then brackets and here we do a range from 0 to 2. so our slice here references the first two values in the a array and you can see here instead of the type being ampersand str it's ampersand curly brackets i32 because the array stores a list of signed 32-bit integers and there you have it chapter 4 of the wrestling book complete we learned about ownership memory management references borrowing the slice type and much more if you like this video make sure to give it a thumbs up and if you want to see more rust content make sure to hit subscribe and the notification bell so you can be notified when the next video comes out i'll see you in the next one you
Info
Channel: Let's Get Rusty
Views: 17,238
Rating: 4.9831223 out of 5
Keywords: rust tutorial, systems programming, rust lang, rust programming, rust programming language, rust programming tutorial, learn rust, rust language, rust programming for beginners, rust programming language tutorial, rust programming language course, rust programming tutorial #1, rust programming full tutorial, learn rust programming, learn rust programming language, learn rust 2020, rust tutorial 2021, system programming linux, rust programming project
Id: VFIOSWy93H0
Channel Id: undefined
Length: 25min 30sec (1530 seconds)
Published: Sun Feb 07 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.