Rust and RAII Memory Management - Computerphile

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Captions

previously you've done a video about garbage collection we need resources for our programs to run but if we hang on to those resources for too long that's a problem because no one else can use them and eventually we run out of memory garbage collection before which is one way of solving that problem the difficulty with it as was kind of mentioned in that video was that that in itself takes memory and CPU time to run which isn't great because if you're trying to really run something fast you want to reduce the amount of resources you're using as much as possible so what we're going to talk about today is a concept called raii resource acquisition is initialization and specifically we're going to talk about how a language called rust has built that into its language in order to do the most it possibly can to stop the programmer making mistakes with memory it's not a niche language but it's probably not as um well known as maybe some others so we could show a brief hello world if you want if I just make a new file here hello.rs we'll start with the main function so there's a bit of sea like then is it a little bit sea like yeah it definitely wants to borrow from that General type of C Java that General syntax so if I just print hello world here and close it there so you can see that the basic structure of it is going to look very familiar we use FN to declare a function instead of starting with a return type if I wanted to return something that would come after the function name but the basic syntax of it doesn't look too different the first place where it starts to look a little bit different from languages like C C plus plus is the way we declare variables instead of putting the type first then the variable name then the value we use this keyword let with a variable name then I'm going to specify the type so we're going to have an i32 and let's give it the value 10. so the syntax is a little bit different compared to what you're used to in C and Java nothing too different so far the first interesting bit comes in this is the very first bit where rust starts to try and stop you from making mistakes is the fact that if you want to be allowed to change a variable you have to explicitly tell it you want to do that there's a distinction between a normal variable and one that you're allowed to mutate if I want to mutate a variable I have to explicitly say let mute and let's have y i 32 is 20. some other languages do this as well like kotlin you've got a difference between vowels and vars I mean typescript you have let for a mutable variable and const for one that you can't change a few other languages picking this up as well but that's the first place where rust is going to start trying to help you avoid making mistakes if you want to change a variable you have to explicitly let it change the variable and that's going to help guarantee that your variables have got the values you think they have I'm going to go over to C plus plus first because it's going to be easier to demonstrate particularly to people who haven't seen rust before and then we'll switch over to showing how rust has embedded that in the language C plus plus you have to do it yourself rust it's pretty much built in the basic idea of raii I've never I don't know if the I don't know if there's an established way of shortening that like you could you could say I or Ray or rye um you know what I'm going to call it Rey um if that's wrong doubtless I'll find out in the comments um so the basic idea of r-a-i-i is that when you construct an object you allocate the memory you need for that object uh at the time you create it and then you make sure it's destroyed when the object is destroyed the idea being then that the lifetime of the memory you request is tied to the lifetime of the object that's using it means you don't have to worry about freeing it elsewhere it means you don't have to worry about allocating it elsewhere when I create my objects I allocate my memory when I destroy my objects my memory gets freed for me it's a load off my mind as a programmer it means I have much less to worry about dangling memory so if I do a brief example in C plus let's have a class called Bob and Bob is going to hold on to some integers for me so it creates a class variable called n which is a pointer to some integers now we'll have the Constructor which is what we're going to call to initialize Bob and we're going to pass the number of integers I want Bob to hold as an argument and then we can say this n equals new ins X okay so I'm saying that when I create a new Bob I'm going to allocate enough memory to hold X integers in C that would be a call to malloc um C plus plus has a a keyword to do that so it makes it slightly nicer to program but it's pretty much doing the same thing underneath then in my D structure which is something that c has a Destructor is a function that gets called whenever the variable gets freed or it goes out of scope and in my Destructor I am going to delete this n and finish my class so what we've just done there as soon as I create a bob I'm going to allocate enough memory to store all of Bob's numbers and when Bob goes out of scope or Bob gets deleted Bob releases all of the memory he was using to hold his numbers and that's really great because it means that I don't have to keep track anymore of Bob's memory I don't have to remember that Bob's holding on to a load of integers soon as Bob goes out of scope Bob drops his memory produce the chance for memory leaks that's awesome the one obviously slight drawback is that I still have to implement that myself I still have to write that myself um so it still puts some of the work on me as a programmer to remember to make sure I've done that properly it doesn't save me the problem it just means I'm only doing it in one place and just sometimes people cut Corners with that uh I don't know if people will cut corners but it's definitely possible where maybe Bob's holding a lot of different kinds of memory for different things if I just forget if I miss one of those um in the process or this works great so long as absolutely every class uses this pattern if I forget to do that for something and I just use a plain old pointer somewhere then memory leaks can still happen it it sort of if done properly it makes it much harder to get memory leaks but it's still contingent on me as a programmer doing my job properly and implementing it properly mistakes can still happen right one of the biggest actually is that this stops one of the problems with memory management it stops the problem of memory leak which is where I forget to free something it doesn't solve the problem of what you call a dangling pointer which is where I try and access memory after it's been deleted this doesn't stop that so if I add a method to Bob that's going to be int pointer get n and get in is just going to return that pointer to the memory that Bob's holding right maybe I want to use that somewhere else then in my main function let's have ins main I'm going to make a new Bob it's going to hold five numbers then I'm going to have a int pointer that I'm going to call X's that's going to have B get n then I'm going to delete Bob and for good measure printf the first number in X's and I should probably include the right header as well to make that compile so what we've just done is we've created a bob bob allocated his memory we then got a ref got a pointer to that memory so that I can do something else with it then I deleted Bob and Bob was very good Bob uses um Ray and deletes his memory problem is I've still got another pointer to that memory when I try and print out the first element of that memory it might work it might not this is a very simple example I've not done anything else to the memory in between so this will probably work if this was a more complicated program someone else might have used that memory and then at best I'm going to get garbage at worst I might crash right so Rey solves one of the problems with memory managements but it doesn't fix any of the others I've still got to be thinking carefully about is my memory still in use am I still allowed to use it that's where Russ comes in and that's where rust tries to Hold Your Hand a bit and stop you from making these kinds of mistakes so let's write a simple bit of rust that's hopefully going to Showcase these same sorts of things and then I'll explain some of the concepts behind how it's working firstly Russ doesn't have classes as such he just uses struts that's probably not too important but let's have a struct called Bob and Bob is going to have uh n is we use a vector of I 32s so same as the C plus plus version Bob is going to hang on to some number of numbers for me um it's got a vector and then when I make Bob when I create a new Bob uh let's just say we're going to return a bob where n is a new Vector so I'm making a bob and when Bob is made he's going to allocate some memory on the heat for me to store his numbers and that's pretty much all I have to do to make sure that Bob's memory always remains valid the way this works every variable in Rust has something called an owner right every piece of memory has what's called an owner and the owner as you imagine from the name the owner is the person who has control of that memory every bit of memory has exactly one owner and like we just talked about with Ray as soon as the owner gets deleted goes out of scope or whatever all the memory that the the owner owns gets destroyed okay so there's our Ray principle and that's just baked into the language right if I now write a main function for my rust program here let's have let's n equal Bob new then I will end the function there I mean this is a main function so I mean the program is finished at this point so everything gets freed but you know what let's call this funk one and let's then call func one from a different main function just to make the case there we go we'll do it like that so n was created inside func one which is our new Bob Bob allocates memory for his numbers once func one returns our variable n has gone out of scope so n gets deleted the memory that Bob was holding gets deleted at the same time fine that's Ray this all hinges on maintaining this property of uh one bit of memory having one owner okay we need to make sure that that stays true because the moment a bit of memory has two owners the whole system collapses right and that's when we can end up if we go back to the C plus for a moment Bob had a pointer to that block of memory that we called n and then I also got a second pointer to the same block of memory our variable B owned that data our variable X is also owned that data and that's where the problem came in so rust needs to stop us from doing that there's a few way Concepts that we want to talk about that rust uses okay the first thing we need to talk about is moving let's go back to Funk One n is my new Bob and maybe I want to get a second variable that refers to Bob okay so I'm going to say let m equal n I'm allowed to do that m is a new variable I'm going to assign it to the same thing as n okay that might look like M and N are now looking at the same memory but the first thing that rust is doing behind the scenes it's done what's called a move because we're only allowed one owner to a piece of data I sort of made a copy of it quite the opposite it is absolutely not made a copy because that would mean well you can get a copy but what it's done in this case m is now the owner okay and N not allowed to use anymore if I was to try now and do something with n the compiler would tell me that I can't use n because the data has been moved out of it okay it's guaranteeing that at any time there's only one owner for the data and that applies to function calls as well if I'm going to write let's write uh Funk 2 and that's going to take an argument of type Bob and we're just going to print line Bob Dot N and we'll have the first elements of that in our Funk one say let's uh we'll call func to with our Bob variable in other languages when you pass something as an argument to a function either it's going to make a copy of it or it's going to give that function a reference to it but this isn't quite what we want either we're going to end up with a duplicated version of the data or we've got two owners with a function same thing applies our Bob gets moved into the function so now my B variable that I defined in the function owns that Bob and in Funk one if I try and do something with n try and do the same thing I was doing in Funk 2 the compiler will warn me can't do that n has been moved so there's the first step where we're guaranteeing that our memory stays safe so that's great right one owner at a time that makes sure that when my variable goes out of Scope when the owner goes out of scope and gets deleted I know for a fact no one else is using it there are no dangling references to that data perfect problem is sometimes I do want to keep using my data after I've moved it right maybe Funk 2 does something to n but then I still want to keep using n so that's when the second part of this comes in which is this business of borrowing in Rust in Rust you you have this idea where you don't take ownership of the data but you say just for a moment I'd like to do something with it I'm not going to take ownership of it it's still your data but I just want permission to look at it or maybe tweak it for a bit and I'm going to give it back to you and that's what we use uh borrowing for and the way we do that in Rust is through uh a syntax called referencing so I'm going to edit Funk 2 now so that instead of taking ownership of my Bob it's just going to borrow Bob for a moment and then give it back the way we represent that in Rust we just stick an ampersand in front of the type and that says that funk 2 gains a reference to a bob it doesn't take ownership of the Bob and then I put an ampersand operator in front of the n as well in Funk one and that says don't pass n into Funk 2 pass a reference to n into Funk two so now Funk 2 has borrowed Bob it can do whatever it wants with Bob but then when the function returns Funk 1 still has ownership of Bob because Funk 2 just borrowed him he didn't take him up take control of him completely and then if I was to compile this funk one now it would compile absolutely fine no issues so between this um moving and borrowing we make sure we've got our Ray principles sort of baked into the language and we make sure we don't have any dangling references so presumably if something's borrowed like any good library book you need to give it back at the end is that how it works yes exactly so the borrow gets given back at the same time as it would be deleted so at the end of funk 2 Funk 2 does its thing it's done and that kind of gets given back it's slightly different though this is the last thing we'll talk about [Music] um because I introduced at the very start the idea that rust has a difference between if you want something to be mutable you have to explicitly say so right and that plays into referencing as well um because you just talked about library books um if I borrow a library book eventually I've got to give it back that's all you get fined or you get fines uh which in the which in this case I guess the compiler would just refuse to compile I guess the analogy is slightly different because so long as I'm not modifying the thing I've borrowed multiple different functions could borrow the same value okay if I introduce a function called func3 a bit like them being able to read it but not to write to it exactly yeah yeah if I just borrow like normal I can read the data so here I'm going to print line I'll prints the second element of n now I mean everyone's going to tell me I haven't put anything in N yet so technically this wouldn't work but you get the point Funk three so instead of passing a reference directly let's have N1 is a reference to n and two is a reference to n and then I could call func to with N1 C3 with N2 Andreas going to be happily fine with that because neither N1 nor N2 is trying to modify n they can both borrow it and it's absolutely fine what Russ doesn't let you get away with though is a mutable borrow if I want to modify a piece of data that I'm borrowing I'm allowed to do that let's say function 2 now is going to modify Bob I'm going to tag that mute keyword onto the end of the reference to tell rust that I want a mutable reference and then I could do stuff let's reassign n in here for example okay so function two is going to modify uh Bob and that's allowed I'm allowed to borrow something and change it so long as I say so and let's make N1 a mutable borrow but what rust is going to tell me is that I can't have a mutable reference to data and an immutable reference to the data at the same time because you can imagine that situation if I've if I've told my function that it's getting an immutable reference I'm telling it that this data is not going to change you're not allowed to change it and I'm yeah I'm guaranteeing to you your data is going to stay the same if I have a mutable reference and a immutable reference at the same time you can see how that causes problems it's a slightly different type of problem this is a problem where the memory holds something different from what you think it should and so again the compiler is going to stop you doing that it's going to say no you're being naughty you can't have an immutable reference and a mutable reference to the same bit of data at the same time so it's protecting you from accidentally modifying data that another bit of your program assumes is constants really we've just touched on the main sort of way that you'll work with memory and rust this business of moving and borrowing there are other bits to it if you want to do it in the old C C plus plus way of manually allocating exactly what you need and manually freeing it when you need you can do that right Russ supports that um it just forces you to put it in a special block that's marked unsafe to make sure that you know as a programmer you're taking 100 control of the memory there are also constructs that you can use if you if you want to use that um reference counting sort of garbage collection style of things that's particularly useful if you're sharing memory between multiple threads you have construct available that'll do that as well and I think this is one of the reasons why people like frost it's certainly why I like it is because it gives you this whole range of options for how to work with memory borrowing and moving is great for the vast majority of cases sometimes actually I want reference counting because sometimes I do one several different people to take ownership of this data and so in that case I can introduce some reference counting and and it will take care of it it'll be a bit slower but I can do it other times maybe I want precise fine-grained control of when memory gets allocated and freed because maybe I know that I can free up some data halfway through an object's lifetime and I can do that as well I can open an unsafe block and start using pointers directly Russ lets me do that as well saves a whole lot of complexity in writing the code for not really much downside really stairs is red and football is great it's a terrible example because you wanted memory you had to ask for it and release it yourself so um there are two Primitives here one's called malloc and malloc says give me some quantity of

Info

Channel: Computerphile

Views: 219,851

Rating: undefined out of 5

Keywords: computers, computerphile, computer, science

Id: pTMvh6VzDls

Channel Id: undefined

Length: 24min 22sec (1462 seconds)

Published: Thu Feb 23 2023