Crust of Rust: Smart Pointers and Interior Mutability

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

At one point he struggles to demonstrate why a program with UB is actually UB (the memory is free'd but isn't re-used, so printing the previous values still works). This is the whole point about why UB is bad: the program works fine until you change some code somewhere else, then suddenly you get a segfault. Or worse, even though the program works fine normally, an attacker could craft program input to let him/her to jump into shellcode.

Same with the `unsafe Sync`: if setting a value requires multiple steps, possibly including allocating memory for example, and the 2 threads randomly interleve their commands sometimes, then most of the time everything will be fine, but occasionally the data will be corrupted. Again, if an attacker can control timing he/she might be able to again jump into their own shellcode.

It's good that he struggles to show these problems, because it highlights the fact that they are intermittent and often hidden, and therefore often missed by normal debugging practices.

I'm enjoing the video, in case that isn't clear. :)

👍︎︎ 28 👤︎︎ u/richhyd 📅︎︎ Jun 18 2020 🗫︎ replies

There goes my morning, love these videos!

👍︎︎ 22 👤︎︎ u/GhostNULL 📅︎︎ Jun 18 2020 🗫︎ replies

After 30min I realised that I should code this myself with the video on the side. Thanks for doing these!

👍︎︎ 12 👤︎︎ u/Hedshodd 📅︎︎ Jun 18 2020 🗫︎ replies

Thank you for making these videos. I've watched the first 3 Crust of Rust videos and they were very helpful and thorough. The length is just right and they all focus on very practical issues. I'll watch this one too, for sure. Interior Mutability is another concept that I was hoping to figure out at some point and never got around to it.

The previous kind of videos you made were slightly too advanced and intimidating for me. That said, I'm sure they're great. As I become more proficient I will try watching them again.

👍︎︎ 12 👤︎︎ u/pickyaxe 📅︎︎ Jun 18 2020 🗫︎ replies

Thanks Jon! I really like this videos.

👍︎︎ 8 👤︎︎ u/Jasperavv 📅︎︎ Jun 18 2020 🗫︎ replies

This guy knows his stuff

👍︎︎ 8 👤︎︎ u/suddenarborealstop 📅︎︎ Jun 18 2020 🗫︎ replies

Its amazing, this guy types faster than my brain processes what he is typing or saying. Great content though, bookmarked for when I feel brave enough.

👍︎︎ 5 👤︎︎ u/jiffier 📅︎︎ Jun 18 2020 🗫︎ replies

This is a good video but I have a small request for you to re-read the questions you are reading off-camera before answering them. It's a little tricky to understand what you are answering context-free.

Many thanks!

👍︎︎ 2 👤︎︎ u/nicodemus26 📅︎︎ Jun 19 2020 🗫︎ replies

If a pointer is immutable, than of course it can't be changed itself, but i think dereferencing it and changing the value is only natural. I've never declared the heap-allocated struct to be immutable, only the pointer.

int a = 10;
int* const my_ptr = &a;
*my_ptr = 20;

printf("%d\n", a);

This works in C, and i think in rust it works the same way. The pointer itself doesn't change, like 0xoFF1DE4 or something like that remains permanent, but the data it's pointing to is changeable.

👍︎︎ 1 👤︎︎ u/[deleted] 📅︎︎ Jun 20 2020 🗫︎ replies
Captions
huh all right nice hi everyone welcome back to another crust of rust this time we're gonna be talking about smart pointers and interior mutability so this is something it's a little bit of a vague topic but I want to talk about some of the types you come across a lot in the rust world that maybe you have some passing knowledge of but they're just they're so pervasive and you need to have knowledge of them and these are things like arc RC refs l mutex L the DRF and as ref traits the Baro trait maybe even things like cow and sized if we get to it I don't quite know and I figured the best way for us to try to understand these is try to implement some of them ourselves and so that's what we're gonna do as before I do a lot of these streams I post all the recordings on YouTube afterwards so if you have to leave during the course of the stream or something just jump on there there are also a lot of older videos online you can go look at follow me on Twitter if you want to provide input on these episodes or new episodes or if you just want announcements of upcoming upcoming streams and to that end I actually recently added a sub channel to the rest' Asian station discord server so rotation station is a podcast that is sort of intended to be a Ross community podcast and I figured it would work pretty well for it to have sort of audio-visual content as well and so the discord right here both has integration with all of the live chat that goes onto your in the stream so in theory people can see here I'll post the link to the discord in that chat for those of you who may want to move and then as announcements for upcoming streams so here you can see this announcement of my stream Ryan Levesque who's also a row streamer has a stream announced here and hopefully we'll get more of the rough streamers on to this one server so that there'll be one place where you can get announcements for these streams rather than having to like go all the way around and hopefully we can have some useful discussion here as well about about the streams the content of them and where to move forward and of course go to my YouTube channel if you want to see some of the past videos I also do these sort of for those who haven't watched them before I do these longer programming videos as well that you can also find here such as porting javis concurrent hash map to rust is if you want some more in-depth longer videos go check those out alright so today we're going to be implementing well if we get to it RC ref sell and sell and also discuss our canned mutex and maybe also look at things like a CFD ref borrow and cow bull see were where we get to their I think we're actually gonna start out with sell because it's the one that is has the least amount of quirkiness to it so you'll see the and as before like for these cost of us they're they're intended to be a way for you to understand that the sort of somewhat more intermediate concepts in rust and so I'm not assuming you're an expert here if you have questions please ask them other people will have the same questions as well and so just like post in chat and I'll make sure to check it every now and again and that will hopefully this will mean that whatever questions you have get sort of persisted in the stream that in the recording of the stream that other people might watch later so if there's any questions you have other people will have them too feel free to ask them and I'll try to try to keep up with this the chat as we go all right great so you'll see that the rust standard library has does move this over there has a module called cell and this module as the top-level comment says is shareable mutable containers now this might sound a little bit like a weird concept in rust because in rust you have the notion of a shared reference this is just like the ampersand and you have a mutable reference or an exclusive reference as it's more aptly described as which is only one thing has that pointer and so therefore you're allowed to mutate the thing that you have an exclusive reference to and so a shareable mutable container sounds a little weird right this is something where you have and you have a shared reference to something so someone else also has a reference but yet you're still allowed to mutate it it should immediately set off alarm bells in your head but this module is various container types that allow you to do this in a controlled fashion under the constraints of where where it is permitted this is often referred to as interior immutability so it's a type that externally it looks like it's immutable but it has methods that allow you to mutate it the primary two ones while the primary three ones you have our cell refs Al and mutex mutex is not in cell it's in sync because it it uses synchronization primitives that are provided by the operating system or by the CPU to make those operations safe so it doesn't really belong in cell but it kind of belongs in cell and you can really think of a mutex as a type of cell a type of interior immutability we're gonna look at cell first because cell is is it provides interior mutability in a kind of interesting and very rusty way and it's a good segue into some of the more advanced things we're gonna look at first any questions about interior mutability or like exclusive or shared references just before we dive into cell hopefully the hopefully it should strike you as odd that we can mutate things through shared references in rust like that seems antithetical to what share references sort of imply even though in reality there there are ways to do this right I don't see any question so let's move forward with cell okay so a cell I can still be used for recursive type storage depends what you mean by recursive type storage you can store any type in a cell if that answers your question if we could outline why using one versus the other would be great shared mutation sounds fine but they seem to be a number of specialized tools to a similar problem yeah so one thing you'll see as we go through this is what are the restrictions of the different cell types the different interior mutability types so cell refs have different restrictions on what things you can stick inside of them and how you can use them and generally like the farther you go towards mutex the freer you are to put whatever you want inside but the cost the overhead of doing the the required logistics to make the type work out also increases so I'll discuss that a little bit as we go through the types box does not provide interior mutability if you have a shared reference to a box then you cannot mutate the thing inside the box is there a way to tell that a supposedly immutable struct has some some stuff inside it that is mutable like a cell no you do not know externally from a type whether it has interior mutability all right so let's dive into cells so cell is kind of interesting if we look down at what cell provides you'll see that perhaps unsurprisingly you can create a new cell and you give it a value of some type T you can also change the value by calling set and you'll notice this set has an immutable reference but it still allows you to modify the value that's contained within the cell it also has a swap method which lets you take references to two cells and swap the values are inside of them there's a replace there's an into inner that consumes self so assuming you have ownership of the cell which of course means that there are no shared references and then we get down here and you'll see that cell where the type is copy has a get method and you'll notice the get does not give you a reference to the thing inside the cell instead it copies the thing that's inside the cell and gives you a new copy of that value but you do not get a reference inside the cell and if I if you were to look through all the different methods on cell you would see that there is no way with cell for you to get a reference to what's inside the cell you can replace it you can change it and you can get a copy of it but you can never get a pointer into the cell itself and this turns out to be important think of it this way if if there's no way for you to get a reference to cell so the thing inside a cell then it's always safe to mutate it right because if no one else has a pointer to it then changing it is fine does that make sense think about that for a second if no one if we know no one else is a pointer to the value we're storing then changing that value is fine right and that is what cell tries to provide just by virtue of the of the method signatures that it provides it never gives one out and therefore it knows that no one has it the other restriction that cell has in order to make this safe is its cell you will see does not implement sync and what this means is if you have a reference to a cell you cannot give away that reference to a different thread and the reasoning for this is pretty straightforward if I had two threads that both have an immutable reference to the cell a shared reference to the cell then both threads could try to change the value at the same time and that is obviously also not OK but if you have both of those restrictions if you know that there's only one thread that has a pointer to the cell then you also know that if if I have a shared reference to that cell then no one no one has a shared reference to the value inside the cell so it's safe this change it as long as I don't give out a reference to what's inside the cell there's a lot of words so let's check in on whether that made sense before we continue all these guarantees are at compile time for sale by the way all right let me just open it window question why can't we borrow as Mewtwo more than once for RC if its own there's no RC here and there's no borrowing so this with a cell there's no there's you never use an exclusive reference in general with a cell if you have an exclusive reference to the cell you can get an exclusive reference to the value inside but at that point you can't get or change the value right in the example you mentioned you say if there's a single thread then there's no need to worry about multiple references to a cell in that case what benefit does using cell provide so the benefit that cell provides is that you can have multiple shared references to a thing for example you can store usually cell is used with something like RC where you want you want the cell to be stored in multiple places or pointers to it'd be stored in multiple places like in some data structure like imagine a graph where some of the things might share a value right then you might have multiple references to a thing but because it's single threaded you know that you will only be using one of the reference at a time and so what cell lets you do is in safe code lets you mutate that value cell should usually just be used for small copy types yes you notice that you can only get the value out of a cell either if you have a mutable reference to it in which case where you probably don't need the cell at all or if or if the value is copy and so you generally want to use cell with types that are copy and the relatively cheap to copy to copy out because that's the only way you can get their values is there a sink version of sell no I don't think you can do this just with the type system if you don't rely on sink okay so let's try to implement sell ourselves so it's usually a good way to like understand any of these things so you a new Lib and we're gonna call this M pointers why not okay source Lib we're gonna get rid of that we're gonna do a mod sell and we're gonna do loose LRS all right so we're obviously gonna need a pub struct I guess any pub mod sell so we're gonna have a cell type it's gonna hold a tea we're gonna have to figure out what's inside here let's for now just assume that it's going to be a tea and we're gonna implement for sell it's gonna be a new which takes a value of type T and returns the self and that gives us a cell that contains the value that was given we're gonna have I guess this is gonna be pub we're gonna pub FN set which is going to take an immutable reference to self and a value T and it is going to do self dot value equals value is of course currently will not work right we're trying to trying to assign to self dot value which is behind a shared reference and so we can't modify it and we want to do a get self which is going to return a tea and this is gonna be self dot value this is like the basic API were going for right but remember that part of the part of the well okay let's let's try to figure out how we might actually do this so at the core of almost all of these types to provide interior mutability is a special cell type called unsafe cell so if you go back to the to the browser let me zoom in a little bit here that might help if you go back to cell you'll notice that there is one called unselfish a cell which lists itself as the core primitive for interior mutability in rust and core cell is unsafe cell is totally unsafe to use it really just holds some type and you can random you can get a raw exclusive pointer to it whenever you want and it's up to you to cast that into an exclusive rust reference when you know that it's safe to do so it's sort of a building block right so here we're gonna use cell unsafe cell so the value here is gonna have to be an unsafe cell that's the only way that we can actually from an from a shared reference that we can mutate something through that shared reference is by using unsafe cell is there a classic example of when someone would want to use cell it's usually used for smaller values like usually things like numbers or flags that need to be mutated from multiple different places so for example it's often used with red locals right so with a thread-local you know that there's only one thread accessing it and you might want to keep some thread local state like a flag or a counter or something but the thread local only gives you shared a shared reference to that thing because one thread might try to get the thread local multiple times and then sell is a good way to provide mutability to it why does cell have an ass pointer method that gives you a pointer a raw pointer to the thing inside the cell right but trying to bring that back to a shared reference would be unsafe and so it's fine for cell to expose the raw pointer because you can't do anything with that raw pointer unless you write an unsafe block all right so the value here is gonna be just unsafe sell new values so that's fine that's not too bad and here what we're gonna do is so unsafe sell as a gift method and the get method takes a shared reference to self and gives you a raw exclusive pointer to tea right and so what we're trying to do is this and similarly here we're gonna do dot get Stalker all right so the code I'm writing now is currently incorrect and we're gonna see why it's incorrect so here we're trying to dereference a raw pointer and the compiler is telling us that that is unsafe and rightly so right where we have a shared reference to this unsaved this T and there's the compiler doesn't know that it's okay for us to change that value it doesn't know that no one else is currently mutating that value under us for example it doesn't know that there's not some other threads somewhere that's changing this T that we're trying to dereference at the same time and so if we write unsafe here what we're doing is we're telling the compiler I have checked that no one else is currently mutating this value and if we just did this like this is just wrong but what we're currently doing is simply wrong Wow how do I want to do this because even though we have said unsafe here so the compiler accepts it the code is is wrong there's nothing preventing some there's nothing preventing currently the following from happening it's something useful to do this in a test-driven way right so let's do a simple test here bad so this is gonna use super cell it's going to do sell new of 42 and then we're gonna do like a thread spawn thread spawn won't actually let us do this which is a little awkward let's do like an arcane u of a cell new we haven't talked about arc yet but it basically lets us share a reference to something across thread boundaries X dot set 43 and in this thread we're gonna do X dot set 44 sync are there's a little bit of setup here but clone X Yeah right so currently nothing stops a developer from writing this code right to start two threads well it actually will be prevented but that's annoying so here we haven't written anything here to prevent this from happening right to have two references to the same cell and then two different threads both called set at the same time and this unsafe is us just telling the compiler that's fine but it's not fine right if two threads try to write to a value at the same time what value does that thing now have it doesn't have a well-defined value and so this is not okay instead what we need to say is we need to basically implement not sync for cell T right we need to tell the compiler that you can never share a cell across threads this is what we talked about for cell when we initially looked at the API for it the compiler has support for the syntax but it's nightly only the way you get around for this for now is you basically stick your value in there that is not thread safe and guess what is not thread safe unsafe cell itself so it is not sync and so we actually already get this implementation because unsafe cell is not sync and therefore cell is not sync so this is implied by unsafe cell which means that this unsafe is actually now okay and this code will be rejected and if we try to run that code let me get the compiler error a little larger fine let's do this is this is just to get it to compile Oh fine I was hoping to do that later but if I now if I now try to compile the tests you'll see that it says unsafe cell cannot be shared between threads safely and specifically tells us within our cell type sync is not implemented for unsay cell which means that cell is not sync and so even though we tried to pass it to multiple threads that did not work as intended this is a little bit of a roundabout explanation so let's pause here before we move on alright this was a lot so let me try to walk through it one more time from the top the cell type allows you to modify a value because through it through a shared reference because no other threads have a reference to it and so you can't have multiple concurrent modifications and because you've never given out a reference into the value you store and therefore you can replace it just fine right so because get here returns a copy of the value that's stored inside we didn't we never gave out a reference and so even if we change the value we don't have to invalidate any references because there are no references outside and this is why this code doesn't compile because here that we're trying to mutate the cell from two places at the same time and to give another example of what what wouldn't work and what cell defends you against bad to so here I'm just gonna have a single thread and show you why a single thread can also go wrong so here we're gonna do something like a 42 so here imagine that I do first is X imagine that cell allowed you to get a reference out right then I can do this to get a reference to the first thing inside the vector and that I could do set back blank or whatever right and now even though this is single threaded if I now try to say print first even if I now try to even though this is single threaded this is clearly not OK right because here first is a pointer to this 42 once I call this set that vector is gone as a first should be invalidated so we can't allow this code either and the way that we don't allow this code with with cell is by getting not returning a reference get only return to copy and we never give out a reference which means a set is always safe all right let's see if what we've done so far is makes sense can you unsafely implement sync to show your test failing yes I can and Pulte sync for cell T come on say so we can we can say that in fact we could if we want to here's what I'm gonna do here I've removed all the safety restrictions so now these tests are both going to pass or sorry are both going to compile right so if I now try to run the tests and run bad 200 right I need to actually mark the missteps that's a good idea let me try to run bad to here that's interesting [Music] that should definitely not work oh I wonder it's because it doesn't actually get de-allocated [Music] let's do boxing you instead for a string string is good Oh cuz we're gonna replace it with an empty string and then try to print out the original string why does this work I think what's happening here is that is that the even though the memory has been D allocated the pointer is still valid so if I try to do this that might no capture why is it not printing this out yeah so see here for example so in this test recreate the string hello we make our cell point to it we get a reference to that string so this should now point to hello then we changed that value and now suddenly the pointer that we initially took out is now pointing to world instead even though that's a completely different allocation and so this should print a low but doesn't [Music] yeah it's basically the allocator didn't release the memory so the pointer is still valid but you should hopefully you see why this shouldn't be okay right because this is a pointer to this memory and once we change this once we allocate some new string here then this memory should be de-allocated it goes away and this pointer is invalid it happens to still be valid because of the memory system and that's why this isn't crash but if this was a larger more busy application that would not be okay all right so we we want to disallow this and the way we disallow that is by never giving out a reference right so we won't get to only work when the types copy and then we give out a copy of that value and now this won't compile because you can't get that reference in the first place in this case the case where you have multiple threads this one is probably not gonna fail easily like it's gonna be hard to write this as a test that fails but though maybe the way to see this is what's a good way to demonstrate that this is broken there isn't really a good way to demonstrate this is broken even though it is because the so the two threads are both going to modify the value in place and the problem is you don't know what value it's gonna be set to maybe the way to do this actually is to have this be a array it's gonna be 0 1,024 zeros and this is gonna set it to be a thousand and twenty four 1s and this is gonna set it to be a thousand and twenty four twos and we're gonna do so we're gonna have one thread that tries to set the whole value the whole array to be one and one that's gonna set the whole array to be to let's make this larger take a little bit longer and then we're gonna wait for both threads to finish and then we're gonna print out the value that ends up being stored in there so that's gonna be ex dog k-mer that's awkward I guess for I in stock get reprint I see what happens if we run this that doesn't seem right ex-dog get is not an iterator fine like this then so we scroll up here we'll see that let's see if we find any that are broken all are all of these set to one because if so that undermines my point yeah that might all be just awkward [Music] I see here's what we're gonna do I'm gonna do short sorry this is great okay this printer did all of them uh Stu's sometimes though this really ought to not do that make it a little larger Stack Overflow you say I'm just trying to make it large enough that the threads start interleaving apparently it won't let me do that okay ah I guess in that case I'm gonna have to argue why this is problematic so the problem here right is this thread is gonna be writing out this long array of ones this thread is gonna be writing out this long array of twos both of them are gonna take a while and we're allowing both of them to be modifying the same bit of memory right the same memory that's stored inside the cell at the same time so we have no guarantee 'is that these threads aren't gonna be stepping on each other right imagine like this thread runs for a while and then goes to sleep then this thread runs for a while and then it goes to sleep and we'll see an interleaving of ones and twos in this particular case that the test when we run it in practice doesn't fail and the reason it doesn't fail is because of the underlying memory system being fast enough that these interleaving start don't actually show up but this is something that can happen and then if it happens you basically can think of this as we're gonna end up with a corrupted array we're gonna end up with an array that contains some ones and some twos even though nowhere did we set that to be the value right we expect at the end of this test for the entire array to be ones or the entire array to be twos we do not expect it to be interleaved but the way we've set this up that could happen if the threads start yielding inappropriately yeah the so there's a there's another way for us to demonstrate this which is which would be this imagine that this thread does X X equals x1 dot get x1 dot set X plus 1 it's like that the silly way to do this right x2 we're gonna do this this is just gonna be 0 and we're gonna do this hundred thousand times maybe this is a better way to demonstrate it you're probably right and we're gonna search that when we get out the value it's gonna be two hundred thousand right because each thread is incrementing by one a hundred thousand times so hopefully by the end it should be two hundred thousand and we're gonna stick in here thread yields it's too tight of a loop then the computer is too fast I doubt this will actually pick up we'll see let's try to run this great it failed fantastic it expected the value - I need to add a zero to this but it will still fail great it failed it expected the value to be two million and instead it was this lower number and the reason here is because the threads are the threads get to race they're both modifying this value and so some of the modifications end up being lost because one thread writes its value and the other thread writes its value and they both read before they both right before they read again all right hopefully I've sufficiently convinced and Confused you that this implementation is necessary so specifically because cell if we declare our cell as not being sync then now this code won't compile because it'll recognize that we're trying to share the cell across threads and that's not okay and so really if we if we want to do the sort of the proper way we're gonna document why this is safe we know no other no one else is concurrently mutating self dot value because not sync and we know we're not invalidating any references because we never give any out similarly here safety we know no one else is modifying this value since this since only this thread can mutate because not sync and it is executing this function instead right a given thread can only do one thing at a time and because we know it's not shared between threads and we know that it's calling yet because we're in get that means set is not being called and so therefore this value is not being modified okay so hopefully this should explain why cell is safe actually I guess I can leave those tests in in theory but okay the cell makes sense we went back and forth on a bunch I apologize for that but sometimes happens what is the point of allowing T to be non-coffee if we only have the get method for copy types or you're saying up here why not do this and require it or why not do this and require it for the whole thing we could totally do that generally the only thing that requires it is the get method and so far the like idiomatic rust way is to only put the bounds where they're needed mm and this is usually usually don't want it on the type because then any type that contains the cell would also need the copy trait and it ends up just putting a bunch of extraneous bounds all over the place putting it only in the most constrained space place means that callers only have to put it where they are actually using the cell themselves will you be able to give a quick explanation of what's under the hood and unsafe cell and can you explain why we need unsafe cell and cannot just unsafely cast the rest the shared reference to an exclusive reference yes this is an important point the only way in rust to correctly go from a shared reference to an exclusive reference is with unsafe cell you are not allowed to cast a shared reference into an exclusive reference you it's just not allowed the only way is through unsafe cell the reason why that's true is a little complicated and comes down to the way that the Ross compiler optimizes your code and how it interacts with LLVM for example but you are never allowed to cast a shared reference to an exclusive reference just ever except by going through unsafe cell if you do the compiler might optimize your code in such a way that it breaks alright so that was cell now let's move on to ref cell so ref cell is a little different unsafe cell is really just it's just a tee but with the compiler has special knowledge of unsafe cell cell does not have any special compiler instructions no unsafe cell is a special type but cell is not all right so ref cell let's go back to our documentation here so rep cell is a little different you if you look at the documentation you'll see it says immutable memory location with dynamically checked borrow rules so normally in rust the all of your borrow checking is done at compile time right there either you have a shared reference in which case you can not mutate or you have an exclusive reference in which you can mutate but those are all determined at compile time what rep cell lets you do is basically it lets you check at run time whether anyone else is mutating this is really handy if you have a value that appears like you're traversing a tree or something or you're traversing a graph where there might be cycles and you are like it might be that earlier in your recursion you already got a mutable reference to this thing but later down you're trying to take a mutable reference to the same thing and ref cell will catch these cases but imagine that you have a graph that you know has no cycles so you know that you have like check your graph that it has no cycles but this is a runtime check and so you know that you can always get a mutable reference to any given node but the compiler doesn't know that because the graph is not known at compile time so ref cell is a way for you to get safe dynamic borrowing dynamically checked borrowing yes exactly so ref cell is a good use case for for things like graphs and trees and in fact ref cell is fairly straightforward so ref cell is a type that is basically also just a use so it is basically also just an unsafe cell of T but it also has this special value that keeps track of how the thing is currently borrowed we're gonna call this I guess flag mmm references let's make it a nice ice and what we're gonna do for ref cell is the ref cell is gonna have a new which they give all u T and gives you self friends like standard set up stuff and the reference is basically going to be a reference count it's gonna be how many references have we given out to this thing and of what type and think of it as a positive number it's gonna be how many shared references are there and of course that number can be any value like there can be zero to a million references or to however many is basically infinite right and there can only be ever be one exclusive reference because this is what the rust ownership system requires the we guarantee and so we're gonna have a method called borrow which is gonna take a shared reference to self and it's gonna give you an option let's for now go with an option reference to tea and then we're gonna have a borrow mute which is gonna do this and initially let's just make these beat them so this is the basic API we're going for we're if you try to mute ibly or exclusively borrow a ref cell that has already been borrowed whether exclusively or not then you'll get a nun back because the compiler is not we're not willing to give you another exclusive reference because that would violate the reference rules in rust if you try to borrow then you will get a sum unless an exclusive borrows already been given out because we want to and guarantee this contract rust has of if you have a shared reference then there are no exclusive references and if you have if you get an exclusive reference there are no shared references and so that's why this has to be an option this is API roughly makes sense we're gonna make it be ice ice because it needs to also handle the case when there are exclusive references it could be an enum instead if you'd rather have it be an enum like in fact if we want to be a little bit more explicit about this we can say ref state which is going to be either it's going to be either [Music] unshared or shared with some count or exclusive and this is going to be a rep state maybe that's easier why can't we use the borrow and borrow mute rate here the borrow and borrow mute rates are for something very different also you'll see why in a moment okay so it looks like the the rough API makes sense so let's try to actually write one of these right so what is borrow going to do well it's gonna depend on the state of self basically if the if self-taught state is ref state unshared then we're gonna give out and then we're gonna give out this and otherwise we're gonna give out nothing right so if it's currently hasn't been shared then we're willing to give out the value otherwise we're not and borrow is going to be a little bit similar in that it's going to be if it is currently oops state if it's currently unshared then we're fine to give it out right because no one else has a reference if it's currently shared with some number then it's also fine to give out and if it's currently exclusively borrowed out then it's not fine to give out so if we've given out an exclusive reference if we've given out an exclusive reference then it's not okay to give out a shared reference similarly if we have given out any reference then it's not okay to give out an exclusive reference right so this is really just us typing up the rust rules for references I don't think this is deviating from the refs LAPI I've simplified it because I want to explain why we can't simplify it this way as you'll see shortly so there's one thing that's yeah so there's one thing that's obviously missing here and that that is that we're not doing any ref counting right like we're never changing self-taught state and so this is clearly not okay right here if we give out an exclusive reference so we need to set self dot state to be exclusive right and and similarly if if it was unshared but we give out a shared reference to it that we need to set that it is now shared and if it was shared and we give out another shared reference then we need to update the reference count right so that that's certainly one thing that was missing so but of course this won't actually work because we have a shared reference to self and we're trying to mutate something that's inside of that shared reference and so this won't work well one thing you'll notice here is that we're modifying ref state here in a way that's not thread safe right if you had multiple this is basically another instance of the problem we saw before if multiple threads were allowed to borrow at the same time they might both read the old end both set the new end to be n plus one but you would end up losing one of the increments right so this can't this just this type just cannot be thread safe so we're this also just like cell is not sick but if this type is not sync and we need some way to mutate state in no place where it's not thread safe anyway well we can just use cell so in fact what we can do here is we can make this a cell because if you think about what we talked about for cell-cell doesn't like the restrictions for sell is that it's not thread safe which is fine because ref sell is also not thread safe it doesn't allow us to get references to the thing that's inside and that's fine ref state can easily be copy right it's nothing really preventing us from that great so why don't we just use sell right because sell gives us the ability to mutate something through a shared reference so it gives us exactly the thing that we need so someone's asking could you use an atomic I sized to make it thread safe we'll talk about that when we get to mutex mutex is basically a well our W lock and mutex are basically thread safe versions of ref cell so we'll get to that later okay so let's use create cell it's use the cell type we just made cuz why not use the thing we made ourselves all right so cell knew this is going to be a dot get and this is going to be a dot set great cell ception yeah that's right okay so now we can write the safety argument here right so no exclusive references have been given out since state would be exclusive and similarly here this is also safe for the same reason and no exclusive references actually that's not even necessary and down here for this no other references have been given out since then state would be shared or exclusive when using something like rayon would ref sell sell make no sense yeah with rayon you need something that's thread safe and this is not thread safe so you need mutex or RW lock or some other sync primitive right so as chat just observed the problem here is that we're increasing these but we're never decreasing them so if you wrote this code the moment you exclusively borrow something you can never borrow it again which seems kind of useless right like the moment I stopped having this exclusive reference I want to be able to get shared references again otherwise this whole data structure is kind of useless and so this is why we can't really have these be just shared references and exclusive references because we have no way to track when they go away and so really we need some other type here as we're gonna do is we're gonna use a ref type and a ref mute type and let me need to define those down here so a ref is gonna have a lifetime the points to the ref sell right because when the ref cell goes away we certainly need to make sure that all of the references have gone away otherwise those references would have dangling pointers so that's not okay and we're gonna have to figure out what actually goes in here and the same for ref mute and what really are these well these are really really they just contain a reference to the ref cell that's all they really need to hold right and now what we can do is we can implement so I implement e drop for refs right so we want to implement drop for it and when you drop a ref then we want to decrement the reference count so what we're gonna do is let we're gonna do a match on self ref cell get and if it is actually dot state dot get and if it's a ref state exclusive then that should be impossible right the ref is a shared reference and so the state must be shared because otherwise how do we get here in the first place so that shouldn't be possible if it's a ref state unshared that should also be impossible because we have a shared reference so clearly it's not unshared right so these two aren't possible but what about shared well if it was marked as being shared with one reference well then now it is unshared when this thing goes away ref stayed unshared and if it's shared with some higher account and I guess well let's leave that for now then we're gonna share it with account ref count one lower is that tracking make sense right so anytime we give out every time someone borrows a shared version of our inner value then we increment the count and we return one of these refs and when that ref is eventually dropped then we decrement the count and set it either two and minus one or two unshared if there are now no shared references all right questions about this right so this ref type let's just make this not be so we're going to return some of Wrath where the ref cell is self and same thing here but now if the user gets a ref how do they actually get to the tee right like previously we gave out a shared reference to the tee but now they're just given this weird ref type and really if they borrow what they want to do is get to the tee and the way that we solve this problem is that we implement the D ref trade so I'm gonna stick that out for a second so we're gonna implement standard ops D rough d r 4r f of t and the Saudi ref is basically the trait that gets invoked whenever you use the dot operator so if you have something of type T and you do like that value dot and then some method then if T doesn't have that method but D refs to something that does then the DRF trade gets called that's basically a way to get sort of automatically follow deeper into a type and this will make a little bit more sense once you see the signature right so the for the DRF trait what you're saying what the director requires you to give is the following signature given a reference to self give me a reference to this target type in this case the target type is T what this allows you to do is if you have a ref of T then you can call any method that requires a ref of T on it it basically it D references into that inner type T so basically in this point ref is a smart pointer right it's a pointer that dereferences that is really just a transparent pointer to some inner type but it has additional semantics when you drop it which is basically what a smart pointer is and then the question becomes how do we actually get the value well doing that is pretty straightforward we just get the value inside the ref cell we have to update the safety argument a little though to say that the our ref is only created if no exclusive references have been given out it's good to get into the habit of like writing these out since we once it is given out state is set to shared so no exclusive references are given out so dereferencing D referencing into a shared reference is fine is the argument here DRF automatically does the arrow operator from C yeah you can think of it that way and we basically want to pull the same trick for ref mute right so let me go ahead and copy all of what we just did for ref we're gonna make this ref mute and refworks much the same way it will also implement DRF implement DRF but it also has to implement DRF mute so d ref mute is a similar trait to d ref except that it says that you can go from a mutable reference to the smart pointer type and get a mutable reference to the inner the pointed to type and this is true for ref mute but it's not true for ref right so the reason it's not true for ref is because there could be multiple refs to the same value and so us giving out a mutable reference to the thing inside would not be okay because that way if I have a ref to a thing and you have a ref to a thing and we both tried to do D or F mute we would now both have a an exclusive reference to the inner value which of course is not correct that should never be legal and rust whereas with ref mute here we can we can write the argument better right which is a ref mute is only created if no other references have been given out once it is given out state is set to exclusive so no future references are given out so we have an exclusive lease on the inner value so dereferencing is fine mutable or immutable EDD referencing is fine and the safety here is see safety 4d refute is it common practice to write safety comments for every unsafe use yes I highly recommend you do this people vary a little bit in how they do this you can look at the stream that I did a while back porting Javas concurrent hash map to rust we did a lot of this but in general you should make sure to document all your safety requirements both where you write unsafe and for the module as a whole okay and then we need to implement dropped for ref mute and this one is also pretty straightforward here shared or unshared should not be possible right it should not be possible for us to get for the ref cell to be in a shared state because we have an exclusive reference and it shouldn't be possible for it to be unshared because we have an exclusive reference so it must be in the exclusive state and now we're gonna be when we drop our exclusive reference we now know that it's unshared and of course here we now need to do ref mute of cell and notice here that there's nothing stopping people from writing code here that will crash right they can unwrap this option all they want but there's no way for them to get to exclusive references to some inner value at the same time okay does ref cell roughly make sense hopefully this was a little bit more cogent than the explanation of cell should borrow mute returned a refuge yes does it not it does great they both make sense alright perfect so now we're gonna get to RC and RC is a little bit trickier so let's do pub mod RC okay so what is RC well let's go back to our documentation here so RC is a single threaded reference counted pointer and the type RCT provides shared ownership of a value of type T allocated in the heap invoking clone on RC produces a new pointer to the same allocation and the heap and when the last RC pointer to a given allocation is destroyed the value stored in that allocation often referred to as the inner value is also dropped shared references in rust disallow mutation by default and RC is no exception you cannot generally obtain a mutable reference to something inside an RC if you need mutability put a cell a ref cell inside the RC which is what we've already talked about before so an RC is like in some sense similar to a ref cell in that it keeps count of references but it's dissimilar in the sense that it never provides mutability all it does is allow you to have multiple shared references to a thing and only deallocate it when the last one goes away this is also useful if you have things like usually this is useful in data structures where you might have one element to be present in multiple places so imagine a best example of imagine that you have some string in your program and that's theirs or some large like binary blob or a configuration or something you don't want to keep multiple copies of that blob around in your program you just want to keep like pointers to it but the problem becomes how do you know when to deallocate that big blob well the answer is when all the pointers go away but how do you know when all the pointers go away and this is what RC gives you but crucially RC is also not sync it's also not send we'll get back to what that means in a second but basically RC is not thread safe so it will only do reference counting on a single thread but even on a single thread this is useful often again in the context of data structures or things like graphs how does RC handle cyclic references it doesn't the if you have a cycle then the cycle just prevents it from being allocated generally though especially if you look at the standard library implementation of RC you have weak pointers and strong pointers we're not gonna implement them because they're not that interesting but basically the difference is a weak pointer will not prevent the thing from being deleted whereas a strong pointer will so if you have if the strong pointer account goes to zero then the thing is D allocated and weak smart pointers will you basically need to upgrade them to a real pointer before you use them and that upgrade will fail all right so what is RC well RC is basically a pointer to some type T right that's stored on the heap it needs to be stored on the heap because if I have multiple functions in my code are all referencing this this type that's stored somewhere it can't be on the stack of any given type and of any given function because when that stack frame goes away when that function returns the value would disappear for all the other places where I have it as well so you can sort of think of this as it has to be a box T of course it can't actually be a box T because if we clone the RC we're gonna clone the box and cloning the box clones the T right so really this is just gonna be something like value again and it's really gonna be a pointer to a team and then what we're gonna do is we're gonna implement clone for T and notice that we don't actually require here that T is clone and the answer should be the reason for that should be a parent right because when we're cloning the RC what we're really doing is we're increasing the reference count we're increasing the reference count but we're not actually copying the inner value there's only one of the inner value and so the question then becomes where do we keep the reference count we can't keep it here right if we keep it here then each clone of the RC would have its own reference count so how would we ever know when the count goes to zero instead the reference count has to be in the value that is shared amongst all the copies of the RC so we're gonna do is we're gonna define like an RC inner and the RC inner is the thing that's gonna actually hold the value and in addition it's going to hold the ref count and when you clone what we're actually going to do is do a we're gonna get a reference to the inert thing we're gonna increment the ref count and then we're going to return and an RC of just another RC right and then we're also going to implement DRF the same way that we did before for RCT and it's gonna be rough into the inner type T and this is gonna be unsafe right because there's a this is a raw pointer so you'll see will I'll explain why we need these unsafe blocks in a bit so the problem we run into here right is that if you have an RC the compiler doesn't know whether this pointer is still valid right so think of this as if we define new this might become a little bit clearer so new is really just going to do inner is gonna be box new RC inner where the value is V and the ref count is 1 which is the current thing we have and then it's gonna return an inner which is going to be box into raw of inner right so we're gonna do a heap allocation we're gonna stick this like shared state we call this shared instead of RC inner if we wanted to and then we're gonna box into raw sort of consumes the box and gives us a pointer to it and the reason we want to use boxing to raw rather than just dereferencing it is because otherwise if if I here just did this right then when this scope ends then the box gets dropped and so the memory gets freed and so that wouldn't be okay we've we needed to not drop the box even though we don't have a box anymore so we do that by doing this but of course inside of the DRF let's move the clone down here for a second inside of the DRF here we're just saying take this random pointer that's inside of this RC and dereference it it's fine and then give back a pointer but the compiler doesn't know that this is still valid it doesn't know that the box that we initially allocated hasn't been freed since for example if we had written this the way I wrote it earlier right if I'd written this as this this same code would have compiled but it would have been wrong because when this function returns the box is freed and so this pointer that we stored inside the RC is invalid and so this dereference would be invalid and so this needs to be an unsafe unsafe block and what we're asserting here safety is self-taught inner is a box that is only D allocated when the last RC goes away we have an RC therefore it has therefore the box has not been D allocated so d RF is let's see our C's are quite useful when building gtk apps since passing a reference to a closure is somewhat a key rap your value in an RC clone and sent to your closure yeah in anything that's like single threaded which is often GUI loops for example RC is great for this kind of stuff if you could explain the difference between reference types such as ref Mew T star Mew T and star consti so star mute and star Const or not references they're raw pointers so in rust there are a bunch of semantics you have to you have to follow when you're using references like if use the ampersand symbol an ampersand means a shared reference and you have a guaranteed that there are no exclusive references to that thing and similarly if you have a an ampersand mute an exclusive reference you know that there are no shared references the star versions of these like star constants star mute do not have these guarantees if you have a star mute there may be other star mutes - the same thing there might be star cons - the same thing you have no guarantees but you also can't do much with a star if you have a raw pointer the only thing you can really do to it is use an unsafe block to dereference it and turn it into a reference but that is unsafe and you need to document why it is safe the difference between star constant star mute is a little fuzzy but the basic semantics there are a star mute a star mute is something that you is usually something that you might be able to mutate something you might have an exclusive exclusive reference to where's the star constants intended to signify that you will it will never be okay for you to mutate this and so for example in general you're you're not able to go from a Const pointer to an exclusive reference but you can go from a mutable pointer to an exclusive reference what does box provide for us the box here provides us with heap allocation all right that's what lets us go from this our Zener which would otherwise be to the on the stack to a pointer that is on the heap which is what we store here okay so for for the clone here we're gonna increase the reference count but here we have the same problem as we did for a ref cell right which is we have a shared reference to self but we need to mutate something inside of it and so here lo and behold the problem is the answer is the same thing that we've done before it is our friend cell right if this is a cell you size then now what we can do is ref count see is inner dot ref count dog get and then we do inner dot ref count dot set C plus one isn't on the safe or pretty weird keyword name it just means something the compiler cannot guarantee is safe nothing is actually unsafe yeah the unsafe keyword is a little weird because really what it means is I have checked that the stuff inside the brackets is safe it's like I as the programmer certified that this is safe so it's not really unsafe it's like in some sense saying that I acknowledge that this code seems unsafe but it's actually safe so I agree with you it's a little bit of a weird keyword name so we have the same problem for our C as we did for ref cell which is when an RC goes away this is the smart pointer part right we need to make sure that when the last RC goes away then we actually deallocate otherwise is gonna be a memory leak right otherwise you keep cloning your RCS eventually all of them go away but nothing here actually drops this box and so the value is just going to live on forever on the heap which is obviously not ok so we need to bloom and drop before I see and so the question here becomes should we when you drop an RC should we or should we not drop the inner value and so what we're going to do here right is we are going to check with the countess if the count is one we are the only reference we are the only RC left and we are being dropped for after us there will be no RCS and no references to tea otherwise there are other references there are other RCS so don't drop the box right because although things are going to need it but up here we are the last RC and so at this point we need to actually drop the inner value and what are we going to do that well this box from Raw which takes lo and behold a nice to drop in here which takes a raw pointer and gives you back the box and we're gonna drop it immediately what is the relationship between box into raw and box leaked box into raw gives you a raw pointer that you then can do whatever you want with including mutating through it box leak gives you a static referenced attic shared reference to the to the heap memory because when you leak a value right it's gonna live on the heap till the end of the program and so giving a shared static reference to it is fine because that shared reference will indeed always be valid as in what static implies but you can't then you take through it for example because it's just completely shareable um you'll see that this code doesn't actually compile and the reason here is in the difference between mutable or raw or star mute and star Const box from raw requires that we give it a pointer that has that is not star Const that is star mute instead the contract here as I mentioned is a little like fuzzy but basically what they're trying to encourage here is that you don't take the that you don't take some raw pointer that might be shared and and try to turn that into a box now there are some fairly subtle thing at things that work here and this ties into something called coherence and rust well sorry not coherence but variance in rust coherence is another beast but this time something called variance in rust is one of the primary differences between star mutant star Const it's not something you will usually run into so I'm not gonna dive too much into it here but basically we need to give it a star mute and currently we're giving it a star Const we could just make it as a star mute but instead what we're gonna do is the standard library has this really neat thing called non null so non null is so you'll notice here it mentions variants the primary reason to use known non null those for optimization purposes which is basically if the compiler knows that a pointer can't be null write as a star mute can be point two it can be a value of zero it can point to nothing but a non null the compiler knows that the the pointer is not a null pointer which means that it can use the null pointer as an extra value so for the example they give here is if you have an option non null then the compiler can use the null pointer to represent none so there's no overhead to an option non null and the other thing that's nice about non null is that it's sort of like a star mute so we give it a storm utage is where we get from box from raw and we can use it to get back this this star mutti which is what we need for into raw so let's go ahead and use that instead use standard pointer non null so this is gonna be up snow this is gonna be a non null of these and then here we're gonna do this is gonna be a non null new-new and here of course the safety argument is box does not give us a null pointer because the box actually does give us a heap allocation and now what we're gonna do here is we can use the unsafe RF method on non null instead of having this star ampersand star thing so ass RAC and same thing here and now down at the end here we can now do self in our ass pointer and this is obviously is unsafe all right the compiler this is another case where the compiler doesn't know the compiler doesn't know that we have the last pointer and therefore that it's safe to turn this back into a box and drop it but we know because we know that we're keeping the reference count correctly can't we leak immutable reference to the value in ref cell by calling DF on ref mute storing the return value somewhere and then dropping the ref mute no so this is a good question let's go back to this briefly this ties into the way the dear F and D ref mute works so the observation was why don't I just get a ref mute call D off mute take the pointer that I get the reference that I get back the mutable reference and save it somewhere and then drop the ref mute and then use the the mutable reference this won't work and the reason is because because of how Russell deals with lifetime's there's an implicit lifetime here of this which is the mutable reference that we return live only as long as the mutable reference to self so the mutable reference to the ref mute so if you tried to stick the mutable reference you got back from mute somewhere and then dropped the ref mute and then tried to use this mutable reference again the compiler would say no that's not allowed you're trying to use this mutable reference after the lifetime it's tied to has already expired because the refuge has gone away as the other compiler would not let you do that if we have a mutable pointer why do we need a cell we don't have a mutable pointer in our see the so we have a mutable pointer but it's not safe for us to mutate through it is the difference um you know so this is why this is the difference between a mutable pointer and a mutable reference a mutable reference guarantees that no one else is currently modifying it and it is an exclusive reference a mutable pointer has no such guarantee a mutable pointer is just this is a pointer with certain semantics and we call it star mute it does not it does not carry the additional implication that it's exclusive which is what allows you to mutate through things this is memory is size aligned and rust can the compiler fit other non null variants in 0 1 2 & 3 this is something that's being discussed in the unsafe in the unsafe working group for rust I don't think they've reached a verdict on it yet though who restream seems to be duplicating a bunch of my things someone asked why do I drop the inner before I do this so this is me being paranoid at this line we're dropping the box and so any pointer into that box is invalid the moment we do this inner is a pointer into that box and it doesn't get dropped until here or here and so technically this reference is no longer valid from this point forward and so if I didn't put this here someone could later axud come along and write inner dog ref count - equals 1 or dot set 0 and the compiler wouldn't warn them that this isn't okay here they're accessing something through the pointer that we just D allocated but they won't know that this is the case if I do this if I so that's why this this code compiles but if I drop this this code no longer compiles or at least shouldn't because the inner has gone away so that's the only reason alright so now we have cell refs l and RC great now let's talk about actually no there's one more thing we need to do and this is going to make your head hurt so I apologize for that in advance let me just type this out first and then I will explain why it's there okay so I think we need a test for me to demonstrate why this is a problem so this is something in rust called the drop check and I don't want to get into it too much because it's fairly complicated but I'm gonna try to cover it a little bit if you look at the nomicon the unsafe no me cotton it goes into a lot more detail here see here's what we're gonna do we're gonna how am I going to explain this this is a good question this is some really gnarly stuff that usually you will not even need to know about but I'm going to cover it because we're implementing RC and we should do it properly imagine that I write the following code Y and X and then I say X is and why is our scene you this I'm gonna try to explain what goes wrong if we don't add the stuff I added and then I'm gonna explain why that is the case [Music] well they're actually multiple things right if I remove this for a second I might make it easier to explain and then I do cardio tests I don't know how to explain this yeah this is um let me try to explain it without explaining it it's gonna seem weird but I think it's too detailed to be useful when we write this rust if we don't have this marker here rust does not know that this type owns a tea all it knows is that this type has a pointer to a tea but when this RC goes away it doesn't know that there might be a tea that gets dropped this matters if tea might contain lifetimes so rust has this thing called the drop check which I'm gonna explain in very basic terms because it's fairly complicated but the intuition is there imagine that I have I have some type that contains a reference and when it gets dropped it's going to modify that reference so I have some code that looks a little bit like this I mean struck foo T and it has a it has a V that's a mutable reference to keep and I implement drop for foo and let's imagine that this does like Adam no what does this do this does like a V dot Robin if I some mutable function on V and now imagine that someone writes a main function and they create a new foo they create a at E string and they create a foo and then they drop and then they drop the T and then they drop the food this code is problematic right so here we create a string we create a food that has a pointer or to that string a mutable pointer to that string then we drop the string at this point we cannot we cannot touch that string again right because when T is dropped the string goes away but then we drop foo and dropping foo calls a method on the string through its drop implementation but drop is implicit right so if I write let foo and T here what's gonna happen is the T is gonna be dropped first and then the foo because R Us drop things in in reverse order but haven't written drop anywhere and so the compiler when functions when any type gets dropped it has to assume that every use of that type sorry every drop of that type is a use of the type and any fields that it contains so even though I haven't written dropped through here there's an implicit drop foo at the end of the scope of main and Russ is going to treat that as accessing every see one of its fields and so this means that this code is gonna be rejected I'm just putting something here so that it'll compile so Russ is gonna treat this as dropping the when you drop the foo it's considered a test of use of all the fields of foo which includes the string and so if these were dropped in the wrong order Russ would would actually catch this as a problem right it would say that the when foo is dropped it tries to access tea but the tea has already been dropped this is what's known as the drop check but it can only do that because it knows that foo holds a tea in here it can only do this because it knows that a foo is dropped at the end of the scope imagine that we did this with RC instead if I wrote RC new of foo and now that RC is dropped when that RC is dropped rust looks inside of RC and looks does RC contain any Foos in the old case where we just had this RC doesn't contain any food and so the the compiler is gonna assume that when we drop a foo when we drop the RC no foo is dropped and therefore we don't have to check Foos implementation of drop and that's not okay that means that this code would actually be allowed to compile when it shouldn't even though tea has gone away because the dropping of this RC will not count as a use of this string because the compiler thinks that RC doesn't contain a foo bye this marker so phantom data is a way to say there's one of this type but I'm storing none of them it basically tells the compiler treat this type as though we have one of these in here even though we only have a pointer to it this makes the compiler that lets the compiler know that we own something of that type and so when you drop an RC you need to treat it as dropping one of these okay that is a complicated simplified explanation of a very complicated topic in rust I highly recommend that you go look at the nomicon drop check ooh that's very bright sorry about that so the nomicon on the drop check has way more details about what this check is about an escape hatch that in technically we need an RC but I'm not going to go through it hopefully that made a little bit of sense but I'm gonna take some questions cuz I agreed that this is complicated what I just explained but I felt like I couldn't I couldn't not put this in there because without it RC is broken and I couldn't put it in there without explaining a little bit about it yeah so the Phantom data tells rust that when you drop an RC and RC inner T might be dropped and you need to check that if we didn't have the marker rust would not assume this to be the case because it's only a pointer to one would it be sufficient to have phantom data T instead of phantom data in ER our CT yes although it's a good question I think this actually was a pull request of the standard library to change it from being just a t2 and to like the wrapper that internally in the standard library we have I forget why that was changed I think it's to guard against someone accidentally writing an Impala opt for RC in ER for example if someone wrote that implementation then this if we just had our CT here the drop for RC in ER would not be checked it's just off the top of my head and that way the guess why this only needed this is only needed when T is not static yes but we want to allow any T here right there's no reason why RC shouldn't work for other types all right so there's one other thing I want to mention for RC in particular which is if you look at the real definition of RC over here you'll see that the RC type allows the T to be unsigned so question mark sized here means it's opting out of then so rust normally requires that every generic argument is sized we talked about this in a previous video and question mark size is the way to say I'm opting out of that requirement and sighs we're not actually gonna go through how this works internally in the stream because it's a little complicated and because you need some unstable features to actually support this bully but if you are curious about the kind of stuff that question mark sites will let you do we'll probably do a future stream on trait objects that will go into it in a little bit of detail and you want to look up the coerce incised trait which is a which deals with some of the restrictions of why it's hard for you to implement RC Foley yourself if you want to support dynamically sized types the real question for me is that should this problem exist in the first place I'm feeling that if I write the same thing in C it'll be clearer and I won't be wasting time in all these so I want to stress here you will very rarely be writing the kind of convoluted stuff that I wrote right now in your own code what we're writing is a very low-level primitive and rust right the RC type is a very low-level smart pointer type and usually these concerns won't come up for you and it's important that you can write these restrictions in rust in C if you wrote this code the C compiler would never check right and see the way this would manifest is you run your code and it randomly crashes at runtime in rust you have to think a bit a little bit more about the code to make the the types correct but if you do these problems are caught at compile time rather than run time and so it's true and C you don't have to do this reasoning but in C you also don't get the benefit that this gets checked at compile time so I don't think this is wasting time what is the difference between exclamation marks sized and question mark sized exclamation mark sized means not sized a question mark size means it does not have to be sized this is because the default is that everything has a sized bound it's a way to opt out of that bound all right so the next thing we're gonna cover in the last bit of time is the synchronous versions of these so if you have multiple threads then the strategies we've written so far don't quite work right in the in the cell case if you have multiple threads to can mutate at the same time there just is no equivalent of cell because even though you're not giving out references to things having two threads modify the same type at the same value at the same time it's just not okay so actually is no thread-safe version of cell refs l is a little interesting so in the ref cell we wrote right you have borrow and borrow mute and they return options you could totally implement a thread-safe version of ref cell one that uses an atomic counter instead of cell for these numbers so it turns out that the cpu has built-in instructions that can in a thread safe way increment and decrement counters so you couldn't do that in practice this is usually not what people want because if you borrow and get a none but you need the some just because some other thread has it you would just have to like spin in a loop to get the sum that you wanted which isn't great and so the multi-threaded or the this synchronized version of ref cell is usually our W lock so if we look in sync you'll see that it has a bunch of different types and our W lock is one of them and what a reader/writer lock is is basically a ref cell where the counters are kept using Atomics so there thread-safe but also borrow and borrow mute which in the reader/writer lock are called read and write they don't return an option instead they always return the ref for the ref mute but what they do is they block the current thread if the borrow can't succeed yet so they block the current thread until the conditions are met so for example if you call borrow or the equivalent in ref in reader/writer lock scald read then if there's an ax thread has exclusive reference to it it will block the current thread until that exclusive reference is given up and at that point that thread will resume and you'll have the shared reference similarly if you try to take the right side of the lock the exclusive part of the lock then it will block if there are any shared references that are giving out and it will only stop blocking once there are no more shared references mutex is sort of a simplified version of ref cell if you will where there's only borrow mute as you don't need to keep all these like extra counts for how many readers or how many shared references there are it's just either it some other thread has a reference to it or some of the threads is not and it similarly has the blocking behavior where when you call lock on a mutex it will block until there are no other references to the inner value and at that point you're given that reference and it similarly has a guard the same way ref cell does where you get back a ref mute and that ref mute when you drop it is gonna decrement the count and let someone else go okay does the translation oh and are see the synchronous read safe version of RC is a RC or atomic reference count so if we go back here to sync you'll see that's an arc and this is a thread safe reference counting pointer and arc is pretty much exactly the same as our C except that it uses these thread safe operations these atomic CPU Atomics for managing the reference count rather than a cell the way that we did and this is why a narc is indeed send NSYNC actually that's a good point our RC needs to not be sent know whether this is already not send but this needs to be not send which I'm gonna mark as a what types are not send pointers are not send actually I don't know if non-null is send which means we may be fine so with a with an RC it's not safe to send it to different threads because the count is not it's not thread safe right so if I sent an RC to some other thread and that other thread dropped the RC and I dropped an RC at the same time both of us would try to use the cell to decrement the count but that's obviously not okay because cell is not thread safe and so the RC can not be sent and non-null is indeed not sent by default can you guarantee thread safety across SFI boundaries or is that just the C code calls into you need to uphold some guarantees so across FFI boundaries there are no guarantees that's why everything across on FF I binary boundary is inherently unsafe and any guarantees you want to give you have to establish by wrapping it in an unsafe block why would you ever prefer RC over art um RC is much cheaper so there's a cost to using these Atomics they're much more expensive in terms of number of CPU cycles and and coordination overhead between cords so in general you want to prefer the non thread safe versions if you can because they are cheaper to use they have lower overhead they're an async RW lock yeah I think both async stood and Tokyo and the futures crate and the futures intrusive crate have asynchronous mutexes alright so now we've covered the oh sorry so RC is indeed not send because non-null is not sent so we're fine there we have a couple of minutes left so in the last couple of minutes let's look at the Baro module so the Baro module is a little weird because it's not it's sort of a smart pointer we're gonna look specifically at the Cal type so the cow type is an enum that is either owned or borrowed so if you have a cow of T then the cow either contains a reference to a T or it contains a T itself roughly so think of this as either it contains a reference to a string or it contains the string itself the name comes from copy-on-write and the idea here is that is that when you have a me pull up the here so cow implements d ref so you can get a shared reference into the cow and if the cow is itself just if it just holds a reference to something else it just passes access through there but if it if it owns the thing it contains then gives you a reference to that but crucially if you want to modify the value inside of inside of a copy-on-write then if it's a reference you can't modify it because it's a shared reference so what Cal do the magic of cow and some just in some sense is tutor if you require a write access so you call I get mute and it's currently borrowed then it will clone the value and turned it into the owned version so if you had a ref prince2 a string it will clone the string and store a copy of the string instead hence the copy-on-write and then it will give you a mutable reference into this thing the reason you often want cow on the the place it usually shows up is if you have some operation where most of the time you don't need a copy because you're only going to read but sometimes you need to modify it right this comes up pretty often in string operations so imagine that you have a what's the best example of this imagine that you have an escaping function so you have something let's go back to lid imagine you have something like an escape that takes a string and it returns the string now sometimes when you're given a string there are so this is going to be like it's going to turn every like single quote into a box lie single quote it's gonna turn every double quote into a backslash double quote etc but imagine that you've given the strength food right the string food just turns into the string food there's no escaping you don't need to modify the string at all and so it would be kind of sad if we returned a string here because then when you're given a foo you would still need to clone it even though you didn't need to modify it at all because the signature says you're returning a string and this is where cow comes in really handy you can here return a cow and of course the lifetime here is the same as the input and now what we can do is if we don't have to modify it if Saul Reddy escaped if you will then we recruit can return a cow borrowed of s and we don't need to do any allocation and only in the other case do we do like let string is s like to string this is mutable and we like do something to string like add back slashes and then we return a cow owned of string so the benefit here is if we don't have to write if we don't have to change anything we just pass it through and only if we do have to change something do we do the cloning and do the mutation and if we do have to do the copy we also pass that ownership to the caller right they're gonna get a cow owned that they can then all preyed on including they then have ownership of it so why does from utf-8 law see returned cow but the other UTF variants don't yeah so that's a good example so in the string type the string type in the standard library has a function called from utf-8 lossy and what this does is it takes a it takes like a bunch of bytes and it turns you a cow stirrer and the reason it does this is that if the given byte string is completely valid utf-8 then you can just pass it straight through you can just cast it to a string reference and just pass it on through right I guess this should be borrowed like bytes a sister it's not quite what it does but like if valid utf-8 bytes and this and if it's not valid then what they do is they like walk they do like a it's gonna be Vectra bytes this is a slice right so it's create a vector of bytes and then they're gonna walk through bits and replace with dislike um invalid character utf-8 symbol if not valid utf-8 and then they're gonna return cow owned sort of bits a string this is a little simplified little pseudocode right but the idea here is basically the same as what we saw for the escaping where if you don't need to modify then don't allocate and the cow type lets you do that the other from UTA types usually allocate regardless and if they allocate regardless then then there's no reason for the cow type some of them are like checked from utf-8 so they return a result and then there you never need the mutation so you can always return a reference I think we've now covered most of the smart pointer types in rust right so we talked about go back here we talked about cell first which is for non thread safe non-reference interior mutability then we talked about ref cell which is dynamic interior mutability then we talked about RC which is for dynamically shared references so this is something where you don't know how many references are gonna be you don't know when the inner value is going to be dropped and that you only know that a runtime then we looked at the thread safe versions the synchronized versions of those types and then we looked at cow which is not really a smart pointer but kind of a smart pointer right it's like a copy-on-write pointer that upgrades when you need it alright I think that's all I wanted to cover there's a lot to digest here so consider going back and watching the stream again when the recording goes up and see if you can tease apart some of the explanations which are a little convoluted at times and apart from that thanks for watching and hopefully I think next thing will probably be like a bit of like a potpourri like a mix of different things maybe trade objects maybe things like the borrow trait oh I forgot to talk about trade delegation yeah I'll probably make it into a subsequent stream great alright thanks for watching I'll see you next time bye hey if I can make this work nice
Info
Channel: Jon Gjengset
Views: 45,637
Rating: 4.969543 out of 5
Keywords: rust, live-coding, mutability, cell, refcell, smart pointers, rc
Id: 8O0Nt9qY_vo
Channel Id: undefined
Length: 123min 4sec (7384 seconds)
Published: Wed Jun 17 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.