Crust of Rust: Lifetime Annotations

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

I really enjoy all of /u/jonhoo streams, but this is probably my favorite so far. The shorter length made it reasonable for me to view and digest in a single sitting, and the topic was especially relevant for a rust programmer of my skill level.

Thanks a lot! I would really love to see more streams in this format!

πŸ‘οΈŽ︎ 38 πŸ‘€οΈŽ︎ u/highspeedlynx πŸ“…οΈŽ︎ Apr 23 2020 πŸ—«︎ replies

Jon you are amazing! I love your streams a lot.. please keep em coming!

πŸ‘οΈŽ︎ 9 πŸ‘€οΈŽ︎ u/d3adbeef123 πŸ“…οΈŽ︎ Apr 23 2020 πŸ—«︎ replies

The way Jon explains things is really coherent and well done. Loved his livestream

πŸ‘οΈŽ︎ 6 πŸ‘€οΈŽ︎ u/beemstream πŸ“…οΈŽ︎ Apr 23 2020 πŸ—«︎ replies

Actually a good video to explain lifetimes.

I really recommend it.

Especially, the one with two lifetimes when a function has a shorter lifetime.

πŸ‘οΈŽ︎ 5 πŸ‘€οΈŽ︎ u/Lexikus πŸ“…οΈŽ︎ Apr 23 2020 πŸ—«︎ replies

Always great content. You are one of the programmers that got me into Rust. Thanks a lot.

πŸ‘οΈŽ︎ 6 πŸ‘€οΈŽ︎ u/staszewski πŸ“…οΈŽ︎ Apr 23 2020 πŸ—«︎ replies

Wait what, their firefox is upsidown, how can I get that :O !!

πŸ‘οΈŽ︎ 4 πŸ‘€οΈŽ︎ u/villiger2 πŸ“…οΈŽ︎ Apr 23 2020 πŸ—«︎ replies

What happens when it drops a β€˜static value at the end of StrSplit lifetime? Does the β€œruntime” account for the fact that that particular &str was pointing to the binaries strings?

πŸ‘οΈŽ︎ 3 πŸ‘€οΈŽ︎ u/[deleted] πŸ“…οΈŽ︎ Apr 23 2020 πŸ—«︎ replies

This is great. I would suggest either splitting it into 20 minute videos or at least time-stamping different checkpoints, which will help intermediate-level people (I think I am between beginner and intermediate), so they can skip the parts they feel confident in and focus on the parts they don’t understand. It would be even better if every video time stamp had a corresponding git commit.

Using this video as an example:

https://youtu.be/rAl-9HwD858?t=3466 - Introducing and managing multiple lifetimes.

https://youtu.be/rAl-9HwD858?t=4527 - Making structs and impls generic over lifetime + Trait

πŸ‘οΈŽ︎ 2 πŸ‘€οΈŽ︎ u/wouldyoumindawfully πŸ“…οΈŽ︎ Apr 23 2020 πŸ—«︎ replies

Hi,

thoroughly enjoyed that video. I especially liked the fact that you were answering questions in the chat as well, as there were some pretty good ones.

I also have the issue that there is a huge mental jump from beginner to intermediate. I don't want to read yet another post/tutorial on lifetimes, etc., but how to use those concepts in real-world scenarios, and this video did exactly that.

I personally would benefit a lot from a similar video on how to work with Rust in projects with Rc, RefCell, mutability in those cases, etc. Maybe something like that is in the cards?

Anyway, I hope there is more coming...

Thank you,

πŸ‘οΈŽ︎ 2 πŸ‘€οΈŽ︎ u/st-man πŸ“…οΈŽ︎ Apr 24 2020 πŸ—«︎ replies
Captions
hello everyone I saw the RUS survey 2019 results and while reading through I came across this little bit people are asking for more learning material about rust specifically intermediate level material and a lot of it at asking for video content specifically and if you're watching this maybe you already know I have a YouTube channel where I do live intermediate and uploaded intermediate rust content so I was like huh this sounds like my wheelhouse and so I tweeted out like what would you like to see if I were to do some of this that's like a little less advanced that the stuff I normally do and a little shorter and a little more self-contained than what I usually do like what do you want to see and I got a ton of responses to this so that was pretty cool to see all the ideas that people had the suggestions were sort of all over the place and what I basically what I got from it is people are confused about lifetimes and they don't want to see another explanation of lifetimes they wanted they want to see code that actually uses them in order to try to understand what's going on and so I figured maybe that's something I could do I have I only have one video that's like more beginner friendly than the normal ones I do which are usually much longer sessions we build something real in rust and that is where we did a live coding of a linked hash map and that one actually turned out pretty well I think that's something that's easy to follow if you're relatively newer to the language but I figured I would do something that sort of dedicated to this and and if it turns out to work well I might do more in the future and so that's where we are now specifically in this stream what I want to do is basically have us write a bunch of code and rust not very much but sort of a you'll see where we cover like multiple lifetimes a little bit about strings because that seems to be something that's confusing people and a little bit of generics depending on how we how we turn out on time my guess is this room will be about ninety minutes that's sort of the target range I'm going over this which is much shorter than my usual streams I'll try to be less verbose than I usually a.m. and if you if this is the first time you watch any of my videos first of all welcome and second of all if you go to my Twitter account I have this is basically where I post any new videos that I'm about to release both to announce the live streams in the first place to ask for input and what you want to see and also whenever I upload a recording of a live stream then I will put them up here and so without further ado let's get started because this stream is sort of geared more towards people who are still getting to grips with sort of some of the complexities in rust I'll also be taking a bunch of questions from chat so if at any point you get sort of you feel like you're not quite following or you feel like something doesn't make sense and you would like me to explain it again then please mention it in chat and I'll try to make sure to look over there an hour and again to like make sure that you're all following along and in particularly this is important because the people who are watching this after the fact they won't have the opportunity to ask questions if you have a question live chances are someone else will also have it later and if you ask it the answer will be recorded in the stream great okay so I also got some questions actually from people who relatively need to wrestle like how do I even start a rust project that's not normally something I cover but I figure given that we're starting something new I'll start from the very beginning with cargo and you we're gonna make a library and the library we're going to make is one that lets you take a string and split it by the string and walk the splits of that string and so we're gonna call it string split or stir split it's not a very original name but it doesn't have to be and inside here you'll see the way up the cargo Tommo we get to define sort of metadata about our new crate and also source lib which currently basically has nothing of real value and the starting point here there's a bunch of stuff I like to add as a allude to any package that I make and that's things like I like to add warn for missing debug implementations Russ 2018 idioms and missing Docs there are a bunch of others you might add to these aren't gonna matter that much for the stream we're not going to be writing a ton of documentation even though normally I do when I publish something like this but here we're gonna focus more on sort of that the insides of the thing we're gonna build but I figured it would be good to give you that preload it's something you can use in your own crates too I like this to be warned not deny because sometimes these change over time like the compiler gets smarter at some of these Lintz and you really don't want to link to be breaking your compile because someone as a later version of what you originally built with all right so here's what we want well we want to type and it's gonna be called let's say string split and the methods really what we want for Springs string split is we're gonna have add fields to it after a while but there's gonna be something that's gonna be a new and it's gonna take some haystack this is usually haystack is usually the thing that you are searching in and it's gonna take some needle in this case we're gonna call it the delimiter so this is the thing that we're splitting by and it's gonna return a self self here is it of a special type that refers to the name of the input lock it's useful to use self rather than that we could write stir split here using self is just nice because it means that if we rename the type later we don't have to change all the return types of the methods and stuff as well so in order this is saying I want to split this by this and we could call this like split by instead of new but we're not really doing API design here this is just to give you a sense for what this type is going to do and then what we want to do is implement iterator Forster split so that's the iterator trait is the iterator trade allows you to do something like force part in let's say you have a X that's of tighter split then you can do for part in X as long as X or stir split X's type implements iterator right so the item here is gonna be stirrer and the only thing that you need to implement on iterators is the next function which takes a mutable reference to self and returns an optional item notice I haven't put any implementations any fields yet I'm just giving you roughly the API we're gonna be working with and the idea here is that the the for loop construct is really turned into a while let's sum equals the type dot next that's what for D sugars to so it's gonna keep calling next while it's still returning some and when it no longer returns some then the loop will be terminated and so the idea here is and we could arguably write this as a test although this test won't really do much at the moment the idea is that you have some string like let's say ABCDE and you want to do four letter in haystack dot well I guess in this case is gonna be stir split new haystack and then let's say here space right that's going to be the idea and this is going to produce a then B then C then D then e ideally how do we write a test for this well we could do letters is this collect or in fact could just be this and then we can do assert equals letters with something like into bitter you can compare iterators as long as they have the same type does the basic thing we're building make sense we'll probably not talk about higher kind lifetimes that probably won't come up won't that just add noise when debugging early prototypes yes this whole thing at that this Prelude is in the initial phases of development you probably don't want it on because it's just gonna cause you get more warnings so it's harder to see which ones are which ones matter and which ones are more stylistic if you will how did you decide between library and binary and how do you check the library output results while coding so you decide between library and binary depending on whether you are whether you're building a binary or a library right binary is if you're building a program that someone's gonna run on the command line everything else is library you might build a thing that has both a library and a binary in which case it doesn't really matter the only difference between the lib and bin Flags are that the bin flag creates a source main and the Lib frag creates a source lib those are most of the differences but you can have both in one crate and if you build a library the way you check its output is you write tests what do you do to mock external dependencies I'm not going to cover that in the stream but there are a good ways to do it I thought all loops the sugar to loop with a break condition I well in Mir I think while loops D sugar to loops as well it's just it's easier to explain it as four turns into a while but you're right the deeper down while turns it a loop yep all this is gonna be about lifetimes you're about to see great all right a quality comparison between iterators is element-wise yeah it also checks with all things are the same okay so now we have a program and now we're gonna have to actually figure out how to write this and there are a couple of things that immediately come up so let's start with I'm gonna call this remaining remainder and we're gonna have a delimiter right so we're gonna be remainder this is the part of the string that we've not yet looked at we haven't returned any from it yet and delimiter is what are we splitting by which we need to remember over time right and so new is really just going to create a new self where the remainder is the haystack and the delimiter is the delimiter notice that the limiter here I don't have to put field colon value because the to the field and the variable have the same name and so I can sort of duplicate them in the case of a remainder the field and the variable do not have the same name so I give both and I the reason I don't change I I want this to be called remainder and I want this to be called haystack and that's why I end up with this format okay so news pretty straightforward and most of the magic here is gonna be in the implementation of next right so the question now becomes what do we do in order to implement next well the implementation is pretty straightforward right what we're gonna do is we're gonna find where the next delimiter appears in the remainder and then we're gonna chop that part of the string that's what we're gonna return and we're gonna set the remainder to what remains after the delimiter right so the next delimiter next the limb is going to be self dot remainder dot find and we're gonna look for self dot delimiter and this is an if let's sum because it could be that there is no delimiter in the string right the delimiter no longer appears in the string in which case we're sort of done but if it does appear then what we want is until the delimiter right is going to be self remainder from the start until the next delimiter right and then we're gonna modify the remainder to be so remainder everything following that delimiter right so that's going to be next the limb plus the length of the delimiter so everything from there and out and then we're going to return some of until delimiter right and if the delimiter is not found well in that case we can just return none actually then if the delimiter is not found we have two options we either either the remainder is empty in which case we return done or if the remainder is not empty then we return the remainder so in that case we're gonna set rest is gonna be self remainder self dot remainder is gonna be empty and then we're gonna return some of the rest you'll see there's actually a bug in this code I'll get to that later okay take some more questions is the cascade itself really the preferred way of implementing that I'm very much new to rust and it seems a bit odd coming from other languages I've seen both I personally prefer using self as I mentioned because it means that if I change the name of the type I don't have to change anything else there are I think the biggest cost to this is that it means you can no longer do sort of local reasoning looking at this line of code you sort of have to figure out which type you're inside the imple of which is trivial here but if you have a really long one it might be trickier and also you need to rely on a later release of the Ross compiler although this was added I think a decently long time ago now can you explain later when should I use associated types versus generics there is actually a decent description of that in the rust book I believe the basic idea there is use generics if you think that multiple implementations of that trait might exist for a given type use associated types if only one implementation makes sense for any given type when to use match versus if let's sum I use match if I care about more than one of the patterns if I only care about one of the patterns then I use if let how should I read line 15 so I'm using relative line numbers so when you say 15 it's not clear what you mean it's else if not required oh you're right that should say elf elsif good catch great all right so we have an implementation now that we think she let me I'm just gonna write a to do bug here cuz I know there's a bug but I don't want to talk about it yet okay so in theory we're now done right we have our implementation great we're gonna just CD in to store split cargo test oh no it doesn't compile what is this right it's telling me missing lifetime specify your missing lifetime specifier all over the place but certain cargo check instead so we don't get these duplicated it's telling me in all these cases where I have references that I need to give them a lifetime because I can't figure out what that lifetime is okay so we do what we've been told the compiler said add a take a so we're gonna add a take a okay now it's telling us we have to do that here as well okay so we're gonna use the new fancy like anonymous lifetime here I'm gonna do the same thing here alright so we did that now they all have a lifetime I'll talk a bit about what this actually implies but let's just see if it's happy it's still not happy down here okay so down here in the iterator we say that the iterator returns a string reference but rust doesn't know think of this as a pointer to a string right and rust needs to know how long can I hold on to this pointer for so for example this pointer right we know it points into the remainder here right that's where we know that that's where it points but rust in when it calls iterators leader at a next method just gets back a pointer to a string and if that's all it gets then can it hold on to that for the end of the lifetime can it the end of the program can it drop ster split and then still use it it doesn't know how long it's okay to keep using this pointer for and it needs to know right because otherwise it might use that pointer after the string it's pointing to has already gone away after the memory has been D allocated and that would be a problem well so we have this tick a now right and this tick a here is really a lifetime to like well let's let's keep calling it tick a for now and this is really how long does this reference live for right what we're saying here is that if you have a store split then the remainder and the delimiter both live for this long they the pointers are valid for that long and down here really what we're saying is well if the remainder is valid for this long right to the take a here and the take a here are the same then we could call them different letters think of this as like Generic over a lifetime then the thing that we return has the same lifetime you could imagine other lifetimes here right the lifetime of the return string could be something that's tied to the lifetime of the store split itself right but that's not the case here we're actually having there's some lifetime that's longer than star split right even after you drop the store split the thing that you get back from the iterator is still valid because it's about the lifetime of the string we were originally given the haystack we were given that's what matters all right so that was a lot to cover so let's talk about that before we continue that looks very foreign yeah it is foreign lifetimes are you don't have these in other languages in general the plug in here is COC with the Rost analyzer can I be wrong by specifying lifetimes you can never be wrong by specifying lifetimes if you specify a lifetime that the compiler think of it this way it's like using the wrong type you can use the wrong type but eventually there's gonna be like you have to call a function and you have to provide something that is of some type and you give it some other type in the compiler goes these are not the same so the compiler won't let you compile a program with the wrong lifetimes it won't generally let you do that it like yeah so you can't really give the wrong lifetimes any more than you can give the wrong type in the sense that the compiler is gonna catch that you did that how do you tell where an anonymous lifetime can be used so anonymous lifetimes are places where you tell the compiler guess what lifetime and that only works when there's only one possible guess so one example of this is if you let's have some other info here I'm just gonna make up an impulse and it's something like get R if it takes a reference to self and it gives back a reference to stir right here if I put this here there's only one other lifetime here and that's the lifetime to self and so the compiler can guess what this type is as you don't need to give this right the compiler understands when you give this that it must be this lifetime so that's an example of where the compiler can guess what's the difference between take a and take underscore take underscore is telling the compiler you guess the lifetime it sort of means it's sort of the same as underscore for types that's where it originally comes from sorry underscore for not on scruffy types it's sort of like a pattern that matches anything that's not quite true either and but it's something you can use when you don't want to specify the lifetime and you think the compiler can figure it out take a is a specific lifetime it's similar to a generic is like a T is there any kind of ordering on lifetime specifier it's like it stick a more than take B no well yes you can order lifetimes based on how long they are so for example the special lifetime kicks static is a ring that lives for the entire duration of the rest of the program and so you can have some tikka thats shorter than or smaller than take static in general though the the name you give does not matter just like the name of a generic does not matter like take a versus take B is just a name that you choose how does the compiler know it's wrong but it cannot infer it I think of this as I can write a function multiplied and it takes an X of the unit and a why that's an i-32 right this is wrong right like I can't write the implementation here and so the compiler knows that it's wrong the compiler doesn't know what this should be right only you know what this should be right if I hear right I guess x times y you know what this should be the compiler does not know so the compiler can tell you that you're wrong but it can't tell you what the right answer is yeah so underscore is basically type inference for lifetimes why would you not align the lifetime if you're leaving the tick underscore in the type you basically want to lied whenever you can there's some cases to you were take on to score you can use it to say don't consider this lifetime for the person of guessing I'll as we expand this a little bit more this might become clearer there's a way to use multiple lifetimes bus fares at the same employ yes we'll see that in a second yes you can specify an order for lifetimes we won't need that here but you can the stick underscore only get used if there's only one possible lifetime no so tic underscore you can also use it if imagine that you have a function imagine you're writing something like this where it takes stir and why is it tick B store and you want to say this returns something of the lifetime of the X all right so we could call this x and y as well as instead to make this easier to to read right if this is the in pull you have you can simplify this with anonymous lifetimes by saying this this in which case this gets ignored this basically gets in argument position it gets turned into an arbitrary unique lifetime and in the output position it means type inference basically a lifetime inference and so it's gonna infer that this must be tied to X but must not be tied to Y because Y has a turn on lifetime so in other words the lifetime of ster split remainder and starts bed delimiter is now tied to the lifetime of the stir plet split itself no so this is where you'll see that we still get a compile error so actually let's move on to that because I think those are most of the questions great so let's do a cargo check ok so what does this do so now we get an error saying lifetime of reference out lives lifetime of borrowed content so this is where we get into sort of weird lifetime land right and this is probably an error that you've seen in the past you throw up your hands ago what is even going on so let's try actually read through this it's complaining about the new function and it's saying specifically there's a problem with haystack the reference is valid for the lifetime tick underscore as defined on the implementation up here but the borrowed content is only valid for the anonymous lifetime number one defined on the method body at 10 5 so 10 5 if you see is here oh sorry yeah so 10 5 right line 10 column 5 is like right around here so what it's telling us here is you told me that you were gonna give me something with this lifetime right when we say new return self then that self has this lifetime but the thing that you gave me in remainder which is supposed to have that same lifetime right the remainder here is supposed to have take a where the take a is the one from the definition of star split but you gave me something that has a lifetime that is just whatever this lifetime is and those are not the same specifically I don't know that the haystack pointer here lives for as long as this lifetime here they're just both like some lifetime and we haven't given any relationship with them or between them and the same thing here for delimiter like the delimiter that's given in is a pointer to a string to some string right but for all we know the moment that new returns the string that haystack and delimiter point you might be de-allocated immediately because we haven't put any restrictions on the lifetimes of those parameters so imagine if that were the case if the caller immediately removed those strings from memory at that point we still have a stir split hanging around with some lifetime that has some random name right and that stir split struct has pointers to those strings still and that should obviously not be ok because it's not ok for us to continue to refer to those strings once the memory has gone away so clearly there has to be some relationship between the strings that are pointers that are passed in here and the lifetime of the the pointers we hold inside star split and we want the compiler to ensure that the as long as the star split is around those strings are still accessible through the pointers we were given and so how can we express that well what we really want to say here is that I can give you a stir split with a lifetime take a if you give me string pointers that are also take a right you see the difference here so here we're saying the pointers you give me in they can live for however long you want but they have to live for at least some some duration take a and the the type I give you back has a lifetime that is the same as that lifetime and the compiler is now going to check that as you can only keep using this as long as that lifetime is still live which implies by the by the fact that is connected to this lifetime that you can only keep using the stur split for as long as the input strings are still valid does that make sense okay so that was a lot more let's iterate on that and then continue when do these generic names for lifetimes and not proper names like typical variables I mean why do we use T for generic types that that said I have seen an increasing number of people using more descriptive names and my plan is to do the same here we can't currently do it but I'll get to it a little bit later how resilient is the anonymous lifetime will you get yourself in trouble if you rely too much where's the compiler going to pick correctly the vast majority of the time use it if you can is generally the answer for the anonymous lifetime communes post restrictions between lifetimes yes you can hear so far we only have one lifetime right we only have the lifetime take a and so there's no relationship to really give but yes you can give lifetimes you can give more than one lifetime and then give relationships between them saying this reference must live for longer than this at least as long as this reference I don't think we'll need that here but we'll see great great yes this is very much related to type systems like lifetimes are types lifetimes are like types and you can use similar language to talk about them in some sense I don't know to what extent this is actually accurate but in general you can think of lifetimes as the relationship between lifetimes as sort of like subtyping why is the ticket next to the IMP award needed yeah so notice we're doing take a here and take a here the reason those are needed is for the same reason as if you have some struct food that's generic over tea you cannot write this that is not something you can write if you did the compiler would say you're using a type T here and I don't know of a type T the placing it after the in public is what makes it a generic in public it's saying this in public is generic over T over any type T similarly this is saying this imple block is generic over any lifetime TK the rust type system has to bottom types yep subtyping is actually the language used for lifetimes in the wrist nomicon yeah that makes a lot of sense I'm generally not gonna be answering questions about other things because I want to keep this stream short great alright so let's see whether this works now ok so here's another thing that won't work you've ruined cargo check now you see that the errors we got from you have now gone away this oh that should not be that it's an empty string ok notice here the compiler let me get rid of these because they're not that useful at the moment most of the compiler so it's not giving us any errors now so the compiler is totally okay with me having a stir split that contains take a reference to a stir and me just assigning the empty string to it why is this ok right think of this as self dot remainder has type tick a stir right this has type tic static stir so why is it okay for me to take one of these and assign it to something here well so this gets back to the static lifetime so the static lifetime is the lifetime that extends until the end of the program think of it as it basically never ends and this is where the subtyping relationship comes in so if you have any lifetime you can assign to it if you have a reference of any lifetime or the thing that contains any lifetime you can assign to it anything of the same type but a longer lifetime and the reason for this is is sort of straightforward right if if I need something that lives for at least a then some other lifetime that's longer than a trivially can be reduced to that description right the other the other thing the other way is not true if I require some thing that's a pointer that lives that is valid until the end of the program I can't give it anything that has a shorter lifetime because it wouldn't meet those criteria but if we were going the other way is fine all right so let's try our test case here does not implement debug okay so we're gonna derive debug up here oh I forget what the trick here is it's like letters dot this I guess great okay so our test passes so the question now is are we done let's just see whether the things I just did so now we have sort of a complete program in the description of static let's see whether that roughly made sense so everything my default has a static lifetime um you can sort of think of it that way although it's not really true um any value has a lifetime of however long that value if you have a value that's you assigned to a variable say the lifetime of that value is until that value is moved if the value is never moved then it has a static lifetime but the value itself like if you store something on like the stack of a given function the lifetime of that value unless you move it somewhere else is going to be the lifetime of that function basically the the stack frame for that function and when the function returns that lifetime ends and it has to write because if you gave out a reference to something that's on the stack then that reference can't be allowed to continue living after the function returns that wouldn't be okay can I think about stirs plated like a fold are no it's not a fold it's a split it takes it takes a sequence of characters well it takes a string and it splits it into multiple smaller strings separated by some delimiter yeah so one one reason why this empty string over here is static is because any constant string any string that you write directly in double quotes is compiled into your binary it like lives as a little like it's stored in the program that's stored on disk and so and when your program is launched the operating system is gonna load that binary into memory and anything that's that is a static like that anything that's a value that's written into the binary is in sort of read-only memory that will never move and so if you take a pointer to it which is effectively what this does behind the scenes it takes a pointer into that the text segment of your binary Wow it takes a pointer into a particular segment of your program then that reference naturally lives for the rest of your program that pointer is always going to be valid because that part of your programs memory never changes yeah so each is the thing on iterators it's very handy you're right that we don't need to like another way to do this would be to collect this into avec and then assert of letters and this we can do that instead let's find two this is more to show that it's neat don't variables die at the end of scope not just return yes so lifetimes are for as long as a value still lives and so this is why a value it's not like values default to being static they default to living for as long as they do that there's not really a default it's just any value only lives for as long as it lives for like until it's moved or dropped basically until it goes out of scope but it can be shorter too right if you call some other function with that argument and it gets moved then the life time for that for that value ends and you can't use it even later in the same scope yeah so one reason to prefer assert equals will get an are nicer errors okay so I mentioned a bug let's just deal with that before we go on to the next and the bug is this if I run this it's gonna fail specifically here we have a delimiter that tails the string in this case the iterator should produce the last element as an empty element because the delimiter was there and so technically it should produce an element there and so we need to distinguish between whether the remainder is empty or whether the remainder is an empty element we haven't yielded yet and these are a little bit subtle the way we're probably going to end up doing this is I'm gonna make the remainder oops that's not at all what I meant and option.this and so here we're gonna do [Music] hmm how do we want to do this in fact there's a different problem here which is really hmm so good question there's a separate problem here right which is in fact even more subtle or not even more subtle this is tricky which is that currently it's not even gonna do the right thing for actually no that is the only case that gets it does the bug make sense why can't the compiler infer these lifetimes the compiler does infer these life times the compiler in first the lifetime for every value here what we're saying was we're writing code that is generic over lifetimes and so the compiler doesn't it it can't infer that the type the lifetime we return here is tied to the lifetime of the remainder which is tied to the lifetime here it would have to do some pretty sophisticated code analysis to figure that out and so we're adding these lifetime annotations to tell it how long we need different pointers to live for lifetimes over all allocated memory if you have a heap allocation then that still has a lifetime it's just the heap allocation has a the heap allocation lives until it is dropped so it still has a lifetime if it's never dropped then it would be static but in general that the only way you can get something on the heap and then never drop it it's with something like box leak and box leak does return a static reference if you dump the binary could you spot this article occasion yes you could not for the empty string because it gets optimized out but in general yes in fact there's a program on UNIX called strings that prints all the strings in a binary great all right so let's fix this bug I think what we want to do here is else if let some remainder is self-taught remainder take then we're gonna return some remainder actually we don't even have to do that we can just do self remainder take I'll write this out first it might be easier to follow right if let some wrath remainder itself remainder here I could use the new like smart smart matching patterns I don't really like that feature but I could so if there is some remainder then we're gonna search through the remainder I guess this has to be a mute people are going to have all sorts of questions about this and I'll get to that in a second and then this is gonna be none sorry this code is currently ugly let me get to that yeah so if there is some remainder that's still to be searched [Music] then we're gonna look for the delimiter in that remainder if we find the delimiter in that remainder that's inside this second this sort of nested if let's sum then we extract then we do what we did before right we extract the stuff until the next row delimiter and then we set the remainder to be everything passed that remainder and then we return some and otherwise we're gonna return just what the remainder was regardless of whether it was well this this will trivially be some because it was some up here but we want to take it so that we leave none in its place and then I guess this just becomes and so now you might wonder well what's this business going on down here why do we have to do any of this so actually this you might even be able to do this might be fine all right yeah so question here I'm aware of string split at thanks so one question here is what is the ref keyword so if I did this that moves out of self remainder this is saying this is assuming that I own this and I get to move the value but that's not really what I want to do here right I want to get a mutable reference to the value inside of self remainder if it is some and that's what ref mute does here right the the type of this here right is an option an option take a stir and I want the type of remain to be mutable reference to the tick a stir right oh I did not want that rapid that's what I want remainder to be and that's what this ref moot does if I did not have the ref mute here then what I would get back is this which wouldn't help me because I need to reassign that value to move it to be beyond the next delimiter and so I don't want to sort of take that value I want to modify the existing one yeah so ref a means that I'm matching into a reference like I want a reference to the thing I'm matching rather than the thing I'm matching itself and similarly ref mute means I want to get a mutable reference to the thing I'm matching rather than get the thing I'm matching itself what was the ampersand star we're gonna ignore it cuz you don't need it it was an attempt at a rebar oh that wasn't needed why can't you write okay so this is a good question so why can't I write this I think it's the question so this sort of does the opposite this is saying take take what the right hand side is and try to match it against this pattern so write the mutable reference here is a part of the pattern it's saying what I'm going to give you this would only match something that was a option tick mutti and then remainder would be the T that do this do like a visual match here right let me try to line these up so you can see it more clearly right so remainder would be assigned to tea because it would automatically do the dereference if I write ref mute and you give me a something like this then remainder is going to be a referent immutable reference to that tea and so this is the way in which they differ there's sort of inverses of each other if let's sum mute remainder equals yeah so with the new like auto magic things I could also do this that would also work I don't like writing the code this way because it it looks weird to me but it does mean you get rid of the ref mood but there's more magic going on here so I like writing it this way yeah so ref you can think of ref as make a new reference or take a reference to and similar to ref mute what's the DRF on the left side of that assignment doing ah so the type of remainder here is a mute is right remainder here is of type this right but the right-hand side is of type this and I can't assign something like this to something like this that won't work because they're not the same type and so I need to dereference I I want to assign this in to where remainder is pointing and so they're hence the dereference there next Dell m+ self delimiter length this yeah so this might end up being one past the end of the string and my point to just beyond the end of the string and that is a valid position to cut a string or an a slice at it basically gives you the empty slice what is the take call doing so take is a really handy method on options so take is a function it's all implemented on option T I guess fine let's make it proper and take takes a mutable reference to the option and gives you back an option T and the idea behind take is if the option is none then it returns none if the option is some then it sets the option to none and then returns the sum that was in there and that's what we want here right we only want to return the remainder that doesn't have a delimiter once and so that's what this would do because the moment you take it what's left is none and so on a subsequent call to next what you would get is this would no longer match and you would get to the none branch instead and in fact so just to check this should work now we can simplify this code even more which is the question mark operator the try operator also works on options so we can do and people are gonna hate me for this we can do this okay so this is something most stress programmer would never would rarely actually write remember that every let statement is a pattern match and so here what we're saying is I want a pattern match on what was inside the sum of self remainder to take a reference to what is in there this is weird rust that you won't see very often you could also write this as this and we do the same thing there are sort of inverses of each other if self is mutable here wise self or made are not mutable by default so there's a you need to keep in mind that the mutable references are only one level deep so if you have a mute to self what that means is you're allowed to modify any of the fields of self but so I'm allowed to modify remainder I'm allowed to modified limiter but what delimiter is is an immutable pointer to some string and so while I can change delimiter itself to make it point somewhere else I can't change the thing that the limiter is pointing to for that delimiter itself would have to be a mutable reference a question mark an option is available and stable as well should be at least okay great so this now works let's just double check who that does I guess not work B cause probably this would be my guess Oh huh so this actually doesn't move so we can't do that so this has to be an AZ mute okay so this is kind of subtle so let's go over this as well this is not something I plan to cover but we might as well while we're here what this does is if self remainder is none then it returns none otherwise it returns the value inside the sum and normally that would move the thing that's the T that's inside the sum but because the thing that's inside the T is copy we get copy semantics instead of move semantics so it copies this reference out of the ocean this means that remainder is no longer the same remainder as the one that's in here it's not a mutable reference to this it is just a separate reference pointer this means that when we modify it down here what we're actually modifying is just our our copy of that pointer it's not modifying the the pointer that stored inside self and so we can do as mute ear as mute is a function on option so it is a function on option that takes a mutable reference to self and returns an option that contains a mutable reference to self and so now if this is none then we return none if this is some what we get back is a mutable reference to the thing that's inside the option and so now remainder will be a mutable reference inside of star split great all right so now we have a working implementation it doesn't hang that's all fine as Nami you might wonder well I came to the stream to learn about multiple lifetimes and so that's what we're gonna look at next um imagine that you want to write the following implementation you want to write this function that is a split by character or actually let's do even better let's do until character it's gonna take a string s and it's gonna take a character and it's gonna give you it's gonna give you the string it's gonna give you the string until the first occurrence of that character right so if you wanted to write a test for it we'd write something like until our tests I'm expecting that if I do hello world and I give it a Oh then this should return hell it's a fun coincidence I did not plan that okay so that's our plan and here we'll do a take underscore to tell the compiler just infer this in fact we might not even need to but with the rust 18 idioms it's gonna complain at us and and naively now that we have stir split this should be pretty straightforward right we should be able to just do stir splitting you give it the s just format the C to be a string and do next and do an unwrap because we know that there will be a zeroth element right so here let's make this unexpect and say stir split always gives at least one result right so we'd sort of hope that we were able to do this and if we run cargo a check here it tells us okay it expected a stir and it found a string so we're gonna just take a reference to this string and it says cannot return value referencing temporary value return some value referencing data owned by the current function okay so let's dig into what this is actually saying it's saying you're creating a temporary value here and you're trying to return a value that references that temporary value basically what it's saying is the stir that we're returning is tied to the lifetime of this string but that's stupid right because we know that stir split Oh only ever returns some strings of this string the first argument the haystack it never returns references into the second string the lifetime of the second string doesn't matter for the purposes of what store split returns but if we look at our definition we can sort of understand where Russ is coming from here right we've said there's only one lifetime both of these have that lifetime and the thing that we returned from the iterator has that same lifetime and so when rust gets what we're saying here right when you create a new Star split it's saying that these two things have the same lifetime and so when we down here past two elements that have different lifetimes right one has the lifetime that's only the scope of this function whereas s has whatever the lifetime of this is then rhaskos okay these two have different lifetimes and so in order to make them the same I'm gonna take the longer lifetime and turn it into the shorter lifetime and so the tick a for this store split is gonna be the lifetime of this scope right and so when we try to return a reference to that here that reference has a lifetime tied to the scope of this function but what we've said in the function definition right if we sort of fill out the alighted lifetimes here is we're really said that this is the contract we want but the lifetime that this returns is tied to the scope of this it's not tick a it's not let's call it tick s because that's what the argument is called so how can we tell Russ that this is okay well what we need to do is we need to have two lifetimes here okay let me see if everyone understands the problem first let's see should we copied the dilemma into our struct okay so one option is we don't have multiple lifetimes we just stored the delimiter as a string so this gets us in fact let's let's explore that option first we might I sort of don't want to but let's talk about this without necessarily exploring it fully so imagine the delimiter was a string instead of a sister you'll notice that the string does not have a lifetime associated with it and this gets back to the differences between stirrer and string so Astor Astor is similar to similar to but not quite the same as let me make this thing similar to but not quite the same as this it does not have a size just like just like a slice that's not behind a reference does not have a size it's just a collection of characters is a sequence of characters it doesn't know how long that sequence is it just knows that it is a sequence of characters usually you will see stir in the context of a reference to a stir just like you would normally see a reference to card here there's all sorts of things we could talk about here but the basic idea is the reference is this is a fat pointer not a shallow pointer we're not a narrow pointer I guess and so the fat pointer stores both the pointer to the start of the string or in this case the start of the slice and the length of the string or the length of the slice and so this is just a thing that remembers both where the string starts and how long it is just like reference to a slice is the same thing string is a little bit different so string is more equivalent to a Veck of characters so the way there are two ways in which this differs first of all a string is heap-allocated right the this reference can point anywhere it could point to something's on the stack something's on the heap something that's in static memory it's just a pointer to a sequence of characters a string though has the property that it is heap-allocated and it is dynamically expandable and contractable it's a heap-allocated thing just like a vector it can shrink and grow now if you have a string you can get reference to Astor right if I have a string then I can go to a reference of Astor right because the string obviously knows where the string starts and that is the in-memory representation of it is a sequence of characters and it knows how long that sequence of characters is as I'm going from a string to Astor is trivial and in fact this is why string implements as ref stir because you if you have a string if you have a reference to a string you can trivially get a reference to Astor going the other way is harder so if you have this and you want to go to a string you don't know where this reference is pointing so the only way you can construct a string is by doing a heap allocation and then copying all the characters over and now you have a string and so this is cheap and uses ass ref that's not what I wanted and this is expensive it basically uses clone it's not quite clone but it has to do mem copy I guess so it's true that we couldn't store the delimiter as a string but this has two downsides the first of those is that now we require an allocation right in order to create a store split you have to allocate this is not great for performance but it also ties them to the second problem which is now you need to have an alligator so this means that once we started using a string this library can no longer be compatible with embedded devices for example which may just not have an alligator you don't have a heap and so really we'd like to keep this a stir if we can let's see questions about this yeah can you get that character from until car and transform it back to a stirrer yeah so that's basically what this this reference in front of the format is doing right format produces a string and then ref takes a reference to that it's just that the lifetime of the reference we get back is tied to the lifetime of the string this this might be more visible if we move this out so if I say the limiter is this this might be more obvious this string is gonna be de-allocated it's gonna go out of scope here and so when we take a reference to it the lifetime of that reference is gonna be this scope and the lifetime of s is tick s and when the compiler is told these have to have the same lifetime it's gonna use the shorter one and we can't make it longer right because this is gonna be de-allocated memories gonna be gone and so this reference is just like not okay the way we've written this above all right so how do we fix this well the solution here is to have multiple lifetimes and I will say before you start this usually you do not need multiple lifetimes there are only some cases where you do this is one of them and it took me a while to figure out a case that needed multiple lifetimes it is quite rare the the time it comes up is when you need to store multiple references and it it is important that they are not the same because you want to return one without tying it to the other so let's name these lifetimes and a haystack and delimiter so I told you I was gonna name them and now this imple block is gonna be a generic over haystack and delimiter right and the haystack is gonna be haystack generic the dilemma is going to be delimiter generic and notice that these now have different lifetimes so the compiler no longer has to force these to be the same by downgrading the lifetime down here and now down here right we do the same thing oops right and now we have access to another lifetime here we can say that the reference we give back is tied to only the haystack lifetime it is not tied to the lifetime of the limiter right and notice here that the compiler is totally happy with this because the code we wrote indeed follows that contract because any reference that we return from in here is a reference into the haystack if I change this and said let's see I just like made a mistake and somehow returned like self delimiter now the compiler is going to complain so there was a question earlier right about whether you can use the wrong lifetime so here the compiler is gonna get be very mad at us and saying cannot infer an appropriate lifetime due to conflicting requirements first the lifetime cannot outlive the lifetime delimiter as defined so that the reference does not outlive the borrowed content so this is saying the thing you returned self delimiter has a lifetime of delimiter but the lifetime must be valid for the lifetime haystack as defined on the Impala up here so that the types are compatible this error is a little bad this should ideally be pointing at item but specifically what it's pointing at here is the what it should be pointing out here is the self dot item up here right we promised in our code that the item would have a lifetime of haystack and so that's what the compiler is pointing out it's saying you said that the lifetime should be valid for the lifetime haystack right it's saying that that contract that guarantee you gave over here but the thing you returned self delimiter has a lifetime of delimiter and these two are not the same and in fact we haven't even given a relationship between the two so one stupid way if I actually wanted to write this code right is I could say where the limiter is greater than haystack this is something I'm allowed to write now I'm saying now the compiler is gonna go okay you returned something with a lifetime delimiter you promised you were gonna return something with a lifetime of haystack normally that was not okay but here I have a clause saying delimiter is longer than haystack or sort of phrased differently delimiter implements haystack right this is the subtyping relationship and if delimiter lives for at least as long as haystack that if I have a reference with lifetime delimiter it also can be downgraded to the lifetime of haystack whereas the reverse is not true of course this is not the code I want to write and I don't want that bound there so I'm gonna turn it back to what it was and now the compile is gonna be happy if I now run cargo tests now it passes and the until car function actually works and the reason of course is now even though this string gets D allocated really quickly that doesn't matter to us that's totally fine right because the the items there yielded beyster split as an iterator or have a lifetime that's only tied to the first argument that was given to Nooh ah let's see can you put underscore for the delimiter lifetime to say it's not needed yes you can so here I can do this right this block does not care what this lifetime is we don't need to be able to name it and so we can use the lifetime elision basically or the anonymous lifetime to say any lifetime here we'll do it's gonna be unique from all the other lifetimes it's a good catch and same thing down here actually for until car here the compiler can just infer that this must be tied to this lifetime because there are no other lifetimes to attach to um now let's say let's say that someone writes actually that's not that important okay this is what we did now make sense think so yeah so this does do a heap allocation as that's the next thing we're gonna look at introduced multiple threads I don't see in what way multiple threads are relevant here each thread would have its own store split if you ever were to make one you don't technically need this I guess but if you turn on this so rust 2018 idioms one of the things that they introduced oh actually I guess yeah one of the things they introduced was that if you return a lifetime they really should require it here too fine fine you're right it can be left out here um I like to give it just to indicate that it it is Auto inferred but it's not required all right so I want to cover one more thing in like the last 15 minutes or so and that is we do have an allocation here which is kind of sad can we get rid of that and this is actually going to end up getting rid of the lifetime for delimiter which is what if instead of the delimiter being a string we want it to be able to be anything that can find itself in a string string is one such example but it doesn't have to be so let's say that this is going to be D for delimiter so now this is generic over D and notice that D is not a lifetime D is just a type and the delimiter is gonna be D and then here what are we gonna what are we gonna do what are the requirements on D well the only thing we really need is the ability to figure out basically the bounds of where that delimiter next appears in a string so we're gonna introduce a trait pub trait and we're gonna do like let's call it just the limiter why not and the things that a delimiter has to be able to do place for the time being is it has to be able to give us its length so that we can skip past it actually let's do skip so it's given a reference to a string and in doing is to return a reference to a string we can even do better we can say that it's gonna be fine next given a self and string and what it needs to return is an option with two numbers where it starts and where it ends that's all we get so now we know let's say here that we want to say the D has to implement delimiter so we want to implement iterator first or split for any D where D implements delimiter and now we just need to write this in terms of find next so this is gonna be Dylan starch and element and then this is going to be self delimiter dot find next of the remainder and now we know that until the delimiter is delhomme start and after the limiter as delhomme end okay that wasn't too bad right we just sort of flipped it around and now what we can do is we can implement delimiter for a reference to a string s is a stir and it needs to return an option you sighs you size this because if we're strings is pretty straightforward right finding a string in a string we already did this and it's essentially just s find self and then I guess we're gonna map that to start because we also need to return the end that's part of the trait contract so it's gonna start and it's gonna end at self dawdlin so let's see whether this still works I'm gonna cheat a little ignore that okay so that still works it's now generic over whatever the type D is and notice that there's no longer delimiter lifetime and yet we were allowed to give a reference to Astor and the reason is because here we're saying we're generic over any D and that D could be reference it could be it could live for whatever time it wants there's no requirements on the other than it implements the limiter and now where this gets really neat is we can implement delimiter for other things we can implement delimiter for character and now we want to find well we can sort of cheat here but what I'm gonna do is s dot car indices position where C is self start plus one [Music] let's see if that actually this is not to be position I guess this can just be course actually no it can't find all right so what this is doing is it's iterating over all the characters of the string looking for one that is the character we're searching for and then when it finds whatever results it finds if it finds one we're gonna map that sum to take the position and return that position and that position plus one right because the character is only one character long and now let's see if this works so instead of now allocating the string can we just pass the C here we can okay so now the allocation is gone and now we can implement this pattern for all sorts of other types so anything that can find itself in a string will now just work and so now we have a generic store split implementation that works for anything that can find itself in a string all right questions Papapa why new use pattern I'll get to pattern self here in the implementation down here the self here is a reference to this type so it's a reference to a reference to a stir this does need to be Len utf-8 I think you're right it's not even a thing no I think that's what Len will do I think that's right is there a simpler way than character indices there is but this shows the concept of it like you can do way more efficient things than this what does find self return find itself so find as a method on strings that you can give it a string and it will tell you the start of that string in that string that plus one is wrong and will panic your code is that true I think you slice bike oh it's by byte indices yeah so this is gonna have to be is this a thing it's going to be bright for those of you in dark rooms it's two character just character have like a UTF length to it yeah okay so this is gonna be Len utf-8 so that's the one you were referring to so that should do it great all right so some questions here can you explain find map yeah so find gives you an option of where the thing is found an option of the position where it's found and the map is if it's none then just return none if it's sum then I want to change the contents of the sum to be this because remember the trait requires that we give both the start at the end why self Len and not s Len because self is the thing we're searching for it's the length of the delimiter right so in order to find the end of the delimiter it has to be the start of the delimiter plus the length of the delimiter not the length of the string we're searching in the s here is what we're searching in okay great so now some of you have been observing this that why are you doing this like fine just works and why don't use pattern so my planners to keep this a secret and now I successfully have so all of the things we implement today exist in the standard library trust me I am aware even though I don't bring it up so if you look at stir you will find the find function better yet you will find the split function so split on a string takes a reference to self and some pattern P and it returns to split and if we look at split split implements iterator and it gives you the things split by that and you'll notice that split has a lifetime of TK and the TK is the lifetime of the string you're searching in but then it also takes a pattern a delimiter that implements this trait pattern and pattern is a trait that it's a little more convoluted than our delimiter trait but it's basically the same thing it gives you a way to look for some something in a string and so really what we've done today is go the whole route through how you get to what's in the standard library today of how you split a string but going through it in such a way that we also go through where multiple lifetimes are useful and how to turn these kinds of things into into traits and generics and so you could find actually that all the tests we wrote today we could just use the standard library instead we could just do sort of haystack dot split on space and it would work the same way similarly we could do haystack dot split on characters because characters also implement patterns and so all the things we did today there's no reason to publish this as a crate because it's already in the standard library but hopefully it was a useful exercise and understanding how these different pieces fit together different types of lifetimes when you might need multiple lifetimes how to read some of these lifetime errors and and also things like differences between strings and stors and references ok so I think that's getting us close to the end of the 90 minutes but let's take some questions now that I've revealed the big secret let's see here why can't you create a string from Astor fat pointer because you don't own the memory a string assumes that it owns the underlying memory it assumes that when it's dropped it has to free that memory and it assumes that it can grow or shrink that memory is necessary which would not be true if you took some arbitrary pointer in length and just decided that I own this now that's not generally true it's true we should have unicode character test yeah so the I was it's usually obviously the top level thing is being reimplementation but it's usually not obvious the deeper things are also done for the same reason yeah so it's a good summary of what I was trying to do here and some some people jump the gun in chat but that's okay it's good that you observe that this exists in the standard library don't you think the rest is kind of less readable than other languages no I don't think so I think if you if you wrote rust using only the features that existed in those other languages it's equally readable but rust has additional features that require additional syntax and it's true that when you use those additional features your code becomes harder to read but it also adds additional features so you couldn't even do the same things in those other languages the pattern in the haystack seems to the same in sharing the same life time tick a yes oh pattern in the standard library is a little interesting there's a reason why it's nightly only and that is basically because they haven't quite figured out what the design for it should be the tick a here is the lifetime of the string that the pattern is searching in that gets communicated sort of all the way down the stack you'll see there's a second trait called searcher which then lets you do basically the same thing we did today right so you see there's a next match and an extra jacket and all these things get to operate on the tick a reference to the haystack so you'll see actually this is very similar it's just more convoluted in order to basically write more efficient implementations when you see something like type tick X how do you know what the tick what the X is the lifetime of you don't just like if you see type T you don't know what type what that type T is what do you think of rust having a future in the industry I gave a whole talk on this if you look it up on my You Tube channel it's called considering rust and the last like 10 minutes of its basically looking at the future of rust and industry could you publish this as a gist so we can play with it absolutely I will post this somewhere and then I will actually I can do that I'll do that after the stream and post it in the in the description for the video as well [Music] can you work with standard in as an input instead of a stirrer that's harder because standard in is a stream it's not just a constant so it's not it's not something you can seek in for examples that might be trigger but you could try great how do you think generic associated types will improve trait definitions I think it will help a lot so gat generic associated types will probably not necessarily help with trait definitions for that you need existential types more so and a couple of other things it will help a lot with being able to clone less as one of the big things that will help with this is a matter that the second you size in the find X is the end index and not the length it could be either to be honest it might actually be better for it to be the length and that's something we could modify do you intend to do some lecture for newcomers to rust I'm not planning to do any complete like beginner streams I think that that would be a good addition but it's not something I plan to do I might do more of these sort of relatively focused videos though at least if it seems like there's some appetite for them and the people enjoyed this style and so I might do some more similar types of things but they will still be more geared towards intermediate than beginner alright I think that's about time for now if you want to hear about other upcoming videos of the style or of the other videos that I do check out some of the past recordings on my youtube channel and also i am on twitter / john and there I will post any and all notifications that are relevant sweet thank you all for joining me I hope you learned something I hope it was possible to follow and I hope that having it be 90 minutes instead of six hours made it more digestible thanks everyone stay safe stay home and I will see you I guess next time there's a video bye
Info
Channel: Jon Gjengset
Views: 72,226
Rating: 4.9881406 out of 5
Keywords: rust, lifetimes, strings, generics
Id: rAl-9HwD858
Channel Id: undefined
Length: 93min 23sec (5603 seconds)
Published: Wed Apr 22 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.