Dynamic vs Static Dispatch in Rust

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right welcome everybody to another stream it's been a while but i'm back and excited to be uh here with all of you for another day of uh streaming some rust so um the plan for today is going to be an interesting one i hope um it is going to be an exploration of trait objects um and rust of dynamic dispatch um and i guess by contrast also of uh of static dispatch as well so we're gonna understand what these concepts are um when you use one versus the other why do we have to care about this and rust what does what's actually happening underneath the hood um and so hopefully we'll also get a chance to look at uh go into a debugger and take a look at what's actually happening at the assembly level as well um and uh and hopefully you get a better understanding of how rust handles method calls or function calls in general because that's something that you actually have control over and rust and in many languages you don't have that control um oh and it looks like yes the the uh stream is improperly titled uh for those that are watching this live sorry about that i forgot to change the name of the stream that's that's a bit embarrassing but if you're watching this on youtube then hopefully i have correctly titled the stream on on youtube so hello future um this uh this video is hopefully correctly titled um but yes we are focusing on dynamic dispatch and trade objects as well and not on implementing a custom vec type which i did last time all right so let's switch on over to the screen to my code then here and we're starting with just an empty project here i just ran cargo new and it's got hello world in it and stuff like that and in fact we're gonna real quick change this over to be lib instead so it's not going to be a binary it's going to be a library instead and you know i was thinking about good examples for talking about dynamic dispatch and um it's kind of tough to just come up with an example that's that's relevant for dynamic dispatch and is interesting on its own um but i've been messing around with some other code that has to do with spell checking for another project that i'm working on and i thought that we could um use a spell checker um api as kind of a motivation for um for what we're working on today um and so uh basically what i want to do is have a function um which looks like we're in a good state now okay um yes so uh so let's get started um what are we doing today we're going to do a spell checker and really what i want out of this is to have a library that's something like a public function that is spell check here and it takes in you know an input let's say and it will be of type uh ampersand stir string slice and it's going to return back a vector of let's just do this it's going to return back a string and so this this whole thing will will basically um we'll spell check uh the input string and return a new string where things have been have been auto-corrected let's say and as a second argument here to input i want to take another thing which is the spell checker like this um and uh you know what type is spell checker going to be well the first thing that we're going to try out is the generic rust way of just saying some generic type let's call it c for checker here and at the beginning of the function we say we have some generic type c here right so now what does this function say it says it has two arguments an input of a string slice and the second argument is a spell checker of some type c and what do we know about type c we know absolutely nothing so we can't really do anything with c because we know nothing about it it's it can be any type literally any type um we're just having happening to call it uh type c here that's not super useful here the code that we probably want to that i eventually want to be able to write is basically um something like um you know uh four changes in spell checker dot check or something like that where we pass in the input so the spell checker will have a method on it called check and we're going to pass an input string into it and it's going to return a collection of changes of change sets on that input and then for each i'm not changed for change um and for each change inside of uh you know for each change that the spell tracker returns to us then we'll do something like you know update text where we pass in um the result which is just going to be a copy of our input here and we'll pass the immutable handle to um to our result type and the change that we want to apply let's call this instead of undo like apply change or something like that um and that you know for each change that our spell checker returns to us we apply the change and then at the end we return our results here um so naturally um two things first we have to have an apply change function here um we can write some docs for it like takes a change and uh updates the string with that just like this so it's going to take change um and we don't have this type yet but we'll have some type representing it a change that the spell checker returns to us and it's going to take in a string a mutable reference to a string that it can update and this this stream is not really about implementing a spell checker i don't know if this is a good idea for how you would implement a spell checker or not i can think of several issues with this particular design but that's not the point so in here we're going to just say like to do implement and for reasons that will become apparent soon i'm i'm not using the to do macro that panics at runtime i really want the apply change thing to do absolutely nothing um so we're just going to ignore these changes so what kind of uh what is change then won't change it's just an enum that represents changes that a spell checker might return to you so for instance a spell tracker might want you to delete certain text and here we'll have a standard ops range a view size so the the the range of text that it wants you to delete from index um you know a to index b delete that text or maybe it wants you to like uh replace some text so uh again a range um from a to b where it wants you to replace it with some updated string or something like that so this is our this is our spell checker um uh it can return um changes here and so hopefully this makes sense right we we take in an input string we we get an owned copy of it basically here with two owned so now result here is a string we're going to call the check method on our spell checker and that's erroring for for maybe obvious reasons but for reasons we'll talk about in just a second we get back a whole collection of changes and for every change that we get we apply that change to the the result string and at the end we return that result to the user so this is how we're we're modeling our spell checker and this is you know this is world class code here but of course we have an error here from the compiler saying like i don't know what check is for type c like i know nothing about type c and i sure as heck don't know if it has a check method on it um what is this check method right and that makes sense again we know nothing about spellchecker of type c we know nothing about type c it could be you size you size doesn't have a check method on it it could be string string doesn't have a check method on it nothing has a check method on it as far as we're concerned and rust we have to um you know we have to give more information than that we don't have kind of dynamic typing here where it'll figure it out at runtime and give us an error message if if there is no check method and do the right thing if there is we have to convince the compiler ahead of time so how do we we do things like that well we use traits in this case so we can say that we're going to have a trait called spell checker um and i'm i'm always confused whether spell checkers should be spelled with a capital c or not um i'll go with lowercase c for now so we have a trait called spell checker and the trait has one method on it called check and what check does is it takes uh ampersand's health so it's a method it's not a it's not a function and it takes some input string here and then returns back a vector of changes all right so this is our spell checker trait we now can say that types can implement the spell checker trade if they implement a check method that has you know takes ampersand itself and an input string and returns back a vector of changes you know we're still getting an error up here because still type c is could be anything but you know the magic happens when we constrain c to be any type as long as it implements the spell tracker trait pretty straightforward there and i think that this here is just saying yeah result needs to be marked as mutable that's fine and now everything uh is compiling except you know it's complaining that um none of this stuff is marked as public but that's fine let's just mark it all as public and actually public um oh yes it does any questions about this so far basically what we've done here is we've modeled a spell check function that can take an input string and any spell any type c as long as it implements the spell checker trait and it allows us to kind of plug in any spell checker that we want this is a public function anybody can provide us any spell checker that that we want that they want as long as that type implements the spell checker trait looks good sounds good if you don't quite understand this then the rest of the stream might be a little bit too advanced for you because this is the kind of the basis that we're going for feel free to stick around and ask questions but um but this is kind of the assumed knowledge that you have for us so far for to get the most out of this stream all right so um what is happening here well uh what we're doing is we have a generic function here where we can plug in any type c into our spell check function as long as it implements um the the the spell check uh spell checker trade here and so we can go ahead and um let's let's implement a few for illustration purposes here um let's impulse spell checker for let's call it no op spell checker check here and we're going to have some struct called no op spell checker here and what the no op spell checker will do is just return back um in an empty vector so it essentially always finds that there are no changes to make here all right no op doesn't do anything just returns an empty vector of changes and if you know if we want to have a little bit of fun here we can do for instance uh what was i thinking about earlier of doing um i think it was like a capital capital uh letter spell checker and this one what it's going to do is check for words that are completely composed of capital letters and um propose to delete them basically it doesn't like capital letters it wants you to get rid of them um so we can say here um what's the best way to do this uh input dot uh match indices and we'll just look for one single space um as words uh it's not quite right um you probably would want to use a regular expression library or something here to help make this more robust but that's fine we can just assume that all words are separated by a single space for now and this will return back and index whoa index and word not world and we can say if the word dot [Music] i chars there's an is uppercase no no i think i think you have to do this uh chars dot all here is uppercase um so if words char if all the chars are uppercase then we're going to and instead of using map here i'm going to use filter map instead which allows us to map and filter at the same exact time which can be handy here because we can say if it is a upper case then return back the change of delete where the range is index and index plus the word's length and otherwise we didn't find a capital letter word and so we returned back then um and the last thing that we have to do here is collect this up into a vector oh and it looks like we're missing uh cool so just to repeat what this does real quick is um match indices basically looks for one empty space and when it finds that it um when actually this is not not exactly correct because we want let's this is not the point of the uh of the exercise here um let us go ahead and just change this name to like uh anti-space checker this just deletes all the spaces now um and this just makes no sense anymore um we can just do this instead so now we're just deleting or saying that we should delete all the spaces and stuff like that so again we're not actually implementing a spell checker here this has nothing to do with what we're doing here but i wanted to implement several spectaculars so that we can see what happens when when we use uh different ones at the same time all right so now we have a spell check function we have two uh separate spell checkers and our spell check uh function is capable of taking either one of them because it is generic over any type c as long as it implements spell checker and we can prove that to ourselves by writing ourselves a little test here so cfg test mod tests and let's use super superstar and i'm i've forgotten how to type um it works and we can go ahead and say let text equals hello is it me you're looking for um and we want to spell check this with the text as our input and the spell checker will first be the no obstacle and we can go ahead and insert that result equals text so hopefully this works and it does cool cool so we have spell check here it's using um it's it's uh using the noap spell checker um and down here we can call spell check here with the anti space checker here and that also compiles fine and we won't check the result here it's it's it's fine either way but the the nice part about this is it's generic and it takes um it takes two different types here because it knows that they both implement um spell checker which is great all right now this particular form of generic functions here uh relies uh particularly here on when we call spellchecker.check it relies on something called static dispatch and what that means is that for each time that we call spell check with a different spell checker um in the case of our test here we're calling it with two different spell checkers so each time the rust compiler will go ahead and create specialized spell check functions for each one of those spell checkers this is a process the technical term for it is monomorphization basically it copy paste the implementation and differs them based on the two types that we're calling with all right let's look and see if there are any questions here um [Music] if i want to fetch and approximately fetch and parse approximately 300 links would you recommend it right now oh um if you want to fetch and parse 300 links um if you're so if you want to go ahead and do 300 different http requests at the same time then you might want to just go ahead and use um just an async implementation something like uh surfers or something like that which allows for parallelization pretty easily if you just want to use a standard blocking http implementation then something like crossbeam probably is the right way to go or you might be able to use something like rayon as well um and there's another question about comparisons between capital s string and ampersand string does that work yes it does so there is a implementation for for equality checks between own strings and string slices i don't copy and paste code from the internet i monomorphize it exactly um cool so this is kind of the typical way that you would write um something like this uh with generic rust now when you're out there in the world you might see a different uh implement you might see some different syntax here instead of using this generic parameter here you might see something like this and we'll have to give it a different name spell check two here and this compiles fine there's no um error messages here or anything like that but this is a different um syntax and has a different semantics than what we saw before but in effect accomplishes something very similar in fact we can see down here um if we want we can call spell check 2 down here with our text as input and with no op spell checker here and with anti-space trigger as well and this compiles just fine so there's nothing nothing wrong with this this works totally fine and in a way accomplishes the same exact thing as as our original spell check function will have the same results the same thing will will be possible here there's just a little bit of a difference when it comes to what's actually happening at runtime and you'll notice here that we have ampersands here so we're taking references to these things instead of just taking them by value this is a trait object here um and it is using dynamic dispatch instead of static dispatch which we'll talk about in just just a second what that is so the first thing that i want to do is like i added this this ampersand and this done like what what happens if i don't have that what it happens if i try and do something like this without it well um it's a it works the code compiles but it's a warning and i think most likely and i have a pull request open this on the um on the rust compiler this will most likely end up being an error in the next edition of rust and the reason for that is because having dyn here makes it very very clear that we are using a trait object so the point is to be explicit in saying that this is a trait object here it's using dynamic dispatch spell checker here is a trait it's not a struct it's not a a concrete type here and this makes that very apparent you don't have to know that spell checker is a trait once you see dyn right here you then you can be sure you don't have to look at the implementation of spellchecker at all so so that's all well and fine the dyn is necessary just because russ wants it wants you to be explicit when you're using a trade object here what happens if we go ahead and remove the ampersand here in fact okay cool gives you the the error that i was hoping it would give you it's saying that the size for values of type dyn spell checker plus static cannot be known at compilation time and this is very important to to remember and rust you you'll often see these types of error messages here about not knowing that the size of something at compilation time what is that all about the the important part uh for this is that rust compiles your code and when it compiles your code it has to lay out the code in memory and in order to be able to do that for something like passing arguments into a function it has to know how much space that argument will take up on the stack that's an important thing to know when you're creating a stack frame how much space does your argument actually take up on the stack now if you don't know what the stack is exactly and stuff like that i encourage you to look into that it's basically the you know the space that if that a method or a function call takes up um in memory when you're uh when when it's being run um and arguments need to be you know allocated onto onto the stack some um other languages like uh like java for instance um which was mentioned in the um in the chat get around this big by basically almost everything um except for primitive types in java are heap allocated that's not how russ works almost everything in rust is stack allocated unless you explicitly heap allocate and when we're passing a trait we can't pass a trade a trade is not a thing a trade is just a contract saying hey this is a type that happens to implement this check method here when we're actually running we have to pass a concrete type that implements that that that trade that can be anything so it hap it just so happens that our two types here no op spell checker and anti space checker you know take up no space um they're they're zero size types they they don't have anything associated with them no data associated with them they have no fields basically but we can imagine you know adding to our our no op checkers some fields here like um you know i don't know no spell checker doesn't really need any fields um so we'll leave that as as zero there um but down here empty space checker maybe it wants to keep track of like number of spaces here um and we'll do that as a use size here and um you know we'll change our implementation at some point to keep track of the number of spaces that it encounters for analytics purposes or something i don't know i'm just making something up here right now no op spell checker takes up no space it's a zero size type basically at run time it effectively doesn't exist anymore um because it has no fields and anti-space checker takes up a uh how many bytes four bytes takes up the size of of what a u sizes and and memory so these two things have different sizes so when rust comes here and says okay how much space on the stack do i need to allocate for the spell checker parameter here it it looks and says okay when no op spell checker is being called here i would uh allocate zero bytes because it doesn't exist it's it's a zero size type and when um i want to pass in empty space uh checker here i need to allocate four uh bytes here or eight bytes actually because i'm on a six bit machine um and that doesn't that doesn't fly you can't have a function here that sometimes takes zero bytes for one parameter and sometimes it takes eight bytes it always needs to take the same size uh thing and that's exactly uh the difference with with spell check up here um spell checker up here really as we said it monomorphizes it copy pastes different functions depending on whatever type we pass in for c here essentially in a way in this case to be able to allocate different space on the stack for when we pass in no op spell checker it won't allocate any and uh and when we pass in a anti-space checker we'll allocate four uh eight bytes for for that parameter so that's how our generic parameter handles this problem for the trade object the way that it gets around this is we always have to pass our trade object behind a pointer of some type we have to pass it indirectly in this case we're passing it a reference but we could also instead of passing it as a reference we could pass it as a box and you can see i got rid of the reference here but we're still behind a pointer it's now heap allocated instead of whatever a reference would be but we have introduced some indirection here so that the type that implements spell checker is living somewhere else and all we're doing is passing a pointer to it and a pointer always has the same size it always is the size of you know on my machine 64 bit machine it's 64 bits big it's 8 bytes big right on a 32 bit machine it would be 32 32 bits big all right so this indirection allows us to say i don't care what spell checker you pass it could be different every single time but you're always going to pass it behind a pointer all right i've talked a lot let's let's answer some questions so there's a good question here what's the difference between using box and ampersand when using the the um the done keyword so box here is always always always a heap allocated pointer essentially box owns the thing that um uh that it refers to and that thing necessarily lives on the heap and when when this spell checker gets dropped which will happen at line 15 the the contents of the box will also be dropped so our spell checker will go away at the end of this function we know that for sure it's not borrowed it's owned here just happens to live on the heap whereas when we have a reference here we're borrowing it so we don't really know where our spell checker actually lives it could be living in the heap it could be living in this and some other you know method stack frame could be somewhere in memory we don't we don't know um and in fact it could even be kind of embedded in our binary we don't really know um uh where we're borrowing it though um and so essentially the difference between ampersand on and box done is just the difference between ampersand whatever and box whatever um ampersand references are borrows and box t is a owned pointer to something hopefully that makes a little bit of sense all right there's a question about v tables which we're going to get to in just a second um what's the trade-off here so the trade-off um in this particular case and we'll talk a little bit more about trade-offs as we as we get deeper into this is that uh spell check here as we said will be monomorphized and so every time we use it with a different type um it will be copy pasted what does that mean that means there's going to be potentially many many many copies of this spell check function and that can mean larger binary sizes because every time you use spell check one with let's call it spell check one so i can more easily refer to it every time you use the spell check one with a different type it gets copy pasted and that means more code inside of your binary spell check two there's only one function called spell check two that exists in your binary it's this one right here it's not being copy pasted for every different type there's only one spell check two function and that means smaller binary sizes but you pay in terms of runtime cost for reasons that we'll talk about in just a second with static dispatch which is what this function is here you're just making a straight meth straight function call as in at the assembly level it's using the call instruction and going straight to wherever this check um function is is actually implemented and that's pretty cheap um the the call to check here has to go through some indirection first it has to go chase this uh spell checker and find where it's living and there what it will counter is uh um i guess now is a good time to mention it as chap did a v table and a v table is basically a data structure in memory that points to where the implementation of that specific types methods live in memory and again that's going to be a pointer so it's going to be somewhere else so it's basically chasing two pointers to go ahead and call um that function um instead of making the direct function call instead and that will have performance impacts another performance impact on top of that is that rust can very easily look through this and do a whole bunch of optimizations and see for instance if it looks at the check uh method here and see oh you know that check method's pretty small i'm gonna go ahead and inline the code i'm gonna like take the code from where it is and like paste it in the and so i won't call the method i'll just like paste in the implementation of check in line there because it's small and it will be faster that way so we don't even actually end up calling the method at all we just call what the method's implementation is directly it just gets like the assembly gets plopped in directly there and that's when things get really fast because you don't have to pay the price of of of a function call you don't have to use the call um assembly is structured you just execute the assembly of that methods implementation straight away you can never do that with dynamic dispatch like the compiler just has no way of knowing because this what what implementation we actually call is determined at runtime so there's no way for the the compiler to ever really know um it could never inline that implementation because it doesn't know what implementation to inline effectively that's the difference between static dispatch and dynamic dispatch static dispatch is determining which function to call at compile time and dynamic dispatch is determining which function to call like which implementation of the function to call at runtime and so all of the interesting optimizations that you can do with spell with static dispatch which we have in spell track one fall away and you can't do that in spell check two so if you use it in really performance sensitive cases you can end up seeing pretty large performance penalties for it all right um and yes uh as chad is also mentioning static dispatch typically will be slower to compile because it has to do that copy pasting and it has to compile two functions and stuff like that so yeah there there is um often the case where you might move to dynamic dispatch because you can pay the performance penalty it's fine your code's fast enough even with the paying the performance penalty but you will only have to compile this one function instead of you know let's say we have a hundred different spell checkers we'd have to have a hundred different versions of spell check one which can get pretty expensive to compile um is there a difference between spell check 1 and spell check 2 in terms of pure functionality no theoretically like these are um you know from the answers that we will get back um if we pass in the same spell checker to each one will get the same exact answer for either one so from a functionality perspective there is no difference between in this case between dynamic dispatch and static dispatch we will see in just one second i'll talk about how dynamic dispatch can also unlock additional functionality that you cannot get if you need to use the static dispatch so that's there's another question in check can i do everything i can do with static dispatch that i can with dynamic dispatch and the answer is no you can't and we'll in this particular case with what we've written right here the answer is yes so everything we've written here there's there's nothing special about dynamic dispatch other than the performance both compile times and runtime differences that we talked about and things like that binary size things like that that's the only difference in this case but let's let's take a look at one interesting thing we renamed this to spell check one um oh yeah we added some fields here which i don't want to do anymore cool so here's an interesting thing that we can do with dynamic dispatch that we cannot do with static dispatch let's say we want to have we want to run multiple spell checkers over our input text so we can say let s spell checkers equals no op spell checker and an anti-space spell checker so now we have two spell checkers here um and um well this yes this won't compile okay so this doesn't compile right here um why is that because when we when we call vec with no spell checker and anti-space checker we're trying to construct essentially a vector with two different types in it we're trying to um construct a a vector that has both a no-op spell checker and an anti-space checker and we can't do that in rust and it's essentially for reasons that we talked about before with the the function parameter when when we create vectors we have to allocate enough space in the vector for each item in them and no op spell checker and anti-space checker might have different sizes in memory in this case they don't so you know go back to thinking where you have multiple one different ones um but uh you know you need to be able to to lay out in memory in your vector okay space each space takes up x number of bytes and and the only way we can be sure of that is if every element in the vector is of the same exact type and no spell checker and empty spell checker empty space checker um are two different types all right so this doesn't happen and if we look at the uh the error message it basically says hey i'm building a vector of no op spell checkers here and you pass me an anti-space checker that's not cool you can't do that so we can get around that by instead um let's go ahead and box these up real quick um and this come on sorry for the bouncing around here this also doesn't compile but it will in just one second so bear with me be for the exact same reason we're now building a vector of box boxed no spell checkers and now we're passing it a boxed anti-space checker and that's not cool so can't do can't do that yet but if we go ahead and uh come on say that this is actually going to be a vector of box dun um spell checker now this does compile and what we're saying is this is this vector um isn't uh containing a um this this this vector actually does contain all the same type it contains in each element a boxed done spell checker a uh a trait object for this spell checker trait here and now our spell checkers here is simply just a vector where each element is um a a owned pointer a box that points up to this trade object and the trade object in memory is simply just as basically just that v table that we talked about a a place of um where where it keeps um you know it keeps a table of where it's actual implementation for all the um all the methods for the trait that it implements are and in case of spell checker there's only one trait uh only one method um so this is fine because every element in the vector is the same size and we can go ahead for instance and loop over our for spell checker and spell checkers then we'll call um we'll go ahead and call spellcheck 2 on the text with the spell checker and what is complaining about here ah because we should be able to do this so i was complaining before that we were passing in a trade a we're trying to pass in a boxed trade object here but our spell check 2 takes the ampersand to trade object that's very easy to get around we just go ahead and de-reference the box and then re-reference it so basically borrow the contents of the of the box here um and as ref should probably work as well chat yeah it also works so there's a there's a question in chat again about what is the purpose of the done keyword well again you don't really need it because this compiles just fine but we do get a warning and saying hey this is a trade object right it's not dyna it's not static dispatch this is dynamic dispatch um and you're not using the done keyword here and the reason that the done keyword was uh was introduced was basically to make this very explicit because let's assume i know nothing about what spell checkers are and i'm looking at spell check 2 here i have no way of knowing that spell checker here is actually a trait that this is actually a trade object here and that dynamic dispatch will happen here if i knew nothing about spell checker if i had never seen this code base before for instance i'm not writing it i have no idea what this is and i look at it and i go okay with spell checker this could either be a reference to some spell checker type a a structs called spell checker or it could be a trade object and before the done keyword was introduced in rest 20 and russ 2018 around that time it was introduced there was no way to tell and you basically had to look you know is spell checker a trait okay it is okay this is a trait object that's how you would know um and the done keyword was introduced to make that very explicit and now when you look at it you know right away this is a trade object spell checker is not a struct there's no way it could be a struct and in fact if we let's try to put no spell checker no op spell checker here it'll go it's like i expect this to be a trait and you're giving me a struct it's like exactly it's got to be a trait there's a question why do we need to re-reference it's a box ampersand done t no it's not it is a box done t there is no amp there's no reference inside of this box so this works here what happens if we try and use uh spell check one here it goes nope like the i'm trying to monomorphize here a trying to like create a fresh copy of this bell check one um uh function here that that where done spell checker implement spelt checker and that's just not the case now i'll leave it to the reader to try and implement spellchecker for ampersand done spellchecker and see the fun that you get into with that as well uh silly silly question which is not a silly question why is done pronounced done and not dine i think a lot of people call it dine i call it done as in not as in done but done i don't know it's just me call it whatever you want d-y-n it's also fine yeah den is also i i'm not exactly sure what the kind of like correct way to say it is um or if there even is a like official way to say it um i have to ask the uh rfc author what their opinion is yeah the old gif versus jif or jheif debate who knows so does everybody understand what the the like on a you know high level um um kind of high level understanding of what the difference between this uh static dispatches here this monomorphized function here with a generic uh parameter here of type c that implements spell checker versus this dynamic dispatch trait object here with spellcheck 2 because is this making some sense now the question might be that's coming up okay in this particular case which one should i pick like you said they're functionally equivalent which one should i pick well largely it's going to be the trade-offs that we talked about in this particular case like you know does binary size matter too much to you how many different spell checkers will there be will you potentially balloon compile times if this is most of the time this is going to be what you want in terms of just these you know plain function calls here so most of the time you'll see you'll end up seeing this static dispatch um type like the status dispatch kind of uh pattern being employed for for function calls and stuff like that because most of the time the trade-offs end up being better for static dispatch like compile times usually aren't affected too bad by this um it's faster which is generally what you want to like do by default is whatever is fastest binary sizes like most of the time aren't affected too bad by this it's only when you're using it a lot will you notice compile time and binary size impacts by this so most of the time you're going to want to pick this but remember that if you want to support the the use cases for instance of um excuse me if you want to support the use case of having like kind of these um different types of uh of of spell checkers in a in a collection or something like that then dynamic dispatch comes in um pretty handy here and there's another um often use case for instance like let's say i have a struct of like my um text editor here my text editor has you know a ton of fields in it like it's got you know the uh buffer in it and you know that's a string let's say and it's got the cursor where we are in the thing you know cursor um and it's got you know so on and so forth a ton of different things and you also want your text editor to have a spell checker in it right like that's fine um no let's get ahead and have spell checker here we have two options here if we want a plugable spell checker here um we have two different things that we can do we can either say our spell checker is going to be well we have three options the first option is pick a concrete spell checker spell checker will always be the no op spell checker always and forever i'll get rid of cursor since it doesn't exist now we've hard coded our spell checker and our in our text editor always be the no op spell tracker we can't switch it out for anything it's always got to be the no op spell checker that's not so nice right then what we can do is say we'll make this text editor generic on what type of spell checker it is with two fields not too bad maybe this is fine um but then you always have to think about like a text editor um you know what generic type do you want to pass it it makes things um makes things bulkier to use and stuff like that you're you're when people are interacting with text editors they always have to either say i don't care about what type of spell checker it is some generic type c i don't care or they have to plug in which one they actually care about and stuff like that that's also not um sometimes not so nice maybe it is maybe it isn't and the last thing that you can do is say okay i'll just make this a box done spell checker and now notice text editor here is not generic text editor is always it's always just a text editor it's not generic on anything but our spell checker can be changed out we could have for instance like we could just make this pub here and people can just say you know my text editor.spellchecker is this different type it's plug and play you can have any type of spell checker you want any type of type that implements spell checker could live behind this this box pointer here and so that's another area where you have to decide do you what kind of flexibility do you want to want the ability to plug in a a different implementation of this type and do you need do you want to make your type generic over that or would you rather just pay the the runtime cost of having it be a trade object and use it instead and this is how in a lot of languages inheritance ends up working or and interfaces and stuff like that in java and c sharp basically in java and c sharp for instance everything is boxed everything lives on the heap except for uh primitive types so um you don't have special syntax for it but everything ends up living on and everything will go through dynamic dispatch so if this dynamic dispatch sounds funny most languages that's just the default that's where everything goes through dynamic dispatch there is no static dispatch it's only really in languages like c c plus plus and rust where um this static dispatch is uh plays a role there's a comment saying um this is a confusing type and you might want to have a type alias for it um yeah you could do that as well type spell checkers is equal to this type so you don't have to write it it is a bulky type that's for sure um so using a box done trade object adds more of runtime overhead yeah as we talked about before um dynamic dispatch is a little bit of a runtime overhead and can't be optimized as aggressively as static dispatch check in on the time okay we're running a little bit low on time because i have a hard hard ish stop at her in about 45 minutes um so i want to go ahead and start looking um at what this looks like in uh at actual run time and let's open up a debugger and check it out so i'm gonna clean this up by commenting out some some stuff here that we don't we won't need um and let's go up here and to illustrate the point i'm going to get rid of most of the implementation of spellcheck here and i'm just going to leave the the actual call to this dot check function and in fact let's go ahead and clean it up even more and just say these these things don't return anything so this is kind of kind of a waste obviously but this is this will be good for um uh looking at it in the um and the debugger all right and we don't need this anymore oops okay um let's just use the no obstacle characters so we're not going to take a look at the anti-space checker anymore we're just going to look at calls to spell check 1 and spell check 2. again spell check 1 is using static dispatch and spell check 2 is using dynamic dispatch all right there's a question about does done dispatch add a significant memory overhead no it shouldn't be significant a v table will will be a v table with one function in it will probably just be the size of one pointer the pointer to wherever the implementation of that method is and so you know so that's eight bytes or whatever that's that's not very big so so um memory is not what you need to worry about in terms of overhead for diamond dynamic dispatch it's really runtime performance in the fact that you have to go through two pointer lookups and you and the compiler can't optimize that away whereas it can optimize away the the static dispatch call all right um there was a question about using um ampersand done spell checker inside of our text editor type that we talked just talked about instead of boxed done and that's totally fine you can also do that but then you're running into the thing about lifetimes you know box is great because it's owned you don't have to worry about lifetimes and stuff like that but anytime you borrow something inside of a struct you will have to worry about the lifetime of that borrow and stuff like that that's not really specific to dynamic dispatch though um cool and then i really hope that uh gdb plays nice inside of this vs code terminal here so let's go ahead and run cargo test again and you'll see here we ran cargo test it creates a whole test binary and this is where the test binary lives right here and so what we can do is run uh gdb um on and i'll run it in 2e mode it's the terminal ui mode on target debug depths and this uh the the uh test binary that we were running and okay it seems not too bad so that's good so then we can go ahead and say that we want to stop um or put a breakpoint at the it works function and we'll go ahead and run cool so let's go ahead and hopefully i know my face is blocking down down below here and you can't see my mouse right now for instance um let me know if uh if i end up blocking something important i'll try and keep my keep my eye out for that and i think i can yeah cool so this is now full-ish screen okay so we're inside of this debugger right now we've set a break point for the first line of our of our function um we're going to take a look at uh what actually happens when we call spell check 1 and spell check 2 and what that looks like in assembly and don't worry if you're not used to assembly and stuff like that we're not going to look at it in detail we're just going to kind of get a rough idea of what's going on here okay so um i think the the first thing to do uh would be to go ahead and go here so now as you can see um over here we're getting a warning that our test is running very long that's fine um we are now at the line where spell check 1 gets called inside of our of our test here and what we can do is go ahead and run layout asm and what that does is switch from the from the source code here to the assembly that that source code represents so if you can what you can see here and in fact if i scroll up and we just look at here this is basically the entire it works um function that we had from before and we're about to go ahead and call our uh make a call to spell check one here all right so so that's cool so we can go ahead and step here and now uh we are inside of spellcheck one here and let's go ahead and switch back to to spell check one and you can see where we're in spell check one here um and all it does is go ahead and call check on our the spell checker that we've uh passed in all right so layout asm here and don't worry too much about the assembly again um all this all this junk here basically is doing is putting the arguments in and the right space and stuff like that um and setting up the stack and things like that there's nothing too much to worry about here the important thing is is that when uh right here this line right here is where we actually call check on our spell checker and you can see in fact that it is uh it knows that it's going to be calling check on the no op spell checker here so this no op spell checker as a spell checker calls its check and it's just one assembly uh instruction call to go ahead into um into that uh function so we can go ahead and um s step i step instruction here with s i and we're we're going along we're executing we're executing and then we go ahead and we execute one more instruction and call and then we're inside of spell checker uh check here it's spell checker check is not uh not doing too many um interesting things here um you know again setting up the um setting up the the stack and stuff like that that's not a bunch of fun and then it's calling um this you know um vector implementation because we're we're um basically calling vec new here and that's uh that's it so you know not too too fun or anything um but but that's basically what's what's happened here we've gone into the spellcheck one uh function and we've called using the call assembly instruction directly to the no op spell checkers check uh method hopefully that makes sense um pretty straightforward um and a whole bunch of other assembly junk that has to happen for uh esoteric reasons that don't really matter for today um all right i'm gonna start over again um and this uh gd my gdp gets messed up sometimes uh when i do this um what do i need i think i need to call refresh here yeah there we go all right let's go ahead and switch back i've restarted the program let's switch back to our source here so here we are in our source um again and i'm going to call next here which will go to spell check one and i'm gonna call next again and it's just gonna run spell check one again we've already seen what happens in spell check one and now we're at spell check two okay and let's take a look um at spell check two here so layout asm so we're about to um we're about to call spell check two again to go and or sorry not again we're going to go into spell track two here um and nothing very interesting here we're stepping through we're setting some arguments up or whatever and then we're going to call spell spell check two and go into spell check two so now we're in spell check two let me refresh it again so that doesn't look all messed up now this is spell check two and spell check two looks very different than spell checked one did spell check one remember like it was like four maybe five instructions that were like doing these these interesting weird move keyword pointer things again setting up some instructions in the stack and stuff like that or some some stuff on the stack setting up um arguments and things like that but then like around here ish is when we saw that like the call to or the the assembly instruction for call and remember it said like call the no op um the no op uh um check method so it knew exactly where to go to but there's no call here it's all the way there's all this this interesting thing down here call down here so like okay we're doing obviously doing more work here than we were doing before and doing a bunch of stuff but no idea what it is so step step step step step we're getting down here and then finally go up again we uh we finally get down uh to sorry i need to restart i went one one instruction too far of course um layout source next next um and then hello okay then we're in uh spell trick two again let's go back to here we're gonna go through just like before calling all these things boom move move and then finally we get to this call assembly instruction again but notice that it's not calling um just a hard-coded address anymore it's calling this keyword pointer thing and if you don't know x86 assembly basically it has calculated um a um an address and that's living inside of a register somewhere and it's going to call the function that lives at this address that it's calculated here so if you don't care too much about assembly the point here is we've done a whole bunch of work and we're more work now than we did before and we're going ahead and calling a function that we've calculated at runtime because we it wasn't hard coded in the assembly uh we've calculated at runtime unlike um what we had before all right before with uh when we were calling in spell check one we could just directly call um that uh that function all right and basically what's gonna what's gonna happen here um if we go ahead and step in is like we we call the um the uh we get the uh the v table like that's what was happening before as we were calculating the um uh the address to the implementation that we want to call from the v table where we gathered it we ex we calculated and then we go in ahead and execute it and we can go ahead and execute it here all right any questions um about what's happening here the difference that that we saw in um spell check one versus spell check two and why spell check one was shorter and just had that call directly to that hard-coded address and and how spell check two was was longer and didn't make a call to a hard-coded address but rather to address that was calculated at runtime do we we get that why that is um again it's important to to point out that the the reason for that that in our code is because spell check 2 if we go up to spell check 2 here spell check 2 is calling the check function on a trait object so it doesn't know until run time which specific implementation to go to it needs to calculate that at runtime by using this v table all right the last thing that i uh i wanted to show to you that's also interesting um let's go back here oh and it looks like looks like i totally screwed up the gdb here that's um if we go ahead and set a breakpoint again at it works i just want to look again at this the static dispatch that we had before let's go ahead and run this um and we want to go ahead and step here and then we want to uh step again and we need to refresh this so that it's you can see it and again we're about to call check on our spell checker and if we look at the assembly code again here we're back and again you know one two three four five six uh instructions into uh the function we're already at this hard-coded address that we're going to call for for check here but we're going to call a function here right we're going to call this uh no op spell checker check function so when we when we get to this here we leave this stack frame we leave this area in memory and we well go somewhere else and now we're in a totally different area of of memory right um now there's two interesting things that will happen if we go ahead and change our code let's go ahead and add back spell check one here with our anti-space checker and we'll add spell track 2 also with the anti-space checker but as the trade object so now we're calling spell check one and spell trick two each time twice with the two different spell checkers that we've created before um wait i have to sorry i have to go ahead and compile this cargo test and my terminal is all blown away because gdb2e blew it away um and run here and we can go ahead and set a breakpoint at it works and run it so here we are again we've got our text we've got the four function calls that we um are going to make and we can go ahead and check out what that what that looks at like if we go ahead and go to layout um asm here and take a look um we can you can see up here this is where we make our call to spell check one and then this is where we make our call to spell check two and if we scroll on down here um we make our call to spellcheck sorry up here was let me go to the top again this was the first call of spellcheck one and this was the second call to spell check one and then our two calls to spell check two are here and here is there anything that you notice about the addresses that we're actually calling here if we scroll up to the top when we call the spellcheck1 function the first time we call it we call it at address 555 whatever 68040 and the second time we call it at 555 whatever 680b0 so we're calling two different functions that proves to you spell check one is actually two different functions in code it gets compiled out to two separate functions one that works with the no op um the no up spell checker and the other one that works with the anti-space spell trigger but if we scroll to the bottom spell check 2 here call is to 555 698 and this one is also to 69860 so there is only one spell check two and that's exactly what we were talking about before of the trade-off between static uh dispatch monomorphization where you have to copy paste the code and that can lead to more code and dynamic dispatch dynamic dispatch there's only one version of spell check two and we're calling it right here but there's two versions in our code of spell check one and just to um just to prove that to you let's go ahead and step here um and we're still we're about to call spell check one um and if we call spellcheck we step again and refresh so this is viewable now we're inside of spell check one and you can see just like before we um are going to go ahead and call um the no op spell checker here the no optional checker check now i'm gonna go ahead and um go back up the stack frame again now we've we've returned back from spell check one the first call to spell check one and i'm going to call then the second call to spellcheck uh one which again is a separate function so we step um sorry we have to go through all the all the stuff i didn't uh i screwed up my gdb real quick we're going through the implementation blah blah blah drop okay now we've finally reached the point in our and our entire call stack where we have called spell track one again but look we're in a different spell check one now we're in the second copy of spellcheck one and here's the call to the anti-space checkers check function so there we have proven again to ourselves that there is multiple copies of this of this check function does that make sense to everybody the last interesting thing that i want to show real quick is if we go in here and add this annotation inline always what inline always does is it says compiler when you encounter this this function call here i don't want you to compile it down to a function call with the call assembly instruction i want you to take all the code that's inside of this function and wherever this check method is being called just copy paste the implementation right there so i don't have to pay the overhead of actually calling the method i want to go ahead and just copy paste the code right in and this is something that the compiler does often by itself we're running these tests in in debug mode and so a lot of these optimizations are turned off and so the inline always is basically like saying do this optimization no matter what even if you dear compiler don't think it's a good idea i want you to do this there's a question about how smart does russ see about endlining or does that happen in llvm um as far as my understanding right now of the situation is is that um llvm does the vast majority of inlining decisions that there's not a ton in the com and the rust compiler front end um that makes uh too many decisions uh about headlining except that it can provide hints and stuff like that to to lvm for instance inline always there gets eventually just passed lvm and say you have to inline this i'm telling you so it does know some about uh inlining so let's take a look and see if this makes a difference when we actually run it so now again real quick just to see when we call spell checker the no op spell checkers check method it should be inlined and we can do cargo test and that will give us a new uh binary that's been compiled and we can run g uh g e e2e um and it should be inside of target debug depths here uh and then we can go ahead and set a breakpoint and it works again and we'll go ahead and run it so here we are again and let's go ahead and step and we'll step into spell check one and now we're going to call the no op spell checkers implementation of of check we're about to to do that and let's take a look at what this looks like in assembly and you can tell here already this is the entire spellcheck one here there is no call anymore to uh no op spell checkers check function it doesn't exist we're not calling we're inside of spellcheck one here in code it says call no op spell checkers check function or check method but here we don't see that because it's been inlined and in fact what is the implementation that we're inlining it's a call to vecknew which is right there so here we can see that we have this implementation um where we can inline um the the call to vect nude directly into the call to to spell check one but if we go ahead back to source here my my gdb is just loving it here um where we're at spell check one but if then if we go and hit next again now we're going to go into spell check two let's step into spell check two which is calling again the no ops it's going to call the no opposite checkers check function or check method just like spell track one originally did we step into it here and if we take a look at the assembly here and look at this then we can see unfortunately we're doing all the same work that we did before we're doing all the same work that we did before where we're actually going to end up calling right here into the dynamically calculated address of of wherever no op um spellchecker's check method is and so inlining was not possible here because again we don't know that we want to call the no op uh spell checker function until runtime and so the compiler has to has to figure that out at or the compiler emits code that figures that out at runtime and so we can't possibly inline that implementation and this is the reason oftentimes that real reason oftentimes that dynamic dispatch is slower than static dispatch because we can't inline all right that was a discussion of static versus dynamic dispatch the differences between the two a little bit of a discussion about um when you would use one versus the other what the the pros and cons of one versus the other are and what we ended up with is taking a look at the actual assembly code that gets emitted when we use one versus the other to see what what impact they have at the assembly level and we were able to convince ourselves that static dispatch actually ends up producing multiple copies of the function depending on the type that we pass into the function whereas dynamic dispatch does not but static dispatch allows for inlining um whereas dynamic dispatch does not and with that you know a lot now about dynamic versus static dispatch um and and what actually happens at runtime when you use one or the other and with that i think we'll go ahead and um and end it there if there are any more questions then i'd be happy to answer that and please you can you can see at the bottom of the screen i know you can't you can see if i switch over to here oops at the bottom of the screen here you can see um my my twitter um in the youtube channel um where i post this please make sure to like subscribe all that stuff really helps me know if you if you like this stuff and let me know on twitter for instance what you want to see more of if you found this interesting or not if it was too slow too fast too advanced to beginner this really really really helps me because i'm just taking my time out of my day to do this and so it really helps to know what you think about it so there's a question here about um what do you mean by by inlining so we so we saw before um when we added this annotation here inline always what what the compiler does when it encounters this attribute is it looks at the the check method and says instead of compiling this as an actual function call an assembly and an assembly a function call is done through the call instruction i want you to take the implementation inside of this function and just paste it where you would do the function call instead and so it avoids setting up arguments switching out the the stack pointer and stuff like that and then going to a different place in code it just executes the um the functions body kind of inline directly where you would have otherwise called the function and that's what inlining does and that has the effect of making things much faster because there's a whole bunch of overhead that you pay for calling functions at the assembly level that you don't have to pay if the function is inlined there's a question about what is needed to collaborate with windows rs if you i'm for those that don't know i'm working on a project at microsoft called windows rs which is a rust project for interacting with the windows apis the various windows apis and if you want to collaborate then please open issues use it let us know what you think open up bug issues and take a look at some of the issues that we have already in the issue tracker and you can feel free there's some that are marked i think is good for beginners you can go ahead and implement those as well um all right doesn't look like there's uh any more questions feel free to reach out on on twitter and um i appreciate everybody's time it's been a lot of fun so i will see everybody around bye bye
Info
Channel: Ryan Levick
Views: 8,828
Rating: 4.9733334 out of 5
Keywords: rust, programming, computers
Id: tM2r9HD4ivQ
Channel Id: undefined
Length: 88min 25sec (5305 seconds)
Published: Mon Feb 08 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.