C# LINQ Performance Tips #1 - Let keyword & Custom Lookup

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone so today we're gonna be talking about link performance tips and it's video one in a series of videos i've touched on a link before so i made a video about performance problems in links a couple of days ago but i obviously have more material to show you so um we're gonna get back to that video because there's some things that i haven't shown and i wanted to show you now um but let's jump in so today we're gonna be talking about uh link in general so we're gonna talk about a couple of interesting things so first it's the let keyword in link then we're gonna move to a group by operator and then we're gonna touch on closures in general and in link and we're gonna go back to that circle back to the video that i made a couple of days about about link and i'm going to show you how closures are generated in that example okay so moving on let's just say for the second of our for for the sake of argument now that we have something that's called a transaction and that transaction has a single field in it called name so we have a list of transactions and we have two arrays which filter out these transactions so we have an array of domestic transactions and we have an array of foreign transactions so our task here is to filter out transactions that are on one of these two lists so the rest is going to be filter out and these will just remain okay so as you might probably saw these are lowercase names and these are like next case but in this example they're all uppercased so in your typical link fashion we would write something similar to this expression here so from x in transactions where domestic transactions contains name to lower or the foreign transactions contains name to lower to list and return the result that's it simple expression and nothing really fishy going on here yet okay so let's test the performance and again i'm not gonna use benchmark.net because for the sake of video it takes too long so i've written my own measure function which kind of works like benchmark.net um it will do all of sorts like these good things like the warm-up phase it's going to run a couple of times it's going to calculate the average and this should be enough for the most cases if you want i can add like benchmark.results into the description of the video or maybe somewhere on the end so this example takes 110 milliseconds all right but what we're doing here so notice that we're doing name to lower twice because there's no way for us to be able to do it once because we have to do it for every transaction and sorry turns out that there's a keyword for that where you can define a variable in link and then use that variable so that's called a let so here we have the same example where we're going to use the let keyword so we're going to define a variable called n and we're going to assign that name to lower and then we're just going to use that n so what we effectively done is we decreased number of operations by half in terms of these checks here because we're doing two lower once and then we just check so that should be better for performance right well yeah but there's always but keeping keep kind of keep that in mind that what you write in link is not what you're gonna get so this is just an expression that gets converted to some code but you might be surprised that in order to solve this problem the compiler has to go to great lengths and will generate code that you would not expect and yourself would never write so if we would write this the plain one the plain way uh we would probably never write such code as the compiler will generate whatever that's good or bad is yet to be seen and let's actually see the performance so example two the first one took 110 milliseconds so this is already surprising because it takes longer 147 milliseconds and it's even slower the second times we we sort of run it right so it's around 20 percent worse at least so what's happening here right like i said uh what you see is not what you get in link and you have to be careful and you have to know that because what this will generate is will it will actually do first of all it's going to select so it's going to transform each transaction to it to a new object not a new variable it's going to be a new object completely that's that's one of the things the second bit is that this new object will be captured in a new lambda where it will still do like to lower and stuff like that and it's going to be very slow and it's going to probably do things that i'm not even aware about because this is just so big so let me tell let me actually show you what the c-sharp code compiler generated because that's that's really interesting so let's go to this tool what you want to check um whatever you want to check what the compiler generates you can go to ielts pi and decompile the code and you're gonna see what's up so this is our first example so we're gonna do where transactions uh we're gonna do the function and we're gonna not just do the lower and it's you know short and this is the led version so this is the led version here it's pretty long and it's pretty surprising so let's uh i've actually copied it to the c sharp code and as yeah as you can tell it does the select it does the filtering and it does the select again and then it will do create a bunch of delegates like one with two lower two uh sorry one delegate another delegate yet another leggett yeah so it's bananas basically so it's it's completely insane and there's probably a reason why it has to be solved like that but i i cannot get the feeling that um you could do better all right so how to solve this problem right so stick uh with this here um so stick with the simplest link expressions that you can find even if it has like a duplicated code because it might not get duplicated actually so it's something to think about you have to decompile and see the c sharp generate code unfortunately to understand some of these problems but that's how it is and if we wanted to go even faster so if we wanted to actually cache that that value here well we could not do link for example if performance is key if it's not then stick with this but if it is then let's just write a simple version of this expression so we're just going to have a loop um we're going to take our transaction we're going to do name to lower and then we're just going to check in an if expression and lastly we're going to add to the resulting list and that's it so that's that's all it takes to solve this problem so let's see how the version three performs with just a loop well it's at least two times faster so it's two times faster from the first version and two and a half times faster from the second version which uses the leg expression so that's pretty good um yeah so this is one of the the pitfalls that you fall in when you're using link expressions and link is not all bad so um but you have to be very careful how you use it and there's some pitfalls like this so this example probably is a okay if you're trying to use that but this is not but while we're at it can we do even better than this version turns out yes we can so there's a four version here that is faster but what it does is surprising because it just changes the conditions so now we're looking at foreign transactions and then we're looking at domestic transactions and this is sort of um cheating because if you know the business context that you have more foreign transactions than domestic transactions then um you can be thoughtful about your if statements and if you do it's gonna get even faster so you have to measure this and have statistics and data that back it up but if you do then you can go a bit faster it's it's like a couple of milliseconds but you know still okay so as it turns out you have to be careful with using link but let's move on because there's some other things that i wanted to show you so for example um there's this well this thing that you have to be uh careful about as well so we if we have a list of transactions and we're gonna do transactions last and measure the performance of getting the last item it's gonna be 1.1 milliseconds but if we increase this a bit and run it again um probably too much yeah so too much um let's leave it at two zeros extra no still okay so now it's 154 milliseconds that's that's something that we can sort of measure and this is a list right so the last should be like instant it shouldn't take any time but since link doesn't have the notion of type systems it just has like interfaces and all of the things are like based on i enumerable there's a couple of things that are happening here so first of all all of the codes will be virtual calls which already takes um just a little bit of overhead on top but that's not important the important bit is that it doesn't know that it's dealing with the list because if it did know that it's dealing with the list we just could get the last element easily so let's see and link can totally do it because you can have checks to check if your innermobile interface can be converted to a list and if it can sorry cast it much rather to the list and if it can then you can do it but for some reason they don't so let's write a function that will get our last element so it's this function let's assign something to it for x and let's run it so it's 30 milliseconds it's 10 times faster at least so for some reason the c-sharp devs didn't include that i'm sure there's a valid reason for that or maybe there's just time but it's something you have to be careful about when you when you're using link because it has all of these operators that can bite you but we could for example implement this operator ourselves because as it turns out you can in fact overwrite the default link implementation because if you have this included in your project in the same namespace and you and you have this this thing here then it's going to use your implementation of the last and if that happens then we're just going to use this implementation here and as you can see this is the default implementation of the link but we're checking against our ilist and if we find the eye list then we're just going to do the same thing that i did so if we're dealing with these sorts of situations then we can easily replace the implementation of the last operator so for example i if i know that i'm even dealing with an ilist on a source then i can already do it and now if i'm gonna do the last it's already faster right it's it's not as fast for some reason because we have a bunch of more checks maybe i don't know something but it's faster already so this check is so actually there's no checks but this is an ilist so it's a virtual interface it's not our own so perhaps there's some cost associated to that okay so um we can overwrite all of the link operators i i do not um encourage you to do it i just encourage you to overwrite some of the implementation of link if you feel that it's bad for you for some reason but let's actually now talk about the good things of link because there's totally a bunch of good things in link so let's switch to example number three and again the same thing transactions but this time around there we have a different task so our task is to group by the transaction and we only want to return transactions that have the same name appeared on that list more than once so in this example we have two transactions that have a count of greater than one so we should just return this so we're gonna do a group by using link and let's see so this is the group by we're gonna group by into names where names count greater than uh one here and we're just gonna select the names and let's run 18 milliseconds pretty good okay so um let's just say that we would we don't trust link we say that yeah we can do better let's try to do better by having a dictionary uh which we'll to use to count all of these things and then we're just gonna um go through the dictionary and add all of the you know key values much other keys that are greater than one they have a count greater than one so this is the dictionary version it's what most people would do when they would try to re-implement this okay turns out it's a lot slower how much slower two times two and a half times something like that two and a half times slower so as you can see there's a good things there's good parts about link and you cannot implement this really faster because the implementation of group by uses something that that's called a lookup table so this is like a data structure that they've built and this data structure is extremely robust so you just cannot well you can but you have to try hard to implement a performance version of that so there's just no point in doing that because in this example link actually wins and it wins by a lot but let's say that we're stubborn and we want to prove that we can do it so we're going to implement our custom lookup which we think is going to be faster and let's see how that works so we have a basically a hash table we're going to calculate the hash this is here to not have minus values because get hash code can give you from minus x to plus x and this and here will just flip that last bit and it's going to be all positive values then we're going to have buckets which we call groups and we're going to add to group so this is using a bunch of like clever stuff to be able to do it quickly there's almost no if statements so branches are eliminated and it's all good and it's all pretty good as you can see it's using inlining and a bunch of stuff and let's try and you know let's actually see how it performs because you have to just take my word for it but this is an optimized data structure it's not like dumb data structure so it still takes yeah so it takes 26 milliseconds while the first version of link takes 14 when all of the caches are like optimized and stuff that's why i'm running a couple of times because the caches get hot and you can see the most optimal version the most performant version much rather so as you can probably tell my version of the data structure while optimized is still slow as compared to the link version so i would have to do a bunch of other stuff here to be able to make it faster so for example uh i would need to use like a linked list here um to like filter the news out of that group then i could have a much more optimized hash code implementation because this is not like a non-deterministic version all of the the deterministic versions would be faster but like i said it's difficult to do so there's no point in unless you're like have a valid use case there's no point in implementing group by by your own link is just better so as we can see link is not all bad so link is good as well and you just have to know where to use these things so yeah and lastly let me show you a thing about closure so in order to get this you would have to probably uh watch my previous video but um let's see here so i previously had this find function written in link where we had the where operator let me actually show you real quick so we have this find function here or we have this where operator and i know we can do like for example this um we can remove that and we can add this here but it was just to show you that people make this mistake all the time so they do where's something something and then first to default first or default and this is like a redundant call it doesn't have to be here you could just do first or default and pass that lambda in here but this holds all sorts of problems so first of all this is innumerable and again virtual calls already this is on the list under the hood but we're not we don't know so we're gonna promote our iterator to the heap we're gonna um get this item push it to the heap we're gonna get this lambda expression push it in the heap and there's gonna be a bunch of locations that we didn't even wanted to have and why is that happening here because some of the lambdas don't have to be promoted to the heap at all but when you have a situation like here where you just pass something inside of the lambda from some inside context then it has to be promoted to the heap because whatever you run this it has to be in a public class where this guy can reach it and this code here shows you how it's implemented so this is our find function where we have our first default and we have this display class here which we're going to initialize and this item here that we're interested in because it's used in the where expression is going to be hooked to this field of this display class and now our condition is going to be presented as a function here which you know returns a ball and we're going to use this here and that's why in this example we uh we're actually boxing the item and lambda and everything else because like i said it doesn't have to be like that but here there's just just no choice so how would we solve this problem and this is a discussion for um like a series of videos i guess because it's tied to compiler and compiler designs you would have to effectively bake that value so what i mean by that so when you when you pass it passing this along you create a copy of it inside of this link method and you just use that copy and then that lambda doesn't have to reach anything anywhere else because the value is effectively baked so the compiler could even turn it into a static at this point of invocation so for example if this is 10 the compiler could just do greater than 10. and now there's no problem because if this is greater than 10 then we don't have to create all of these classes and you know do all sorts of these uh weird things we can sort of verify that the compiler does this because i'm not sure if it does but if this would be baked so baking means like you know evaluating the variable and cashing it in somewhere is it here or just turning it into a cons that will be fine as well then there's no problem but then you have to throw the value array but it's possible so it it's possible to create a compiler like that to be able to do these things and maybe we're going to see it at some point in that as well so let's compile this let's throw this guy away because it does not refresh automatically and let's add this and let's see now so we still have a class which is sealed um but now we don't have any uh sort of um variables so that variable is not going to get promoted to the heap but still that function is going to be permanent heap and that's not um optimal still but [Music] we could still probably um i mean the compiler could probably skip generating this class altogether because if we're gonna have a function um then we could probably use something like a local function here so there's ways to do it but it's just it's out of the scope of this um you know video at least so that's all for today um so thanks for watching um if you have questions um leave them like in the comments if you liked the video leave a thumbs up and um thanks for watching so yes bye [Music] uh
Info
Channel: LevelUp
Views: 19,296
Rating: undefined out of 5
Keywords: dotnet, dotnetcore, C#, csharp, programming, performance, allocations, computer science
Id: Dv_nsoEmC7s
Channel Id: undefined
Length: 26min 10sec (1570 seconds)
Published: Wed Aug 12 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.