To LINQ Or Not To LINQ - That is the Question

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
to link or not to link that is the question what's up youtube in this video we're going to be looking at link and we're going to be looking at performance now i don't want this video to come off as never use link or don't use link i'm a big fan of link from a technology point of view but i'm not a big fan of link from the point of view of usage which is it makes it really simple to mess up so this video is kind of in that direction it was i guess the genesis of this video the chord you're seeing here is like real scenario real world scenario if you would like vocal had to modified a bit but the genesis of this entire benchmark that i'm going to show you today is the result of something that occurred in the real world and then sort of the pursuit of trying to find a better more optimal solution let us down a path where different members on my team came up with different alternatives and so i put them all together to kind of benchmark and compare them all so i want to show you that result of that here my opinion of link it's a incredible feature i think most languages actually don't have that feature yet maybe they don't think it's worth it but as i said earlier it makes it really really too easy to mess up and i don't like that about languages where it's it's so easy and you think it's coming for free or there's no cost to it in terms of performance or memory allocations but then it you find out later that that's a heavy cost there so i use link in sort of the simple scenarios and even there i'll show you in this video even in those simple scenarios there are performance issues but again you know what does a performance issue actually mean if something is faster or slower does it actually have a matter have a does it matter to you i think that's where you need to look at this this video from or any anything based on performance right or benchmarks nanoseconds milliseconds here and they may not make a difference if that method is being called just once let's say you know every second or what are the the frequency that you have right but if it's a hot path where that method is being called so many times in a second tens of thousands of times a second then minor improvements in performance will give you huge gains but if you call that method once in a while and it's millisecond to microsecond differences i'm not sure that makes a difference right so i don't want you to chase performance for the sake of squeezing out performs even when there's just one method but i do want you to understand the stuff i'm going to talk about here and the other benchmarking videos i've talked about you need to understand what happens why things get slower so that and when you have a need to make something as optimal as possible you're aware that you understand the reason for why something might be slower or faster so that's essentially the point of this video right so we're going to be looking at some code and then we're going to look at an excel spreadsheet but i've charted out the results to make it more visual and then we'll discuss the reasons for why certain versions methods if you will the benchmarking methods the different ideas are more or less costly so that gives you a better under the hood kind of sense of link how it works and what it does on the hood all right okay so let's look at the code here so i'm going to basically explain the the code here there's a bunch of a collection of customers here where the customer has a first name and a last name property and i'm just gonna put some names here and then the last name is just start with al okay and the idea is that given a list of customers and then given a list of three other customers and it's not a list because there's a hash set for a specific reason here but that is these are also first names of customers that you've got from somewhere the three completely different collections with only the first image there and the idea is you want to get the intersect if you will of these lists right meaning any customer that's here and either here here or here in either the list needs to be your result that's the basic premise or the idea for this thing i've simplified this i've pulled it out of context it may not make sense to you but as i said this was a the genesis of this whole thing was based off of some some real need we had where i was doing a code review and i found the the baseline if you will i'll show you that and then i kind of sent that out of the team saying guys okay let's see who can tell me what the reason is for the slowdown or how to improve the performance of this thing because we were having an issue and so the team set out to implement various versions of it and they send their versions back i benchmark them and i'll show you right but i hope you understand the basic premise is that you want to get a list of customers that are in here or from this list of customers rather you're going to get any customer that's in any one of these three lists in other words since there's a a a d d e e e we want a a a but we don't have a ddd and e so we want to aaa from here we want the bbb from there and the ccc from here right that's the output of course it's case insensitive compare so that's the other aspect of this whole test is some of things are focused on the up in the two upper or two lower case insensitivity comparisons and not the outcome of that all right i also have in this case an array of customers but i also have a a new ball of the same same set of customers so one is seen as an i renewable of customer and this is intentional i'll explain why and there is an array of customers so this thick set of customers the list of customers in each of them is the same right they're exactly the same reference there this is the reference type just one is typed to be an iron renewable of customer that is typed to be a and ray of customer right okay let's look at the baseline now this was the original sort of code or style in which the code was submitted saying customers.where and then you do an or for every one of the the three different lists in this case and this is a hash set so you can use the contains and you do a c which is the current iteration if you will of the customer from that collection first name not to opera but then you're doing it multiple times so that and the tool this is only to kind of materialize the list if you will so that you know uh it's not a default execution in this case so the tools will force the execution of link don't worry about if you don't understand that [Music] so this the kind of the thing that triggered me to even want to start to optimize some of this stuff was the fact that that the two operators here is multiple times you don't need never ever need to do that right so the first order of business would be saying okay well i don't need to do the two upper of the same name multiple three times for the same iteration remember this where this delegate would be called multiple times per element in the collection of customers here so yes you might want to do it once for each iteration but you don't need to multiply right so that was my initial thing i said okay listen we are having some issues here that two upper multiple lines needs to go so let's try with that and see what happens right and so the question then became what how do you do it like where do you do the two upper and once in the iteration if you understand link you'll say okay well i'm not sure how i do that right so there are two alternatives in link that you could do one is to use the let keyword within the link statement right so but that would require to write um the query syntax you know from blah in last kind of things in syntax and the other is to use a lambda statement block right so we'll get to those two alternatives in the code that we see over here all right then somebody from the team said well you know let's just do this uh with the equality comparer instead of doing the two upper thing like we're doing over here we'll define any quality compare that is ordinal ignore case if you want to see how that's by the way the source code and the excel spreadsheet always will be available to you on my github in fact i've shared this code before uh in a for a previous video as well where we are doing other benchmarks so this excel spreadsheet in this project has got multiple benchmarks string comparison equality a whole host of them this is just one tab if you will in excel especially so one of the benchmarks with the source code available on my github i'll put a link in the show notes below all right so the string comparer here is defined as string compare ordinal ignore case which is sort of considered to be the the best way to compare strings their ordinal comparison and ignore case and uh says define initialize as a single variable here up front then somebody on the team suggested why not do a union or sort of combine okay this is physically modifying the collection this collection is being union with that and that so these three collections are being combined into one but it's a physical alteration so you can't go back so we said well let's try it you know just get a sense of what it takes first and what it looks like then let's worry about whether you're modifying that collection and if that's acceptable or not right the benefit here is that you're seeing this case you're comparing the same first name three times right there's three comparisons here in this case you're com converting it three times per iteration this is comparing three times per iteration but here you are comparing it just once per iteration the full iteration nonetheless but you're not comparing you're not doing it multiple times so this two of us exactly once per iteration this is the same as sort of an optimized version of it so let's see what that looks like and let's look at the results so you can as we as i'm describing the implementation you can look at the results as well so the baseline that is the the baseline and that turns out that it had a average domain time of 706 nanoseconds again these are all nanoseconds so it's not like a whole lot or a huge deal keep your eye on also on the allocations that take place here so this is using link of course i don't mention it here but the get using multiple two upper if you're doing multiple two uppers it takes about seven or six nanoseconds and it's allocating 680 bytes during that process okay the allocation for some cases is the string allocations the two upper and other cases just link has allocations and that's one of the big differences between link and let's say a four or f for each even a for each has allocations a for loop does not i mean i'm not saying that you never have allocation depending what you're doing inside the form but by itself the for loop itself has no allocations whereas with link you always have allocations all right the next was the multiple equality comparer let me also flip to the code here so you can look at the code so this is the baseline then multiple equality comparisons we are doing this same same code here except we are using the comparer and the compiler is doing an order ignore case rather than doing it two upper three times basically let's just compare it that arnold ignore case compare and let's look at the result of that that is this and it's actually slower than it consumes more memory than the original this to me was a bit i didn't expect it and why would it be slower i mean you're you're comparing it or not in our case but keep in mind even though ordinal ignore case is the sort of the more recommended version of comparison string comparison maybe you have a culture sensitive thing that's a different story comparing converting it to upper or two lower and then doing comparison is a much faster comparison by itself it may not be the right or the correct component to do but it's the faster compiling and it's not a whole lot slower or faster it's got more allocations it takes a little more time that's this one here right so again i would resort to ordinal ignore case comparisons just because that's sort of more sensible to me if unless you're dealing with some culture sensitive things as well the doing the two-hour to do a comparison doesn't doesn't sit well i just do an ordinal ignore case and it has a tidbit of information in the dotnet framework they had optimized the two lower implementation over the two upper okay so the two lower was a bit more optimized than the two upper in.net framework and dotnet code they change that to make the two upper more performant than the two lower i'm guessing because they found that most people are using two upper instead of two lower to do comparisons like this and so they decide to optimize that method more than this method so anyway just a little bit of trivia all right then the merge sets was the so the union if you will where we combine let me show you the code so this one set is being combined or these two sets are being combined into the one set and see then you just do a comparison once it's still doing a two upper but of course it's happening only one time per iteration as compared to like over here three times per iteration right so let's compare that with that one so you can see this is actually quite a bit faster than even the baseline right it's got less allocation and it's faster but remember that the one of those sets got modified permanently modified so that may not be an alternative right so but we did it anyways just to see what it looks like and of course we expected since there's only one comparison one conversion happening so that's the difference between that and the baseline all right all right then somebody came up with this idea just trying to do a concat [Music] and let me show you the code here they can cat so essentially we are taking the customers collection and we're concatenating it three times like you know three times and then we're doing it two uppers you're concatenating so still modifying the customers and then you're doing a first upper once yes but oh my gosh that's a lot of work you know as simple as this looks and link if you were to look at the code generator for this and the allocations that occur behind the scenes this huge amount of work and huma many allocations that occur as well and plus you modified the original list so it's not quite working but it was worth the shot the idea was when we were trying to do this just come up with anything while let's you know i've got the benchmarking stuff in place give me the code and you know i just plug it into this thing you see the you know the result of the benchmarks comparisons right so no harm if you were to just try something wild maybe it works maybe it doesn't all right so that one as you can see took the the longest amongst so all the trials we did all right the next one was this which is this one here now i'll show you the code the using query expression let so this was the alternative as i mentioned earlier where you want to do this once the two upper once per iteration but how do you do it just once per iteration in a link query of course if you use the query expression syntax rather than the method syntax then you can say let and so it's going to do this once per iteration and for every one of these it's going to use this upper version to compare it with that right so that's one way to do that and sort of equivalent to that my favorite actually this was my suggestion upfront meaning that while i was doing the code view itself i said you guys you know like this two upper saying just do it once and to fall in step with what the dev had already written i said just do it like this so if we went from a lambda statement sorry lambda expression to a lambda statement block what's the lambda statement block here it is scott within the where or any lambda is really a curly open closed curly and then you have to explicitly it's like a method body if you will right it's a full full out method body unlike where you have only one line statements we are returning the result of this bowl expression so we have to actually explicitly define or say return in the lambda statement block lambda expressions don't need a return statement return statement in there but lambda statement blocks do all right so what does that look like so the let the query expression with the let and the lambda statement block i don't call it lambda statement block i call it local variable assignment for two parts we do the two upper once assigned to a local variable within the lambda statement block it's still happening just one time per iteration even though we're comparing it three times but we only do the two upper ones let's compare this and this the query expression and the statement block all right so that's the query expression the blue and then the local variable here is this one so you can see there's a big difference in fact this compared to the baseline i was very impressed even though i was the one that came up with that immediate kind of result because i'm thinking just from the perspective of i know the two hours is going to cost us time so before we have that issue let's just take it out and this maybe some of you don't agree with this style i'm not saying you want to pre-optimize but define free optimization right i'm sure a lot of people will say pre-optimize and if you've read donald trump's book or at least that one chapter you'll realize that it doesn't just end there but we just say pre-optimization is the root of all evil and that we don't follow the next statement which actually kind of explains the whole process so the whole statement much better nonetheless when you know something is going to cost you and it's not you're not writing weird code you're not making your code so unreadable and if it's just simple steps or simple things you can do to clean up your code and if that's going to give you that little bit of performance here and there go for it right i don't unnecessarily leave performance on the table i'm talking about not kind of bending or backwards to try and get that performance i'm talking about just simple things for example i've found uh radio.net to be way faster than ef i don't use it.net even for my little blog i won't use it it's just going to cost me you know so my blog is running on on azure with the smallest little vm that i can buy or pay for an ir why would i want to make it more expensive that doesn't make sense i mean it just doesn't make sense so that's what i'm talking about the code looks the same it's not any different right but it gives you that you're not leaving so much performance on the table actually what i'm getting at so this alternative that i showed you here this one was basically just looking at what he had the original bass line if you will this is already there in the code that okay since you have this there i'm not happy with this part here right the best would be to you know what make this is a lambda expression make your landlord statement block declare a local variable to this two upper once and then use it here three times right that's it so i would be happy with this modification that's it so you want to use link great you know no harm but be sensible don't unnecessarily leave performance on the table be sensible use link when you can when it's simple in my world we use link mainly for like where clauses that's it and even in where clauses i can show you even where clauses simple basic simple where clauses can cost you okay so but that's okay we don't need that that's super duper performance every every place in our system so we don't worry about it and we it looks sort of clean even this may not make sense but this compared to the four loops might actually make more sense so we just look at the for loops now so the first sort of comparison or implementation we had for without link and that was my my second step was just as my own inquisitiveness i said okay well that let you know the land statement block version okay that's cool but what would the for loop be so i normally sort of baseline it at that that's okay i'm using four each or four or ironing rubles what are the cases let me see what would happen with the for loop i shoot down to that photo basically let me compare this with everything else and then the difference will tell me whether i'm okay or not right so there's a option to use for each instead of four right and then there's the four and then of course there's also the innumerable versus arrays and there's a big difference there one link treats the different in collections if you will the annuals differently so internal to the implementation of the irony rule extension method like the where and the select and the count and the first and last all these methods if you can imagine the implementations internal to the implementations they have checks is this source this collection that comes in is this an array okay if it is then do this if this is it online list then do that if it's everything else do something else right so they have conditional logic where they do the implementation of the the first or the select or the the last of the count or whatever that is that the method is is different based on the type of collection all right so here we're using the innumerable version of customers intentionally because the four each will have also do the same thing this is not linked specific but it's 40 specific the four huge implementation also smart says okay if i'm gonna you remember the four is just lowered have you seen my loading video on c sharp lowering lowering upper thing to it over here the lowering video is the lowering step is a step in the compiler that happens before we get compile the whole code base anyway but take a look at that video i want to go into the hole right so the time of lowering before each gets converted if it's an array it gets converted to a for loop so the compiler in this case the c of c sharp compiler smart enough say okay you got your four each loop here for each you used for each at the time of lowering it says well is that an array if it is i'm just going to make it a for loop all right so to force it to use fouries out to make and give it a new wall that won't compile it down to a for loop in case you're wondering four loops are the fastest four reaches are second best if you will link is the third right but i was trying to explain that the four loop four each can also be converted to a for loop and you might get bad or misguided readings because they both are the same so i'm using the annual layer idea is the same you do the four up four two upper once and then you compare that nice per iteration you're just doing the two upper ones then this is without laying but with the for loop so you use a for loop here again same idea the two upper ones right so we have the four version the for each version and of course there's a link version so these two so we'll compare the this one as well the local variable so that's the local variable one the green and it turns out that the four each with the renewable actually is a bit slower now all right and if you look at the the memory allocations they're about sort of the same but the link for reaches which lower the intervals so the question is how is the 40 slower than the link version this is something people don't understand with link link has many allocations that occur and also things that slow things down in link right let's take a look delegates delegates are an allocation delegates are slower than calling even a virtual method in c chart meaning if you call a non-static method or a static method or virtual method they're faster than a delegate method or a delegate right interface methods or classes if you since you're working with an interface like iron renewable and you're calling methods on it you're i call i'm just calling them interface methods right something of course that has to implement that interface but when you call through an interface it's slower than even calling a virtual method or a delegate it's the slowest so when i gave that for each and i enumerable it had no choice but to call it through the interface because it doesn't know it as some other type it's seeing it as an interval so it's calling it through the interface and that's why it's slower okay so these are certain nuances i'm not sure you necessarily need to worry about when unless you get to that performance site right but it's nice to know isn't it it's nice to know that certain things can change slightly but you may not you may not need to worry about it but at least you know why it's happening right so that's why it's happening all right then back here done these two the four loop okay now these two other methods here i put comments here they basically i'm just trying to get a difference between the two upper and not so here i'm not interested in getting an output i'm going through the motions but i'm not comparing it with the two upper in this case it's the four each with the innerval and in this case it's the four with the array with the array but no or two upper glass we're going through the motions doing the checks just not returning anything in the collection all right and that looks like that so this is the four and that's the four so the difference between the two upper with the for loop and the two upper without the for loop going through the same motions is that much right this versus that and allocation is 280 bytes versus 32 bytes right what do we want what about allocations exactly allocations may not matter in fact c sharp is super fast in allocation it's faster than in c plus plus allocating allocations are faster we love to create things in csharp.net so we're super fast allocating we pay the price during the garbage collection so the garbage collection is what costs us so how do you make the garbage collection faster don't create so many things the few things you create the few things to garbage collect the less often the garbage collection will kick in the more performance the application becomes right so that's the kind of the tension between performance and ease so in this case we'll be creating all these strings with regards to the two upper and just to throw them away that's adding pressure to the garbage collection that starts slowing things down right the garbage collection has to start collecting if you will look at the gen zero collections here look at the difference there's also fewer gen zero collections uh in this one compared to that one right that's point zero three versus point zero zero three right so this has many more garbage collections gen zero garbage collections than that one only because of the two uppers all the strings that are being allocated are being thrown away and thus that might have to be created per iteration and then it has to be thrown away but here i'm not intentionally not creating the doing the two uppers i'm not allocating it's not like this is a correct code the implementation is not correct those last two methods are there just to get that difference without the two upper but a for loop without the two power button for each loop what's the difference right the for each is this is the one the link for each and it's got uh 320 bytes and it took four and thirty point six nanoseconds and the no upper equivalent is this one so these two are link for each base basically so again same thing with fewer allocations between that and that you can see fewer garbage collections and faster almost half the time so the two upper is costly now you may not be able to avoid it for the most part so i'm not saying that right the last these two methods here right but these are not correct they're not giving the right output or expected output but then do it as infrequently as possible which is the the doing it once per iteration option right so again in the actual code because it wasn't so performance critical that that piece of code i suggested just going with the lambda statement block with that the local variable for the you know the two upper ones in the local variable which is i don't call it a lambda statement block here but it's uh this one right so that's this one uh this one yep compared to that and i'm good with that for like any production use where it's not you're not trying to optimize you're just not you're trying not to be wasteful right i think there's a big difference between trying to squeeze out as much performance as you can versus just you know being conscious of the waste in this case you're creating on uh unnecessary allocations why if you can do less why not you know so just being more cognizant of the waste if you want so i'm good with this for production but if you want more performance then you might want to resort to that which is just slightly faster these two between these two it's 390 and 341. this is faster that's invent less allocations as well right so but i'm good with that for production use because they're just being mindful of the waste right how many times you have to do the two upper for the same thing right can we just not clean that up a little bit that's it so but i hope you found this video interesting some of the thought processes that went through you know the different people on the team coming over there maybe if i could work on cat i mean you can do a merge of the sets and you know and the outcome of that and of course eventually what was the result that we went with as i said this was the one with this lambda statement blocks but i hope this was fun i hope you've learned a few things about link about c sharp about the c compiler if you have please give me a thumbs up and i will see you next time
Info
Channel: Shiv Kumar
Views: 2,062
Rating: undefined out of 5
Keywords: LINQ, Delegates, Anonymous objects, BenchmarkDotNet, Benchmark, Performance, .NET, .NET Core, ASP.NET, ASP.NET Core, Programming with Intent, Memory Allocations, Lambda, Lambdas, Interfaces, IEnumerable, For, Foreach, Roslyn, Compiler, Compiler Lowering
Id: D1m-RIWFrhM
Channel Id: undefined
Length: 32min 34sec (1954 seconds)
Published: Sun Mar 28 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.