Stop using LINQ to order your primitive collections in C#

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everybody i'm nick and in this video i want to introduce you to the brand new ordering methods added in dotnet 7 and explain why even though they're slightly better than what they aim to replace they're not really the thing you should be using for ordering for most cases there's actually something way way better both faster and memory efficient than we're gonna see it in this video if you like about content and you want to see more make sure you subscribe bring this notification bell and for more training check out nickchapsas.com all right so let me show you what i have here i have a simple project over here which is running dot net seven and i have currently installed the latest rc one on my pc so that's what we're working with and this is just a simple console application and what i wanna have is some items collection but just to make this more easy to understand for this example i'm going to make a collection here from one to let's say 100 items but i'm going to randomize the items in that collection so i'm going to say give me some random data with a seed and then i'm going to say dot select and i'm going to select random dot next and this will give me in this case let's say do we want to enumerate let's say two array just so you can actually visualize this better it doesn't have to be an array for this to work it works on any innumerable so if i have an array of integers or a list of integers or a collection of integers if i wanted to order this with link i would have done something like this i would say item dot in fact this should be items dot order by and then i would order by itself and if i say to array i don't have to do it twice i could just skip it and only do it once and this will only allocate the array once but if i do that and let me show you what happens i'm going to add this so it doesn't exit and i'm going to quickly debug this i'm going to go ahead and press debug to run it and then as you can see it's going to create those items over here all random unordered and if i do this and order it by itself then i'm getting everything ordered in a ascending fashion which is the default and if i want this to be ordered descending then i could say order by descending but now because i'm enumerating this multiple times i have to say or it should be better to say to array over here and if i do that then i can get the things order once i know the twice this one is ascending so from the smallest to the biggest and this one descending from the biggest to the smallest now what dotnet seven does is it actually adds a method that removes the need for you to have this lambda over here which will be allocated and have memory implications so we cannot remove that and say simply order or order descending and now you don't have to have this lambda they're targeting to itself so if i go here and run this then as you can see the numbers will be the same because we're using a randomization seed so the data is the same we're going ascending over here and we're going descending over here and everything works now i've seen it in many places where people presenting this as the way to order collections less arrays and it's not the case what i'm going to do is first show you how this compares to the order by versions of ordering that we had in dot net six and actually in any version before that since link was basically introduced just to show you how the delta between the two things are so i'm going to say add benchmark.net in here to run some benchmarks and i'm going to add a benchmarks class over here i would like to have a memory diagnoser so i can actually see what's happening with my memory and then i'm going to have the randomization element again so i can actually run the tests in a deterministic way just so there is no difference between run to run i can do it by simply adding a random field in my class and then for these tests i'm going to use an array first we're going to see a list later but for now we're going to start with an array so we have an array over here of integers and i'm using the same technique to create a randomized array with shuffled elements basically and our tests look like this we have the order by and the order then the order by descending and the order descending and in both cases we're enumerating to an array just so we can actually evaluate the results of those things because if i was to return innumerable because that's lazy loaded it wouldn't actually return anything making sure that this is in release mode and then go here and comment all that out and say benchmark runner dot run benchmarks not bit vector bench marks fat fingers here we go and that should now be enough to run what you see in here so let's go ahead and run this and see what we get back in terms of performance between order by and order and order by descending and no descending so results are back and let's see what we have here so order by an order just a small minor nanosecond level difference in speed however less memory allocated because we don't have that delegate anymore and the results are the same for descending that doesn't seem to be a natural difference this is all within margin of error between the two things so yes it is just slightly faster but nothing that would you'd ever see really um and same goes with the memory yes you're saving some memory but it isn't anything major i think this was just added mostly for cosmetic reasons and if you actually see behind the scenes it is using a static identity function here which is what will ultimately call the same order by method anyway so it all goes all the way back to order by and order by descending just in a more efficient way that is only allocated once now why i think you shouldn't really be using those methods in the first place well because in this use case you really have two things right you have an array you have a list you want to order well both of these methods have very efficient sorting methods so i'm gonna remove the descending from the equation just because it will just make the tests run slower the order by ascending will be enough but what i'm going to do is i'm going to change this test now and i'm going to merge the allocation of the items inside the test itself because in order to showcase the next method i kind of have to because there is a way to set up the data per test execution but it is a bit tricky so i'm just going to merge that so everyone is on the even playing field and i'm going to show you how you can very very easily write a sort method that does the exact same thing significantly faster so we still have the items to array and what i want to do is say array dot sort and then pass down the items and then we can just return the items i don't need to do the two array now i don't really need to enumerate the two array twice i can operate on the new rule directly so this will make order by an order just a bit faster but we're comparing the fastest ways to do this thing so i think that's a fair compromise and just by using the sort method over here we actually are going to be way way faster how much faster well let's see i'm gonna go ahead and run this benchmark and let's wait and see what we get back all right so results are back and let's see what we have here so as you can see over here order by an order in terms of speed pretty similar basically the same thing order is more memory efficient just because of the absence of that delegate however the sort method doing the exact same thing 2.4 microseconds and basically one-fourth of the memory of order by and one-third of the memory of order now you have to keep in mind that array.sort will actually mutate the array it won't return a new array it is a void that actually mutates the array so that might not always be the thing you want to do with your collection because in all of these scenarios order and order by are pure methods are actually tagged as pure i'm sure if i go uh over here in hover they must be okay order by is tagged as pure uh pure means it doesn't have any side effects so yes it is significantly faster it's actually more than twice as fast and three to four times more efficient but you should know that it actually changes the data set and that's why i'm setting the items every single time because if i had it over here it would only change them once and then you'd be operating on an already sorted array now what i want to do is change all that to a list just to show you how that would also work because that might be the thing you might be more familiar with and might be using because it's really it has more functionality at least an array arrays fixed in size it is a bit more tricky to work with i haven't seen many people use arrays that much but lists are all over the place so i'm going to change all that to lists however we can no longer use the array.sort on the list because it's not an arrayed list but what we can do is use items dot sort the method in the list and by the way if you wanted this to be descending you'd have to write the comparison so that with the i comparer in this example we only work on ascending that's why i'm not showing that but it is possible you can actually change the sorting in any way you want oh and if you're wondering that because you have a delegate it will actually be slower well you can cache the delegate and use it here or you can use an i compare implementation you don't have to worry about that so it will be equally as fast so now everything is list i'm going to run the same tests with a list to see both how order buy and order perform with a list and how sort list compares to the array list not the array list the array compared to the list all right so results are back and let's see what we have for that so as you can see just a little bit slower than the array just just a few nanoseconds but equally as memory efficient if you actually add the delta between an array in the list which is a few bytes i think 32 bytes in this case um so equally as efficient and the changes do stay here in scale now the last thing i want to show is that well i've been showing numbers for this test and you might be working with text so you might be like okay how does this work if i have strings i want to order which is the other most common thing you might be ordering well let's take a look at that i'm going to make a benchmarks with text over here i'm just going to slightly change that so what i'm going to do is just to string the numbers so i'm still going to use the same numbers but as strings i'm going to keep this as a list for now but a list of strings over here and over here and over here and that's it and i want to go back to the program.cs and change that with the benchmarks with test i'm going to go ahead and run those benchmarks and see how it differs between running for numbers and running for text which obviously will be less efficient all right so results are back and as you can see now the sort method isn't really the fastest it is from one to two microseconds slower however it is still the more memory efficient and the grand scheme of things these two kilobytes will actually save you more time in gc than this so pick your battles just know it is not everywhere that sword will always be more efficient in speed it will usually be more efficient in memory though now the last thing i want to show is that in the numbers world over here if i revert everything to an array and i should have shown this really earlier but if everything is an array then what you can also do is you can work with a span and sort the span so if you happen to be working with a span and not necessarily turn something into a span just to sort it then you can say span of type int in this case and span has a sort method so items dot sort and then you can return the items but of course you have to array them which will be inefficient in memory because you allocate the array twice but know that span does have that sort method now i want to close this video by showing you what this array dot sort is actually doing and why it is so efficient and that's basically the same for the span.sort and the list.sort that basically operate on the same logic so we're going to go here and find this array sort helper and i'm going to find the array implementation and as you can see this is using an introspective sword which is why it is fast both in small data sets and also in large data sets and also memory efficient it is because introspective sort is a hybrid algorithm that actually combines multiple algorithms into one this is a very hard word to pronounce for me so as you can see over here we go with an intro sort over here and then as you can see this starts as a quick sort and then their checks on the partition size so if the partition size of the data we're operating on is less or equal to the interest size threshold which in dotted core and 5 and 6 and 7 is actually 16 then it checks for partition sizes and it switches to an in session sort otherwise if a specific depth limit is reached then we go for a hip sort and in the end you can see recursively it goes back to the intro sort which is the quick sort and the heap sort and the incessant sword over and over again and this is how it is both fast and memory efficient in both small and big data sets again i think it's great that they added these simplified methods because it does remove some memory allocation however for that example of ordering on itself these short methods will be faster and more memory efficient basically always as long as you can deal with the fact that they will mutate your collection which you might not always want but you should know that they are there well that's all i have for you for this video thank you very much for watching special thanks to my patreons for making videos possible if you want to support me it's really going to find the link description down below leave a like if you like this video subscribe for more content like this during the bell as well and i'll see you in the next video keep coding
Info
Channel: Nick Chapsas
Views: 95,714
Rating: undefined out of 5
Keywords: Elfocrash, elfo, coding, .netcore, dot net, core, C#, how to code, tutorial, development, software engineering, microsoft, microsoft mvp, .net core, nick chapsas, chapsas, dotnet, .net, linq, linq performance, linq slow, linq fast, improve linq, linq c#, linq .net, linq .net 7, The 2 NEW LINQ methods you probably shouldn't use in .NET 7, The 2 NEW LINQ methods you probably shouldn't use, new linq methods, linq order, linq order by, linq order by descending
Id: K1Ye_QEpAq8
Channel Id: undefined
Length: 14min 56sec (896 seconds)
Published: Thu Sep 08 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.