How IEnumerable can kill your performance in C#

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everybody i'm nick and in this video i'm going to show you how enumerables and the ienumerable interface in csharpn.net can actually tank your application's performance if you don't fully understand what's going on behind the scenes now i made a short on that topic and many people in the comments weren't aware of this issue and they were like oh that's why my application was behaving strangely or why it was so slow so in this video i'm going to explain all that in depth and show you how you can actually fix this if you like about content and you want to see more make you subscribe during the summer notification bell and for more training check out nickchopsters.com and speaking of that i just launched my back to school sale at my website nickjabs.com where i have all of my courses and i'm offering the first 100 of you 20 off on anything on the website including the courses and the already discounted bundles so if you're waiting for an offer to jump in and get some of those courses and expand your knowledge on any of those topics then use the link down below and use code school 2022 at checkout for 20 off anything now back to the video alright so what i have here is a simple console application with a very small and hidden origin which i'm going to show you later what's in there and then i have this customers.csv which has a bunch of people with different ages and names and what i want to do is i want to write a method that passes that file and then i do something with that text in my case i want to just print it so i'm first gonna create a record customer and i'm gonna say string rule name and then i'm gonna have int age and now to process that i'm going to build a method that returns i innumerable customer and then i'm gonna say get customers and these customers will be coming from that file so to read that text in that csv i'm just gonna say var lines equals file dot read all lines i could use the async version but in this case i don't really care it doesn't really matter and i'm gonna say customers.c as we here we go and there's actually many ways to extract that information from the csv we're just gonna basically split on the comma and then map it to the object i'm gonna start with this approach over here where i say um for each line in lines do something and what i can do is i can say yield return new customer and then i'm actually going to split the line on the comma so i'm going to say line dot split here we go and now i know the first thing is a full name and the last thing is the age i'm going to pass that as an integer so split line and one here we go and that's it and now what i can do is go up here and say for each var customer in get customers and i can simply say console.writeline customer here we go so if i go ahead and i run this code what i'm seeing is the three customers being printed and that's great now let's say i want to do more than one things with those customers i might actually just create a variable and say customers equals get customers and then one of the things is to iterate on them same thing as before and i can also say up here var count equals customers dot count over here and i can say console.writeline there are and then count customers and if i run that now as you can see the thing works there are three customers and then i'm getting three customers over here however this code has a fatal flow that isn't that obvious and the flaw is that this method will actually run twice i'm gonna go ahead and add a few break points here one up here when i'm reading that file and one when i'm iterating and yield returning and i'm going to walk you through the execution of this program so first we get the customers and that doesn't really do anything because nothing really asked for the result of that innumerable so it has the instructions on how to build it but it doesn't actually do anything at this point and then when i step over the count because at this point we need to get something about the enumerable in this case the count is going to enumerate that and then read the lines from the file we have them here in memory as you can see and then iterate and yield return and then once we're done with that it's going to give us three and i'm getting all the customers and the thing is printing on the console now i'm going to iterate on them and when i'm iterating i'm reading the file again because it needs to recompute that i know this is called multiple enumerations of the enumerable and it's actually a problem because clearly we're doing the compute twice now where yield returning as the loop is processing the data that's because it only deals with one at a time as opposed to the count that needed all of it to actually give us a count in the end and that comes down to the implementation but as you can see now we are basically running a float program because we're wasting resources reading this thing twice this can be the same with database calls this can be the same with making a call to a streamed response from somewhere this can be a problem with io in general or heavy compute and it's not only limited to yield return for example if i don't yield return if i just comment that fella out and i say return lines dot select and i use this sort of like mapping mechanism over here and i just copy that here and i just return a new customer and x here is my line this is now using link to do the same thing effectively but if i stick a break point here as you can see the same thing will happen i'm doing this twice once over here and then once again over here and as you can see it actually works in the same way as the yield return it processes one at a time which is actually quite interesting now this is clearly a problem but in some cases it is also a feature because you might not want to enumerate the enumerable just yet because the person consuming the code might want to do something else and depending on what that thing might be it will probably be more efficient to have an enumerable and then chain let's say a where clause or a select many or something on that and then enumerate rather than having a list or an array which is already allocated and computed and then do the where something and of course operate on a smaller subset but still do some work to reallocate that final result to the to list or to array or to whatever you need to actually work on this so this isn't so much a flow with how enumerables are implemented because there could be legitimate reasons for why you don't want to enumerate early but what you should know is how they work in this case it's going to be computed multiple times and if there's no reason for this to be computed multiple times or chained then there's different things you can do on your end to mitigate that for example since the innumerable is just that interface that every other type of collection effectively inherits from then what you could do if you wanted to prevent multiple enumerations here is create a list and basically now this is an allocated list every single time and instead of using link or yield returning i'm gonna just uncomment that out you could simply say list dot add and then return list and if you do that now what's going to happen let's run the code again is that even if you still return a innumerable because the thing will be computed on demand once it didn't actually skip over the code as you can see here it does return the list on that point the point you step over the code and it builds it and it allocates it once then no matter how many times you enumerate it later you still work on that original list that has been allocated already and there is code in this count or this um get enumerator and all those methods to actually detect for that and be as efficient as they can be now what i want to show you is the warning you would get on something like rider or resharper that would actually tell you that hey there might be a problem like this and here it is i just removed the region that was hiding the suppression of this warning so here we have this warning saying possible multiple enumeration and this is all you're getting and the name can be a bit vague maybe you don't understand the full impact of the thing now it does say possible and it is a warning because even though this warns me of a potential multiple enumeration i know because this thing is using a list behind the sims even though it's returning an innumerable that there won't be one so know that the warning might be there but there might not be multiple enumeration in every single of those occasions for example if you have a call to a database using dapper dapper is buffering those results in memory using a list so it won't call the database multiple times but behavior like this is exclusively on the sdk's responsibility to implement correctly and it could be dangerous so if you don't trust the interface and what is happening behind the scenes i'm going to just quickly return that to the original approach which is just yield returning which for the record in my opinion is a better approach to go about this because there are advantages in this being lazy loaded and only done on demand but in my specific scenario i would also call a two list or a two array to force an enumeration and then the thing knows to not enumerate it multiple times because i have a concrete implementation of a list at that point and of course if i go and i add the same breakpoints here now they have the to list call to enumerated it will actually enumerate it on the spot so one two three everyone has been computed and then it doesn't happen for anything else not for the for loop not for the count which at this point can also be turned into account property because we do have the size in memory because we have a list now so to recap you need to be very careful when you're consuming an innumerable it might be your own code or it might be someone else's code but know that multiple enumeration can cause the thing to be processed multiple times unless it's being enumerated in that thing returning an enumerable which at that point i probably wouldn't return an annual in the first place i would return the type but maybe that's a discussion for a different video but now i'm going to pass the question to you did you know about this problem are you taking this into account when you work with a neuroblast leave a comment down below and let me know well that's all i had for you for this video thank you very much for watching special thanks to my patreons making this possible if you want to support me as well you're going to find the link in the description down below leave a like if you like this video subscribe for the like sharing the bell as well and i'll see you in the next video keep coding you
Info
Channel: Nick Chapsas
Views: 82,807
Rating: undefined out of 5
Keywords: Elfocrash, elfo, coding, .netcore, dot net, core, C#, how to code, tutorial, development, software engineering, microsoft, microsoft mvp, .net core, nick chapsas, chapsas, dotnet, .net, How IEnumerable can kill your performance in C#, performance, IEnumerable, IEnumerable c#, IEnumerable performance, multiple enumeration
Id: cLsmW7a8MkU
Channel Id: undefined
Length: 11min 2sec (662 seconds)
Published: Thu Sep 01 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.