Writing C# without allocating ANY memory

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everybody i'm nikki in this video i'm going to show you how you can write a location-free c-sharp that is very fast or as allocation-free as possible using the latest dot-net and c-sharp features now understand that the purpose of this video and why you should watch it is because you should know how to optimize for memory or speed if you need to and this will only focus on one use case but the tools we're going to use can be applied in many other use cases you just have to try and think how you can adapt them so keep in mind that even though many of you will think hey this is pretty much your optimization or i shouldn't actually be doing this right now well the truth is you should know how to optimize when the time comes to optimize and for me for example this is my reality i have to write code that is so efficient just because of the throughput i have to deal with so it really depends on your use case but you should know this type of stuff if you like a lot of content and you want to see more make sure you subscribe ring the notification bell and for more training check out nickchapstass.com so let me show you what i have here and i have quite a few stuff so i'm going to start with the program.cs here so here's our use case i have a good and i want to convert that grid to a url friendly id and then from that url friendly id i want to put it back to a good to use in my application this is very common many services do that type of thing from a url friendly type of id as you can see right now on youtube or this one over here this is a url friendly base64 encoded id and even though this is not the only approach this is one of the most popular ones and the reason why we do that is because it's very unlikely you're gonna have a long grid in the url it just doesn't look friendly so people tend to shorten them and also because it's less bytes to have that shortened some people do that for efficiency as well now let me run what you see right now on your screen to see what we're dealing with here because i need to put everything into context so first we have a random good right so this is where we're starting then if you were to just convert that grid to a base64 string you end up with 24 byte string that looks like this now the reason why this is a 24 byte string is because we know that these are ascii characters and it's 24 because that's how much the 16 bytes of the grid are converted into a base64 string now here's the thing these two last equals in the end those are padding you do not need them so we can basically just remove them from the end and because we know that the size of a good is fixed we can just reappend them when we do the conversion back into the code so we can optimize as you can see here that they're missing that part of the base64 string the other thing we need to do is that the plus symbol and also the forward slash symbol that are part of the base64 string are not playing nicely with urls for that reason plus is being replaced by underscore and slash is being replaced by hyphen so with these three changes we're making a short url friendly base64 string and then from that string we write something that converts it back to a good how does that look well let me show you i've made this class called guider weird name and then i have the two string from good and the two good from string the implementation is something very basic that at the first glance might look absolutely fine we're getting the byte array from the grid because we need to pass it to the two base64 string method of the convert class then we replace like i said slash to hyphen and plus to underscore and then replace the last two equals with an empty string and then in the end we do the opposite we replace the hyphen and the underscore with the slash and the plus and then we append the equals in the end and then we bring this all together using the guide from a byte array here so that's what we're dealing with in case you didn't know a good is 16 bytes and those 16 bytes come from four bytes from an end two and two from two shorts so four bytes they're eight bytes and then eight bytes individually and the color coding here represents what each item here maps to here it's just sets of two it's hexadecimal you don't really need to know that but i think it's good to know now what we want to see first is benchmark this and see how efficient it is both in memory and speed to put it into context so first what i'm going to do is i'm going to go to the program.cs and i'm going to comment everything here out and i've already added a benchmark so i have this goiter benchmarks which has a constant and a static read-only good and it just calls those methods to give me results both on memory and speed on how they perform so what i'm gonna do is we're gonna say benchmark runner dot run and i'm gonna point to goiter benchmarks that's it i'm gonna change that to release mode and run it and i'm gonna give it a minute to run and let's see what we get back so results are back and what we have is the two grid from string taking 80 nanoseconds and allocating 184 bytes of memory and then the two string from good taking 95 nanoseconds and 256 bytes of memory now this might look insignificant but actually small things like this if overused can amount to a big amount of memory really they can so our goal here is to completely remove any allocation on the first method the two guide from string and significantly reduce the second one because we cannot completely eliminate it because we eventually return a string which is a reference type allocated on the heap what can you do so let's go ahead and optimize those so first things first let's delete that and uncomment this because we're going to need it right so here we go i'm going to start with the second method the one that returns a good because here we can actually eliminate all the memory allocations on the heap good is actually a value type because it is extract as you can see here and even though we accept a string here which is a reference type allocated on the heap we can change that to accept a read-only span of character and let's go ahead and create the new signature so we want a public static grid here to do it from string op for optimized and this can now be a read only span of character or char or car or whatever you call it and call that id here and then this can return a new gui and we're going to implement that in the end and the reason why this is acceptable and we don't break the api is because if i go to program.css and i say for example here go it again o p and add the op here this works because a string can be implicitly converted to a read-only span and that's because strings are immutable and the span can use the arbitrary memory of the string to read the characters and do stuff with it right so back in the method the first thing that i know i'm going to need is a span of character and this will replace what normally we would use which is an array which would be allocated on the hip here span is not going to be allocated on the hip it's a stack object and for that to happen i need to say base 64 characters and i can say stack a lock to allocate on the stack 24 characters and this will give me an area in the stack represented by this object which i can play with without having to worry about heap allocations i should also say here that this video will not be using unsafe code at all no fixed no one safe no none of that we could but maybe that type of optimization is for a separate video because there are concerns around that so what i know is that i'm dealing with 22 characters in that incoming id and the last 2 the 23rd and 24th which i can represent here with 22 and and 23 i know that those will be the equal symbol or it would be that so that's how i can add it but i can also extract that into a separate constant and say private cons star equals car equals equals a lot of equals and i can use it then like this so what's left now for me is to assign from the incoming id to the base64 characters so the only thing i need to do is replace the characters that are incoming so the hyphen to the dash and the underscore to the equals so to do that all i want to say is base 64 characters and i here equals id i and i have a switch here and first i have the hyphen so i could say something like this and we will but i'm gonna again extract it into a field so that should turn into a forward slash and then we have the underscore that is turned into a plus and anything that we haven't mapped is id of i here we go and this will now map everything we need here now of course i don't want to have this like this so i'm going to extract them in constant so private const let me quickly do that private const character so that's what it looks like and now we need a few more things first i know that a good is 16 bytes in its good form so i need a span of bytes which again i don't want to have an array here i want the span of bytes because it's a stack allocation and i can say id bytes equals stack unlock that's why it's a stack allocation and i'm going to say byte array of 16. it's not really an array it's 16 bytes in memory and then eventually i'm gonna use the grid constructor that accepts a read-only span as you can see here which doesn't allocate any memory and what i need to do to finalize this is write the base64 characters of this span of characters into this span of bytes from 24 characters to 16 bytes to do that i can use the convert class and say try from base64 characters and first i passed down the characters so base 64 characters and this goes into the id bytes and then i do not need the last parameter which is the by it's written so i'm just discarding it and now this is our method if i go here as you can see this is our method this is what we added and what i can do is i can write it in the end if i go ahead and i run this watch what happens here you go the grid is constructed let's go ahead and debug this to see exactly what's going on in slow motion so i'm stepping into the method over here and we have this implicit conversion of the read-only span which is represented as you can see here and then we step over this base64 span which allocates 24 characters and then we iterate over each item we're mapping the ones that needed to change in this case we don't really have any that need change so it's just the same thing we append the two remaining characters so that's what the final thing looks like we allocate the 16 bytes here and as you can see before the conversion step everything is just by zero in that span but if i go over that and i write into it then every bite in that span is written and i can step over this and this returns the final good so how does that look in terms of performance well let me go ahead and duplicate this benchmark and add the op at the end and run it and see how it performs so i'm going to say first return here because i want to return early because i want to say benchmark runner dot run grid benchmarks here we go change this to release mode and let's go so results are back and as you can see zero memory allocations this will make a visible difference in your application if you're using this conversion a lot and also it is more than half the speed so as you can see this has an impact not so much on the execution of the operation i mean it is half the time but it's not like we're dealing with seconds here but this memory this will make a difference now let's go ahead and optimize the remaining stuff and that is the two string from good and there like i said before we can't really eliminate all the allocations because ultimately we return that string so if we are constrained in safe code which we are we cannot use and save here let's see how far we can get so i'm gonna go ahead and move this this fields up here here we go and try to create the efficient version of this code so duplicate that say op keeping the signature the same and i'm just going to return empty here and let's see what we can do so a lot of the things will look the same so we are fundamentally trying to do the opposite here but the main reason why this approach has a problem well there's many problems here first we're allocating four strings here but also we have this two byte array and we're using that because we need to access the byte array behind the scenes in the good and there is no as byte array or something so when you have two two will actually allocate the thing that it says it will so if it's a two tara array or two list or two something two means i'm going to allocate this thing again into that new type if it says as like as list as innumerable as quebel those do not allocate the thing again they just cast it behind the scenes so we need to find a way to access the byte array without allocating the byte array again how do we do that well first i note i'm going to need 16 bytes so i'm going to say id bytes and that's going to be a stack a lock of 16 bytes again no hip memory here and that's going to be for the bytes of this id then i know that i'm going to need a span of 24 characters again on the stack and the reason why i need that is because i'm gonna have to convert the id bytes to the base64 characters here now the first thing for this to work like i said is for me to find a way to access the bytes of this grid without allocating a byte array again because that's a heap allocation to do that i can use the memory marshall class and again this is all safe code nothing unsafe here and say try right and i have a destination and a value my destination that i'm going to write into is this span over here and then i have the id and it needs to be a ref id because that method needs a reference to the type that you want to pass into that span they can't all be winners but at least this means we can write into that span and just to prove that this works what i'm going to do is i'm going to do the same thing as here url friendly base 64 id um op and change that to op and then call that o p here i'm going to stick a break point and debug it and see what we get so step into that and as you can see this starts empty nothing in that span and then i step over the mushroom as you can see actually the grid is is full and then i'm stepping over the try write method and now that id by span has the bytes of that good written into it without having to allocate 16 bytes of the grid again so that's great stuff here now i also need to write those 16 bytes in the span into the 24 characters of this base64 span and to do that all i need to do is say base 64 dot and code to utf-8 and say id bytes here and then base 64 characters here and actually not i'm looking at this this should not really be characters it's not exactly the opposite of this this at this point should be bytes so base 64 bytes here is what we should be dealing with and this is actually what this method even accepts and then i can discard the last two items here so this and this can go bye-bye and now if i do that i'm gonna write those 16 bytes into base64 24 bytes here from that point on it's just about doing the opposite replacement so first we need our span of characters and that's going to be the final characters equals stack unlock of hr and then i need to do the same thing so 4 22 items here i say final characters get the i equals base 64 bytes get the item switch i love the switches and then we have the slash but in this case it needs to be a byte so i'm going to also create a slash byte here and change that to a byte and to do that i need to cast it to a byte so slash byte and the byte is converted into a hyphen here we go and then i have the plus and that also needs to be a byte surprise surprise so i'm gonna change that to a byte change the name to a byte and cast this to a byte here we go and i'm gonna say plus byte goes into underscore and then the remaining here we go we can cast that to a character and it is whatever incoming base64 byte we are getting here we go and now i can say return new string and pass down the final characters as a span and there's an overload for that so this will allocate no memory again i mean of course it's gonna allocate the string but not the parameter going into the method and now with that if i go ahead and debug this code let's see what we get so i'm gonna step into that and as you can see we allocate the two things then i can step over the writing and the base64 bytes are all written here and i'm going to step over all the iterations don't want to go through every single one of them so i'm just going to take a break point and do a five and in the end all the characters are here and in that span and we can return them back in a very memory efficient and fast way and if i let display as you can see in the console over here both conversions are correct both from and to a good how does that perform let me go back here and duplicate this benchmark so i'm gonna say this underscore op and then op here and let's go ahead and run those benchmarks again and see what we get back so results are back and let's see what we have here so as you can see the optimized way is how many times is that 256 divided by three and a half times less memory allocated and it's three times faster again the visibility of the speed of the execution depends on how many times you're using it but the memory will have a very visible impact in your garbage collection now does this mean that you have to go ahead and change everything in your code to be super optimized and use spans or whatever well not really you have to be very careful with how we're using those tools for example the stack has the maximum amount of memory and if you exceed it you're going to get the stack overflow now it's highly unlikely that you're going to hit that one megabyte limit i think it's one megabyte but you do have to be careful now let me know if you want to see more things like this and actually let me know if you can make this even faster and more memory efficient without the use of unsafe code because i can make it faster with unsafe code too it's no biggie i just want to know how fast you can make it with safe code well that's all i have for you for this video thank you very much for watching special thanks to my patreons making videos possible if you want to support most well you're going to find it in the description down below leave a like if you like this video subscribe more than likely sharing the bell as well and i'll see you in the next video keep coding
Info
Channel: Nick Chapsas
Views: 137,993
Rating: undefined out of 5
Keywords: Elfocrash, elfo, coding, .netcore, dot net, core, C#, how to code, tutorial, development, software engineering, microsoft, microsoft mvp, .net core, nick chapsas, chapsas, dotnet, .net, writing fast c#, fast c#, efficient c#, zero allocation c#, stackalloc, malloc, span, span of t, readonly span, c# span, .net span, .net performance, allocation free c#, allocation free .net
Id: B2yOjLyEZk0
Channel Id: undefined
Length: 19min 36sec (1176 seconds)
Published: Thu Mar 31 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.