Go 1.20 Memory Arenas Are AMAZING | Prime Reacts

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
go 1.20 experiment memory Arenas versus traditional memory management is traditional like managed memory because that would be more traditional right is gar isn't garbage collection modern memory management I don't even know now now you got me all confused Dimitri but let's find out what you have to say uh go Arenas are an experimental feature the API and implementation is completely unsupported and the go team makes no guarantees about a compatibility or whether it will even continue to exist in any feature release or future release awesome okay this seems exciting this is exciting this is some Cutting Edge go people all right go uh 1.20 introduces an experimental concept of Arenas exciting for memory management which can be used to improve performance of your go programs in this blog post we'll look at it what are arenas how do they work how can you determine if your program could benefit from Arenas how do we use let's see how we use Arenas to optimize our services so this is super exciting because again if you can make go slightly faster like it's already It Go as a super simple language to get right get it shipped and move on so if you can make it even slightly more faster like that'd be crazy all right because memory is a huge turn on any system memory is going to be one of your biggest causes for things being slow let's see what are memory Arenas uh go is a programming language that utilizes garbage collection meaning that the runtime automatically manages memory allocation and deallocation for the programmer this eliminates the need for manual memory management but it comes with the cost absolutely the go run time must keep track of every object that is allocated leading to the increased performance overhead yep classic and then it also has to find out which ones can be cleaned up in let's see in certain scenarios such as when an H titty uh server process processes requests with large protobuf blobs which contain many small objects this can result in go or in the go runtime spending a significant amount of time tracking each of those individual allocations and then deallocating them as a result this also causes significant performance overhead one thing I don't know about go that's true in JavaScript is that like everything is its own object therefore like a map with maps in it or an object with maps in it are actually two separately tracked items so I'm not sure if that's true in go or not I I don't know so anyways Arenas offer a solution to this problem by reducing the overhead associated with many smaller allocations in this protobuf uh blob example a large chunk of memory and Arena can be allocated before parsing enable enabling before parsing enabling all parse objects to then be placed within an arena and tracked as a collective unit once parsing is completed the entire Arena can be freed at once okay so this is like effectively in some sense you're saying that this object can only be referenced by a certain amount of items and they're all in one single group so it's very very simple okay this is actually pretty cool this is kind of like a cool concept I like it okay so garbage collector yeah they have all these individual objects versus just have them all in one okay I like this okay so this is what they mean by an arena okay identifying code that could benefit from Arenas any code that allocates a lot of small objects could potentially benefit from Arenas but how do you know if your code allocates too many in our experience the best way to find out is to profile your program yep nice pyroscope okay pyroscope is one of these cool so they give you like a little Arena or allocation allocation I believe they call these icicle graphs because they hang from the top it's just a flame graph I call it a flame graph okay just invert it and boom you got yourself a flame graph I'm not really sure why we decided to flip flame graphs upside down but you know we did we went there and now look at us now look at us now we got icicles okay I don't understand it what the hell's Happening Here okay uh the purple uh nodes in this allocated uh objects flame graph represent where Arenas may be most effective oh interesting I wonder how that works or why they're colored purple what makes them that way oh samples there's a lot of samples objects in Ram there's a lot of objects in Ram okay okay okay uh you can see the majority of allocations this many come from one area of code oh okay so it's this one right here or are you talking about this one are you talking about this one I'm not sure which one they're talking about but somewhere in there usually how I read this is this one is the one that allocated all this this one allocated from here to here and this one allocated from here to here right that's how I'd read it given that it represents 65 of allocations this is a good candidate for using Arenas but is there enough of a performance benefit to be gained by cutting down these allocations let's take a look at the CPU profiler okay so it does look like you're getting the same kind of uh area right here okay exciting purple nodes in this CPU flame graph represents potential for performance improvements all right let's go a few things stand out the problem or the program spends a lot of CPU time in the same insert stack a function okay so there could be there could be some gains uh is it the memory that's causing him if you search for a runtime Malik GC multiple pink nodes at the bottom you'll see that the function is called frequently in various different places and takes about 14 of our total execution time so this is typically how I do this for node node I'll look for major and minor GC and I will look for how much of the program's time am I spending in a major or minor GC and that's really your garbage collection win now where this makes a huge win is inside of requests right inside of a server because once your server can can reduce that it actually makes a disproportionately huge effect in how much you actually get done because a single request all the the problem why is that it's like a multiple right so in in node when a single request hits with a garbage collection all other requests that are waiting all get hit with the same garbage collection so a 200 millisecond stop isn't just a 200 millisecond for a singular request it's a 200 millisecond for the 10 requests in there so the amount of speed you gain by reducing garbage collection goes up significant amounts inside of a node application that's why you know that's why garbage collection is a really good thing to think about you know what I mean I don't know if it's the same in go I don't know exactly how go works but if it also has freeze the world garbage collection then you could argue the exact same thing every single one of those will all have to freeze and therefore you're a 14 isn't just 14 it could be a hundred and forty percent right you don't know how much it will actually improve your response time and all that about five percent of the CPU time is spent in runtime uh big Mark worker okay awesome so in theory if we optimize all of our allocations in this program we could cut about 14 plus 5 19 CPU time this would translate in uh 90 cost savings and latency Improvement for all of our customers in practice it's unlikely that we could truly get those numbers down to zero but this is still a significant chunk of work yep okay optimizations we made if you're interested in following along there's a public pull request for uh pyroscope repository that you can use as a reference to begin we created a wrapper component that is responsible for dealing with allocations of slices or structs if Arenas are enabled this component allocates slices using uh an arena otherwise it uses a standard make function we do this by using a build tags okay I don't know much about build tags and go this allows for easy switching between Arena allocations and standard allocations at build time okay perfect so there's no runtime overhead is what they're saying it just chooses one or the other perfect then we added initialization and cleanup calls for our Arenas around the parser code after that we replaced regular make calls with make calls from our wrapper component finally we build pyroscope with Arenas enabled and gradually deployed to our production environment okay let's see uh flame graph with the representative CPU time per function is this supposed to be the same as this one I mean it looks like the same right so 13 percent 13 okay the flame graph above represents a profile after we've implemented the changes you can see that many of the runtime Malik calls are now gone I couldn't see them before I guess I'd have to kind of run through this is there a way to like easy oh is this one over here yeah okay so it was this one four percent oh is it all pink that okay so all pink okay so all pink is allocation stuff okay so now pink is all in one nice little area and I mean I can't really tell if it's different I'd have to use I don't know how to do searchings I don't know how to use this thing well anyways uh the flame graph uh above represents a profile after we've implemented the changes you can see that many of the runtime Mallet GC calls are gone but are now replaced with Arena specific equivalent you can also see that the garbage collection overhead is cut in half it's hard to see the exact amount of savings from solely looking at the flame graphs but when looking at our grafana dashboard which combines our flame graphs with CPU utilization from AWS metrics we saw an approximate eight percent reduction in CPU usage this translates into an eight percent cost savings for our Cloud build particular Services well that's only if you can technically scale it correctly right I think you have to be at a pretty good amount but look at that that's cool right like that means you can handle a lot more requests what I would really like to see is the latency or not the latency uh the round trip times what do you do with the 50 percentile what happens to the 50th percentile and the 75th percentile because the the 99th will probably remain near the same right the 99th and the 99.9 those probably all remain the same but like the 50 percentile or the 75th percentile how much do you shrink that back you could actually see a very significant percentage shrink back uh this may not seem like a lot but it's important to note that this is a service that has already been optimized quite a bit for example the protobuf parser that we use doesn't allocate any extra memory at all garbage collection overhead five percent is also lower end of the spectrum for our services we think that there's a lot more room for improvement other than parts of the code base and so we're excited to continue to experiment with Arenas this is actually really cool this is super cool now I want to play with it damn it I'm supposed to be studying for htmx and oh camel and now I want to go play with go all of a sudden right uh trade-offs while Arenas can provide performance benefits it's important to consider the trade-offs before using them the main drawback using Arenas is that you use Arenas you now have to manage memory manually and if you're not careful this leads to serious problems absolutely failing to properly free memory can lead to memory leaks I know but this is a problem with all maps right so I mean you still have this the exact same problem in any long living map uh attempting to access an object from a previous previously freed Arena may cause program crashes absolutely classic really uh here's our recommendation only use Arenas in critical code paths do not use them everywhere good good call Profile your code before and after using Arenas to make sure you're adding Arenas in areas that uh they can provide most benefit yep uh definitely profile your code before and find where the memory is being churned the most right uh even node has this you can add object pools to node and you can see huge performance benefits by looking where you allocate the most amount of memory and what I found especially with like callback objects is really really good to use like bound functions and all that it's very very good hog stack I know very excited for the hog stack the hog or the Gog stack uh pay close attention uh to the life cycle of the objects created in the arena make sure you don't leak them to other components yep so you definitely have to have internal implementation details use defer a free to make sure yes beautiful use clone to clone objects back to the Heap okay beautiful uh the other major drawback at the moment is go Arenas are experimental feature the API and the implementation is completely unsupported and the go team makes no guarantee about backwards compatibility or about compatibility or whether it'll even continue to exist in future releases I think I think it's really exciting though I I really do hope they kind of pursue this because I really love the idea of having Escape hatches to manually manage your own memory because it is such a huge benefit when it is a benefit you know for the most part A lot of the things you do is super ephemeral you don't care but there are those few times where it's just like if I could manage memory right here I could like eliminate half my programs like running time just for this like this one thing I have this exact thing right now and I wish I could just map so what I do in JavaScript I kid you not what I do in JavaScript at my job right now is I do this stupid stuff where I'll be like const get you know you know items equals some sort of awaiting and getting this thing right I get some sort of uh array back right this returns uh an array of you know something right uh something yeah yeah yeah yeah and then I have to go through it right I do some sort of while items uh length right something that looks like this and do items pop because then I'm reducing it one at a time and then at the end I go items equals null because I have to which means that I have to use a let here and it's all complicated in one part because I actually use too much memory and then by doing this and importing GC and enforcing GC I can keep memory down like this is crazy what I have to do but my program goes to eight gigabytes sometimes or I can keep it at 200 megabytes by enforcing manual memory management in node which is totally the worst thing ever and I hate it and I have to write stupid code like this but I do because that's what I have to do and I don't want to do it right I wish I had better off I wish I had more things right anyways the go team has received a lot of feedback about Arenas and we'd uh like to address some of the concerns that we've seen from the community most frequently mentioned issue with Arenas is that they make the language more complicated by adding an implicit and not immediately obvious way for programs to crash absolutely one positive thing about go is that go does not uh want complexity and there's something about that that is very beautiful in of itself even if you don't like it it's still beautiful most of the criticism is well-founded but misdirected we are not anticipating Arenas becoming widespread uh we view Arenas as a powerful tool but one that only uh is suitable for specific situation in our view Arena should be included in the standard Library however their usage should be discouraged much like the usage of unsafe reflect or seagull our experience with Arenas has been very positive and we're able to show that Arenas can significantly reduce the amount of time spent in garbage collection and memory allocations the experiment described in this article focused on a single already highly optimized service and we were still able to squeeze eight percent extra performance by using Arenas we still think that many users can benefit a lot more from Arenas absolutely if you don't have an optimized one you're just creating wild objects you could really get some good stuff in addition to that we also find that Arenas are easier to implement compared to other optimizations that we have tried in the past such as using buffer pools pools are very hard to use pools are super easy to leak or writing custom allocation free protobuf parsers this is like Ultra duper duper hard and compared to the other types of optimizations uh they share the same drawbacks but provide more benefits so in our view Arenas have a net win I am completely on this one right here this is like totally a w right here because this makes perfect sense because it is so hard to do the other optimizations pooling objects is non-trivial it's easy to get it wrong it's easy to leak memory it's easy to do the wrong thing uh yeah pools are easy if you ignore exceptions yeah they're simple and also oopsie Daisy stale data and some other things that accidentally happen and blah blah blah blah Arenas are a powerful tool for optimizing go programs particularly in the scenarios where your program spends significant amount of times parsing large protobuf or Json blobs they have the potential to provide significant performance improvements but it also is important to note that they are experimental feature and there's no guarantees of compatibility yep all right awesome beautiful beautiful article thank you Dimitri uh this was fantastic really liked it and I'm actually pretty excited about go to me this is a this just makes go more appealing because what this says to me is that go isn't fine with 95 performance they want that 99 and if you can get 99 you're looking creamy smooth right because with JavaScript you cannot do this right like this is just not currently a thing for JavaScript and nor do I really want it in JavaScript because anything that is done with JavaScript tends to get wildly abused you know what I mean wildly abused so this is beautiful this is actually really beautiful and I think Greg is leptos Greg here is leptos Greg here Greg I'm using some more leptos tonight I don't know if you know that but dude I'm going deep on leptos Greg look at this look at this beautiful stuff I'm just about to grab some data from Tercel but I'm using a local file client which makes it even better right just grabbing it from that example love it anyways Greg you're a great guy great guy Greg everybody give great guy Greg big claps everybody great guy the name is I really do love leptos and it is solely the reason why I'm continuing to use rust no matter what I think I would switch to oh camel and use o camel for my back ends but at this point I still use rust because of leptos because leptos is that amazing again
Info
Channel: ThePrimeTime
Views: 96,055
Rating: undefined out of 5
Keywords: programming, computer, software, software engineer, software engineering, program, development, developing, developer, developers, web design, web developer, web development, programmer humor, humor, memes, software memes, engineer, engineering, Regex, regexs, regexes, netflix, vscode, vscode engineer, vscode plugins, Lenovo, customer service
Id: eglMl21DJz0
Channel Id: undefined
Length: 16min 37sec (997 seconds)
Published: Mon Jul 17 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.