New Go Billion Row Challenge w/ Great Optimizations | Prime Reacts

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
1 billion row challenge in goang from 95 seconds to 1.96 seconds okay so if you guys don't know um I like reading these because often they they're going to be they're going to be filled with a bunch of like information I love goang okay there's a lot of code in here so I want to go look at how he's doing the codes and how he's making this thing fast so I'm actually pretty excited about this okay I know we've already done one but I want to love it even more okay this is exciting the 1 billion row challenge is quite simple the task is developing a program capable of reading a file with 1 billion lines aggregating the information contained in uh in each line by the way if you don't know the 1 billion Road Challenge started off as a Java Challenge and then spread to every single language this is like the fourth thing that I've read on the 1 billion Road Challenge it's actually super super cool I actually think this is quite awesome um and print a report with a result each line within a file contains a weather station name a temperature reading in the format of station name temperature where station name may be spaces and other special characters uh excluding that and the temperature is a floating Point number ranging from 99.9 to 99.9 with Precision limited to one decimal point the expected output format is station name equals Min or uh min mean Max holy cow dyslexia is kicking in sorted alphabetically by station name in Where min mean and Max denote the computed minimum average and maximum temperature reading for each respective station example of a measurement file Yellow Knife 16.1 whatever in in in in 32.9 example of the expected output all right awesome uh given that 1 billion line file is approximately 13 gigabytes instead of providing a fixed database the official repository offers a script to generate synthetic data with random readings just follow the instruction to create your own database although the challenge is primarily targeted for Java developers the problem Pro uh presents presets the problem presents probably is the T is going for uh an interesting toy exercise to experiment in any language as I've been working with goang in a daily basis at Gamers Club I've decided to give it a try to test how deep I could go but before going forward with this article I want to acknowledge that despite being well-versed I am no specialist in goang I am the kind of dumb for low-level optimizations uh domain to which I have never been very interested in this article I will pres uh present all the steps I took to optimize the solution everything was written and tested on Aizen 9 7900x PC not overclocked so 4.7 GHz with 12 cores and 24 threads an as rock b650 mhdc slm2 motherboard a 2x6 GB 6,000 bajillion mehz ddr5 Kingston Fury Beast Ram man you know when you're reading when you start reading computer specs out loud I feel it feels funny to say is is he bre is this humble bragging yeah just a little personal PC you know just backto back 32 gigs of really really really fast not even overclocked not even overclocked just standard clocked which is fast enough uh 6 GHz is normal for DDR okay I didn't know that and Kingston SSD sv300 s37a 1220g Windows 11 and go 1.22.0 amd64 okay okay okay okay the partial results I present or the results I present is the lowest consistent value I got for the runs while the editor and browser were open the final result is presented the final result is presented by presenting the aggregated result from 55 executions I'm not going to lie if he says the word presented once more I'm going to lose it okay I can't I can't I'm getting I'm getting satiated have you ever had visual satiation it's happening right now it's happening right now how slow is too slow before deciding to work seriously on this challenge I was curious how much uh how much slow is reading and processing the scary 1 billion rows file I had a feeling that the naive approach to it would take a long time driven by this curiosity I wanted to give it a try and implement the simplest solution possible name min max count okay good so this is a pretty classic approach right if you have the sum and the count you can create the average at the very very end pretty straightforward you have the Min in the max you got a couple extra uh you know you got a couple extra things right here you got the name you really technically don't need the name you could use the key to the map as the name but whatever okay I digress all right uh data map string station I wonder if you don't want it as a pointer would you want it not as a pointer for even better since you're doing literally a billion lookups on this thing is at that point would that like a pointer actually make a difference here if you're doing one billion lookups I would assume a pointer would actually now you're starting to think about pointers uh anyways file open up the measurements Panic if that's not there defer the closing scan uh the new scanner scan line text string line okay so this is like your very simple version of doing this we all know this one my my assumption is this would actually take quite a bit of time to run all right so this is like your very simple one right because buffered scanning I'm not exactly sure what it has to do but I'm sure it doesn't it probably scans the string at least once if not more uh anyways okay so we see all this to my surprise this code ran uh the code above ran in 95 seconds a lot better than I expected it would be note that print result and main functions will be used for the remaining of the article with little to no change okay okay so I think everybody saw this one coming right it it seems like if I were to try to optimize this problem I would start with file reading like how fast can I get file reading done and like that's really your bottom like your your your bibity bottom here what whatever speed you can do this you win right like that that's the rest seems like it's it you don't have to probably consider too hard for optimization I'm curious about that uh how fast is uh how fast is possible satisfied I went to bed and I couldn't sleep I knew how much time I needed to process the data but I couldn't stop asking what would be the fastest time possible to just open and read the file without overhead of processing oh I'm very actually happy he's going this direction because this is actually my question with go I have no idea I am actually very excited about uh this because I wanted to know how fast can you read it in right that was what I just stated and I'm curious how he does this because I assume you just can't use this you can't use a buff iio because a buff iio by the very Nature has to read data put it in here and then you have to read it again to like you doing something is my understanding right and so I can't imagine this will work right let's see first try notice that I'm using bytes instead of string uh a quick research told me that string conversion is slower and involves allocation of memory the bytes function reuse of an internal buffer returning the same object so there's no additional allocation the result was astonish thing 36 seconds okay I think you can still go faster scanner but scanner default configuration is really bad for the task I already knew that it was possible to reach much much faster since Java uh entries could reach the same result in 1.5 seconds but 36 seconds was surprisingly slow the scanner class has a buffer function which accepts a predefined bite object and a maximum number of elements for the case when the buffer can grow up to uh up in size without much details about how it works internally I tried to use it and tested some different values for the buffer size oh nice okay okay look at that interesting reduce this calls equals win yeah okay so you I mean with this with this buffered one you can get it at least down quite a bit okay I mean I guess this makes sense right in the sense that if you have a small buffer you have to constantly be reading over and over and over and over and over and over again well at some point it has to get worse right at some point it has to get worse because this is a lot of buffering right I think at some point it has to get worse because you're no longer I wonder what why the reason is I'd have to think about that actually I guess you're not is there a reason why that'd be worse cuz my assumption is allocating this would take almost no almost no time right allocating this should take almost no time reading empty space Oh you think it's reading empty space H much better so using a buffer around 20 uh this one uh bytes 4 to 16 megabytes respectively could improve uh 80% reaching around 6.7 seconds okay so I think you can still get faster so experimenting how fast go can do this okay so that's I mean this is pretty interesting so scanner is scanner is not all that fast buff reader another quick test I could do was using a buff io. reader object which would read bite by bite okay this change actually increased the time to 25.5 seconds okay yeah I I would assume that makes sense because I assume you're having a lot of the same problems here right uh all right we were there let's see buff let's see buff reader line instead of a reader read bite I tried read line I assume this would actually still get this is still bad which got it all the way down to this okay okay we're not there yet so we're still not there yet I like I like his I like his attempts here like I like what he's doing by the way which is just I I do like just trying a bunch of stuff and just measuring it empirically and coming up with an idea as to what what's going to happen here I think this is good after some initial exploration I checked out how scanner. scan Works internally and I noticed that it does a lot of things that I don't need it manipulates the buffer object a lot not sure why I also found not sure why chesterton's fence mentioned uh I also found that it uses file. read which had never used before let's try it out ah all of a sudden you get the control all of a sudden you get the control how big that buffer is hopefully he takes his ideas from the previous one these larger buffers and maybe uses that okay that would be fantastic right all right resulting in 18.86% file so I tested different buffer sizes again there we go let's go okay we're starting to get closer we're starting to get closer all right I like it I like it I'm surprised so I'm surprised it's going to take him a half second to process through all that data but okay okay I I guess I could believe it it's a billion objects it's a billion things you have to parse in a half second actually that you know now that I say that out loud that's it feels fast I wonder what it would take in JavaScript I wonder how fast you could make this in JavaScript I don't know now I want to try now I actually want to see how fast can you make it in JavaScript just because it's so hard to get JavaScript to run correctly probably close you could probably make it fairly close because you got to remember that when you call to a map you're calling to C++ functions when you read from a file you're effectively doing C++ stuff I think this is where bun versus no node would make a huge difference to business days all right T it's going to be about two business days um great now that we make a lot more sense okay all right now that makes uh now now that make a lot more sense okay good so we're looking good I like it I lack the knowledge to explain why large buffer file reads is so much better than other versions but I believe it may be related to how the information is retrieved from SSD yes you're calling the system a lot less right you stay in your land longer than than in other person's land uh to finish up the minimum structure I want to communicate with multiple go routines to get the feel of how much overhead that could add my idea was to create a single go routine send the buffer directly to it uh using a channel so I can measure the cost of communication alone interesting yeah because you almost because you couldn't reuse the buffer I bet you could kind of cheat the system by making like something like five buffers the indent is insane it's just because it's a tab right Welcome To tabs tabs on the internet you know for all those people saying Tabs are appear good and everyone should just use tabs there's always internet problems okay we still got a long way to go we still got a long way to go okay all right to finish up the minimum structure I want to communicate with multiple go routines okay we already did this one so this makes sense so here consumer is just reading from the channel uh writer is literally just sending these things one at a time okay cool this increased uh time to 1. 1266 seconds interesting I'm a little bit surprised by that I'm surprised it took uh oh is he using a buffer Channel yeah he's even using a buffer Channel I'm a little bit surprised that it took 3 seconds to send that over is it really that that much I guess yeah it's a billion Road I guess you're sending over a billion items you're setting over a billion items I I I I can never remember do is a go routine different than a co- routine I would say no i s I'm assuming depending on how you allocate this but probably not like not a co- routine you think a co- routine is different like a sort of I can never tell when it comes to a co- routine is that just con is is is that exclusively so just for my own knowledge when people use the term co-routine does that imply no parallelism or does it imply there could be parallelism that's the one thing I don't know co- routines are two generators yielding to each other there's no actual parallelism yeah so that that was my question was is a co- routine because Lua co- routines uh aren't they don't run in parallel there's no parallelism okay okay then a go routine does have parallelism you can have parallelism and go routines are effectively just you know you got these nice green threads so my question was does a go does a channel is a channel a lock free Q cotlin cobes can be par okay well that ruins everything yeah it does have a of course it has some sort of schedule during the leg the leg work um I mean anything that anything that's doing acing stuff or anything that's doing stuff and running in parallel you have some managed environment running it right there's there's no it would make zero sense if you didn't right they how could it ever even run something's doing it right gertin are are made parallel viia threads yes a go routine has a mutex okay interesting it has a mutex okay you mean not a routine a a uh a channel a channel has a mutex okay I feel like I've asked this question before if a channel is a lock free Q or if it's a mutex protected item but okay so it's a it's a I wonder I wonder if there's there's a speed difference I have no idea I honestly have I have no idea the speed difference between a lock fre Q versus a newex channel I have no idea um copying the buffer all right let's see I still got some problems the first one is that the file. read buffer will override the buffer every let's see let's see buffer every reading thus if we send the buffer to the channel directly among other synchronization problems the consumer will read inconsistent data this problem will be even worse when we add more go routines correct I wonder if you even really need more go routines uh to avoid this situation I will copy the buffer data into another array see I don't think you need I wonder if you need to do that I don't think you need to do that uh increasing the time to about 2.3 seconds uh almost doubling it notice that I tried to create a slice and copy the data uh manually with for this without any Improvement okay so now we we're back to being slow we're back to being slow in Sag again single girl routine copying buffer people let's see uh people uh take over two weeks to review my diffs and I'm being pushed out oh sorry we're in the middle of something we're in the middle so I can't I can be asking questions about things plus you kind of gave me like the middle you gave me like the middle of it I don't even know like the ends of it I have no idea what you're talking about flip take that out if you can flip please flip I'm telling you flip could you take that out all right leftover logic a let's see a natural way to scale our solution is spending each chunk or sending each chunk of data to a different go routine in parallel the go routines aggregate or aggregate the data independently and when finished the main thread should merge the information I believe that this process is similar to how no squeal databases optimize their queries however at this point the main thread uh reads a fixed buffer amount from the file but the lines can have different lengths which means that the buffer will cut off the last line unless we're really lucky yes yes I added leftover logic to store incomplete last line from one read uh from one reading to be used as the first part of the next chunk interesting interesting interesting interesting I don't know if I would do that nice by the way for those that are wondering 10 10 is the value of a new line character if I'm not mistaken right it's hexa which is the new line because you don't actually have to do this interesting anyways whatever uh all right so we're at 2.3 seconds okay so how does he get it lower you're hexy yeah I know all right workers and Communications as I stated previously the Natural Evolution from here is to create a workflow where go routine process the uh data chunks and return the partial aggregation uh and the main thread merges and presents the results okay yeah my idea was creating a list of go routines and sending the data to each of them sequentially cycling the go routines until the end of the file there was no significant increase of uh of time with this modification at this point I just copied the old processing with a few changes okay so why are we still using scanner I thought we're done using scanner here people were we done using scanner did we not prove that you got to do you got to do this one you guys drop the scanner anyways input chan oh is that thing this article is much better than I expected why I thought this article is going to be great right all right so we see all these things we're going to do some parsing some floating some blah blah blah blah all right consumer okay so this is a consumer here we go okay so it's reading this and it's using a scanner to read the data oh interesting it's using a scanner crazy okay it seems like a lot of work for this especially since the temperature you know is a single decimal point to precision and always within like this very specified amount it kind of feels like you could always make like a pretty easy read here right the read are somewhere the the read is somewhere between three to five bytes every single time all right anyways okay so they do a lot of this blah blah blah we've already seen all this they've read they oh file reading we've already talked about that they do the copying with the leftover data send over the data current worker doing one of these little cyc things by the way okay nice beautiful uh do this thing do all that thing little weight group wait for all the workers to finish okay fantastic and then we put all the data together and then I guess you aggregate all the data okay okay now we got two new parameters to adjust the number of workers and the size of the buffer channel to discover the impact of these parameters I created a grid test uh that run with each pair of configuration to see the results okay well that well done well done buffers and workers okay so I don't know why that one I don't know why that one's better I can't I don't know why that one's better but okay I like this I mean I like I like what he's doing here this is good this is really great exploration work by the way uh as expected for somebody that like cuz you got to remember this guy explicitly stated he is not somebody that knows about a bunch of low-level optimization like he just does he has no idea so he's just experimenting I love I love watching empirical investigations go down this much code better be building a full operating system or I'm going to be disappointed uh as expected a few go routines uh with a s uh single message buffer in the channel lock the main thread waiting to channel let's see waiting for the channel to become available there was no significant gain with more than 25 workers after the buffer size of 10 for a balanced setting I will proceed with 25 workers in a buffer size of 25 okay right in this he's shooting for this region right here okay I love Latin modern serif okay okay I'm glad you love that Discovery and progress managers be like push it to P we're done good enough optimization starting from the basic implementation I'll show how identified and worked to optimize individual code pass if you wish to repeat the process you can add the following stimp it at the beginning of your program CPU profiler Prof this okay so we're going to start doing a little bit of that delicious CPU profiler let's go all right then I'll run the uh Tool uh go tool PPR HTP this thing right here CPU profiler Prof which will open uh a detailed site showing the CPU profile insights the image below is one let's see is one of the reports presented by the site called a flame graph uh here technically it's an icicle chart okay cuz it's hanging from the it's a stag M cuz it's hanging from the ceiling and not going upwards like flames you know some people try to call these things icicle charts instead of flame graphs you know we used to have just one term for things and then people got so cutesy and then all of a sudden now some people toss out the phrase icicle chart and I'm just like is it stag tight I think it's stag might is it am I wrong on this one stag stag I dud I don't even know how to spell the effing thing oh damn you're right it's rising from the floor man oh man I'm stupid damn stag these nuts got him God my my cap my cave knowledge gosh dang people my cave knowledge is so weak anyways by the way can we just take one brief second here just one brief second here and just appreciate that this is supported in go can we just all be very happy that this is free free and go JavaScript can do that JavaScript cannot do that my friend okay you have to use what is referred to as the Chrome debugger protocol to hook up specifically to V8 to be able to transfer out this type of information you have to be able to use that Chrome does not do that or I mean uh JavaScript does not do that V8 does that um JavaScript core has some level of support for that and it's been varying over the years and as as Chrome adds something like say the performance timeline which is the new thing they've been pushing you'll notice that something like JavaScript core does not have it uh bun Doo support so those are run times bun uses JavaScript core which means it's going to lag behind anything that you see in V8 because V8 is Chrome right I mean the same same team so they're working hand and glove Dino on the other hand's using V8 so technically it should be able to do that long as the runtime allows for you to interact and set those flags all right anyways fantastic okay so what do we got here look at that look at that parse float you knew it was going to be parse float that son of a was going to be parse float you knew for a fact that son of a was going to be a parse float let's see let's begin with a bite split function or which I'm let's see which I'm using to split the name and temperature readings for each line as we can see the flame graph most of the time consumed is uh attributed to memory allocations runtime make slice and make Malik GC the simplest solution is to keep a fix siiz buffer for the name and temperature and copy the bites from the original line into new buffers yeah yeah yeah yeah that makes sense um how how do they how do they do that okay we got this nice little output Channel weight weight group we all have all this stuff do we have like a fix size buffer there you go named buffer temperature buffer we do this thing I'm trying to find where he does this thing I don't know where he's doing this could you like not convert it to a string could you just store a series of bytes a bite array as a key I wonder how that Keys into this with this change we could only reach 5.5 seconds uh optimizing solution removing the bite split reduced about almost 3 seconds of total time this is where things get get funny right isn't it funny how once you get into this level of optimization when it comes to this many items simple things like this actually make like second difference that's why like you know that's why I always have that Grudge with the 4 each Loop in JavaScript when I was using a 4 each Loop throughout my program because I was trying to be Mr good guy and not use you know not use Boomer loops and then all of a sudden when I just tried for fun to remove my boom or to or to use Boomer Loops instead of these four eaches it it was like on the order of like 6 seconds and I was just like okay I'm just not going to use stupid Loops anymore I'm going to use Boomer Loops because that's way better Boomer Loops four four Loops 4 I equals z i has to be less [Music] than there we go damn now I feel old yeah well I'm a boomer and I like those Boomer Loops all right custom bite hash now the next major offender is bites to string conversion see I was wondering about this I was wondering about this a subsequent map lookup the former is a problem because the statement string name buffer this is the exact line I was asking about it's almost like I pre-read this also allocates memory luckily this conversion is not necessary for interacting of the loop the name as string serves two purposes first to store it in the station data struct second to be used to look up in the map the the map lookup involves extracting a hash from the key and applying internal logic to uh to locate the corresponding data within the struct we can speed up the processing by sending a preash key uh so I decided to use the fnv hash which is a built-in and go I have no idea how it works but it works all right hell yeah yeah yeah there we go give me that give me that data give me that data reset right Bam Bam Bam Bam let's go I like it I like it I like it did did they also alter the line they must have also altered the line right here the string data wait hold on did you actually keep it in right here you what why why you got to keep the St first off you spelled that wrong okay uh first off typo second off you can't just be removing one name and then keeping it also right here just drop the station name dog all right there you go that improved it all right all right fantastic fantastic uh parsing float NE next biggest vendor is parse float I tried the same approach of converting uh the bites directly to the float big Indian o we got the nice big Indian indan babies oo the big Indians uh attempt one attempt two I don't see how these things are going to work cuz you're still using parse float you don't need parse float the attempt was using the binary built-in package but its performance was a lot worse the second attempt was using the btes uh the btes converted from uh the perf package let's see as you can see here but the results was equivalent so I considered parsing the individual bites but I let's see but I couldn't thought in any real Improvement to the function but I couldn't thought however at this point I already had consulted some of the Java Solutions and one of the best approaches uh they used is converting the temperature to an INT instead of floats which proved to be a lot more efficient he's lost uh he lost his mind while optimizing yeah something English is a little hard in this one something English is a little hard I told you God damn it integer is a gift from God it well yes it's that's because uh the float specification is crazy and you already know that it's up to two digits a period and one digit so it's kind of like you really don't even need to parse int you need to just simply you know you could have the simplest number function that's just like 0 through 9 0 through n period skip it 0 through n that's it right like I I don't know it feels like you don't even need to do parsent at all already uh showed some improvement but notice that if we just conver because you don't even have to check like you check for a negative sign once you know um uh but notice that if we convert the data uh to int from string we'll lose the decimal point thus I wrote a custom int conver ver that will keep the decimal point and generate an INT equivalent in float temperature this this this uh we can just have the final result by dividing the mean min max by 10 let's go that'd be good that's all you need to do I think bites to int is it negative yes it is negative result there you go I like it I like it I like it this is good okay this is exactly what you want this is fantastic this is fantastic negative do this this I don't think you really need to do this I guess you do need to do that you do have to multiply by 10 yeah what am I saying you have to do that anyways uh again well maybe I'm sure there's probably another way you could make this better but uh again another large Improvement 3. uh 3 seconds nice nice let's go we're getting so far down oh we've seen a lot we've seen a lot of main time thing we got stack these bad boys Uh custom scan uh the custom scan function is very straightforward we just needed to read the bytes until we find the slash end yep so there must be yeah there you go scanner so they're going to try to take out this chunk right here because if you take out this chunk you're going to see a huge you're going to see a huge W right here yeah you have to do the 10x to yeah uh this let's see the custom scan function is very straightforward we just read the btes until we find the slash head yep all right right that shouldn't be too hard read all these things blah blah blah blah I think we all know what's going on here by the way multiple return items are just fantastic okay I love implied tupal can we all just take a moment and say that if you can't return multiple values from a function your language is weak okay weak no no that's not returning multiple you're returning a single one you're destructuring on both sides that does not count this does not count there's a whole set of problems that go along with that when you return an object again you have to like typee that object you have to build it up there's like all sorts of things you have to do when you have to do that okay it does it in fact does not count this does not count this in fact does not count this is coping this is called cope you you typescript bros have just you you guys have lost it you guys have genuinely lost it you you know what you should do you should just take the L okay and say yeah it's true we don't got multiple return parameters but we can we can at least create an object WRA the object return the object and destructure set object okay that's not the same we are not the same you got to remember when you return an array or an object you're also getting all the other things that an object does must be talking about JS we are tech savvy Travy people are trying to say that this like JavaScript can do this too you just wrap a little array around it or just do or just put a little object right there easy not hard it's just like well it's not the same there's still like JavaScript constructs a lot of things when it comes to an object right it does put something in the nursery GC you must remember that there's a nursery GC it gets added to the nursery GC there's like things that happen okay it's not for freezy it's not it's not zero it's not zero cost all right so as usual look at this and this is our flame graph okay so it looks like we got we got to do some sort of we got to do some stacking here okay we got to stack these sons of custom hash I began considering a custom uh computation for the map hash after noticing an increased relevance of function some rate at this point I had the done some analysis and could extract some insights about the data in the measurements file one interesting finding was determining the number of bytes required uh to represent a station name without colliding with other stations in my database I found that I needed nine bytes with values ranging from 65 to 196 using this information I had the idea of concatenating every number into a single large un 64 while ensuring that the value does not surpass the upper limit of this I think that's fine but you're still having to like do a bunch of math to get stuff out I'm not sure if I buy this as like a win I'm not sure if I buy this before implementing the my solution I benched it against my uh F half running those functions this one okay yeah well I guess this wins there you go I guess you're doing a special case hash but how often are you colliding like the real question is is that are you actually slowing down all the map side of things that's where the danger comes whenever you write your own hash function your hash fun your hash function may be faster but it may Collide more and you may you may get yourself some oopsy daisies on the other side you get a you get you get a real problem potentially so you know I'm not saying this is a great idea it's worth noting that this hash is very situation and May Fail fall let's see fail in uh different data sets a problem for later nonetheless in this particular case the result was 2.7 seconds okay awesome okay okay I still think I still think we got some things left here okay uh inline functions the consumer let's see consumer is a hot let's see is the hot path thus any function call outside of it can potentially generate unnecessary overhead however inlining all the functions didn't show any Improvement but made me lose the profiling information edit I just learned that go inline functions automatic automatically whenever possible you don't yeah there you go fantastic uh workers reading the file while drafting this article I realized that I was super close to reaching the Max uh the minimum threshold I had set with the main thread configuration upon analyzing most of the time in the main thread was related to go routine communication with .98 seconds to reading the file and 0.13 seconds to communicating the data to co- routines it struck me that I could just move the file reference to a consumer and completely eliminate the communication overhead including replacing the communication buffer for a fixed buffer reducing the memory allocation overhead yeah but you're locking are you really I mean I guess if you're saying that your your reading is is better it's not cheating it's just that you're assuming that you're I mean it could be right because you're not actually doing any sort of I mean to be completely fair you're not coping buffers anymore and you're able to uh if if there's already a mutex in the channel then why not just try your own because then there's no because you're not actually do there's no buffering there's no extra anything going on you're just literally doing that this is actually pretty interesting okay by delegating the reading task to the consumers the go routine can locally the the only hard part is when you read how do you not accidentally read so far as to hit hit a new line that's where my question is is that you have to read to a new line and then you have to like not read a mutex is a spin lock underneath the hood I think you're right I think it might be a spin lock underneath the Hood by delegating the reading task to the consumer the go routine can locally read from the file using a mut text to avoid any concurrency issues for testing purposes I let's see only I will discard the first and last line of the buffer to avoid complex distribution leftover Logic for now the results uh were reduced further to 2.1 seconds yeah see I'm consider I'm curious about the distribution of little endline things right how do you do the end lines I don't know how they did that trash bin in order to recover the first and last line of each chunk I created a trash bin go routine uh that receives the discarded paths from other go routines and tried to merge the individual parts to complete the line that sounds crazy how this can't be right notice that the first line uh of the go routine is always complete complete uh valid line the last line of the go routine is always empty the file ends with sln all parts of the line between the match let's see matched by their ID each read from uh each read from file increases the ID the first line is kept as the previous ID and the last line assumes the next ID this process is controlled by the same M text used in reading the file guaranteeing the cons uh concurrency consistency I don't know how this works I don't understand how things are reading file like this I don't know how you're even guaranteeing this order question couldn't you just have a global bite that's like say I don't know 500 bytes or something like that just some small amount of bytes that the last one that you're reading while you have the lock mutex is that you read back take that trash amount and put it into this little Global buffer and then when you unlock you're actually locking two things at once instead of one wouldn't you just have that you like I I don't see why you'd want to use like a Channel or something like that a go routine that receives like that's the confusing part is this go routine receiving business that I don't really quite understand name and temp puffer I finally realize that I don't need a name and temp buffer yeah I know uh I let's see if I just use the sub slice of the read buffer I don't need the copy of the name and temperature buffer over and over again with this change of further reduced it to two seconds pretty much 2.07 seconds okay we're getting we're getting Swiss Swiss map I think we missed we missed some statement with the Swiss map all right all right we're getting smaller we're getting smaller finishing up uh with grid test I to finish uh up in style I wanted to perform a new grid test however I needed more samples for a setting in order to address the time variation which is too uh which is Much Too Close uh to each other now since I removed the channel buffer I only have two parameters read the buffer size read buff size and the number of workers after 15 runs for each configuration as you can see let's see all right could be achieved in less than two seconds and some runs now using the winner configuration for the final test I increased the number of runs to 55 and closed everything in the computer but the terminal the results are the following pretty good job pretty dang good job I feel like this would be a fun challenge to do just to because I've never used any the gopr tools uh the pepr tools I've never I haven't used them in the goand I just love the fact that they exist here it just seems so beautiful it'd be fun to do link I've already linked it once people come on get get your in the game people the P Prof the pp Prof that that a p Prof gun thank you uh still mind blowing that Java managed to get it to 1.5 seconds there must be something that they're doing you know there's probably something that is done wrong here cuz it you know someone was saying they got it in like milliseconds but there has to be something wrong with that right and what I mean by let's say I'm I'm going to have to do this aren't I I know Travy that's how I already feel I already feel like I'm going to have to do this I already feel like I do whipped mango thank you very much um final thoughts okay moreover for me the results are more incredible because I I didn't bother too much about manipulating the bites individually like the Java best like the best java solution okay there we go fantastic all right this was a great article this is a great do you see why we like to do this does everyone see why you like to do this this is actually kind of fantastic this is great abs absolutely love it uh this one's a lot better than the Java one you think so compile Linux from scratch and let's get those numbers down absolutely that's the easiest way to do it the name is someone saying it's less than one second no guys guys look it says right here 1.96 okay dog what are you talking about it's right here hey the name the name is the prime gen you know what I mean that's the name
Info
Channel: ThePrimeTime
Views: 105,118
Rating: undefined out of 5
Keywords: programming, software engineer, software engineering, developer, web design, web development, programmer humor
Id: SZ1PDS7iRU8
Channel Id: undefined
Length: 39min 42sec (2382 seconds)
Published: Sat Mar 23 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.