G1 Garbage Collector Details and Tuning by Simone Bordet

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to this session the session is basically my story it's a we've been using the in suggesting g1 to our customers and my story was that was not very much information around on the internet regarding how g1 really worked internally so I had to do some research I had to contact people I had to look into the hotspot code in order to figure out few details that I was very interested in so this session is basically about me telling you what I discovered and hopefully you will find this interesting I work for an American company called Webb tide we provide support for the jetty open source project we're basically the company that employs the jetty committers we are you know very active in the open source and we have right now moved to implement a cheapy tool and we are very happy with that and I gave other presentation on HTTP to and you know jetty is a 20 years old open source project it was first release in 96 and of 95 using Java 0.9 as a context that the Sun at the time was putting out in order to encourage people to start writing using Java and so Gerry was one of those submissions and you know 20 years later we'll still here we're still kicking ass so it's a it's a very interesting project if you don't use it take a look because it's super interesting but let's go into G 1 so G 1 is now in JDK 8 the low post collector and the first paper of this collector dates back to 2004 so it's now theoretically in the theatres in the algorithmic satellites now twelve years old and it's basically four years old in production and it's supported by Oracle so it's been designed to be like the long-term replacement for CMS and it has been targeted as a low pose and he will probably become the default collector j9 the reason being that majority of java application actually benefit of very short collection post times so if you have a web application you really want interaction with your users to be super fast you don't want to you know pose for 5 10 seconds to do a full garbage collection on the server before answering an HP request and if you are on the UI side maybe you're writing Java effects or whatever or swinging or whatever you still want to be very fast because maybe you're rendering frames and you have a very fixed budget in time that you need to be able to to stay within in order to render your frame so it's it's very important that the garbage collection is very efficient and especially in the low post aspect of it so g1 has been designed to be really really easy to tune theoretically you just need two parameters you need the max heap size and you wanna detail how long is the stopped word post that you're willing to accept okay in this case for example is 100 milliseconds so these two parameters should be enough there's no infinite long tuning command lines that we had with CMS g1 is supposed to be this simple however this presentation will show you that it's not that simple there are many other tuning parameters that you may want to use and and will you know go into the details in this g 1 is also generational collector like all the other collectors that are in the JDK this means that it basically divides the heap into two major zones one is called the young generation and and then the old generation and it has two different algorithms for each generation in particular g1 has a typical like all the other collectors in our spot it's a stop the world parallel copying algorithm very classic the old generation is what is very different in g1 what it does is basically performs a mostly concurrent marking very similar to what CMS does but then differently from CMS it doesn't do any sweeping it doesn't reclaim the space immediately we will see in a bit of in more details how exactly does it but basically it piggy backs this operation of reclaiming the space in the old generation into a young generation collection we will see the details this is an important slide because g1 is a very detailed logging so I always suggest that you keep these flags enabled in your application they have very little overhead in in your application throughput and but they provide you very important information when you have a problem with g1 if you don't have this which is enabled that tells you why G one failed and when then it is going to be very difficult for you to figure out something so always keep them enabled they provide a lot of information that it's possible by tools it's possible by custom parts that you can write it's very easy so it's a very important thing to keep these enabled so let's go in a little bit more details now about the g1 memory layout so I guess you're familiar with this picture here who's familiar with this one most of you so well forget about it it's because g1 is very different from this g1 targets and divides the heap into regions with their Co regions they are pretty small g1 target's 2048 total regions so if you make a small heap it tries to make mm region that are very small if you have a large large heap it still makes mm region which now of course bigger then it tags those regions with particular names and says ok I want these regions to be called Eden regions or survival regions for all the regions the these concepts are you know very similar to the concept that we have in the other collectors there's one more region type which is called humongous region which means that that particular region contains one object only typically very large there occupies at least 50% of the region size so when such large objects typically byte array or char array are allocated then in and they occupy more than 50% of the region then the region is called humongous we will see why they are special so this is basically did you one memory layout it's very different from the other one but we have three regions we have all the regions at the region humongous regions survival regions and so forth so how do one works well let's dive into details well the GBM starts it allocates these regions it allocates the heap allocates the region tags them all and say ok this bunch of regions here they're empty for now but there I'm there my adding regions ok then the application starts and starts out locating and Jiwon starts filling one region say ok you were the first region fill it up when you're full it goes to the second Eden regions and fills that up and then it goes to the third and fills that up and forth and fills that up and so forth right when all the red Eden regions that g1 has allocated and has reserved to be Aden regions when all of them are full then g1 starts a young generation collection however allocating is not the only thing that your application does the other thing that your application does is to modify existing pointers ok so one big question was ok let's assume that there is a an object in an old region that points to an object in the Eden region ok that kemper for example if you have a typical case it's a map that you are located the very beginning of your application and then later on in the lifecycle of your application you insert a new object in the particular map ok so the map leave the long enough to go into the old region in to an old region but the new element it's a new map entry and it's going to be allocated right now into an eighteen region so you have a pointer that goes from an old region to a young region and so d1 must track this inter generation pointers it has to track from older or you moongoose remember this could be an array into Eden or survivor Rob okay why it has to do that well because if we look at one single Eden region it looks like this there's an object D pointing to F and then there is object E but because nobody is pointing to D and nobody's pointing to e we can say well this is all garbage nobody points to this object there's no way to reach object D from outside and so you know this region is basically full of garbage its and I can reclaim and you know use it for allocation of other objects however when you take into account that there may be inter region pointers then the situation is very different yeah it's it could be like this okay there could be an old region the points to D and an old region the points to E and then there could be an external pointer these external pointers are called the roots the points to a so this could be for example a static field in a class the points way the points here the points here and now you can actually see that these two objects are alive because they are pointed to from another object while for example B is an object that nobody points to it so it's dead but points to another object here and so this one is that too but you know you have to figure it out it's not that simple so but then it goes deeper it's like okay I understand there are these pointers but how g1 tracks these pointers like this one here and this one here do you want us to be really precise in order to detect those pointers because when he looks and it performs a young generation collection it only looks at the Aiden region so it doesn't look at the whole deep because otherwise it would be too expensive so before we go into that I'll tell you that there is a one more data structure that is very important that is called the remember set okay this is the data structure that remembers for this particular region regions what are the objects the point two objects they live into this other region okay so basically this one is a data structure that says okay there's someone from outside the points to me all right and I remember where these guys from outside are in this remember set okay there's also an additional data structure called the card table which is here which tells inside this particular region where exactly are the external guys okay the points this region so how does exactly g1 tracks the inter region pointers well earing stalls what is called a write barrier basically this one is a small piece of code that the JVM injects into the code that you are actually running every time you do this operation in the code object da field equals something all right so g1 basically says okay I need to change the value of this field here but not only I change the value I also run a little bit of code that allows me to keep track of inter region pointer because at the point g1 knows what is the subject and what is this object okay so g1 knows both objects he knows the one that is pointed to and the one that is pointing to and so it can keep track of all the information about this pointer every time a pointer is written g1 stores the information in the card okay so here it says basically it marks this card and say okay this is dirty okay and then puts that information into a cue into a separate cue that it's called the dirty card key all right and then the skew is divided into four zones here there's a white green yellow and red zone and as my application runs my application modifies pointer this cue grows up okay many modification to objects existing in that particular region are created and so this cue fills up okay when until the number of entries stays in the white zone nothing happens when the number of entries exceeds the white zone and enters the green zone threads backgrounds threads are started by g1 these threads are called refinement threads and what these threads do is basically they go back here and they say okay I know there is a cue somewhere the points to these regions here I want to update the remember set I wanna you know be sure that these data structure here is totally up to date why does it do that why does any update the remember set immediately well because updating the remember set immediately is very costly this data structure will be heavily contended imagine several threads all trying to write this particular remember set instead what it does is says okay let's use a queue the queues are much cheaper data structure to hold information I store information into the queue and then I fire up a background thread that every time there is a change this thread basically writes here while the application is running okay so because it's a g1 thread then g1 knows everything it's the only one right into the start of structure so that's good when we entered the yellow zone basically g1 has started older Fineman threads and when we enter the red zone so there is many many many many changes that fill up the skew very quickly then g1 uses a trick he says okay I don't kill any more these changes but I ask the application to do the change okay by asking the application to do the garbage collection work then the application has to run under this a bit of code so that addition a bit of code that was not run before slows down the application by slowing down the application I slow down the rate that the application modifies pointer and therefore I give g1 a little bit of time and space to in order to catch up and drain the queue in order to update the remember set okay so this is basically what's about there Fineman threads it's it's something that you could find for every explanation you find on the internet about g1 they say oh and by the way there are these parameters g1 conk refinement green zone but there's really no explanation of why what is this green zone red zone yellow zone or whatever so that's the research that I've done in order to understand why these are important and and how you want to turn it what is the meaning behind these refinement threads so now let's go into a little bit more detail about the g1 young generation phases what happens when g1 performs a young GC well the young generation collection in g1 is still a stop the world collection so the first thing that you want does it stops the tried all the application threads are stopped and then at the point g1 waits for all the tried to stop which could take time and G 1 at the point builds what's called the collection set we'll remember this because we're going back to the collection set concept a little bit later when we look at the old generation phase so basically what's the collection set it's all the regions that g1 wants to look at during that particular collection ok so because it's a young collection then in the collection set there will be all the heading regions and all the survival regions right because I don't want to look at the old regions now it's a young collection so I just look at the Eden and survival the first phase that Jiwon does while the world is stopped is what is called the route scanning so do one has to start from known places that are known to be alive so for example static-filled members in classes but also all the local variables that you have in your stack it for each thread so what it does is basically g1 walks the thread stock and say ok I am in the middle of executing this method this method has allocated or reference some our local variables and therefore the object pointed to by those local variable are alive because you know I'm using them in this thread so those are lives and so he searches the first frame then it goes back one frame and more framing frame until he arrives at the top of the frame the stock frames the second phase that does is updating the remember set remember the cue dirty car - it may be that for example that the my application doesn't change references very often so all the changes are still in the white zone of the queue or maybe a thread has been spawned to actually help the updating the remember set but hasn't finished yet there's still entries in the dirty car queue so this phase is basically about draining the dirty car queue and updating the remember set in order to have the remember set give a consistent view of who is pointing to that particular heading region ok the third phase is ok now that I have all the data that I want I want to process that data so I want to go to the remember set follow the pointers back to all the old regions figure out what are the object the points to me and really understand if the predict what are the objects that are alive in the setting region ok so I go there follow the pointers and say ok object remember object D before yeah that object is alive ok so I check it and say ok this guy is alive so this is the processing the remember set the third phase is what is called the object copy and this is a very important phase because it does many things at once so now that I know who are the live objects what happens is this okay let me start from the roots let me follow all the object graph of the known objects that I know are alive okay and why I am traversing the object graph and following the pointer to children and then grandchildren and so forth I can do many things one of the things is basically I can copy those objects from one region to another okay I copy the roots then I follow the pointers copy the children follow the pointers copy the grandchildren and so forth okay but while I'm in traversing the graph I can also keep track of what kind of object they are okay in particular I keep track of whether the object is a soft weak phantom final or generic reference because I want to keep those aside because I want to process those a little bit later okay not only that while I'm traversing this object graph I can also keep track of how long does it take to me to to clean up one single region or many regions or the whole Eden in survival region so I can keep track of the times so well basically the object copy is mainly about copying the object from one aidan region to either another reading region or a survival region so the fifth fifth phase is well because if traverses the graph I kept track of the references then I want to do the reference processing and I want to know okay this is a weak reference may be the heap is full I better clean this object and you know clean the weak reference and then schedule it for garbage collection and so I do that you can enable these two switches if you want to have the parallel processing to sort the reference processing to be done in parallel by more couple threads by default is done by a single thread and you can also add the switch in order to get more information about for example how many references g1 has processed this could be very interesting information because if you have like numbers well it depends on the heap size but typically if you have very large numbers like it says in the millions of references then probably you have a problem in your application or there's a library the result locating a lot of weak references that are immediately thrown away so because this operation is costly you will not try to minimize it so as I said before g1 tracks the phases in the times it takes to to run through the young generation collection and why does it do that remember at the beginning that you could tell g1 what is the maximum stop the word post that you're willing to accept well that's the reason why I keep track of the time because if I have to process a hundred regions and this under region takes too much what is the solution that you want takes in order to respect the max GC post that you have given to it well I can just reduce the number of regions that I'm processing right so if instead I can do the the Eden region processes very quickly and the target is much larger then I can say well I can use more regions and therefore my GC will become less frequent in time because the idle regions grows and grows and grows and it's larger and in order to fill it it takes more time okay so by doing less collections there is less overhead on the throughput of your of your application so g1 is something that it wants to do it it wants to keep your application running as much as possible okay and have GC to be done as rarely as possible so g1 tries to enlarge the adán generation up to the point where it can respect the max GC pause target that you have given to it when he cannot do that it shrinks the the Adhan sighs all right so it's a kind of a simpler algorithm but we'll see in the real life you know by parsing the log we seen a very peculiar behavior of g1 that it's very typical of this algorithm and I'll I'll show you in a moment so graphically this happens but this is what happens basically so this is the hip at a certain point g1 young generation collection triggers and what happens is this that few regions are chosen and you know survival region certain regions are chosen we say okay I want to relocate all these regions in another region and basically you and our survival region also can become here maybe this survival region or contains a number of objects that are always alive you know maybe objects that you have allocated the beginning of your application and stay alive for the whole duration of your application and so maybe what happens is this we have a new survival region here a new survival region and all the regions and these one where I started from become empty right and I can fill them up again you with me so far yes good so let's go to the old generation now which is a little bit more complicated but at the end is very similar because it piggybacks on on the young generation so what happens when at a certain point the heap fields up as we have seen here survival region go into all the regions so the number of old regions grows and grows and grows and grows and a certain point I have to schedule an old generation collection okay so no generation collection is scheduled when the heap reaches 45% fooled the whole heap so basically you have less than half deep full less than half and then at the point g1 says okay it's time to take a look at the old regions before you know I feel the whole lip because if I wait too much then it may be too late the time that I take for me to scan all the old regions maybe they fill up even more in certain point I go out of memory so that's not good so five is a very conservative number and and it's the default but can be tuned with this parameter so the old generation algorithm is consists of marking the live objects the algorithm is tries to find all the live objects that are in the old regions and it does so concurrently with the application now remember before it's not only allocating your application but it's also modifying pointers how can G one navigate an object graph that is actually changing so that was another doubt that I have and I wanted to explore and understand better so the algorithm that you want uses it's called three color marking and it works in this way this is an object graph and the first thing that it does is g1 says okay there are routes objects that I know there are alive for sure and I mark these guys black and the children they point to I marked them gray and I put them in a queue okay then I say okay let me take the first gray object out of the queue let's assume as this guy here okay let me analyze this guy and follow the pointers okay I follow the pointers I mark the children to be gray and when all the pointers I followed all the pointers of that object then that object becomes black alright so it becomes like this okay this one is only has one pointer so the children's market gray put back into the queue and then this guy is marked black well let me the queue the other object that the other object happened to be this one let me follow one pointer so I mark the children to be gray and back in the queue and then I have to follow the other pointer and so I mark the children gray put back into the queue and then this guy is done mark it black all right and therefore you can see okay let's pop another gray object it has no pointers so all these three basically have no pointers or they have pointers to now right or maybe two color object like int or long or boolean or something so I don't have to follow any pointer so basically mark them black I have nothing more to follow so I'm done so what is the status of the hip at this point while all the objects that are alive are black and what remains are white objects there are by definition garbage I cannot reach them and therefore they are garbage by definition notice also that is very important that there is no pointer from a black object to a white object never okay we have been black to gray but never black to white we have maybe in this case we have great white but never black to white right this is an important invariant that G one has to respect however the application is modifying pointers behind below me while I'm traversing the object graph in order to find who is alive and so there is what is called in literature the lost object problem and this is how it goes I have marking in process I'm about to analyze the object B I'm popping out of the gray queue okay then the GC thread gets stopped the application thread now does this it does either C equals C so it creates this pointer and then the leads the other one says B dot C equal now so this pointer gets deleted all right and so okay that's something that an application can do and suddenly the GC thread that was doing the marking resumes and it says okay let's pop out of the great 2d object B or look there's no pointer I mean the pointer is pointing to now so I have nothing to follow and mark this object black because I'm done but then I we have this pointer here which is bad okay so g1 solves how do one solves this issue this is bad because imagine I copy this object into a different region then I will F from that new region I will have a pointer the points to this object there is considered garbage because it's white which could be overwritten by new allocations so I have a dangling pointer to some random code the JVM will crash so in order to detect this particular situation the JVM does installs again a write barrier and in particular it installs a write barrier that detects every time a pointer is deleted in particular if I if I can detect this B dot C equal now I know many things I know the pointer that I was the object I was pointing to I know the object that points to that object so I know both of them and I have a bunch of information that I can use for garbage collection so this technique of remembering the deletion of a pointer is called snapshot at the beginning techniques this is another word that comes a lot when you search online for how D 1 works that is very seldomly explained nobody really saw yeah yeah we use as an app sheet at the beginning technique yes but what it is is exactly you have to go by a big book on garbage collection and over there you can find more information but you know the book costs like hundred euros or something so yeah maybe I'll go to a conference and so what you want does is basically it speculates the DC object to whom the pointer has been deleted remains alive so if we go back to the graph here okay so we were in this situation yeah here I made the case where a dot C was actually a was pointing to see if so was keeping C alive but this may not be true it just happened that the B pointer gets deleted so this object is actually garbage because this one will never be created by the application ok however jiwon says ok you know it's a rare case it's a concurrent concurrent race between the GC thread and the application thread happens rarely I assume C will stay alive okay and and so this technique happened to be very much more efficient for example CMS doesn't use this technique and G 1 in G 1 they chose to use this kind of technique because they are known to be faster and better and and so it's okay I may retain some floating garbage what's called C it's a floating garbage but it doesn't matter because on the next cycle nobody is really pointing to see so well navigate the object graph again and basically C will be will not be navigated and therefore will be garbage the next cycle I will remove C no problem all right however this gives you a hint the G one works really well when it has more space than the strictly needed for your object so for your application to write exactly for this reason because it could happen the floating garbage remains there alright so I will be more space G one works better so let's go into the details of the g1 phases again it stops the world shortly it piggybacks on a young generation collection so performs I'm doing a null GC but when I'm doing an oldie C I asked the garbage collector hey by the time do young GC now doesn't need it okay and while you're doing the the young GC also please record for me the old regional routes so that I know where to start from okay then the one the young generation finishes the application threads are restored and the now concurrent marking can start all right and again because it traverses the object graph it can keep track of references like soft week etc and it also computes per region liveness it can say oh I'm traversing this object okay this object belong to this old region but look there's only this one object in the particular region it's empty it's basically full of garbage and it only has this one single object in there or you can say hey I'm traversing many objects they all belong to that particular region so that particular region is full of live objects okay so g1 keeps track of that information and then what it does it stops the world again and perform what is called the remark phase and you're in their mark phase what it does is it processes the snapshot at the beginning Hugh remember all those see object that g1 speculates they are alive okay so process them so it follows them again maybe they have children so it navigates them again and then we process the reference the typical reference processing again during a stop the world phase and then of course there is a cleanup phase because g1 says okay while I was navigating the object graph all the old guys turns out that that particular region I did not navigate into that region so it means that that region is actually full of garbage it's completely full of garbage so what can I do is okay no problem I can reclaim immediately the region and just you know put it back and say okay you're free so you're free to receive you know promotions from the young generation and then of course application threads are regime all right however there's something missing because where is it that I reclaim the regions that are partially full of live objects and partially full of garbage there is no phase for that and that's the real difference between CMS or other algorithm and g1 we're going to determine in a second so this is how it looks like in the log we have an initial mark here which detects the routes and then we have conquering concurrent region scan and then we have the conquer mark starting you see in this case it took like 1.67 seconds to perform to navigate all the object graph and then we have the remark phase 48 milliseconds and then cleanup phase you see in this particular case it went from 16 gigabyte to 4 teen so basically a reclaim two gigabytes worth of old regions they were completely full of garbage and it took 55 seconds to do that so I was saying what about the non empty non completely full of garbage regions okay and how is fragmentation resolved alright well this is how it works in G 1 G 1 perform what is called a mixin GC what is a mix it you see well a mixer GC is basically this it G 1 says ok I know that all this old regions are contained some live object but also some garbage ok let me divide the number of region by 8 let's take one 8 of those number of regions and then put them into the collection set while I'm doing a young generation collection right so G 1 says ok marking is finished sometimes passes application of runs I'll occasionally be more eventually the applique 1 says ok I need to do a young generation collection however there's a mix it flag turned on so not only I have to take in and count adult regions because I'm doing a young generation collection not only Eden and survivor but also 1/8 of the old regions because they just finished an old concurrent marking alright and then the algorithm to the young generation collection what do I do ice I know what are the objects alive and navigate the object tree and I just take the object copy them into other regions and I'm done right and that's the way g1 compacts because it goes into an old regions moves all the live objects leave it back the garbage and then that particular old region can be recycled alright and so I can compact things I can take five five six old regions and then copy all their object into a single all the region right and then reclaim the five six all the regions that are before so g1 is able to do back tation in the Sun in this way which is the big difference in the big plus that CMS was not doing of course g1 is very smart so if first targets because he remember he counts how many live object per each region so if first users in first copies in the in the particular 1:8 if first targets the regions that are mostly empty right because if I just move two objects and then that particular region I'm done that's very cheap and I reclaim a lot of space so that's super good so by default regions there are more up to 85 percent sorry more than 85 percent full of garbage so there's only 15% alive then they are chosen to be moved in that way and there's also a wasting of g1 that could say well you know I know that this region contains maybe 92 percent of live objects and there is an 8 percent of garbage but you know what copying those 92 percent of objects it's going to be costly so I don't care about the 8 percent of garbage that remains there it's okay I don't even look at that region because I know it's going to be very expensive for very little gain maybe I reclaim I don't know a few bytes maybe a few kilobytes but you know it's not worth it okay so you can tell g1 and say okay you can waste 5 percent of the epin that way don't worry alright so then the application runs again and then more at an allocation again a young generation collection is triggered and then at the point I can take the second 8 of the old regions and look at the second 8 and so forth the more I go into the this 8 slices the less those region will be efficient to reclaim so g1 can stop and say well ok you know yeah I had to do 8 mix it you see but you know I reclaimed enough space even if I do maybe three so the other five I'm not gonna do it you know I'm going to waste a little bit of heap but it's okay so graphically again this is what happens I you know target all generations regions here but also adding regions in survival regions and then I just move them and you know from the old that was here it goes into the old that was here but now it's compacted and this one can be used that to overflow from Survivor to old and you know compact even more and so I reclaim space in this way all right so this is basically how g1 works now let's go for the advices and what do you have to do to to actually perform well so avoid at all cost Fuji sees this is like a mantra that we have to do every time and well why because again fool you see in g1 or single-threaded and believe me they are really slow and you don't want to go there at all avoid another situation which is this one yeah you have free space but whenever you have to you know collect all the certain region and survival regions it may be that sorry they don't fit into the single region so that situation is called the to space exhausted you don't want to do that when you encounter a to space exhausted it means that your hip is too small you have to enlarge your hip give room to g1 to function properly all right because all the live objects that could be here I mean all these regions could be totally empty it could be full of garbage and then yes you can get away right you copied zero object and you're good but if it turns out that the live objects in the other region overflowing here they cannot fit then to space exhausted yeah and you go back to a fool jassi also avoid humongous allocation because humongous objects are treated very specially in the g1 code there are special paths that needs to be followed when you do humongous allocation when you when g1 goes over and say so I have humongous regions around so I need to treat these specialists and so forth so basically g1x Achatz more code if you do your Mongoose allocations and you know executing more core code it means that the collection will last longer so again you don't want to do that how do I solve humongous allocation well you just make the region's bigger right if before I was allocating maybe 12 mega byte array into a 32 gigabyte region sorry into a 16 megabytes region then it meant that because remember 50% so any object there is larger the 9 8 megabyte it's going to be a humongous object so it was 12 the soil was humongo okay if I double up the region size and go to 32 megabytes then the limit for humongous is now 16 and now 12 megabytes is not it's not you mongers anymore ok so super cool reference processing is something the bit as well you know if you connect for example a J console to a living production web application RMI is creating weak references so you know there could be little one or it could be a lot of them so pay attention to that thread locals also allocate weak references and you know take a look at also third-party libraries that you know real world example so where's the real meat is so this one was a customer was online chess games the single server was running on Jerry it was doing 20,000 requests per seconds single server pretty beef 64 gigabyte 32 cores something I think it was 2 by 12 and it's a mistake here 24 cores so it had an allocation rate 0 5 to 1.2 gigabytes per seconds that's pretty high but this application was running like 24 hours are they so you know during nights was 0.5 but still running and during Peaks 1.2 and we went from CMS to g1 so first bump that we hit into is that oh we moved to JDK 8 the permanent generation is gone yes thank you it has been replaced with something that is called the metadata space rather than being a permanent generation space so they just changed the name and now instead of throwing out of memory error it says okay I need to expand so let me do a full DC and for a web application like this one with 20,000 requests per second if you stay still for 14 seconds basically you have a an outage and that's not good you know imagine how I don't know I didn't do the math but how many requests have been ignored in these 14 seconds when he obtained 20k per seconds so it's a lot so the problem is that well it's easy to fix you just enlarge the meta space size the problem is that you have no idea how big it must be so you know you have to try an error try an error so do your homework before you go to production the second issue is the target pose we said okay let's go with 250 and then we collected a 24-hour run and then we run the person tails on how long did you really post right so it turned out there are only 50% of the poses were less or equal than 250 milliseconds the other 50% of poses were greater than that up to the 100% till there was three times that so g1 tries to respect the max juicy pose in our particular case this was the result that we got so you know your mileage may vary your application may be different but you know still not respecting that is a maximum more respecting that is a median okay know that you're going to be around there rather than that it's the maximum third the most interesting behavior is miksa juicy all right so remember when I told you what is it the g1 does when it has to stay into the time right it shrinks the Adhan generation all right so this is what happened however when you have to take into account that one-eighth more regions all right what you want has to do well what that one-eighth of all the regions there has to be taken into account for mix ajussi takes time and therefore g1 said well because it takes more time I have to shrink the atom so in this particular log line you can see that the Aydin went from twelve point four gigabytes to zero point six that's twenty times smaller you went from here to here but what is it the remain the same the application didn't stop running it was still allocating one gigabyte per second so if before he was taking twelve seconds to allocate you know to fill the atom now is taking 0.6 seconds to fill the atom right so this is what happens basically this parameter is called the MIMO mutator utilization and basically says how long in a period of time is your application actually running all right and so this is what happens this is the events that happen so this is about 10 seconds apart so you have young GC dun-dun-dun 10 seconds pass another young you see boom 10 seconds another young GC boom and then in the meanwhile here the marking was finished and then at this point do you want decided that I had to do mix urgency and shrank the Adhan generation to 0.6 right so what happens is now young generation bum bum bum bum finally I finished my to process the mixer GC and I can enlarge again the Adhan generation and go back to just doing young generate generation DC's all right my atom is now enlarger the gain but during this period of time I was doing frantically young generation collection one after the other this is how the graph of the Eddin size looked like it was 12 gig here born down to almost less than 1 and then slowly recovering up and this is the minimum mutator utilization so if you take a two-second window and you say the ideal point is that during those 2 seconds only my application runs ok that's 100% if within those 2 seconds I have a collection garbage collection activity then my application does not run ok and so I have a deep here right so this is a young generation now ok I go to down to 85 maybe or something Oh 95 here 90 whatever but here during the mixer you see I go from 100% down to 40 it means that even if each single poles of G 1 respects the target that you gave these poses are basically 200 milliseconds of GC pause then one millisecond application run then 200 millisecond GC pose one millisecond application run basically your application tries to run but did you see is always interrupting her so it doesn't run your application still you're respecting the pose but only your application only runs 40% of the time during that particular window that could be bad because if you look at the logs you see oh everything's going great I'm respecting the GC poses but why is the customer complaining okay well it's complaining because it's actually making a request exactly at this point and the system is not responsive all right so this is important to know because it will help you out if you have to figure out problems in your application that are not evident from the logs because if you look at the logs all the poses are really good within the time that you gave them so you have to graph them and graph in particular the minimum literature utilization and see if you have big downs here this is the same graph but over a five seconds Windows you see that even in five seconds which is a human time to click on user interface and something the application was wrong only running as 60% of this capacity all right so conclusions g1 is the future is gonna be default in Jerry k9 there are very good chances that you just set the maximum hip size and the target pose and it will just work however it's easier to tune the CMS more or less but you have to know it you have to understand how it works because you may be surprised by some behavior that it's not evident in g1 so you know for us it was a very interesting quest in understanding g1 understanding how it works understanding why customers were complaining even though though the applications seem to run really well it is still based on a stop the world algorithm so for if you really need to go to extremely low poses in the you know less than one milliseconds you may want to look at some different solution which are available and always use the most recent JDK because they are working very actively on it they are fixing bugs improving performance every single day if you subscribe to the output or spot you see mailing list you will see a ton of image of a ton of improvement and basically everything relates to g1 the work on CMS and parallel collector is reduced to a bare minimum like I don't know less than 1% so though g1 a few references you can you know go to SlideShare search for g1 you see you find a bunch of authors that put out information oracle's sites OpenJDK mainly list a really kind if you give them your logs they are able to parse your logs understand what's going on and you know of course when they have time but it's they're very useful they give you suggestion on how to tune g1 if you have problems so with that I'm done my time is out so thank you for listening and if you have questions I'll be around here and you can stop me by and ask me anything you want Thanks you
Info
Channel: Voxxed Days
Views: 28,883
Rating: 4.9852128 out of 5
Keywords: VDZ16, Voxxed, VoxxedDays
Id: Gee7QfoY8ys
Channel Id: undefined
Length: 55min 47sec (3347 seconds)
Published: Sun Mar 06 2016
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.