Performance optimization: how do I go about it? by Kasia Zien

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hi everyone humne can you can you hear me at the back cool how's everyone doing at 5 p.m. everyone's still alive yay I hope you enjoyed the whole day of talks thanks for sticking around until the end so I'm cats or this weird polish sounding Twitter handle on Twitter I work I live in London I work for months a bank so some of you might know it from the flashy hot coral cards and today I'm gonna talk to you about performance optimization so as developers when we pick up a task we usually like to start with planning and you might do the proper planning with sticky notes on the wall or you might just figure out in your head what you're gonna do and then you go off into some coding and then hopefully you write some tests with the codes or test first whatever you prefer and then the tests pass grades deploy to staging everything is fine we're good to go to production right well have we considered performance anywhere in this and I think it's a very typical scenario I don't really want to blame anyone for this because let's just say it out loud analyzing performance is hard and especially when you've never done it before when you're facing it for the first time and you have no idea how to get started so I thought I'll try and show you today how you might go about starting with performance optimization basically based on my own experience from having to do it the first time around and having no clue where to start so first things first let's remind ourselves why we should care about performance I'm not gonna spend too much time on this because hopefully the answers are obvious to everyone in case they are not how about this fast is generally better than slow duh everyone was once fast memory efficient is good saving money by using less servers for you running your application is good it's the much smaller AWS bill running out of memory in production is bad and then suddenly running out of memory in production is really bad and in the language and the hardware improvements will only take you so far in the peach fidgets RFC there was a sentence that was saying that from the PHP authors that we believe we have hit the limits of up cache optimizations so essentially up until roundabout 2004 performance was improving about 50% year on year so that's the classic Moore's law so all we had to do is just wait a bit until we got the latest hardware and it just magically made things faster and then we hit the limits of how much we can solve with just hardware and so the the rate of improvement these days is more like 22% per year so you can see it in the in the top right as Gordon Moore Moore said all Exponential's come to an end so gone are the days when we can just be lazy and wait for the hardware manufacturers to do our work for us I'm not saying it's not improving it is still improving and then especially if you go into multi-core architectures so instead of using one core for your application you use all four at the same time so if you have a concurrent design you can still gain some improvements there and I think this is why languages which promote concurrency are so popular these days like golang or all the other ones and even in PHP there are now frameworks popping up for doing concurrent programming so school is one of the most popular ones these days so that means that the bolas honor is out is in our court again in the developer's court that covers the whys in terms of when you should think about performance pretty much at every stage of the development development lifecycle so in development when you're writing an app on or when you're maybe asked to analyze an existing application you should do it locally it's very convenient because you can use any tools you like without worrying about affecting production you might have a dedicated load stinkface before you release you might have automated load tests running as part of your CI pipeline pipeline and then even after you're done working on your software pretty much after every deploy you should still keep an eye on performance and see if you've made things better or worse how is the app still doing so you should monitor your your application over time on production to spot any issues so now we know the whys and the wins in terms of what should we optimize well most of you probably have a pretty good idea already we generally like to measure things like execution time and memory usage those are the two main things it's good to establish the minimum the maximum and the average values for those and this is very useful in itself like performance aside your sis ads and your dev ops will love you for that information because when you come to them for the first time and you go hello I would like you to deploy my application to production please and they ask you well what kind of server do you want and you go I don't know so it's very useful to know what your application typically needs to operate you could look into concurrency in PHP so like I mentioned before school is one of the frameworks there is also react PHP there is amp there is the pthreads extension which has been around forever and you could of course upgrade your hardware to the latest technology it's also known as throwing money at the problem until you can do it or until you can no longer do it because you run out of options eventually sometimes it's genuinely the valid thing to do so it's not always a bad thing like sometimes you might just gain you might have gained users over time the traffic might have naturally increased so sometimes it's a very very valid thing to do but more often than not it's just running away from tackling the real problem until one day you just have no other options and you have to face the real thing so this is my very unscientific graph to try and illustrate the goal of optimizing performance so you have three aspects we have memory time and code and so obviously we're aiming for that zero point we're aiming for using as little memory as we can having as little code so just the essential code to running up to run our application note that no redundancy and short execution time but there is a bit of a caveat to this goal because that zero point is a little bit of a unicorn let's not forget that optimizing performance will often compromise the readability and them and the maintainability of your application sometimes you might come up with a crazy solution that yeah it's faster but six months down the line even you won't remember why did you do things in this really weird way and you remember that it made things faster but it's now not really easily understandable by you or your co-workers so let's not forget that aspect and that potential danger so I would say optimizing performance is just more of a balancing act act and you should really just stop at a sensible point like don't desperately try to get to that unicorn and make it up good enough for production but also readable and maintainable so for most people their adventurous performance starts when their memory graph might start looking like the one at the top so here you see the memory usage over time and you can see that it's creeping up and that seesaw like shape where the memory drops and then picks up again and drops and picks up again that's just the garbage collection happening in the background so it's a garbage collector doing its job cleaning up all the unused objects and elements from memory so that it can be used again so you can see that despite that's still happening in the background the overall memory used keeps creeping over time which would suggest that there is a memory leak somewhere and we can't free all the memory that we would like to be freed or maybe your graph looks like something like this and that shows the request times and you can see that generally everything's fine except for those spikes and one of them shows a 30-second request time it's from ahead of timeouts and then just got killed and you might be wondering why what your application was doing what was it waiting on or maybe you got everyone's favorite the dreaded PHP fatal error memory exhausted whoo black screen of doom or white screen of doom so those are all clear signs of performance issues other than memory leaks here are some other things that you might might encounter so you might have time outs so that's just your application being stuck waiting on something if you if you are using concurrency in your design then you have all the lovely stuff that comes with it like race conditions and dead logs and live logs and and then you might have just the everyday redundancy or inefficiency in your code maybe you're doing the same thing more than once unnecessarily and you might think dead code or unused libraries like there's this really impact performance maybe it doesn't have a huge impact on the runtime performance but it will affect things like your build time and your deploy time because if you're pulling down libraries say via composer that you're not using then a you have to wait for them to be downloaded regardless so your build time is slower your deploy time might be slower and also you're then loading that into memory so if you're never using that code you're just wasting memory so then we we all do what every good developer does we go to the Internet and we start googling around and there is some sane advice on the Internet things like don't evaluate your condition and sign the loop do it once save it and variable use that in the condition that's fine avoid doing database queries inside a for loop like if you can do an aggregate query instead of a hundred individual queries that's probably better that sounds sensible but then you get things like use empty instead of count equals equals equals zero or you should use the triple equal operator instead of string compare for comparing strings because it's faster and then my absolute favorite you should use single quotes when concatenating not double quotes you should use for each not array walk because array walk array work is a function where you just pass it you pass it in an array and you pass in a function as a callback and then it will execute that callback function for every element in the so some people will tell you don't use that because that's an extra culture function whereas if you use a for-loop then that's just native PHP so you don't do that extra call so it's faster but then it sacrifices readability probably reading through the code if you see array walk you understand much better what the code is doing rather than just looking at a bare bones for loop or for each loop and then trying to work out why are we looping through this and what are we trying to do and then some people tell you use reg X for some crazy string manipulation because it's faster than the native functions but then some people tell you not to use reg X because the native string functions are faster so you're like where is the truth and then actually coming back to the list of issues that you have or you might be dealing with none of those tips on the Internet actually help you deal with any of these issues what do they all have in common they're all micro optimizations I'm not saying there are things we should ignore but 99% of the time ignore them every time sometimes a lot of those optimizations together it can make a difference even if on their own they will seem very insignificant or there will be almost no difference maybe the aggregate difference will actually be something especially if you're optimizing a very critical path in your code maybe they are worth it but usually they're not worth spending time on unless you can actually prove that this is your real bottleneck and very often it's not so you should always benchmark and measure to prove the gains and actually justify refactoring the entire code base to change the double quotes to single quotes or the other way around and most importantly just don't avoid fixing the real bottleneck and waste time on on these things so just to show you some stats these are taken from table mathematics online PHP benchmarks website he bet he just runs the websites when where he does online benchmarking of the different functions so here is the single and double quote if I like the top two rows you're only saving about four milliseconds like that's a drop in the ocean in the in the bottom two rows the and constant either side of the ecocide sign which one goes first is it one faster than the other no difference peer egg replace only three milliseconds slower than string replace that's a drop in the ocean all right so if not micro optimizations then how do we identify the real problems well the how is actually what most people get stuck on the general answer is of course measure but then how do you measure and what do you measure and how often do you measure so I was there myself and I thought it would be useful to do a little demo like a fake demo because it's all of screenshots but basically I walk through of how you might actually go about this let's start with the tools it's potentially the first confusing or overwhelming things when you're starting out because there is a lot of them they all do slightly different things and I've kind of grouped them here based on roughly what they allow you to do so at the top we've got the general monitoring tools and they are just for keeping an eye on things or other to you when things have changed so they will give you things like your logs your request times your CPU usage over time your app time and so on and so forth and they will give you the bigger picture about your application and you might be able to tell from drugs those graphs if there is anything that you need to worry about like if you see the spike in memory or the memory going up that might be a hint but they won't really tell you what's going on inside the app so then the next group is tools which help you simulate the load if you have done if you have no production data available or if you're not profiling on production so there are the load testing and benchmarking tools and there is a whole load of them there is one for every language and Under the Sun and some of them are super simple some of them are super complicated some of them will let you just look around and do ten thousand requests to the given URL some of them like Gatling and Taurus they're a lot more complicated they will allow you to vary the load specify different parameters they won't just do the request one by one they will simulate the real-life traffic and like spikes in traffic and so on and so forth even a simple batch loop or like a curl call in a loop that is a very quick and dirty way of generating load so use whatever you want to just avoid sitting there and refreshing your page manually like there's no need for that you can easily automate this my personal favorite from this is the go work tool it's just a very simple go up that you just give it a URL and then specify some params so those monitoring and benchmarking tools will give you the higher-level idea but to have a look inside your app and actually work out what's going on under the hood you need to dig deeper so for that you need more detailed information you need things like stack traces and coal graphs and memory you want to see what's in your memory at any given time you want to see the execution times for specific portions of your code so for that you need a profiler and then usually you need a visualization tool on top of that because the output of a profiler is just random string of nonsense that it's not really human friendly so usually need some visible visualization help so there are a few of the shelves really good solutions available like all-round solutions Blackfyre is probably the most well known ones data dog New Relic they all do a great job at giving you your performance metrics your graphs they will alert you if something changes out of the box the set up is usually minimal we just usually install an extension you have some daemon running in the background collecting the data and I'm totally not against them but I'm not gonna focus on those in this talk because sometimes you can't use them and I was there myself for start for starters they are all paid so if your company's not willing to pay for them you're kind of stuck and what if you want to profile locally like what if you don't want to send your data to a third party and sometimes companies are limited in terms of where they want to send the data to so sometimes you might want to just limit it to your local server or your your own arc infrastructure but the good news is that there are tools that you can set up locally and have achieved the same effect so this is what I'm gonna walk you through today so how does the tooling landscape look in PHP well to start with none of the tools are built into the language itself which is why I put a little infant and gopher logo at the top because we've got line for example you get all the tools coming with the language itself with PHP you have to install them separately but that's not the end of the world because they're usually pretty easy to install usually via package managers you could roll your own like the simplest quick-and-dirty benchmarking you can do you can just capture the start time the end time subtract the two and you roughly get an idea for how long something took but that obviously doesn't scale very well it doesn't give you the data over time doesn't let you aggregate so this is really just a quick and dirty option so a quick recap just on the slides for how you install extensions in PHP for all the tools my favorite one is just to use a package manager if you can so like brew or apt-get or whatever your flavor is and that's most of the time you can get away with just doing that if not I've posted this lines slides online already so this is just a cheat sheet they can come back to later and there's always a readme in all the projects so always check that first anyway so now that we know how to install extensions let's go get ourselves a profiler and then we have a few choices here so the to go to ones in PHP are usually X debug and XH prof. and you might be wondering what the difference is or which one do you pick these days they're both installed it's PHP extensions so there's no difference in how you set them up I think the history of both extensions explains the difference and might help you guide for which one to choose in the old days of PHP 5x debug didn't have the memory profiling built into it it just had the debugging side of it so if you you would use that for debugging but if you wanted memory profiling you would use XH prof but then since PHP 7 Facebook stopped maintaining XH prof. because they switch to a TPM and then there are a few people who picked this up and transformed the original exit off from or upgraded it from five to seven and so I've put links to three free ports of exit probe here one of them is optimized slightly for p2p seven with a slightly easier setup one of them two tied ways once the UI is completely different and they've modernized the original UI the third one I think it's just the original port as it was but then also XD back since version I think 2.6 and also in peach v7 now has the memory profiling built into it so now you can use X debug for all your profiling needs in theory except for if you want to profile on production because X debug will make your app a hundred times slower so you should never use that in production you should always look for something else and exit proof used to be the go-to tool if you wanted to profile on production and sometimes you might have to there is a new kid in town I called peach P spy and that is a product from Etsy it was released I think April this year so it's very very new it's still very much in like alpha or beta it only runs on Linux the setup is very like raw and you might have to like do some magic there but it has a lot of promises because it promises to have such a small footprint that you can run it in production and it kind of promises to be the X debug but good for production and it also can produce flame graphs which is really cool and it's very exciting because until now there isn't really an easy way to do that in PHP and I'll talk about flame graphs at the end and then I've also put peach few mem info and that's an extension for basically taking a look at what's inside your memory when you're running your application alright so to work out what your diagnosis is you need to do some diagnostics and guessing definitely isn't enough so I tried to find more real-life more of a real-life example of an app that I could go and optimize because I didn't want to use foods and bars because that's not what real life is like and everyone can optimize the simple foods and bars application so a while ago a friend of mine Matt Bruns he tweeted about his side project called cigar and it turned out to be this lovely app written in peach v7 and it's basically a smoke testing tool so what it does is it takes an input file like you provided a file with the list of URLs and then it will take every one of those URLs query it and then you also say what HTTP status code you expect back for a given URL so basically call those URLs and compare the actual results like the actual HTTP code with what you were expecting and if something doesn't match it wouldn't flag it to you so it's just a very simple tool takes an input file goes off to the internets and I'm kind of deliberately just leaving it at that because very often that's what real life is like yeah you're basically just told roughly what something does but you don't know the details under the hood and you might listen this not necessarily you need to know all the details right now so imagine the first scenario the application is suddenly running slowly like somebody comes over to you and says hey we used to run cigar in under one second now it suddenly takes three seconds what's going on and then your job now is to figure out why so first things first and this is something that a lot of people it's so easy to forget and I've done it so many times you need to capture the baseline because you need to compare the end result against your first like initial state to actually know if you've made things better or worse so don't forget to capture the baseline and the current state of things so let's set up exit proof this is how you set it up it's you don't have to worry about the code bits because it's all in the readme you basically paste a bit of text at the beginning and at the end you can like use PHP directives from prepending and appending this to the files you can use include files or you can just like copy and paste it where between like sort of on the edges of the code you want a profile and any V the results of exit or off in a browser so you need to have web server setup and then you point the document root and the xaxa trough HTML output and then you just see something like this in the browser and so we can see some suspects already we have this like parse class and then it calls something called get URL objects and then that calls a closure and that seems to be close to the top so that seems to be taking the longest so then what you can do is you can click on the call graph link and you can view the cobra-cabra and it looks like something like this so exit probe is trying to be helpful and it's trying to show you the hot path on the of your code and we can see that it finishes on the closure but then you can't really tell from this graph what's inside it like what is it about this closure that makes it so bad but at least this gives you an idea of which area of the code to look into so how do you find out what's going on well we could obviously go and look at the source code and then maybe we'll work this out but let's actually use X debug for that so to setup XZ bug you mostly just have to set up a few PHP directives again this is all on the X debug setup so you don't have to worry about this and then we also want to install the visualization tool because X debug will just output a bunch of files that are not really easily understandable so I like to use Q cash grind for mac I think it's called K cash grind for Linux and this tool is very old it looks like it's from 1995 but it's actually the most powerful tool that I found still reviewing the profiling data so we run Sagara with XD bug enabled so exci bug will run in the background and the output of X debug is saved as qks the cash grind files so then we click to open up the file and then we get something like this so it looks pretty old but it's actually pretty powerful so I'll zoom into the different areas of the tool in the next few slides so at the top you have a drop down for time or memory so let's select time to start with and it basically just lists the functions executed in order of which one take took the longest so we can see that array map is close to the top and so that we're already sort of at the very low level of your PHP code and then we also see this the familiar parser class as well so then we can look on the right hand side you've got something called the Cali map and that is essentially a pyramid graph the first time I saw this I was like how do I read this and I you should focus on the areas which have lots of small squares stacked on top of each other because that's clearly the busiest part of the program it's actually the opposite what you look for in this graph is the big empty squares and spaces because that means that proportionally that big fat green square was what took the longest so lots of little squares doesn't necessarily mean busy and then if you hover over the green box and I zoom in then you see the parser class X debug and Q cash grind also gives you the Cobra so it looks a bit different but it's essentially the same thing so it's it's saying the same thing as XH prov that the closure is taking a long time and then debugging closures can be especially tricky because you don't get a name a function name like a closure is a nameless function so that's where it gets tricky especially if you have a few closures in your code and it might be hard to narrow down which one exactly are we talking about here if we compare the two trees they look the same there are two separate runs so that's why the timings the data aren't very slightly it's like 79% for pars compared to 81 and that's something to keep in mind as well that because every single run is going to give you slightly different data usually you just tend to average those outs so if you get slight differences like this usually it's fine to ignore so then we can select array map in Q cache grinds at the top and then we can list viewed a list of the collies and there is our closure again and then the cool thing about Q cache grind is you can jump into the source code right away so you don't actually have to switch and it looks like there's some weird stuff in this array map function it looks like maybe somebody put in some code to calculate pi in a very inefficient way to make this run for three seconds so I put that in the get URL objects functions just for the demo purposes it's not in the original code from brontë so yeah we removed that code and then tada we improved things as expected and always remember to compare against the baseline against you what you had at first because that's actually proof so now when you're sending a pull request to somebody with this fix you can say hey look at the data like I'm proving to you that this is the right thing to do and then you're a code reviewer code reviewing a person doesn't have to wonder if this is actually the right thing to do or not and then after removing this if you run xh probe then parts doesn't even show up on the new call graph and now we can we have a new hot path indicated and then if you zoom into that one it seems to be mostly ghusl stuff and that's not necessarily alarming because you know from what I told you that this app is mostly doing making calls to the URLs so now if we see that Gaza is proportionately taking the longest that kind of makes sense so that's not really suspicious anymore however it's important to keep in mind to never ignore the library codes like just because this is guzzle and you haven't written this somebody else written has written this and it's used used by millions on the Internet it doesn't mean it doesn't have memory leaks inside and it happens to me when we use one of the metrics libraries and the metrics library which was used all over the place had actually a memory leak so just because something is a third-party code or a framework code it doesn't mean that it's without any problems so if you see something popping up in your hope path even if it's a fred party code don't necessarily ignore it so that was a good first step we made things better what more can you get from X debug from the output and from Q cache grind well you can see how many times every function was called throw up throughout the lifecycle of your application and you can click around to see what called it so for example here I've selected string to lower and it was called 123 times overall so you might be wondering what why are we calling this 123 times well you can click on the colors and you can see that guzzle seems to be using it a lot in calls to do with request headers so you're probably thinking it's probably lowering all the headers for comparison or something so this sort of exploration gives you a really good idea for what the code actually does really under the hood what else well we see F open nine calls well I've told you about one call to it one expected call which would be to open the input file to read the list of URLs so where the other eight come coming from well it looks like maybe guzzle is saving some temporary files because again if you click on the list of colors we can see one at the bottom from our parse class but then the top two are is guzzle stuff again so you might be guessing well maybe it's like saving some temporary files and again if you click on the source code yep that's exactly it it's saving into the slash slash temp location so you can really dig down into what the app is doing even the third-party libraries together to get a good understanding of what's going on so let's flip the drop down to memory at the top and let's come back to the get file contents function so we saw that it uses F open and I've pasted the original bit of code on the left so if you click on the of the on the colors of F open you can see the get file open function and it calls three things F read F open and file size so it seems like it's taking quite a bit of memory per call like over eight eight and a half thousand bytes so I'm wondering or you might be wondering if your if you know you're always only reading from those files like they're kind of read-only the input file is read only would it be better to use the get file content if I'll get contents function and would that be quicker and sure enough if you profile this again with X debug just changing the function it uses less memory it uses 496 bytes instead of 600 so obviously this is a very silly micro optimizations here but I just wanted to show you like if you're ever wondering if this function or this way of doing things is better than that way then this is a really great way to prove or disprove your theory and also get a very quick feedback loop because all you have to do is just edit the code run your application again X debug is just running in the background output you file you opened a new file in queue cache grind and you've got the new data so it's a really quick easy feedback loop we don't have to wait for something to deploy and so on and so forth so what if you want to do some general explai exploration like how do you check if your program is not leaking memory it would be nice to be able to see into the memory and see what's in there at the end of the run so for that I tend to use the PHP mem info extension it's fairly simple to use you just add one line at the end of the file at the point at which you want to stop and save the memory data and it actually saves data as JSON files so they're a bit more understandable and Petrie mem info comes with a very useful analyzer command it's a command line tool you can start with the summary command and it's very very useful for just spotting at a glance any objects that you wouldn't expect to be in your memory at the end of a run or at the end of the profiling and it gives you an exact count of instances as well so you can see exactly how many objects of every type are there if there was a memory leak in this case you just see loads of instances of classes that you wouldn't expect to be there or like a suspiciously high number and it's really hard to reproduce a memory leak in PHP 7 I tried all the usual tricks for like making circular references to objects to make them stay in memory or having Open File handlers or open buffers none of this works PHP 7 is actually really clever at working out what you're trying to do and stopping you so let's just for the sake of the demonstration let's have a look at what we have in this list and there's our favorite parser class again so let's have a look at the source code and this is the main cigar file so this is the entry file into the application so it's procedural we're not into the whole and yet and this is where we create the parser object then kick it off so it looks like we're creating a variable called parser and then it seems to be staying with us until the end of the execution like it's in the list and there is one instance of it so you might be wondering why is it still there like we know we created we only use it to read the file and then we never touch it again so you'd expect the garbage collection to probably just clean it clean it up because it's never referenced afterwards well let's find out so you can use the query command with the analyzer tool and you can filter by class so if we filter by the parser class it will tell you things like what's the memory ID of the object and then it tells us that the execution frame is global so that makes sense because we're in the entry procedural file so in we're in a global space it's the root object it's not a child object and it seems to have to object handles like two references so let's dig deeper and find out what those references to this object are and to do that you can use the ref path command and you give it the memory ID of the object and then it tries to produce you like a pretty print of all the references so here we have the two references one of them is from the global context and that makes sense and one of them is a self reference which every object in PHP has so every object will reference itself so that explains why the garbage collection collector might not necessarily want to touch this variable because it's referenced by the global space and because PHP is into an interpreted language you don't compile it upfront you have like the PHP interpreter doesn't have any idea whether the parser object will or will not be referenced later like it just looks at the code as it goes along roughly speaking so because it's referenced by the global space it's probably just too scared to touch it and leaves it in there it could also be that this app runs so fast that it just doesn't get to it because there's no need to be particularly worried about cleaning memory like it's not a particularly memory heavy application but in theory this would be like the academic explanations for why it might not be removed and if you had a lot of objects which were referenced by the global scope or in general then that could that could then result in a memory leak because those objects would never be cleaned and they would just stay in memory forever they could grow over time so that's something to watch out for out of curiosity I've looked at the Global's ID as well just to see what's in there and you can see all the familiar things like underscore post and underscore get so it's all there in memory you can all see it you can get the IDS of those things and then query those so you can really see what objects are there in your memory which I thought it was really cool for like a teeny tiny extension so can we make that parser object go away somehow we know we don't really need it once we are done reading the config file so can we somehow make it go away well we can use this little trick we can access the member of the new like newly created objects in just one expression just by doing this and actually avoiding assigning it to a variable so if we do that and then we rerun the application with PHP mmm mmm info in the background then there is no parser object at the end so again this is a handy trick to remember if you're creating lots of objects in the global scope and you know that they're not gonna be used it's basically a hint to the interpreter to like don't worry about this object don't even assign it so that's a potentially useful little trick and again we've got proof we're not just guessing that this is better than the other way like we actually have the data now to prove it that this is better because it removes that from memory all right so before we wrap up let's look at another tool which might help you work out what's going on inside your application and where most of your time is spent and these are cold thin graphs there are a few ways to generate them in PHP some of them are less painful than others it's not super painless you can use it one speech peace pie is more widespread and more stable you can use that to generate them for now I prefer the other alternative which is use X each probe to generate to run it with the samples enabled and then you get that output you put it through a script which I've linked links to here that translates it into the flame graph speak and then you take the flame graph speak and you pass it through Brendan Greg's Perl script which then generates the flame graph and Brendan Brendan Brendan Greg is the original author he invented the flame graphs so flame graphs looks like they look like this you get something like this and you're probably wondering like a lot of people are wondering how do I read these like how do I understand these so if you imagine that thing cut in half that's exactly what I did here to zoom in a little bit and a lot of people just are not sure how to interpret this and they immediately jump to the conclusion that it tells you about time and it's completely not that so how do you understand them I thought I'll try and explain it by using an example of describing to you how I go about my day so if I was to describe to you what do I do during my day I might be well I'll get it I'll get up I'll wake up I'll shower I'll get dressed then I go to work sometimes I might take our office dog bingo for a walk and then an evening I might run I might watch some TV or I might go and see some friends and then when I went down sometimes that goes straight to sleep sometimes I read books before I go to sleep so if you were to present this in a graph somehow to work out what I most often find doing or maybe if there is anything that I do too much off and I should do a bit less of that how would you present this well if you imagine I had a profiler following me during my day and every minute it would take a snapshot of what I'm doing then I would essentially get stack traces of my day and a stack trace is essentially a chain of functions from the start of the execution going through all the functions that have been called so far and ending with the function which is currently running on the CPU and so when it was what the profiler found on the CPU when it took a snapshot so if you imagine you can imagine that I would end up with hundreds potentially hundreds of thousands of those stack traces at the end of my day like if the profiler is running every minute and I take a shower for 15 minutes I'll get 15 identical stack traces finishing with the shower so quite a lot of them will be very similar so if you imagine having thousands of these how do you group them together it would be very useful to group them together and identify the common ones but you're obviously not going to do it by hand and that's essentially why Brendan Gregg invented flame graphs he ended up having thousands and thousands of stack traces from a my sequel database application and he just looked at them and went how do I make sense out of this and then he invented these so the flame graph generation algorithm essentially does this for us like it takes all the stack traces and war and groups the same ones together and figures out the common paths and then it also figures out where things branch off into different functions so it produces something like this it basically just D JYP's all the stack traces by keeping in mind in mind how many of each where there overall so again you get something like this if I cut this in half and i zoom in I get this side note you can use emoji for PHP function names so these graphs are interactive you can click about on the diagram and you can highlight things so you can select just parts of the graph you can zoom in there's also a search bar so you can search for a specific function let's just flip over to actual function names because they emoji are but are a bit confusing so the first thing that you might notice is that the functions in every row are listed alphabetically from left to right you read this graph bottom up you can also reverse the graph to get the icicles and then you would read it top to bottom but the traditional one you just read it bottom up so here in the top left I have my day and then I might enjoy my evening and so on and so forth like if you read upwards and then at the bottom in the bottom half of it you can see that again I start with main I start today I get ready and then that calls gets dressed and then there that calls something called faff about that I might not even know that that was there but that's what the profiler found so that that's how the flame graphs kind of aid your exploration because you might be sort of spotting functions that you didn't even realize were cold and then I have an overall percentage of how much like far about was how much - it was found and it was 14% so based on that I could say well 14% of my day I spent faffing about so it really lets you see easily the upstream and Stream functions and like the dependencies between functions the sleeps at the top I actually had to make this program sleep to get the profiling data otherwise it was just running through it too fast and the graph was flat or very unreadable so the sleeps at the top are just it's just a native sleep function and in a couple of points the x-axis is not no it has nothing to do with time a lot of people assume that you read it from like the start is on the left and you progress to the right it has nothing to do with time they're all rearranged alphabetically so enjoy evening is before I think get yeah it's ready so is before G left to right so this graph is more for just working out percentage-wise or proportionally how long something - compared to something else and the y-axis is the death of the stack and again just the fact that the stack is deep doesn't it's just like with the pyramid graphs it doesn't necessarily mean that it's a bad thing or that it took a long time it's just that there were a lot of nested calls but they all could have been really really quick so the depth of the stack again has nothing to do with time and the color is they don't mean anything either a lot of people think that the red boxes are the bad boxes it's just a completely random choice of colors he just went for the warm tones palette and the flame graph algorithm the only thing it will try to do about the colors is to make sure that no two adjacent boxes will have the same color so it's just using that algorithm to work out how many colors of a certain palette I need to make every one every box every adjacent box have different colors you can switch this to cold colors if you prefer so again red boxes do not mean anything so this example is a little bit artificial and it's a little bit simplified normally this would be a lot more varied and a real-life example looks more like this again like I said in the browser it's all interactive so you can click on every box you can zoom in this I think shows the actual my sequel application with all the functions that my Sifu calls sometimes you have operating system functions show up there as well but there is a search box so you can just search for your function and start from there so hopefully this all seems a little bit less scary now performance optimization can be a little bit tedious at times sometimes it will take you a few days but it's very rewarding at the end of the day especially when you get the results that you wanted so so keep some key takeaways for me when I look at a problem I like to start with this question which is do I really need to do this it's so easy to forget like it's so easy to just jump right into the optimization part but sometimes doing something faster is better but not doing it in the first place is the best optimization you can ever make so don't optimize prematurely and think instead can I simplify this can I just get rid of this entirely so this applies to dead code operations that you can only do once maybe you can save some HTTP calls anything to just make the code simpler simpler code simpler code is a lot easier to optimize prioritize what to focus on look at the gains versus effort and basically improve the most important things first ignore the micro optimizations for the most part and remember that the design like overall design changes to your application very often will trump the sim the individual code changes that you make so sometimes you might actually have to take a step back and then just redesign the application for example to make it concurrent the hot paths are usually the things to focus on but not always so take it more as a hint not a given customize the for the optimization that you're doing for your own needs don't be afraid to break the rules if that fits your use case there might be some unconventional or more unconventional ways of coding something that might actually be faster for you so it's just like we've denormalizing databases and theory should get to the third normal form but if you're doing a lot of reads and you just always join joining tables you're probably better off denormalizing this and doing it that way so just just like that with optimization don't be afraid to break the rules if it works for you use the data to prove your guesses measure everything and but keep in mind to do it with minimal impact on performance especially if you're doing this on production and remember that optimizing performance is iterative process it's not enough to do it once you need to know your trends over time sometimes things just come out in the wash on production after your application has been used for some time so you should more aim for more just iterating over your solutions over time to adapt it and whenever you upgrade your infrastructure or your components you should redo all your performance testing because things might have changed between versions things might have might be now faster or slower maybe some of the optimizations or workarounds that you put in before don't apply anymore maybe they're not necessary anymore like a lot of things go away when you're upgrading from HP 5 to 7 for example your hardware might have changed your configuration how many cores you have all that might have changed so have monitoring in place however you can make it happen and try to have some alerting as well based on those trends so that you don't have to wait for your customers to tell you that something's gone wrong if you can get to the point of having automated tests then again that saves you a lot of manual work and it will really help you prevent the performance degradation over time you can even correlate your deployment points with the performance graphs to get like a VIP experience there is a really good blog post by Etsy on how to do this from a good four years ago but you sense you get graphs which shows you like after this deploy say the memory usage started going up so you know exactly what you've changed and what you might have to roll back if you have any questions feel free to tweet me or grab me afterwards I'll be around and at the social and thank you for listening [Applause] [Music] [Applause] you
Info
Channel: Laracon EU
Views: 21,259
Rating: undefined out of 5
Keywords: laracon, laravel, php, laraconeu
Id: hOajLLej68Y
Channel Id: undefined
Length: 50min 27sec (3027 seconds)
Published: Thu Dec 05 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.