droidcon SF 2018 - Debugging Application Performance

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone my name is Carmen Jackson and I'm on the Android performance team at Google and today I'm going to talk to you guys about debugging application performance so like I said I'm on the Android performance team and it's our goal for the phone to seem magical to the users right fast smooth intuitive responsive and for that to happen we need to be on point with the system and we need to be on point with the apps it really only takes one slow app to make a person think that their phone is slow right so it's kind of up to all of us to keep things in shape and maintain androids good reputation but when we say good performance what do we mean first 60 frames per second so 60 frames per second is the magic number for smooth interaction as long as you can get your work done in time to draw the next frame you're good but if you drop one users will notice and they really will even if you just drop one and that's what we call jank and second we don't want our users to have to wait and the most common time they might have to wait is during app startup so we make it our goal to minimize the downtime when switching between apps and minimize the amount of time before that first app frame is drawn at work I spend a lot of time making sure that we keep a background processes or background apps in memory as long as we can so that when that app is restarted it's a fast warm start when we can't do that and an app is killed in the background or of course when it's the first time an app is opened on the system at all we need to make sure the cold app starts are as efficient as possible and a lot of that effort comes from you the app developers and that's really it with just those two goals in mind everything else performance-related just kind of follows so who here has heard of systrace cool who here has used it all right who has discovered a big performance win using systrace all right so that's a little bit less so I'm gonna try to try to bring that number up a little bit systrace is working systrace is the Android performance team's favorite tool to use so we're all really familiar with the code that we write right and we know what it does in general but I've found across the board from apps to the Android platform to the entire system sometimes we're wrong sometimes frequently something is influencing our code that we don't expect so like if you have an a/b test running or an experiment running or you depend on a library and maybe you've updated that library and it has changed or you're calling into a framework API and sometimes we're surprised by the amount of time something takes so like we knew exactly what we were doing but we kind of didn't know it would turn out like that so when we look at systrace systrace shows you a very very precise picture of what's happening all the way down to the individual CPU time slice so sub-millisecond once you're looking at that you can make much much better informed decisions about what's going on on your code and then you can make well-informed trade-offs as well what do you want to do what is worth it to change so just to make sure we're on the same page I'll remind you all what the trace viewer looks like when you open a trace up this is what you see so you're on a time line left to right and you can use the WASD keys to scroll left and right and zoom in and out of your trace these tools in the upper right corner are a little palette you can click the tools or you can use the numbers one two three four on the keyboard to select between them which is kind of the next level wizardry way to do it so if you use the one first tool it's a selector arrow so you can select select things singular or in aggregate and view information about them in this bottom info section tools 2 & 3 are for panning and zooming so some people like to use those instead of WASD they're a little bit smoother instead of jumping and the fourth tool is for selecting a region of the trace so it locks to the edges of the adjacent views to let you kind of get in exact exact representation of the amount of time something takes and you can also use the letter M to lock to the edges of a single view so if you have selected of you and you hit M it will select just that view so when you've selected a section or a view that'll look like this and I use this all the time to select an area of the trace that I care about and then scroll up and down through the trace to see what else is happening during that period of time so as you scroll down each section below this top kernel section represents a process that's running on your device so here at the top we can see the surface flinger process running with pit six to seven and then each row inside that process represents either a thread or a trace counter that's running in that process so today I'm gonna share with you some tricks for investigating app startup performance and I'm gonna focus on startup specifically because it's a little bit easier to draw conclusions about app startup on a general level when you're debugging jank it's gonna have a lot to do with the specifics of your app so you'll probably want to add some trace points in out trace points inside your app to add the context of what you're trying to do inside your app during each frame for app startup the platform level trace points platform level trace points are good enough to give us some good debugging information and really show us what's going on without requiring app specific context so i've redacted any app specific information but I got these startup traces using apps on my personal phone so a lot of you say you've looked at traces before there's a chance you'll recognize your app in here so first thing you do when you're trying to look at an app startup trace is go into your trace and locate the launching section for your app and so that's going to be located inside the system server process and you can find it by using control F to search the page so you'll select that with the one tool and hit M to highlight it and this time represents your startup time until the first frame of your app is drawn so then you'll scroll down into your process to the UI thread of your process and that's what it'll look like down here this is generally what I expect to see from an app that just has a straightforward simple app startup so first we'll see bind application and this pink section inside bind application is going to be the system opening up your APK and loading in your Dex files and then over to the right we'll see activity start activity resume the choreographer laying out your first frames and then the time timer continues until the first frame is successfully sent from your app so you can see that this um f in the green circle represents the first frame being successfully sent and so the first thing that you do when you are taking a look at your app startup or really when you're taking a look at any trace that you're trying to evaluate the performance of any area is to examine the critical path just look at the top level what is being traced what is happening here so in this case here are two different app startups that are actually starting extra activities in the process of their startup so this is actually super common and the question you need to ask yourself as an app developer if you're doing this is is it worth it so the first one adds two hundred and thirty-five milliseconds to their app startup starting this first activity second one is adding a hundred and sixty six milliseconds starting that first activity there's a few reasons we might see apps doing this the first one that comes to mind is showing a splash screen but a lot of apps will also do this to show a permissions page or a login page if they need it but if they don't need it this first activity just turns into a pass-through to the activity the user really wants to see so if your goal is to do a splash screen there's kind of another solution to that called a launch theme and that's a specific theme that only applies when you're launching your app if you want to do a permissions or a login page that's a little bit more complicated but you you know like a module that you code up and a base activity class to do that check and only open the second activity if you need it rather than opening it every time either way if this is a strategy you've decided to use here you can kind of see exactly how much of an impact that's really having you can decide for yourself whether that's what you really want to do so I had to zoom in to the top level critical path in order to see this one but service transactions are also on the top level of your apps critical path so these will also delay the first frame from being drawn and the absolute value of these is pretty small so it's eight milliseconds for the top one and eleven milliseconds for the bottom one but even that a small amount adds up I would fix a 10 millisecond app startup progression and for some services this might be worth it some services you need to start immediately while your app is starting up but for others you might be able to move this start outside of the critical path or after the first frame is drawn with no impact to your users so definitely something else to really consider after you've looked at the top-level critical path when your performance debugging you'll want to dig a little bit deeper and this is a this is one that's super common recently so I wanted to call it out specifically Apache is no longer included in Android peace so inside bind application where you can see your apk and Dex files being opened you might also see Apache being loaded as well so you can see that in the trace it's taking 17 milliseconds and Pia the Pachi library is removed from the default class path so if you're using this it now needs to be loaded in additionally and especially if it needs to be paged from disk oh we can see that it can take a long time the recommendation as it has been for a few years has been to switch to HTTP URL connection and you know I'm really sympathetic if that's not something you want to do because HTTP URL connection is not similar but it might be worth it at this point now that there's a now that there's a performance impact if you're already thinking about app performance then paying attention to what views you're inflating is probably not news to you but examining them with systrace to see what their actual inflation times are can really help you prioritize what's important and this is definitely an area where there's a ton of performance advice floating around and like I've watched all those Colt McAnlis videos and I'm sure it's all good advice but there's no substitute for actually looking at the numbers and seeing what your views are actually using so in both cases here there's a whole lot of views being inflated in the first app this might be a little hard to read this long view that's being inflated is called the welcome screen address bar and there are a whole lot of nested views inside this address bar there's four different sub inflates in there and only the developer would know for sure but when I loaded up this app and I looked at what was drawn on the first screen it did not have this many views in it so there might be something more complicated going on I can also see that app based libraries in this view so it could be a more complicated view that only some parts are enabled for or it could be a view where the developer has a lot of components of the view that are hidden kind of waiting for the user to interact with it in some way but either way here you can see they're making a pretty big trade-off to kind of be ready for that down here in the second view we're seeing a drawer being loaded and so it's a modern drawer view and it's taking up about half of the total time for inflates in this activity start it's taking up about 42 milliseconds drawer views are really interesting they're kind of a conundrum because it's possible for someone to open up your app and then really like slam on that drawer view and open it up right away but drawer views almost always have a ton of child views right that's just the the UI of them and it takes a lot of work to get them ready in advance so whenever we load a drawer view proactively inside activity start we're making a little bit of a trade-off between having every user absorb the inflate time for that drawer rather than have the occasional user who tries to lose it who tries to load it immediately seeing a small delay and that's a that's a hard question to answer but again interesting to note that this is the this is kind of the size of that trade-off when you're looking at views another thing that often jumps out at you real fast is large images and I'm constantly surprised when I open up traces and I see these giant images being loaded so here both of these look like based on the name that their background images the top one is a background search welcome dot web P and the bottom one is background underscore afternoon dot ping ones in activity start the other is an activity resume and it's really possible these app developers or you know the designers they work with really wanted these beautiful detailed gorgeous images in their app startup flow right but 50 milliseconds is objectively kind of a lot of time in terms of app startup and so these developers should be really aware of the trade-off they're making and you know consider this a really good opportunity for optimization here first off you can almost always replace a ping with a web P with no visual change but you'll see often a size improvement often a load time improvement so that would be a good immediate a suggestion to make for this app developer and then second in a case like this I'd really recommend that you play with your image quality you know toggle your compression up and down until you find something that balances an image quality that is what you want to see with the loading time impact that you're going to have and one specific thing I want to call out about large images that I see a lot is that we'll often see them on splash screens and that's a real dilemma right because on one hand the splash screen is just there to bridge the gap between your loading time and your activity being started so on the other hand if you're slowing down and blocking your activity being started by showing a huge detailed picture in your splash screen you're kind of fighting with yourself a little bit there right so uh so that's a really common thing once I've kind of taken a look at what trace points are in there on the top-level kind of on the second level the next thing I'll look at is the state of the UI thread or the thread that I'm trying to debug or performance evaluate so the thread state is this narrow row on top of the threads row in your sisterís you know it kind of looks like confetti with just all these colors everywhere and there's four main thread states that you'll want to know about so the green one is running which means a thread is scheduled on a CPU right now and you can see when you click on a green segment in the thread state when you click on a green thread state up there you can actually see details about that CPU time slice in the information section at the bottom so you can see how long is that time slice running for when it started which CPU it's on the next thread state we're thinking about is runnable so that means your thread is ready to run but it's not scheduled yet sleeping means your thread is not eligible to run at all which usually means your thread is waiting on something like a response from another thread often this is something like a binder transaction so if you take a look at why your thread is sleeping and you track it to the other process often that other process is just continuing and doing the work that you asked it to do but it's always good to kind of double-check that when you see sections of threads that are sleeping and then the last one were thinking about is blocking i/o so blocking i/o means that you're reading from disk if you use the number one tool select a section of the thread you can actually view aggregate data about the thread state and that can really help you get a quick kind of mental image if there's any bottlenecks going on in that section of your code so if you select the whole area and you see that it's always running the majority of the time is running but you still feel like that code is just objectively executing too slow that just means that you need to optimize what algorithms you're using so take a look at the libraries take a look at your data structures and see where you can speed things up that way if you are sleeping a lot you will need to check the other threads that are running to make sure they're doing the right thing to make sure there's no gotchas elsewhere where you're not looking if you see a lot of blocking i/o and I'll talk about this a little more in a second you'll want to look at what libraries and files you're loading and see what you can optimize there and if you're seeing a lot of runnable time the system is scheduling other things ahead of you and specifically I want to call out for a runnable time you should never see a lot of runnable time in a UI thread for a running app but if you're trying to debug performance on a background thread you might see a lot of it and that's a kind of red flag that means you have set the priority incorrectly on that background thread so if you're looking at a background thread it's too slow it's always runnable it's maybe not really as background as you thought it was right it might need a higher priority so here's an example of a case where there's a huge gap where the thread is sleeping and rather than just see like oh it's a binder transaction like the work continues elsewhere and we came back here this this activity resume is seeing monitor contention so when you notice monitor contention in your trace you can click on the trace segment and see an explanation of where exactly that monitor contention is happening in the code so here it kind of looks like our X Java is owning a lock the monitor contention is the same as lock contention so rx Java is owning a lock of some sort and then it looks like this whatever app this is is also using retrofit and there's some kind of monitor contention going on there so these are often actually transient errors so you might not see a lot contention like this in every trace that you take of your app but if it happens once it can happen more and so this would be a flag for the developer to take a look at how they're using these libraries see if there's a way they could like restructure their apps setup to avoid the potential for this contention so here we see a section where the where we're seeing a lot of blocking i/o inside activity starts so 25 milliseconds of blocking i/o almost all at once is a really long time and the best recommendation for how to figure out what's going on here is to add some in-app Trace points to identify what's being loaded into your app at that time and what I'll always recommend often this doesn't produce any results but what I'll often recommend is to start off by doing a little light evaluation of your library usage so if you're not if you're calling a library but you're not really using it or you're just loading it in to call one function or you know do like one little bit of complicated math or something like that maybe there's alternatives or workarounds that will allow you to like completely cut the dependency on this library that you're loading in so it's always worth checking another thing when you see sections of long aisle like this that's always kind of a surprise is if you scroll down here we can see that there's three other background threads on this app that are doing i/o at the same time we have a scheduler for prioritizing the UI thread on the CPU of Android devices and on a CPU we have we have multiple CPUs so we can actually paralyze that work most Android devices only have a single IO channel for doing these requests to disk so sometimes the i/o from a background process can occur ahead of the i/o you're trying to prioritize in the foreground process just because they're competing with that time on your one i/o channel so it's really common to see app developers being proactive about doing background threads especially background view inflation for views that they're not using right away but those view inflation's will still use still access the disk so sometimes even though you're kind of trying to get ahead of the game there you'll start kind of stepping on your own toes by blocking the i/o in your main thread from your background threads this is a this is super common and it's really worthwhile to be thoughtful about when you see something like this can you move this i/o heavy background work to later on in the app startup process can you kind of wait to start those threads until after you've drawn your first frame or a little bit later in the app startup with no impact that might be really worth it to do so there you have kind of an outline of how you might use systrace to analyze your app and try to turn up some performance issues and I want to emphasize that all of these strategies I applied them to app startup but I use the same strategies if I was trying to evaluate the performance of any segment of an app at any time so before I leave you I'll briefly go over how you can get these traces for your own app and the first thing that I want to call out is I've mentioned in App trace points and adding an app trace points is super easy to do so we have the Android OS trace API and it has only two functions in it trace top begins section and traced ends section an inside trace top begins section you add a string that's a label to that trace section and not all just show up as a as a trace point in the in the trace HTML but just like you were looking at before the string that you write will be the label there and this might seem obvious to you but I'm definitely gonna call it out you can add variables to that string and that can be super handy you know actually include the name of the file you're loading or the of the image or stuff like that also you can nest these tres begin section and trace end sections and they will kind of stack as you would expect like a flame graph in the trace that you end up taking so the first way to actually take a trace from your device is called catapult and catapult is actually included in your Android tools but I prefer to recommend that you actually it's open-source so I prefer to recommend that you go ahead and clone the github project so you can get whatever are the actual latest updates to catapult or to the trace viewer and so all you'll need to do is clone it and then run this systrace command at the bottom and this you'll need your phone plugged in this operates via ADB and it will take the trace and output it into the HTML file that you supplied so this giant list of arguments is trace categories and this is actually the list of trace categories that I would recommend for getting relevant information out of your apps so these are kind of the good ones and also I want to specifically call out this - a argument so if you've included in app trace points you'll need to include this - a argument with your package name and that app has to be compiled as debuggable for systrace to see those trace points and that's for security reasons we don't want people to you know if you have accidentally included app trace points in your app and then shipped it other people won't be able to read those because that app won't be marked as debuggable the second way to take traces is to use the on device systrace app and that's located inside developer options at the bottom of the debugging section and so you don't have to be the difference is that you don't have to be connected to a db2 use this so you can open up the app set your trace settings you know add the debuggable applications flag pick categories and so on and you can even add a quick setting this tile to your quick settings panel so that if you are just kind of out and about and you notice your app thing bad you can quickly start a trace record the bad interaction and then stop it and save it so I definitely use the quick settings tile every time once you have finished taking a trace it will show you some notifications you know I'm taking a trace I'm saving a trace and then it will ask you to share it so you can share it through kind of the normal share sheet via email or Google Drive or whatever or you can plug in your phone later and pull all the traces down with ADB pull in data local traces once you've done that this tool generates compressed trace files dot C trace files because these HTML files are like 250 megabytes they're massive so so we'll output these dot C trace files and you'll need to run them through this quick post processing script systrace - - from file to create the HTML files to view them in your browser and last I would be kind of remiss if I didn't point out Android Studios CPU profiler so you can view a lot of the trace info that I just shared with you via this UI in Android studio if you prefer to use Android studio if you're familiar with their profiling tools this might feel better to you it's a different trace viewer but it's viewing the showing the same information from the traces it's a little bit more closely targeted towards showing only your app and only your apps process so it's just a little bit of personal preference whether you prefer Android Studios or whether you prefer the traditional HTML trace viewer which has well maybe show more information but you may or may not care about that information so so here's another option for you for viewing traces and that's about all I have and so I do have a bunch of time for questions if anyone has any so I think your question was it's pretty normal - inside your start activity to fire off your initial rx Java request and does that seem like a problem I don't think I don't know if I've looked at trés for that and I would definitely want to I think the question would be when you fire off that initial rx Java request probably how much work gets done you know setting up rx Java initially and sending out that request and does it get in the way so so I would say I don't know offhand but I would definitely recommend look at a trace and see you know put a trace point around that rx Java work look at the trace and see is that work actually blocking or is rx Java doing the right thing and putting that on a background thread that's operating correctly and staying out of your way does that make sense so in general generally speaking should you defer network requests I haven't generally seen kicking off a network request take a lot of time because that usually does pretty quickly switch to a background thread right so in general my guess would be that you don't need to defer something as simple as kicking off a background network request but like I was saying before you kind of never know and that's that's kind of the main thing that I want to emphasize to everyone with when you're thinking about the performance of your app and when you're thinking about using systrace is you know I think it seems ok but you would really need to take a trace of it look at where that network request is happening see how long it really does take because it's gonna take a little bit of time and you'll probably think oh that's ok but you should know for sure right so so it's probably ok but check on your own thank you that was a good question yeah so um so he's kind of asking when I'm looking at a trace like this let's get a little bit more specific about what exactly are we looking at so find application is one thing that's happening and then anything that's underneath that is kind of a nested function that's happening within bind application so when you look over at the activity start and you see underneath it the inflate the inflate is happening inside the activity start so if you think about um tagging the activity start is tagging the beginning in the end of the activity start function and then anything underneath it is happening during that activity start function does that make sense does that answer your question oh yeah yeah so I think I don't have a super good if I go a little bit so here's a case where like a bunch of stuff is nested right like you're seeing the apk is getting open and then inside that a part of that work is open Dex files from oat you know part of that work is the pink and green underneath that let's see so he was asking um just kind of when we're looking at the nested trace points what are we seeing there so here here's an example of where we're seeing several threads which happened to be in the same process this is what several threads look like and so the top one the UI thread has an activity start trace point and then nested underneath that it has several other trace points that are happening during that activity start these other threads look different because they don't necessarily have trace points that describe the work that's happening on that thread so that's why inside the selected area there's no box with words under it because there's no trace point there to define that area right so so as he was saying when you add a trace point with a string it would come up as an actual box with words so if if this was a section of your app that you added a trace point to and you'd add those words then you would see a box here underneath these orange sections describing what was happening but there doesn't happen to be a system trace point for that section describing what's going on there so all we can see is the narrow thread state all we know about it at this time is the state of the thread which is doing blocking IO so the answer to that is kind of complicated sorry so the question was in the new sister a stools which are in Android studio they'll show the frame which is being rendered and did they show that in this systrace and the answer to that is kind of complicated the short answer is no it doesn't show it in a very convenient way but the longer answer is uh when you are when you're looking at a trace of your activity start so these kind of F's inside the circles represent when your process is trying to send a frame up to to the surface flinger which is the process on the device that consolidates all the frames that are being sent from all of the visible processes you know so your app system UI and so on consolidates those and sends them along to the hardware composer so you can see here your apps sending working on each frame and sending its frames up and you can use those EPS in a circle as kind of a guideline you can also scroll up to surface flinger the process that receives them and look at the work that surface flinger is doing and see it's work per frame so so you can access all of that information inside this trace view the Android studio trace view simplifies it a little bit and makes it a little bit easier to see right next to each other how your apps work relates to surface slingers work and so on does that make sense great in the back yes so so his question is is there a way to get a text file or a JSON file from the SIS trace instead of HTML because he wants to do some processing on the trace instead of you at visually and that is huge that's an awesome idea and yes you can there's actually inside systrace there are two scripts one is called tres to HTML with the number two and one is called HTML to trace the HTML files that get generated actually include the trace file in text inside them so you can use those files to convert to basically take off the HTML wrapper and put it back on so so yes you can definitely access the the trace file as a text file unfortunately whether we can get that in like a more convenient txt file like JSON the answer is no so the trace files have a particular trace format there are open source libraries for processing the trace format one is called trebuchet because it's related to catapult and so if you're interested in looking and doing that I would definitely recommend checking out that trebuchet library for trace parsing come see me afterwards I'll spell it for you trebuchet so with regards to app startup how is your code loaded into memory and how does that affect app startup so over here over here is where your code is initially loaded into memory it doesn't it will and then throughout app startup as it uses so it'll load your apk files load your Dex files here and then throughout activity or throughout the activity start it will load the XML files and things that you use libraries that you call into memory as well sorry did that answer your question about how it's how it loads into memory no sorry it will load your Dex files but it will not necessarily load your entire apk I would actually I would actually need to look that up because I don't think it loads in all of your bytecode I think you can optimize because I think you can optimize out what exactly it loads but I would have to check check on you for that yeah and then it loads the rest later yeah I would have to yeah I would have to do some research on exactly which is loading in here um maybe you're a little bit more familiar with it actually but uh I know if you have multiple Dex files that will need to load in the secondary Dex files later if you if you call a library or you have XML files that are being inflated or loaded in or whatever it also loads them later when you use them so it's not loading in every library or every XML file that you have right away it's kind of loading them when they get accessed yeah it's not going to have a complete answer for either anyone else any other questions I think if I were to do it I would do a background thread I would be nervous about waiting a certain amount of milliseconds you know writing a set amount of milliseconds um but I might I might do it on a background thread that gets kicked off later in the process or even look at a callback for the frames being drawn like do it in a non draw callback or something like that yeah probably one of those two it's been a minute since I actually wrote that code though I have before but I think there are a few different options for for what you might want to do also depending on the the amount of activity that you have kind of at the beginning or the amount of like you might also be able to get away with waiting until the person clicks on the drawer to open it depending on how much whether there will be a visible delay once you factor in the animation at that time so that would maybe be an easier thing to do if your UI can support it so this question was since we only have the one IO channel for reading from disk are there other areas where we're restricted like that where multi-threading won't help us not that I have come across I think there might be a similar contention if you are trying to interact with some other kind of hardware you know like another kind of sensor or something but I see the IAO stuff so often that I people are not generally running into running into that elsewhere so nothing comes to mind can you run these traces on the emulator and if you can is that data meaningful in any way so if you use the Android studio CPU profiler it's actually really easy to take those traces on an emulator the data is like I would characterize it as kind of meaningful like it still lets you figure out what's happening you know it shows you what's happening so if something really surprising is happening you know I didn't realize I was you know calling out to this service or that I was inflating so much stuff or something like that um yeah that's useful but since it's not information from an actual device the actual absolute values of the times are not as useful so so when you're using you know when you're when you are using an emulator you kind of naturally have to take it with a grain of salt understanding that that's what you're doing but I still think it's really valuable to it's always really valuable to look at a trace of your app and kind of map your mental model of how you understood your app to work with what is actually happening in the trace and that's valuable whether or not you can 100% trust the numbers that are that go with it if that makes sense so if thank you for that question so if any of you guys have any more questions or if you want to go over a trace of your app with me I actually have office hours in the Copeland's room in an hour at 3:00 from 3 to 4 so if you have more Citrix questions or general assistant performance questions come over there and we can chat thank you [Applause]
Info
Channel: droidcon SF
Views: 3,001
Rating: undefined out of 5
Keywords:
Id: qXVxuLvzKek
Channel Id: undefined
Length: 42min 11sec (2531 seconds)
Published: Sat Dec 01 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.