Drawn out: How Android renders (Google I/O '18)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] CHET HAASE: Hi. Welcome to our talk, "Drawn Out-- How Android Renders the UI." It was almost called something else. We called it this, and then somebody in some administration position decided it was actually going to be called "How to Optimize Your App for Top Rendering Performance" or something. That wasn't what the talk was about, so fortunately, we changed it back. ROMAIN GUY: And it still isn't. CHET HAASE: Instead, we're going to tell you the details of how, actually, stuff works. I'm Chet Haase. I'm from the Android toolkit team. ROMAIN GUY: And I'm Romain Guy. I'm on the Android framework team, and I do real time graphics. CHET HAASE: And that's kind of what we're talking about today. So we have given versions of this talk before, and we thought we were done. And then we realized that enough stuff has changed inside the system that maybe it was time to go through this again and see where we're at. So this is our attempt to do that. Let's go. So first of all, there is this word, rendering. What do we mean by that? Normally, it means melting fat in order to clarify it. That's not quite what we're going to talk about today. Instead, we're talking about the process of actually turning all that stuff, like buttons and check boxes and everything on the screen, into pixels that the user can look at. And there's a lot of things going on. There's a lot of details that we are glossing over today because we only have 40 minutes to do this. But we'll dump a lot of details on you along the way. So first of all, I'm going to take you through-- there's going to be a bunch of colored dots on the top, and these will be sort of a visual cue for a lot of the rest of the talk. So I'm going to walk through sort of the life of what happens in the flow of information down to the pixels on the screen really quickly. We have this thing called the choreographer, and that kicks in usually 60 times a second, and it says, hey, Vsync, this is the interval at which the frame is being synced. Buffers flip onto the screen. It's a good time for us to process a lot of information, and then handle rendering that information as a result of that. So we get a Vsync operation that's sent up to the Java SDK lands, and we're on the UI thread. And all of a sudden, we need to process input events, which can trigger changes in properties. We also run any animations. So we change property values. Again, that may trigger things like layouts and invalidation. We do, then, the whole traversal pass, which is measuring views to figure out how large they are, laying them out, which is actually positioning them where they need to be, and then drawing them. Once all of that information is done, we sync the result of that information over to this thing called the render thread. And the render thread takes that and says, OK, well, I'm going to then execute these. I'm going to basically turn these into native versions of all that information we produced at the Java layer, and then I'm going to get a buffer from the GPU, so that I have a place to write this information. And then I'm going to actually issue all these GPU commands as OpenGL stuff over there. And then I'm going to say, OK, it's time to swap the buffer, and then I turn it over to the GPU. And then the graphics system does something called compositing, and we're going to talk about most of these steps today. ROMAIN GUY: So compositing is, I think, something we've never explained before. So we're going to go in a little bit of details about this part of the Android rendering architecture. CHET HAASE: So the little colored dots on the top of the screen will be noting where we are in the process, as we work through a few examples, so that we can understand this better. Speaking of examples, here's a simple one. So let's suppose that we have a user, and the user clicks on an item. I wrote this awesome RecyclerView application that looks exactly like this. I know it is because that's a screen shot from my awesome application. It's a RecyclerView with a bunch of items in it, and when the user clicks on one, this amazing thing happens. It turns into a random color on the background. It's incredible. I could give you the source, but I don't know. It's pretty complicated. I'm not sure you'd understand it. So here's the amazing layout for my amazing demo application. There's a ConstraintLayout. There is a RecyclerView inside of it, and then I populated it at runtime with a bunch of random items in there. The view hierarchy for this thing looks basically like this. In fact, it doesn't look basically like this. It looks exactly like this. So you walk down from the DecorView and you've got a LinearLayout and a FrameLayout. I'm not exactly sure why we have the deep nesting there, but whatever. History. We have a bunch of stuff for action bar in there. None of that really matters. What we're concerned about here is what's actually going on in the content hierarchy, because that's what you can affect with your application. So we have content FrameLayout, we have the ConstraintLayout on the outside, wrapping the RecyclerView, and then all of the items. Specifically, these are the items that are on screen because those are the only ones that are actually being measured and laid out and drawn. So what happens? Let's walk through this example and walk through that entire flow that we went through at the beginning. So user clicks, there's a Vsync operation. That gets sent up and we process input during the input phase, and we notice that this is a click. I'm glossing over some details here. Actually, we're going to notice first that there was a down, and then there was an up, and then it gets processed as a click. Just take it for granted we're eventually going to process a click here. That ends up in this item clicked method that I have in my amazingly complex example, and in there, we're going to set the background color on this item to a random color. That's why I called the method random. That gets sent over to the set background color in view.java, which does a bunch of stuff to set the color on the background drawable, and then it eventually calls this method called invalidate. Invalidation is the process-- it doesn't actually redraw the views. It's the process of telling the View Hierarchy that something needs to be redrawn. So you get a click. That happens on the item down at the bottom. So that item two-- you see it's surrounded by green. We have a little invalidate method that gets called on that, and that basically walks up the tree. It calls a series of methods all the way up the tree, because the view knows that it needs to be redrawn, but it actually needs to propagate that information all the way up the hierarchy, so that then, we can redraw all the things on the way down, later. So we call invalidate trial all the way up the hierarchy. That eventually ends up in a massive class that we have called ViewRootImpl.java, and we have this invalidate child method there. And that basically does nothing except says, OK, I need to schedule a traversal, which means, OK, I've taken information that somebody got invalidated somewhere. That means that I need to run my traversal code later at the end of this process. Traversal is the process of doing all the phases that are necessary for actually rendering that frame. Specifically, it does measure-- how big the views are, layout-- setting the views' position and size, and drawing the views. All of that is called traversal. So we've scheduled a traversal. That's going to happen at some later time, and that later time is now. So in the same frame, we end up in the traversal code in this performTraversals method. It's going to do a PerformDraw, which ends up calling a draw method onto core view, and basically, that's going to propagate all the way down. So the draw method actually ends up in an optimization that we implemented back in Honeycomb called Get DisplayList. So a DisplayList is a structure that stores the rendering information, right? So if you look at the way the button code is written or view code in general, we call graphics commands in Canvas like DrawBackgrounds, DrawDrawable, DrawLine, whatever. But these end up as operations in a DisplayList. This is a compact way of representing those operations, as well as the parameters to the operations. So we call Get DisplayList. The decor view did not change, in fact. So it's going to say, well, I didn't change, but I can certainly get the DisplayList for my child. And then on all the way down the tree, until it gets to item two, and it says, oh, I did change. When invalidate was called on me, that triggered something so that I know I need to redraw myself. So Get DisplayList actually ends up being a draw call on the view, where it regenerates its own display list. So now, it ends up in this onDraw method. That ends up in the operations in the DisplayList for that. The DisplayList consists of, for this item, basically rect information and text information-- pretty basic. And then we have the DisplayList for basically the entire hierarchy. So it wasn't just the view itself, but we have the View Hierarchy itself is reproduced in this hierarchy of DisplayLists, all the way down. So now we have the DisplayList for the entire tree, and that's all that we need to do on the UI thread. Now we need to sync that information over to the render thread, and the render thread is a separate thread that's actually dealing with the GPU side of this operation. On the Java side, we produced all the information. On the native side, then, we actually produce-- we take that information, then sync it over to the GPU. So we have the sync operation, where basically we copy a handle over there. And we also copy some related information. We copy the damage area, because it's important to know that that item two-- that's the only thing that changed during that frame, which means we don't need to redraw anything else outside of that area. So we're going to copy over the clip bounds there, so that we know what needs to be redrawn. Now, we're also going to do some optimization stuff like uploading bitmaps. So this is a good time to do it at the beginning of the frame. Give them some time to actually turn those into textures along the way, while we're doing the other stuff. ROMAIN GUY: It mentions here that we're uploading the non-hardware bitmaps. So hardware bitmaps is a new type of bitmap configuration that was added in Android O. So typically, when you have a bitmap, we have to allocate the memory on the Java side. And then when it comes time to draw, we have to make a copy of the bitmap on the GPU. This is expensive. It takes time, and it doubles the amount of RAM we're using. So using the hardware bitmaps that are available in Oreo, you can keep the Java side of the equation, and you can have a bitmap that lives only on the GPU. So if you're not going to modify this bitmap ever again, this is a really efficient way, memory-wise, to store your bitmaps' memory. CHET HAASE: We have mentioned the render thread before. This is something that we introduced in the Lollipop release. It is a separate thread that only talks to the GPU. It's native code. There is no call outside to Java code. There are certainly no call-outs out to application code. It just talks to the GPU. We did this-- so we still need to do basically the same thing we did, pre-rendered thread, which is we produce all the DisplayList information, and then we send that DisplayList information to the GPU. So its sort of serial, but the render thread is able to do things atomically, like the circular reveal animations, as well as the ripple animations, as well as vector drawable animations-- can happen atomically on the render thread. So that something that can happen without stalling the UI thread. And in the meantime, the UI thread can be doing other things while it's idle after it syncs, like maybe some of the idle prefetch work for RecyclerView that was done last year. So render thread kicks in. We've synced everything. We have the DisplayList, we have the damage area, and then we turn the DisplayList into something that we call DLOps-- display list operations. So you can see that we have that fill operation in the middle. That's the thing that we turned green there. And then we have some optimizations that we perform. ROMAIN GUY: So we have various optimizations we do here. So for instance, if you do Alpha rendering by calling set Alpha on the view or if you set a hardware layer on the view, we try to identify the drawing commands that need to be targeted to those layers, and we move them at the beginning of the frame. This avoids state changes inside the GPU, which are extremely expensive. So without doing this kind of optimization, you would see horrible, horrible performance. And it's not because the GPU itself will be slow. The GPU would just be waiting for the CPU to give it instructions. The other one we're doing, and we're going to show you an actual practical example, is called reordering and matching. We look at all those operations, and you can see in this example, because we have list items, we interleave a lot of operations that are similar. So we're going to draw a rectangle, and then we're going to draw text, and then we're going to draw a rectangle and text again. And again, here, we're changing the state of the GPU several times, but instead, what we can do is if the commands don't overlap, we can draw all the rectangles together, and then we can draw all the text together. So this is part of the reordering and matching. And sometimes, what we do-- we say, look, if we see a bunch of text that uses the same color and the same font, they don't have to be different draw text calls. They can be just a single one that covers the entire screen. CHET HAASE: So you can see here, the original DLOps had a fill operation, and then it wanted to draw some text, which is going to end up being texture map copies from the glyph cache. Then it's got a fill operation, and then some more text, and a fill. And so we've got all these operations interleaved. So after the reordering operation, then it looks a little more like this. We've got a series of fills and a series of text operations. They can even be batched together to be more optimal, which we will see here. ROMAIN GUY: So this is an example of Gmail. So that was in the Honeycomb era. You can see how here, we modified the pipeline to slow down the rendering, and you can see exactly how Gmail was drawing. So we have a lot of list items, and we just draw them in the exact order that they exist in your view hierarchy and in the order of your code, actually. All the draw calls that you make on the Canvas will respect that order. Unfortunately, like I said, it's very inefficient. So instead, after batching and merging and reordering, we get this. You can see, in particular, that all the stars were drawn at the same time, and most of the text appear all at once. What's interesting is we are drawing all the list item backgrounds one after the other. So that's good. The reordering worked. The batching didn't work, and it's partly because the list items are slightly overlapping. So when commands overlap, we have to draw them one after the other to honor the blending, to make sure the Alpha values are correct. And so the effect really depends on the application. If I remember correctly, in KitKat, the settings application was we could draw the whole screen in about six draw calls, instead of dozens and dozens, as seen by the View Hierarchy. So this is a very important optimization for us. CHET HAASE: I think this work at the time on current devices saved something like a millisecond, which doesn't sound like much, unless you realize that we have to do everything within 16. So it was actually a huge improvement that allowed Gmail to be less janky, because then it wasn't pushed out into the next frame as often. So back to our explanation of everything. Then we have the clipReject. So this is where we feed in the information about the damage area, right? So we know where item two was on the screen, and we know that we don't need to draw anything outside of that. So as we're processing these DLOps, we know that we can basically throw away anything that is drawing outside of that area. In graphics, that's called a trivial reject. So we trivially reject all the DLOps that weren't intersecting with that area, and now all we have to do is draw a fill and some text and a line. So we do that. In the process of doing that, we can do a GetBuffer. This is usually an implicit operation. We don't request a buffer. It's more that as soon as we start doing GPU operations, then the GPU hands us the buffer-- more particularly, SurfaceFlinger hands us the buffer-- that we can then put these commands into. Then we issue the commands. This is a series of GL commands. As you can see on the slide, it says glCommand. Basically, the equivalent of what we need for whatever-- doing the fill or the text-- Bitmap copies lines, whatever. And then we swap the buffer. So this is us saying, we are done with all of our rendering operations. We're ready to display this frame on the screen. It's a request to SurfaceFlinger to then swap the buffer. Basically, we're done drawing to the buffer. You can swap this with the one that's in front, and it'll put it on the screen. Meanwhile, in SurfaceFlinger, then we have the composite step, which Romain is going to talk a lot about later. But basically, it takes all of the windows on the screen. We see here the navigation bar, the status bar, and the actual content window for our application. It combines all of those in the hardware compositor, puts them on screen, and then, tada. We're done. So that was a really simple example. Let's look at a super complicated example. This one is going to be in two phases. One, so we're going to drag the list up. So we're going to drag it, and then as we drag it, we're going to move the items a little bit. And then eventually, if we keep moving them, we're going to have a new item appear. So we're going to look at two versions of this. There's the move only version. So as we drag it up, a new item-- not a new item appears. Everything just shifts a little bit up. So first of all, we need to process the down. So we have a Vsync. It says, time to process input event. So we do that, and we end up in code like this. On touch event in RecyclerView, it says, well, there was a down operation. And all it needs to do is register where that down happened. It doesn't need to process anything. Nothing changed on the screen. We just registered that the user actually pressed down. So we record that for later, and there's no op. We don't do any of the rest of the stuff we talked about, because nothing changed. They keep dragging, and then we end up in similar code. So we process input on the next frame, and we say, OK, on touch event. Oh, now we know that they've actually moved, and we know how much they moved because we saved the old X and Y. We calculate the delta. And now we call this thing called offset top and bottom. Basically, for all the views on the screen, we simply move them in Y. And offset top and bottom calls something-- it's an invalidation method, but it's slightly different. It says, invalidateViewProperty. This is an optimization that we put in probably Honeycomb second release or something with DisplayList properties. So when I talked about DisplayList earlier, there was one nuance that I left out. We have the information about the operations and the parameters for the graphics operations. But we also have the information about some core display properties, which are basically properties of the view, like the translation property, rotation alpha. And these are properties which we don't need to rerender the view to change. We can simply change them in the DisplayList structure, itself, and then they get picked up at GPU issue time. So it's a very fast operation for us to do that. So instead of invalidating a view and redrawing everything in that view all we do is say, change the translation property in this view. So the way that happens is we call invalidateViewProperty. That propagates all the way up the tree, because we still need to know, at the top layer, what happens, but it's a much more optimal step. So this ends up in scheduleTraversals, as it did before. In the draw, it ends up in performTraversals, but PerformDraw can do a much simpler version of this because the DisplayList didn't actually change. All we did was change DisplayList properties inside of it. So we can immediately sync that information over to the render thread. We can then execute that, turn that into DisplayList ops, get the buffer. Basically, everything is as before. Let's go through the second phase of that super complicated example. User keeps dragging, and as they drag, a new item appears on the bottom. So Vsync, we process the input. We end up in a method something like this. We know that they've moved. Oh, but that means that we need to trigger the creation and the bind of that new item there. That ends up in code like this, where we actually add a view to the parent. So the RecyclerView is going to get a new view, and it's going to call RequestLayout. So RequestLayout is kind of like invalidation, but instead of saying, I need to be redrawn, it says, I need to be remeasured and relaid out. And that could side affect everybody there, so we basically propagate a RequestLayout all the way up the tree, like invalidation. And then we do measure and layout on the entire tree, just to see what changed there. So RequestLayout happens on the parent, and then that propagates all the way up. And that ends up, again, in scheduleTraversals, our friend. And then performTraversals-- we're not talking about the draw stuff now. We're going to do a performMeasure and a performLayout. Measure is basically asking all of the views how big they would like to be. It's a request. And then layout says, this is how big you're going to be, and this is where you're going to be positioned. It's a negotiation between the views and all of their parents, according to all the constraints in the system. So we do a performMeasure. That basically calls measure at the top, and that propagates all the way down. And then we have all the information about how big all of the views want to be, and that is good enough for us to calculate the layout information. Then we propagate layout all the way down the tree. And once that happens on the item and the parent that changed, then we actually lay out that item, and we're ready to go. Now, we can actually draw things, and everything is as before. So the nuance here was just the layout side, except an important nuance is we're talking about all of this RequestLayout and measure and layout happening for this RecyclerView situation. However, RecyclerView optimizes this. It knows enough about its own parent and its children that it actually can just offset the views. Instead of doing the RequestLayout, it can actually just move the views out of the way and create the new item. So optimization for RecyclerView, as well as the older list view. ROMAIN GUY: So now we're going to talk about how SurfaceFlinger, our window compositor, composites all the windows on the screen. This is interesting, well, first of all, because it's always interesting to learn something new about technology, but also because you will understand some concepts that are behind some of the public APIs like Surface, SurfaceTexture, SurfaceView, or the MediaCodec. So before we can understand composition, we have to understand a very important concept on Android, called the buffer queue. So the buffer queue, as the name suggests, is just a queue of buffers where our graphics buffers live. Typically, we're going to have one to three buffers. There are different options, internally, where when we set up a buffer queue, we can request how many buffers we want. And very importantly, a buffer queue has two endpoints. We have the producer and we have the consumer. So typically, the way we use a buffer queue-- the producer calls a method called dequeueBuffer on the queue. It grabs a buffer from the queue. Now it owns it. It can do any kind of rendering. It can be sending the pixel data directly, it can be using OpenGL, it can be using the Canvas. It doesn't really matter. And when you use OpenGL, this is basically what happens when you call sled buffers at the end. That's when we're producing the content inside the buffer. So when the producer is done producing the content, it calls queue buffer, and gives the buffer back to the buffer queue. Now, a consumer can grab the next buffer in the queue using acquire. So it calls acquire buffer, it takes the first available buffer in the queue, it does whatever it has to do with it, and when it's done, it puts it back by calling release. So it's a pretty simple concept. Of course, if you were to look at the code-- all the header files and all the code involved-- it's pretty complicated, in part because the two endpoints of a buffer queue can live in different processes. And this is exactly what happens. This is how our surface compositor works. So when you create a window in the system, you have the Window Manager and you have SurfaceFlinger. So the Window Manager is effectively the producer in this scenario, and SurfaceFlinger is our consumer. So when you call Window Manager in that addView-- and this is what's done automatically for you when you create a dialogue, when you create a toast, I believe, when you create an activity-- internally, we create a window object. That window object has a sibling on the SurfaceFlinger side, called a Layer. The names can be a little confusing because in graphics, we have to deal with buffers and queues, and that's all we do, and we quickly run out of names. And the graphics team-- I thought of-- so that's why we have Surface and SurfaceTextures and buffer queue and layer and window. Yeah. So it's a little bit messy. So we have a layer in SurfaceFlinger. It's basically a window. And the layer is the component in the system that creates and owns the buffer queue for your application. So it creates a buffer queue, and we have a way to send the endpoint to your application by creating a Surface. So whenever you see a Surface in one of our APIs, you really have the producer endpoint of a buffer queue that lives somewhere else in the system, either in your process or in some other process. Most of the time, it's going to be inside SurfaceFlinger. So now, the typical use case for you, as an application developer, you're going to deal with the Surface API when you create the SurfaceView. So the way SurfaceView works is your window as its own Surface that we see here. Then we cut a hole through that Surface, effectively. And we ask the Window Manager and SurfaceFlinger to create a second Surface. And we just slide it underneath, and we pretend that they are part of the same window. But they are not. They are two different Surfaces. They have two different buffer queues, and they can be completely independent from one another. So if you use a SurfaceView, you're probably going to use OpenGL or Vulcan or a media player to generate content. So for instance, in this case, we have OpenGL ES. It's going to dequeueBuffer. It's going to do some rendering. It's going to queue the buffer back into the Surface, and therefore, into the buffer queue. If you use a Surface texture, your consumer will be OpenGL. So you create a SurfaceTexture by giving it a texture ID. In that case, the SurfaceTexture creates and owns the buffer queue, so that will often be in your own process. Then you have to pass that SurfaceTexture to some producer, and to do this, you create the endpoints yourself, by creating a Surface. There's a constructor of Surface that takes a SurfaceTexture. So you create your Surface, you send it to some other application, and then when your OpenGL code is ready to render, it calls acquire to get a buffer from the buffer queue. It does its rendering. It's going to, itself, produce a buffer inside another queue, and then when it's done, it can call release. TextureView is the widget that's part of a UI toolkit that you can use to benefit from SurfaceTexture. In that particular case, the render thread that we talked about is the consumer of the SurfaceTexture, and you're still responsible for getting the Surface from the TextureView and giving it to a producer of your choosing. You can think of it as a fancy image, effectively. It's an image view that you can update really efficiently, using hardware acceleration. In recent years-- we used to tell you that TextureView was the solution, instead of SurfaceView, when you wanted to integrate a video or OpenGL rendering inside a complex application. For instance, if you had the ListView or CountView or anything that was animated. SurfaceView, because it's made of two different windows, was naturally efficient for that, and it was not synchronized with the rendering of your own application. This has been fixed in the recent versions of Android. So most of the time, on recent versions of Android, you should use a SurfaceView instead of the TextureView. Use the TextureView only when you have to. Maybe it's sandwiched between your other views, or use an animation that's not supported by the SurfaceView. CHET HAASE: I think that was the O release. ROMAIN GUY: That was the O-- maybe N. CHET HAASE: Maybe. ROMAIN GUY: One of the two. You have to test. CHET HAASE: That's why we say recent. ROMAIN GUY: And here's a list of different producers and consumers in the platform. So we looked at SurfaceView and SurfaceTexture. OpenGL ES is a producer. It can also be a consumer. So remember when Chet was saying that in that lifecycle of a frame, at some point, we get a buffer. That's when we call dequeueBuffer from the render thread, and this is done, typically, when we do the first draw call. And at the end, when we call eglSwapBuffer to tell the driver that we're done with our frame, it will actually produce the frame and put it back in the buffer queue. You can also use things like Vulcan, the MediaPlayer, and MediaCodec. And we have many, many more throughout the platform. Now, the actual composition. So we have created multiple windows that each have their own layer. SurfaceFlinger knows about all those layers, and to talk to the display, SurfaceFlinger actually talks to something called the Hardware Composer. Hardware Composer is a hardware abstraction layer that we use because we want to avoid using the GPU and we need to composite all those windows on screen. One of the reasons is to save battery. It's more power efficient that way. But it's also to make sure that your application has access to basically all the capabilities of the GPU. We don't take it away from you. And in the past, you may have heard that you should limit the number of windows you put on the screen. And you'll see why in a few slides. So we have the Hardware Composer. It's effectively a block of hardware that's really fast at taking multiple bitmaps and composing them together on screen. And we just talked about this. So the way it actually works-- the Hardware Composer is really a kind of a protocol. Here, I'm going to describe the older Hardware Composer. It's called Hardware Composer 1 or Hardware Composer 0. I'm always confused now. We use one called the Hardware Composer 2, but it's much more complicated so I'm not going to describe it here. But the gist of it is it basically works the same way. So SurfaceFlinger has a bunch of layers, and we're going to call prepare on the Hardware Composer. And we're going to send older layers to the Hardware Composer and ask it to tell us what it wants to do with every layer. Every Hardware Composer is a proprietary piece of hardware in the phone or tablet you're using, and there is no way that we can write the drivers for all the different Hardware Composers out there. So instead, the Hardware Composer's going to tell us what he wants to do. So in this case, we have a layer. The Hardware Composer replies, overlay. That means the Hardware Composer understands the pixel format of that window and tells us that it can handle it and is going to do the composition for that window. So we keep going. It's telling us overlay for the second layer. It's telling us overlay for the third layer. So that's great. That means all the composition can be done automatically for us on our behalf in a very efficient way. So now, all our leaders are matched to overlays. We call set. We send all the layers to the hardware composer, this time for actual composition, and the hardware composer sends everything to the screen. Now a more complex example. So we have the number of layers. We call prepare. For the first one, everything goes fine. The hardware composer says, overlay. It can handle it. But for some reason, for the next one, it says, frame buffer. So that can happen when you're using a pixel format that's not supported, or it could be because you're using a rotation, and the hardware composer doesn't know how to handle rotation or we have too many layers on the screen, or any number of reasons that are specific to that hardware composer. CHET HAASE: This was much more common in devices probably three to four or longer years ago. ROMAIN GUY: Right. We used to have about four hardware layers that you could use. That's four. That's five. That's four. So we used to have four, and on Pixel 2, without going into too much detail, you have basically seven. So it's much better than it used to be. But if you have a Pixel 2 XL, we use two of those layers to draw the rounded corners. So you don't really have seven; you have five. It's actually more six, because they can be merged by the Hardware Composer. Anyway, lots of details that can be really complicated. You don't need to know about all those details. Anyway, in this case, we have one layer that can go directly to the Hardware Composer, but we have two layers that have been marked frame buffer. And that's where the hard part starts for us, because when we have layers that are not handled by the Hardware Composer, we need to use the GPU to composite them ourselves. So SurfaceFlinger has to be able to do everything that the hardware can do. And in that situation, we need to create, basically, a scratch buffer-- another layer-- in a format that we know the Hardware Composer can accept. And then we use custom OpenGL code to do the composition ourselves of those two layers. So then, once we're done with that part, we're left with only two layers, and we know that those can sent to the Hardware Composer. So that's what we do. We call set, and then they show up on screen. CHET HAASE: So if you're curious sometime, you can run this command, adb shell dumpsys SurfaceFlinger. ROMAIN GUY: Capital S, capital F. CHET HAASE: Very important. And it'll spit out way, way more information than you want. But one of the things that it's going to show you is a table of the windows on the screen and whether they are currently represented as overlays or frame buffer. ROMAIN GUY: Although you have to run this command pretty quickly because there are tons of optimizations internally. So what can happen sometimes is if layers have been on the screen for a while, and we know they are not changing, their Hardware Composer might be collapsing them into a single layer until they change again. So the output of that command is sometimes a little bit misleading, because you might be seeing the result of optimizations based on time. So the best thing to do is usually to run this when you're running an animation or something is changing on screen. That's going to be the most valuable information for you. So a few other things that we haven't talked about. We used to tell you to use the variant of invalidate that text in rectangles. So you could invalidate only part of your view that you knew needed to be repainted. That was particularly important with older Android devices, because bandwidth was extremely limited. We were using software rendering. And even in the early days of GPU rendering for us, we were pretty easily maxing out the GPU. So those were really important savings. You don't need to do this any more, and even on recent versions of Android, before this was deprecated, it was actually ignored by the system. Now what happens, every time you go invalidate or invalidate with a rectangle on the view, the render thread's going to rerender that whole view or recompute that sole damaged area. You don't have to worry about it. And one of the reasons we're getting rid of it is not only is it not that necessary anymore for savings, but it's also because it's error-prone. And it's really easy to have an off-by-one error or rounding error and to get artifacts on screen. And Chet and I can attest to that because we fixed so many bugs linked to the use of those APIs. And the framework itself still has bugs, I believe, around that. So now, you don't have to worry about it. RecyclerView is now able to do prefetching of items ahead of time. CHET HAASE: Yeah, we mentioned this earlier. This is one of the wins that we're now getting out of having a separate render thread, because now, well, there's idle time. The UI thread was done with its work after its synced. Well, it can use that idle time productively for doing other things like fetching things that it knows it might need in the next few frames. ROMAIN GUY: For composition, internally, we have a concept of the actual display. This is an API you can use. For instance, that's what we use when we take a screenshot or when we record a video or when we do casting-- Chromecasting, for instance. What we're effectively doing is asking SurfaceFlinger to perform a composition, but not to display directly-- just into another surface. So it's another way of producing a surface. If you're interested, there's an excellent sample application called Graphika with a K available on GitHub. It was written by a member of the graphics team a few years ago. It's basically a collection of all the things you can do with SurfaceFlinger, with Surfaces, with SurfaceView, with the media encoder, with virtual displays. It's a very interesting piece of code to look at. Color transforms. So in Android O, we introduced color management, and that's one of the color transforms we can apply. Things like nightlights are also color transforms. We also have colorblindness simulation. Those can be handled by the Hardware Composer in specific situations, and they can be the cause for performance issues. For instance, a while back, nightlights was not supported, I believe, on the N5X or the N6B. One of the reasons was the hardware. We didn't have the drivers that could do the color transform, so we had to fall back to GPU composition. It was really expensive. It was hurting the battery, so we didn't have the feature on the device. And like Chet said, there are many, many more details about our rendering pipeline. This was just a very high level overview. We gave a number of talks in the past that explained in more details what we do, for instance, in the UI render itself, how we do the batching and merging-- that kind of optimization. So if you're interested, you can refer to those. CHET HAASE: Shadow calculation is interesting. I think we had a talk where we talked about some of those details. But exactly where does that fit in the rendering pipeline is sort of clever. But yes. Lots more going on there, but hopefully, this gives you a general sense of how things work on Android. ROMAIN GUY: And with that, we're done. CHET HAASE: So we'll stop it there. Thank you. [APPLAUSE] [MUSIC PLAYING]
Info
Channel: Android Developers
Views: 31,297
Rating: undefined out of 5
Keywords: type: Conference Talk (Full production);, pr_pr: Google I/O, purpose: Educate
Id: zdQRIYOST64
Channel Id: undefined
Length: 36min 2sec (2162 seconds)
Published: Tue May 08 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.