Gestures in Jetpack Compose

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
JOLANDA VERHOEF: Adding custom gestures to your app helps bring your app's look and feel to the next level. In this video, I'll show you how to create beautiful, rich interactions. After this video, you'll be able to reason about and easily implement custom gestures. But if we want to talk about these rich interactions, we need to make sure we have some common terminology. First, Android nowadays isn't just phones with touch interaction anymore. To be inclusive with other form factors and input types, we have to come up with a more general term than "touch." In Compose, we use a term, "pointer," to refer to any type of input that is used to point at things on your screen. With a pointer, a user can perform a gesture. Using the phone with touch concept, a gesture would consist of the user putting their finger on the screen, then optionally moving around a little bit, and finally lifting their finger again. In Compose, on a low level, such a gesture is represented by a stream of pointer events. So for our gesture, we first see a press event, then a whole lot of move events, and finally, a release when the user lifts their finger off the screen. In total, the sum of all those events between pointer down and pointer up is what we call a gesture. Of course, the user can also use more than one finger to perform a gesture. In this case, the pointers are performing a pinch gesture by moving outwards and inwards. Probably, the user will not put their fingers down at exactly the same time. So you'll first get a press and some move events for one pointer, then get the second press, a bunch of movements for both pointers simultaneously, and finally, those release events. Now, recognizing these events as a pinch gesture is not very trivial, which is why Compose includes several helper methods. And this brings us to our toolbox. What helpers can we use to implement gesture handling in our app? On the lowest level, we can listen for and directly handle raw pointer events. This is done using the pointer input modifier. It gives you the most flexibility to implement exactly the gesture recognizer you want. For some common gestures, such as dragging, tapping, and zooming, the pointer input modifier contains a set of gesture recognizers. These recognizers translate raw pointer events to full gestures. One level up, gesture modifiers, such as clickable and draggable, allow you to add gesture handling to arbitrary composables. These also contain extra functionality, which we'll get to in a bit. And finally, some components, such as button and slider, include gesture recognition right out of the box. In this video, I will focus on the middle two layers, gesture recognizers and gesture modifiers. Let's start with the gesture recognizers. To create a gesture recognizer, you first apply the pointer input modifier to your composable. You pass a lambda to this modifier. And inside this lambda, you are in a so-called pointer input scope. This scope provides the various gesture recognizers. For example, you can use the detectTapGestures method to recognize various step gestures-- double tap, long press, press, and tap. You pass the lambdas for the tap gestures that you want to respond to, and they will be executed when that gesture is recognized. Let's add this method to our toolbox. You can also detect drag gestures. During a drag, the onDrag lambda will be continuously executed. You can limit your drag detector to only respond to vertical dragging, or to horizontal dragging, or only detect drags after the user long presses. And you can detect multi-touch transformation gestures, such as panning, pinching, and rotating by using detectTransformGestures. This makes a full list of all the gesture recognizers that are available in the pointer input scope. If your requirements are more complicated than this, you would need to fall back to handling raw pointer events. So that completes our list of gesture recognizers. Let's continue with the gesture modifiers. These gesture modifiers can be applied directly to a composable without needing to use the pointer input modifier. Besides this more concise way of writing, they give us some handy extra functionality besides just gesture recognition. And we'll look into that later. First, the clickable modifier adds click behavior. This is similar to the detectTapGestures recognizer, but it only responds to single taps. Again, you pass a lambda that is called when the composable is clicked. If you want to react not just to taps but also to long presses or double taps, you can use the combined clickable modifier instead. Again, you just have to implement the callbacks that you're interested in. You can use the draggable modifier to listen to horizontal or vertical drag gestures. In this case, instead of passing a single onDrag lambda, you pass a state. This is a common pattern that allows you to hoist the state and mutate it outside of the modifier. The scrollable modifier works the same, but it includes logic for scrolling and flinging. And the transformable modifier makes it possible to listen to multi-touch transform events on a composable. We filled our toolbox up quite nicely. We have a set of custom gesture recognizers and a set of modifiers that we can use directly. So how do we choose between them? For example, we see both a clickable and a detect step gestures. And both of them recognize steps. However, the modifier does much more than that. Let's quickly look at its source code. Well, if you look at the implementation of the clickable modifier, you can see that it not only adds gesture recognition. It also adds click semantics to deal with accessibility, key detection, and focus information to support keyboards, and an indication modifier to show ripples on top of the clicked element. This tells us that we should consider these as well when we drop down from modifiers, like clickable and draggable, to use the pointer input modifier directly. This is a common pattern. When you use the various detect methods on the left, you're recognizing gestures. If you use a high level modifier, such as clickable or draggable, they include support for other types of input as well. In general, try to use the gesture modifiers and use the gesture recognizers only if there are good reasons to. We'll see some of these reasons later in this talk. Now, one other thing to keep in mind is that these modifiers exist to recognize gestures. And they don't actually transform anything. So if we, for example, look at the transformable modifier, we see that it gets continuous updates of zoom, offset, and rotation changes. Let's say that we want to use this modifier to create an element that we can move around with two fingers. Just having the modifier is not enough. We need to apply the modifier to the box composable. This is the part where we recognize the gesture. Then we need to hold the current scale rotation and offset values. In this case, we keep track of the current scale, the current rotation angle, and the current offset. And we also need to pass logical initial values for these. From our gesture recognizer, we update these values whenever the gesture receives a new change. And finally, we apply the transformation values to our composable. In this case, I'm using a graphics layer modifier to set this composable scale, translation, and rotation. So to repeat, we recognize, hold, and then apply the data coming in through a gesture. Decoupling the recognition from the application also means that a gesture recognizer does not need to be applied to the composable that you're transforming. Instead, you can move modifiers around to recognize a gesture in a parent, but then apply it to a child composable. In this case, the user can perform a gesture on the parent box, but the transformation happens on the inner blue box. A real world example of this is the material slider. The gesture recognizer is defined on the full slider component, while the transformation only happens to its thumb handler. This way I don't actually have to grab the thumb handler, but I can start the drag from anywhere. So now that we have a basic understanding of the tools we have at our disposal, let's go and create some fancy interactions. We'll be creating this photo grid sample app. There's a lot happening here, and we'll take it step by step. And yes, before you ask, the code is available in a gist. So what requirements do we have to implement? First, we tap a photo to open it full screen. Then we can double tap it to zoom in or out. And alternatively, we can pinch to zoom. Now, while we are zoomed in, we can drag to move around this photo. And we can tap the gray background outside of the photo to exit our full screen mode. Let's implement these one by one. For the first requirement, we want to tap the image to open it full screen. We can choose between the detectTapGestures recognizer and the clickable modifier. Both will do the trick. But as we discussed before, the clickable modifier will add some extra functionality. So we'll go with that. Let's implement our code. We start by implementing our PhotoGrid. We create a composable, and pass it a list of photos. We create a lazy vertical grid with cell sizes of at least 128 dp. Inside, we add all the photos as items with their ID as the unique key for that item. Each item is then represented with a PhotoItem composable that we'll implement later. With this, we actually already have some gesture handling in our app. Lazy vertical grid comes with scroll support right out of the box. This is an example of a composable that includes support for gestures. Now let's implement our first requirements. Clicking one of the photos should open it in full screen mode. We first wrap our photo grid in a top level app composable. We add some states. We keep track of the photo ID that should currently be showing full screen. When we start the app, this should be null because no photo is selected yet. If this activeId is not null-- that is a photo has been clicked-- we show the full screen photo. We look up the photo that belongs to this identifier, and we pass it to the full screen photo composable that will be responsible for showing this full screen photo. Finally, we add the callbacks to update the activeId. We pass and navigate the photo lambda to the photo grid composable, and then the onDismiss lambda to the full screen photo composable. Going back to our photo grid composable, this now gets this navigateToPhoto lambda. We add a clickable modifier to the PhotoItem, and we call that navigate lambda when the user clicks this element. And with that, we finish our first requirements. The clickable modifier can be used to make any composable clickable, including our photo in the grid. Next, while we're showing the full screen photo, a double tap should zoom in or out. So how would that work exactly? When the user double taps, we remember the tap location on screen. We then scale the image around that point, while making sure it stays within the bounds. To implement this requirement, we again have two options-- detectTapGestures and combinedClickable. Let's implement our code and see which one fits. For that, we first need to learn more about the FullScreenPhoto composable. It gets a photo, the onDismiss lambda that should be called to exit the full screen mode, and an optional modifier. Inside, we define a box that fills the whole parent and centers its contents. In the box, we'll have a Scrim and the image. The Scrim will basically be a semi-transparent background that we can click to exit full screen mode. We'll get to this later. Right now, the photo image composable is the one we're interested in for zooming its content. So let's look at this photo image. It contains an image that we initialize with the URL of the photo. We pass it the photo's content description and set its aspect ratio to make sure that it's square. Let's add some variables to hold our state. We'll keep track of the offset and the zoom level. The offset is used so we can zoom in on a specific part of the image. We can then use a graphics layer modifier to apply these values to the image. We use the offset and zoom values to translate and scale the composable and set the transform origin to make some of these calculations easier. And we also add a clip modifier to make sure that the image stays within its bounds, even when we scale it up or move it around. Now we're ready to add the gesture recognizer. We might be tempted to use the combinedClickable modifier. However, the onDoubleClick does not contain any information about the location on the screen where the double-click happens. So we can't use it to update our offset value. Instead, we can revert to the detectTapGestures recognizer, which passes the tapOffset as a parameter to the onDoubleTap lambda. We use that tapOffset to calculate our new zoom and offset values. I'll leave the actual calculation of the offset out for now because it's a little bit difficult to wrap your head around, but rest assured, it's just a one liner doing some calculations. Great. That finishes our double tap to zoom functionality. Now, you do remember that I said earlier that this leaves out some functionality. And we will get to that later. Don't worry. Next, let's cover the pinch to zoom and drag to pan. For these requirements, we should recognize all sorts of movements, both with two pointers for zooming and with a single pointer for the panning. We again have two options-- detectTransformGestures and the transformable modifier. Let's first see what behavior we want. While the user pinches to zoom, we want the image to zoom around the middle of the pinch gesture. This middle, represented by this orange dot, is called the centroid. The transformable modifier does not get this centroid. The detect transform gesture method does. So let's choose that one. Going back to our code, our hold and apply blocks are still identical. We still want to change the offset and zoom values, and we still want to apply them to our image composable. We just have to add another gesture recognizer. We add another pointer input modifier, this time detecting transformation gestures. This will continuously call the lambda with delta values for movement and scaling. We calculate the new offset value and zoom value. I'll leave out the actual calculation for now. But again, it's just a few lines of code that you can find in the gist. And with that, we have a zoomable and pannable image using detectTransformGestures. Now we only need to allow the user to exit the full screen image by tapping the screen. Going back to our full screen photo, remember that Scrim composable. If we look at its definition, we can see that it currently is simply a full screen box with a grayish color. We can add behavior to it by passing an onClose lambda and calling that lambda when the user taps the box. In the FullScreenPhoto, we can simply forward the existing onDismiss parameter to this Scrim composable. And with that, we finish our list of requirements. We can tap to open, double tap to zoom. We can pinch to zoom, drag to pan, and tap the scrim to cancel. But actually, we're not done. There's one more thing that we should consider. Specifically, I'd like to talk for a bit about Talkback, an accessibility service on Android devices that is used by people with visual impairments. When Talkback is enabled, it captures all of the gestures that a user makes. So any of those custom gestures that you built will not be available to the user. Instead, Talkback handles all gestures. For example, a swipe to right anywhere on the screen will move the focus to the next element. A double tap behaves like a click. And a two-finger swipe creates scroll events, et cetera. So any gesture recognizers that you apply to your composables won't work out of the box. Instead, you have to configure the composable to explain to Talkback what it is you're trying to accomplish with this gesture recognizer. Let's check our requirements. The clickable modifier that we use to open photos full screen adds the necessary information. But the double tap to zoom, the pinch and drag to zoom, and pan, and tap to cancel full screen won't work out of the box. Now, I won't go into detail on how exactly to add this behavior, but check out the documentation on accessibility and Compose. That explains it in detail. And with this, let's move on to our next gesture. Our goal is to be able to select multiple photos in the grid. A user should be able to long-press and drag to select multiple photos at once. Let's break this behavior up into separate requirements. First, we want the user to be able to enter selection mode by long-pressing a photo. Then tapping any photo should add them to or remove them from the selection. When the user long-presses a photo and drags, the photos in should be selected or deselected. And finally, when the user removes the last photo from the selection, we should exit the selection mode. Now, before we implement the gestures, I'd like to pause for a second and think about state management. That is, what state should we keep track of in our composable to correctly render our grids? At any given time, we will have a set of photo IDs that are selected. When there are no photos selected, like here on the left, this set will be empty. On the right, you can see an example where we have several photos selected with IDs 4, 5, 6, and 8. We also need to keep track of whether or not we are currently in selection mode. That is whether at least one photo is selected at the moment. When selection mode is false, all photos should show as is. But when selection mode is true, the left top corner of the photo should show a radio button, and they should respond differently to taps. We will keep track of this information in the PhotoGrid composable. Remember that the PhotoGrid composable is currently simply showing a lazy vertical grid with the photo items in it. We add a variable selectedIds to keep track of all the photos that the user selected. Note that we use rememberSaveable here so that the variable is persisted across configuration changes. In addition, we want a simple Boolean to tell us whether or not we are currently in selected modes. Since this Boolean will change only when we move from an empty list to a list with at least one photo and the other way around, the right API to use here would be derivedStateOf. This will make sure that the composables that rely on the value of the inSelectionMode variable will only recompose when the value changes from true to false or back, and not for every change in the set of selected IDs. In addition, for each photo in the grid, we can look up whether that specific photo is selected by checking if it's part of the set of selected IDs. Again, we're using derivedStateOf to make sure that we only recompose composables depending on this state when the selection state changes from true to false, or the other way around. We then forward these properties to the PhotoItem composable so it can adapt its UI based on their values. And inside the PhotoItem composable, we use these values to choose what icon to show, if any. When inSelectionMode is true and the photo is selected, a check mark is added. If inSelectionMode is true, but selected is false, an unselected radio button is shown. And if inSelectionMode is false, no icon is shown at all. And with that, we have state management, which is reflected in our UI. Only we're not changing the state yet. And that, of course, is where our gesture handling comes in. Remember we want a long-click on an item to start the selection mode and directly add that photo to the set of selected photos. But we also already had a normal click listener defined on our photo item to open it full screen. So in our case, we want to listen to both click and long-press events on the same composable. Let's go to our toolbox and see which handlers would fit that requirement. We can choose either detectTapGestures or combinedClickable. Now, remember that detectTapGestures only gives us the raw gesture handling, while combinedClickable includes extras, such as accessibility and focus support. That sounds great for our use case. So let's go with combinedClickable. We go back to our PhotoGrid composable. In the first part of this presentation, we applied a clickable modifier to the PhotoItem composable. Now we want to change that to this combinedClickable. For the onClick lambda, we pass the same behavior as before, allowing the user to navigate to the full screen photo view. When the user long-presses, we add the ID of the photo to the set of selected IDs. By simply changing the mutable selectedIds state, any composables that depend on that state will automatically recompose. Our first requirement is done. We can long-press to enter selection mode, and we used combinedClickable for that. However, right now, if we click a photo while we're in selection mode, it opens it full screen. That's not the behavior we're looking for. Instead, if we tap in selection mode, a photo should be added or removed from the selection. So let's go back to our PhotoGrid composable and adapt its gesture behavior. Currently, it always applies to combinedClickable modifier, navigating onClick, and adding the photo to the selection onLongClick. We want this behavior to be different, depending on the inSelectionMode value. We add an if else statement, and apply a different gesture handler based on the value of inSelectionMode. If inSelectionMode is true, we should listen to taps and add or remove the photo from our selection. So we add a clickable modifier and let it remove the photo ID from the set when it's currently selected, or add it to the set when it's currently unselected. We solved our second requirement with the clickable modifier. We can now enter selection mode and add or remove photos while we're in selection mode. Next up, we want to use long-press and drag to select multiple images at the same time. Let's open up our toolbox and see which gesture handler fits. We need a handler that waits for a long-press, and only then passes on any drag events. There's a detector that does exactly that. The detectDragGesturesAfterLongPress method will help us to implement our multiselect. It waits for a long-press to happen, and will then forward all the drag events to a lambda. It also contains lambdas for when the drag starts, when it's canceled, or when it ends. For our multiselect case, when the drag starts, we should add the photo underneath our pointer to the selection. Then while we drag, when the pointer moves to a new photo, all photos in between the old and new photos should be added or removed from the selection. But which composable do we apply this gesture handler to? Do we apply it to the single photo item or do we apply it to the whole grid? In order to answer that question, I need to tell you a bit more about hit testing. Let's take our app and look at it from an angle. The UI of our app consists of layers, where our photo items are at the top, the grid is in the middle, and the main app container is at the bottom of the stack. When we touch our screen, a hit detection algorithm runs. You can envision this as a ray being shot at your UI hierarchy. In this case, it will first hit one of the photos. Then it will hit the grid, and then it will hit the main app container. These composables get the chance to register themselves to receive pointer input events. Now, as the user drags their finger over the screen, this imaginary ray shooting at the screen changes position. But it is important to understand that it doesn't perform any more hit testing. So even though it exits the bounds of the middle photo, that photo composable would still be the one receiving the drag events. As long as the gesture is in progress, only the composables that were originally hit will receive updates. So why is this important to us? Well, our requirement includes adding and removing photos from our selected list as we drag over them. Imagine that we would register the drag listener only on a single photo item. This item does not know anything of its siblings. So when the pointer exits its bounds, it will not know which photo it moved to. Instead, we need to listen to the gestures on the whole grid. Given the location of the pointer within the grid, we can calculate which photo that is pointing at. And then when it moves out of the bounds, we can recalculate that. So let's go back to our PhotoGrid composable with its lazy vertical grid. We add the pointer input modifier. Inside, we call the detectDragGesturesAfterLongPress method. We have a way to listen to this gesture, but now we need to actually respond to it-- in our case, by updating the set of selected IDs. During our onDragStart, we want to figure out which photo is originally selected and keep track of it. We also want to add this initial photo to the set of selected photos. When the drag is canceled, or when it ends, we should make sure to reset any internal state in the modifier so it is ready for a next gesture. And during the onDrag callback, we need to find out if the pointer changed from one photo to the next. And if so, add or remove the right photos from the selection. To know which photos to add or remove, we need both the identifier of the photo that was initially hit by the long-press and the identifier of the photo that the pointer has last been over. In the onDragStart lambda, we get an offset that indicates where in the bounds of our lazy grid the pointer went down. We can use that offset to find out what photo the user is pointing at using a helper method. You can find the implementation of this photoIdAtOffset on GitHub. The method can also return null if the user is pointing at the space between photos. So we do a null check on the result of the method. If a photo is indeed hit, we check if it wasn't yet in the selection. If not, we add it and save it as the initial and current photo ID, and add it to our selection. In onDragCancel and onDragEnd, we reset this initial photo ID to null. And in the onDrag lambda, we handle the actual drag events. This lambda gets a change event as input. In that change event, there is a position field that contains the offset in the composable where the event happens. We can reuse the same photoIdAtOffset helper method to see which photo the user points at. Now during the drag, we are continuously tracking which photo ID is underneath the pointer. We're only interested in the movement from one photo to the next. So we check if the pointer photo ID is different from the current photo ID. We also update the current photo ID when we're done with this update. And finally, we update the set of selected IDs. This addOrRemoveUpTo method is part of our business logic, and we'll update the set of selected IDs based on the pointer, current, and initial photo ID. Now, we can successfully long-press to enter selection mode, tap to add or remove from selection, and long-press plus drag to-- wait. Long-press and drag doesn't work. Why is that? In order to find our answer to this problem, we need to talk about conflict resolution. Now, remember that we solved our initial requirement with a combinedClickable modifier, letting it listen to long-press events. But now we're also detecting drags after long press. So both the photo items themselves and the grid are listening for and trying to react to long-press events. We have a conflict. If we look at the layered version of our UI again, when a pointer event comes in, the layer that lies on top gets the event first. In this case, that's our photo item. It chooses whether or not to consume the event. Only if this layer would not consume the event, it is forwarded to the next layer, et cetera. In our case, the long-press recognizer that we added to the PhotoItem composable captures and consumes all events. This means that our photo grid will not get any events, and its long press and drag recognizer will not work. Looking at the code, the onLongClick in our combinedClickable is the culprit. It blocks the long-press and drag off the grid. But actually, we don't really need this long-click listener anymore. The long-press and drag listener of the grid already adds our photo to the selection after a long press. So we can change the combinedClickable to the original clickable modifier. Fixed! Now we can successfully long-press to enter selection mode, tap to add or remove from selection, and long-press and drag to add or remove multiple photos. Now, finally, we want to be able to tap the last selected photo to exit selection mode. However, this already works out of the box. Because we made our selection mode dependent on the set of selected IDs, deselecting the last photo automatically exits selection mode. We're done! If you learned something new about gestures in Compose, then please share this video or like and subscribe.
Info
Channel: Android Developers
Views: 19,319
Rating: undefined out of 5
Keywords: Gestures in jetpack compose, custom gestures in jetpack compose, what are gestures, how to use gestures, how to use gestures in jetpack compose, how to use custom gestures in jetpack compose, custom gestures tutorial, jetpack compose, jetpack compose tutorial, jetpack compose latest, jetpack compose updates, developer, developers, android developer, android developers, google developers, android, google, Jolanda Verhoef
Id: 1tkVjBxdGrk
Channel Id: undefined
Length: 31min 33sec (1893 seconds)
Published: Wed Oct 11 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.