[MUSIC PLAYING] SUMIR KATARIA: Hi, everyone. My name is Sumir Kataria. And I'm an engineer
on the Android team. I work on architecture
components. And today, I want to talk to
you about a new library we have called work manager and
background processing in general on Android. So let's talk about
background processing in 2018. What are we trying
to do these days? Just this morning I
was trying to send a picture of my lovely
wife and my beautiful son to the rest of my family. So that's an example of
background processing. We're also sending logs, syncing
data, processing that data. All of this work is being
done in the background. And on Android, there's
a lot of different ways you can do this work. Here's a lot of them. You can do takes on
threads and executors using JobScheduler, AlarmManager,
AsyncTasks, et cetera. Which one should you use? And when should you use it? Meanwhile, Android also has a
lot of battery optimizations that we've introduced
over the last few years. For example, we introduced
doze mode in Marshmallow. If you've been
following Android P, we've had app standby buckets. In Oreo, we restricted
background apps, background services. So all of these
things have to be taken care of as a developer. And finally, we
always have to worry about backwards compatibility. So if you want to reach
90% of Android devices, you want to at least
have a minSdk of KitKat. So given all of this, what
tools do you use and when do you use them? And the trick behind
this is that you have to look at the
types of background work that you're doing. I like to split this
up into two axes. The vertical axis here is
the timing of the work. Does the work need to be done
right when it's specified? Or can it wait for a little bit? So that if your app-- your device enters doze mode,
you can still do it after that. Also, on the
horizontal axis here, how important is the work? Does the work only need
to be done while your apps in the foreground? Or does it absolutely need
to be done at some point? So for example, if
you're taking a bitmap, and you decide that you want
to extract a color from it, and update your
UI with it, that's an example of
foreground-only work. You don't care about it once
the user hits home or back. That work is irrelevant. Meanwhile, if
you're sending logs, you always want that to happen. That's an example of
guaranteed execution. So for things that
are best-effort, you really want to use things
like ThreadPools, RxJava, or core routines. For things that require
exact timing and guaranteed execution, you want to
use a foreground service. So an example of this
would be that your-- the user hits a button, and you
want to process a transaction, and update the UI and the
state of the app based on that. That really needs a
foreground service. That needs to happen right then. Your app cannot be killed by the
system while that's happening. This fourth category
is very interesting. So you want
guaranteed execution, but you're OK if it happens
later, doze mode can kick in. And there's a variety
of ways to solve it. On your APIs, you'll
use JobScheduler. If you want to go a
little further back, you can use Firebase
JobDispatcher to do that. And if you don't have
Google Play Services, you'll probably end
up using AlarmManager and BroadcastReceivers. And if you want to target
all of those things, well, you'll use some
mix of these four things. And that's a lot of APIs,
a lot of work to be done. WorkManager falls here. It's guaranteed execution
that's deferrable. So WorkManager, let's talk a
little bit about its features. I just mentioned guaranteed
execution is also constraint-aware. So if I want to upload that
photo that I talked about, I only want to do it
when I have a network. That's the constraint. It's also respectful of the
system background restrictions. So if your app is in doze
mode, it won't execute. It won't wake up your
phone just to do this work. It's backwards compatible
with or without Google Play Services. The API is queryable. So if you haven't
queued some work, you can actually check,
what is its state? Is it running right now? Has it succeeded or failed? These are things that you can
find out with WorkManager. It's also chainable. So you can create
graphs of work. So you can have Work A
depending on Work B and C, which in turn depends on Work D. Also
WorkManager's opportunistic. So this means that we
will try to execute that work in your process as
soon as the constraints are met without actually needing
JobScheduler to intervene or call you and wake you up. It doesn't wait
for a JobScheduler to batch your work
if your process is up and running already. So let's talk about a
little bit of the basics and talk through the code. So I just described the example. I want to upload that photo. So how would I do that
using WorkManager. Let's talk about
the core classes. There's a Worker class. This is the class
that does the work. OK. This is where you will write
most of your business logic. And there's a WorkRequest class,
which comes in two flavors-- OneTimeWorkRequest for things
that just need to be done once, and PeriodicWorkRequest
for recurring work. And these will
both take a Worker. I'll show you just now. So here's my UploadPhotoWorker. It extends the Worker class, and
it overrides to doWork method. This is the method that
will run in the background. We'll take care of that
on the background thread. You don't need to put it
in a background thread. So you simply do your work. So in this case, we upload
the photo synchronously. And we return a result.
So in this case, let's say we succeeded. And the WorkerResult in here
has three values-- success and failure, which are fairly
obvious; and retry, which says, I encountered a transient error. Let's say that the device
lost network connection in the middle, so retry me after
a little bit with some backoff. So now that I have this, I can
create a OneTimeWorkRequest using the UploadPhotoWorker,
and then I can enqueue it using
WorkManager.getManager.enqueue. So soon after this is
enqueued, it'll start running. You'll upload your photo. But I just talked about this. What if you lose connectivity
in the middle of this, or even before it does? What if you've never
had connectivity? You actually want to use
constraints in this case. So an example of the
constraint here you want to use is you make a
Constraints Builder. And you say,
setRequiredNetworkType to be connected. So you need a connected
network connection. You build it. And you also set the
constraints on the request that you just built. So by
simply doing this, and then enqueuing it, you make sure
that this work only runs when your network is connected. So let's say you want
to observe this work now that we've done it. So I want to show a spinner
while this work is executing, and then I want to hide
the spinner when it's done. How would I do that? So as I said, I'll
enqueue this request. And then I can say,
getStatusByID on WorkManager using the request.id. So each request has an ID. And this returns a
LiveData out of WorkStatus. If you guys remember
architecture components, LiveData is a
lifecycle-aware observable. So now you can just hook
into that observable, and you can say, when
that work is finished, hide that progress bar. So what is this WorkStatus
object that you were looking at at the LiveData? It has an ID. This is the same
ID as the request. And it has a State. The State is the current
state of execution. There's six values
here enqueued, running, succeeded or failed,
locked and canceled. And we'll talk about
the last two later. So let's move a step
up in concepts here. Let's talk about chaining work. So I promised that you
can actually make directed [INAUDIBLE] graphs of work. How would you do that? Let's say this is
my problem now. I'm uploading a video. It's a huge video. I want to compress it
first, then upload it. So these are both eligible
for background work because they're time
intensive things. So let's say I have two
Workers, CompressPhotoWorker and UploadPhotoWorker. They're both defined to do
the things that I just said. So you can make
WorkRequests from them. And you can say workManager.begi
nWith(compressWork). Then uploadWork and enqueue it. So this ensures that
compressWork executes first. And once it's successful,
then uploadWork goes. And if you were
to write this out, because that was a very
fluent way of writing it, what happens behind
the scenes here is that begin with returns
of WorkContinuation. And a WorkContinuation
has a method called then that also returns
a WorkContinuation, a different one. So you're using that to
create, that's fluent API. So you can actually use
these WorkContinuations and pass them around
if you want, et cetera. So let's say that I'm
uploading multiple photos. I take lots of photos. No one takes only just
one photo of their child. So how would I upload
all of these in parallel? Well, so let's say I've got a
WorkRequest for all of them. I can literally just say,
.enqueue and put all of them there. It's a [INAUDIBLE]. So you can pass more
than one thing there. And these are all eligible
for running in parallel. They may not actually
run in parallel depending on your device, and the executor
being used, and all of that, but they could be. So let's choose an even
more complex example. Now you want to
filter your photos. So you want to
apply some kind of-- I don't know-- grayscale filter
or a sepia filter to them, then you want to compress them,
then you want to upload them. How would you do this? WorkManager makes
it very simple. So first you say, beginWith. You do all the filter
works in parallel. After those have all
completed successfully, then you do your
compression work. And after those have
completed-- that has completed successfully,
then you do your uploadWork. And don't forget to
enqueue at the end. So we've talked
about all of that, but there's a key
concept that I want to cover that's very
much related to chaining. This is inputs and outputs. So let's talk about this
problem that I have here. It's a MapReduce,
and really a good way of explaining a MapReduce
is to give an example. I love reading. I've loved reading
Sherlock Holmes novels since I was a kid. And the other day, I
was thinking, well, Arthur Conan Doyle has a
very specific way of writing. What are the top 10 words
he uses in his books? Well, how would I
figure that out? I would go through each book. I would count the
occurrence of each word, and then I would
combine all of that data and sort it so that I would
find the top 10 of those. This is a distributed problem
that we could call a MapReduce. And for inputs and outputs,
the common unit of operation here is a data. The data is a
simple class that's a key-value map under the hood. The keys are strings. The values are
primitives and strings, and the array versions of each. So this is kind of like
a bundle or parsable, but it's its own thing. It's serializable
by a WorkManager, and we limit it to
10 kilobytes in size. And I'll go more
into that part later. So how do we create a data? So in Kotlin, you can
make a map very easily. So in this case, we're mapping
the key file_name to the value a_study_in_scarlet.txt. That's the novel that
I'm going to look at. And I'll convert
it to a WorkData. So this is a data object. And once I create my
workRequest builder, I can set the InputData on it. So this is the
InputData of that map. And I pass it along. So inside my worker,
I can actually retrieve this InputData by
just calling the getInputData method. And from that I can get the
string for the file names. And now I have the fileName. And I can say,
count all the word occurrences in this fileName. That's some method that
I've written somewhere else, and I can return my success. But you don't want
to do just that. You actually want to
also have outputs, right? Now you've done all this
work, it should do something. There should be
an output for it. So let's say that data that
we have returns a map of words to their occurrences. We can convert that
map to a WorkData. And we can call a method
called setOutputData that sets this data-- so getInputData, setOutputData. So the key observation
that you need to know here is that the worker's
outputs become the inputs for its children. So what happens is the
findTop10Words worker, which goes next, its
inputData is coming from the previous worker. So in this case, you can pass
the data all the way through, find the top 10
words, and return out. So the data flow for one
book becomes like this-- I'll count all the word
occurrences in that book. I'll pass it to the
findTop10Words worker. It's inputData will be
whatever I pass through. And it will do the sorting
or whatever it needs to do. But here is a
really tricky thing, what happens when you
have multiple books? What's the input for the
findTop10Words worker? You're passing multiple
pieces of data, but I've only been able
to get one inputData. What happens to the rest of
them, or how do they combine? For this, you want to
look at InputMergers. So InputMerger is a
class that combines data from multiple sources
into one data object. And we provide two
implementations out of the box-- OverwritingInputMerger,
which is the default, and ArrayCreatingInputMerger. You can also create your own,
but let's talk about these two. First,
OverwritingInputMerger-- so we have two data objects here, each
with their own keys and values. What does
OverwritingInputMerger do? It first takes the
first piece of data and it just puts everything
in a new data object. So it's an exact copy of this. Then it takes a
second piece of data and it copies it
over, so overwriting anything that's the same key. So in this case, the name Alice
becomes Bob, and the age of 30 becomes Three Days. Note that it changed type. So a number became
a string here. The scores key was new,
so it just got added. Note that if we did this
in reverse, instead of Bob, you would have Alice
as the final output. So this is something
a little tricky. You want to make sure that
OverwritingInputMerger is the right tool for the job. But it is very simple. What about
ArrayCreatingInputMerger? This is the one
that actually takes care of those collision case. So in this case, let's
go just key by key. The name becomes an
array of Alice and Bob. Color becomes a
singleton array of blue because it's only
defined in one of them. Scores, notice that
there is one integer and one array of integers. These combine and they
just concatenate together. Order is not specified, but
all the values come through. What happens for age? So there's an integer,
and there's a string. This is an exception we do. Expect it to be the
same basic value type. So let's go back to that
example I was telling you about, Sherlock Holmes. Implicitly, there is an
InputMerger before the stage. So we combine all of that data. And which InputMerger
do we want to use? So we want to-- we
don't want to throw away any of this calculation
that we've done. So we actually want to use an
ArrayCreatingInputMerger, which preserves all of the
data and gets it through. So how do we do that? Well, we just say setInputMerger
on the request builder of the findTop10Words. So it merges data using an
ArrayCreatingInputMerger. So you say, beginWith
the countWords workers, then do the findTop10Words
worker, and then enqueue. So for example,
if the first book had 10 instances of the
word Sherlock, 5 of Watson, and 30 of elementary, and the
second one had 12, 15, and 5. You would get arrays like this. Sherlock would be 10, 12. Watson would be 5, 15. Elementary would be 35. So in your
findTop10Words worker, you would sum all of that up,
sort them, find the top 10, and that's your output. And I just said that's
your output, right? So you can actually observe
the output in your work status using that LiveData. So you can actually
get that output data. So that's super useful because
you can display it in your UI. How do you cancel work? So I just decided to
send up a picture, but I'm like, wait a sec,
this is not the picture I'm meant to send up. How do you cancel that upload? Very simple, you just
say, cancelWorkById. But do note that
cancellation is best effort. So the work may have
already finished. These are all
asynchronous things. They may be happening
in the background. Before you have had a chance
to do that cancelWork, it may already be
running or finished. So it's best effort. OK, so let's talk a little
bit more about tags. And tags are solving
this problem. IDs that I just told you
about are auto generated. They're not human readable. So they're actually
UUIDs under the hood. And you can't really
understand them. They're not useful
for debugging. If you log them, they're
not going to make sense. What kind of work was running? I don't know. It's just some big number. I don't know what that is. Tags solve this issue. Tags are a readable way
to identify your work. So tags are developer-specified
strings by you, and each work request can
have zero or more tags. You can query and
cancel work by tag. Let's look at an example. So I used to work
on the G+ team here. And the G+ app
supports multi-login. So you can have multiple users
logged in at the same time. And each of those users
could be doing several kinds of background work. You could be getting favorites. You could be
getting preferences. So if you have three users
logged in on your phone, and they're doing
two kinds of work, you have six things happening. How do you identify what you're
looking at any given time? Well, you can use tags. So for example, in this
workRequest builder, you can add tags to
say this is user1, and this is the
get_favorites operation. So now you can actually
identify that work. And if you wanted to
look at the statuses, you could say, give me
all of the work for user1. And this will return a list
of work statuses as a LiveData because each tag can correspond
to more than one workRequest. So this is a list
of work statuses. Similarly, you can also
cancel all work by tag. Cancellation is
best effort, again. But you can cancel all of
one particular kind of work, in this case. Tags are also useful for
a couple of other reasons. Tags namespace your type of
work, as I just told you. You can have tags for the kinds
of operations you're doing, get favorites, get
preferences, et cetera. But they also namespace
libraries and modules. So if you're a library
owner or a module owner, you should always tag your
work so you can get it later. Let's say that you
have a library, and you move to a new version
of that library in your app, maybe you want to cancel
all the work you had. You can cancel all
work by your tag. So always use tags when
you are using a library. And WorkStatus also has
tags available in it. So if you're ever
looking at a WorkStatus, you can get the
tags for that work and see what you yourself
called it in the past when you enqueued it. One more thing I wanted to
talk about is unique work. So unique work solves a
few different problems. But one of the common ones
that almost every app has is syncing. You want to sync when
you first launch the app. You want to sync maybe
every 12 to 24 hours to get the freshest data. And you may also want to sync
when your language changes. Maybe you have a version of your
data in a different language. So you want to sync
at that point, too. So you're doing
all this syncing, but you really only want
one sync active at a time. You don't want four
sync operations running. Which one is the right one? Which one wins? You don't know. You just want one. Unique work can solve this. So it is-- a chain of work
can be given a unique name. You can enqueue, query,
and cancel using that name, and there can only be one
chain of work with that name. Let's take a look at
that sync example. So if I say, beginUniqueWork
with my name, let's say sync, in this case. And that next argument is what
I call the existing work policy. So if there is work
with this name, sync, what should I do with it? In this case, I say, keep it. I want to keep the
existing work, ignore what I'm doing right now. The next argument is
actually your workRequest, in this case, a syncRequest
when you enqueue it. So if there's work with the
name sync already in flight, it will keep that. If there isn't, it will
enqueue this and execute it. So this is how you
dedupe your syncs. So here at Google,
we love chat apps. And maybe you're updating
your chat status. So you want to say, I'm bored. And then 10 seconds
later, I'm watching TV. Then, I'm bored again. OK, I'm going to sleep. And you're in a bad
network connection state. You have bad Wi-Fi, and
maybe the first thing hasn't gone through when you
type your second chat status update. And really the second
one should win, and the third one
should win over that. So you want to make sure
that the last one wins. How would you solve this? Here's a simple function. You don't even need to
read the rest of it. It's the last line that
I want you to care about, which is beginUniqueWork. Your name is update_status, and
you choose the REPLACE option. REPLACE cancels and deletes any
existing in flight operations off that name. So the last one always does win. In this case, if you have
two update chat status calls, the last one will win. And finally, I love music. I love the Foo Fighters. I was building a playlist the
other day with all their songs. There's a lot of songs. There's like 150 or 200 songs. And I was doing all of this. I was adding a song. I was shuffling
two songs around. I was moving something to
the bottom of the list. I was deleting a song because I
had it already somewhere else. These are all things that I
want to do using WorkManager, but how would I do that? These things all have
to execute in order. And so since the
order is important, we provided the ability to
use the APPEND existing work policy that says, do this
work at the end of the list of update_playlist operations. So append this to the
end of this thing, so everything else
must successfully execute before this executes. So you can add
operations to the end. So ExistingWorkPolicy,
as a summary, there are three types,
KEEP, REPLACE, and APPEND. A few notes about
PeriodicWork, it works very similarly to
everything you've seen so far. Just a couple of notes on it--
so the minimum period length is the same as JobScheduler. It is 15 minutes. It is still subject to doze mode
and OS background restrictions, just like any of the other
work we've talked about. It can't be chained, and it
can't have initial delays. And we think that that just
sort of makes good API sense. It's much more reasonable to
think of it in those terms. All right, so we've
talked a lot about code. Let's talk about how it
all works under the hood. So you've got a work. You enqueue it. We store it in our database. What happens after that? Well, if the work is
eligible for execution, we just send it to the
executor right away. By the way, this executor,
you can actually specify it, but we do provide
one that's default. But let's say that your
process gets killed. Well, what happens then? How does it get woken up, and
how does this work run again? So if you're on API 23+, we
send it to JobScheduler as well. And JobScheduler invokes an
IPC, wakes up your process. It goes to the same executor,
and that's where it runs. If it's an older device, and
you use Firebase JobDispatcher and user optional
dependency, we can send it to Firebase JobDispatcher. Same thing, invokes an IPC,
runs it on that executor. What if you don't have that or
you're not using a Google Play services device? So you're using something else. We have a custom AlarmManager
and Broadcast Receivers implementation. And the same thing,
uses an IPC, wakes up your app when the time is
right, and runs the job. A couple of
implementation details-- so JobScheduler and
Firebase JobDispatcher are through Google
Play services. They provide central
load balancing mechanism for execution. So if every app on your
device is trying to run jobs, they'll load balance them. They'll make sure
that you're not running too much
work on your device and burning up your battery. The AlarmManager
implementation that we have, unfortunately, can't
do that because it's only there off your own app. Of course, your concepts,
like content URI triggers, idle, doze mode, et cetera,
are only available at the API levels that they
were introduced at. So those methods
will be marked with, requires API with the
appropriate API level. We take care of obtaining
wake locks when necessary. So especially, this is
true for the AlarmManager implementation. Don't take wake locks
in your workers. You don't need to do that. We take care of it for you. Finally, let's talk a
little bit about testing. You want to test this app. We provide a testing library. It has a synchronous executor. Use WorkManager as normal
to enqueue your requests. And we provide a class
called TestDriver, which executes enqueued
work that has constraints. So we can just pretend that
the constraint has been meant. Periodic and initial delay
triggers are coming soon. We don't have them yet. So if you wanted to
look at the code for it, you can
initializeTestWorkManager. You can get the TestDriver. Create and enqueue your
work as you normally would, with a constraint, in this case. And we can tell the
TestDriver, hey, all constraints are
met for this work. Your work executes at
that point, synchronously, and you can verify
the state of your app and make sure that
everything is right. I also want to talk a little
bit about best practices before I end here. It's very important to know
when to use WorkManager. WorkManager is for tasks that
can survive process death. It can even wake up your
app and your app's process to do the work. So for example, it's OK when you
want to use it to upload media to a server. It's also OK when you want
to parse data and store it in your database. It's not OK for that
example I gave earlier. You're extracting the
palette color from an image and updating an image view
with it, because that's foreground only work. It's also not OK when you're
parsing data and just updating the contents of a view. Because you could
switch screens. You could go in the background. It's not work that needs
to use WorkManager. It doesn't need to
survive process death. Also, it's not OK to process
payment transactions in it if they care about
timing right then. So if you click
buy, and you want to update the state of
the app, that really needs something else. So that last one needs
a foreground service. The other two may just
use thread pools or Rx. Also, WorkManager is
not your data store. Instances of data are
limited to 10 kilobytes each when serialized. So data is really meant
for light, intermediate transportation of information. You can put file URIs or keys
to other databases in there if you want. You can put simple
information to update your UI. If you want to use
a full data store, I would recommend using Room. Yit would be very happy
that I'm saying this. It's an awesome database. Finally, be opportunistic
with your work. So here's a filter compress
upload example again. The reason that these
are not just one big job is because they all have
different constraints. So they can execute
at different times. Let's say, I'm getting
on an airplane, and I'm uploading
a bunch of images and running this chain of work. Well, I go into airplane mode. Maybe I don't have network
for the next 12 hours because I'm flying
across the world. Well, the other work can
still execute, and it should. So if you architect like
this, you can do that. This also, by the way, makes
your code a little bit more testable because
you can write a test for filtering that
isn't conflated with compression and upload. All right, and I want to talk
about a few next steps for you. If you need to reach us and
talk to us about WorkManager, we are in the Android
Sandbox, just behind us, I think, over here. D.android.com/arch/work,
that's more information on the official developer
website about WorkManager. These are all the
greater dependencies. The first one's a required one. The second one is if you use
Firebase JobDispatcher, also include that. There's a testing
library, and of course, we have Kotlin extensions as well. WorkManager is part of the
architectural components in Android Jetpack. And we have a bunch of
talks here tomorrow, Navigation Controller, 8:30 AM. Hope you guys make it there. And thanks for being
part of this talk. We look forward to hearing
back from you soon. Thank you. [MUSIC PLAYING]