[MUSIC PLAYING] DON TURNER: Hello, I'm Don, and
I'm an engineer on the Android Develop Relations team. Today, I'm going to show
you how to build a data layer for your app. This will be a workshop based
on the data layer codelab. If you'd like to
follow along, you can do so using the
link in the description. This workshop will cover
how to create repositories, data sources, and data models
for effective scalable data management, exposing data to
other architectural layers, handling data updates, and
complex or long running tasks, data synchronization between
multiple data sources, and lastly, how to create local
and instrumented tests that verify the behavior
of your data layer. In this workshop, we'll
build a task management app. This will allow you to add tasks
and mark them as completed. We won't be writing
the app from scratch. Instead, we'll be
working on an app which already has a UI layer. During this workshop,
we'll add the data layer, then connect it to
the existing UI layer, allowing the app to
become fully functional. So I've already cloned
the codelab project and imported it
into Android Studio. And if we run the
project, we can see the app running
with a loading spinner, waiting for the
data to be loaded. By the end of this
workshop, a list of tasks should be displayed
on this screen. And we'll do that by
building the data layer, but what is the data layer? Well, it's an
architectural layer, which, as its name suggests,
manages your application data. It also contains
business logic, which is what gives value to your app. These are the real
world business rules that determine how application
data is created, stored, and updated. It provides methods
to other layers to allow the data to be
read and updated following those business logic rules. The key component types
which make up the data layer are repositories, data
sources, and data models. Let's take a look at each of
these types in more detail. The application data is usually
represented as data models. These are in-memory
representations of the data. Since we're creating
a task management app, we need a data model for a task. Here's the task class. A key point about data models
is that they are immutable. Other layers cannot change
the task properties. they must use the data layer
if they want to make changes to a task. Task is an example of
an external data model because it is exposed
externally to the data layer and can be accessed
by other layers. But we'll also define internal
data models, which are only used inside the data layer. It's good practice to define
a data model for each place that it's stored. So for example, in this
app, we have a local task, which is a task stored
in a local database, and a network task, which is a
task which has been retrieved from a network server. These are both examples
of internal data models, and they come from data sources. Speaking of data sources,
this is an object responsible for reading
and writing data to a single source, such as a
database or a network service. In this app, there
are two data sources. TaskDao, which stands
for data access object, is a local data source
which reads and writes tasks to a database. TaskNetworkDataSource
reads and writes tasks to a network server. Lastly, let's talk
about repositories. A repository is what brings
these data sources together. It's responsible for a
single external data model. In this app, we'll
create a task repository, which manages tasks. Its role will be to
expose those tasks, provide methods
for updating tasks, execute business
logic, such as creating a unique ID for each task,
combine or map internal data models from data
sources into tasks, and lastly, synchronize
data sources, so copy data between the local
database and the network. OK, that's enough theory. Let's write some code. We'll start by creating
a data model and a data source for the local database. So empty files have
already been created for us inside the data
source local package, so let's open LocalTask
and create a data class. Just get rid of that. So I'm just going to copy
in the data class code here. OK, so the entity
annotation here tells Room that we want to
create a table named task. And because of the
annotations like an entity, this class is strongly
coupled to room and shouldn't be
used for other data sources, such as data store. The local prefix
in the class name is used to indicate that this
data will be stored locally. So now that we
have a data model, let's create a data source
to create, read, update, and delete those local tasks. And I'll do this inside
TaskDao for reasons that will become clear shortly. So again, I'm just going
to copy, paste the code in. So since we're using Room,
we can create our local data source using Room's data access
object, or DAO for short. And this is as simple as
specifying the DAO annotation for an interface, and then
defining methods and associated SQL for reading and
writing the data. You'll notice the
methods for reading data are prefixed with
observe, and these are non-suspending functions
which return a flow. This means that each time
the underlying data changes, a new item will be
emitted into the stream, and this is great because you
can listen for data changes rather than pulling
the database. The methods for writing data
are suspending functions because they are
performing I/O operations. And in case you're
wondering what upsert means, it just means to update an item
if it already exists or insert it if it doesn't. So the next thing we'll
do is update the database so that it will store
those local tasks. We just need to replace
this blank entity here with the LocalTask,
which we just created. So I'll just change that here. OK. And we should also add a method
to return the data access object. There we go. And I can also get
rid of this code. It's no longer required. OK, and just to be
clear, the data source is responsible for providing
access to the data. The database is the mechanism
used to store data to disk. So this project uses Hilt
for dependency injection, and Hilt needs to know
how to create our data source so that it
can be injected into classes that use it. We already have a Hilt module,
which will inject the database. So let's use that same module
to inject the data source. We're just going to add
a Provides method, which will return the TaskDao
from the database, and I'll just fix
this import here. OK, great. We now have all the pieces
required to read and write tasks to the local database. Now, we've written
quite a lot of code so far, but how do we know
that it works correctly? It's easy to make a mistake
with all of those SQL queries. So let's create tests
to verify that TaskDao behaves as it should. Now tests aren't
part of the app, so it should be placed
in a different folder, and there are two
testing folders. I'll just expand the
Project Explorer here. So we have androidTest
here, which contains tests which are run on
an Android emulator or device, and these are known
as instrumented tests. And then we also have
this Test folder, which contains tests which
run on your local machine, also known as local tests. So TaskDao requires
a room database, which can only be created
on an Android device. So therefore we need to
create an instrumented test. So let's create a class
called TaskDaoTest. OK, so here we go. Now, the first thing
we'll do is add a database to it, which is initialized
before each test. And this just ensures
that the database is created afresh every
time we run a test. So we're using an
in-memory database here because it's much faster
than a disk-based database, and this makes it a good
choice for automated tests because the data
doesn't need to persist for longer than those tests. So now we can start
writing tests. Now a good way of
structuring tests is to follow the given
when-then structure. So to test that a task
is inserted correctly, we can say, given
an empty database-- so let me just paste this
in so you can see it. So given an empty database,
when a task is inserted and we start observing
the task stream, then the first item
in the task stream matches the task which was
inserted, and here's the test. So it's good practice to
start with a failing test, and this just verifies that
the test is actually running and that the correct objects
and their dependencies are being tested. In this case, we'll
just check the size of the task list is 0 here,
rather than the expected size of 1, which is
what we would expect, given that we are
inserting a task. So if we click the Play
icon next to the test, we'll go through
the build process, and this will actually be
deployed onto the emulator, and we should see
a failing test. And this should
come up in orange. OK, there we have it. So we see this message expected
0, but it was actually 1. So let's now fix that by
changing the assertions here. And we're going to test
that one and only one task is inserted into the
database, and we'll also make sure that it's the
same task that we inserted. So again, we'll just run
this, and hopefully we should see a green tick. OK, there we have it. So that confirms
that our local data source is behaving correctly,
both now and in the future. So it's great that tasks can
be saved locally on the device, but what if we also want to
save and load those tasks to a network service? In this next step, we'll
create a data source to communicate with a
simulated network service. To do this, we'll
follow a similar process to the local data source. So we start by defining
a model for a task coming from the network
called NetworkTask. To do this, I'll just use
one of these files that's already been created. So here's our
NetworkTask data class. And the differences between
NetworkTask and our LocalTask are that the description
field name is changed, so it's short description
rather than just description. Instead of an is
completed Boolean, we have this status field,
which is an enum, which can be active or complete. And we also have this
extra field priority, which just isn't used in the app. So now we have a data model
to represent the network data. Let's go ahead and create
the network data source. So again, I will
just paste this in. OK, if I remove the
emulator and just make the local window go away,
we can scroll through this. So the network data source
provides two methods. We have loadTasks, which loads
all the tasks from the network, and we have saveTasks, which
saves all the specified tasks to the network server. And each of these methods
has a delay of 2 seconds, and that's just to simulate
either network or server latency. And this class also
includes some test data, which we will use later on
to verify that everything's working correctly OK, great. So we now have two data
sources, one for the local data and one for network data. Each one allows reads and writes
and has its own representation of a task. Now let's create a repository,
which will use these data sources and provide an API so
that other architectural layers can access this task data. We'll start by creating a class
called DefaultTaskRepository. Now, this repository
class is going to take our two data
sources as dependencies, so we'll just import them here. There we go. OK, let's add a method
called observeAll, which will expose all of the tasks. Now, repositories
should expose data from a single source
of truth, and it's common to make that source
of truth the local database. And we can obtain tasks
using our local data source's observeAll method. But that will
return local tasks. Here we go. And we need to convert
those local tasks to our external
model, named Task. To perform this
conversion, we need to map the fields from
LocalTask to the fields in Task, and we can do this by
creating an extension function inside LocalTask. So I've actually got a
couple of functions here which can do those conversions. And this is just mapping
the fields directly from LocalTask to
task, and then we have an extra toExternal
function here, which does the same thing, but
on a list of those local tasks. So a key point here is
that mapping functions should live on the boundaries
of where they are used. In this case, LocalTask
is a good place for mapping functions
to and from that type. Now, whenever we need to
convert a local task to a task, we can just call toExternal,
and that's exactly what we'll do inside TaskRepository. So we can just map
all of those tasks to the external representation. Now each time the task data
changes in the local database, a new list of local tasks
will be emitted into the flow. Each local task is
then mapped to a task. Great. Now, other layers can
use observeAll to obtain all the tasks from
our local database and be notified whenever
those tasks change. OK, so a to do app
isn't much good if you can't create
and update tasks, so let's add a method
now to create a task. So it takes the title and
description of the task and returns the ID of the
task once it's been created. You'll notice that this
is a suspending function, whereas the previous
method is not. This is because createTask
is a one shot operation, whereas observeAll
allows the caller to be notified of
changes over time by returning a flow immediately. Another important point
here is that the data layer forbids the task from
being created directly by other layers. They must use createTask
by supplying a title and description. This approach allows
any business logic for creating and storing
the task to be encapsulated, so let's add a method to
create an ID for a task. And now we use that
method to create the ID inside createTask. And there we go. OK, so this ID creation
code might be fine, but what if it's
computationally expensive? Perhaps it's using
cryptography, which takes several seconds
to create each ID. This could lead to UI jank if
it's called on the main thread. And the data layer
has a responsibility to ensure that long running
or complex operations do not block the main thread. We can protect against this by
using a coroutine dispatcher to execute these instructions,
so let's add a dispatcher to our repository. So we can use the already
defined default dispatcher qualifier to tell Hilt
to inject this dependency with dispatchers.default,
which is a dispatcher which is optimized
for CPU-intensive work. Now we can place
the ID creation code inside a withContext block. And this will ensure
that the correct thread is used for execution. So now we have a task ID. We can use it along with
the supply parameters to create a new task. Before inserting the task
into the local data source, we need to map it
to a local task. And we can do this
the same way as we did before by
creating an extension function inside LocalTask. This is the inverse
mapping function to the toExternal function
that we created earlier. So just flick back to LocalTask
and add another extension function here. And we can now use
this to insert the task into the local data source. So there we go. And the last thing we do
is return the task ID. So while we're adding
methods to update the data, let's also add a method
to complete a task. It just takes the ID of the task
we want to mark as complete. OK, great. We now have some useful methods
for creating and completing tasks. The final step in
creating the repository is to implement a data
synchronization strategy. Our repository should perform
three types of operation. Firstly, loading data. This is done from
the local data source only-- no network interaction. Next, when saving
data, it should be written to the
local data source first, then to the network. And finally, a refresh operation
will load all the information from the network and overwrite
the local data source. It's important to note
that this strategy is far too basic for a real app. Luckily, we have some
excellent guidance for more robust and
efficient strategies for data synchronization,
which you should definitely check out. OK, let's head back
to Android Studio to implement our
basic sync operations. So the repository already
loads tasks from the local data source using the
observeAll method, which we defined up here. We just need to add methods
to save and refresh data from the network data source. First, let's create a mapping
function from LocalTask to NetworkTask and vise
versa inside NetworkTask. And we'll just fix this import. So here you can
see the advantage of having separate models
for each data source. The mapping of one
data type to another is encapsulated into
separate functions, and we could, and probably
should, in a real app, test these functions. So back to the repository,
we'll add a refresh method. Now, this replaces
all the local tasks with those from the network. So withContext here is used
for the bulk toLocal local operation because there's
an unknown number of tasks, and each mapping operation could
be computationally expensive. So now let's add a method which
saves tasks to the network. OK, fixed that. So the first
function here is used to obtain the first item from
the task stream, which contains every task in the database. And this is then used to replace
all of the network tasks, which happens here. Now we can update the
methods that modify data so that the local data is saved
to the network when it changes. So we'll just update
completeTask, which is here. So we just make a call
to saveTasksToNetwork. And we'll also
update createTask, so right at the bottom here. Make sure that happens as well. OK, one issue with this code is
that saveTasksToNetwork forces its callers to wait until the
data is saved to the network, and this isn't great
because it means the user will have to
wait for two seconds to create or update a task. And no one wants to
wait to create a task, especially when
they're really busy. A better solution is to
use a different coroutine scope to allow the operation
to complete in the background. So same way as we did
with the dispatcher, let's add a coroutine scope
to our repository. And there we go. Now we've previously defined
a help qualifier application scope, which we can use to
inject a scope, which follows the lifecycle of the app. Now if we wrap the code inside
saveTasksToNetwork inside scope.launch-- And we'll just move this inside. And fix the formatting. There we go. So now saveTasksToNetwork
will return immediately and the task will be saved to
the network in the background, allowing the user to carry on
creating and updating tasks. We're almost there. We've implemented
a repository, which can expose, update,
and synchronize data between our two data sources. So we've added a
lot of functionality to the data layer, so
we should definitely verify that it all
works by creating local tests for our repository. So let's go ahead and
create those tests. So here we have a blank file
for testing the repository. So before we can
write any tests, we need to instantiate
the repository with test dependencies for the
local and network data sources, so we'll actually
create those first, starting with a fake
data access object. I've got a class here,
which is the fake. Now all this does is replicate
the behavior of the DAO without needing it to
be backed by a database. So in a real app, we'd also
create a fake dependency to replace the
network data source, but since it's already just
simulated here in code, we'll just use it
directly in this workshop. Let's now create a test
class for the repository. So this contains some test data. It contains the local
and network data sources. It contains a test
dispatcher and a test scope, so that we can have
deterministic tests, which run on a single thread. And lastly, it
contains the subject under test, our repository. So now we can start
writing local tests. So let's start by
testing the exposed data. So what we do here is we
get the first item emitted into the task stream,
then test that those tasks match the external
representation of the tasks from the local data source. Let's just run that and
make sure that it passes. Great, OK, that passes. Now let's test the data updates. This verifies that
when a task is created, it appears in both the local
and network data sources. So we create the task
here, then verify it in the local and
network data sources. And we can create a similar test
for marking a task as complete. And lastly, let's
create a test which verifies the refresh operation. And all this does is verify
that after a refresh operation, which happens here, that
the network tasks match the local tasks. So let's run all of
these tests together. OK, great. They all pass. Now we know that the
data layer works. It's time to connect
it to the UI layer. Let's start with TasksViewModel. And this is a ViewModel for
displaying the first screen in the app, the list
of all the tasks. So we can pass our
repository as a parameter. And we'll just import it and
initialize this task stream using our observeAll
method in the repository. Our ViewModel now has access
to all the tasks provided by the repository and will
receive a new list of tasks each time the data
changes, and that's all in just one line of code. All that remains is for us
to connect the user actions to their corresponding
methods in the repository, so let's start
with completeTask. Here it is. And we'll also connect
the refresh operation. There we go. And we can follow
a similar process for the
AddEditTaskViewModel, which is responsible for adding
and editing the tasks. We have the repository
as a parameter. And then we call it when
we create a new task. OK, it's the moment we've
all been waiting for. Let's run the app. OK, great. So the data has been
loaded, although we don't have any tasks just
yet in our local data source. So let's go ahead and refresh
the data from the network. Make that a bit bigger. OK, great. So the loading spinner
appeared for two seconds, then the tasks appear, which
came from our network data source. Now let's create a task. Just move this over. There we go. So we add a title
and description, and now tap the tick
button to save it. And you'll see the task
appears in the list. And if we tap on the
checkbox, the task is then marked as complete. So that's it. Our task management app
is working as intended. Excellent. So to wrap up, in
this workshop, we learned about the
role of the data layer in Android app architecture. We learned how to create
data sources and data models. We learned about the
role of repositories, how they expose data, and
provide one time methods to update that data. As well as when to change
the coroutine dispatcher, and why it's important to
do so for complex jobs. We also learned about
data synchronization using multiple data
sources, and how to create local and instrumented
tests for common data layer classes. You can check out our
data layer guidance for more detailed
information on this topic. Or for a more complex
real world example, check out the Now
in Android app. Thanks very much. [MUSIC PLAYING]