How to build a data layer

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] DON TURNER: Hello, I'm Don, and I'm an engineer on the Android Develop Relations team. Today, I'm going to show you how to build a data layer for your app. This will be a workshop based on the data layer codelab. If you'd like to follow along, you can do so using the link in the description. This workshop will cover how to create repositories, data sources, and data models for effective scalable data management, exposing data to other architectural layers, handling data updates, and complex or long running tasks, data synchronization between multiple data sources, and lastly, how to create local and instrumented tests that verify the behavior of your data layer. In this workshop, we'll build a task management app. This will allow you to add tasks and mark them as completed. We won't be writing the app from scratch. Instead, we'll be working on an app which already has a UI layer. During this workshop, we'll add the data layer, then connect it to the existing UI layer, allowing the app to become fully functional. So I've already cloned the codelab project and imported it into Android Studio. And if we run the project, we can see the app running with a loading spinner, waiting for the data to be loaded. By the end of this workshop, a list of tasks should be displayed on this screen. And we'll do that by building the data layer, but what is the data layer? Well, it's an architectural layer, which, as its name suggests, manages your application data. It also contains business logic, which is what gives value to your app. These are the real world business rules that determine how application data is created, stored, and updated. It provides methods to other layers to allow the data to be read and updated following those business logic rules. The key component types which make up the data layer are repositories, data sources, and data models. Let's take a look at each of these types in more detail. The application data is usually represented as data models. These are in-memory representations of the data. Since we're creating a task management app, we need a data model for a task. Here's the task class. A key point about data models is that they are immutable. Other layers cannot change the task properties. they must use the data layer if they want to make changes to a task. Task is an example of an external data model because it is exposed externally to the data layer and can be accessed by other layers. But we'll also define internal data models, which are only used inside the data layer. It's good practice to define a data model for each place that it's stored. So for example, in this app, we have a local task, which is a task stored in a local database, and a network task, which is a task which has been retrieved from a network server. These are both examples of internal data models, and they come from data sources. Speaking of data sources, this is an object responsible for reading and writing data to a single source, such as a database or a network service. In this app, there are two data sources. TaskDao, which stands for data access object, is a local data source which reads and writes tasks to a database. TaskNetworkDataSource reads and writes tasks to a network server. Lastly, let's talk about repositories. A repository is what brings these data sources together. It's responsible for a single external data model. In this app, we'll create a task repository, which manages tasks. Its role will be to expose those tasks, provide methods for updating tasks, execute business logic, such as creating a unique ID for each task, combine or map internal data models from data sources into tasks, and lastly, synchronize data sources, so copy data between the local database and the network. OK, that's enough theory. Let's write some code. We'll start by creating a data model and a data source for the local database. So empty files have already been created for us inside the data source local package, so let's open LocalTask and create a data class. Just get rid of that. So I'm just going to copy in the data class code here. OK, so the entity annotation here tells Room that we want to create a table named task. And because of the annotations like an entity, this class is strongly coupled to room and shouldn't be used for other data sources, such as data store. The local prefix in the class name is used to indicate that this data will be stored locally. So now that we have a data model, let's create a data source to create, read, update, and delete those local tasks. And I'll do this inside TaskDao for reasons that will become clear shortly. So again, I'm just going to copy, paste the code in. So since we're using Room, we can create our local data source using Room's data access object, or DAO for short. And this is as simple as specifying the DAO annotation for an interface, and then defining methods and associated SQL for reading and writing the data. You'll notice the methods for reading data are prefixed with observe, and these are non-suspending functions which return a flow. This means that each time the underlying data changes, a new item will be emitted into the stream, and this is great because you can listen for data changes rather than pulling the database. The methods for writing data are suspending functions because they are performing I/O operations. And in case you're wondering what upsert means, it just means to update an item if it already exists or insert it if it doesn't. So the next thing we'll do is update the database so that it will store those local tasks. We just need to replace this blank entity here with the LocalTask, which we just created. So I'll just change that here. OK. And we should also add a method to return the data access object. There we go. And I can also get rid of this code. It's no longer required. OK, and just to be clear, the data source is responsible for providing access to the data. The database is the mechanism used to store data to disk. So this project uses Hilt for dependency injection, and Hilt needs to know how to create our data source so that it can be injected into classes that use it. We already have a Hilt module, which will inject the database. So let's use that same module to inject the data source. We're just going to add a Provides method, which will return the TaskDao from the database, and I'll just fix this import here. OK, great. We now have all the pieces required to read and write tasks to the local database. Now, we've written quite a lot of code so far, but how do we know that it works correctly? It's easy to make a mistake with all of those SQL queries. So let's create tests to verify that TaskDao behaves as it should. Now tests aren't part of the app, so it should be placed in a different folder, and there are two testing folders. I'll just expand the Project Explorer here. So we have androidTest here, which contains tests which are run on an Android emulator or device, and these are known as instrumented tests. And then we also have this Test folder, which contains tests which run on your local machine, also known as local tests. So TaskDao requires a room database, which can only be created on an Android device. So therefore we need to create an instrumented test. So let's create a class called TaskDaoTest. OK, so here we go. Now, the first thing we'll do is add a database to it, which is initialized before each test. And this just ensures that the database is created afresh every time we run a test. So we're using an in-memory database here because it's much faster than a disk-based database, and this makes it a good choice for automated tests because the data doesn't need to persist for longer than those tests. So now we can start writing tests. Now a good way of structuring tests is to follow the given when-then structure. So to test that a task is inserted correctly, we can say, given an empty database-- so let me just paste this in so you can see it. So given an empty database, when a task is inserted and we start observing the task stream, then the first item in the task stream matches the task which was inserted, and here's the test. So it's good practice to start with a failing test, and this just verifies that the test is actually running and that the correct objects and their dependencies are being tested. In this case, we'll just check the size of the task list is 0 here, rather than the expected size of 1, which is what we would expect, given that we are inserting a task. So if we click the Play icon next to the test, we'll go through the build process, and this will actually be deployed onto the emulator, and we should see a failing test. And this should come up in orange. OK, there we have it. So we see this message expected 0, but it was actually 1. So let's now fix that by changing the assertions here. And we're going to test that one and only one task is inserted into the database, and we'll also make sure that it's the same task that we inserted. So again, we'll just run this, and hopefully we should see a green tick. OK, there we have it. So that confirms that our local data source is behaving correctly, both now and in the future. So it's great that tasks can be saved locally on the device, but what if we also want to save and load those tasks to a network service? In this next step, we'll create a data source to communicate with a simulated network service. To do this, we'll follow a similar process to the local data source. So we start by defining a model for a task coming from the network called NetworkTask. To do this, I'll just use one of these files that's already been created. So here's our NetworkTask data class. And the differences between NetworkTask and our LocalTask are that the description field name is changed, so it's short description rather than just description. Instead of an is completed Boolean, we have this status field, which is an enum, which can be active or complete. And we also have this extra field priority, which just isn't used in the app. So now we have a data model to represent the network data. Let's go ahead and create the network data source. So again, I will just paste this in. OK, if I remove the emulator and just make the local window go away, we can scroll through this. So the network data source provides two methods. We have loadTasks, which loads all the tasks from the network, and we have saveTasks, which saves all the specified tasks to the network server. And each of these methods has a delay of 2 seconds, and that's just to simulate either network or server latency. And this class also includes some test data, which we will use later on to verify that everything's working correctly OK, great. So we now have two data sources, one for the local data and one for network data. Each one allows reads and writes and has its own representation of a task. Now let's create a repository, which will use these data sources and provide an API so that other architectural layers can access this task data. We'll start by creating a class called DefaultTaskRepository. Now, this repository class is going to take our two data sources as dependencies, so we'll just import them here. There we go. OK, let's add a method called observeAll, which will expose all of the tasks. Now, repositories should expose data from a single source of truth, and it's common to make that source of truth the local database. And we can obtain tasks using our local data source's observeAll method. But that will return local tasks. Here we go. And we need to convert those local tasks to our external model, named Task. To perform this conversion, we need to map the fields from LocalTask to the fields in Task, and we can do this by creating an extension function inside LocalTask. So I've actually got a couple of functions here which can do those conversions. And this is just mapping the fields directly from LocalTask to task, and then we have an extra toExternal function here, which does the same thing, but on a list of those local tasks. So a key point here is that mapping functions should live on the boundaries of where they are used. In this case, LocalTask is a good place for mapping functions to and from that type. Now, whenever we need to convert a local task to a task, we can just call toExternal, and that's exactly what we'll do inside TaskRepository. So we can just map all of those tasks to the external representation. Now each time the task data changes in the local database, a new list of local tasks will be emitted into the flow. Each local task is then mapped to a task. Great. Now, other layers can use observeAll to obtain all the tasks from our local database and be notified whenever those tasks change. OK, so a to do app isn't much good if you can't create and update tasks, so let's add a method now to create a task. So it takes the title and description of the task and returns the ID of the task once it's been created. You'll notice that this is a suspending function, whereas the previous method is not. This is because createTask is a one shot operation, whereas observeAll allows the caller to be notified of changes over time by returning a flow immediately. Another important point here is that the data layer forbids the task from being created directly by other layers. They must use createTask by supplying a title and description. This approach allows any business logic for creating and storing the task to be encapsulated, so let's add a method to create an ID for a task. And now we use that method to create the ID inside createTask. And there we go. OK, so this ID creation code might be fine, but what if it's computationally expensive? Perhaps it's using cryptography, which takes several seconds to create each ID. This could lead to UI jank if it's called on the main thread. And the data layer has a responsibility to ensure that long running or complex operations do not block the main thread. We can protect against this by using a coroutine dispatcher to execute these instructions, so let's add a dispatcher to our repository. So we can use the already defined default dispatcher qualifier to tell Hilt to inject this dependency with dispatchers.default, which is a dispatcher which is optimized for CPU-intensive work. Now we can place the ID creation code inside a withContext block. And this will ensure that the correct thread is used for execution. So now we have a task ID. We can use it along with the supply parameters to create a new task. Before inserting the task into the local data source, we need to map it to a local task. And we can do this the same way as we did before by creating an extension function inside LocalTask. This is the inverse mapping function to the toExternal function that we created earlier. So just flick back to LocalTask and add another extension function here. And we can now use this to insert the task into the local data source. So there we go. And the last thing we do is return the task ID. So while we're adding methods to update the data, let's also add a method to complete a task. It just takes the ID of the task we want to mark as complete. OK, great. We now have some useful methods for creating and completing tasks. The final step in creating the repository is to implement a data synchronization strategy. Our repository should perform three types of operation. Firstly, loading data. This is done from the local data source only-- no network interaction. Next, when saving data, it should be written to the local data source first, then to the network. And finally, a refresh operation will load all the information from the network and overwrite the local data source. It's important to note that this strategy is far too basic for a real app. Luckily, we have some excellent guidance for more robust and efficient strategies for data synchronization, which you should definitely check out. OK, let's head back to Android Studio to implement our basic sync operations. So the repository already loads tasks from the local data source using the observeAll method, which we defined up here. We just need to add methods to save and refresh data from the network data source. First, let's create a mapping function from LocalTask to NetworkTask and vise versa inside NetworkTask. And we'll just fix this import. So here you can see the advantage of having separate models for each data source. The mapping of one data type to another is encapsulated into separate functions, and we could, and probably should, in a real app, test these functions. So back to the repository, we'll add a refresh method. Now, this replaces all the local tasks with those from the network. So withContext here is used for the bulk toLocal local operation because there's an unknown number of tasks, and each mapping operation could be computationally expensive. So now let's add a method which saves tasks to the network. OK, fixed that. So the first function here is used to obtain the first item from the task stream, which contains every task in the database. And this is then used to replace all of the network tasks, which happens here. Now we can update the methods that modify data so that the local data is saved to the network when it changes. So we'll just update completeTask, which is here. So we just make a call to saveTasksToNetwork. And we'll also update createTask, so right at the bottom here. Make sure that happens as well. OK, one issue with this code is that saveTasksToNetwork forces its callers to wait until the data is saved to the network, and this isn't great because it means the user will have to wait for two seconds to create or update a task. And no one wants to wait to create a task, especially when they're really busy. A better solution is to use a different coroutine scope to allow the operation to complete in the background. So same way as we did with the dispatcher, let's add a coroutine scope to our repository. And there we go. Now we've previously defined a help qualifier application scope, which we can use to inject a scope, which follows the lifecycle of the app. Now if we wrap the code inside saveTasksToNetwork inside scope.launch-- And we'll just move this inside. And fix the formatting. There we go. So now saveTasksToNetwork will return immediately and the task will be saved to the network in the background, allowing the user to carry on creating and updating tasks. We're almost there. We've implemented a repository, which can expose, update, and synchronize data between our two data sources. So we've added a lot of functionality to the data layer, so we should definitely verify that it all works by creating local tests for our repository. So let's go ahead and create those tests. So here we have a blank file for testing the repository. So before we can write any tests, we need to instantiate the repository with test dependencies for the local and network data sources, so we'll actually create those first, starting with a fake data access object. I've got a class here, which is the fake. Now all this does is replicate the behavior of the DAO without needing it to be backed by a database. So in a real app, we'd also create a fake dependency to replace the network data source, but since it's already just simulated here in code, we'll just use it directly in this workshop. Let's now create a test class for the repository. So this contains some test data. It contains the local and network data sources. It contains a test dispatcher and a test scope, so that we can have deterministic tests, which run on a single thread. And lastly, it contains the subject under test, our repository. So now we can start writing local tests. So let's start by testing the exposed data. So what we do here is we get the first item emitted into the task stream, then test that those tasks match the external representation of the tasks from the local data source. Let's just run that and make sure that it passes. Great, OK, that passes. Now let's test the data updates. This verifies that when a task is created, it appears in both the local and network data sources. So we create the task here, then verify it in the local and network data sources. And we can create a similar test for marking a task as complete. And lastly, let's create a test which verifies the refresh operation. And all this does is verify that after a refresh operation, which happens here, that the network tasks match the local tasks. So let's run all of these tests together. OK, great. They all pass. Now we know that the data layer works. It's time to connect it to the UI layer. Let's start with TasksViewModel. And this is a ViewModel for displaying the first screen in the app, the list of all the tasks. So we can pass our repository as a parameter. And we'll just import it and initialize this task stream using our observeAll method in the repository. Our ViewModel now has access to all the tasks provided by the repository and will receive a new list of tasks each time the data changes, and that's all in just one line of code. All that remains is for us to connect the user actions to their corresponding methods in the repository, so let's start with completeTask. Here it is. And we'll also connect the refresh operation. There we go. And we can follow a similar process for the AddEditTaskViewModel, which is responsible for adding and editing the tasks. We have the repository as a parameter. And then we call it when we create a new task. OK, it's the moment we've all been waiting for. Let's run the app. OK, great. So the data has been loaded, although we don't have any tasks just yet in our local data source. So let's go ahead and refresh the data from the network. Make that a bit bigger. OK, great. So the loading spinner appeared for two seconds, then the tasks appear, which came from our network data source. Now let's create a task. Just move this over. There we go. So we add a title and description, and now tap the tick button to save it. And you'll see the task appears in the list. And if we tap on the checkbox, the task is then marked as complete. So that's it. Our task management app is working as intended. Excellent. So to wrap up, in this workshop, we learned about the role of the data layer in Android app architecture. We learned how to create data sources and data models. We learned about the role of repositories, how they expose data, and provide one time methods to update that data. As well as when to change the coroutine dispatcher, and why it's important to do so for complex jobs. We also learned about data synchronization using multiple data sources, and how to create local and instrumented tests for common data layer classes. You can check out our data layer guidance for more detailed information on this topic. Or for a more complex real world example, check out the Now in Android app. Thanks very much. [MUSIC PLAYING]
Info
Channel: Android Developers
Views: 20,339
Rating: undefined out of 5
Keywords: Android, Google I/O, Google IO, IO, I/O, IO 23, I/O 23, Google I/O 23, Google IO 23, Google I/O 2023, Google IO 2023, IO 2023, Google New, Google Announcement, Google Developers, Developer, Development
Id: P125nWICYps
Channel Id: undefined
Length: 32min 19sec (1939 seconds)
Published: Wed May 10 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.