Advanced offline caching techniques in Cloud Firestore

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] MARK DUCKWORTH: Hey, everybody. [APPLAUSE] Hey, I'm Mark Duckworth. I'm an engineer on the Firestore SDK team. And I'm actually relatively new to the team and the product. And one of the things I've really enjoyed about getting started is learning about the customers, how you use Firestore, and the challenges you face. And one of the challenges that I've seen is understanding the Firestore SDK cache and how it can be used to your benefit. So that's what I'm here to talk about-- using the cache, and using it to build more efficient applications. So just take a step back, if you're not familiar with Firestore, well, Firestore is Firebase's NoSQL document database. And with a NoSQL document database, you can build lots of different types of applications, but it's common to build read-heavy applications. Think, for example, maybe an e-commerce app with a product catalog with thousands of items being read thousands of times per day by different users. And that product catalog doesn't get updated as often as it's read. Well, there's something else that I've seen as a common consideration when using Firestore, and that's how much it costs. And I don't think that's common. I don't think that's unique to Firestore. I think it's common across all software, but it's important here in this context today, because with Firestore, you're billed by the number of reads and writes. And so if you're building a read-heavy application, then you want to be mindful of the number of reads against the server. So to explore these ideas, I built an e-commerce app called FireSale. Let's check it out. You can see FireSale here on my laptop and my mobile phone. And you can browse a catalog of pens, stickers, and patches. Browse it on the laptop or the phone. And I'm going to log in on both devices so I share the same shopping cart. And when I add items to my shopping cart, you'll see that those items are synchronized across the Firestore server. So I'm going to talk about how I built FireSale. I have my application running on both devices. It's using the Firebase SDK and two different Firebase services-- Firebase Authentication and Cloud Firestore. Now, I want to point out that I built just one application, a responsive web app. But I could have built a native iOS or native Android app using the Firebase Android or iOS SDKs and gotten all the same functionality you're going to see today. Let's dig in a little bit into how I'm using Firestore. Well, I use an API called getDocs, which I use to query my product catalog from the Firestore backend. I use a second API called addDocs when I want to add an item to my shopping cart. My shopping cart is represented by a collection in Firestore. And I add an item or add a document to that shopping cart. And third, I use onSnapshot to set up a listener to my shopping cart. So when a new item is added to the shopping cart, my application is notified, and I can update the UI with the new item in the shopping cart. So with those three basic APIs and some helpers, I built a pretty basic e-commerce app. But I want to mention that it also works offline. So how does it work offline? Well, actually, let's check out offline functionality. So the first thing I want to do is take my application offline on my laptop and add a couple of items to my cart. When I add them, you can see the items get added to the cart on the laptop. But they don't show up on the cart on the phone until I bring my laptop back online. So how does this work? Well, getDocs, addDoc, and onSnapshot all work seamlessly offline thanks to the Firestore SDK cache. Now, the cache keeps copies of documents that have been queried from the Firestore backend so they can be queried again when offline. And it also keeps mutations, changes to documents and collections that were made while offline that can be sent to the server when the application comes back online. So the only other thing I've done with my application is enabled persistence using enableIndexDbPersistence. And that's just to get the offline behavior that I want. So let me talk about enabling persistence a little bit. With persistence, it causes the data to stay in the cache so it can be queried offline. And it's not the only way to have your data stay in the cache for offline querying. But it was the easiest way for my application. And also, persistence ensures that the data is in the cache, even if your application restarts, which can be really critical in certain scenarios. And so I also want to point out that persistence is enabled by default in Android and iOS SDKs. But it's disabled by default on the Web SDK, so you need to enable it, or I did on my application. So I'm pretty excited about what I built with Firebase. And I'm telling my brother how powerful it is. And he says, well, if it's so powerful, then does it cost a lot? And I decided I need to come up with a cost model. So I think about what success might look like for my app. And I say, maybe that's I have 10,000 monthly users viewing 25 items per catalog page, loading three pages from my catalog, and maybe refreshing those pages five times. And that comes out to a total of 3,750,000 document reads. And I'm a little concerned until I plug that amount of reads into the Firebase pricing calculator and see that that's only going to be $1.38 a month. But I, as an engineer, I'm kind of thinking about the future where maybe I achieve global scale. Maybe I have 10 million monthly users and a product catalog that's 10 times that size, where I could see bills that are in excess of $20,000 a month. So I still want to see if I can reduce the number of billed document reads in my application. So I have an insight, which is basically that my product catalog doesn't update very often. So I decide if I update my application to pull the product catalog from the Firestore server only once a day, and then any other time I need to query the product catalog, I can query it from the local cache, I can reduce my bill of reads. So this updates my application. So now I'm only using getDocs, which queries the server once a day or the product catalog, and the product catalog is stored in cache. And then any other time I need to query the product catalog, it will be queried using getDocs from cache and not hit the server. So this updates my billing read estimation, potentially reducing those five page refreshes down to one, or saving me up to 3 million document reads. But I still have a problem, which is each of my 10,000 users are querying the full product catalog. And it's the same product catalog for each of those users. So I want to see if I can solve this problem. And I have another insight, which is, well, I think that I can query the product catalog only one time and share those results across all users. And in fact, I can do this with a built-in feature called bundles. So what are Firestore bundles? Well, Firestore bundles are the packaged results of queries from Firestore. And typically, you use this with common queries. But it allows an application to load your query results from a bundle quickly without querying the backend. And in this case, it's also helping me save on billed reads. So how did I create a bundle from my product catalog? Well, in this case, I need to use a Firebase server or Admin SDK to run that same query for my product catalog and get all the documents, and then I can run the documents through the bundling API to create this bundle-- that icon you see on the left. And then with my bundle, I can distribute it to one application, or 30 instances of my application, or 1,000 instances of my application, and I've only queried the Firebase server one time. So this changes my application a little bit. It changes how data gets from the Firestore backend down to the SDK and my application. So I have a process that runs once a day to update that generated bundle file for my product catalog, and puts that in Cloud Storage. And then when an application starts or once a day, it fetches that bundle from Cloud Storage, and then loads the bundle into the Firestore SDK cache using loadBundle. And so from there, I can query that product catalog out of the cache. So once again, this updates my billed read estimation, reducing those 10,000 monthly reads down to 30 for a total of 2,250 billed reads for a savings of over 99%. Now, I want to point out if you are serving a bundle out of Cloud Storage, then you'll still have to pay for egress on that bundle from Cloud Storage. So I'm pretty excited about that. However, I found another issue, a blog from 2021, that says, basically if you overload your SDK cache, it can slow down querying from the cache, and potentially slow down your application. So I want to investigate this and see if I can fix that. And it turns out there's a relatively new feature called client-side indexing, which can be used to speed up queries against cached documents. Now, I'd like to point out that client-side indexing is in preview or public beta. So you can use it, but it may change. And so what is client-side indexing? Well, client-side indexing simply brings indexing to the SDK. Indexing is always performed on all documents on all fields on the server. And it's new to the SDK. And if you're not familiar with indexing, well, it's just a mechanism the databases use to perform very efficient queries. Take, for example, this table on the right represents an index on the field category. If I get a query where category equals pins, I can use this index to very quickly find all the documents where category equals pins. So like I said, indexing is automatically performed on the server. But in the SDK, you have to configure it. And you configure it using setIndexConfiguration. And you pass in a configuration telling the SDK which fields to index. So after I've set that up, when I want to query from cache using something getDocs from cache, it's now going to query very efficiently. So in summary, FireSale, I built an e-commerce app with offline support thanks to the SDK cache. And I've saved on read costs by querying from the cache and using bundles. And I've sped up querying from the cache using client-side indexing. So if you want to learn more about these features, check out our online documentation, or I'll be out at the lounge right after this. And thanks for coming. And thanks to all my team and everybody who helped build this stuff. [APPLAUSE] [MUSIC PLAYING]
Info
Channel: Firebase
Views: 16,622
Rating: undefined out of 5
Keywords: Firestore SDK, caching system, reduce latency, offline querying, offline mutations, reduce billed document reads, Firestore cache, firebase summit, firebase summit 2022, firebase products, google products, firebase, google, google firebase
Id: iQOTjUko9WM
Channel Id: undefined
Length: 12min 34sec (754 seconds)
Published: Thu Oct 20 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.