Firestore Data Modeling - Five Cool Techniques

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] one of the most challenging aspects of building any app is determining the optimal data structure ideally you want to maximize performance while minimizing costs but that's easier said than done and especially so with a no SQL database like firestore because you really have to plan ahead on how you model your data last week I asked everybody on our slack channel to send me the biggest challenges that they faced modeling data and firestore and this conversation snowballed into an entire new course on fire ship dedicated completely to no SQL data modeling and queries so this is a topic you're interested in consider becoming a pro member to get access to that course and I'll leave a discount code in the video description in today's video we're going to look at five different ways you can structure and query your data that you may not know about and each technique that we'll look at is based on a real submission from the community a minute ago I said that you have to plan ahead when structuring your data so what do I mean by that exactly well the basic idea is that you want to pre render your data so that it fits the view or the screen in your app as closely as possible ideally you only need to make one document read or one query to a collection to fill the entire view with data this differs from a relational SQL database where the idea is to break your data into a lot of small pieces and then join it all together server-side heard people say that fires store or no SQL databases in general are not good for relational data modeling but that's actually not true and it's really just a matter of rethinking the way that you model data in general and what you'll find is that you can solve all the same problems that you can with SQL but you can do it in a way that's faster less expensive and more flexible with that being said let's go ahead and jump into our first data model and access pattern this suggestion comes from Samrat and he wants to query a collection based on whether or not a user was mentioned in a post in that collection this is similar to something like Twitter where if you mention a user name with the @ symbol it will notify that user that they've been mentioned and they also might want to see a list of all the tweets that they've been mentioned in in this case we have a many-to-many relationship where a tweet can have many mentioned users and a user can be mentioned in many tweets but the thing to keep in mind here is that a tweet or a post can maybe only have one to 10 mentioned users but a user can be mentioned in potentially billions of tweets in situations like this it's usually best to have the document with the smaller side of the relational data manage D relationships in this example we have an array of usernames embedded on the tweet or post doc you'd implement this logic somewhere in your front-end code using regex to extract the user names out of the post text and then duplicate them on the same document as an array now it becomes very easy to query all the tweets that a user is mentioned in by making a reference to the post collection then we can use the array contains operator to get all the tweets that have a corresponding user ID in that mentions array and then we could use a cloud function to listen for a new post to be created and then send a push notification or an email to a user when they're mentioned so the main limitation with array contains is that you can only query for one item at a time and that leads me to the next challenge which is a category system that allows you to filter by multiple categories at the same time while also being able to see if a post contains a given category out of potentially multiple passed values and also the ability to filter by not containing a certain value so these requirements roughly translate into an and or or not statement in a sequel database I'm gonna go ahead and call these tags instead of categories but the general idea here is the same we have a tags collection which is really just there to hold any extra data about a tag like a description or a URL or something like that and I'd recommend giving your tags a unique ID that's something descriptive that you can actually show in the UI and then you'll take that ID and associate it to a post by using a map so we have a map called tags and then each key in that map is the ID from the tags collection and the cool thing about firestore is that it automatically indexes the keys in the map which means we can query them without any additional configuration first let's take a look at how we can do a logical and query or in other words we want to get all of the items that have both of these tags present we can do this by simply chaining together where statements and as long as they're using the equal to operator we can do this as many times as we want so this works great if you're filtering by multiple boolean values but if you throw in a range operator that means you'll need a composite index and keep in mind you can only use one range operator per query so that's a limitation that you want to consider now if you wanted to make something that was more like an or query you can just make multiple queries for separate tags we're able to run these queries concurrently and then we can just join them together and filter out any duplicates in our client-side code now the most difficult part of this challenge is implementing the knot logic the values on the map are set up as boyens so the most straightforward approach here would be to add all of the tags to every document and then set them to false by default this would be perfectly fine if you have a finite set of and know their values upfront but it would be much more challenging if tags are generated by users and there could be potentially billions of tags so that's a limitation to consider but keep in mind that if you have a numeric value or if you have a string value with some sort of ordering and bedded in it and you can simulate a knock query by using range operators for example if we wanted to get all the items that were not twenty dollars then we could query on either side of that range or in other words all the items that are less than twenty dollars and all the items that are more than twenty dollars and if you're doing a lot of stuff like this it might be time to look into something like algo Leah to index some of your data in a full-text search engine that's what I do for fire ship and firebase in algo Leah work really well together now moving on to our next data model this one comes from Lenny Cunningham and his flight now dotnet app geolocation queries bring up a really interesting data modeling technique called composite strings not only do composite strings allow us to do things like geo hashing but it also allows us to do things like treat reversals and threaded comments take a look at this tree structure we have here where each letter represents a document in the database what I'm going to show you next is how to write a query that will traverse down one node of the tree and this is especially useful if you're building something like reddit comments that can be threaded multiple levels deep or if you have a hierarchy of categories and only want to query a specific node in that tree we can do this with all of the documents in the same collection and you can see here at the top of the tree we have the document a and it has a parent value set to false that would be our top-level comment and then let's say a user responds to this comment that will be our B document and it has a parent value set to a now let's imagine we have a response to the B document then we would have a C and D document and they both reference the same parents of a B so what we've done here is create a composite key where the items at the third level of the tree referenced the IDS of the parent documents at the first and second levels these don't need to be an alphabetical order anything like that you can just use fire stores automatically generated IDs the only thing that's important is that the composite string is created in the same order that the parent elements appear in the tree let's take a look at how we can actually traverse this tree if we just want to grab the top level of the tree which you would probably want to do if you're showing the top level comments is just query where the parent equals false now if you want to query across the breadth of the tree or just get all of the top level responses or something like that you can also save a level property on each document which would you to do so but I think most use cases will require you to query the depth of the tree so this parts technically optional then assuming you have the level property which is just an integer that increases by one for each level you can use where to query where all the documents live at a certain level or you could use a range operator to get everything above or below a certain level now things get a lot more interesting when we query the depth of the tree the composite key that we set up earlier will get larger and larger as we get deeper in the tree that means we can take a document ID and use it as a starting point and then query all of the children that start with that same ID in their composite parent string we can make this query by saying where the parent is greater than or equal to the ID and where the parent is less than or equal to the ID plus a high Unicode character so that would allow us to start from any node in the tree and then traverse downward and this is actually the same way that geo hashing works and it's a very powerful feature but is a little more advanced now moving on to our next data model which comes from Stefan we're going to look at how we can query a collection starting with an array of document IDs his data model was a lot more complex but I just wanted to pick out one little thing that I think is helpful to a lot of people when working with firestore it's ideal to denormalize your data or embed it on a single document but there are a lot of cases when it's just not practical or possible to do that so one thing you can do is create a more normalized model like we see here with a sizes array then each element in that array is a string referencing a document in a different collection but because there are no server-side joins in firestore how do we actually join this data so we can use it in our UI we can actually do this very efficiently in firestore because we can send multiple read requests to the database and firestore will handle all of those requests concurrently here's a little helper function you can use in JavaScript to join an array of IDs to a collection it takes a collection and an array of IDs as its argument then it Maps the IDS to a document read that will give us an array of promises and then we can use promise dot all to resolve all of those promises concurrently and lastly we can map the document snapshot to its raw data now if we wanted to use this helper method in our code we can just pass at this collection reference and the array of ID's that we want to read and it will return an array of the document data so if you want to model your data in a more normalized SQL style you can do it using something like this now that brings us to the final model coming from Troy and he's building something like a social media style follower feed in other words you follow a bunch of users and you want to see the most recent posts from those users that you follow this is actually a very challenging requirement so I built a full demo to show that it's possible in firestorm as you can see here I'm logged in as Jeff d23 and when I unfollow these users their posts are removed from my feed so basically what we're doing here is we're grabbing the most recent posts from these users and then ordering them by date and on top of that we're also maintaining the user to user follow relationship let's start by taking a look at the data model the only way to make a system like this scale to a deist amount of users is to duplicate some data in our database we have three collections at play we have one for users one for posts that can be posted by a user and then another one for followers all the relational data will be handled by the followers collection and in fact all of the data that you see in the UI is coming from a query from the followers document the followers document belongs to the user who is being followed and keeps track of all of the followers in an array but in addition to keeping track of the followers it also duplicates some of the users recent posts it doesn't duplicate everything it only duplicates the data that we need to show in the feed like the title and it may be a preview of some of the text and the last thing we need to do is keep a timestamp of the users last post what we're going to do is make a query on this last post property along with the users array and the result of that will be all these duplicated posts from the users that the user is following now an important thing to point out here is that we're keeping track of the followers on a single document so that means we can only scale up to one megabyte of data and that means you may need to break this up into multiple documents as your app grows I'm going to go ahead and write this query inside of an async function because we'll need to do some data wrangling after we retrieve the initial data first we'll make a reference to the followers collection and then we'll query it using array contains with the username of the logged in user after that we'll make this a compound query by using order by with the last post timestamp and keep in mind this is a compound query so it's going to require an index and firestore we'll give you a warning about that in the console now the cool thing about this is we can limit it to 10 document reads so it's a very efficient read but it can provide us with data to populate potentially dozens of different posts in the feed once we have the result of this query we have an array that we need to then map to the document data and then on the document data we're going to have an array of the users most recent posts and that's what we actually want to be showing in the view so I'm going to reduce this array of documents down to a new array that only contains the recent posts and then I want to sort those recent posts based on their published date we can use a race sort to handle that and at this point we now have a sorted array of post that spans across multiple users that this user is following I'm gonna go ahead and wrap things up there if you want to learn more about data modeling and how these particular data models work consider becoming a pro member at fire ship IO to get access to the course thanks for watching and I will talk to you soon [Music]
Info
Channel: Fireship
Views: 140,706
Rating: 4.9353514 out of 5
Keywords: firebase, app development, typescript, javascript, lesson, tutorial, firestore, cloud firestore, data modeling, data, nosql, app data, big data, mongodb, sql, nosql vs sql, firestore tips, firestore query, firestore data model, nosql data modeling
Id: 35RlydUf6xo
Channel Id: undefined
Length: 11min 44sec (704 seconds)
Published: Thu Apr 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.