Streamlit & Google Sheets: The Easiest "Database"

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
Ok, so in this video, I will show you how to connect your Streamlit app with Google Sheets. I'll demonstrate how to link it to a publicly shared spreadsheet and also to a private one. Let me actually give you a quick example of why this could be useful. Let's say you're working on a small project. Often, such projects involve a database where you store and retrieve data. Instead of setting up a full-blown database, you could use Google Sheets. With the method I'm about to demonstrate, you need just one line of code to create a new worksheet, query your data using SQL, update your data, or even clear the entire sheet—all directly through Streamlit. As I mentioned, this approach is a fantastic alternative to a relational database. With that said, let's dive in. To get started, head over to the following GitHub repo. I'll also provide the link in the description below. First things first, you need to install the package. To do this, copy the following link to your clipboard, navigate to your terminal or command prompt, paste the command, and execute it. With that in place, let's kick off with a basic example. In this scenario, we'll connect Streamlit to a publicly shared spreadsheet, meaning you'll have read-only access. All you have to do is copy this example and paste it into a new Python file. I've named mine streamlit_app.py. I'll open it using VS Code and paste the content there. After saving it, I'll run the app using the command streamlit run and the name of my Python file. Once the app is live, it will display a dataframe. Now, I'll position my app and my text editor side by side so that I can explain the code. After importing the package, you just have to provide the URL to your publicly shared spreadsheet. With that, you can establish a connection and use the object to read the content of the spreadsheet. To validate the content, I'll copy this URL and paste it into my browser. This dataset consists of two columns. You can be more specific by selecting which columns you want to use. In this case, it's column A (so index 0) and column B (so index 1). A quick tip: You can also specify columns using a list and setting the range. If I save the script and refresh my page, you'll notice we get the same result. One challenge I encountered was specifying the sheet name. There's a parameter named worksheet. If you input the sheet name, like "Example 1," in my case, save the script, and refresh the app, you will get an error. This is because you can only use the name when connecting to a private sheet, which I'll demonstrate shortly. For publicly shared sheets, you should use the worksheet ID found in the URL. Simply copy everything after "gid" and input those numbers as the worksheet parameter. When I save the script and reload the app, we see that it works. Obviously, you can repeat the process also for other sheets. Just select the desired sheet, copy the ID from the URL, and adjust your code. When I refresh the page now, we can view data from the second sheet. As I mentioned earlier, the entire code functions because I'm using a publicly shared spreadsheet. For the rest of this video, I'll be using a different spreadsheet. The structure of this spreadsheet is like a typical database where each sheet could be like a table in a relational database. I've included sheets for Products, Sales, and Customer information. To use this data in my Streamlit app, you'll first need to share the spreadsheet and set its access to "Anyone with the link". Once done, copy that link and integrate it into your code. I'll also remove the "usecols" parameter and update the sheet name. With these changes made, I'll save my code and refresh my app. Now, we can see the product information in my app. But here's the exciting part: With this package, your data is treated as a database connection. This means you can even use SQL to execute queries. For instance, I crafted an SQL script to filter products where the current inventory level is below the reorder threshold. Note that I've used double quotation marks for column names. This SQL can be input into the query method, which returns a dataframe. After saving these changes and refreshing, we can see the resulting dataframe. However, if you're handling sensitive information, it is better to keep your spreadsheet private. Yet, with my current setup, private sheets won't be supported. To show you this, I'll restrict the spreadsheet's permissions and refresh my app. At first, everything still seems to work, but that's because the connected dataset is cached by default for 60 minutes. I'll discuss caching in more depth later in the video. For now, I'll clear the cache and refresh the page. And now, as expected, an error pops up. So, how can we connect to a private sheet then? While it isn't too complicated, it requires some initial setup. The GitHub repo provides all the necessary steps. Start by visiting the Google Developers Console. Once logged in, create a new project, give it any name you like, and then click 'create'. This might take a moment. When it's complete, make sure to select the new project from the dropdown list. In this project, we'll need to enable both the Google Drive and Google Sheets APIs. To do this, simply search for them. I'll start with the Google Drive API. Just select it, and then hit the enable button. After enabling, you'll be redirected back to your project. Next, search for the Google Sheets API, select it, and also enable it. With that settled, we need to create credentials for our project. Click 'credentials', followed by 'create credentials', and then select 'service account'. Assign a name to this service account (I'll name mine 'python Google Sheets access') and continue. Assign the role 'Editor' to it. This will allow the service account to not only read the data but also to update and delete it. Next, click 'done'. Now, select the recently created service account, navigate to 'keys', click 'add keys', 'create keys', and choose 'JSON'. This action will automatically download your keys. It's important not to share this sensitive information. However, for this tutorial, I'll display mine. Next, we'll integrate this information with the Google Sheets connector package. Return to your app's root directory and create a new folder named '.streamlit'. Within this folder, generate a file called 'secrets.toml' and open it (I'll use VS Code). To populate this file, refer to the GitHub repo, find the following snippet, and copy it to your clipboard. Paste this content into the 'secrets.toml' file. The next step is inserting the values from the JSON file you downloaded here. To make it easier, I will align the two files side by side. The only thing that cannot be found in your JSON file is the spreadsheet link. So, I will return to my browser, copy the link and paste it into the secrets file. Next, copy all the other values except for the worksheet name. You can leave that empty. The private key is actually much longer, but you also need to copy it. Just select and copy everything and paste it here. Due to my auto-formatting, my file looks a bit messed up, but that is okay. Just save the file and go back to your main code. We now no longer need the URL, so remove it everywhere it appears. Now, as we use the Google Sheets credentials, we can reference the actual sheet names. So, instead of using the ID, input the worksheet name here. With these changes made, I'll refresh the app. Okay, so we have an error. Therefore, let me restart the app, as we added the secrets.toml file. Back in my terminal, I'll stop the app using Control + C on my keyboard and then start it again. When I reload the page, there's another error. And this is because, I forgot an important step. We have to share the spreadsheet with our service account. To do this, go back to your secrets.toml file and copy the client's email. Then, in your spreadsheet, share it with that email address. Just paste the email and give it access. After doing that and refreshing the page, it now works. The data isn't public, but we can still see it. And, as I mentioned before, now we can use the real sheet names instead of their IDs. So, to see my sales data, I'll change the name here and reload the page. But you will notice that there are some extra empty columns. To fix this, I'll only use the first five columns with the usecols parameter. Okay, now that looks better. Now that we can access the data, we can manipulate it just like a typical database. To show you this, I coded out a simple example. Let me run the code, and then I will explain it. The first part is creating some dummy data. I made a function to generate a dataframe called 'orders'. Then I made another dataframe that's like the 'orders' one but with one change: I multiplied the 'total price' by 100. Then, I'm showing both dataframes using an 'expander'. When I open it, we see our 'orders' data and the changed one with the new price column. Okay, and here's the kicker: just like a typical database, we can now manipulate our Google Sheets data. So, let me place our spreadsheet next to the app. As before, I simply created the connection. With this connection, we can create a new sheet, query the data, update the data, and clear the sheet. To do that, I've created different buttons. To generate a new sheet, you just need to take the connection object and use the 'create' method. You can then specify the name of the sheet and the data you want to insert into this new sheet. In my case, I'll insert our orders data. So, when I click this button, we'll have a new tab in our spreadsheet. When I select it, we can now see our orders data. And, just like before, you can also query this new data using SQL and the 'query' method. For example, I will calculate the total price. In the resulting dataframe, I can now see the sum, which is 1,000. Next, I want to update this data. I'll use the connection object again, but this time I'll use the 'update' method, specify the sheet name, and provide the new data. Now, the total price sum is 100k. But if I rerun our SQL query, you'll see that we still get 1,000 back. This is because the data connector has a default TTL (Time-To-Live) of 60 minutes, meaning it's cached for 60 minutes. This is often practical, allowing you to interact with the data, filter it, and plot it without constantly establishing a new connection. However, just for demonstration purposes, I'll set the TTL to 5 seconds. Now, when I reload my side and calculate the sum again, you can see it's calculated correctly. Lastly, I'll clear the sheet by running the 'clear' method. Okay, guys, and that's all I have for you today. While Google Sheets won't replace a real database, I believe for smaller projects, it might be all you need. Simply create a well-structured spreadsheet, and then you can interact with it using Streamlit. And with that said, thanks for watching, and I'll see you in the next video.
Info
Channel: Coding Is Fun
Views: 5,078
Rating: undefined out of 5
Keywords: Streamlit, Google Sheets, Database, Streamlit App, Streamlit Tutorial, Python, Streamlit & Google Sheets, Streamlit Database Integration, Google Sheets as Database, Python Streamlit, Streamlit Web App, Google Sheets API, Python Database, Easy Database Setup, Spreadsheet Database, Google Sheets Integration, Streamlit Project, Streamlit Guide, Python Web App, sql
Id: HwxrXnYVIlU
Channel Id: undefined
Length: 12min 19sec (739 seconds)
Published: Sun Aug 27 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.