5 Tips for Building Powerful Data Dashboards in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
last year we started building our own ion codes data dashboard we wanted to get an overview of how our social media channels notably YouTube and Linkedin were performing and also just seems like a fun project that we could then later in the future do video about well the future is now in this video you're watching right now now while building this dashboard we learned a lot so today I'm going to share five things that we learned that are hopefully also going to help you build better data dashboards and there's also one issue that we haven't been able to solve yet talk about that as well finally this video is sponsored by taii more about them late before I go through the tips let me first show you the dashboard quickly this is what it looks like so we have these three sections overview LinkedIn and YouTube so this allows us to track some metrics from LinkedIn and YouTube and we have a sort of overview page here the way this works it's really simple I can click for example on LinkedIn Tab and then I can see table with some totals I can scroll down and I can select a couple of metrics and then that shows a different chart for example this is the Impressions looks like it goes down but this is fluctuates a lot over time so it doesn't say all that much but you can also look at for example page visits or you can look at other types of metrics and this is for us is very interesting to see and it also makes it very easy because now we can simply go to our dashboard page we don't have to log into LinkedIn go to the analytics page page and whatnot we can simply view it directly here which is quite useful and we built something similar for YouTube on the YouTube page we have some other information that's useful for us such as the average number of views on the latest videos and this is for example helpful when you're dealing with sponsorships so potential sponsors want to know this sort of information and this allows us to get it very quickly and then we also have again some overview table and we have some metrics that we can compare for example this is the total number of use over time but we can also select other metrics here like uh the subscriber numbers for example and this may look a bit weird to you like it looks like that since March 26 there is like zero new subscribers this is not actually what's happening the YouTube API that we Ed to collect the data actually rounds this to 1,000 subscribers so that's why this number looks the same if I select on the top let's say the last 6 months and then I scroll down again you then you see that we have this sort of staircase kind of behavior and that's because YouTube rounds this to thousands of subscribers the first tip is to pick the right to for the job we initially thought it would be the easiest route to go in for some all-in business analytic solution like powerbi or Tableau we quickly found out though that often these tools don't really look all that great they're pretty hard to use especially if the data that you want to use in your dashboard doesn't come with a standard integration and then if you want to adapt the way that the dashboard looks it's often pretty hard it doesn't do exactly what you want so we found that was pretty limiting on top of that some of these tools are really expensive they can easily cost hundreds and hundreds of dollars a month now a pretty standard tool for doing this type of business intelligence is powerbi from Microsoft that seemed quite affordable but we also soon found out that it was almost impossible to use in particular if you want to integrate data that you get from an external API like LinkedIn or YouTube anywh want to incorporate that in powerbi you can if you use like the desktop app but we like something that's web based so we don't have to install applications on our computer and everything and the web-based part of powerbi is really underdeveloped in my opinion and if we wanted to get that to work we had to run like scripts on a local machine to get the data that we wanted it was like a complete pain so powerbi ultimately we decided to not go for that and Escape Microsoft h so finally we decided to look into python libraries and tools that allow us to build it ourselves which we should have done in the first place we've now built the first version of our data dashboard using streamlet but like I mentioned there's one issue with it that I'm going to cover later in the video so the second tip is to make sure that the user interface is actually good if it's not then you're not going to feel compelled to actually use your data dashboard and we spent quite a bit of time figuring out in what order we should show things and which things we shouldn't show actually a very important part of Designing a great user interface is to also leave out some things and keep things really simple in order to do that properly you need to know who is going to use the dashboard so myself I'm using the dashboard to see what's going on on the iron cods social channels but also for example Victoria who's responsible for marketing as iron cods needs to look at the dashboard regularly so big part of building this dashboard is not so much writing the code but doing interviews with people and figure out what they actually want to see in the dashboard and that's a really important step you should never forget now the nice thing about streamlit is that out of the box the user interface already looks pretty good you get these basic components that you can use and then the dashboard already is pretty usable and I really recommend you don't like over complicate things with many many panels and many tabs and lots of different charts and whatnot because it's just way too much keep it simple and don't hesitate to remove things to make sure that your dashboard looks simple so so that's easy to use and that the important information is always on top now you can actually build a dashboard like this completely from scratch just writing raw HTML I definitely don't recommend that use a tool there's many tools available that you can use we've done experiments with both Dash and streamlit overall I found streamlit looks nicer and it's a bit easier to use so that's what we went for in the end they each have their pros and cons though if you'd like me to dive deeper into Dash versus streamlet and what some of the differences are let me know in the comments and I might make a video about that in the future now what I'm showing you in this dashboard is simply a visualization of the data but if you want to do more than just display your data in a dashboard then a really nice option to consider is tyy and they also happen to be the sponsor of today's video tyi is a python open source Library designed for easy development of datadriven web applications typi handles both front end and the back end it's open source and you can use it for free to install it simp simply type pip install tyy or if you're using poetry simply type poetry add tyy and added to your P project file you can use tyi in vs code directly by using the typy studio extension and it also works in Jupiter notebooks the nice thing about using a tool that's dedicated to running pipelines is that it offers a lot of features that you then don't have to build yourself for example typei has scenarios which is a sort of registry for all your pipeline runs it also doubles as a great comparison tool for what if analysis because it allows you to launch pipeline runs using different parameters and this will help you take projects that you set up as a pilot using just a simple machine learning model and make it available to your users with a much higher quality model very easily now on top of that typy has many other features like parallelism caching data scoping and pipeline versioning go to tyi's GitHub page to check it out I've also put the link in description now back to the video the Third third tip is that you should Implement some sort of filtering mechanism and easiest way to do that is to make those filters Global you can actually see that right here in the data dashboard that we buil so we have at the top this area of the user interface where we can select a time period and when I do that it automatically changes the view regardless of on which tab you are so for example here let's say I do last seven days if I scroll down then you can see that we get this chart here but if I switch this to let's say the last 6 months then we're going to get the updated chart automatically and the same goes for the number of views that you gain in that period the number of subscribers that you gain in that period so this is a very nice simple way of quickly getting a view on your data and we've also added the option here to select a custom time period now user interface wise I'm not really that happy with this I think this takes up a lot of space which means that if I select a different time period here and I want to see what happening and I want to see the result I need to scroll down so that's not ideal now normally I do view the dashboard smaller maybe something like this so I have to scroll less but you still have to scroll so that's not particularly ideal and we didn't really find a ready to go component for streamlet that combines these two because I think ultimately these need to be combined into a single drop down where you can select either a fixed time period like the last six months or you should be able to select a custom time period another thing that would be nice is that actually not the entire dashboard Scrolls but that this always stays at the top and that just this area of the user interface scroll so I think there's a couple of things you can do to make this better than what it currently is and we're still working on this now final thing that you need to take care of when you define these filters is that you also make sure that they make sense to the stakeholders to the people who are using your dashboard so these options that you see here the last seven days this month and last month Etc we pick those together with the people who are going to actually use the dashboard so make sure you do that otherwise the filters are going to be useless the fourth tip is to separate data collection from data visualization this is especially important if you're building a realtime dashboard if you need to collect data from different apis like in our case we need to collect it from YouTube and Linkedin and in the future potentially other social networks then it's really crucial to do this by separating it you make it much easier to maintain your dashboard and make it way more scalable you can use Python scripts for example for collecting the data and then storing that data in a database or a cache if you try to put data collection and visualization in a single app then your dashboard is going to be really slow because it needs to access all these apis and you have no control over how fast these apis are and then in the future if you want to expand the dashboard with new apis that you're collecting data from this is going to get even worse and you may end up in a situation where your dashboard is just completely unusable because to just waste so much time accessing all of these different apis next to that being slow you're now also dependent on all of these apis being live and making sure you don't get hit by rate limit if lots of people are using your dashboard at the same time so separate that also don't underestimate how complex API interactions can be in terms of security authentication different types of endpoints all apis are different you may need to get the data even from various places in the API it may not be a single endpoint maybe if the API is not very stable you need to implement a retry mechanism if a certain endpoint call fails so you still get the data building this is really complex in fact we spent I believe over half of the time of the project on properly interacting with these different apis so make sure you schedule enough development time to make sure that you can properly address all these issues related to interacting with different apis and what we did is that we have a Google Cloud function that runs once a day and that retrieves the data from the various apis and in our case there are two there's YouTube and there's LinkedIn and here's an example of what those scripts look like so we've created a bunch of functions that help us access the YouTube API so using the Google API client and this is also where interactions with API become complex because like I said every API has a different way of interacting with it and has different security requirements so it's all going to be different so we have a function that gets recent video IDs and we have a function that gets the video views for a particular video so we made all these functions that do a specific thing with the YouTube API and then we can collect that data and create a sort of snapshot and store that in a database for which by the way we're using mongodb we have a similar setup for LinkedIn where we also have functions that retrieve particular type of data and we have some helper functions for dealing with dates and with times so an example is that we want to know how many followers we gained on LinkedIn so using a LinkedIn API clients to get that information and as you can see this can get pretty complex in terms of the time we spend on coding the various parts of this project both the data collection and the data visualization I'd say about two3 of that is data collection and interacting with apis and just onethird is the actual user interface so dealing with apis is really really complex so I have a couple of other functions here as well that get specific type of information like the number of page visits or or uh the number of shares which is also a useful metric to have now final bonus tip for when you're dealing with apis and you are collecting data always make sure that you not only store your aggregated data the data that you're actually going to use but also make sure you store the raw response that's actually something that we also do here so instead of just storing the data that we want like uh the page visits and things like that we actually also store raw responses we don't redo any anything with them in dashboard but in the future if you want to extend the capabilities of your dashboard you want to have a new type of chart with more information so then it's possible that in the Raw response that you store there is actually some data that you want to use and then you can go back in time and add that to your data and add that to the higher level data in your database and that means you won't rely on the original API to get this data because you store the raw response and when you look at the data that we get from the YouTube API we did exactly the same so we also store the raw responses here this makes our data a bit more future proof the fifth tip is that you need to think about security and determine who should have access to the data in your dashboard now depending of course on the type of data that you store in our case we're just storing some social metrics which you know you can probably get from a public API so the data is not that important but still you want to think about protecting who has access to what and there's basically two aspects of this that you need to think about one of them is authentication in other words who you are and the other is authorization which is what do you have access to now for authentication you need some sort of login method that's the oo standard that will allow you to do that for authorization you typically have something like a role based access control so a user has a particular role in a system and depending on the role you have certain permissions but there are services that do authentication authorization all in one for you so you don't have to code that yourself yourself but if you like a challenge you can also code this up yourself here's an example of how you could set that up so I have a permission enum in this case just reading and writing and then we also have a role where each role has a particular name and then has permissions and then in the simple example we have a user that has a name and the user has a particular role and then here's a simple function that checks if a user has a specific permission so it should be in the permissions belonging to the user role and then they have a main function here where I create a couple of roles like the admin role and the user role but you can add other roles as well obviously and then you can assign roles to user so we have user Alis that's admin and Bob is the regular user and then you can do some example checks like hey does uh this particular user have right permission by checking for the permission and when I run this we see that Alice has right permission and Bob doesn't have right permission and of course you can add other types of permissions to assist system like this as well these kinds of access checks are something that you would add to your API but you can also add it to your data dashboard so that certain users can only view certain types of data by the way this is one of the many examples taken from my software architect mindset course if you want to learn more about that check the link in the description now unfortunately we discovered that streamlit is a bit finicky when it comes to authentication and authorization especially if you want to host the dashboard yourself in Cloud so this is an issue that we have haven't solved yet so if you know of a way in streamlet when you're hosting that yourself to have authentication and authorization in a way that's easy to implement let me know in the comments below because we'd be very interested in that now bonus tip you want to make sure that you optimize your dashboard a bit so that it's performance and already by removing the data collection job from the dashboard and putting that into a separate thing is going to help tremendously because then you just have the database to interact with but there are other things you can do as well to make your dashboard more performant for example you could aggregate data so that when you load the dashboard it doesn't have to get everything from the database but just a subset or just a high level set in our dashboard this is done in the overview part where we just get like a couple of high level things another thing you can do is cache things so when you retrieve let's say a particular time period view of data you could store this in a redus instance for example and then get that information really quickly of course you also have to make sure that you put an expiry date so that you don't get stale cash values but it can definitely help implementing this and you want to make sure that interacting with your dashboard is performant even if many people are using it at the same time another thing you could do to improve the user experience if if you need to load a lot of data and there's just no way around it is to use asynchronous data loading so you already show the dashboard but you use skeletons and then fill in the data as it it comes in that's typically a better experience for your users one thing that we learned from this project is that it's actually really hard to integrate apis that took up like most of the time for this project and not so much the user interface the main time we spent on the user interface was not so much the coding part but was figuring out what we actually wanted to show and also removing things that were not necessary to keep things simple so in that sense building a dashboard like this is kind of a combination of Arts and science it was actually a really fun project to do and we're going to work more on this dashboard in the future so I might do a follow-up video where I share more details of how we actually built this now I'd like to hear from you have you built dashboards like this before what was your experience what were the most important things that you learned share them in the comments below if you enjoyed this video you might also be interested in another video series that I did a while back it's code roast where I completely refactored a data science project and you can watch that right here thanks for watching and see you next time
Info
Channel: ArjanCodes
Views: 28,269
Rating: undefined out of 5
Keywords: data dashboards, dashboard, data dashboard, dashboards, dashboard tips, dashboard tips and tricks, python dashboard, python dashboards, data python dashboards, python data dashboards, python data dashboard, data visualization, business intelligence, interactive dashboard, interactive dashboard examples, data visualization storytelling, data science python, dash plotly, building data dashboards, data analyst dashboards, social media dashboard, social media dashboard project
Id: 6xncb3aTrXk
Channel Id: undefined
Length: 19min 5sec (1145 seconds)
Published: Fri Apr 19 2024
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.