Build a TikTok Data Science App with Streamlit and Python | Data Science Project

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what's happening guys my name is nicole astronaut and in this video we're going to put our data analytics skills to the test and build our very own real-time tick-tock analytics dashboard that's right so you're going to be able to get real-time data from tik-tok using a special python library that i managed to find and you'll be able to visualize that using a streamlit dashboard and we're going to take this step by step as though we're actually undertaking a real-life client and developer relationship let's take a deeper look as to what we'll be going through alrighty so what is it exactly that we're going to be going through in this tutorial well first things first we're going to need to get some data now the tick tock api isn't all that open when it comes to getting data by trending analytics or data by specific hashtag so i've actually found a great python library that's actually going to help us get through this so we're going to grab that tik-tok data using python then we're going to need to do a bit of processing with it so what you'll actually see is that when we get this data it's not exactly in a format that makes it all that appealing to actually go and throw it into a data frame so we're actually gonna need to do a bit of pre-processing this is a really good technique for you to practice because it's something that you're going to do an absolute ton if you want to get into data science or deep learning or machine learning once that's done we're going to go on ahead and build an interactive streamlit dashboard so we'll actually be able to type in a particular search query this is going to go out to tick-tock bring back our data and visualize it in to our dashboard ready to do it let's go speak to our client hey matt how you doing i heard you were looking for some insights hey nick yeah i was wondering if you could get me some tick tock data by any chance one of our social media teams is looking to explore getting onto the platform right any data in particular uh just anything that's popular at the moment maybe see if you can get it by hashtag as well got it i'll see what the team can do right so that's our first call with matt dunn we know that he wants to get some tick tock data in order to do this we can use python and specifically we can use the tick tock api to be able to scrape data down from tick tock what we'll try to do is export this to json so that his team's got some initial data to work with let's get to it alrighty guys so you heard matt the first thing that he wants is some tick tock data to help out his social media team now matt didn't necessarily specify what format it should be in or what date ranges or what features he wants in this data set so we're just going to focus on this main requirement get some data in there ensure that we've got some tick tocks ensure that we've got some hashtags now this is a very agile way of development take some requirements build some stuff iterate quickly and show value this is a good way to ensure that you're going to hit the client brief because going fast is going to ensure that the client can see some results quickly and iterate if they need to and this is a pretty common way of doing development in real life take some requirements build fast show results and this is exactly what we're going to be doing here now i've gone ahead and done a little bit of legwork and found that we can actually get some tick-tock data without using the specific tick tock api using the unofficial tick tock api in python now i'm actually going to walk you through my exact workflow step by step so this is from the ground up i'm not going to be copying code and i'm not going to be skipping steps so you'll actually see everything from the ground up so we're going to be using this now whenever you're using a new library or a new module whether that be python or javascript or any other language the best thing is to try to find some documentation and see how to actually get started now we can actually see here that there's a section in the documentation and i'll link this link in the description below so you can see it we've actually got a section called getting started so it actually says when it comes to installing this we can run pip install tick-tock api and then we need to install a another thing called playwright which is a headless browser and this is effectively going to be what's used to go through to tick-tock and scrape that api we'll scrape that front end so we're going to go on ahead and do this from the get go now i'm actually going to do this from scratch so we're going to create an environment and you're going to see how to actually do this so i've actually the only thing that i've gone ahead and done is i've gone and created a folder called tiktok but right now you can see that this is empty so there's nothing in here so i've got it inside of my d drive inside of my youtube folder and inside of tick tock so but there's nothing in there right now right so what i'm going to go ahead and do is i'm actually going to go on and first up create a virtual environment so this is going to isolate all of our python packages so that we effectively have everything encapsulated and everything is installed and we don't have any conflicting dependencies so let's go ahead and do that so i'm just going to open up a console or a command prompt and i'm going to go into that folder first up so i'm going to go into my d drive and then cd youtube and then i'm going to go into my tik-tok folder so cd tick-tock and then i can clear that so if i type in dr on a windows machine we can see that we don't actually have anything in that folder if i type in cls that's going to clear everything now remember what i said the first thing that we want to do is actually create a virtual environment so this is relatively easy to do we can type python dash m [Music] v e n v and then the name of the environment now i'm going to call it tick-tock analytics and this is going to create a new virtual environment for us to actually work in so it's going to create almost like an isolated python environment then to activate it i can hit dot backwards so actually let me show you what's in that folder now so if i type in dr i've actually got this folder called tiktok analytics now so if i show you this inside of the main folder all right so this is my folder so i've now got this environment called tiktok analytics and you can see that this is all my python stuff so i've got all my includes doesn't look like there's anything in there all of my libraries not too many at the moment and all of my scripts now in order to activate it we actually need to come into here and then run activate over here now this is a little bit different if you're running this on a mac i'll include a script or command that you can run if you're running it on a mac so you can see that as well in the description below so what we're going to do now is we need to activate the environment because right now we're still running in the default environment when we're on our machine so we actually need to go and activate that environment so i can type dot backwards tick tock analytics backwards slash scripts backwards activate and this is effectively let me actually show you what we're running so we're going into tick tock and let me zoom in so i'm going into tick tock analytics scripts and then i'm running activate so this over here so if i'm running through powershell it'll trigger this if i'm running through a batch script it's actually going to run this so this is going to start up our virtual environment so let's go ahead and run that and you'll actually see it activate so you can see that we are now running inside of our virtual environment here so it says tick tock analytics cool so that's all well and good now the next thing that we want to go on ahead and do is install our dependencies which if you remember correctly we're pip install tick tock api and python dash m playwright install so let's run those two commands so i'm just going to move this over here so let's run the first one to pip install tick tock api so you can see that's now installing so we'll give that a sec and let it install okay so that looks like it's installed correctly so i'm getting this warning here saying you're using version pip or pip version 19.0.3 have a 21.3.1 is available i'll show you how to fix that in a second but let's actually scroll on back and see what we ran there so the first command that i've written was pip install tick-tock api which is exactly this over here so pip install tick tock api pip install tik tok api that is going to install the tick tock api dependency inside of our python environment so that we can now begin using it right now we can actually verify that this is installed successfully if i run pip list and if we scroll on up we're here so you can see we've got tick tock api version 4.1.0 so it looks like we're pretty successful there now let's get rid of this uh warning that you can see there so this is basically telling us that we're using an older version of the python package index so we can fix that so again it actually tells us how to run it here so python dash m pip installed dash dash upgrade pip now you're not necessarily always going to get this but if you do let me show you how to fix it so we can run python dash m pip install and this is purely optional upgrade so this this command was really weird when i first encountered it but basically was saying so use python and then run a command within python and then we're going to pip install and we're going to upgrade the pip package so think of it as using the python package installer to upgrade itself that's effectively what this is doing here and it's going to go from version 19.0.3 to 21.3.1 so if i run that you can see it's going to uninstall 19.0.3 and it's going to reinstall 21.3.1 so that's all well and good now so we can type in cls to clear that that is all now installed now remember what was in our documentation so the next command that was sort of advised to be run is python dash and playwright install so let's install that now so python dash m let's go full screen playwright install let's make sure that's correct python dash and playwright install python-m playwright install so we're running the exact command that's specified in the documentation so if i run that now fingers crossed that works that looks like it's run successfully so if i type in pip list i think i actually saw that saw it there already um but yeah so we've got playwright installed as well now cool those are our two dependencies now in sword and we've also upgraded pip so you can see that's 21.3.1 okay now that's pretty much the setup that we need done for now so the next thing that we want to do is actually start doing a little bit of coding so let's actually spin this up so i'm going to open up this current folder inside of vs code now you can use a different code editor if you want to i just find vs code pretty straightforward so i'm going to type in code dot and it's going to open up inside of my current folder it's opened up on my other screen you can see it there so i've got tick tock analytics over here and i've got my powershell environment running down here right now we don't actually have any python scripts or anything so let's actually create a new file so i'm going to right click new file and i'm going to call it um let's call it tock dot pi right so that is going to be our first file where we're actually going to start doing a little bit of coding right now at the moment the environment that's currently associated to this tick tock dot pi file up here is one of my older environments so you can see down here let me zoom in right in the bottom corner it's saying deadlift and then v e and v this is a previous virtual environment that i went and used it might not be the case for you but let me show you how to fix this so if i select that it's going to ask me which interpreter i want to use what we want to do is we want to use our tick tock analytics interpreter which you can see has popped up down here now if it doesn't show up down here you can hit enter interpreter path and then hit find and then we just need to go into the folder where we went and created our virtual environment in this case you can see tick tock analytics i can select that double click into that go into scripts and then choose python down here and then hit select interpreter zoom out and you can see we're now running on our tick tock analytics virtual environment so this basically means that when we go and hit the play button up here we're going to be using the same tick tock analytics virtual environment okay that's that now done now the first thing that we want to do from here is import a couple of dependencies so let's go on ahead and do that first and then i'll walk you through what i've gone and imported alrighty so first things first i've gone and written two lines of code so let's actually take a look at what we've written here so first up i've written from tick-tock api import tik-tok api as tick-tock so this line over here is effectively importing the tick tock api that we went and installed first up so that was from the documentation that we saw in this particular case i'm getting the tick tock api and i'm importing a specific module and i'm renaming it and renaming it so that it's just called tick tock so this is just going to make it a little bit easier to use and then i'm importing a json so this is effectively going to be what we can use to actually go and export our data now if we go back and take a look let's actually make some comments so i'm going to say this is importing the tick tock api or python api actually it's an sdk python sdk and then we are importing import json or export of data cool so we've got our two dependencies now now what we actually need to do or what is probably a good idea is take a look at how we can actually use this tick tock api or python sdk so if we actually go back to our documentation and scroll on down now you can see that we've got a bit of a quick start guide it actually tells us that we can set up our instance we can then go and get our information and then print it out so we're actually going to try to do something similar to this but i'm going to tweak it a little bit so let's actually go and write a couple of additional lines of code and then we'll actually be able to get some tick tock data okay so those are our first two lines of code now done now this first line of code is actually blank so this is actually incomplete at the moment so let me write incomplete because we need to get some information from the tick tock site to be able to go and scrape this then the next line of code is complete and let me explain what this does so i've written api equals tick tock dot get instance and then we're passing through two keyword arguments so the first one is custom underscore verify fp and we're setting that equal to this value over here now what is actually going to happen is we actually need to go to the tick tock website and grab some cookie data to be able to go and connect to the tick tock page and effectively scrape it that is what this tik tok sdk is actually doing it's actually do almost like a real time scraper now what actually happens is on the production version of tick tock you can actually get sort of rate limited or get hit with capture so we can actually use test endpoints to get around this hence why we're using this argument over here so i've written use to underscore test underscore endpoints equals to true now let me show you where to actually get the value for verify fp and this may change in the future at as of right now this is effectively how you can go and do it um if the code does change i'll update it inside of github at the end okay so what we actually need to do is go on over to tick docs so if you go to ticktalk.com uh whatever let me get rid of this so just go to ticktock.com and pause whatever random tick tocks are playing if you go and hit the little uh what do you call it the lock button and then hit cookies and then go to ww i can't move this over there we go go to www.ticktalk.com so the ones that you can see down here if you expand this and go to cookies and then scroll on down you can see that there's this cookie down here called s underscore v underscore web underscore id if you group or select that you're actually going to get a value under here which says content so if i copy that value that is the value that we need to pass through to verify fp so let me show you how to get that again let me zoom out so if you go to the lock go to cookies www.ticktalk.com expand that i can't expand this anymore sorry guys and then hit cookies and then if you scroll on down you're going to get all of these cookies under here now this may change in the future as of right now that's what it is but under cookies you're going to get s underscore v underscore web id if you select that and grab this value here and copy that and go back into our tiktok code what we need to do is just paste that there so this is the value coming from that cookie and that is going to be what allows us to effectively connect and get this information cool so that is now complete so we can get rid of that all right so let's take a look at what we've done so far so so far we've gone and imported a couple of dependencies so from tick tock dot or tick tock api import tick tock api as tick tock then we've imported json so we're going to use this in a sec then we've gone and got that cookie data remember we went to the lock to get that and then we've gone and set up our instance so we've gone and written api equals tick tock dot get underscore instance and then we're going to pass through this value over here and so we've set custom underscore verify fp equal to this variable right and that's what we got from the site and then we're going to pass through use underscore test underscore endpoints equal to true to be able to ensure that we hit the non-production endpoint so we can actually do this multiple times now what we can actually go on ahead and do is actually test this out and get some information so let's try this okay so let's test this out so i've gone and written two additional lines of code there so i've written trending equals api dot by underscore stop popping up api dot by underscore hashtag and then we'll set that equal to python so you can see that there so api and then dot by underscore hashtag and then we've passed through python as the hashtag that we want so if i type in api dot you can see that we've actually got a bunch of different methods that we can actually use here so say for example you wanted to get by sound or by trending or by username like there's a whole bunch of different ways that we can actually go through and search but we're mainly going to be focused on the by hashtag search type so so i've written trending equals api and dot by underscore and then we've gone and passed through python and then we're going to try printing that out so if we actually go and try to run this let's take a look and see if we get any information back all right so it doesn't look like we've got a successful run there so it's saying tick tock api dot exceptions tick tock not found eric challenge python does not exist so sometimes it's actually going to throw out a bunch of errors depending on what you actually pass through here so let's try lowercase python all right that looks way better so in this case you can see that we've got all of this information here so there's a ton so sometimes you might need to try a different hashtag you might need to try a different line or different yeah mainly hashtag to be able to get this information in sometimes you might get rate limited as well but in this case it looks like we've got a successful dump so we've got all of this info over here which is a ton right but uh i don't know if our client's really gonna like us just giving him this right this is uh not all that great or at least i wouldn't be satisfied as a somebody taking up the skills of a data scientist or a developer um let's try another one so if we type in dot ai or ai and run it again looks like that's run successfully let me try that again all right so that's coming up challenges don't exist uh let's try let's try running that again sometimes it just throws that error but it still would run successfully okay there you go so sometimes it'll throw up that challenge does not exist error but if you go ahead and run it again it'll work successfully so you can see there that we've got that run successfully but this is just way too much to actually um understand so we can actually hit open in file editor and that's not actually going to bring that up we can actually go and export this as a json file and that's going to be a little bit better we don't want that right okay cool so we are here right now now what are we doing uh so we've got some information we've gone and printed it out it doesn't look all that great so let's actually go and export this to json so that will at least satisfy our first client requirement of at least exporting some data so what we can go ahead and do is use our json library here to export this out so let's go on ahead and do that let's double check how we can use it let's double check how we can use json.dumps again i always forget whether or not it's dump or dumps it's json dump and we want json and dump yep cool all right so let me go back so the way that we're going to do this is we're just going we're going to pass through trending now i'm not necessarily sure whether or not this is actually in a json friendly format at the moment so let's test this out and see if it works so the line that i've written is with open and this is going to give us a temporary file that we can actually work with so with openexport.json and then we're specifying that we want to write so we're passing through the right flag as f and then we're going to write json.dump and then we're going to pass through trending so this thing over here so this was the object or the results that we're actually getting back from the tick tock api so what we're going to be doing is we're going to use json.dump we're going to dump out this trending variable and we're going to pass it through as f which means that it effectively should export to a file called export.json so let's actually run this now and see if that runs successfully challenges not ai does not exist so this error over here is just because sometimes it's going to throw errors whenever we're going and trying to run this search before we run it again okay so that looks like it's run successfully so let's actually take a look so it looks like we've got some json so what we can actually do is let's close this for now is right click and hit format document and this is going to format our data a little bit better so we can actually see it but this is obviously huge right so let's scroll on up it's what like 8 000 lines holy crap so that's a ton of data how do we get to the top okay i'm just hitting page up right that so there is a ton of information here so if we actually take a look we've got an id we've got a description so what do you mean iron man marvel and it looks like we've got a hashtag ai so remember our client requirement was we need to be able to get hashtags out so we've got hashtags we've got tickstock data so we've got our video information it looks like this is nested at the moment so you can see that there so we've got author music challenges stats duet info text extra we've got author stats which is great this is probably going to be super useful later on we've got stickers and then this is our next video so these are the main keys that you can see there so we have successfully gone and grabbed some tick tock data there now this is pretty much pretty good so we could actually take this export.json file and send it over to our client now so that would effectively be our first requirement now successfully satisfied so if we go and run this script we could actually go and change it up so let's say for example we want to give our client data on python tic talk so if i write python run that now right so that looks like it's successfully gone and run so let's just hit don't save and then open this back up right click hit format document there's got to be a faster way to get to the top of this let's scroll on up i'll probably fast forward this in editing but uh let's just double check we've got some tick tocks that are to do with python okay so let's see so this is not necessarily python coding this is uh python snakes but it looks like it's successfully gone and grabbed some tick tocks got the links if we want to go and see the links pretty cool right so we've at least got some data now if we wanted to remember the client requirement was get some tick tock data ensure that it's got hashtags in it so let's go on ahead and send this over to our client and see what they say matt got you some awesome data from tick tock you want to take a look oh sweet i saw that in my email but i noticed it was just in json format it would be great to have it as a csv so i can open it up in excel or maybe share it around with the executives say no more we'll get on it so matt's come back and given us that he needs this data transformed from json to something that's a little bit more transferable and usable with pandas what we can do is take our nested json file and transform it using a bit of an etl pipeline we'll build this up in a file called helpers.pi to be able to transform this data let's do it okay so matt's kind of happy so we went and sent him through the json data but as expected it's not necessarily the best use of data analytics really when you're using something like this you want it to be in a slightly more structured format something maybe like a csv or an excel document this is pretty common whenever you're working with different business users or different teams within an organization so we want to take this one step further and at least do a little bit of pre-processing now one thing that immediately i'm seeing is going to be a bit of an issue when it comes to actually using this in like pandas or something is the fact that these are nested right so you can see that this is inside of a nested environment now when it comes to bringing this into pandas it's not going to play all that nice so you can see that it that's nested that's nested and let me actually show you what i mean by this so what i'm actually going to do is i'm actually going to show you how i set up my data science style environments whenever i'm doing a project like this now you're probably also thinking why haven't you gone and written this ticktock code inside of a jupiter notebook nick we know you love jupiter notebooks it doesn't play all that nice the the tick tock api doesn't necessarily play all that nice with jupiter and that is because jupiter is asynchronous and the tick tock api is asynchronous as well i believe that's the thing but basically they don't like playing together on at the same time so it's better to run them separately and you're also going to see this later on when we integrate it into streamlit what were we doing so um let's actually go and set this up so that we can use it inside of a jupiter so first up what i'm going to do is i'm going to type python-m and then pip install ipi kernel and this actually allows us let me zoom in on that so you can see it i'm minimize this i don't want that so i've written python dash and pip install this is so messy python dash m pip install ipy kernel how do we close it there we go python dash m pip install ipython so this is actually going to allow you to install this virtual environment into jupyter it's going to make your life a bunch easier right so i'm going to let's go ahead and install this and i'm going to show you how to do the install purely optional as well guys but i want to take you through the actual process that i go through when building something like this cool so that's now installed if we type in pip list let's make sure it's installed ipi kernel so you can see it there all right cool now what we need to do given that that is now done what we need to do is actually go and install this tick tock analytics virtual environment into jupyter if we want to do it on our desktop again you could skip this if you didn't want to use jupiter but i find it a little bit easier to work with so let's actually go and do that so we can type in python dash m ipi kernel install dash dash name and then it's tick tock analytics so let's zoom out a little so you can see that so i've written python dash m ipi kernel install dash dash name tick tock analytics which is the name of our environment so this is going to go on ahead and install this environment given the fact that we're in the current folder going to go ahead and install this into the ipy kernel environment so this means that we'll actually be able to see this when we go to create a jupyter notebook which means that we're going to have consistent packages across the board when it comes to doing development so when you go and do stuff inside of jupyter versus doing stuff inside of vs code you're not going to have differences in the environment all the libraries that you're using when you're writing python code inside of vs code are going to be the same as when you're writing them inside of jupyter so let's go and do this cool so that says that it's now installed installed kernel spec tik tok analytics in c colon program data and i believe you can type in ipy kernel kernel spec list i think that is uh python dash m ipy kernel kernel spec this something like that no that doesn't look like it is the right command i've gone and screwed that up let's go back and reactivate it i'm trying to show you all the kernel specs from here um i will get this tick tock analytics backwards scripts backwards activate what is it that's jupiter kernel spec so jupiter jupiter all right that's better so by running that so jupiter kernel spec list you can actually see all the virtual environments that we've actually got installed inside of jupyter so you can see here that i've got tick tock analytics and that is now installed so we can actually work with that cool all right what was it going to do now so i've showed you that it's now installed inside of jupiter so if i now go inside jupyter notebook so jupiter notebook promise we're not going to spend a lot of time maybe we will i don't know we'll see all right so we've got our let me zoom in on that because that is way too tiny so we've got our virtual environment we've got export.json which looks ridiculous and we've got our tick.pi script or the initial versions of it now what we can do is let's say for example we wanted to actually read in our export.json file into our ticktock analyte or into a jupyter notebook so we can actually hit new and then down here you can see that we've got our tick tock analytics virtual environment so we can actually select open and this is actually going to be using the same environment as our python code now what i'm going to do is i'm going to rename this so if i hit untitled let's rename it data frame test actually it's called a tick tock analytics test if all you wanted to do is bring this data into a pandas data frame so we can type in import pandas as pd do we have that installed okay so what we now need to do is install pandas so if you haven't heard of pandas before pandas is a great library when it comes to working with tabular data so think excel spreadsheets json csvs um there's probably a bunch of others that i'm forgetting but it works really really well it's almost like excel but for python really but uh not necessarily gui based it's all python based so what we now need to do is install pandas so let's do that so we can type in exclamation mark pip install pandas and it's going to install it inside of our environment we'll give it a sec cool so that's now installed so if we now go and run pip list you can see that we've got playwright and all of those environments that we previous all of the packages that we previously installed but we also have pandas cool all right so let's go and import pandas so we can type in import pandas as pd and then what we can do is we can try the reason that i'm doing this is i want to show you what that json file looks like as a data frame in its raw format so if i type in df equals pd dot data frame so this is the command to actually bring in or create a data frame from a actually is it from pd.dataframe let's see dot uh no it needs to be pd read json all right to actually read it in it's pd.read also pro tip so whenever you're writing commands inside of jupyter if you just type in dot and then tab it's going to show you all of the different commands that you've got available to you or all of the methods that you can write so in this case i wasn't sure how to actually read in a data frame and this is super common even though i do this day in day out i always forget is it pd.dataframe or is it dot rejson so the reason why is i was taking typing in pd.dataframe. then you can see that we've got a bunch of different lines here um but you can see that there's no command for json right maybe is oh there is two js oh that's two json so that will export out we want from json or read json so if i type in dot read json so you can see it down there this is still a bit small isn't it that's a bit better all right df equals pd dot read underscore json so this is actually going to allow us to read in some json now it just so happens that we've got our export.json file here so let's try reading this in and see what it looks like okay so to do that i've written df equals pd dot read underscore json and then we've written export.json so that is going to give us what should give us a data frame with our tick tock data so if i type in df.head that's going to show me the first five rows of data and there you go we've got our data frame pretty cool right but this looks like crap i'm gonna be honest like we've got the id of the video on the that looks fine so the fact that we've got the videos down that way but take a look at the video column so we've actually got a dictionary there as our video id and our author is a dictionary as well and so is music so challenges so stats this isn't all that useful like if we sent this to a client they'd be like well this is the they'd probably have to go and do a ton of formatting to get this into a format that's actually going to work for them so what we actually want to do is tidy this up a little now this is this json file is a nested is what we call nested data right so if we actually take a look at the raw information let me close this uh well we've actually gone unnested it now but if i uh let's collapse a few of these all right so we've got a ton of videos so let's expand the first one so you can see that we've got one key here which is id we've got another key here which is description and that's got a string we've got a created time so this is going to be a number but then if we take a look at video the value for video is actually another dictionary or another object in this particular case that is why pandas is not playing nicely with that so we've got to work out a way to actually clean that up and put that in a format that's actually going to work for us because if we go over to here this looks like rubbish like that is not going to fly when it actually comes to opening it up in excel let's actually do that what would this actually look like if i open it in excel so if i go to uh so let's actually export this to excel now so to export a data frame to excel we can type in df.2 now or we'll do it to csv and we're going to call it um export or tick-tock data and remember that our maths requirement in this particular case was to convert it into a csv or produce it into a format that's going to be a little bit easier for him to consume if we went and exported it as it is let's see what this would actually look like this is what we'd be giving him right so we've got id we've got description which the description looks okay that's there's no issues there let's change the column width make it a bit smaller but look at our video this is a dictionary this is a dictionary this is a dictionary like these formats are not intuitive enough to actually give to a user so it's not all that great right so we've got all these other columns and guys what i'm actually showing you right so opening stuff up in excel perfectly fine when you are doing data analytics use all of the tools at your disposal you just because you're a data scientist or just because you're a developer doesn't mean that you have to preclude yourself from using tools like microsoft excel they've been around for a while for a reason because they are useful when it comes to being productive but all right back to our data so enough me ranting um look at this right so that is not great so it makes it a little bit difficult for us to filter on this uh how do we filter again shift l so if we wanted to filter we're filtering by this entire dictionary not not all that great right crap experience that is not what we want to do how do i zoom on this again there we go okay so what we now need to do is actually do a little bit of pre-processing so what i want to do is rather than have these columns structured as such where we've got id 0 that was description desc created time that's fine i want to unpack all of these let's zoom in so i can show you this a bit better so if we take a look at our video column right i want to unpack this so that rather than it reading id i want it to have video underscore id and i want that as its own column so this value here we should have that as its own column height should be video height and that is its own column width own column you sort of get the idea right every single unique value should have its own column rather than having these weird nested arrays so we're actually going to write a function to be able to unpack that and that's where our helpers.pi file is actually going to come into hand now i'm going to prototype it inside of jupyter but we could definitely do this inside of a python script but it's just going to be a little bit easier because you'll actually be able to see how i actually iterate through now if you wanted to i'm going to include all of this code at the end as well so you can just sort of skip this prototyping but this is sort of useful because it's going to show you how what this process actually looks like so let's um let's clean up our jupiter notebook a little bit so this is going to be exploratory analysis and what have we done here so we've gone we can delete that pip list we don't need that so we've gone and imported pandas we've gone and read in let's add comments so import pandas i find it's useful to include comments as you're going along because it's actually going to help you remind or remember what you're actually doing and what you're actually up to so here we're reading in the initial json object then we're viewing the first five rows and then we're exporting to csv so that's a little bit of exploratory data analysis already so we've gone and imported pandas we've gone and read in json we've gone and taken a look at the first five rows and exported it to csv but we don't like that csv format looks crap we want to make it better so let's add in a bunch of cells again pro tip if you want to add in cells just hit b twice it's going to give you some cells if you want to delete cells d twice and that's going to delete it all right so um let's add in another section so what i'm actually doing when i'm adding in these sections is i'm converting these cells to markdown so if you haven't played around with that before so if you select a cell hit escape so you can see that it's green it's active if i hit escape it goes blue and then i can hit m which converts it to markdown so markdown is a great way to annotate your text or code effectively so here we're going to create helper function to process data coolio okay so what are we doing uh so the first thing that we actually want to do now is we want to bring in this data in its raw format as json so let's import it so import json so that is going to be our json helper to load export.json and we can actually go and load it up so i'm going to write with open export.json and then i'm going to pass through the read flag as f uh it's going to call it data equals json.load f alright i think that's our data loaded all right yeah that looks good so let's take a look what we wrote there so load up data so i've written with open and then i've passed through the name of our file and this is just this here so export.json see export.json export.json so i've written with open and this is going to open up our export.json file and then i've passed through the r flag because we want to read it and then i've specified that i want to work with that variable as the variable f so as f and then colon and then i've gone and specified or what we want to actually store that variable as so i've gone and created a new variable called data and i've set that equal to json.load and then through the json.load method i've passed through the file so this is effectively it's a long-winded way of saying grab the file and load it as json right and then it's going to be stored as a variable called data and if we take a look at our data variable here that's all of our monster.json data right so there's a ton of it cool okay so what's the next thing that we want to do so we want to go and loop through each one of these and create a what are we doing so we want to un-nest each one of these so it's probably useful if we take a look at which of the values are nested because when we do our processing we're going to want to actually unpack each one of these so let's actually go back into here so what's actually nested so video is nested author is nested music so i'm just actually going to let's actually create a dictionary for this so i'm going to create a new uh not a dictionary i'm going to create a list so nested values and what's nested so video is nested so i'm just storing the keys that are nested because what we're going to do is we'll do a check to see whether or not as we're iterating through whether or not a value falls into the nested values list and if it does then we're going to do some additional processing so uh video is nested author is nested what else is nested music challenges challenges is not a dictionary though right so this is a list so if you take a look right so each one of these nested values is a separate dictionary a challenges value is an array now to be honest the value inside of that is kind of rubbish anyway so we can actually i'm going to skip that let's create another list called uh skip data and we are going to skip challenges because i don't think that's actually going to add much value to be honest um all right so stats we definitely want stats we're data scientists we want that what else what's good the stats we need that uh duet info it doesn't look like it's got much info anyway let's skip that one duet info uh text extra that's in an array as well now if we really wanted that we could probably grab that but i'm gonna skip that one as well actually wait we need to put that in a skip list [Music] wait let's collapse that author stats we definitely want that stickers on item ah screw that we don't need that and remember this is the first iteration right so if we wanted to go back through and add in more data that looks like we're done um if we wanted to go back through and add more data we definitely could wanted to go and do some additional pre-processing we could to unpack um like what is it the challenges thing or the uh the text extra thing we could definitely go ahead and do that um this is purely just the first iteration so let's make sure that we've got everything that we need there so id is fine description is fine created time is fine video is fine so we've got video added there got author we've got music we've got stats challenges we've gone and put in our skip values fine duet info skip that text extra skip that author stats we've got that in our main i'm going to show you this in a sec stickers on item we're going to skip that and then the rest are flat okay cool so we've gone and created two new lists and these lists represent the values that we're gonna need to do some additional processing on so nested values is going to we're gonna need to unpack each one of those skip values we are going to need to just skip those values so we're going to actually iterate through each one of the values inside of this data object and we're going to process them now so let's actually take a look at what we need to do so we'll first up create a flattened dictionary we're going to call it flattened data and then we want to actually loop through every value inside of that dictionary so let's actually go and get this done okay so that's our initial loop so again not nothing groundbreaking there so i've just written flatten underscore data equals and then created a blank dictionary and then to loop through we've written four value in data and then we're going to print value right but now what we need to do is we need to iterate through each one of the values inside of the values of this but we need to iterate through every single key inside of this value here so we can create a new loop there i'm going to write 4 result what are we going to call it property index property value in value dot items should actually call this um something better so let's call it i will leave it it's fine all right so we're going to then loop through each one of those and then print crop index need a colon there right okay so this is giving us all of our indexes so you can see that there so the first thing that we've written is four value in data so we're looping through let me show you what i'm doing so we're first up looping through each one of the results because remember we've got a ton of videos so we're going to go through and loop through each one of these values because it's in a list so four value in data and then we're going to loop through every single property that we've got inside of one result so if we actually go through and take a look so you can see that id is a property description is a property created time is a property video is a property so on and so forth then what we want to do is we are going to we should probably enumerate here so um let's create idx so what we actually want to do is create one unique id per value in this data set so rather than having yeah this is going to work better so rather than just having a list we're going to store it as a identifier so i'm going to call so for idx comma value in enumerate data let me show you what this changes so if i print let's comment this out print idx right so this is going to give us the fact that we have 30 different 30 different videos inside of this result coolio all right so that means we can use idx as the identifier and value is the actual value okay so what are we doing so we are then going to create a new record inside of our flattened data dictionary so if i type in flatten data idx and we're going to set that equal to a blank key all right so this is the initial phase make sure this tabbing is working um so if i comment that out if i type in flattened data right so we've now got a dictionary we've got a unique id so this is going to be per video and then we've got another dictionary which is effectively going to hold each set of data there so then what we want to do is we want to first up we're going to check whether or not our data falls into one of these are we we should probably include this in here and then just check whether or not it falls into nested values and then if yeah let's actually do that for that so this is effectively going to represent all of our nested values and then we check whether or not it's within the skip values if it is then we're gonna see it okay that makes more sense okay so we are then going to first up what we can do is one of the easiest things is we can just add in all of the flat data right so anything that's not nested we can just go on it straight ahead and just add that back to our flattened data array because this is just going to make it easier when we actually add it into our csv so let's go ahead and just add the flattened ones already so we can go um if let's check whether or not it's in nested values if it's not in nested values then we can go on ahead and add it so if uh prop index because remember this is our property right so if prop index in nested values have we run this no we haven't all right nested values and we're just going to pass for now alif actually it's just else right else we are then going to add it to our flattened data dictionaries flatten data idx and then we are going to say prop id prop idx yes because that's the key okay equals prop value all right that looks like it's worked so if we take a look at flattened data okay so that's looking good so far so what we've got now is a dictionary we've got a dictionary and each one of those values so let's take a look at value zero we've got all of a or all of our flattened values now attached to that flattened data dictionary so we've got id we've got description we've got create time original item official item secret for friend blah blah blah all of that other stuff so that's looking better we've got our hashtags there as well okay but now what we need to do is do an additional bit of pre-processing when it gets to here so if it's in nested values we want to check whether or not it's in skip values so if prop i dx in skip values then we're going to do nothing alif actually then we'll say else then we want to flatten it so we then need to loop through every single i know this is pretty complicated guys but this is what the data pre-processing is like so let's actually give some comments because i think it's getting way too hardcore all right so create blank dictionary we're then going to loop through each video we are then going to loop through each value in each video and then we are going to check if nested and then down here is if it's not nested add it back to the flattened dictionary right all right so we're creating a blank dictionary we're then looping through each one of these videos we're then creating these blank indexes so remember if you take a look we've got these zeros over here so effectively gonna have one key which is just a number per video so if we've got 30 videos then it's going to start at zero go to 29. then we are looping through each value or each property in each video let's actually change out each property in each video and then we over here so what we're going to have so prop idx is actually going to be the property name prop value is actually going to be the value attached to it so let me explain this so this is going to be prop idx this is going to be prop value this is going to be prop idx this is going to be prop value right and then what we want to do is if it's not nested we're perfectly fine to just add this back to our flattened data dictionary so we can just grab this and this bang it back into that dictionary if it's nested what we want to do is unpack it so we're going to convert this property instead of it reading id we want it to be video underscore id and we don't want it to be nested so let's go on ahead and do that now so if it's inside of skip values mainly because i just can't be bothered doing additional pre-processing for lists but it shouldn't be too hard if you guys do update that code let me know if it is in the nested values or if it's in the nested values list and it's not in this list then we need to process it so we're going to now do another loop so we need now need to loop through prop value right we're now looping through all of these right so again i know this is pretty hardcore but this is what it's like when you're working with raw data and working with new data sets and whatnot okay so let's actually go and do this loop again so i'm going to write so loop through each nested property okay i think that's okay so what i've gone and written is for nested index which is just going to be the the key so say for example looking video this is going to be the nested index this is going to be the nested value for nested index comma nested underscore value in prop underscore value dot items so this is effectively going to be this bad boy right so everything under video for example so from here let me show you exactly where does it go to yeah well okay hold on so this got some weird we might need a drop share cover anyway that's fine we'll just throw that in we could always drop it out in the data frame so it's going to get all of that and loop through all of that so i've written four nested underscore index common nested underscore value in prop underscore value dot items colon flattened underscore data and then what we're doing is we're grabbing this id here right so say for example we're processing tick tock zero and we are going to go to tick tock zero which is this we are then going to create a new key and that key is going to be a combination of the original index so in this case it's going to be video underscore id for let's take a look at another nested one for author for example it's going to be author underscore id author underscore unique id author underscore nickname author underscore avatar sum and so on and that is effectively what this is doing here so prop underscore idx plus so this is just some string formatting plus underscore plus nested idx equals nested value i think that's right okay so written a bunch of code there with a bunch of loops probably not the most pythonic thing in the world let's run this see if it works and then i'm going to walk you through it all again so if i run this that looks okay no errors so you can see there that we've got our id we've got video underscore id you can see it's no longer screwy right so we've got rid of the majority of our nest i can see that there's still this here but i can't be bothered fixing that up if you wanted to you definitely could um but in this particular case this is looking way better than what we had before so now if we tried to make a data frame out of flattened data so i could type in pd dot uh actually so this is a dictionary now right so it's no longer json so if i type in let's check our type all right so this is now a dictionary so to make a data frame out of this let's create a new section test out output what are we doing we're creating a data frame so uh df underscore test equals pd dot data frame dot from so we've actually got a from dict method when it comes to actually creating this data frame so if i type in dot from dict and we are going to pass through flattened data to that no errors it's always good oh my gosh that is looking way better all right there's one thing that i see that we're going to need to fix that's fine all right so what we've got and written there is df underscore test equals pd.data frame so this is in camelcase dot from underscore dict and then to that we've passed through flattened data which is a blank dictionary that we originally had up here and you can see that we've got all of our values but if you look closely you can see that we now have all of our values as or all of our features as rows and our videos as columns now we can fix this pretty easily we just need to type in orient equals index and this will flip it there you go problem solved so we've now got all of our video ids so id description and scroll on looking way better now you can see here that we've got a little uh ellipsis so this is just cutting it off let's actually export this to csv just to make sure that this is looking good and then i'm going to walk you through the code so if i type in df underscore test dot true csv and then we can say clean or analytics dot csv we're going to open this up now youtube and then we go to tick tock api analytics wait no wrong one tick tock uh and we called it i think we call it analytics right analytics so we've got analytics over here a bit small but you can see there so we've got analytics if i go and open that up boom that's looking way better so i can still see that we've got this video share thing here but that's fine we can delete that when we send it through to matt all right so we've got our id we've got our description create time video id looking way better right so we don't have all this nested crap that we were working with before we've got a ton of columns that we can actually begin to work with now look at that sick okay so that is our data now pre-process so you can actually take a look at all the descriptions we're looking good guys so that is phase two i think now done now what we should do is let me actually walk you through that code that we did to preprocess so okay so the first thing that we did is we went and set up these two dictionaries over here so we set up one for nested values and one for skip values and the nested values list just represents every single value inside of our dictionary or every single first level value that is nested so that means it is not a base data type so it's not a string or it's not an integer it's not a float it's going to be another object like for example it could be an array could be for could be also be a dictionary so on and so forth and then we've gone and specified the skip value so these are values that we just don't want to process at the moment we might actually come back and process those later on but now we're going to skip so then what we're going to note is written this property or this bit of script over here so we've written flattened underscore data so this creates our blank dictionary and then we're looping through each video so for idx comma value in the enumerate data this is going to loop through each video and our idx is going to be a number and our value is actually going to be every single individual video then i've written flatten underscore data idx and two that will so this is effectively creating a blank dictionary for every single index in our data set then we're going to loop through every property in each video so for prop underscore index comma prop underscore value in value dot items so this is actually going to give us our different properties as for example our id our description our video length our video width so on and so forth we're then going to check whether or not the properties are nested so whether or not our specific properties are a dictionary or a list if they are then what we're going to do is we're going to check whether or not it's in the skip values array which is that over there if so then we're gonna skip it else we're gonna go through and unpack it and create that new unique uh identifier so that it's unpacked and flattened now uh if it's not nested then we're just gonna add it back to the flattened data dictionary cool that was a lot to process so i think rather than leaving this here actually though we should probably create a uh helper script so that we can actually just run this on the fly so what i'm going to do is we're going to create a new script inside of our folder we don't want to save that let's just show so let's go on ahead and create a new file and we're going to call this helpers helpers.pi and we're going to start bringing some stuff so let's convert this into a function we're going to call it process results and then to that we are going to pass through the road let's call it json data actually let's just call the data so then we are going to so def underscore a def process underscore results we're going to pass through data and then we're going to specify a colon and then inside of that we are going to need to are we actually going to load it from no we're not going to need to actually load it from a raw value so we're actually going to be passing through that value so then we're going to grab these two lists so our nested values and our skip values and we're going to paste that into there you can see that there and we'll test this out as well so we're going to convert this convert processing what is it code to function all right so we've got our two uh dictionaries that define whether or not we can hide our explorer so we've got our two lists we then need our processing code so we can copy that and then what are we going to do we should just return flattened data okay so i think that is looking good so what have we actually gone and done there so we have gone and taken our two lists and put them into a function called def process underscore results and then that is going to expect data as a value and then we uh we've basically just dumped all of our code into there and then we're going to return flattened data at the end so if we save that let's actually import it into our python jupyter notebook now so let's actually test out our script actually let's go and test it out inside of our python code so we now have trending over here right so if we wanted to let's go and import this helper so we're going to import the process results function into our ticktalk.pythoscript so let's add in another line of code so import helper into tick-tock code actually that's a dumb comment import helper uh data processing helper all right so we're going to write uh from helpers import what do we call it again process results and then what we can do is i think so we don't need to export this or dump this out to json it's already json in this format so what we should be able to do is uh so we're going to say flattened data equals um what is it process results and then we're going to pass through trending so trending is effectively our raw data that we're originally exporting out to json right so we can actually comment this for now so we are now going to grab that raw data pass it to the process results function from over here and all things holding equal this is going to give us a nice clean json output which we can then load into a data frame so this is going to process data and how we're going to test this let's actually dump it out to json yeah okay we are going to dump this out to json so we're going to tweak our json codes rather than dumping out trending we're going to dump out flattened data so let's try that now uh we can probably stop our juvenile notebook we don't need that for now so let's try running this code so what do we tweak there so we went and added this so we went from helpers import process.results or process underscore results and that is just the code that we went and wrote inside of bad drupan all this code's going to be available on github as well guys if you want to take a deeper look but we've taken our process we've created our process results function inside of helpers.pi we've then imported that into our ticktalk.pipe function and then we've gone and used it down here so on line 15 so we're now passing through our trending value to our process underscore results function and storing that inside of a variable called flatten underscore data which we're then exporting using json.dump so if we try running this all things holding equal this should work let's see okay so it doesn't look like we've got any errors there just yet so if we go and take a look at our export file now let's open up our this is looking better already okay that is looking good guys so that looks so apart from this video share cover which in this case some can't be bothered fixing but you can see that that we're now getting a flattened dictionaries now exported from our ticktock.script so i think that is phase two now done now what we should probably do is set it up so that it exports it out to a csv so let's actually go on ahead and do that first up so rather than just exporting out to json we're going to tweak this a bit so we are going to comment that out so we don't need that anymore we are now going to export to csv so first up we need to convert it to a data frame so right up here we need to import pandas so import pandas to create data frames so i'm going to write import pandas as pd and then down here we're going to convert the pre-processed data make this bit bit up big up pre-processed and actually close this again pre-processed data to a data frame and we are going to call it df equals pd dot data frame dot from predict yes from dict and then we're going to pass through the flattened data now remember we need to pass through uh or why is this is killing me uh orient equals index i think that is good so grab that we're passing it through to our pandas.dataframe all we now need to do is export it so df.2 underscore csv to csv and we are going to call it uh exported actually let's just call it tick tock data and we're going to specify index equals false or actually let's just export it for now okay so what have we actually gotten out of there so we now went and added the pandas import so import pandas as pd and then we went and commented out our json x4 because we don't really need that anymore right matt said he wasn't really happy with the json export would rather a csv or something of the like then convert we went and created this or converted this flattened data dictionary into a data frame so written df equals pd.dataframe.from and then we're passing through that pre-processed value from that to that function specifying orient to equals index because remember it flips it on that axis so rather than having the videos across the top we've got them by the columns and then i've written df.2 csv and then ticktockdata.csv okay so this should assuming this works this should satisfy matt's requirement right because remember first iteration we just got the ticktock data exported it to json he came back and said we wanted a csv so now we're going to give him a csv so this is still going to effectively give us or it's gone and done our pre-processing but now it's going to be a lot cleaner so if we run it cross our fingers hope it works all right so it looks like challenge python remember sometimes we're going to get that error that's fine run it again okay so it doesn't look like we got any errors down there let's go and check out our data what do we call it again we call it tick-tock data tick-tock data down here let's open that up boom that is our python data let's check we've got our python hashtags all in there we've got a no video id video height video with let's try different a different um hashtag so rather than doing python i don't know what's something that's popular let's try run that's something i'm looking into actually let's try crypto save and run so all i need to do to change the hashtag is just change this api dot by underscore hashtag pass through a different hashtag uh we've got that first error that's fine just run it again that looks like it's run successfully let's go and take a look at our data tick tock and then what is it analytics uh no it was tick-tock data all right there you go we've got crypto so you can see that there so you can see we're now getting all the data on crypto tick tocks i don't know if people doing tick tocks on crypto i guess they are because we can see them there so you can see all of that information that is good we can now effectively go on ahead and shoot this off to matt and that is the second part of this now done oh nick i forgot to mention would you be able to take this data and maybe make it a little bit more interactive uh what exactly did you mean by that like a filterable pandas table or something similar not quite any chance we could have a dashboard oh okay sure what would you like to see on it would be great to see the tweets by views and maybe some analysis on popular authors could also just throw in a tabular view of the raw data maybe at the bottom of something so we can still take a look at the source done anything else yeah actually i had one suggestion is there any chance we could include maybe like an input box or something so if we wanted to we could actually search by a different hashtag because you never know what it is exactly that our social media teams might want to target no problems we'll get on it so in order to build our dashboard we're going to be using a python library called streamlit streamlit makes it easy to create user interfaces and dashboards using python in order to build our streamlit dashboard we're going to create a new file called app.pi and within that we're going to be importing streamlink what we'll then do is we're going to set up a search bar which allows us to automatically update our search by hashtag to be able to get new tick tock data we'll then load this data using pandas and finally we'll visualize it using plotly let's wrap this up alrighty so we've now given matt his csv data he can now work with but this is where i thought this was going to go matt now wants a dashboard so we now need to start doing a little bit of some additional value added analytics so how we gonna go about doing this well in this particular case i think one of the best and one of the easiest ways to actually go and build dashboards that's our jupyter notebook now dead one of the easiest ways is to actually go and use streamlit so streamlit allows you to quickly build and this isn't sponsored in any way i just think it actually makes stuff pretty easy to use so streamlight's great gradio's great if you want to try a different alternative what's another good one i've actually started playing around with tkinter but in this particular case matt wants a dashboard he wants it to be a little bit interactive and in this particular case streamlit is going to suit our circumstances for that very very easily so the first thing that we need to do is go ahead and install streamlit now remember what i said when it comes to installing stuff just go ahead and take a look at the documentation so you can see here that it says um get started in under a minute so that this website is just streamlit.io so to actually go and install it we just need to run pip install streamlit and then we can actually go and kick things off so let's go on ahead and do this so i'm just going to clean this up so we don't need export.json open um helpers.pipe let's actually take a look at where we are right now so uh let me open up my explorer so inside of our folder we now have the let's clean it up a bit so we've got analytics.csv i think that was our existing csv we don't actually need that we can delete that uh export.json that was our raw json export that was the first thing we gave to matt just as a records probably useful to keep that this is actually the re the retweets one so you can see that we've got the zero right we've then got a file called helpers and this contains all of our helper code so remember this is going to be what does all that let me zoom out a bit this is going to be what does all of our pre-processing remember and this is all the stuff that we wrote inside of our jupyter notebook we've got our jupiter notebook that we originally went and wrote let me clear that a bit so this was just all of that iteration to actually go and produce our helper code which gives us our pre-processed data we've got tick tock.pi which is still our main file that we've been working with so this is going to be what pulls in a ticktock data what exports it to a csv and then we've got ticktockdata.csv which is what the last deliverable that we actually went and gave matt right all right cool so what does matt want now matt now wants his dashboard and we decided that we're going to be using streamline so first things first we've got to go and get streamlit set up so to do that we can run pip install streamlit so let's go ahead and do that so i'm going to open up my terminal and you can see that we our environment is already activated so we don't need to go and activate it so let's uh let's go ahead and install streamlight so let's bring this up so you can see it a bit better so remember if install streamlit give that a sec and all things holding equal that should install successfully okay that's streamlin now installed so what we want to do it's good practice to always just go and test stuff out so rather than trying to commit to it just make sure that it works initially so let's actually create a new python folder or new python file and we're going to call it app.pi so this is going to be our dashboard let's delete that because we don't want just a random file so new file app dot pi and what are we going to do so we are going to go on ahead and test out streamlin now so let's hide our explorer we don't need this okay so in order to test out streamlit we can type uh import as st comment as always so import base streamlet dependency and remember we've already got our we've already got our raw data i don't know why i close the explorer we've already got our raw data right so tick tock data.csv so we can at least just test it out and make sure that this is going to work but i think if we then import pandas to load the analytics data we can type in uh let's actually import it to import pandas as pd so we've gone and imported two sets of dependencies so import streamlit as st and then import pandas as pd so what we can then go ahead and do is load up our data so let's load in our data load in existing data to test it out df equals pd dot read csv uh what's our csv called again nope that's not gonna work uh it's called tick tock data dot csv is it called tick tock data yeah tick tock data all right tick tock data dot csv okay and then i think in streamlight you can actually just go and render the data frame so uh show tabular data frame in tick streamlit not tick tock uh let's show the first five rows let's show everything see what that does okay so what i've gone and written there is four lines of code so import streamlit as stream sst this is going to give us streamlit as our dependency then i've imported panda so import pandas as pd so that's going to give us pandas then we've gone and used pandas to load in our data frame so df equals pd dot read underscore csv and it's going to import our csv that we're exporting from over here right and then we've gone and written df so this should actually go and show our tabular data frame in streamlit now to run a streamlit app it is a little bit different to the way that you run a normal python script and let me show you this so if i clear this down the bottom to actually go and run this as a stream lip app we type in streamlit run app.pi all things holding equal that should run so if you get this windows defender firewall just hit allow access and that is it working so you can see that we've got our data frame and now available in streamlit so this is your very first stream if you've got this far so uh we can actually go and play around with our data filter so this still gives matt and his team the ability to go and play around process the the different data frames i don't like the fact that we've got this unnamed column over here right so you can see that we've got zero one two three four five and then we've got it all over again and this is because when we went and exported it we exported it with an index so there's an easy way to fix that but for now just take a look right so we've got our streamlib app now up and working so that is a great first start but i want to go and fix this first up so let's go and fix that so uh we can stop our streamlit app and what we're going to go ahead and do is go back into ticktalk.pi and when we export to csv we're just going to add index equals false i think that's it i mean yep index equals false okay so now if we go and run um let's go and run our tick tock dot pi script again and let's go and start our stream lip app again so streamlit run app.pie and so if it doesn't open up for yourself you can actually get the the urls that your streamlit app is actually running at so in this particular case we can see it's running at http colon forward forward localhost 8501. so if i copy that and paste that into there you can see that's our streamlit app and you can see we've got rid of that weird column right so that is now we're going to be better and if we wanted to expand it we definitely could so you've actually got like a way to play and interact with your data online and that is your baseline sort of dashboard now up and running but we're not going to stop there we're going to take this way further so what's the next thing that we want to do so the first thing i think that we should look at doing is actually building up the search so what do we need for search we probably need like an input box and a button so let's go ahead and do these two things first so what i am going to do is i'm not going to stop my streamlit app the nice thing about it is that it does have hot refreshes so if we want to just write code and leave it running it'll still work so what i am going to do though is we're going to start writing some code to actually start building up our our dashboard so let's do it so let's first up start by uh creating our inputs so input and then button okay so those are the two next aspects of our app so if i hit save so i've gone and written two additional lines of code there so i've written equals st dot text input and so our text input is going to be the value that act or one of the components that we can actually use to build this up let me actually give you some more info on this so if you actually go to streamlight and go to the docs so if you go to docs.streamlit.io if you actually take a look under api reference there are a whole bunch of different elements that you can actually use so say for example i wanted stuff about text elements it actually tells you so you can use markdown title let me show you markdown title header subheader in this particular case we wanted some input so what would input actually be under it's probably under input widgets so you can see that there so we are using a text input and this gives us all the information about how to actually go on ahead and use the text input so you can see all of that there where was it going with this all right that's a bit of documentation okay so we've gone and used the text input so hashtag equals st st dot text underscore input and then we're going to pass through the label which in this particular case is search for a hashtag here and then we're setting the value equal to a blank string and that basically means that when we want to work with that input we're going to have it inside of a variable called hashtag right and then we need a button right so we need a button to actually go and trigger how we actually go about and get new data so i've gone and created a button so i've used st button and then i've called that our button is actually going to say get data now in terms of how you actually work with these buttons so how you actually do something with a trigger what we actually do is we write if st button and then we can actually trigger something down here right so i could write um run get data function here but at the moment we don't actually have a function to actually go and get some data so we'll probably just leave it like that for now but in this particular case we're starting to build up our framework right so we've got a hashtag which is going to be what we pass through and then we've got the button that we're actually going to be using to actually go and get that data so if we go back to our dashboard which is running over here and you can see that here so if you go up to the information bit so in this particular case it's i'm zoomed in a little too much let's zoom out to 100 so that's our dashboard so if i go up to a little info tag it says source file change so we can actually hit rerun and this is going to refresh our dashboard so you can start to see now it's starting to take shape so we've got our hashtag over here and then we've got our button so i can type in um python and get data in this particular case it's not actually doing anything but if we hit get data you can see it's running stuff up there right so ai get data you can see it's running but in this particular case we don't actually have anything over there what we could actually do is we could actually add st.right hashtag let's try that uh it should be that let's see if that works so if we go and rerun right so you can see that there it's now actually writing out what we're actually passing through so if i type in python you can see it's changing the value down the bottom so this at least shows that we've got some way to go and trigger that get data pool however we might actually go and do that okay so that's all looking good at the moment so we've got some data loaded we've got our get data function or we've got our input working we've got a get data button working now how are we actually going to go and handle getting this data so at the moment we've got our tick tock script which is over here so tick tock dot pi which actually works works pretty fine but this isn't exactly a function at the moment this is just a python script that we've gone and thrown sort of together how do we actually need to go and reshape this to be able to actually get it to work well what we're going to do is we're actually going to convert this into a a proper script all in in and of itself so rather than actually just leaving it sort of like this i'm going to tidy it up a little bit so let's actually go on ahead and do this and the other thing that we're going to need as well is we're going to need to have some sort of way to actually pass through a hashtag to this ticktock.pi script or this tiktoktopi function now before i get any any experienced python guys coming at me for the way i'm about to do this just know that the tiktok api sdk doesn't necessarily like working with streamlits i've had to jerry-rigged this a bit but it does work so let's go ahead and do this so first up what we're going to do is we're going to encapsulate all of this information here inside of a function so let's do this so we're going to call it uh get data and this is going to expect a hashtag all right so i'm going to tab all of this in let me zoom out so you can see what i'm doing that's a bit better so i'm going to tab all of this in so this is all of our code right and i think we can get rid of the json bit as well now because we don't need that so we're now going to have a function called get data which is going to take in a hashtag and that is going to do something so if i go and change this hashtag so we're actually going to go into the line that says api by hashtag and rather than leaving it as crypto which was the last run that we had we are going to change this to hashtag and the reason that we're doing this is that so when we go and now call this script we can actually pass in a hashtag from somewhere else so this is going to give us the ability to go and maybe run this from the command line and actually pass through a hashtag from somewhere else so that's the first thing so we went and did that and we also went and deleted the export to json script that we used originally for pre-processing so that's basically our getdata function now tidied up now what else do we need to do from here so all right so we actually need let's actually test running this out so if i just go and run um get data and i'm going to type pass through python actually let's just double check what our data looks like at the moment so if we go to d drive youtube tick tock so remember we've got this tick tock data set was the last one so the last one that we ran is crypto so if we go and test this out now and we're going to run get data and we're going to pass through python so if we go and run let's run this let's actually create a new terminal run this we need to stop this [Music] so i'm going to run python tick-tock dot pi okay it doesn't look like we got any errors looks like it went and wrote out if we open this up again all right let's bring in python let's just do one more test just to make sure that's okay for now let's try ai so remember we've gone and taken all of our code and we've gone and encapsulated it into a function called get data and now rather than leaving it as python we're going to call it with ai so if we're going to run that script again we just ran this using python tick talk dot pi right so that's now running with ai so you can see that there see a i ai ai so on and so forth cool okay so that is now working now we're not going to leave it like this because this still does this doesn't really give us the ability to go and trigger this from the streamline actually let me show you ah it's not going to work actually let's go and take a look so rather than leaving it like this let's actually try to run this from app.pi i'm actually going to show you the errors that sort of pop up if you try to run this simultaneously because undoubtedly ideally the right way to actually go and run this is to actually import getdata in here inside of app.pi so we'd run from tiktok import getdata and then what we'd actually try to do is we'd run it under the button so we'd run getdata and then pass through a hashtag so let's say for example actually a hashtag is going to come from up here so if i pass through this right and save that so if we go back to our streamlit app and we're going to rerun so if i run that it might run right fine the first time but what you're actually going to notice is that it starts screwing up the second time so that looks like it's run okay no errors there let's uh where's our other terminal just trying to find the terminal that's running streamlit because it's uh is that crashed okay let's crash so let's go and restart this so if i go uh streamlit run app apply so this is what i actually found during iterating this and like sort of building this up the first time so it would run fine the first couple of times but then might throw a bunch of errors so if we go and let's try it out to python okay so that looks like it's fine it looks like it's printing stuff out now if i go and try to run it again so you can see that this error is popping up so greenlit error cannot switch to a different thread and this is because of the way that streamlit runs and because of the way that the ticktock api runs there is conflicts between the different threads that are running at the same time but we're not going to stop there we're not going to let that stop us there is a way around it that i've managed to work out so i'm actually going to kill this and hit clear and refresh our streamlid app so rather than doing it that way what we can actually do is run it separately from the command line so rather than importing our get data function from or running it that way we can actually run it from as though we're starting up a new command line because running it like this so uh python ticktock.pi doesn't really cause any conflicts right we can run that multiple times doesn't seem to throw up any errors but if we run it side by side inside of our streamlit at that starts to cause a little bit that starts to throw a couple of issues so we're going to run it as though we're doing this but we're going to trigger it from our streamlit app so let's actually go and start doing that what we first up need to do is clean up how we actually run this so we need to create a proper way to or we need to set up the the way that we're actually running so i'm going to write if name underscore underscore equals equals main and this is basically checking whether or not we're running this from the command line and if we're running the main script what we then go and trigger so in this particular case we want to go and trigger getdata if i run get underscore data and we're just going to pass through ai for now so let's go and run it again right no issues so let's go and take ai was the last one we ran wasn't it was it i don't know let's try crypto again right nowhere is there so if we go and take a look going to youtube going to tick-tock and going to tick-tock data right uh crypto crypto yep crypto all right so that's worked successfully so you can see crypto working good but we still want rather than having to go into that tick tock dot pi script and update the hashtag we ideally want to be able to just run it from the command line and it does its thing so we can solve this pretty easily we're going to go up here we are going to import a new dependency and we're going to import sys so import sys dependency to extract command line arguments and let me explain this because when i was starting python nobody i really had no idea what i don't know if this was doing all right so when you go and run a python script you're going to be running or passing through a few arguments right so if i type in print sys dot a rgb cis.argv is going to hold all of the arguments that are run whenever you trigger a python script so if i go and run this command right now what we're actually going to get is tick-tock dot pi so if i just run it by itself so without passing anything else you can see that we're getting one argument here so tick tock dot pi now if i were to run a this same thing and pass through a command so let me bring this up a little so you can see it and pass through let's say for example we're going to pass through our hashtag um what's a good one let's just pass through ai for now you can see that we're actually able to get our second argument over here so it says ai now this is great for us right because we just need to grab that second value and pass it through to our getdata functions so i'll show you how to do that so if we just go and grab the this is just a list right so we can just grab the second value which is going to be index 1 from this list and that's going to allow us to get that ai value so if we go and run it again boom ai now pretty straightforward in order to run this for our getdata function we can just grab that and pass it into here so this is now going to go and run get data for the command argument that we're actually passing through down here so if i go and run this let's let me prove to you this is actually running i'm always super skeptical when i watch tutorials because i want to see them do it if i show you right now it says uh what is it running either right now so this is a crypto one right so this is the last run that we did it was crypto so let's actually close this i'm going to leave that file there right right now it's at crypto remember that so if we go and run this again but remember we're passing through ai down here let me make this bigger so we're passing through ai down here right all right so we're getting that challenge does not exist that's fine we just run it again normally it runs successfully the second time okay so that runs successfully the second time so if i go in now boom so we're now getting ai results you can see that there cool right so that is now successfully working let's try it again um let's do it i'm trying to think uh formula one that's popular right now lewis hamilton is about to beat verstappen famous last words all right let's i can't run that okay so that's run successfully so no errors should probably have some logging when it's run successfully right so you can see it's running and we've got the hashtag f1 so that is successfully running now cool so we've now gone and taken our original tick tock dot pi script and we've now converted it into a manner that we can actually run it using the command line and we can actually just pass through an argument here now you're probably thinking nick how on earth are we going to run this inside of our app.pi file well hold on we got this so let's just quickly take a recap as to what we did here to actually get this running so what we did is we took all of that code that we wrote initially to get our tick tock data and we encapsulated it inside of a function called getdata which takes in a hashtag we then went and imported the sys dependency so that we could extract the arguments that we're passing through at the command line and we went and cleaned it up so we wrote if underscore underscore name equals equals and then inside of quotation marks underscore underscore main underscore underscore so that when we're running from the command line we're going to be running the function main if that happens then we're going to run get data so over here and we're going to extract the arguments from the command line to actually go and run this so we can actually go and just type in python tick talk dot pie your argument actually your hashtag yeah and we can effectively go and run this for any hashtag right so we as long as you pass through your hashtag here this is effectively going to go on ahead and run and hashtag without spaces key key thing to know without spaces okay so what's next so now what we need to do is actually go and integrate this back into our app.pi file because remember when we tried to do it with that get data function that was throwing errors so let's try doing this okay so in order to do this first up we need to import a another dependency called subprocess and this is going to allow us to run it from the command line say import import sub process to run uh tik tok script from command line import sub process all right cool so that sub process uh actually wait we need to run that slightly differently so from sub process import call all right so this call method effectively allows us to run let me actually show let's bring up some doco because doco is your friend sub process right uh and we need the call function okay so down here you can see this is the call function so um run the command designate described by args so this basically means that we can pass through a list to this call function and it's going to run what we need so what we're going to be passing through is python and then we're going to pass through ticktock.pi and then we're going to pass through the hashtag so we can actually go down here and run call and then type in python so that's our first argument and then our second argument is ticktock.pi and then our third argument is going to be a hashtag that we're extracting from our input so i'm going to do do we need i'll leave that there for now okay so let's actually go and run this now uh is our tick tock app still running i have no idea how to get back to the terminal for that anyway doesn't look like it's running anyway that's fine uh so we can run streamlit run app.pie okay press your fingers guys fingers crossed this works so we can now type in python get data and that looks like it through an error this is a slightly different error that's fine let's try it again is this bringing in python this is still with the f1 data right so that doesn't look like it's run successfully let's try it again okay that's run successfully so you can see there that we're getting python now let's try another one ai right so that's now running for ai so we now have successfully implemented the search for matt right so that was one of the biggest things that he wanted he wanted to be able to search for a specific hashtag and have that process and render it into the dashboard now the one thing that i don't like is the fact that this shows up with the last set of data so i'd rather have it sort of blanked out until you run get data so we can fix that really easily all you just need to do is tab this under so that this df function down here is only run once the button is hit so if i hit save and let's go back into our stream app and rerun so you can see now that when i refresh it there's going to be nothing until we actually go and run a script so if i run python okay there you go so you can see that now it's only going to show up once we've run it can probably get rid of this right and that is just this line over here so we can get rid of that hit save so you can see that that's not going to show up anymore sick is that so we've now got a real-time tick-tock data feed okay so now what we probably want to do is clean or not necessarily clean this up but add some slightly nicer analytics because at the moment we've just got a data frame which isn't it's not revolutionary right guys want to clean this up okay so what we're going to do is uh let's minimize this and we're actually going to start building a couple of plots this is where we're starting to get a little bit more data analytically okay so the first thing that i am going to do is let's think about this flow so we've got our input running fine we've got our butter and running fine we could probably encapsulate this inside of a function but for now i'm just going to leave it so we want our data all right so that is our data then loaded so imagine it runs that python script that is our data refreshed we've then got our csv inside of a file called ticktalkdata.csv inside of the same folder what we can then do is we can start to use like plotly or some other visualization libraries inside of streamlit to be able to visualize it so i'm going to import plotly to be able to start doing some viz so let's do that and i just realized we don't actually have plotline installed so by me writing this line here it's not actually working so let me walk you through the first line so written import plotly.express as px now plotly express is amazing so if you actually go to plotly there are a ton of different types of visualizations that you can actually do with this so let's actually type in plotly express so this is like a really straightforward visualization library that basically allows you to do viz so say for example we wanted a scatter plot which we'll do in a set um it's literally import plot lead dot express and then you write px dots let me zoom in on that import plotly expresses px and then to create a scatter plot it's px dot scatter and then you pass through your x value and your y value and that's sort of done so pretty straightforward and then you can like hover over your values and actually see those values pretty cool right so we're going to do exactly that but first up i just realized that we haven't actually gone and installed plotly so let's take a look at our documentation so remember whenever you're looking at doco always look for getting started or some way to install so i'm gonna hit getting started so how do we install it all right so it looks like we can run pip install plotly and this is for a specific version so we don't necessarily need to do that we can just run pip install plotly so let's try that and i'm gonna stop our streamlit app for now and clear that so i'm going to run pip make this bit bigger pip install lightly i think that's all we need all right that's plotly installed so we can clear that now so if we go and so you can see that that error is now gone right so now when we try to import it it's not throwing a whole bunch of errors so what are we doing let's start our stream app again so streamlit run app.pi looks like it's running successfully we can close the old one let's test out our search again so we're gonna run python hit get data and remember if you get that challenge error just run it again normally it'll work all right so that is our baseline streamlit app now running but we are not happy with that we want to do some vis and we're going to do vis underneath so we've got our call function we've got our data frame function we're going to add in start adding in our biz here beers yeah or plot leave is here all right let's uh i don't know let's create a bar chart or something of all the tick tocks to begin with so let's do it okay so that is our first chart now done so i've gone and used a histogram now uh i can't remember hold on let's take a look our histogram is going to calc yeah so this is going to do a count let's see if it automatically does a sum so what i'm doing here is i've got the data frame okay let's just give it a crack so i've written fig equals px dot histogram and then i'm passing through our data frame and then what i'm actually doing is i'm passing through the two columns that i want to visualize so in this particular case i've passed through my x variable which is going to be the description of the tweet so if i actually show you that at the moment so it's going to be this column here by the let me show you uh the value uh where is it stats so stats.dig underscore count so that column there so that is going to be our histogram so x by description y and then stats underscore did count and then i've written st dot plotly underscore chart and then we're going to pass through this figure so this is effectively going to take a plotley chart and put it on a streamlit app so remember we imported streamlit as st up here so sd.plotly underscore chart or passing through our figure and then i'm passing through a keyword argument called use underscore container underscore equal to true so this is basically going to take up the full width of the app so if i go and hit if we go and actually just refresh this now python fingers crossed we should get some viz and there you go that is our first chart and now showing so you can see that we've got our sum of us dig count values and we've got all of our all of our tick tok values how cool is that right i'm not sure whether or not these num no that look okay pretty sick right so that is the beginnings of our dashboard so we've now gone and created our hashtag we've now gone and created our first visualization um we might want to do some scatter plots as well so let's do the same thing so again it's pretty straightforward to do this we just go and use the different plotly express visualization types and then we use whatever the chart type is and add it to our plotly chart so i'm going to add in a couple more and show you how to do these all right so i went and did this one a slight bit different so what i've gone and done is i've gone and created two columns within my streamlit dashboard so i've written left underscore col comma right underscore call and i've set that equal to streamlit dot columns so that basically means i can put something inside of my left column and put something inside of my right column so inside my left column i'm first i'm going to put in this scatter plot so i've written scatter one equals px dot scatter and then to that i'm passing through my data frames again pretty much the same format as what we had for our histogram up here so df equals x and then i've set that equal to stats underscore share count and remember this is just data from our data frame so we had this column called let's just go to the stats section and it's not going to be that easy i can't search so we've got stats uh share count and then what else did i do it by stats.com account so we're going to have a scatter plot of um our share count by our comment count so um our comment account is going to be on our y axis so that might show the relationship between how many shares we've got versus how many comments we've got and then to associate that to our dashboard i've written left underscore call let's remove this search left underscore call dot plotly underscore chart so up here we used streamlit.plotly chart down here we use the left underscore called.plotly chart so when you're creating the two new columns you're effectively creating a mini dashboard in and of itself so what we're actually doing is we're associating the plotly chart to the mini left cold dashboard effectively and so that to that what i'm passing through is scatter one and then again i'm setting use container width equal to true so if we go and run this again hit get data right so that is our first scatter plot now running so you can see that down the bottom there so we should probably add a label but again you could customize this to your heart's content so we've got our first scatter plot let's create another scatter plot so we could actually just copy this let's zoom out a little bit so we can see it side by side we're going to copy this and this time rather than making it scatter one we can make it scatter two a second chart scatter two and we are going i don't know what should we visualize maybe some orthostats so let's do uh video count by their heart count so we've got a whole bunch of additional stats in here so we've got uh and again you could do it by any columns it's it's this is the world's euro cell right matt didn't specify so we can throw in whatever we want at least initially you're probably going to get new new requirements coming through as and when they do i'm just looking through what columns we've got here we go we've got some author stats so let's do the video count on the x-axis video account and then we're going to do a heart count on the y-axis so we're going to do what is it i'm trying to author stats underscore video account and then our y-axis is going to be our art count so over here author stats underscore heart count and we need to put this inside of our right column and what we're going to be passing through is our second chart so it should be scatter two go and refresh get data cool so we've now got our orthostats we've now got our baseline stats um we could probably retweet this to have some color and sizing for these little plots or for the actual points let's go on ahead and do that so in order to set the size we can type in size equals stats play count we can also set the color the same way so color equals stats underscore play account then down here we can set our size size equals what did i use follow account and a color would be the same thing and again you could customize these you can play around with the charts i'm just sort of giving you a baseline and we're ensuring that we keep matt happy so we've gone and set the size and the color and we've just passed through a couple of arguments there so let's go and refresh it again and there you go so we've now got some different sizing we've now got some different colors and you can see that we've got whenever you hover over let me zoom in on this whenever you hover over one of these values you get some information now we can actually add more information or subtract the information if we really wanted to but in this particular case i think it'd be useful to have some information at least over here about maybe the name of the tweet or something along those lines so we can actually do that using the hover what is it keyword argument inside of our plotly chart so let's actually finalize that so for my scatter plot i'm passing through hover and i'm setting that equal to d e s c so when we hover we'll actually be able to see the full tweet and we can also make this a little bit shorter because at the moment it's taking up a bit of real estate right so you don't actually see these plots until you scroll down let's say for example we'll make this like 300 pixels so we can make this height equal to 300. 300 and let's just do our hovers for the last couple so what i'm going to set the others to i think this one we're going to set to description as well and this one let's set it to the unique id for the author and again these are just values that i'm about getting out of this data frame so again i've sort of picked some previous one that looked good uh so we are going to use uh author unit actually let's use author nickname author what do we need so author underscore nickname alright so if we save that do we add so we just added the hover for our histogram we added the hover for our scatter and we set that equal to the description that means if we go and refresh this now let's take a look uh and we've gone and have we do it shouldn't actually be hover it's hover data hit that there and update that there try that again okay so we can see that our histogram is a little bit shorter now but if we go and hover over it you can see that we've actually got our description showing up there so you can see there's oh julius julius the python on youtube this one says stain petite python banana happy birthday little buddy check out this baby it's not actually python specific um but again if we go and take a look at our plots down here you can see that we're getting our descriptions probably a bit long to be honest but anyway down here we can see that we've got our author nickname so in this case we've got j brewer popping up the reptile zoo galaxy exotics let's try rather than doing it for python let's do it for ai again same type of things collegehumor pretty cool right okay so that is our baseline dashboard done so i mean i probably skipped i went through this a little bit faster than i would have liked but um you can start to see that in order to use plotly express all you need to do is pass through the x value and the y value and these are just values from your data frame so the first value that you pass through is your data frame itself so what we had from up here then you specify what you want your x to be what you want your y to be if you're using like a 3d scatter then you can pass through let me actually show you if you're using a 3d scatter let's search for 3d scatter so when you're using a 3d scatter you need to pass through your data frame again but then you pass through your x your y and then your z value so that allows you to actually use a little bit more data cool all right and that is uh that done so i think that's all of our histogram done our scatter done and our additional scatter i think we're looking good the last thing that i think we should probably do to make this a little bit more presentable is you can see that when you go full screen you don't it still sort of stays kind of narrow so i think the last thing that would be sort of useful to do is make this wide and then maybe add like a side bars that gives us a little bit of information about the dashboard so let's actually go and do those last two and then i think we'll be able to send it over to matt and be done so the first thing i'm going to do is i'm going to make this wide and this is pretty easy you can actually just use the page config so let's do it all right so i've gone and so let's just say uh set page width too wide so i've gone and written one additional line there so just after all of my imports i've written st dot set underscore page underscore config and to that i'll pass through one keyword argument and i've set that equal to y so i've set layout equal to y so if i save this now and we go and run get data again at the moment you're probably not going to see anything because we've sort of collapsed it but if i go and expand you can see that we're now taking up the majority of the screen so it looks a little bit better we've got a little bit more room we can play around at least visualize what these labels are actually saying in a little bit of an easier manner right so we can actually see the full the full tweet or at least the majority of the the tick tock text and again if you wanted to expand any of these you definitely could all you just need to do is hit the little arrows so you can see that there so we can make this big and see everything that's happening okay now the last thing that i think we should probably do is add a sidebar so streamlet's got a sidebar capabilities i don't think i'm probably using it the best way but let me show you what they look like streamlit sidebar right so you can add like these sidebars like that i'm gonna be using it just for some information uh we don't have them here but uh this at least gives you the ability to provide a little bit of context as to how to use the dashboard how you can play around with it so let's go at our sidebar and then i think we'll be done ready to send it off to matt so let's do it actually before we do our sidebar i actually want to include a logo of tick tock in there so i'm going to go and find a tick tock logo let's just make sure we've got a transparent one i closed it uh this one looks okay yeah let's copy this one and i'm gonna save it in the same folder as where we've been doing all of our work so if we go into tick tock i'm just gonna call it logo and then we should be able to bring this into our dashboard okay so let's uh let's go and write this up okay so that's uh the one now done now probably need to do a little bit of formatting so actually i showed you to get the logo but in this particular case i realized that's not actually going to work we need a link to an image so you can see that we've got this image here but we need to make it a little bit smaller so we can do that using the markdown so there's different ways to actually work with the sidebar in this particular case i found using the markdown capability works the easiest so i can actually just pass through i think it's just width equals 200 here let's try that okay so that's way smaller uh is 200 i think let's make it 100 a bit better okay cool so what i've written is st dot sidebar dot markdown and then i've started to use a little bit of markdown so this is pretty much html so open parentheses or open um brackets image source equals and this is all there's a string right so if i just let's take it out of the string so you can see it a little bit easier so source and then to that i've passed through the url to the ticktock image that i showed you and this just makes it easier to work with and then i've specified width equals 100 to make it a little bit smaller and then forward slash and then close brackets and then we actually put that inside of double quotes to get this to work and then i've written unsafe underscore allow underscore html equals true to get our tick tock image now if we wanted to we're going to also add in a let's add in a title so if i type in h1 i think it's h1 yep and we're going to set our style this is so difficult to read um h1 and then style that'll make this easier to read guys let me write in a second line and then i'll show you all right so h1 style equals uh how do we do this it's in parentheses so display and then it is inline block and so this is really just html when you think about it and that's what markdown is it allows you to use html inside of your markdown code um so h1 style and then this is going to be our title so we could call it tick tock dashboard because right now we've just got the logo so if i show you that's just the logo we want it to say i don't know tick tick tock analytics matt will love that okay so if we grab that and then i think that's it uh yeah and then if we paste that just after our logo and make this single quotes and hit save let's see what that looks like okay so it says tick tock analytics i don't know why it's not picking up the h1 tags each one and maybe put it inside of a separate div what does that look like oh that's better okay so it needed to be in its separate dip so you can see that now it says tick tock analytics slick right okay so to that markdown i then went and added h1 and then style and then this is just a little bit of html so you can pretty much just copy this and if you say for example we want it to be tick-tock dashboard then just change the text which is here so you could write dashboard right and then if we rerun now i'm going to say tick-tock dashboard but we're going to leave it as tick tock analytics so rerun cool and then i've pretty much done the same thing down here so i've written st sidebar dot markdown and then i've written this dashboard allows you to analyze trending tick tocks using python and streamlit and then i've written st.sidebar.markdown and then to get started and then i've just used an ordered list and some list items down here so that gives us the ability to give some instructions to our potential users so i think i can actually get rid of that i hit save and you can see that basically just reads out to get started enter the hashtag you want to analyze hit get data and then get analyzing so if we hit save now i think that's pretty much everything now done so let's go ahead and rerun it so you can see it looks pretty good right we can close all of this stuff so let's type in ai i know you could also add some searches and stuff on the side and there you go there's our ai dashboard how cool is that so that effectively gives matt and his team the ability to actually go and analyze some tick tock data so if we wanted to say for example now no formula one uh let's actually type in red bull racing i don't know if that's a hashtag might be nothing there oh there is pretty cool let's uh do a mercedes f1 there you go so it's coming up with mercedes there you can even see it pretty cool right so we can obviously expand this do some more analysis hamilton runs over his mechanic classic vettel but you can start to see that we've got a whole bunch of analytics at our fingertips here and that's pretty much it so what the last couple of things that we went and added to our dashboard so we obviously went and created our sidebar and really we can just write whatever we wanted here if you didn't want to write this you could write whatever you wanted we also went and added our different plot so our scatter one and our scatter two we went and split our columns and that about wraps it up guys so we can now go on ahead and show this to matt or i don't know create a screenshot and get him over to our desk if we really wanted to you can actually deploy these as well um but i won't be going through this right just right yet um but you can actually deploy these apps um or deploy them to streamlink cloud as well but that in a nutshell is how to actually go about building up your tiktok analytics dashboard let's shoot it over to matt hey matt this is v1 of the dashboard it allows you to type in a specific hashtag into the search bar and actually shows you the information that you're looking for in real time now if you scroll all the way down to the bottom you've got the full tabular view as well if you want to go on ahead and take a look at that there's probably a little bit of additional work that we can do in terms of improving the stability of the data feed but other than that that's at least v1 done so hopefully this is going to help out your social media team thanks so much for that dashboard that was exactly what our social media teams are looking for that search is absolutely perfect i mean we could probably work on the stability of the search but i mean as a first cut this is absolutely brilliant thanks nick and team thanks so much for tuning in guys hopefully you enjoyed this video if you did be sure to give it a big thumbs up hit subscribe and tick that bell and let me know what you thought of this new format i figured it'd be super useful for people that aren't as experienced when it comes to working with a client as to how to take down requirements and actually translate that to code but thanks again for tuning in guys peace
Info
Channel: Nicholas Renotte
Views: 2,173
Rating: undefined out of 5
Keywords: data science projects for beginners, data science project from scratch, data science project in python, data science project walkthrough, data science project ideas, data science projects using python, data science project tutorial, streamlit tutorial, streamlit python tutorial, streamlit dashboard, streamlit app
Id: E6B3uWF-V7w
Channel Id: undefined
Length: 137min 11sec (8231 seconds)
Published: Sun Dec 12 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.