Using Python and YouTube API to Create Analytics on any Channel.

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

I created a video tutorial as well as a written tutorial on using the official YouTube API to pull in data and build some charts. In the video tutorial, I use Mr. Beast's channel to gather data on all of his videos and plot the most Thumbsdowned videos. In the written one I used Good Mythical Morning my favorite YouTube channel! Let me know what you think.

Written Tutorial

Just give me the code!!!!! GitHub

👍︎︎ 1 👤︎︎ u/Dwigt-Snooot 📅︎︎ Mar 17 2021 🗫︎ replies
Captions
welcome clarity coders in today's video i'm going to show you how you can use google's official api to pull youtube data into your programs you can use this if you want to track channels analyze channels whatever you want by the end of this we'll have a working program that pulls in data and also i'll create some visualizations for you like these let's not waste any more time and jump right in so to get started here we're going to sign up for google's official api to do this we're going to head over to this url if you don't have a gmail account you're going to have to sign up for one and then log into it you'll end up on a screen that looks like this from here you can click the drop down above here you probably won't have a project yet and you can click new project you can name that project whatever you would like and hit create and now we have to enable the youtube data app so let's go ahead and go down here and hit library now you can search through all the different api libraries here the one we're going to use is youtube data api version 3. go ahead and click that and then you should have a blue button that says enable here go ahead and enable that we're going to head back to our original screen once you've enabled it you can click on credentials and then you're going to do create credentials and then api key once you have an api key it'll spit it out on the screen and you can push this copy button over here we're going to use this in our app now the interface i'm going to use today is deep note i do that for a couple reasons one you don't have to install a lot of the libraries we're going to use so if you want to follow along exactly like this it's free go to deepnode.com it runs python just like you would expect in the jupiter notebook if you want to do this locally or in an ide that's fine as well do it however you would like if you're in deep note i'm going to go ahead and hit new project you can see that we got our new project up and running we're going to head over to integrations and we're going to add a new environment variable so what i'm going to do is i'm going to put our api key in here so i'm going to name it api underscore key then i'm going to paste the value that we got from creating the api key and you can call this whatever you want i'm going to call mine youtube and then push create now we can go ahead and connect this integration and then we can use it in our project so what we're doing here is we're just kind of putting our api key inside of an environment variable so we don't have to have it sitting there in the open notebook now once we get it connected you should have a button that says how to use if you click on that it'll show you you have to import os and then you can get your api key like this i'm going to go ahead and grab that code and we can head over to our notebook this is our new notebook so i'm going to minimize this over here i'm zooming in just for you guys we can delete out this first line and we can paste in what we had from above so essentially this should pull in our api key let's go ahead and save that in a variable i'm going to just call that variable api key and now we got it in our program now these work just like jupiter notebook cells so we can do like print hello you'll see we get our output there so now that we have this set up i'm going to move our api key down here actually this next cell and we can put all the libraries we're going to use in this first cell we're not going to use much and most of these are already installed on deep note if you're using deep note we're going to also import pandas as pd we're going to import seaborn as sns and import matplotlib dot pie plot as plt so these three bottom libraries we're just going to use to visualize the data at the end so if you're not doing that you don't have to import them and then to connect to the google api we're going to use from google api client dot discovery import build this is the only library we have to install here so i'm going to add a code block above this and if you go to google's quick start guide here you can grab this pip install line so this is the line we're going to use to install the libraries we need from google i'm going to add an exclamation point in front of this and then our install packages shift enter and we can run that now we can run our imports and they all should import and you can see we're good to go there so now all we're gonna do is we're gonna try and investigate a channel so let's just head over to youtube and pick a channel i'm gonna go into incognito mode just because i only have like programming channels and things like that not very exciting so let's try and uh grab something really popular maybe on the front page here mr beast or dream i'm gonna go ahead and pick mr beast let's grab that channel now what we want is to get his channel id but you see that he has a custom url up top so when you're on their page you can right click view page source once you got this source code up you can search for browse underscore id and then that will link you to their channel id so we're gonna copy this and paste it into our script so it's in browse id and then it's the value field i think we're done with this so i'm gonna go ahead and close out of that and i'm going to leave this up top as well so i'm going to say the channel that we're looking into in this case is mr beast and that's the channel id now below that one more little piece is set up here we're going to set up our youtube instance so we're going to say youtube equals build which we imported above youtube comma our version is version 3 comma and then our developer key is equal to api underscore key where we saved it above so we can run this cell now we should have everything pretty set up from above now i'm going to create functions so you can reuse these so we're just going to create a function to get a channel statistics this isn't going to be video based it's going to be the channel as a whole so like how many views does mr beast channel have i'm guessing it's more than mine we're going to define get underscore channel underscore stats you could call this whatever whenever i'm making a function i try to think ahead of time what i'm going to pass in we can always change this later but i'm thinking we're going to need our youtube instance and we're also going to need a channel id now if you're wondering how i'm going to build out these functions they all come from the official youtube api so if you head over to this url hit reference then you'll see the basic api on the left side so if you want channel information which is what we're going to do first you can click on channels we don't want to update a channel we don't want overview we want a list of a channel so to list some data out and you'll see here it has some common use cases so this is pretty great documentation and if you click the little code looking icon down here you can even get specific chain or specific languages how you're going to access this information so you can go from there you can also send execute sample requests so you can see if i have a channel id here and i put in part and get that snippet content and statistics you can see what that'll look like and you can see this is the response we're gonna get back so we're going to try and follow along with this kind of setup here so we're going to move back to our program inside of our function we're going to make a request so you can call the variable whatever i'm going to call it request we're going to use the youtube instance that we passed in we're going to say channels dot list and then inside of that list is where we need to push our information so that part is required and you can see that they passed in all the information i don't know what of this we're going to think is useful so i'm just going to take everything so i'm going to copy everything from that variable and paste it in here so we're going to get snippet content details and statistics i'm going to set our id equal to the channel id that we passed in and i think that's all we need don't forget your comma here to separate your attributes so now we're going to get the response back so i'm going to call this variable response and i'm going to set it equal to our request from above and i'm going to execute that request now for right now we're just going to explore this a little bit so let's print out that response and see what it looks like i'm going to shift enter to run this cell and define that function and now below i'm actually going to call that function so get channel stats i'm going to pass in youtube and i'm going to pass in our channel id and i'm going to run that and you can see that we get a response back i'm gonna highlight it all i'm gonna go to a json formatter paste that in now you can see a little prettier view of what we are looking at so you can see that we only got one result back which we expected because we used a very specific channel id and then if you slide down you can see that most of our information is inside the items key so what we're looking at here is it's giving us back a dictionary and inside that dictionary it has a key value pair so it has kind etag page info and then the one that we're interested in is items now inside of items it has even more results that we might be interested in and you can see that there isn't really anything useful in kind or etag or id so it looks like it begins to become useful to us again on snippet it has like our channel id and that sort of information content details and then even further down we have statistics so it does look like all those are useful so the only thing that i really want to do is cut this down so we're just getting the items back so inside of here the only thing that i want to print out of that response is our items let's define this again let's run this again and now we've broken down just into the item so we can get to our snippet our content detail and our statistic and now instead of printing out this response i'm going to return it i'm going to return that response so we can use it later so i'm going to set this to a variable called our channel underscore stats i'm going to set that equal to whatever passes run this again clear this output and now we should be good to go now we should have all of our channel information inside of channel stats now we need a way to get all the video data back from mr beast but we don't want to search because the youtube api has a quota limit per day now your quota limit per day is 10 000 units but each search costs a hundred units so search is very expensive and the most results you can get back from a search is 50. so one little trick we can do is inside of our channel stats we only had one result remember we had a single channel so we have to access the zero index even though there's no other information on it we want to grab out our content details now if we look at that we have an uploads id here now what that is it's a playlist id so we can look at this playlist of his entire uploads and that'll be every video on the channel and then we can search for we can grab out playlists instead of doing a simple search and that will cost us a lot less on the quota so this is going to be the playlist id that we're going to search for so let's set this let's set a playlist underscore id equal to our channel statistics so this will allow us to find his playlist uh down the road a little bit so we're gonna hold that for right now let's check out some of our channel statistics just to see what we're working with here so let's say let's see what our channel underscore stats remember we have to grab from the zero again and remember there was a snippet and a statistics let's look at the statistics you can see that he does in fact have more views in my channel and more subscribers and videos 704 so that's what we're going to kind of reference here we're going to see if we can grab those 704 videos down and take a look at them so this was more or less just to see some information we got what we needed here which was the playlist id and now we can kind of move on from this and try and get out his video data so we got our channel stats function above here i'm going to create another function so here i'm going to try and get an entire video list so eventually i want the details on each video but right now i need the video ids they have an id number as well to get it from the api so what we're going to use is that youtube playlist called uploads and try to get every single id of a video so hopefully 704 videos inside of a list that then we can later pass to the api to get specific details on each video so like how many views that video had how many likes how many dislikes all that kind of information so this is almost like a helper function so i'm going to call it get video underscore list we're going to pass in youtube and we're going to pass in our upload id let's create a video list and we can just set that equal to nothing for right now then we're going to make our request so we're going to make a request again and we're going to do youtube dot playlist items now again you could search to find this but we don't want to do a search because that's really cost intensive here inside here we can define some parameters so we need to define what we want to get back so we're going to say part equals this time we're going to use snippet and content details we're hoping this has the video id in it we need to pass in our playlist id so it is called playlist id and we're going to set that equal to our upload id that we we passed in before now upload id is a playlist id but it's the playlist of your entire channel so every single video now we're also going to do something below so we're going to set max results equal to 50. now this may surprise you because you probably think we should pass in 704 or something bigger than 704 because he has that many videos this is a result between 5 and 50 that youtube gives you how many results will be on a page 50 is the max so we want more results than that right we want 704 results in this case but the most we can get at one time is 50. now the next page will have a next page token that we can use to get the 50 results after that so we're going to get results 0 to 49 and then we'll get 50 to 100 and whatever 99 or whatever and we can keep moving on through that using that page id so that's we want to set our results as high as we can and the highest we can is 50. so we're going to use a variable called next page i'm going to set that equal to true so when this is false that means there's no more pages we're going to stop looking then i'm going to use a while loop so i'm going to say while next page so as long as that variable from above is true it's going to keep searching for more results so while next page we want to execute our result or our request so the request from above so i'm going to do response equals request dot execute then i'm going to grab our data out so i'm going to call this variable data actually i'll set it data equals response and then again it has an items field on it now you can look at these individually to break them down i did this ahead of time so it has the same json structure and we're grabbing out the items out of it now inside of this data so inside of this data request there's a single result for each video now for this first page it's obviously going to have 50 results in it and we want to iterate over that we don't always know that it'll be 50 though so we're going to use a for loop so we're going to say for each video in our data response we want to grab out a video id that's what we're ultimately looking for so we can get the details on the video we're going to say video id equals video we're going to grab out the content details and then inside of there there's another key which is video id now i'm going to do a little sanity check here and i'm going to say if our video id so what we said above is not in our video list already not get video list not in our video list from up here so this is the video list i'm referencing so if it's not in there already let's go ahead and append it so let's say video list dot append we're going to pin that video id now this would work as long as we got out of the while loop this would work and give give us 50 results but we want more than that we want all the videos on the channel no matter how many they have so down here we're gonna say if there's a next page token so if this next page token exists in our response dot keys so in that json if there's a key that says next page token that means there's more results so let's go find those results if that's the case our next page is going to stay true it's kind of redundant i don't think we need to add that but and our request our next request will be almost the same as befo before so we're going to copy this we're gonna paste in our request highlight it push tab a couple times so we're on the same line here so you can see this is the exact same request before so we're going to get the exact same results so if you left it like this you're going to get the same 50 videos over and over again forever and you're going to blow your api limit so what we want to do is we want to add a another value in here and we're going to say our page token equals our response page token response next page token so now the difference is it's going to actually reach out and grab the next page of results and it's going to continue iterating over those results until it's all done now if there isn't a next page token so we're on this if statement here so if there isn't a next page token we're going to add an else and we're going to set our next page equal to false and this will exit our while loop so you want to be careful with this if you mess up the indentation or something like that you may make a lot more calls than what you thought you were so we got everything set up now after our function is done we can go ahead and return our video list so this shouldn't have any details in it it should just be a simple list with in our case hopefully 704 uh results or something like that so here's our total function find video list we got our while loop we're returning our video list from that we need to pass it in to get started our youtube instance and our upload id and it's going to return us a video list so let's go ahead and do that down here now i'm using the same variable name you don't have to i'm going to call it video list because why not and our function was called get video list and don't forget if you're using jupiter notebooks i don't use jupyter notebooks that often so don't forget to run your cell to define your function so i'm going to shift enter so we actually define that function so get video list we're going to pass in youtube and we're going to pass in our upload underscore id awesome shift enter across our fingers here is our upload i d oh we called it playlist id here so we can go ahead and use playlist id now you'll notice we name now the name is different than the function parameter and that's fine it's doing it by order so we can call this whatever we want you could make it the same if you wanted to but you don't have to we're gonna do shift enter oh and i messed that up so we actually need to dig a little deeper here to get the playlist id so we're actually going to dig into related playlists and then here we're gonna do upload we'll run that that should give us a new playlist id you can check that if you want i'll do playlist underscore id here you can see that we got a playlist id out again i think this is the one we want so we're going to pass that instead notice it just gave me a bad response back saying that playlist didn't exist try this again cool we have a video list we don't know what's in it but we hope it's video ids so we can do video lists at the zero index and just see what we got and you can see that we got an id back cool now let's do one more sanity check here let's say the length of our video list and you can see it's 704 so we got the entire video list of all our video ids so we're ready to build our next function which is actually getting the details of our video back all right so we got our video list back let's continue on here i'm scrolling up just a little i'm going to enter another code block here and this is going to get our video details so now that we have our list of videos we can actually get our video details back from youtube i'm going to define a function called get video details i'm going to pass in our youtube instance and i'm going to pass in our video list now here i'm going to build out a list called stats list now you would think we could pass our entire list of videos to youtube and it would hand us back all the details for all the videos but it's not going to do that again it's going to have a hard cap of 50 there so now we do know how many videos we have right we have a list that has 704 items so we're not going to use a while loop this time because we know exactly how many we have even if it's dynamic even if you change the channel to a good mythical morning who has 2 000 plus videos it's going to dynamically know the link based on that video list so we're not going to hard code that in but we are going to use a for loop so what we're going to do is we're going to say 4i in range now for range you have three parameters you have your starting parameter you have your ending parameter now we don't want to type in 704 and hardcode it because it won't work if you use someone other than mrbeast we're going to say len of our video list so however long our video list is and here's the tricky part we're going to add a third parameter so we're going to go comma and then we're going to add a 50 in here now what that means is we're going to jump 50 every time so it's going to do 0 to 49 then it's going to go 50 to whatever so on and so forth so it's going to count by 50. so we're not going to miss any videos and we're not going to go over youtube's limit of 50 items per request now we can make our requests so we're going to say request equals we're going to say youtube.videos this time we're going to do our dot list again we're going to set our part equal to and we're going to do the same thing the full snippet content details all that jazz that we did from up here so we're just gonna grab everything we can about a video and our id is going to equal a portion of our list so we're going to grab a piece out of our list but only 50 at a time so we're going to say video list and we're going to slice it from wherever our i index is right now so it's going to start at an i index of 0 and then our i index plus 50. okay so that may be a little out there at first but the first time this loops through it's going to give a i value of zero so it's going to slice from 0 to 50. so it's going to grab the 50s exclusive it's going to grab from 0 to 49. so it's going to pass in 50 ids and youtube's going to give us back 50 pieces of data and then we're going to go to the next one and the next time it's going to pass in 50 and 100 and it's actually going to give us back results 50 to 99. so on and so forth so that's going to be how we're not going to blow the limit that we have here so now we can get back our data so we're going to get data equals request dot execute now if you want to you can look at this response and get all the details of what's going to be on a video so why am i doing this i'm going to get something off of these videos but maybe you want to get something different so i'm going to show you how you can do that so we're going to go to list i did the code snippet again we're going to enter everything like we are it has a sample video here we don't really care we just want to see what the response is and we can go ahead and execute that piece of code now you can see our response is down here you can see what we're going to grab off of it so if you wanted to change this up you can grab whatever you want i'm going to grab out the title but you can see here and i'm gonna grab out a bunch from statistics so i'm gonna grab out the view count the light count the dislike count and that sort of thing so if you want something different look in here you can find all your results there but that's how i'm figuring this out so we got a request response now this should have 50 videos on it at least the first time we're going to do another for loop here so we're going to say for our video in data and we're going to grab it out of items again that's the only really useful part we want now this is up to you this is what i'm grabbing off the video so i just follow along with me for right now and you can do add or subtract whatever you want so i'm gonna grab out the title and that's gonna equal our video our video we're gonna grab this off of snippet and it's at the key of title copy this our next one is also on snippet so the next one we're going to grab is published and it is at the snippet of published at and then we have a description and that's going to be at snippet as well description then finally i was interested in how many tags people are using so i did another one called just copying the line from above i did another one called tag underscore count and that's on snippet as well and it's actually a list so if we grab tags here this is going to return us a list and i can say the length of that so this will tell how many tags the video had we can do view count i don't like how i'm spacing these there we go so we can do our view count so these are going to start being on statistics so i can do video statistics these next ones are all going to be on statistics so we'll do view count light count dislike count comment count that should be good this is going to be our light count this we can make our dislike count and this can be our comment count all right now we need to dive into each of these so what i'm going to do here is i'm going to use dot get because i'm afraid that it might not have a comment count or something if it's a weird private video or whatever so i'm going to do dot get and then i'm going to search for view count i'm going to see if there's a key of view count now if there isn't a view count if that doesn't show up i just want to throw a zero in there i'm going to actually do that for all these just so we don't get an error here make sure there's no space or anything so for the light count we're going to search for light count now these have to match exactly because we're looking for the keys that are in that json file so capitalization matters no underscores don't change it at all dislike count and this is called comment count so now we should have everything that we want to build our stats dictionary and we're going to paste that inside of our stats list so we have one for each video so we have one dictionary for each video i'm going to say our stats underscore dictionary equals i'm going to create a dictionary here i'm going to say title equals title published equals published and so on and so forth cool so now we should have a dictionary and at the end of that dictionary or after we create that dictionary we can append it to our stats list so we're going to stay stats list dot append and we're going to append our stats underscore dictionary i forgot a comma here so make sure that you remember your comma and then after we're done with this function we want to return out our stats list we're going to return our stats underscore list i also stopped putting commas here so make sure you put your commas on these as well cool it actually worked so now we got our stats list defined and hopefully we can get that back out now we're going to actually try and run that code and hopefully we get a list which has 704 items and each have the stats of our videos on it so let's call this video data and we can set that equal to get underscore video underscore details we're going to pass in our youtube instance again and our video underscore list we're going to run that it does not like our key of tags let's go up and look at that i probably just misspelled it let's say dot get and then we'll pass in tags and if it doesn't find anything let's just return a blank list whose length will be zero if it doesn't find any tags we're just gonna say that the video had zero tag and we'll put a comma there oh no we don't want to come this is the parenthesis that closes our get function we need another one to close our length function here there we go okay that worked so we got it defined again let's try this again and see if we get any other errors you can see this time it's executing and taking some time that's a good sign it's probably getting us some video data let's take a look at one of the results of our video data so we're hoping it has 704 results but let's take a look at the first offering people a hundred thousand dollars to quit their job that sounds very mr beasty so i think we got something here you can see it's even cleaned up format as well that we picked out so you got your view count your likes your dislikes and all that jazz let's see if we have all of our videos so we'll look at the length of our video data and you can see there is indeed 704 results so this video is getting a little long but let's take a quick peek into some visualizations here so we can create a data frame with this data pretty easily now in the format it's in we're going to say df equals pde dot data frame and then we can pass in all of our video data that's going to create a data frame for us and that'll just make our visualizations a little bit easier now i'm going to paste in a couple lines of code that i'm going to walk through with you so what we're going to do here is we're going to add a parameter for title link so i'm going to add a title length in here so i'm creating a new column called title length and i'm using this line dftitle.str.link to find out the length of the title so i'm curious how many characters he's using in a title then these next four we're just changing these two numeric columns it put them in as objects so we're going to say that the view count like count dislike count comment count are all numeric columns then finally i'm adding a reactions frame as well so that reactions is i'm counting just like youtube metrics so anytime someone reacts to a video so like liking my video or subscribing to my things like that so we're adding up the light count for a video the dislike count the comment count and oh here i actually did comic just one comment count and then we're going to save it to a csv so if we want to do more an analysis down the road we don't want to re-download all of his videos so i'm going to add create a csv this isn't good mythical morning anymore this is mr beast let's run this you can see that it added those columns also if you expand our folder you should see we do have a mr beast data csv now so now you can see we also did a head parentheses which is going to print out the first five rows for us so you can kind of take a look at your data as well now once you have that we can start doing some plotting here i'm going to copy pasta some of these in here and just talk about a little bit you can get this notebook obviously if you would like you can see the this is plotting out the tags he used so i'm using the sns library from above the seaborne library to do a distribution plot of tag count so this is the number of tags he used and how many times that number occurred so you can see most the time he doesn't use any tags at all there's 200 occurrences of him using zero tags the next one i'm going to do exactly the same thing except with title link so i want to see how often how long his titles are on average so you can see most of the time it's right around 40 characters sometimes longer but you can see the distribution here now i'm going to create a new data frame in this next block i'm going to call it highest views i'm going to take df and largest the 10 largest view counts and i'm going to sort by those now i'm going to truncate his title so if they're longer than 40 characters i'm going to cut them off there so it doesn't blow up our chart and then i'm assuming his largest videos are going to be in millions of views so i'm going to take whatever the view count is and i'm going to divide it by a million now if you do a subscriber like me who only has hundreds or thousands of views per video you obviously don't want to put it in millions but mr beast his top 10 videos will all be in the millions for sure we're going to go ahead and run this this will give us a brand new data frame to work with and then we can plot out those results so we're going to look at his 10 most viewed videos so what we're doing here is we're doing sns.set just to set the plot dimensions our x is going to be our view count so our view count is going to be horizontal and then our titles are going to be on our y-axis so the most popular videos you can see i'm setting the title i'm setting the labels you can play with that however you want on your charts i'm going to run this cell and you'll see here i use the set limit between 20 and 32 million but all of his videos are larger than that so we're going to delete out that limit and run it again and you'll see that his 10 largest are all between 80 and 120 so if you want to spread this chart out a little bit more we can add back in that limit and we can say between 80 and 120 now this is going to be the same data it's just gonna show up a little different so you can see that's a little more dramatic of a chart i guess actually through 115. cool now you get the gist i'm gonna leave the rest of these up here i want to look at one just to see i want to do the most thumbs down videos for mr beast so i'm gonna copy pasta this in here i'm doing the exact same thing except i'm doing it on the dislike count and i'm creating a new data frame for this as well just in case you want to do something else with it and now let's plot these now again a catch when you're doing this watch out i did uh dislike count divided by 1000. so again if you're doing someone like me i don't usually have a thousand dislikes on my videos not because people don't dislike them just because there's not that many people watching them so you want to do a different one depending on the size of the youtuber here but mr beast that should be good so you can see we're gonna make another plot so this should give his 10 most thumbs down videos you can see it's between 50 and 250. so if we wanted to we could add in a limit there as well copy this we can start it at 50 instead we can go 50 to 250 to [Music] 275 or something awesome now you can see his most thumbs down videos so the most thumbs down videos okay i could see this i could see this saying logan paul a hundred thousand times i could see just liking that video so looks like we're on to something i hope you guys enjoyed this video a little different format if you have any questions let me know in the comments if you want any other videos any other analytics stuff let me know if you enjoyed this video please like and subscribe it really helps the channel and until next time keep coding you
Info
Channel: ClarityCoders
Views: 7,044
Rating: undefined out of 5
Keywords: YouTube API Python, youtube api, python api tutorial, YouTube API Analytics
Id: 2mSwcRb3KjQ
Channel Id: undefined
Length: 46min 49sec (2809 seconds)
Published: Wed Mar 17 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.