Data Analysis of IPL data | Python for Data Analysis | IPL Data Web Scraping | Great Learning

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello guys i welcome everyone to this live session and since this is the ipl season as i've told you in my previous live class we'll be dealing with a lot of ipl related stuff and see how we can use python and data analytics to do quite interesting stuff on whatever data we can get our hands onto and before we start off with the session uh she could just give me a confirmation if you guys can see my screen and if you guys can hear me that would be really wonderful so if you can see my screen and if you can hear me just put up yes in the chat and also before we start for the session i'd like to inform you folks that we have launched a free learning platform called as great learning academy so as you see over here this is the site and you'll be able to find the link of great learning academy in the chat as well so great learning academy will provide you more than 100 courses with respect to different platforms so as you see over here you will have courses related to artificial intelligence cloud computing digital marketing and a lot lot more and once you complete these courses you will get a course completion certificate which you can go ahead and add on to your resume which will be a huge value add for you guys and also as a ceo here we also regularly conduct all of the live sessions and if you want to register for it you can go ahead and click on any of these and you can register for this session as you see over here today's session is this data analysis of ipl data and tomorrow again we have a live session where i'll be taking a q a with respect to python and data science and then on monday we've got a live session by fazan on javascript so if you are um interested in these live sessions you can go ahead and register over here and also for all of the folks who are new to our channel i'd like to request you guys to click on the subscribe button and also click on the bell icon as uh it will encourage us to come up with more such live sessions on a regular basis and also um just another request so we see that we have a watching of 77 and we've got around 21 likes so if you can just quickly go ahead and make the number of likes equal to watching that would be wonderful because the youtube algorithm works in such a way that if you if the video has more likes then it'll be more uh presented to more people and uh it'll be shown in their recommendation list and if they're interested in watching this they can go ahead and watch a session right now so quickly go ahead hit the like button and hit the subscribe button if you haven't yet done it now we'll move on to the agenda for the session so we'll be doing um there are two agendas for today's session the first agenda would be a web scraping so i'll just show you this website so i'll just write down ipl t20.com and as you see we have all of the information of you know different things about ipl and we'll see how to scrape data from this particular website so this will be the first agenda where we'll be scraping data from a table now after we do that what we'll do is we will also go ahead and analyze some data which we already have so these are the two agendas which we have now let's start off with the first agenda so i'll go ahead and open up the correct page over here let me just copy this and let me paste it over here let's see what do we get i see we have the team standings so this is the 2020 season and we've got the team standings and as you see currently mumbai indian says leading the table they've played four matches they've won two and they've got four points over here now what i want to do is i'd want to extract all of this data which is present in this table so let's see how can i do that using python so whenever when it comes to extracting data from uh tables with python we will need a framework called as beautiful soup so beautiful soup is a web scraping library with the help of which you can scrape data and now that is something which we already know so how do we do that that happens because again so before we go ahead to that if you guys could tell me what is a web page composed of what do you have in a web page could i quickly get answers in the chat please what do we actually have in a web page could you let me know i understand that be a delay of around 10 to 15 seconds so if you could put up in the chat what is a website made up of novel says html script others if you could also put up in the chat that'll be wonderful and guys i want this session to be as interactive as possible the purpose of the live session is to speak with you people is to engage with you people so that if you have any doubts or any queries i can solve them so it'll be um you know i'll feel good if you answer questions which i'm asking right so neville's saying html script g1 says text images tags navigations nishanta singh javascript we have links json and so on right so simply put as some of you have said a website is composed of html which stands for hypertext markup language and now how do you find the source code for this so let's say if i would have to extract data from this table this data is essentially present in html isn't it so you will have different html tags and inside that html tags you will have all of this data present now if i would want to extract this data from html tags i'd want to know under which tag is this data present so that is why you would have to find the source code for this so to find the source code you have two options either you can hit control shift and c so sure as you guys see i have just pressed ctrl shift c and we have the entire uh we have the entire source code for this particular page now we already know that this is present in a tabular format so we'd have to find the table tag over here i just clicked and you would see that i have found the table tag now if i would want to extract all of this data which is present in this table i'd have to extract data from this table from this table tag class you can consider the class to be the name of this table over here so that is how we'll be extracting data so now that i've told you how to find out the source code what we'll do is we will quickly go on and look at the beautiful soup script so we need two libraries to perform this the first library would be request library with the help of which we will be sending a request to this particular web page because if you want to extract data from a web page obviously we'd have to send a request to that web page so we are sending a request by using import requests going ahead i would need the beautiful soup framework as well to extract this html data so for that we would need this framework called as bs4 so from bs4 will be importing beautiful soup then this request module gives us a method called as get and with the help of this get method we'll be getting the data from the particular website so here all i'll do is i will give in the url of the website from which i'd have to extract the data so sure as you see i would want to extract data from this particular page and i given the link for this or the entire url for this so the command is request dot get and the whatever result i get that is stored in r now let me run this entire thing for now let's just ignore the result i'll cut out this result over here let me just quickly cut out this result let me get that back i'll actually close this and open this up again just give me a second okay so seems like the code is gone over here so what we'll do is we have something else let me just take this down over here just give me a couple of minutes guys so so okay so anyhow what we'll do is let me just remove this entire thing over here let us write out the code by ourselves let me remove this as well so we have imported all of the required libraries after importing all of the required libraries what i need to do is [Applause] so here i am giving the name of the website the name of the website i'll copy this link and i will go ahead and i'll paste it over here let me just quickly do that so now that this is done what i'm going to do is i'm going to extract this i've put this in soup now let me just quickly run this let me also run this over here this will just take a couple of seconds for this to be loaded properly all right so now when i given the name of the website osho we get a result and that result is stored in this object called as page and if i actually want to know that if we have got the result from this i would just have to go ahead and print out page over here now let me also go ahead and print out soup and we get the entire html text which is present in this so now that the basic thing is done what we would want right now is basically the uh the table data which is present in this so for that purpose what i'd have to do is i would need to extract this just give me a couple of seconds so now that we have extracted this now what i'm going to do is we already know that this is present in the form of a table now that we know that this is present in the form of a table we need to find the table tag so from this table tag i will find out the class so here the name of the class is this standing stable standings table full let me just copy this thing over here and what i'll do right now is i will just go ahead and i will paste it over here now let me run this after this let me just go ahead and print out league table for you guys and let's see what is the result so as you see when i have printed out league table i have got all of the html text which is present inside the table so i hope this must be confusing for you guys a bit i'll just quickly go ahead and um look at the chat if you guys have any confusion so some of you guys are saying that this is a bit confusing sure what i'll do is i'll quickly go ahead and give you guys a recap so right now what we are doing is we are extracting data from this website called as ipl t20.com now if you would want to extract data from this particular website we already know that a website is comprised of html tags and to extract data from html tags we would have to um we would have to need we would need a framework which will help us to do that and for that purpose we have the beautiful soup framework and we are starting off by importing all of the required libraries so for this case we wouldn't need pandas i'll just quickly remove all of the libraries which we don't require for now i'll also remove um right so out of these the two important libraries which i needed for this session are beautiful soup and request with the help of the request framework we are sending a request to the page and from that we are finding out uh you know we are getting the data from this particular website now when we print out page we know that we have successfully extracted the data from this after this since i would want to extract only the html data from this website what i'll do is i would need a html parser so this object so this what you see this method beautiful soup this takes in two parameters the first parameter is i just specify the url link so from this i would need the text and from that text this is basically html parser now after that from this entire url so this when i print out soup i'll get the entire data from this website but i do not need the entire data what i need is only the data which is present in this table since i need only the data which is present in this table what i do is i use soup dot find and over here i given the tag table and i set the class so class over here is basically this over here this attribute so table is the tag class is the name and that is what i'm giving over here and that is the data which i am extracting now after extracting this i would have to also extract the td so this is stable but inside this as you see i have tr and i would have td so all of this data would be present in td now before we go ahead to that what i'm going to do is i've got this and after this what i'll do is class header dr let me just scroll this down a bit what i'll do is i'll just open up the source code so i'll press ctrl u i'll get the source code that will help me to have a glance at this in a better way let me just quickly scroll down to where the table tag is present and this is what i have over here so now that i take this what i would want next is tr so i'll just quickly go ahead and i would extract the data which is present in tr so for that what i'll do is i will have i'll just print out team dot text print team dot text and let's see what do we get right now let's just wait for the result and as you see i get the entire tabular data over here so as you see over here this is what i've done from this entire page what i've done is i have extracted only the tabular data so first i have the headings over here as you see team played one lost tied net run rate for against point and form then i have the values for each row so as you see over here i have mumbai indians m i so as you see this is the shorthand then i have these values 4 2 2 0 zero so four two two zero zero so four is number of matches played two is number of matches one two is number of matches lost tied and no result so this is how i am able to extract this tabular data from this data frame so i hope this gives you a basic idea of what we are doing over here if you have any doubts please let me know again i believe this might be confusing for you folks a bit just quickly let me know in the chat i believe there's a 10 to 15 second delay all right i would just need a confirmation from you folks if you are able to follow this or not and if you are not able to follow this i could just quickly give a you know quickly give a recap of this siddharth is saying should we not take a transpose of this yes we need to take a transpose of this because all of this is present in the form of columns um you know we would need sorry this is present in the form of row so you need this in separate columns rishit is asking what is the purpose of skating data of a webpage so as you see over here this could be one of the application let's say if you want to analyze some data from a website or some table from a website let's say if you want to extract some images or some tabular data then you can go ahead and use these frameworks such as beautiful soup um saying i'm in class 10 which is the best course for me that would depend on which field you'd want to enter so you're in cluster and you're watching this session so i believe you would want to get into the um get into the domain of computer science or maybe data science so i'd recommend you to start off with a language um called as python so if you don't really know python as of now i'd recommend you to start learning a bit of python because python is really versatile and if you know python you can go into any domain so the help of python you can do web development game development you can also uh use it for a lot of data science and artificial intelligence applications so i hope that helps you okay a lot of you folks are saying that it is absolutely clear great shutter is asking what is data scraping so what exactly we've done right now is data scraping so we did not have this tabular data with us earlier what we've done is we have looked at this website and then we looked at the source code of this website by looking at the source code of this website we were able to extract the data from inside this table so this is what is known as data scraping or web scraping so once you guys are asking to share the source code sure we'll definitely do that nishanth is asking do we need to import pandas for this exercise no we do not so the two main libraries which are required for this is request and beautiful soup with the help of beautiful soup you will be sending a request to the website to collect the data from it creative world is asking can we arrange to arrange this in tabular data sure we can do that um today's since we had two agendas we'll cover that maybe in some other session so the first part of the session was to show you guys how can we scrape tabular data so to scrape tabular data this is the uh you know small piece of code which we'll be using this is pretty much it first you'll send the request and get the result within try and catch blocks because sometimes you know the website might not give you the access to fetch data and that is when you have to put it inside try and catch after that um you just need to start off this parser and with the help of the parser whichever data you'd want to extract since i want to extract the tabular data i'll use the find method and i'll give the tag which is table then given the class this is the class of that table now from this table what i'd want is i would want the data which is present in all of the rows so i'll given the tag for team in league table dot find all i'd need all of the data which is present in tr which is table row and then whatever is stored in team we actually have html data but from that html data we only want text data so we'll just print out team dot text and as you see we get the entire text data from that table now if we would as somehow ask if you would want to convert this into tabular data what we can do is we can just extract each part individually so what we'll do is we'll extract this the title in one list then we'll extract the names of the teams into another list we'll extract the wins into another list we'll extract the losses into another list and then we can combine this entire thing into one single table so that is how it can be done farhan is asking do you want me to learn c or python first um from my perspective i'll recommend you to learn python because i do understand that in most of the colleges you will start off with c but the disconnect between college academia and what happens in the industry is a lot so right now in the industry it's python which is largely prevalent so you know after you know completing your graduation or if you have already completed your graduation and if you want to get a really good job python should be your preferred language rather than c now c would come into the picture let's say if you would want to get into the field of embedded systems or chip designing in that case you'd want to be good at c if you'd want to get into the field of data analytics or data science i'd recommend you folks to learn python rather than c ramraj is asking is python easy or hard to learn um i'd say it's relatively easy when you compare it with languages such as java or c plus plus because it's dynamically typed and um you know there is no um boiler code for it there's no boiler template for it it's just simple lines of code and if you invest maybe at least 2-3 hours every single day then you must be able to master python in two to three weeks suffers shakha is asking is creeping data from any site possible yes absolutely you can scrape data from any website again so when i say any website it's just that you it has to be legal i mean if you don't have the access for that website so let's say if the website asks you to enter username and password and if you don't have the username and password and if you illegally try to access that data that is a different domain again that is when you're trying to hack something from the website but here we are not really hacking anything i have public entry to this website and i have the access of the source code of this website and since i have the access to the source code i am just extracting data from inside the source code so for whichever website you have access to source code you can extract data from that particular website home is asking is great learning app available in windows store windows store i'm not sure but you can download it on phone uh through android studio ramrat is asking which field has more scope for learning python i'd say data science and artificial intelligence is of you know the trending domain right now for which python is really in demand the other is obviously um that is obviously sd rules the software development engineer roles because python is versatile and if you know python you can either go apply for the sd rules or the data science rules surfers is asking how prominently would data scraping be required for data analysts data scraping would be required as you see over here if you would want to extract raw data from somewhere that is when we would require data scraping let me just uh put it out very clearly for you folks in a data analyst job role what you're doing is mostly exploratory data analysis and what exactly is expiratory data analysis to simply put you are exploring the data to find out simple insights from this raw data so wherever you have to extract data wherever you have to find insights from the data that is what a data analyst does great i've taken up most of your questions over here i'll head back to the questions later on so we are done with the first part of our agenda today now we'll head on to the second part of our agenda so let's just quickly head on to that i will close this let me also quickly close this and also before i start with the second part of the session quickly guys if you can hit the like button if you haven't yet done it so we've got around 82 people watching and we can get the number of likes to 100 that would be amazing so if you could quickly hit the like button that would encourage us to come up with more such live sessions on a regular basis and also if you find this session helpful you can spread the word of mouth about great learning to your friends to your peers or to your colleagues and let them know that we'll be conducting these live sessions on a regular basis and also if you guys complete uh whatever courses we have on great learning academy they'll be getting a certificate which they can add onto your resume so quickly hit the like button and also if you haven't yet subscribed yet hit on the subscribe button and also click the bell icon that will be really helpful for us now after this we have the second part where we have some pre-extracted data and this is complete exploratory data analysis the first part we had extracted data now we are going to do some exploratory data analysis so first we'll be importing all of the required libraries so i need this library called as pandas which is used for data manipulation then we would require this matplotlib library which is used for data visualization and again we have this library called as cbon which is also used for data visualization now in matplotlib we have a sub module called as plt and that is what we'll be importing and this what you see pd is the alias for this library pandas plt is the alias for this module called as pi plot sns is the alias for this module called as c1 and we'll be importing all of these three libraries let's just again wait for these three libraries to be imported so we have loaded all of the required libraries now after doing this i would have to load the data set so to load the data set i have read underscore csv method and here i have pd dot read underscore csv inside this i will given the name of the data set which is matches.csv and that i'll go ahead and store in this object called as ipl now i'll quickly run this and i'll store it over here so i have loaded all of the required libraries i have stored it and this object called as ipl then i will have a glance at the first five records of this data frame to have a glance at the first five records i'll just write down ipl dot head and as you see over here this would give me the list of all of the records which are present so this season column would tell me uh in which season was this match played on the city column would tell me in which city was this match plead and then i've got the date column which would tell me on which particular date was this match played then i have team one and team two which would tell me between which two teams this match was played then i have the toss winner column so the toss winner column would would let us know which team had won the toss we have the toss decision column what did the team decide to do after winning the toss then we've got results we've got three results over here normal die and no result so that we'll go ahead and look at later then we have dl applied dl applied over here stands for duckworth lewis method so if duckworth lewis method was applied in the match then we'll have a one if it wasn't applied then we'll have zero after that we have this winner column which would tell us which team has won the match so in this match between royal challenges bangalore and sunrises hyderabad sunrises hyderabad seems to have won the match then we have win by runs and win by wickets now this is quite interesting over here let's just understand this properly so here royal challenges bangalore had won the toss and decided to field which would mean that sunrises hyderabad was batting first so if a team batting first wins the match then the team will be winning by a certain number of runs so just to give you guys an example let's just say if sunrises hyderabad had scored 100 runs batting first it would mean that royal challengers bangalore were all out for 65 runs and that is why we say that sunrises hyderabad has won by 35 runs on the other hand if the team batting second has won the match then we'll have a value overshoot so if a team batting second has won the match by let's say three wickets remaining then we'll have a value of three over here and in this match between royal challenges bangalore and sunrises hyderabad yura singh was named as the player of the match after that we have the venue over here this match was played at rajiv gandhi international stadium then we've got these three columns empire 1 empire 2 and empire 3. so the first empire was evident baker empire two was nigel long empire three so there was no third empire or you don't have information about the third empire of issue so this is a brief bit of information about this particular data now that this is done i'd want to have a glance at the number of records and number of columns present overshoot so i'll have ipl dot shape and this would give me a value of 636 comma 18 which would mean that there are 636 records and 18 columns present in this data frame so those 630 records would mean that since each record represents a match there are 636 matches in this data frame then going ahead i'd want to see i want to analyze this player of the match column so to analyze this player of the match column what i'll do is i'd want to know which player has won the most number of man of the match awards and for that purpose first i'll go ahead and given the name of the data frame which is ipl then inside parenthesis i'll given the name of the column which is player of match after that i'll use the dot operator and i'll use value counts method which will tell me which will basically give me the frequency of different categories which are present in this column and as you see over here chris gale has won the most number of man of the match awards so in this data set chris scale has won 18 man of the match awards then you have yusuf batan with 16 man of the match awards after that you have abd williams with 15 man of the match awards then you have david wano with 15 and so on and this is how it goes ahead now out of these let's say if i would want to have a glance at only the top 10 players who've got the most man of the match awards so i'll given the name of the data frame which is ipl then i will select the column which is player of match and so basically the command is same it's just that i am taking only the top 10 records from this and as you see over here i have the top 10 players with most number of man of the match awards right so this is an interesting observation over here now similarly let's say instead of the top 10 players if i would want to have a glance at only the top five players with most man of the match awards so i'll have ipl player of the match value count zero to five and you would see that these are the top five players so you have chris gayle leading the chart then you have use of pathan ebd williams david warner and suresh rainer at the fifth place and now what i want to do is i would want to make a bar plot for this particular result so to make a bar plot we would have to use this so to make a bar sorry for that so as i was saying to make this bar plot we would need this matplotlib library to visualize and we have already imported this pipeline sub module with the alias plt now what we'll do is we'll be starting off by making a bar plot so i'll have plt dot bar and inside plt dot a bar i have certain parameters the first parameter is where i'll be giving the names of these players or the first parameter is basically the categorical values and here as you see this command so i'll just take this command this command gives me the names along with the um the names along with the number of man of the match awards they have one but what i'd want is only the names if i want only the names i would have to attach the keys method at the end of this and as you see i've got only the names from this and that is what i've given over here so now that i have extracted all the names of these players then the second parameter would just be the value counts so to get the value counts i will just remove this keys method so here the first parameter i'll get the names of the players the second parameter i'll get the value counts for the players the third parameter i'm just assigning a color and i'm assigning the green color to these bars and plt dot show would just print out the bar plot and we also have this plt dot figure pick size so with the help of this we are setting the figure size eight comma five so this is the dimension of the figure which we have just plotted over here so as you see chris gale has the most number of man of the match awards and at fifth place we have surish shriner so unfortunately um suresh rainer i don't think would be playing this ipl and uh chris gayle is on bench in most of the matches because seems like he is not at his peak form right now so this is some interesting observation again right so again let me just head back to the chat section and if uh i'll just check if there are more questions the goodgear is asking if i'm okay yes i'm fine it's just a bit of persistent cuff which i've had since a couple of weeks mahima is asking data science job role is less for b-tech and tell which company will allow for job in btech level that's an interesting question when you talk about data science job roles there are different types of job roles so there's the data analyst job role you have the data scientist job role you have the machine learning engineer job role you have the artificial intelligence job role you have the computer vision engineer job role and so on now when you talk about a fresher getting a job in the data science field you would mostly start off as a data analyst and not a data scientist or a machine learning engineer because what a data analyst does is completely different from a data scientist job role a data analyst is involved in the eda side of things now what is edm you have raw data with you you would have to convert that raw untidy data into a structured format into a tidy structured format from which you can find simple insights so right now what we are doing is expiratory data analysis so this is uh pretty much the job profile of a data analyst where you take raw data and you find simple insights from this raw data now when we head on to a data scientist job role or a machine learning engineer's job role what these two folks should be doing is they'll be implementing some data science of machine learning algorithms such as linear regression logistic regression decision tree random forest and so on and on the basis of that they'll be trying to predict or classify something so right now what we're doing is we're just finding simple insights but let's say a data scientist or a machine learning engineer he would try to predict who will win the tournament or which team will have the most probability of winning so when you have those sort of problem statements those are the sort of problem statements which are solved by a machine learning engineer or a data scientist then we have artificial intelligence engineer or a computer vision engineer these folks mostly deal with you know they deal with frameworks such as tensorflow keras and so on and there the data is really huge and also the analysis is extremely complex so these would be the three distinguishing factors between data analyst role data scientist role and artificial intelligence rule so as a fresher after b tech if you would want to get into the data in the analytics field you can try for the data analytics rules so you have companies such as mu sigma z associates and so on which actually do higher pressures for these sort of rules so maybe you can research about these companies and also there's a website called as kaggle so if you go ahead and participate in a lot of contests on kaggle what happens is there are a lot of companies which directly recruit from kaggle itself but then again you would have to be in the top 50 or top 100 in the entire world in these contests so if you're in the top 50 or in the top 100 up in these contests then some of the companies directly recruit from the website itself so i hope that answers a question rashid is asking could you let me know the road map for data analysis and data scientist so um throughout all of my live sessions i say the same thing you have three basic pillars which you have to master to be you know to get into the field of analytics the first pillar is statistics because statistics is data and you're dealing with a lot of data in the analytics field so you'd have to know these fundamental concepts such as measures of central tendency measures of deviation central limit theorem normal distribution poisson distribution and so on so these are some of the fundamentals of statistics you'd have to be good at them once you cover the first pillow then comes the second pillar which is the programming language and when it comes to the analytics field there are two most important languages which are python and r so i'd recommend you guys to learn both of them because both of them are equally used and equally important so that would depend on the project you know which which the current wizard team is currently focused on if they would want to go with python or r then the third pillar would be ml because ml is obviously the most difficult part so go ahead and learn all of the concepts of these algorithms so like linear regression logistic regression and so on and try to implement them by yourself so if you are good at these three pillars if you master these three pillars then you can be sure that you'll be easily getting or easily cracking the data science or the data analytics interview is asking how to convert the scrape data to csv because i've told you guys we had extracted the entire data now from that entire data what we can do is we can individually segregate them we can store the names of the teams in one list then we will store the wins in another list then we'll store the losses in another list then we'll store the net run rate in another list then finally we can combine all of these lists into one single data frame so that is how you can convert that scraped data into a data frame right so um again let's head on to um what we were doing over here so we have done this and now we have the result column over here and as i have told you guys we've got three categories we've got the we've got normal we've got no result we've got tile let's just understand what these three are so ipl result dot value counts and let's see what do we get so out of the 636 matches 626 were normal which would mean that in 626 matches one team was winning the match and the other team was losing the match then we have seven ties which would mean both of the teams have scored same number of runs then we have three new results which would mean that the match did not happen because of some reason either it must be because of rain or something else and the match was called off now i don't want to know about how you know uh which team has won the most number of tosses so to understand that first i'll use the name of the data frame which is ipl then i will given the name of the column which is toss winner then again we know that to get the frequency count we will use the value counts method so ipl toss winner dot value counts and here you would see that mumbai indians has won the most number of tosses so my indians has won 85 tosses kolkata knight riders has won 78 tosses delhi daredevils has won 72 tosses and so on then we'll be doing something interesting i'd want to know how teams are performing after batting first so or basically i want to analyze all of those teams which have won the match after batting first and i've told you that if a team has won the match after batting first you will have a value in this particular column and this win by runs column so what i do want is i'd basically want a value over here so to extract all of those records where a team has one batting first it would mean that this value should not be equal to zero because of this value is equal to zero that would mean that the team batting second has won the match so what i am doing over here is from the ipl data frame we have the win by runs column and from this column i'm extracting or from the entire table i'm extracting all of those records where the value in this win by runs column is not equal to 0 and that i will store in this object called as batting first now i'll have a glance at this um the semi-data frame which i have extracted from the original data frame and you would see over here that i don't have any zeros over here so i've got a new data frame which comprises of all of the teams which have one batting first now i'd want to do something interesting again so i'd want to make a histogram for this column when buy runs column and since this is a numerical column obviously i'll be making a histogram so sure you'd have to understand the difference between a histogram and a bar plot whenever you're dealing with numerical data or continuous numerical values you will use a histogram but if you want to visualize categorical data then you'll make a bar plot and since this is a continuous numerical column we'll be making a histogram over here so first we'll start off again by setting the figure size so i'll have plt dot figure fixed size will become 7 comma 7 then since we are making a histogram i'll have plt dot hist and inside this i will pass in the name of the data frame after that i'll pass in the name of the column which is win by runs so basically we are passing in this column inside plt dot list then we are setting a title for the plot we are setting the x label for the plot and we are showing out the result plt dot show over here and this is what we get so we have something interesting over here so as you see so we have a distribution of 0 to 140 runs and this would tell you the number of matches so there have been around 100 odd matches where the team batting first has won between 0 to 20 runs this over here would tell you that there have been around 18 odd matches where team batting first has won uh by somewhere around 70 to 90 runs this over here so you know this this decreases very very uh you know very very steeply so there are very very few matches where team batting first has won by more than 100 runs remaining so this again is an interesting observation overshoot so um i guess we are done with the one r i'll just quickly head back to the chat and see if there are any more questions om is saying i love mumbai indian steam that's great to know um is saying please provide notes unfortunately notes are something which i don't think we can provide but if you want the code file for this we can add this code file onto great learning academy so from there you can go ahead and download the code file great so thank you very much guys for attending the session and um before signing off again if you could uh if you haven't yet clicked on the like button if you could just click on it that would be very helpful and also if you haven't yet subscribed to our channel or clicked on the bell icon if you could just quickly hit on the subscribe button that would be wonderful as well so thank you very much for attending the session guys we will meet in the next live session which is tomorrow so tomorrow we'll have a q a session on python and data science it will be a one hour q a session where you can ask me anything related to python with respect to all of the libraries in python with respect to anything in data science if you would want to transition into data science or what are the things you'd want to know in data science you know you can ask me anything with respect to both of these topics and i'd gladly help you out for one hour so thank you guys and we'll meet tomorrow at same time
Info
Channel: Great Learning
Views: 8,406
Rating: undefined out of 5
Keywords: Data Analysis of IPL data, Python for Data Analysis, IPL Data Web Scraping, Python Porject, Data science for beginners, python for data science, data science python, data science with python, data science projects for beginners, web scraping with python, web scraping using python, web scraping, python programming, python for beginners, learn python, python projects, Ipl, ipl 2020, ipl match analysis
Id: 8E0FdOKM2P0
Channel Id: undefined
Length: 61min 55sec (3715 seconds)
Published: Sat Oct 03 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.