Data visualization with python | Create and customize plots using Matplotlib, seaborn and pandas

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
greetings folks in this tutorial we're going to see some plotting strategies using seaboard and matplotlib libraries so at first we are just importing our necessary libraries and notice here we are using matplotlib notebook instead of matplotlib inline to have some interactive plotting okay now we are going to import our data read our data so this is our data and you don't have to worry about that data it is actually this is embedded dataset and i guess you need just internet connection for the first time okay as you can see here this is actually a total build tip tip data set so you can make it if okay so depending upon sex we are going to see how tip varies or total bill how deep there is smoker yes or no day time dinner time or lunch time okay so now we are going to start off by just um doing a scatter plot so to make a scatter plot you just have to do plt dot and your plt is from matplotlib plt.scatter and what parameters this method takes you can know by just by just holding shift and press tab now you can see that it takes x y and many other parameters so our x is let's start x is just this tip tip amount and our y is total bill okay total build let's see how it turns out [Music] and you have to save the data set name so data is df okay now you can see i'll plot our first plot and if you want to change the plot structure you can do it okay fine now you can also see the values for this point from this right below corner and you can also zoom it now okay fine now you're going to do some other stuff you can change the color by default colors color is c equal let's say you want it in green now it's green you can change the size of those data points is equal let's say 15. it's now a bit small and if we do 58 bit large okay now now what now you can make another plot another scatter plot within this plot so ctrl c control v so let's say our y is now size and everything else is okay and we want color as red size don't need by default is okay and we want our marker marker as a plus sign okay now you can see size and total bill and now plot it against our tip amount so now you can see tip higher tip amount is not that frequent lower tip amount is frequent okay and when the total bill is high it seems like tip is high compared to low total bill now how can i know that green color is total bill and red color means here size so to do that you can write here level equal what total bill so just total bill and and in this case it's just size so no legend appeared here so to make you have to do plt dot legend not level pl dot legend and then now you can see our plus size bill sorry size and those green data points are built total bill okay now you can also write the title of your legend title equal let's say plot adjust legend okay so you can also pass those legends here let's we are just removing those legends okay okay so to do that you have to pass list of names so our first legend was what first legend was total bill total bill and i was second legend was size okay now see oh come on now you can see the same thing same thing you can do now if you want to write x level y level so plt dot x level that is our x axis level so you can say what was that tip and our y level plt dot level is what um bill and size now you can see here x-axis tape y-axis bill and size and you can change the font size of x level y level by just font size equal let's say 14 x level so now you can see it's now a bit big okay the same plot you can do make create by sns so sns dot what sns dot for creating scatter plot you have to do lm plot and similarly our x tape and paste it here now you can see i have a total build tip and by default this regression and you can undo it by just pressing dot and what was that actually it's i fit right by default this feature egg is true so you have to write just it's false now you can see no line here so okay that's just about our scatter plot now we are going to say histogram or distribution plot so let's say df dot tip on tip column dave dot tip dot plot it is actually our pandas plot and we have a bunch of kind by different kind is line plot we are going to see now his histogram so he's and we are making plot so we need to write here plt dot figure before making any new plot okay so let's see now you can see our frequency plot our distribution plot you can see tip size that is two to maybe three dollar is most frequent the frequency is i80 and it is a bit right skewed which is in this case not good not bad it's good okay so this is our histogram now we can make the same thing using sns sns dot distribution plot dist plot and here our data is let's say df dot tip oh pl d dot fine um this kernel density we can just write here dot but it's not actually so bad it tells us your distribution is positively skewed that's it and we have one top so it is unimodal distribution so if we say false there is more or less like our previous one so okay now let's say you want to plot what how about sex male female count how many of those uh how many of how many of them are male or female so how do we plot them you can do that like in df dot sex dot value counts let's see oh you went okay so this value of counts gives us the count so man is 157 female 87 you can also plot it dot plot kind equal bar plot oh just again plt dot figure actually i'm used to in matplotlib inline that's why now you can see we have the plot for male and female count the same thing you can do using sns you don't have to use that use that value counts method just sns dot count plot and our data tips dot six again plt dot okay now you can see that our male customers are nearly double than female customers okay so let's do something new let's say i want to plot um group our data by time and today i want to see many tip size depending upon time of dinner lunch and so let's do that df dot group by and group by a list of names that is and and what another thing time and day okay and we want our output on tip column dot mean okay you can see some lens also now before plotting plt dot figure dot mean dot plot and our kind is bar plot okay now you can see oh i need to rotate those sticks x levels so plt dot x takes and rotation called cd now you can see yep during sunday sunday was highest in sunday tip amount was highest and during and dinner time these are actually mean values we can also see the standard deviation so to do that you have to do what aggregate aggregate aggregate mean and standard deviation i will now plot it rotation 10 degree now you can see the data with standard deviation during saturday standard deviation was competitively high now you can change this plot a bit like what if you pass a parameter stacked stacked equal true then it will be plotted this way okay again if you just write this part by h then it will be just horizontally plotted okay all right so now we are going to see how to make box plot so we are going to use cborn so sns.box plot and our column name okay fine now let's see our box part this is our box plot so uh c bar has its own algorithm for detecting outliers these are known as outliers according to sms box plot and this midline here represents the median of this data that is 50 of our data is below this this line and 50 of our data lies above this line okay and this whisker means this lowest point means minimum value of our data and this is actually maximum value of our data okay so this line here it's 15 25th percentile and this is 75th percentile i guess okay let's see one fascinating plotting system by sms and so far i think it's the most beautiful sms plotting system so we're going to just plot our previous one scatter plot by sms see that first this is just a scatter plot and tip by total bill now you can assign one another parameter that is hue so we're now going to pass six let's see how it returns now you can see male and female are now different colored okay but it's not the end you can pass more thing like column and column can let's what time time of dinner or lunch okay you can now see time also okay now you can pass another thing it's a row so row uh what else time day which day to us row day okay now you can see we have different columns for our time dinner launch and we have rows also for what time is it days or friday or saturday now you can see those things it seems like in saturday and dinner time and dinner time tip bill was high and saturday lunch time [Music] missing values i think no missing not missing we don't have data for that day i think yep now we are going to see the most interesting part of our plotting now we are going to say subplot so to make subplot we are going to use pld dot um subplot okay so here i am going to pass let's say i want one column and two rows and one last one this is this is actually a one means here our current axis is this one okay so now you can see one figure here but now if you write make another subplot one two two now this is our current axis you'll see two supports okay now you can plot anything you want like previously pl t dot scatter now i'm just going to sns s and s dot disk plot okay df dot tip and here i'm going to use just sns.dist plot def dot total build okay now you can see here this is our tip this is our total build so both of the both of them are one beside another actually now you can also make like one below another now we are going so here one mains axis so this is our current axis for this plot or current axis if you are going to plot something below this subplot that will be in this plot and after making another subplot what you will put below that subplot will be plotted under that subplot okay now we are going to see another way of making subplot what is we do is normally you can see i think in web pages figure dot x equal pl t dot m sub plots maybe something like that okay now if you pass here just but i want one row but two columns so now we have two x's so x is one and x is two and embed them within a bracket see the same thing we have done above and now if we are going to plot our same thing like so control v and just here you have to pass one another thing parameter ax equal ax one okay now just one plus one one plot here another is empty so if you just write here x equal to x two this that is our x's current axis is x is 2 so now this one and you can write both of them plot both of them got total bill it so that's how we make um what this subplot and here your set set x level y level is a bit different and in the above format in using matplotlib we used plt dot x level but now we have to use ax 1 for x1 actually for axis 1 our title is going to be x dot set x level okay now what you want to say right our x1 is total build x level is same so we're just going to write y level whatever is here total um bill and similarly for the second plot you have to write here x 2 now you can see total bill i haven't just changed it x2 is oh it's just a distribution plot so whatever you can write so it's changed and if you want to write title so just ax one dot set title that's it now we are going to say something more about it we are going to use another data and this is actually time series data you can see a date rainfall depth to water that is our groundwater depth temperature temperature so temperature variation of the time groundwater variation of time this is just about it and we have converted our date time data column to pandas datetime format okay just parsed it now we want to plot it we want to plot all the columns against date we want to see by date how they are changing all right so we are now going to make seven subplots because we have a total of seven columns except that date column so and we and from now on we are going to use matplotlib inline okay so here number of rows seven columns just one and figure size okay and you can also increase the figure resolution by just dot dpi equal um by default it's maybe 50 or i don't know you can set it to 300 or something we'll see that later okay so we're just making seven plot and now you're going to write a for loop for i comma column in enumerate and enumerate what df dot columns comes so if you now you just print that i and column then you'll see i is just a column number just one two three zero one two three and column is or just column name so let's see that so now you can see date one and what is this one one one for i dot column in my div that columns and we have seven subplots okay so we don't want to print it just comment it out okay so now sns dot line plot that's a line plot and and we don't want to plot that date we would want our date as our x-axis so we need to drop it so df dot drop date df dot not columns sms dot line plot or x equal and df m date did date and our y equal what y value df and y column just those column names one by one so x you can get y equal to f dot column and now we have to pass that x is equal ax of i okay i think everything is fine you can also change the color by default color now let's plot it run it let's see how it starts out yep i think it worked yes it worked so our x success is date y-axis temperature temperature depth too okay now you can see it dipped to water it's somewhat cyclic high the low again high again low again high but this time not very much low okay and temperature with time low to high i mean low cyclic okay the and if you now would like to change write title or change the what levels you can do that by just ax i dot set x level available by that okay now this data is from maybe 2009 to 2020 2020 2006 to 2020. uh say for example you want just your data from 2018 to 2019 20. so for that you have to do x of i for all columns that's why for all columns dot set x limit so just we are giving them limit time limit and for that you have to import this date from that time module okay and here we are going to say maybe in a square bracket date then in parenthesis you want it from what went from two zero one eight one month day to one then comma again i think a date two zero two zero second month just and let's see how it [Music] returns it's taking some time okay now you can see it's just from 2018 to 2020. all of the plots are from in between that time range now you can make also box plot so to do box plot you have to write here dot dt dt dot er because box plot is going to calculate our yearly average so here line plot will be just pu x plot box plot everything is same and if your date column is index if you make that your index then you don't have to write that dot dt accessor it's somewhat like you are accessing strings so for that we use str accessor and here we use dot dt accessor okay now let's see it oh that's another box dot oh say now you can see that now it's i think a bit more clear from your yearly basis this trend and to see the trend you can resample your for example or rolling mean temperature is like in a steady state in 2020 it was a bit low for those regions okay now we are going to see that resample technique and now we are going to use our in notebook again so here we are going to select just two columns from our data frame one is date and this date to groundwater and resampling it by week and on is on which column you are resampling because all of those columns are not your date column that time so this is just this one then mean and plot okay now you can see the plot but if you do the daily basis then it's a bit more noisy and you can zoom it because it's actually a interactive plotting system this is actually missing value here and you can see the values for those areas like here y is minus 29 that is groundwater depth is 29 meter from surface zone and the time till mid april okay so but if you make it yearly it will be more smooth see smooth but if we converted our date column to our index we didn't have to do those things like on date uh write df.date it would be much simpler than that and similarly as we did before you can set the x limit so we are not using that those axis that's why just my plot will be using that's why plt dot x limit for those 2012 to 2020 um okay so now we are going to see just our last plotting strategy so here again we are loading our previous dataset tip and we want to plot x is our tip wise our total bill as i did before but it's a bit different and here we are using matplotlib inline not notebook because when you are using making a big plot then matplotlib notebook is a bit problematic okay now you can see this trend line but not just this trendline we have those the p value and regression coefficient in our plot data points so how do we do that for that you have to import sci-fi dot stats and sci-fi stats says that that linear regress function so you're just passing linear regions our two datas x and y and it will return us a slope intercept pearson coefficient p value and what is this std or standard error maybe okay so plt dot subplot finger size 10 by 6 dpi 150 but if you just make give it as by default this graph is not very clear but if you increase the resolution just dpi 150 it looks good than the previous one okay so our first plot is plot x comma y just how we plot a scatter plot and the line of it it's not important marker level data points our second power plot is for this line actually this is so x is x okay fine and y you know for this line y is mx plus c so this is our x and this is our slope m m is from this sci-fi library we unpacked that slope okay here slope and intercept from this intercept so mx plus c this is our c intercept and level is this level line what is this line line we are just imprinting our p value and our value okay so legend white okay everything fine now if you want to write some text on this plot you can do it let's say you want to write something in this zone x is 8 y is 15 here so to do that what you have to do is ax dot text and here x is 8 y is 15 that position you want to write something okay and let's say you want to write this thing station 3 or whatever you want maybe x y z and font size 12. now let's see now you can see here this is xyz and you can do many more things with that you can draw some other line some many more things actually so that's it that's all about this plotting tutorial but there's many more even you can plot two two or three plots in just one single plot okay so i think you find this video helpful and you learned something new if you find this video helpful then please subscribe okay i will see you i think i hope in my next video thank you
Info
Channel: Data Analytics.m
Views: 7,251
Rating: undefined out of 5
Keywords: plotting with python, plot in python, matplotlib plotting, Matplotlib plotting, data visualization in python, visualization with python, visualization with matplotlib, data analysis, data visualization with python
Id: NF4WX-cDsTA
Channel Id: undefined
Length: 43min 52sec (2632 seconds)
Published: Thu Feb 18 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.