Python Pandas Tutorial 10. Pivot table

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
here friends today we are going to talk about pivot and pivot table in pandas let's start with pivot pivot allows you to reshape or transform your data frame what I mean by that is let's say you have input data frame like this where you have weather data and you have like date city temperature and humidity columns for your data analysis purpose you might want to transform this into this format where your x-axis is date and your y-axis is city or you want to transform in this format your where your x-axis is your city and as a column you want to have dates all of this can be done using pivot I have Jupiter not booked here so I launched it used by typing Jupiter not booked come on on my command prompt and I created this new notebook and I have imported this weather dot CSV file here so here is how my data frame looks now to transform it I will call D F dot pivot function okay and pivot function takes couple of argument the first argument is what you want on your x-axis or row ok so at a row level I want let's say date so in that case I will say index equal to date so date here is the name of the column and the second argument is columns columns I want city to be in in my columns ok and when you run this you get this so now I have dates as my rows and my columns are these and then I have multi-level columns for showing temperature and humidity let's say I want only humidity to be present in my output data frame so for that you can supply 3rd argument which is values so values means what you want your value to be ok so when you run this you get only humidity you can slice and dice in whatever way you want let's say I want humidity to be my rose and my columns are less City that is also possible that was all about pivot now let's talk about pivot table pivot table allows you to summarize and aggregate your tabular data now I'm going to use a different spreadsheet for this for demoing pivot table so I will first call read CSV to read my CSV files so I have better to dot CSV file which I'm going to read it into my data frame and this is how my data frame looks so as you can notice here for same day I have two numbers so for New York on first off me I have two temperature and humidity numbers this is less the temperature in the morning and in the evening okay so same way for Mumbai I have morning and evening temperature and now I want to create a data frame which contains an average temperature throughout the day for all these dates okay so this is something that I can do using pivot tables so you will say D F dot pivot table and in the argument you have index as your city and your columns is let's say your date so here now I can see that for Mumbai on date five one my average temperature work was 81.5 so you can see that this is Mumbai so Mumbai is here and temperature was 80 and eighty three so the average of these two numbers is 81.5 okay so you can see that this is pretty useful now I have aggregated my input data frame to produce this meaningful output which can be useful in my data analysis process you can also define your aggregate function by providing AG func argument and let's say I mean I I know it doesn't make sense but I can maybe say some as my aggregate function and it's gonna sum them up so for again for Mumbai its 163 because the sum of 80 and 83 is 163 so you can see that it is doing aggregation on this input data frame you can find a list of all the functions on Google so you can just say so this is using num py functions by the way so if you look at the number Y function so we just tried some here you can try it even like def or count there are various things you can try so let me try count here so count is gonna show you the counts of I have two numbers for each of these dates so that's what it is showing me if def works but shows weird result mean is default so that's what we get when you don't supply egg func argument and you can confirm this by going to Google and just say ponders pivot table so it will open pivot table documentation and you can see that the default aggregate function argument is set to be mean okay and you can read all other arguments documentation on other arguments here so if I check one other argument I have margins is equal to false so let's say let's see what this does okay so I'm going to supply margins equal to true so margins equal to true okay i misspelled this so it should be this now margins will provide margins which will have aggregate statistics so you can see that it added all column here and it added all row so what this is doing is this is showing you the aggregate like the average of these two numbers here and the average of these two numbers here so again based on what kind of problem you are solving this might be useful last thing we are going to talk about is grouper in pivot table so for demoing that also I have one more data of example so I'm going to read that CSV file here the file is called waiter three dots CSV and when you print it it looks something like this so as you can see I have data only for New York City but the data is now coming from different dates so till now I had dates from month of me which is five here but I have some dates in December also you can see the in December New York is pretty cold compared to summer okay now you can apply grouper function to aggregate based on date frequency what I mean by that is let's say you want to know average temperature in a month of me and average temperature in a month of December okay this is something that you can do using pivot table grouper so what you need to do for this purpose is you will say D D F dot pivot table okay and index is equal to P D dot grouper now if you look at the documentation for pandas grouper then grouper has couple of arguments so first argument is frequency okay and if you read the documentation for frequency it has all these possible values okay so it has month as a possible value so we are going to use month because we want to know monthly average temperature okay so I'm going to say frequency is month and my key is your date so date is the column on which I am applying this operation grouping operation okay and my columns is equal to city because I want to city I want to show city as my columns okay when you run this you're getting some error where it says only valid with date time index okay so the problem here is that when you read your CSV file although it looks like a date it is not really dead it is a string object so you need to convert that to date and the way you do it is you will simply say DF date equal to PD dot to date time DF date okay and when you print your data frame after that it it the column is now date so if you want to just conform you can conform it by this and you can say it's a series but inside series the individual element is a timestamp so this is now a date column and now let me execute this one more time so when I run this now what I'm getting is the average temperature in month of May and month of December so you can see it is showing me the end date of May and end date almost December and the temperature that you are seeing here are average temperatures so if you average 65 61 and 70 then you are going to get this much 65 okay and same for humidity alright cool so that's all we had for pivot and pivot table I hope you are having fun time learning pandas and Python you can see that pandas and Python makes data analysis process very so even if you don't have any programming background you can get started very easily by following my tutorials and I'm pretty sure within few days you will have a good handle on how to use these awesome tools for solving your data analytics problems I have provided a link to my Jupiter knot book in the video description below thanks for watching goodbye
Info
Channel: codebasics
Views: 201,868
Rating: undefined out of 5
Keywords: pandas pivot, pandas pivot table, pandas pivot table example, pandas pivot table tutorial, pandas pivot vs pivot_table, pandas pivot example, pandas pivot_table, pandas pivot_table examples, dataframe pivot table, pivot table pandas, pivot table python, python pivot table, pivot table in pandas, pivot python, pandas pivot table margin, pandas pivot table total, pandas pivot table aggfunc, creating pivot table in python, how to create pivot tables in python
Id: xPPs59pn6qU
Channel Id: undefined
Length: 11min 26sec (686 seconds)
Published: Sat Jun 03 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.