Automate Excel Pivot Tables Across Workbooks - Learn Python Pandas

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] hey everyone welcome back to the channel today I have a quick video on how we can group data across multiple Excel spreadsheets using Python and pandas let's get started in our first excel file we have a very typical database structured excel file we have all the headings up top so we have a line number and operator the product that the operator is making the amount of that product that the operator makes how long and then the shift we have this same style of data on a second Excel spreadsheet the structure is the same and we just have different values in these fields we're using two different sheets in the same workbook for this example but feel free to use two different workbooks entirely let's say that we wanted to know how efficient each of our operators were on each of the different lines so what we need to do is to first combine these two data frames together and then group by the line and the operator let's open up a text editor and the first thing that we need to do is to import the package that we'll be using so we'll be using pandas so import pandas as PD next we need to designate the file path for each of our Excel workbooks since I'm in the same directory how to say excel file and then the name of my workbook is workbook 1 so work book 1 dot xlsx if you're using a second workbook you should put it here as well but we're just using one that has two different sheets so now we'll say data frame one will be PD I've read Excel and then we'll pass in that file path since this one will be the first sheet the default for the read excel function is the first sheet of the workbook so we don't need to put anything extra we will drop down and say DF 2 is equal to PD dot read Excel so the same function in the same file path but for this one we'll put sheet name is equal to the second sheet name and the second sheet name was second for the second shift so we'll type that in here second let's go ahead and drop down and we'll say print DF underscore one dot head five and then we'll do the same thing for the second one what this is doing is just making sure that we've right in the to excel files successfully and that there's no errors we'll open up a command prompt or a terminal and say Python 3 and then the name of my file is productivity dot pi we see that we get the first five lines of both of these sheets so there's no problem in how we read them in okay now let's drop down and now since we have multiple data sources only two in this case but we could have potentially a lot more we need to combine them together using the concoction-- will designate all the data frames that we want to combine first so we'll say frames is equal to square brackets and then DF 1 and DF 2 let's go ahead and create a new variable so I'll say all data data frame this will just be the data frame that has all the information from all the different data sources that we want we'll say P D and then use the concoction-- will use that frames variable that we just created and then we'll say that axis is equal to zero we're setting this access equal to zero here because we want both of the data frames to be under the same columns if we set it equal to one they would be side-by-side and now let's make sure that worked so we'll say print all data DF well save and execute again so we see that we've successfully combined the two data sources that we have but feel free to use as many as you like across as many workbooks as you need to so now that all of our data is together let's go ahead and get the results that we wanted we want an each operator and how effective they were on each line will go down and create a Productivity data frame and this will be equal to the all data data frame we'll use the group by function which is already built into pandas and now what we need to do is to specify the columns that we want to group by for us we wanted to group by the operator so let's go back up to the top of your data frame we'll take this and copy it and paste it in and we also want them to group by the line number so we'll take this and copy it and paste it in as well and now we need to specify the calculation that we want to group by function to do since we want to know how efficient our employees are let's go ahead and take the mean of the product number and that they make so we're taking the mean of all the other columns here but we'll pull out the amount column this one right here in the next line so let's say print productivity data frame we'll use square brackets then we'll copy this amount column and paste it here save your file and then we'll execute again and now we get the result that we wanted we're pulling out each operator and we're seeing how effective they are whenever they're running each line so now as a manager you could see very obvious points that you may want to fix so for example Mike on line one may not produce the best results compared to the other employees there's plenty of other calculations that we can use on the group by function here instead of mean we could have done minimum we could have done maximum or we could have done something like a summation this group by function is a very easy way to replace pivot tables whenever you have a large number of excel sheets and many different data sources if you have any questions about this please let me know I'll see you in the next video [Music]
Info
Channel: Derrick Sherrill
Views: 23,213
Rating: 4.9283385 out of 5
Keywords: python, python code, scripting, tutorial, automation, python automation, python excel automation, automate excel, python excel, python tutorial excel, pandas, python pandas, groupby, pandas groupby, pandas groupby tutorial, Derrick Sherrill, Automate Excel Reporting, Automate Excel pivot tables
Id: esY2L6fDHC0
Channel Id: undefined
Length: 5min 26sec (326 seconds)
Published: Sat Jul 20 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.