Automate Multiple Sheet Excel Reporting - Python Automation Tutorial | Full Code Walk Through (2019)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone welcome to the channel my name is Derek and today I want to go over how we can start automating multiple sheet Excel reporting using Python we'll cover the basics in this one let's get started if you're an Excel user and you haven't used Python or any programming language before it don't worry because python is designed to be easily picked up by beginners if you haven't used Python before I'll have a link in the description below to show you how to download it the link also has a full course on all the Python basics but why would you even want to use Python when VBA is built directly into Excel I would argue a couple reasons the first is that there's a large community already creating Python scripts that you can easily use in your Excel projects the second is the ease of learning Python compared to vba python is a much easier language to learn of course you can learn VBA and do all the same functions that you could with Python but Python just makes it easier this video will cover the basics of how we can begin automating Excel reporting so we'll look at how we can import data do calculations create a graph and then output new data to do these operations we'll use the Python libraries pandas numpy and matplotlib here's the excel data that we'll be working with if you would like to download this and follow along step-by-step I'll put the link for that download in the description but say that we're working with data from a pizza plant we had different types of pizzas that we make we have the production run times how much of those units we make in those production run times and also who makes those products just first shift so we have a second tab for a second shift and then we have an entirely different worksheet for third shift so each of these sheets in these two cell workbooks have data that we need to use in our Excel report now we'll pull up a text editor and get started using Python the first thing we need to do is to import the libraries that we'll be using so it's import pandas as PD then we'll import numpy as MP and lastly we'll import matplotlib PI plot as PLT if you don't have any of these packages or dependencies you can install them using pip just open up a terminal or a command prompt depending on your operating system and type in pip 30 install and then whatever package you need to install so if we needed to install paint as we could just type pandas so we'll close out of this and go back to your script the first thing that we need to do is to denote the path to our excel files so we'll drop down and say excel file one will be equal to the name of your workbook the name of my first workbook was shift data xlsx and then we'll do the same thing with any number of workbooks that you have we'll just use two for this example but feel free to use as many as you would like and my second one was third shift data xlsx or drop-down now that we have the file path denoted we need to read this data into our Python script we can do this using the pandas library we'll create a data frame which is just a tabular data structure that pandas uses to store data it looks very similar to an Excel spreadsheet we'll say D F first shift as a variable and we'll set this equal to the pandas function read Excel will say the excel file that we want to read is Excel file 1 and then the sheet name will be equal to the name of the sheet that you want to read in our name of our sheet was first we'll do the same thing with second ship so let's say D F second shift will be equal to the same function read Excel of the same excel file so Excel file one and then sheet name is equal to second now we'll drop down and reading the third ship data so we'll put DF third shift is equal to PD that read Excel and then this one will just be Excel file to the default is to read in the first sheet so we don't need to put anything else here by doing this we have our excel data in a data structure that Python can very easily interpret we can view any of this data that we want so let's say print data frame first shift we'll save her file and open up that terminal again keep the file will say Python 3 and then the name of the file is automate Excel reporting PI well execute so we see that we have a data structure that looks very similar to an Excel spreadsheet we have each of the column headings above and then the values in those columns our data frame is a little too large to view in this terminal completely so that's why we have a few dots here but all information is in this data frame we can verify this by seeing that we have 29 rows and 6 columns if we wanted to pull the information from a specific column we could say print data frame first shift so whichever data frame contains information and then use square brackets we'll go back over to the Excel spreadsheet and let's say that we wanted to pull all the products we'll copy this name and pass it in here now when we execute the script again we should be returned all the products in our first ship data frame which is exactly what we get we get each of the products and the position that they're in in our data frame so now that we know how to view our data let's see how we can do a calculation across all of our worksheets or drop down and let's go ahead and combine all of our data together so we'll create a new data frame and we'll say this one is equal to all now we'll use a new painter's function called kin cat put parentheses because it's a function and then we'll use square brackets we're just making a list here of all the frames that we want to add to your single data frame so let's add each of these into this list we'll say D F first shift D F second shift and then D F third shift drop down and let's print data frame all now we should have a lot more rows but we should maintain the number of columns that we have the way the concoction-- works is it joins all the values where the column heading is the same so now we see that we have a lot more rows and still the same number of columns you could also use the concatenation a few different ways but I'll make a different video on that after doing this we have all the data from across three different worksheets in one single data frame so let's say that we wanted to see which shift was the most productive in creating the products will drop down and we'll say pivot so this is just a variable is equal to data frame all will use a new function called group by and we want the information by shift so we'll put square brackets and then we'll pass in shift then we wanted the average so we'll say mean then we'll drop down and say shift productivity is equal to pivot LLC so this is just a new penas function that we can use to select multiple rows of data instead of just one so we'll use this loc we'll put square brackets and then we'll put a colon because we want every row and then we'll pass in the two columns that we want we'll go over and copy this column and paste it here and then we'll copy this column as well they print shift productivity and get the return in our terminal and a Python code we see how powerful Python is at doing calculations we've taken all the data from across three Excel workbooks and done a group by function to see the average number of the production run time and the products produced so from this we can very easily see that third shift is the most productive by far now if we wanted to graph this it's just as simple as saying shift productivity dot plot and we'll say the kind of graph that you want will be equal to a bar graph then all we need to put as PLT does show so this will create a graph from this grouping that we have right here and display it for us we'll run the follow again get a new graph pulled up on our desktop that has all that information in a nicely formatted bar graph now lastly let's talk about how we can output this data at NU Excel workbook or drop-down and let's say that we wanted all the information so DF all and one single excel file will say DF all and then we'll use a new method called - Excel and then we'll pass in a new path that we want this new excel workbook so we'll say output dot xlsx I'll go ahead and comment these out and we'll run this follow again when this executes we should have a new Excel workbook called output xlsx on our desktop which is right here zooming in we see that it has all the data from every Excel workbook that we used and that's pretty much it for this one I hope this shows you the baseline on how we can begin automating our multiple sheet Excel reporting using Python once we write one workflow using Python we can recycle that script over and over for any number of Excel workbooks so all you would need to do is to write them once and then change the excel file path to whatever workbook you wanted to apply to I made content like this on my channel all the time so if you enjoyed this please consider subscribing if you have any questions or comments please let me know until next time [Music]
Info
Channel: Derrick Sherrill
Views: 540,668
Rating: 4.9270306 out of 5
Keywords: Python, python script, scripting, atom, tutorial, coding, automation, python automation, python script automation, python scripts, python code along, code tutorial, python code tutorial, Automate Excel, Automate Python excel, Derrick Sherrill, Excel, automate, pandas, numpy, matplotlib
Id: 1Kcco6koC34
Channel Id: undefined
Length: 9min 53sec (593 seconds)
Published: Wed Aug 28 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.