Example Spreadsheet Automation - Python Automation for Remote Workers Series

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everyone derek here in this one let's continue on our automate remote work series and take a look at how we can automate spreadsheet tasks let's get started starting off let's take a look at the data that we'll be using in this tutorial what we have here is an excel workbook that has the amount purchased of each of these given materials but what we don't have here is the pricing so we have two workbooks that we need information from so let's say that our task is to match our items on the id field and then multiply the price times the total purchased amount and produce a data visualization that looks like this if you want to follow along with this example completely these excel workbooks will be linked in the description below but with that let's get started we'll open up a text editor and then we'll import the libraries that we need to use anytime that i'm working with any information that looks like a table i'll use pandas to work with that so we'll import pandas as pd then we need to choose a data visualization tool of our choice to produce that pie chart in this example we'll be using plotly the import for that is plotly dot express spx plotly is just one data visualization tool that we can use and there's others such as maplib and cborn which could do the same thing the way that pandas works is we're able to read in information from many different sources into something called a data frame a data frame is just the way that we can view this information easily within python to read the information in that we need we'll create a variable so we'll say excel book 1 relative path and like the variable says this will just be the relative path to the information inside the first excel workbook since my script which is right here is in the same directory as my excel workbooks i can just pass in the excel workbook name for me that's purchases home a dot xlsx we'll do the second one so excel book and then this time let's say prices will be equal to price book.xlsx so right now we're just telling our python script where these excel workbooks are now we need to create the data frame so we can see the information inside of them so we'll say something like df data frame prices will be equal to the panes library and then we'll use a method called read excel what this does is take in the information from the source that we specify we have to use read excel because we're working with excel workbooks but if you were working with something like csv you could just say read csv we'll keep this as excel for right now and then we'll pass in this variable so excel book prices we'll do the same thing with df home one and this will be read excel and this time excel book 1 relative path now that we have these i usually like to do a little step to make sure that we're reading these in successfully so i'll just print df prices and df home one to my terminal to make sure that we've read these in successfully we'll open up a terminal or a command prompt depending on your operating system and we'll execute our script for me i'll do that using python 3 and then the script name once that executes we see that we get no errors and we see our information displayed in the terminal it doesn't look too nice right now but we're just working with it here so that doesn't matter too much i'll comment out that line and go down a few one reason why pandas is so powerful is because most of the time there's already methods built into the library that we can use to simplify our workflow in this example we're trying to do something similar to what vlookup does in excel what we can do is merge on that condition where the ids are equal we'll join that information and create a new data frame to do that we'll say df total so this will be a new data frame will be equal to df home one so we're just saying this one that has our information but not the prices we'll use that method so period merge so this is joining two data frames together our first argument within this method is the data frame that we want to join so for us that's df prices and now we can specify a keyword argument of on for us we want to join these two data frames on a column called the id if you remember on these excel workbooks the id column is going to match between these two workbooks here we have an id with new information and then we have id with the material and the purchase amount so we need to match the id numbers for each of these and then take the purchased amount and multiply it by the price at this point what we have is a data frame that is combining the information of these two data frames on the id so what we need to do is say something like df total and create a new column for us we can create a new column using square brackets and we'll say something like total price this new column of total price will just be equal to the data frame total this column of purchased amount and all caps so purchased amount and then we need to multiply that column by the price so we have df total multiplied by df total price if you're brand new to pandas all we're doing here is saying df total and then this column which if it did exist would be overwritten but since it doesn't it'll be created of total price is equal to the df total column of purchased amount multiplied by d of total column of price these square brackets are just indexing out one series or one column of our entire table so now if we were to go down and say print df total and we see that we now have a larger data frame that has all information between the two sheets and a new column of total price so it works now we need to do is to create a data visualization that looks like this where we can hover over it and see the total price the material and the percentage of our cost in our project to do that we'll say something like a variable name we'll just say figure is equal to this library that we imported up here px and the way that this library works is they have a lot of different chart types that we can just use as a method at the end so we're saying px.pi to access the pie chart method the first argument of this method will be the information that we want to use to populate that pie chart for us that information is stored in this df total data frame but we don't need all of it so we'll use those square brackets to index it but this time we need multiple columns so we'll pass in a python list value into this index so that's why there's two square brackets here so the values that we want are the material and a total price we're picking these two columns from all of these because we want to know the total price for each material if you wanted different information you could just use that in place of these here next plotly needs values and these will be the numbers that you use on the y-axis of a chart which will just be total price and then we need a names and this will just be equal to material so in this one line we're creating a pie chart from this information that we're creating from these two excel workbook sources now let's say figure dot show and this will just display the chart whenever it executes we'll save or file and let's execute once that executes we see that we have that new pie chart created this is cool but why is this automating spreadsheet work what you could do is create a function here that automatically did it for you so i have this second excel workbook on my desktop of purchases home b if we wanted to do the same thing with a completely different workbook we could what we'll do is we'll change this to home b and then we'll save the file and we'll execute it again once that runs we see that we've completely automated the process if we have spreadsheets that look similar so here is homes a and this is homes b and all we had to do was change one letter and that's it for this one i have plenty more content on my channel about how we can automate spreadsheet tasks using python so if you found this one enjoyable please check out some of those videos that i'll link in the description below if you have any questions or comments as always please let me know and i'll get back to you until next time [Music] bye
Info
Channel: Derrick Sherrill
Views: 25,334
Rating: undefined out of 5
Keywords: Python, Python Automation, Tutorial, How To, Derrick Sherrill, pandas, python pandas, excel, excel automation, python programming, programming, data visualization, plotly, python plotly, pie chart
Id: qHJLTvmPM44
Channel Id: undefined
Length: 10min 1sec (601 seconds)
Published: Wed Sep 09 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.