Are You Still Using Excel? AUTOMATE it with PYTHON

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
so if you do any kind of sales reports or any reporting in excel that involves doing the same thing over and over again this video is really going to help you out so what we're going to do is we're going to use pandas to actually take a few sales files combine them and give us some results at the end so i'm going to import pandas at the top of my file as pd and we're also going to use numpy so we'll do import numpy as mp these are standards i definitely recommend sticking to them we're also going to use glob because glob is going to let us get all of the files within our directory so inside my project file i have three separate sales files i just open up you'll see we have order id date email and we have country item and quantity now all of these files are the same i'm going to assume that if you're working with a real work computer system your exports will always be the same so this should apply what we want to do first is actually open up all of the files using glob so files is equal to glob glob and we're sticking within our main folder here so we don't need to give a path i'm just going to do star.csv so we only get the csv files so if i print out files now we should get our files here there you go you can see our house now we have a nice list of all the csv files that are in this folder so what we're going to do is we're going to create a data frame for each file that's going to be in our directory and then we're going to concatenate them all together at the end you don't want to append a data frame to a data frame to a data frame it will cause you memory issues so what i'm going to say is we're going to have our df list to start with which we're going to append all of our data frames to so we're going to do 4 f in files and i'll just call it our csv is equal to pd.read csv you can also do read excel if you need to if you're using excel files and we're going to give it rf then we want to do our df list dot append the csv file once we've done that we'll have a list of data frames and we can then say ah let's call it sales is equal to pd.concat so we can concatenate them all together from our df list like this so now if i go ahead and print out our sales data frame we should get some information back here there we go you can see it 3 000 rows that's because each of my files has a thousand lines in now if you've ever had to open up large excel files copy a load of stuff out to try and make one big one you'll know how difficult that is and this has just done exactly the same thing instantly brilliant so what we're going to do now is we're going to talk about column type so i'm going to do dot d types so this is important because what we want to know is the actual data type of each of the information that's in the columns of our data frame now you can see that it's correctly identified our order id is a number and also the quantity however and although we're not going to use it so much in this example the date is an object and not a date time so what we're going to do is i'll leave this d types print in here but we're going to go ahead and say that we want to change that to a date time object now to do this you call the name of the data frame which is sales and then reference the column name so this is important this is the column name in michael it's called date you just need to make sure that whatever you put in here is the actual name of the column say that is equal to and i'm going to do pd.2 date time so we're using pandas.two datetime and we're going to give it the exact same thing that we're just talking about the date so what we're saying here is that we're going to we're going to use pd.datetime on this column to change it to a datetime and then we're going to make our version of it equal to what comes out so now if i copy this and run it you'll see the second time around we have the date time object there as opposed to just this lane object here so we've successfully converted our column uh into the actual format that we want it to be the data type let's move that up there so now we want to work out what we actually want to do with this data i'm going to suggest that probably one of the most common things is to create a pivot table to show and summarize the data so you can pass that on to whoever you need to so to do that it's nice and simple i'm going to say let's have a new data frame we're going to call it total sales if i can type thank you and we want to use pd.pivot table there we go now the first thing that we need to put into here is the actual original data frame that we created which we called sales then we need an index so this is going to be the sort of the left-hand side of our day of our pivot table and that's going to be the item column again this needs to match the actual column name and then we want some values so we're going to say our quantity so this is going to give us the items on the left and then the total quantity on the right but to get the total quantity we need to give it our function so we'll say ag func is equal to mp.sum and that's why we've used mp up here numpy so let's go ahead and print out total sales and we'll run that there we go so you can see we now have our item and then the total number of each one summarized from all three of those files into a quick and easy data frame which we can then crucially export out to another csv so we've got no messing around we can just take all that data here's the summary send it off move on now you notice that my numbers are all very similar and that's just the way that this vape data has been generated now if we wanted to summarize this by country as well i'm going to copy the row with control d in pycharm and we'll do sales by country and to make it have multiple indexes we just simply turn our index into a list so i'm going to go item and also country we're going to leave the rest of the same because we want to know the total quantity just like we did before but we wanted to see it by country as well so let's run that there we go so now we can see that we have by country separated out you can see the numbers are slightly different here so once you've figured out what your pivot tables you want them to look like and you've messed around with your data enough you've changed your dates around or whatever you need to do to sit to save these to a new csv file all you have to do is call the name and then do dot 2 csv or you can do.2 excel if you like and give it a name and then the same control d again sales by country sales by country so i'm going to run that and we'll have two new files in our folder with exactly what we just had there that we can then create that we've just created so there's our item quantity one and here's our sales by country one that we could easily now just fold on to whoever we need to or do whatever we want to do with that data so as i said if you do anything repetitive in excel i definitely recommend learning how to do it in pandas it will save you tons of time in the future if you've enjoyed this video i think you're gonna like this one as well
Info
Channel: John Watson Rooney
Views: 47,936
Rating: undefined out of 5
Keywords: python pandas, python automation, python pandas tutorial, python pandas dataframe, pandas, pandas read csv, pandas read mutli, automate reports, automate excel, pandas pivotes, pandas tutorial python, python excel, python pandas csv, john watson rooney
Id: PBRMvAKz_rQ
Channel Id: undefined
Length: 7min 19sec (439 seconds)
Published: Fri Feb 18 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.