Hey guys,
In this video, I will demonstrate how you can use Pandas to automate the creation of
Excel files, which you can then distribute via email to a list of contacts. To send out the emails, we are going to use
the Desktop application of Microsoft Outlook. So, if you want to follow along, you also
need to have Microsoft Outlook installed on your computer. Before coding out the solution, let me shortly
demonstrate what we are going to create. In my Excel file, I have two worksheets. The data worksheet contains financial information
for five different countries for the years 2020 and 2021. I intend now to separate the data for each
country and only include the transaction for 2021. So, that I will have five separate excel files
- one Excel file for each country. Afterwards, I would like to send out the reports
via Outlook according to my email list. As an example, the Excel file for Japan should
be sent to Tanaka-san. To do that, we will create the following Python
file. Let me go ahead and execute this script. As a first step, the script will create a
new 'attachments' folder for us, which contains the five Excel files. Once done, each file will be attached to a
new outlook email according to our excel email list. Ok, and without further ado, let us get started. To explain the code, I have created the following
Jupyter Notebook, which I will also upload to my Github page. You will find the link to that GitHub repo
in the description box. First things first, I am going to install
the required dependencies. To separate the Excel files, I am going to
use Pandas. Under the hood, pandas is using the openpyxl
library to interact with Excel files. Therefore, make sure that you have both libraries
installed. You can install them by typing pip install
pandas & openpyxl in your command prompt or terminal. To interact with Outlook, I will use the pywin32
module, which you can install by running 'pip install pywin32'. Once we have gotten that out of the way, we
can start separating the Excel file. As the first step, I will declare the path
to the Excel file stored in the current working directory of this Jupyter Notebook. Additionally, I will define the path to the
'attachment' folder, where we are going to save the Excel files. If this folder does not exist, I will create
it using the 'make directory' function from pathlib. After executing the cell, we will see our
new 'attachment' folder. Once done, I will load the 'data' worksheet
into a pandas dataframe and print out the head. Then, I will store all country names in a
list by using the pandas unique method. If I print out this list, we will see the
five countries. As a second step, I will iterate over that
list and filter the dataframe. But before doing that, let me give you an
example of how to filter the data for one country. Let us say we only want to return the financial
data for Japan. We can do this by using the query method. So, data.query followed by the column name
and the filter value. This dataframe now contains only the data
for Japan. Now that we know how to query the dataframe,
I will iterate over all country names so our unique value list. Inside the loop, I will also use the query
method, as seen before. Yet, we are referring to a variable this time,
so I am using the at symbol here. Additionally, I would only like to return
the data for 2021. Once we have filtered the dataframe, I will
construct the output path before exporting it to an excel file. I will also set the workbook and worksheet
name to the respective country name. So, as an example. In the first iteration, Python will replace
the unique value with 'Canda.' I am also using a variable for the column
name. Because, by doing this, we are more flexible. Perhaps in the future, I want to filter the
dataset based on the segment. In that case, I only need to change the column
name variable. After filtering the dataset, I will construct
the output path. So this line of code will return the following
path. I will then use this path in the to_excel
function from pandas. Additionally, I will set the sheet name to
Canda. As we do need the index, I have set it to
false. After running this cell, we will find our
five excel files in the attachments folder. To validate if everything was working correctly,
I will open up one file. As we can see, this dataset only contains
the data for Japan. And also, if we check the year's column, you
can see that only 2021 is included. Now that we have our Excel files, we can create
and send out the outlook emails. As you remember, Python should send out the
emails according to our email distribution list. So, let me load that email distribution list
into a pandas dataframe. Then I am going to initialize the outlook
application using the pywin32 module. Afterwards, I will iterate over the rows of
the email list. For each row, I am going to create a new Outlook
email. Within that email, I am going to set the receiver
address equal to the email column. By the way, if you want to send the mail to
multiple recipients, you can use a semicolon. I will do the same for the CC field. For the subject, I will use an f-string to
concatenate 'Financial Report' with the country. For the body, you could keep it simple by
using the mail.Body followed by your message. However, I am going to use an HTML body. In that case, I can use HTML formatting tags
to style the content of the mail. For example, I will set the first line to
a bold text and insert a couple of linebreaks. Next, I am going to construct the attachment
path before adding the attachment to our email. Last but not least, I am going to display
the email. If you want, you could also directly send
out the mail using mail.send. Ok, and that is all there is to it. After running this cell, we see that Python
has created our five emails with different attachments. Before wrapping up this video, please note
that it was a simplified example. In real life, your dataset might look messier. Additionally, by default, pandas or, more
specifically, openpyxl will not remember or save the formatting of your source excel file. So, in your source file, you might have a
bold header with a blue background colour and white font. If you use pandas and export the dataframe,
the excel file would like this. Nonetheless, I hope you learned something
new today, and this video gave you some inspiration on how to automate some tedious office tasks. If you have any questions or want to see more
excel automation examples using Python, let me know in the comments. Thanks for watching and see you next time.