Automatically Fill Word Files with Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what is going on guys welcome back in this video today we're going to learn how to automatically fill Word files with python to automate things like creating contracts or invitations to events so let us get right into [Music] it all right so we're going to learn how to automatically fill Word files using python today and this can be useful for a number of different use cases so for example you can create contracts in a customized way on demand just providing some information using a template and then having a customized contract instead of having to write everything yourself or having to fill out the placeholders manually you can also use this to send customized invitations to potential clients or customers or business connections so if you have a database of a thousand people it makes sense to automate this procedure instead of opening the template and manually copy pasting information from a CSV file or a database for example and many more things so you can use Python to automate the completion of word templates The Filling of word templates and this is what we're going to do in this video today for this we're going to need to have the package python docx installed so you're going to open up your command line you're going to say pip or pip 3 install python Das dox and we're also going to use pandas because we want to read from a CSV file later on so we're going to install these two packages and once you have them installed what you also need is some sort of template you don't have to use the template I use you can just design your own template it's a very simple one that I have here so I can open it up here I'm going to open it with only office since I'm on Linux but this is my basic template so some Grand neural 9 event you can see dear salutation first name last name it's been a while since our last interaction on last contacted so everything that you see here with square brackets is a placeholder um you can see we hope all is well at company name uh we are deloted to and blah blah blah and then you have again company name and again company name and then you have the static information here which is the event of course you can also make this variable if you want to have it even more customizable but for now this is going to be the same event every time the same dress code every time and it's just going to be a different invitation based on who we're sending this to so we're going to learn how to fill out this template and we're going to do it first just with one particular uh set of information here so we're not going to use a database or CSV file we're going to just go with one person that we want to do this and then we're going to uh do this for and then we're going to automate this with AC v file so we're going to start by from docx importing the document uh and the idea now is to go through the word document and to find specific placeholders now the reason we use square brackets and you can design a placeholders however you want you can do it with curly brackets or something else the reason I use square brackets is because certain words can occur um in the document without square brackets and they're not placeholders so for example if I use company as a placeholder um the problem is that company might occur as a word and it's not a placeholder so if I replace all occurrences of company I might also replace the word company which is not a placeholder whereas if I put this in square brackets it's obviously a placeholder this is not natural text it will occur in the contract usually or in the invitation in this case usually um all right so what we're going to do is we're going to define a function fill invitation and this function is going to take the template path as a parameter it's going to take the output path as a par parameter and it's going to take the data as dictionary here as a parameter in the form of a dictionary as a parameter we're going to Define doc equal to document and we're going to just load the template path whatever the template path is um and then we're going to say now for each paragraph in the document. paragraph So paragraph basically is uh when we open this file up again a paragraph is just one section here so we basically basically have this is a paragraph This is a paragraph every single thing here is a paragraph every every time you have a line break um and the idea now is that every paragraph has multiple runs now in this case we only have one run per paragraph so all of this is one run the idea is that I can of course change certain styling here so I can mark this I can highlight it I can change the the boldness I can change the size the font everything and then I would have multiple runs per paragraph in this case we can just do the runs individually uh the only thing that you would have to keep in mind is that if you have for example different runs and you split the placeholders you're not going to be able to easily detect them so if you're looking for a company name in runs but then you have one run here and another run here it's not going to find company name in this case you would have to replace the full text of the paragraph we're going to see what this looks like here in a second but now we're just going to say for paragraph in doc. paragraphs and we're going to say for key value in data. items this is what we're going to provide here in a second our data we're going to say if the key is part of the paragraph text so this is the content of the paragraph for run in paragraph. run we're just going to say run. text equals run. text replace and then we're going to replace the key with a value we're going to see why this makes sense here once we have the dictionary now as I said if you have some problems with this if the Run uh produces problems what you can do is you can also just say directly uh paragraph. text is equal to paragraph. text. replace and then key value this is also a possibility so if you don't want to deal with runs if you don't have any formatting any formatting issues you can just go with that um we're going to go with the runs here it doesn't really matter so this is now the replacement code the only thing we need to do now is we need to say doc. saave output path nothing too fancy all right so with this function now what we can do is we can Define let's define a main section here so if uncore name uncore is equal toore maincore then what we're going to do is we're going to say the data is a dictionary and what we want to do now here is as the key we always want to put the placeholder of the template so let's open it up again or do I have it open up no okay uh let's open it up and see all the placeholders that I have in here I have salutation first name last name last contacted and company name that is everything so what I'm going to do now is I'm going to say salutation is going to be in this case Mis then we're going to have um first first name is going to be Mike then last name is going to be Smith then come on then the last contacted and I think we also had date don't we also have date no we don't have date because date is static okay so last contact it is the last time you spoke to that person or messaged that person let's say it's October uh 1st 2023 and then we're going to say finally company name is I don't know Smith Inc or something like that so this is the data and this just defines which placeholder is mapped to which value and then all we have to do now is we have to say template path and our case is just template docx output path is just going to be fil do docx and then we're going to say fill [Music] invitation uh template path output path data and now if I run this what I get is an exception because of course paragraph. runs and now it works so I can open this in files I can double click it and what you see now hopefully is there you go a customized invitation Dear Mr Mike Smith it's been a long while since our last interaction October 1st 2023 we hope all is well at Smith Inc and so on and so forth you can see that the placeholders are now filled with the actual values now this does not make a lot of sense if you do it like this if you define the data manually it makes a lot of sense however if you do it automatically using a database or c csb file with contact so here now I have a CSV file with 20 contacts this is generated by chat GPT so instead of just copying this you can just go to chat GPT and tell it you need a CSV file with those columns first name last name last contact at company salutation tell it you want to have 20 entries or more um and this is now the data that I have here so this is a database and this is potentially this contains potentially up to a th000 2,000 3,000 different people that I want to contact and invite so why would I do it manually why would I open the template fill in the information and then uh save it I can just use this automation that we just learned about so what we're going to do now is we're going to have the same function fill invitation but we're going to now generate the invitations from the CSV file so we're going to add a separate function here which we're going to talk uh which we're going to name generate invitations from CSV and we're going to put here the CSV path and we're going to put here the template path which we're going to then further pass into the fill invitation function um but what we're going to do now basically is we're going to load the CSV file as a data frame which is of course why we need to have pandas imported so import pandas as PD we're going to say the data frame is equal to PD read CSV file and we're going to read context. CSV and what we do now is we say for index and row in data frame itero so we're going to iterate over the rows and we're going to create this data dictionary that we have down here uh for all of the entries so I'm going to just copy this here and instead of having these fixed values um what we're going to do is we're going to have the data frame values here so we're going to say Row first uncore name and then comma and then we're going to say row oh actually this is first first name this is salutation this is last name so row last name then we have oh come on then we have the row last contacted and finally we have row company yeah just company not company name right yeah all right so this is now for every row we create such a data dictionary and all we have to do now is we have to say output path is equal to and then we can create a file name structure something like invitation underscore and let's use here index now just to have different names um or let's say index plus one so we don't have a invitation zero doc x uh what's the problem here expected int got hashable and instead why is that index should be a number actually I think it should work let's see if it works or if I'm missing something um but then we're going to say just fill invitation template path output path data so we're just going to call the function we're basically doing the exact same thing as here the only difference is that we're doing it for each row of the data frame so all we have to do down here now is we have to say CSV path is equal to uh contacts. CSV we're actually yeah we need to use the CSV pth here I mean we don't have to we can also just use contact csb directly uh and now we say template path is again equal to template do X and then finally generate invitations CSV path template path so that would be the automation if I run this now hopefully I get 20 invitations as you can see here I can open them in files can open any one of them and here now I can see the invitation for Mr Robert Brown of brown company or corporation uh now we can see here the date is formatted differently but this is just because the date is like that in the CSV file if I want it uh in a different way of course I can change the CSV file or I can reformat it uh programmatically uh but you can see now we have this invitation I can go back now I can open another one open in files there you go and here now you can see okay Mr Brian Gray gray grain and then you can see here this is the last time contacted um and yeah so you can see that this makes a lot of sense if you create invitations of course you can then go further and create uh or convert all of them into a PDF file or you can also automatically send them via email so you might have a full management system where you just click a button and you send out invitations to all these people automatically as PDF files uh this can be in integrated in a larger pipeline or procedure but this is fundamentally how you fill in Word files using python so that's it for today's video I hope you enjoyed it and hope you learned something if so let me know by hitting a like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this Channel and hit the notification Bell to not miss a single future video for free other than that thank you much for watching see you in the next video and bye
Info
Channel: NeuralNine
Views: 17,537
Rating: undefined out of 5
Keywords: word files, python, docx, fill word python, python automate word, python fill docx
Id: 3qjEPRmge8I
Channel Id: undefined
Length: 14min 35sec (875 seconds)
Published: Sat Oct 21 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.