🤖 Microsoft Power Automate Tutorial - Extract data from PDF invoices

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] welcome back this is the rpa champion and in today's video we are going to be seeing form processing with ai builder now in more detail we are going to see an example of how we are going to extract data from invoices so imagine we have some pdfs that every time we see them they are different and we want to extract some information like the due date or maybe the invoice number or for example uh we want to extract the uh the amount you or any kind of information so what we're going to see is we're going to use the ai builder we are going to train our own model and we're going to apply this model that we train inside the flow the flow that we're going to create is going to extract invoice information from pdf documents and it's going to insert this pdf information in an excel or it's going to send it via email we'll see that is not important what the objective of this video is to show you how to use the ai builder to train your own model on invoices or on any other or any other use case that you might have so for this we're going to go to the ai builder and we are going to click on build after that we are going to be using the form processing now this functionality is here they are already pre-trained models by microsoft you can already just click on one of these and start using it right of the box but we don't want to do that we want to create our own model we want to train our own model because in our case our invoices for example are going to be different so we are going to click on form processing and we are going to give a name to our to our model so i'm going to call it invoice p i have already created one that's called invoice so i will create this one as invoice b now keep in mind that to train your model it is only required five documents to have but obviously the more documents that you have the more accurate your model is going to be so let's create let's create our model now what does this mean when we create a model so we are basically creating an algorithm that is going to learn the different pdfs that we are going to receive and it's going to be able to understand where the different data and information is going to be located on these fields so when we're talking about different data and information what is the different data that we want to extract from this pdf so in this example we can be we can say that we want the invoice number we might also want the total the total that is new we might also want the due date we might also want what else we want we might also want the invoice number the total due date i think that should be that should that should be quite good sufficient and let's just also add the customer id so this is all of different information that we want to automatically find and extract from our pdf documents so imagine that previously we were doing this manually we're receiving these pdf documents in our email we would open them and we would extract it manually right now we are creating a flow that is gonna do this for us and you're gonna see how simple this is and you can apply this basically you could train a model for any kind of document that you that you want excellent so right now we need to create so we have created uh the the fields that we wanna that we wanna uh that we wanna extract and identify now we need to create a collection uh what is a collection a collection is just a container that is gonna hold the different data that we are gonna train our model on so let's say that we wanna uh process invoices that are coming for example from coca-cola we're going to create a folder that is going to be called coca-cola and here we're going to put all as many invoices as we can from coca-cola so that we can train our model on it and we're going to do this for as many vendors as we have so for example let's uh let's create two collections one collection and uh for anybody that is interested in more detail there is a tutorial about this on the microsoft power automate page that you could also download the information download the sample information that i will be uploading and you could also get a little bit more uh more detail about different steps that i'm covering so i will name one of the one of my collections i will name it adatum that will be one of my vendors and the other vendor will be contoso excellent so i will now upload the information and the data the different invoices that i want to share my model on so let's select the different invoices and here again keep in mind you could use a sharepoint site you could use a google drive site you could use also a online uh an online place for this where images are uploaded dynamically so in maybe in a pipeline so we've selected uh who do we select we set an undone i believe we have selected so i have already here uh some invoices that i will upload into [Music] here [Music] oh ah [Music] so again just kind of recap the process until here the first thing that we did was to identify different fields inside of our documents that we wanted to extract now what we are doing we are uploading uh five minimum five documents so that we can train our model on these five documents but before we train our model we need to tag these fields and we need to show our model where these films are so let's proceed on to the next step the first thing also we'll have to wait for the documents to be analyzed by by power automate excellent it has trained and now that we have trained uh that we have analyzed our our model we are on the screen to tag our document now we see here two red dots that mean that these documents have not been trained and are ready to review so if we click on our documents or let's for example first clicks on contours what we need to do is we need to show our model or train our model and tell it where the different information that we want is going to be so for example the invoice number where is the invoice number on this on this model it's right here perfect so this is going to be the invoice number and i will select it like this now the total the total the balanced view is here but it's also here i will be selecting this one so again we are going to select balance total so we have the invoice number the total and now we need the due date so this is the due date we're going to select the full date and we'll add this as the due date now in this on this form i don't have the customer id therefore i will select that this field is not available in this collection and therefore do not train the model to identify the customer id on these invoices from this vendor now i have completed the first page now i have five more pages to do so i have to repeat this same process five more times on all the other pages in this way i would be able to train a model to recognize this information in any other future pdfs that are arriving from this from this vendor or from this source and that have a similar structure now microsoft is doing a pretty ai builder for microsoft is doing a pretty good job and the other in the following page it has already identified for me so i don't have to do it let's see if it's going to do the same in the third page so let's click on it and it has identified the data correctly that is perfect and it i believe it will be able to identify the data automatically on the fourth and on the fifth page as well without any problem we just have to wait a few seconds for it to finish loading so this is page four it has identified all the data correctly and now let's just go to page five because in our process we need to visit every single page and make sure that the data is correctly labeled on five pages once we have done that that is enough for uh for ai builder to be able to create a uh to be able to train a model on the data that you have provided and when you think about it five documents is a very very small amount of uh data points to train a trained machine learning model to do something that is that maybe not so much time ago was so complex and would require different technologies different technologies to do and it was also not maybe not the most reliable because of all the different technology stacks and complexity involved in doing it so here we go here we have the other invoices and here we see that we also have the invoice number so not only do we have the total that should be somewhere around the bottom but we also have the date the invoice number and all this information so let's start tagging so this is the date the due date this is the invoice number or the customer id ah no this is actually this is a mistake so let's change this and this is not the customer id this will be our invoice number so let's select this again and let's select customer invoice number perfect so i believe that the customer id is right here and this is the customer id and now let's select the total as well so for this for this invoice we have all of the information that is required so perfect now we have to repeat the process just like with the previous invoices now we have to repeat it again for our five documents so if power automate is not able to identify all the fields uh inside of the invoice it's gonna stop us and it's gonna ask us to identify and to make sure that that data is correct so as you can see in this case it was for some strange reason it was not able to identify the invoice number so let's help it out and tell you tell this is the invoice number and this is the due date so maybe in the following ones it will be able now to recognize also those missing fields or maybe we're gonna have to train now let's wait and see what happens so again it was not able to identify the invoice number and due date but that is not an issue because we have to do this initial training on at least five documents so that microsoft will learn how to get it correct now let's pass on to the final document and just kind of uh remind you how simple and easy all of this is i am training a machine learning model without using any uh coding language without using any statistics any advanced mathematics or any advanced coding tool that might not be that might be out of the capabilities of a simple business user that might require a functionality like this so this is really great to see that this technology has been uh has been enabled for anybody to to use and create their own complex flows like this all right perfect i have finally completed almost all of the different invoices now i believe there is one more invoice so let's tag this last invoice as well we are obviously going to perform a test after all of this is done so this is the due date and the invoice number just so that we make sure that all this data and also this that we are having hard time matching that all of this information is matched correctly and is going to work before we put it inside of our floor now inside of our flow we are going to be doing something really really really nice and really unique we are going to be uh sending images and we are going to be extracting from these from these images we're going to be we're going to be extracting all of the information such as the invoice number such as the amount view the date and so on so we have we have tagged five documents so this is the last thing that we have to do so now uh we just have to train our model we have to uh uh wait for it to get trained once it's trained we have to publish it and then we're ready to use it inside of our flows now this is going to take a couple of minutes so now that we go and see our models right here and just to recap while we're waiting for this page to load when you are building your models you would go to build obviously and then to see all different models that you have built and that you have in production or that you are working on you would go to models and as you can see now that we are on our model screen we can see that we have our invoice p model we can see that currently it is the status is training so we're gonna have to wait another couple of minutes just for this to finish training and then immediately we will be able to use our model as you can see it has already finished training so let's start using it immediately in one of our processes and you will see how easy this this really is to use and uh and integrate inside of your power automate or power flows so i can either quick test this model or i can publish it it is good practice that you should quick test your models once you have the first time that you have created them so just to test we would upload a invoice that we have previous uh an invoice that we have not used for training so for example let's use a new invoice that hasn't been seen by the application and let's see if the model that we have trained is able to correctly identify the information on this invoice so it should identify the invoice number it should identify the date view and it should identify the total at the bottom of the page and as you can see it has identified invoice number with confidence score of 100 and the same with the other three information pieces that we require now let's close this let's publish our model let's put it inside of a power automate script and let's see how we can access this information and how we can use it inside of our power automate flows and again just i really want to stress out how simple and straightforward this whole process is and i really invite you to start playing with with power automate building your building your models it is some it is a really great powerful tool that is so straightforward to use so let's use this immediately in the new flow excellent so this is the page that we are redirected to so we're going to be using our custom our custom service that we have created so let's click on continue and we're going to be directly redirected to a page where we are going to be able to create our flow with this component that we have used so in addition since i have chosen to take the path and create and integrate the the the model that i have created via the ai builder i will also have right now a few additional things that might not be required in your flows but uh for our testing purposes we're gonna use them and i will show you right now what i'm talking about as soon as the page finishes loading okay perfect so what i was talking about is this initial part of an uh manual sugar flow where we have a the ability to upload a picture but we could also upload the invoice when i say invoice i mean upload the invoice to a shared drive or we could send it via email and get it from an email folder or there's different ways for just simplicity reasons we are gonna keep this and we're gonna upload an invoice what we're gonna do that we are after that we are going to use the model that we have trained which is this model right here and we are going to use it on this document type that we are going to get from here and on the document here after that what we are going to do is we are going to send an email with the information that we have extracted to ourselves so let's do this very quickly and in the same way you could also insert uh insert in excel sheet or in google sheet a line in excel sheet for each of the invoices in certain or do whatever step that you might think of for example an approval workflow or an approval step but just to show you how to use the data that is extracted from your model so let's send an email and we're going to send this email to myself apologies for that excellent and now in the body we're going to use different numbers so the total value this is going to be total value and then right underneath we're going to have the invoice number and after that we're going to have the due date and we also want to know which who is this vendor so let's use the name of the collection that we have used here uh to send the name so all right we should have all of the information and again i'll just repeat again you could also have kind of an approval workflow meaning that if your invoice is larger than 100 in the only in that case send me an email for approval and if i approve it then insert it in an excel or send another email or continue with your phone so this is this is pretty much it and uh now we can save and test our process immediately so this has been uh this has been a little bit of a longer video but that is only because we went a little bit into the detail of explaining how ai builder works and how we are going through all the different steps in just a little bit more detail when you're gonna be doing this by yourself you're gonna see that in 10 15 minutes you will be able to create a machine learning model and be able to deploy it and utilize it immediately inside of your flows now let's select an image that we would like to analyze so i will not use this image anymore just for just for testing purposes and let's use this other image to test so this a is a pdf document now let's upload it and let's see what happens so i have done the first step of the process which is to upload to manually upload the document inside of the process now we have sent the document to the power ai server where our model lives and now we have also received an email i will show you the email on the other on the other screen that has has just arrived and this is the email that we have just received so total value this is the total value this is the invoice number this is the due date and this is the vendor from who we have uh from who we have uh extracted this information from the invoice now imagine if you had 100 invoices or a thousand invoices or if you had to process dozens and dozens of invoices every day how a solution like this could help you speed up your work and speed up your activities you you could maybe just build it on the majority of the invoices that you have and then for those invoices that are special that require particular care you could continue working on those by hand until you build a more robust solution so i really hope that you enjoyed this video that you didn't fall asleep it was a long video if you like my comment if you like my content and if you like this video subscribe to the channel give me a thumbs up and i will see you in the next video have a great day thank you so much for watching [Music] you
Info
Channel: RPA Champion
Views: 60,809
Rating: undefined out of 5
Keywords: power automate, microsoft flow, microsoft power automate, power automate microsoft, power automate tutorial, automate invoices, power automate invoices, power automate invoice processing, microsoft flow tutorial, upskilling and reskilling, microsoft power automate tutorial, rpa champion
Id: CSvugOZIruo
Channel Id: undefined
Length: 24min 37sec (1477 seconds)
Published: Thu Jan 28 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.