Set up Microsoft Exact Data Match - Hash and Upload your Data

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey everybody welcome back to the channel we're going to walk through today uploading your EDM data that we did in our previous video with the schema and cleanup exercise now we're going to take that and we're going to upload it into the Office 365 solution I'm going to show you all the steps needed to set up your user accounts set up your server uh for hashing and uploading that data into m365. so that being said let's get into it and we'll start uh setting this solution up all right welcome everybody let's go ahead and get into it this is the guide that we're going to be using to create the accounts set up the server and upload and hash the data so here we go um first step that we want to do is we need to cut set up a Custom Security Group in a user account so we're going to go ahead and do that real quick in Azure active directory uh and we're going to go to new new users and we're going to just create a standard service account EDM upload we'll call this a user one one and go ahead and it creates next up we need to create a group for this and in this case it is a very specific group that you have to curate so let's scroll down and we're going to copy this name uh in here and it does need to be a security group that we create so we're going to go back into here under Azure act trajectory groups and we'll call this EDM uploaders of course Security Group is selected and assigned membership yes and then we are going to do members and we're going to search for our service account that we did I have two in here because I was doing it before so I'm going to use this EDM uploader one and go ahead select and create now while this is baking now we can start setting up our server and we're going to go ahead and pivot over to on-prem in our server environment next step is setting up our machine with this and you'll see here the.net 4.62 version and for running the EDM upload agent what you're going to need on this server is the upload agent so if you click on this it'll take you down to the commercial gccs version and that's what we're doing over here so here I am on my server and I have just gone in and downloaded the EDM upload agent to this environment and we need to go ahead and install it you'll do notice that you did need to check that EDM version um if you if you haven't done that before we can go ahead and check that in our environment so let's go ahead and run this check code and we have 4.7 so 4.6 is required or well above next let's go ahead and install the EDM upload agent so here we go we're going to put this on our server and run away with the install notice here the file path that's in here and we're going to need to sign in so I'm going to use that server EDM piece and we're going to go ahead and set up the install package for this okay now that we're uploaded we're going to go ahead and make a new Powershell script so we can do the the setup components so we're going to call this uh EDM okay so there we are now set here and we'll walk through the rest of this components for us so we have a working directory in my case it's just going to be the C drive and then this EDM folder I have my sample data set in here that is the one that we worked on in the previous video so it's all set up for our custom again this data is from dlptest.com so if you haven't watched this you can grab this data from that other video and watch through the setup process of it all right so we need to go ahead and now authorize this account so I'm going to go and grab this and I like to I always like to work in ISE for Powershell um just because I like to be consistent uh and be able to reproduce this if I need to set it up on a different server so let's go ahead and run this authorized command oop forgot didn't write go to the right folder for this so we're gonna go to CD um and this is the let's go find the install path that it went to foreign and there it is so we're going to go ahead and grab this file and we'll change our working directory to match lots of whoops run just that command and we'll switch the directory we're in and now let's go ahead and run that EDM upload agent oh I forgot to do a DOT and we'll run that so what this is going to do here is that this is going to set up this server with an authorization token so that we can continuously run with this account to do our uploads all right so we're going to go ahead and sign in you'll never need to sign in again after this with this account you just need to do this on the setup process after that it's going to go ahead and keep this consistent and reuse this same authorization token okay so we're set there next step in the process is let's pivot over to doing the upload here so once we're signed in we have to download and save this file the schema file so depending on how you set it up this might not be a necessary step but for most of your environment especially if you're doing the the new Wizard you're going to need to actually save a copy of this uh of your schema so that we can then use it for our uploads so we're going to go ahead and grab that again and again I'm going to put it into this okay uh the data store name in this case is going to be from our compliance admin Center so we're going to go ahead and grab that one and data classifications EDM classifiers I'm going to turn off this new experience and here is that DLP test schema that we used previously so we're going to put that in here where do we want to save this schema file well in my case I am I have this entire working directory here so I'm going to have it output that file into this location and I'm going to use double quotes and let's go ahead and run that line I probably should make this bigger so it's easier for you all to read huh that would make a lot of sense all right so go ahead and run that Command right there okay and there you can see it generated our schema our XML for our schema and let's go ahead and open that up real quick in Notepad and just like we saw before exactly the same schema it just went and downloaded a new copy of it and that's what we're going to use for our uploading okay next step let's go ahead and do a test upload of our file so in this case I'm going to just copy this and we'll go ahead and walk through this with this as a kind of a one-time script you can slash EDM upload data score all right here it is data file and that's going to be our uh yep so that's going to be our our file that we've created our actual data set so we're going to go ahead and copy that so that is that piece there hash location where do we want it to save the file in this case I'm going to go ahead and save it into the same directory but I'm going to create a hash folder foreign where is the schema environment in this case we're going to go ahead and grab that path again so you can if you hold on control shift you get the copy as a path option and so that's that's what I'm doing there uh just because it's easier column separator uh we don't need that I believe because we are not going to be using any tab separated values so that's only going to be needed if we were doing a a uh if we had a tsv file so we're going to go ahead and delete that and then allowed bad percentage we're going to say five percent um actually I'm not going to allow any bad percentage so this is an option for you if your data isn't clean you can come in and set up a allowed bad percentage of saying hey yeah I know maybe 10 of my data doesn't match the correct format for upload upload it regardless and then we'll come back and fix it but if it's over 10 then fail it out so I think that's good for now let's go ahead and run this command and we'll see how that works see if we get any errors and we got to give it a little bit of a delay and we're set so that now is the code that we're going to need so we went through the authorize save schema uh we have it all set up and this is going to be our setup file now if you are running this and uh the the file is taking a long time to upload and hash that can be expected uh it can be a if you have a large data set I think I did 10 000 rows the other day and it took about 15 minutes to run the process and so just know that it can be expected if you are doing a lot of uploading it can take a while if you are concerned you want to check on the process of the file you can actually run this EDM upload agent get session data and it'll tell you about the status of the deployments and how you can you know kind of kind of where things are at and it'll let you know like oh yes this says completed successfully uh and all that kind of information about the data so that's kind of that component next step in this is really integrating these commands into a scheduled task for your upload so um let's go ahead and walk through that process of you know actually scheduling this to be a recurring style thing so we're going to call this file EDM actually we're going to take this same authorization that we did and we'll actually save this as a command that we're going to just run on a scheduled task you don't have to obviously do this as a scheduled task if your database team is going to be responsible for uploading this data into the thing uh into the Microsoft solution they just need to run this at the end of their ETL all right wherever they're dumping that CSV you can just have them you know at the end of their SQL package export the data if they're doing python all they need to do is just run this one line that we worked on and and they should be good this should reoccur uh continuously upload those files some limitations though that are going to happen in this is that you do I think Microsoft only allows two times upload per day of this schema so you know just go into it knowing that that there are some limits like that you just can't be uploading this data 10 times a day or something like that for this so as long as you know that you should be pretty good um talk to your database team or if you are the database team you know you can either just run this at the end of your aligned or in my case I'm just going to set this up like my my database team usually just dumps the file I need into a network share and so I'm just going to set this up on this server to run it once a day and just continuously upload the same file so to do that I'm going to save this as its own file so uh EDM uh underscore upload and we're going to save that I took the exact same file that we did here and let's just set it as a scheduled task so let's go into here schedule task and we're going to create a task real quick to run this so uh EDM upload I'm going to run this whether the user is logged in or not uh triggers we're going to go ahead and set this to just run on a schedule one time every day and just like that um uh 1205. that's fine what actions are is it gonna take well we're gonna run a startup program in this case the program we're going to run is Powershell and I believe the file is the uh hold on I'm gonna pull up a previous job that I had done on this same one give me one second while I pull that up yeah Dash file so that's going to be the command that we're going to run our own Powershell and then the file path so let's go over to our new demo and we'll go ahead and put this in file grab our file that we want to upload again Ctrl shift copy as a path make sure it's in double quotes and we are set so now we are officially ready to go this is going to go in and run this command every day gotta of course save our password otherwise it won't run when the end user is not logged in and boom we are set if you need to adjust this if you know the file changes and it's now being put into a new location for this data file you know for if it's a network share you just need to adjust that command there and then you should be good to go so if you would like to get alerting on success in failures of this you can actually do that with the solution again you're you're yes you can set it up at the scheduled task to get an alert if the task failed or something happened on the server and I would definitely recommend coming into this and setting up you know an actions or getting you know an email or some run a run some other trigger to to have it come in maybe you're going to adjust the Powershell and just have it trigger and do something more complicated than just the the Run command but what I would also recommend is actually using the compliance Center audit logs and getting alerts on that so to do that what we can do is every one of these steps in the process of this EDM is actually stored in your audit logs and so if we pull up the classic search in here we actually can come into this and search for EDM in this case maybe we can search for EDM so EDM completed and failed those are the two activity types that I would recommend setting up an alert for from the compliance Center so this will let you know if the Microsoft solution is successfully ingesting it when the file is run so um especially that failed one that'd be the one I would definitely recommend you know setting an alert for you know that way you get ahead of any issues that you're having with the the uploader or a change in your data set um if these uh policies aren't showing in like if you immediately come into the solution then you you you hit the upload and then you run the command just know that this can take a while for it to show up in our you know solution before Microsoft is appearing in this audit logs um so just give it some time to come through and do that here you can see uh in this audit yes we were successfully able to upload the data if you hear see it here record type 109 that's great we're getting all all that kind of content uploaded in and we'll see a lot of that data pieces in here if you want to set up alert for it you can go into your alert policy and create it mainly in the GUI otherwise you can use Powershell to do this I have this pre-plumbed under in my GitHub I'll put a link to kind of this alert but it's just really alerting on upload data completed or upload data failed and that's the activity that we're gonna Target and you know just put yourself as an email get an alert when it comes through so again I hope this helps someone out there get started with this and walk through the solution not terribly complicated um a lot of the the hard part is of course going to be the data cleanup and getting your data but once you have access to it once it's routinely being dumped into a location that you can take advantage of action and uploading is pretty simple process um hope this helped in the next video we're going to do is we're going to build the EDM sensitive information types and that might be a little bit more of a difficult process but we're going to walk through it pretty quick in our next video and hopefully you'll be fully set up after that one to use EDM with Microsoft 365. uh have a good one bye
Info
Channel: Doug Does Tech
Views: 1,428
Rating: undefined out of 5
Keywords: Microsoft Purview, EDM, Exact Data Match, EDM SEtup
Id: 5QrfMi0QQ_c
Channel Id: undefined
Length: 19min 28sec (1168 seconds)
Published: Fri Sep 30 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.