SQL Data Quality Services 10 SSIS Cleanse Match

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
okay we're gonna use the data quality services cleansing transformation and matching in an integration services package so we're going to open up a sequel server data tools your project integration services project seek a 4-6-3 chapter 20 and I'm gonna create them a desktop okay already exists so I'm just gonna stick there's one on the end okay first off we're gonna rename the package to dqs cleansing and we can drag on a data flow task then we're going to right click connection managers new connection old a DB ad and it's gonna be to our local host and dqs staging data we would have said new local host dqs staging data that's not connection succeeded we've already got that there so we're gonna click OK we've got the connection there and then connection managers so we're gonna double click our data flow task actually while we're here we shall rename this on or leave the flow task so let's make a new source of our dirty data that can be sequel server and we're gonna use the connection that we've got and we're gonna call this customers dat just as a reminder we created two tables one of clean data then we populated a dirty table with the same data and then to the dirty data we applied a series of random changes so you can see there that there's been some amendments to the data for the name street address slight changes and I'm gonna make an amendment to the customer key there we're gonna change the CT dot where that should change to court so he's gonna apply that updates we know one where this court and just to show you what that will do if we go to our knowledge base we're gonna open the TK 4-6-3 customer click Next and then write which one is a street address sorry remember we got to him based relations CT dot will change the court so that's gonna happen when we add our data quality tasks in a moment so I was gonna apply from this knowledge base so let's back to our new business services package so let's take data from the duty table that we just created so customers duty so the columns the key full name street address updated and clean customer key click ok then we're gonna drag on our DQ s cleansing task add a connection and what we're gonna do here is clean cleanse the street address so double-click that so I did the quality manager it's gonna go to our server which is just a local host test that click ok this is our connection to our daily quality services and a knowledge base is the customers and just to see it again there's our domain which is the knowledgebase teach a 4-6-3 customers and at the moment looking at street address where CT dot changes the court okay so we're gonna click street address that's why we're gonna click street address here and the domain within the knowledge base is gonna be street address I'm gonna change the output alias to street address you'll see you've got the source you've got the output we've got status and in advanced we're gonna also have confidence and reason enabled click OK and the next thing we're going to do is we can prepare for exact matches so we're gonna use a lookup transformation I'm gonna call that exact matches so let's connect those and our connection it's gonna be the staging data again and we're going to connect to the customer clean table we go to our columns we're gonna match on the full name and we're gonna match on the street address and we can bring through from the available input columns I may be able to look up column sorry the customer key and we're gonna replace clean customer key with looked at customer key from our clean data and then we can go back to our connection oh no I'm sorry our general tab and what are we gonna do with no matches we're gonna redirect rows - no match output so where there's no exact matches I'll name and address we're going to redirect them to no match and then what we're gonna have to multicast transformations one here and we're gonna drag a connection to there the first one's gonna be our match output so I'll call that match a drag on another multicast it's gonna be on no match and then we're gonna actually create some destination tables the first table is gonna be dirty match table so it's gonna be a duty data coming through these are gonna be with the ones that actually matched in the lookup so we'll execute that so that one table and the next one is gonna be the dirty data coming through with no match so the ones I haven't matched anything to our lookup so we go back to integration services and we're gonna drag on our destination and again it's gonna be in a staging data and we're gonna drag that to our destination we're just gonna rename it's we're just gonna be customers dirty match that's gonna bug me we double click that so our connections to our DQ s staging data that's gonna go into dirty match here's our mappings so everything Maps there except record state this which just go to space there so we've matpat click ok and just to make it a little bit easier we're gonna copy that get a paste and this one's gonna go to our no match table no match drag that connection no match mappings everything is mapped including the record state this click okay everything said to run so we're just gonna execute so the papers coming through it's cleansing the street address based on the knowledge base once it's done that that's the data quality services transformation running they're based on the knowledge base next bit is just perform the lookup on exact matches then you can see there there's a thousand rows going to the match into the customers daily match 817 going to the no match and the duty no much table so go back to seed qual just have a look at out dirty no match and there we go all the mains that couldn't match and why I just want to show you we can scroll through and have a look you can see there that they have been some changes to the street addresses where CT dot has changed the court it's a new record but the confidence he has in that tournaments relation is once at 100% and it's being updated it doesn't have a clean customer key obviously because it wasn't matched so let's have a look at the oops I look at the matching so this is the date records have found a matching key and so these now have matched a proper key on the clean customers and again if we look to see T dot we will see that the knowledge base has changed CT dot to court based on a turn base relation from our knowledge base
Info
Channel: DataGriff
Views: 1,267
Rating: 5 out of 5
Keywords:
Id: UizaI79vMIQ
Channel Id: undefined
Length: 10min 49sec (649 seconds)
Published: Thu Jan 04 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.