Delete Duplicates in SQL - SQL Training Online - Quick Tips Ep43

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey joy blue here and I'm going to talk about a little trick on how to delete duplicates in sequel server and so first off let's create a little table that has some duplicates in it and so I've got a little script here and let me zoom out so we can see all this and so I've got a little table I want to create that has some duplicate emails in it and it could be some kind of opt-in list or our email marketing list you have and let's go ahead and execute that and let's see what's inside of the table now so select star from email list and you can see that we have some duplicates here so Bob's in there twice Jill's in there quite a few times so let's go ahead and order this so we can see what is really duplicated in here and you can see that Bob's in there three times Jan's alinder one so she's okay and Jill's in there three times so we want to get rid of these duplicates well there's really two nice tricks where you can use si te s our common table expressions in order to get rid of these duplicates so let's try that out so I've got another script here the first way and let me copy that over and so here's the first query and it uses the row number over and then the order by so this is a special Clause inside a sequel server that you can use and let's just see what this little select statement does first it actually puts a row number on all the records so Bob you can see is 1 2 3 and then Jan is 4 and then Jill is 5 6 7 ok so now that we have a road number it makes it easier to actually delete and so let's take a quick look at the second part of this where we actually do our delete and we say where row number not in and then we say select the minimum row number from this group here and so the minimum row number is going to be one for Bob four for Dan and five for Jill and so we want the row numbers where it's not equal to the one for Bob the four for Jan and the five for Jill so that means we're going to actually delete those two rows and these two rows so let's run this and see what happens four rows were affected so now let me come back over to the email list and we got rid of the duplicates so that's one way to do it now there's another way we can do it I'm using a similar statement first let's reload our table and I'm just going to put a whole bunch more of these back in it so we've got a ton of duplicates now so Bob's in there whole bunch jams even in there a few times and Jill's in there quite a few times so let's go ahead and look at the second way we can do this I'm going to copy over another statement and I'll just paste it below so we can compare these two so this statement uses another Clause the partition by and let's take a look at what that does differently so I'm just going to highlight that part of it and I will go ahead and execute and let's take a look at what that does differently well what it does is it actually groups by the first two columns and then numbers so you can see that when we get to the next group it starts the numbering over so for Bob we have one two three four and then we start numbering again at Jan so we have one two and then we start numbering again at Jill one two three four and so then all we have to do to delete is actually look for the duplicate count in this case this column here greater than one and so here the duplicate counts bigger than one so we're going to delete all those duplicate counts bigger than 1 so we'll delete Dan duplicate counts bigger than 1 so we'll delete all those so 2 for me this is a little bit easier to follow than the other one I like it better they both work just fine though so let's go ahead and run this and make sure it does what we think it should do so I'll highlight here f5 seven rows were deleted let me go ahead and select and we have three rows and so everything's done so I'll go ahead and put these scripts over on my blog and as always if you find this helpful please leave a comment below click the like button or subscribe to my channel you can also visit me at sequel training online.com and I hope you enjoyed this free video you
Info
Channel: Joey Blue
Views: 42,192
Rating: 4.886179 out of 5
Keywords: Delete Duplicates in SQL, sql server 2012, SQL (Programming Language), Technology, Software, Data, Youtube, Software (album), Business, Management, Office, delete, duplicates, in, sql, server, 2012, remove, duplicate, records, select, distinct, access, mysql, oracle, query, to, take, out, statement, how, 2010, Oracle Database (Software), Microsoft Office (Software), Software Tutorial, find, clean, sql training online, training, online, video, course, class
Id: glEpavFojSM
Channel Id: undefined
Length: 5min 35sec (335 seconds)
Published: Fri Nov 09 2012
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.