Data cleaning in Excel - 10 tricks *PROs* use all the time

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
dirty data is a pain solve it with these 10 powerful yet easy ways of cleaning data in excel watch this video until end to unlock two more bonus ways to clean data in excel let's clean number one first name extraction here we have some names and we would simply like to extract the first name part of it now you could write some formulas or something but here is a dead simple way of doing this just go to the adjacent cell and type the first name of the first few items as you start typing excel will guess what you are doing and automatically suggest that reminding first names this feature is called flash fill now when you see this kind of highlighted values like in dal color just press enter and excel will automatically do the extraction for you you can undo this if you are not happy especially if you would like to tweak the way this needs to happen for example i'm going to show you how the same can be done for last name now i'll say f-a-u-g-h-n-y funny cross white and as i'm doing it already showed me the last name as well but if there are some middle names somewhere you can go and tweak that and excel will pick up the pattern and show this you can also if the suggestion doesn't come up type the value and press ctrl e to trigger the flash fill you can find more about flash fill from the data section data ribbon and flash fill button so that's number one number two number to percentage let's say you have some values in a range of cells and you would like to turn all of these into percentages you copied this from another system or from a web page and they are coming up as 7278 instead you would like them to be 72 percent 78 percent here is a very very simple way to do that go to a blank cell and type 100 there right and then select that copy ctrl c select your original range of values and then right click paste special and from here you can say these numbers need to be divided by that 100 so in the paste special area just click on divide operation what this does is it takes the 100 and divides them with all of those and you will instantly get the values as fraction now at this point if you apply percentage formatting you will get the percentage thing done but this requires two steps let's show another technique you can go to a cell and type one percent in there copy that one percent and then select these paste special and do a multiplication instead this will apply the percentage format number three remove blank rows let us say you have copied some data from another system or you have done an import and you have noticed that some of the values are blank and wherever there is a blank value you want to remove that entire row because it's irrelevant to data for your analysis you can manually select and delete them but this is going to take a while especially if you have multiple columns and hundreds of rows here is a dead simple way select all your data and now press f5 function key 5 key this opens the go to box and from here click on special now using the go to special area you can select items that are meeting a specific criteria in our case we would like to select all the blank cells so click on blanks and ok now what excel will do is in the highlighted range it will find all the blank cells and it will select them for you right once you have selected press the shortcut control minus this is for deleting cells and wherever there is a blank cell we would like to remove the entire row so just say entire row and click ok your data is cleaned up all the blanks are gone number four find duplicates let us say you have a set of invoice numbers that are provided by ross and another set of invoice numbers provided by monica and you now want to see if there are any duplicates between both lists all you have to do is select the first list and then you can hold down the control key select another list now that you have highlighted both lists just go to home conditional formatting highlight cell rules and click on duplicate values this will instantly show you all the invoice numbers that are appearing in both lists in a different color a great way to find any duplicates between two lists number five clean up formats let us say you have some data that you have downloaded or a colleague provided to you and all the information is there but the format is all messed up you don't have to go and manually clean any of this up you can simply select all your data and from the home rib bun there is a clear button just click on that little down arrow and say clear formats this will instantly tidy up your data and set it back to the original raw formatting from here you can format it in the way that you like now time for our first bonus trick let us say you have some data where there is inconsistent spacing so here you could see that bar funny has an extra space at the beginning likewise denison cross white has an extra space at the end as well as at the space in the beginning and this person has some inconsistent spacing this sort of problem usually happens when you copy paste data into excel from another system or a web page here is a very simple trick to clean this up you can use the beautiful trim function and select the value and close the bracket what trim function does is it will remove any spaces from the end as well as beginning and if there is any extra spaces in the middle more than one space that will also be removed once you have that formula just drag it down or go to the corner of the first cell and double click and it will automatically clean up the values and give you data without those spaces that's the first bonus trick number six unstack data imagine you have copied or someone provided you some data where the data is not in the proper order so you have addresses where the street address city and state are in different cells and they go like this and this kind of format is called stack data because everything sits one on top of another and you would like to unstack it so that you get it like in this format you may think this requires a bit of complex formulas or anything but this is really simple all you have to do is note down how many values are there in each set so we have three values straight city and state and set up your structure accordingly so here i have three values so i have got state city and street as three values now go to street and say equal to and point to that because that's my street address enter go to city say equal to and then select the city go to state select west virginia and then now all of these are there you simply just drag this down you will get everything filled like this so once this is there you select all of this ctrl c copy and then right click and paste them as values so now what we did is we have taken the values and we have kind of staggered them here but there is no more formulas it's just values once this step is done while keeping it selected use our delete blank cells trick from the earlier so press f5 special blanks and click ok now in the earlier trick we used control minus and then we said entire row but this time we will say shift cells up because if i delete entire row i will use i will lose this data as well so we'll say shift cells up this will clean up the data and set it up in the correct order for us in an unstacked format number seven duplicate combinations imagine you have some data where name and department or three or four columns are there and there is some duplicate combinations going on so for example you could see that bar funny in procurement has appeared twice and you would like to remove one such combination you can use just select all of this data go to data and you can use the remove duplicates option to remove duplicate combinations too this is not something many people know so let me demonstrate that select all of your data click on remote duplicates and from here keep both options selected this means you are looking for a combination of name and department and click ok it will remove all the duplicate combinations and it will keep values as they are number eight remove yellow cells this is again a favorite especially in office situations where you have a shared spreadsheet or a file that multiple people access to occasionally people would like to go and put some yellow color or some other formatting on it to differentiate values so here we have our data with a lot of yellow highlighted cells select all your data and press ctrl f to find from here instead of finding any single value we want to find all yellow color so click on the format button and go to the fill section and select the yellow color now what we are asking excel to do is find all the cells that have yellow color as their fill color in them and then click on find all this will identify all the six cells so it will say six cells find or how many our yellow color cells are there in your data now select the first one hold down the shift key and select the last one this way you are now able to select all the six cells so once you have selected you can click on the excel window and press ctrl minus which is used to delete cells and simply say delete the entire row click ok and your yellow cells are gone number nine extract numbers from text here i have some expenditure items pulled out from a credit card statement now the expenditure dollar amount is there in the detail but it is not usually at the end sometimes it is at the end sometimes it's in the middle and we want to just extract the number portion this technique works very well if you have only one number in the value if you have multiple items then it can it won't work so go to the expenses section and simply type the expenditure for the first few items so 49.08 and you can see that as i was typing 49.08 um excel flash fill kind of guessed what i'm trying to do so it extracted but this was not right so you can see that 44.9 to spend it extracted that so what flash will at this point is trying to do is it is thinking oh you want the number i will get you number and from there on everything so it's giving me that so i can go to the third cell and then i can simply edit that criteria and once i teach that new behavior flash field will then go and extract what i want so this is how i can extract numbers from text again a very powerful and amazingly simple technique the last one number 10 file name from full path this works for website addresses or anything email addresses or whatever may be the case so here i have some file names with their detailed path so i have d column something something something something and then the file name sometimes there is three or four folders sometimes there are five six folders and all i want is just the last bit the file name you can use find replace to do this do you believe that let me show you select all your data and press ctrl h to open up the find replace now if you already have that kind of yellow colored formatting there you want to first clear the formatting conditions and what we want to find is we want to find anything that has a path in it so i'm going to find star slash okay and then i want to replace it with nothing what this does is star is a wild card and what that does is excel will try to find anything followed by a backslash right so it's gonna find all of that things and it will replace them with nothing and when you say replace all see the magic happen it will instantly replace the entire full path and just leave the file name there so this is what it is now from here if you would just like to extract only the file name without the extension the extension is something that appears after the dot you can again use the same technique we can say i want to delete anything after a dot so dot star replace r and then that will remove the file name the the dot thing as well and then leave the file name keep in mind that if your file name also has some dots in it then it's gonna mess up but otherwise it will work beautifully now time for our bonus two technique numbers formatted as text this is again a common big painful problem that you see in dirty data let us say you've got some values here they look like numbers but they're not really numbers because when i select the number and select a bunch of them i only see the count i'm not seeing the sum here so excel is thinking these are text values here is a dead simple way to convert those into numbers just select any blank cell copy that and select your values right click paste special and from here just add the blank cell to these values now by definition a blank cell in excel has zero value so when you add 0 to a text excel will see oh you're trying to add 0 let me see if i can convert this into a number so it will convert and then it will add the 0 which won't change the value but now you have your values converted into numbers instead of text so there you go the 10 powerful and easy ways to clean dirty data in excel thank you so much for watching this video i hope you enjoyed it if you liked this video please give it a thumbs up and share it with some of your friends my name is chandu and my mission is to make you awesome in excel see you again in another video bye
Info
Channel: Chandoo
Views: 213,201
Rating: undefined out of 5
Keywords: data cleaning in excel, data cleaning in excell, data cleaning in excel course, advanced data cleaning in excel, cleaning survey data in excel, cleaning up messy data in excel for tableau, Using flash fill in excel to clean data, remove spaces from text in excel, trim formula example, data clean up excel, how to clean up raw data in excel, remove spaces from text in excel formula, data cleaning, chandoo, excel tricks, excel
Id: SNVjndgWBlw
Channel Id: undefined
Length: 17min 2sec (1022 seconds)
Published: Wed Mar 10 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.