Introduction to Data Analysis with Excel: 2-Hour Training Tutorial

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
subscribe and click the Bell icon to turn on notifications we've made the accompanying exercise files for this tutorial available for free just click the link below in the video details to get these hello and a very warm welcome to the data analytics in Excel course I am Omar manzer and I have over 14 years of experience in organizations all over the world throughout my career I have been involved in data analytics and financial management processes for companies in various sectors spanning from software as a service or SAS for short to banking I am currently working as a full-time freelancer and my clients range from startups to large multinational corporations in my experience when someone mentions data analytics people start to think of complicated Concepts that they will not be able to grasp I am here to demystify the topic of data analytics and show you that you do not need to have a PhD to analyze data I have designed this course around the most fundamental concepts of excel if you lead an organization or work to support the decision-making process of organizations you are often faced with overwhelming amounts of data you need to process that data to decide what should be done next the data needs to be processed using spreadsheet software such as Microsoft Excel think of Microsoft Excel as a tool that can solve your data analysis varies with that promise in mind let's get started but first let's discuss what data analytics actually means data is pervasive it's collected every time someone buys something takes a flight clicks on an ad or likes a social media post so organizations have never had more access to it but having access to data isn't enough for a business to be successful it also needs employees who know how to handle analyze and interpret the data welcome to the world of big data in this world of Big Data basic data literacy which means being able to analyze interpret and even question data is a skill that is becoming more and more valuable this is where data analytics comes in data analytics helps us question and analyze data through the application of data analytics you can learn about your customers and predict future Trends there are a lot of expensive software packages in the market some of you might be surprised to hear that good old Excel is very effective in analyzing data as well if you've spend countless hours trying to make sense of data without making any significant progress you have come to the right place this course will get you started on understanding Excel features and help you in applying these features to easily analyze data I promise this is not a long list in order to succeed in this course you need to have a basic knowledge of mathematics and statistics you need to have a basic understanding of spreadsheets having a recent version of excel will be beneficial Excel 365 works best and lastly patience to go through the material and a desire to learn will go a long way this is how the course is structured in section one we will be discussing what data analytics is and I will give you a refresher of the Excel layout in section 2 we will have a look at how to use Excel spreadsheets to analyze data we will discuss a number of Concepts such as data entry formatting data types formula Basics and Excel functions Section 3 will explore the importance of cleaning and managing data we will discuss why data quality is important and how do we manage data in consistencies in section 4 we will discuss the various useful functions that Excel has to offer we will go through how to sort and filter data tackle complicated data sets work with pivot tables and Excel charts finally in Section 5 we will discuss the what-if analysis scenario management and several useful functions of a very useful Excel add-in called analysis tool pack however please note that the analysis tool pack has several advanced Advanced statistical tools such as Anova and t-tests which are beyond the scope of this course please remember to make use of the exercises in this course we have one exercise for sections two three four and five please also make use of the Excel workbooks that come with this course you can use the unsolved workbook to work alongside me as we go through the modules and you can check your Solutions against the solved workbook that is provided with this course now that we have discussed the course requirements and the course content it's time to get started I look forward to our association throughout this course let's quickly review each of the parts that make up Excel starting off with the ribbon all Microsoft products use the ribbon a tabbed interface albeit the tabs vary depending on the product the tasks that are utilized the most frequently will be on the Home tab depending on your demands you can modify your search it's much simpler to come up here to the search box and type the task you're trying to do if you can't find what you're looking for in any of these tabs for instance merge and center cells if you type in merge and center these are some of the options and the suggested actions that I get the ribbons icons and actions are also included here you will also notice that there are quite a few arrows all over the ribbon these become more apparent As you move through the ribbon when you click on every arrow you can see the various options that you get here are various additional arrows and actions I can do some of these we will be discussing later the only tab in this tabbed interface that is not used is the file Tab and when I click on it I'm brought to a page where I can make a brand new blank workbook or file I can either save the one I'm working on open an existing one by selecting the left pointing Arrow at the top of the screen you can go back to excel let's navigate to the screen's Bottom by scrolling down although this file might include more than one worksheet I can only see the one that is currently chosen we will also discuss that later I can zoom out or I can zoom in making the font smaller or larger let's move to the screen center I can navigate through these rows by scrolling all the way down the rows are all numerical while the columns at the very top are all alphabetical any of these cells can be clicked we'll go through data types and talk about why text alignment can alternate between left and right the entire text can be formatted and we'll go over how to do that additionally by clicking on the column header for a column or a number for a row you can highlight the entirety of either a column or a row in order to make something broad or narrow you can easily adjust the width for both columns and rows for example if you want to make column b a little Broad you just move the mouse over and it makes it broader the same process can be done for rows if you want to undo something that you have mistakenly done you can click on the back arrow on the top left corner of the screen you will regularly use excel's undo tool The Last Action you took will be reversed if you click on it you can click on it repeatedly or reverse your actions endlessly the number of activities you can undo is up to 100. in truth you can undo something if you unintentionally undo something by clicking the redo button which is the arrow underneath this button so there you have it the basics of excel navigation you may typically use your daily tasks by adhering to the home ribbon tab everything from formatting to alignment to numbers to cells will be covered in this tab let's now discuss how to create open and save a file even though this new blank file is open I can create a new one at any time I can always choose file then new blank workbook in the top left corner of the screen by tapping more templates I may view a list of templates if I don't want to create a brand new blank one I can view all of Microsoft's templates if I scroll down examples include Fitness Vision personal budget volunteer assignments retirement planner there is a lot of variety when it comes to excel templates by selecting the button labeled search for online templates I can hunt for a template if I can't find one I can always make a blank file by clicking the enormous new blank workbook icon if I still can't find what I'm looking for I can now start entering text into the cell because it will open a new empty file by pressing the enter key or tab keys I can switch to a different row or column I'll return to file and choose save as when I'm ready to save the file clicking browse will open the PC's browse dialog where I may store it to a location like documents if I wish to save it locally on my computer at any point I may open a file by going back to the file menu open by selecting a push pin icon I can move an item that I use regularly to the top of this list as a result I can now find any file I need by going to the pinned tab and double clicking on it I've started a new blank file and I'd like to discuss the structure of an Excel file before entering some text the first thing I'll do is zoom in so that you see what I'm doing a little better and I'll do that by clicking the plus sign all the way down to the bottom right of the screen you've probably heard me refer to this as an Excel file and you probably will but it's important to note that the technical term for what this is is an Excel workbook it's a workbook comprised of sheets or worksheets this is sheet 2. as you can see over here I can enter my data and add multiple sheets here when I click a new sheet we have a completely separate worksheet within this workbook which is the Excel file I can refer to these sheets in a variety of ways I can use formulas on one that refer to the cell on another sheet I can use all of them interchangeably and we'll go over the details a little later although you can call this file or a workbook the official term is an Excel workbook so now that we've opened our file let's begin populating it with some basic text and filling out some columns and rows I'm going to enter some product data for a fictional company called XYZ 360. to begin I could place my cursor in any cell by directly clicking on it and if I look I'm currently in column A Row 1 A1 is the cell number as you can see over here if I ever get lost and don't know where I am I can see that this is cell A1 all the way on the top left hand side is the screen in the name box for the time being I'll place my cursor in cell A5 and begin typing I'll press enter to move the cursor to the next row and then I can resume typing GPS trackers let me just press enter I add cameras and SIM cards okay so here we have some products now the data in column A is taking up some space in column B column B does not contain the data it's simply too long and I can't see what's in column B we'll go over how to resize these in more detail later but for now I'm going to hover my mouse between columns A and B and Slide the mouse over I'll move my cursor to cell B5 and begin typing instead of pressing the enter key which would take me to the next row I'm going to press the Tab Key which will take me to the next column and now I have some information over here this was a quick refresher module as we discussed the axle ribbon creating opening and saving Excel files we also entered some text in an Excel workbook hope you found it helpful since 2019 Microsoft has continued to expand on the data types you can utilize to get real-time information you can access the data types option by going under the data Tab and hitting the more button and you get a list of options now currently Microsoft is offering stocks currencies and geography there are a number of other data types as well such as locations medical Space Yoga universities and so on now how exactly do we link this data type to the internet to get real-time information so I have a few subsets of data over here where we have a list of countries list of companies and the list of cities so let's see how we can get real-time information on these data sets let me highlight the countries first and if I go in the data tab I select the data type type as automatic let's see what happens Excel is working its magic and now you see an icon before each and every country and if I select the icon for Denmark I get this information which includes the full name the nationality internet code and so on I also get information about the geography what's the capital city you know I get information around demographics just with the click of a button I get economic information as well and there are a number of other pieces of information I can gather with the click of a button now let me select the entire data set again and let me click the insert data button that appears after selecting the list I have a number of options available let's say I'm interested in getting the country code of each and every country I'll hit this button and I get the country codes so let me call this column country code if I select the list again and this time I'm interested in finding the flags for every country I get the flags just like that and let's try one other option let's say I'm interested in getting the GDP of every country and I get the GDP of each and every country as well so rather than doing the search for each and every country separately on the internet you can just link the data to the internet and get real-time information similarly we have a list of companies over here so if I go back in my data tab go to the data types and I select the stock option over here it's going to convert my data set into stocks let's see what happens so it was able to associate every company to a stock symbol other than Ford so let me select Ford manually over here and now I have a ticker symbol for each and every company let's see the options that I get for stock data I get the percentage change I get the number of employees let's say I'm interested in finding the number of employees for every company so I'll just hit this button it takes a bit of time but then it populates Alibaba has 245 700 employees in our data set let's look at one other option let's try it again so if I go back in my insert data option let's say I'm interested in finding the percentage change in stock price over the past one day for our data set I'll just remove the field error on this cell because there is no Associated stock name so I get the percentage change for this data set let me add a another column here to have some partition between the data sets my last data set is listing some cities so I'll select this data set I'll go in the data types option and I'll select this as cities Excel is smart enough to identify that there are a number of cities named London in the world so Excel is asking me which city am I interested in let me select London the capital of United Kingdom same with Washington I'll just select the first one over here Paris I'm interested in the capital of France I'm interested in the German Berlin so I get all these cities let's see what options I get for City Series so I am interested in the city population so I get the population over here London has over 8.6 million inhabitants followed by Berlin with 3.5 million let's try something else let's say this time I'm interested in finding the nicknames of each City I'll hit the nicknames button so this is the nicknames column this time around Excel was not able to find a nickname for Budapest and Washington so let me just remove that that's okay seems like they don't have a nickname but all the other cities so London is known as the Big Smoke Paris is the city of lights and city of love Berlin also has two nicknames Athens on this free and the parted City it's nicknamed departed City because of the Berlin Wall this gives you an idea of how interesting the data types tool is in Excel the recently introduced data types functionality in Excel is a very interesting tool in my opinion it elevates the status of excel from a mere spreadsheet to an information Powerhouse gone are the days when you have to Google for every piece of information you're looking for you can simply get the information in real time using the data types functionality however please note that Excel might not be able to find all the information that you're looking for but it does a really good job in most instances feel free to explore this functionality yourself data entry is one of the most basic things we do in Excel and it can be tedious at times but there are some techniques shortcuts and a feature called autofill that will greatly assist before we begin let's open our Excel workbook vrn module 201 there's a slider bar with a black rectangle in the lower right hand corner you can zoom in or out by holding down the left Mouse button we are only going to use about seven columns as we will create a small little worksheet so let's just zoom in a bit let me enter some information in cell A3 I can use the arrow keys or click on A3 and make that cell the active one and I type Revenue have you noticed it's on the left hand side of the cell unless you type values that is what automatically happens in Excel moving down a column I'll type in costs and then hit enter and profit enter later on I decide to change the word costs to expenses I'll click there we don't need to erase it I'll simply type over it to edit a cell either click on it and then go into the formula bar click in the appropriate spot and make some changes or double click near the entry to do the editing directly in the cell I'm going to enter some numbers the number I'm going to type is 80 and I accidentally typed o instead of 0. have you noticed that on the keyboard they're right next to each other I tab over one three zero correctly typing this one Tab and you can see the difference when entering numbers in Excel they're always right aligned let's just go through this again eight zero if it is not purely numerical or a formula Excel will left align it whereas numbers are always right aligned okay I'm going to enter four numbers and then press the tab as I moved to the right here so I'll type 225 tab three zero zero tab three two five tab 500 and then press enter after the last entry then six more numbers here thank you and then after the last one press enter I'm going to put monthly names across the top so I'll start by typing January and then I tab over now let me give you a time saving tip it's an autofill shortcut when you have a month spelled out or a three letter abbreviation you can point to the fill handle in the lower right hand corner of itself take note of how the mouse pointer changes shape and becomes thinner I'm dragging across while holding down the left Mouse button the little pop-ups below hint at what's to come you will get the remaining months we can drag on until December and then it will start over or we can leave it as is we can also accomplish this in a variety of ways you may begin with any month and may use a three letter abbreviation so let me type in o c t and then I'll point to the lower right hand corner and click and drag you've probably already guessed what we are about to see the same thing happens if you drag right we can go downwards you get used to this feature pretty quickly it also works with the weeks of the year we can also use the letter q and the number one in a variety of ways Q means quarter in financial terms so if I just type in q1 over here and I drag downwards it goes all the way to Q4 remember there are four quarters in a year you can also go right words of course these are frequently used for columns and row headings so these are pretty useful techniques let's look at another example so we have another data set over here if you want to delete some data we can simply place the cursor inside the cell and press the delete key you can also delete entire columns by highlighting The Columns you want to delete and you can hit delete you can clear everything by pressing the delete key if you click over here and you hit the delete key it's going to delete everything when we talk about columns and rows we'll learn how to handle all of that I'm just making sure you understand how to delete data from cells right now you can now copy data from cells as well the simplest method is to use your keyboard shortcut I'll start by copying everything so let me select everything over here and then I'll press the control and c key and then I'll hit the control and Wii key to paste and now I have a completely new copy of everything let's just reverse that another thing you can do is drag data it can be much easier and faster at times I'll select all of my text and if I slowly move my cursor to the right edge of the data set you might notice that the cursor changes to a cross here and all I have to do now is Slide the data around I could put it wherever I want maybe I'll just move it down a few rows I can let go and everything will move I can also slide it over and everything moves again I like to tell you about a specific situation in this data set we have a very simple formula in column e which is just adding these two numbers together for every line when we copy everything it preserves the formula and when we paste it anywhere we want the formula is going to be preserved but if I just copy this column and paste it everything is blank the formula doesn't work because your data will be considered absolute at times and your formulas will be considered relative at others that is they will be locked into where they are in a sheet at times and will move around at other times in these cases we can just paste the value of the formulas and result in because I don't really care about the formula we're just moving data around so I'll copy this data set and I'll paste it here as values if there's a formula in a data set what and if you need to move the data set around it's always best to copy and paste the data as values so if you go in paste special over here you have a paste values feature so you click on that and there we go all right in this module we went through the concepts of viewing and entering data in Excel we also discussed how to copy and delete data autofill and moving data using the Crosshair cursor functionality can help save loads of time especially when you are drowning in work when it comes to formatting the appearance of an Excel spreadsheet there are no rules it depends on the type of data who it is for and how much time you have to devote to formatting it there are numerous reasons let's have a look at a data set that we discussed in the previous module let's copy it across and let's try to format it so what we see over here is we have revenue and expenses for the months from January through June so the first thing that we can do over here is we can type in financial data over here in cell A2 and in the font section you will see that we have a number of options so we can make it bold we can italicize the text and even underline it if you need to the other thing is when we talk about formatting cells it needs to look neat so we can select all the months and we can align it by using the options that we have over here if I click on this option make sure the data is in the center of the cell if I click on this option it's going to align to the left side we can even middle align the data set so it really depends on what works for you and how does the data look on the spreadsheet I usually prefer the central align the data especially when we are looking at financial data we can also make the data bigger or smaller so if I select these line items I can change the font size I can make it smaller as well and we can also change the color of the font we can make it blue or orange or gray so we have a number of options that are disposal we can even color the cells so let's go with this option for now let me share another interesting feature of excel you don't need to format the data by clicking on each and every cell you can format the entire row by selecting the the row number so let's say we want to format row number four we can change the color of the text in this row and we can make it red and for Revenue we can change it to Green so the entire row of data updates based on what you want to do it's also worth spending some time on the merge and center functionality so let me copy and paste this data set over here and instead of typing financial data in cell A8 I want to make sure that it's merged across these highlighted cells because at the end of the day the data underneath is all financial data so what I can do is I can select the cells where I want the financial data to be merged across I go in the Emergency Center option and I click merge across so as you can see this data set is now merged but in order to make it look a little nicer I can Central align it and there you go this looks much better than this it's far clearer let's now talk about the Excel data types and why they are important this is an excellent example because it contains a wide range of data types we have text we have numbers we have dates and we even have some formulas when you enter data into a cell Excel does an excellent job of determining which data type to use for instance if I type in a word let me type my own name u m a r I'll press the enter key and the text Will Remain it's left aligned this indicates that it is simply treating it as a text string however notice when I enter a number I'll press the enter key and the text Will Remain to the right that is how Excel determines whether the field is a number or a value that is a date an integer a Time a currency or any other type of value that isn't a straight text field it's significant because it determines what kind of formula or function you can perform on that cell for example if I go to the formulas ribbon let me click on this cell and go to the formula as ribbon if we have a number of options available if I go in the financial formulas these are all the formulas that I can perform using this data set we can also perform various functions on text such as transforming text directly in a Cell you can perform calculations using date and time data as well but Excel must understand what type of data set it's dealing with and that's why the alignment within a cell helps Excel determine the data set let's look at these dates over here dates can be entered in a variety of ways so let me type in a date for example as you can see it's perfectly centered in fact if I go over to the formula bar I can see that the date has been expanded I have the year I have the month and I have the day Excel will adjust the dates based on the Locale you have selected so let me look at a time as well we can also enter time in a variety of ways let's see I enter 17 55 pm as you can see it's right aligned once more and if I look at the formula bar we see the PM after the time you always have the option of selecting and formatting the data cell that you want so let's highlight this field over here so amount let me go to the home ribbon tab and over here we can see that the formatting is set to General this denotes the number format and currently it's set to General I can easily change it to currency and the entire data set converts to currency automatically I can also change the number of decimals used the currently the data shows two decimal places if I click on this button it will increase the number of decimals and clicking on the decrease decimal button is going to decrease the number of decimals if you look at this column the days since issue there are some formulas in here but these are very simple formulas this cell for example denotes the number of days since the ticket number was issued it's a simple formula that subtracts today's date from the value in this cell when the ticket number was issued and these are the types of calculations that can be performed when the type is properly set so keep an eye out in the back of your mind for anything that doesn't seem right for example I just typed in time but it's left aligned so it seems that Excel has not recognized that I was putting in time however if I put in a space over here it's going to be right aligned again Excel recognizes that it is indeed time and adjusts accordingly we have a number of options available to format data and ensure that it's readable we can also save valuable time if we understand the different data types and format accordingly formula writing is one of the most common things that Excel users do now most formulas are clear easy to understand and simple to use but they begin with an idea that might be new to you on this sheet we want to figure out how much money we made so in January if we subtract the expenses from the revenue we're going to get the profit so it's going to be 77 minus 47 and we get 30. in Excel if we have a look at the formula all formulas start with an equal sign we can and sometimes do use external numbers but most of the time we use addresses in effect we want to say that whatever is in B4 should be subtracted from B3 so we write the formula and this is one of the three main ways to write the formula we type the location of the first cell B3 minus B4 we press enter or Tab and we have our answer another way to type this so let me just delete it is to use the arrow keys to get to the cell you want this could be a better way to write a formula but the cells have to be close together so let's see how this works so let's press equal then we use the arrow key we go up to B3 then I press minus and we use the arrow key again to go up to B4 press enter we get the same answer 30. and the Third Way which might take two hands is firstly to type equal then we click on Cell B3 type minus then click on Cell B4 press enter we get the same solution in all three cases no matter how you write the formula you should learn to keep an eye on what we call the formula bar this is the formula bar over here you can see it at the top of the screen in a way the cell doesn't really have a number it has a formula instead we keep an eye on the formula bar much like we keep an eye on a rear view mirror when we are driving we look at it often now we need a total in cell and three if everything we knew about the formula came from what we had seen before we would do something like this equal let's click on B3 then add C3 then add D3 and so on but it's not the best way to do it think about trying to do that for the entire year or if you had a lot more Columns of data simply press Escape has hundreds of functions that come with it a lot of them exist you can find them this way on the ribbon's formula tab so we have some Financial functions some logical functions Text data and time the options are virtually endless most people probably use some the most when they use Excel and just like a formula this starts with the word equal and then we type sum s u m and there is no need to capitalize it then we have left parenthesis and then we select the cells in question by dragging from left to right so I want the sum of these columns over here all the cells from B3 are now included then we press enter and just like that the answer comes up in cell O3 we need to figure out the average that is of course the total number divided by 12. as we have 12 months over here we need to add up all of these but the addition has already been done right here so the formula in the average column is just equal then we click on this cell divided by 12 and press enter please note that the division sign is the slash which looks like this and not the one that goes the other way now when we have formulas set up like this if I change a number for example let me make this 87 notice how the profit and the total and average changes have a look at it again let me change the number to 97 the numbers in these columns change and that's how you can start to see how Dynamic these formulas are and how we can change them depending on the situation also if I go back to cell B4 let me change something else over here so the 47 let me make that 57 the number changes we don't have anything on the right yet so it won't change anything over here but sometimes we see negatives and that's what happens when you see a formula in Excel you may want to duplicate it in adjacent cells either across a row are down a column so there is a formula in cell B5 if I double click on it we'll see what's happening and popping a formula in Excel it's important to remember that you're copying a formula relatively so let me just copy the formula from here so I'll press Ctrl C formula is copied and let me just paste it over here to see TRL we I paste the formula by pressing Ctrl and we on my keyboard so let's see what happened in the formula so I have moved one column to the right and the column reference has changed from column B to column C if I keep copying the formula across the references will be changed to column d e f and so on you see if you copy a formula it is adjusted relative to the original otherwise you would be repeating the exact same formula which does not quite help us and of course the benefit is we don't want to type a formula here then hair then hair imagine if you have to do this for every day in a year that would be ridiculous and time consuming and we can simply copy a formula by dragging and entering it from the lower right corner this is referred to as the fill handle as I drag to the right from the fill handle you'll notice that these values are correct right away so in May it's 21 minus 28 which is minus seven so one by one we've got the right formulas and if we had even more columns you can imagine how time efficient and better it would be by just copying the formula across using the fill hand Loop similarly now in column n we need to calculate the totals so if you look at column N3 we are summing all 12 months so the formula is working on the 12 cells to its left so if we just drag this formula across we are mimicking this relationship or the other two rows as well so for expenses the formula over here is working on the 12 columns to the left and same for the profit Row the basic concept is the same but I am moving down a column similarly the average function is the cell to its left divided by 12. you want to do it again for the other two rows so I'm going to go on the cell wait for the fill handle and I'm going to drag it downward using my mouse if you remember the worksheet originally began with very few formulas and we have added more formulas by dragging the fill handle downwards are across the list of numbers on the screen appears to be a random collection of numbers at first glance the rest are all formulas and not numbers you'll need to copy formulas as you work with Excel as demonstrated in the examples in this video in this module we have seen how to use the mouse and keyboard in different ways to make simple formulas once you have created a formula copying and pasting it enables you to save a lot of time especially when you're dealing with a very large data set once you have created a formula dragging it using the fill handle enables you to save a lot of time especially when you're dealing with a very large data set you should you can use excel's most common functions sum and average to add up numbers in a column or a row or find the average of those numbers now you can use these functions with the help of a menu button or shortcut keys we'll be exploring that further what we see over here is a worksheet with numbers let's say Intel N3 we want to add the revenue from January through December so how do we go about doing that you may have seen in a previous module that we can start by typing equal s-u-m left parenthesis highlight the data and so on but there is a better way a faster way or even two faster ways in fact firstly you can find the autism button in the editing group on the right hand side of the Home tab when you click on this button Excel looks up for data and then it looks leftward no numbers show up above it but they do show up to the left when we click this button Excel tells us what it's going to do and we press enter after you've used it a few times you feel more comfortable with it and be able to speed it up a bit we can double click on autosum to do this as well so let's see how that works let me delete this click click and there we go so what did we get out of this as a reminder we can see what it did if I double click this cell you can also see it in the formula bar in cell M3 so it's summing B3 to M3 we can apply the same concept over here in cell B10 I click autosum twice now let's say we we want to add these cells over here so I can highlight the data over here and press alt and equal from my keyboard and that's another quick way of adding the numbers up let's do it again so I delete this I highlight the cells that I want to add and I press alt and equal and the sum of these four numbers is 954 that we see over here if we go on the autosum button again you will see that there are various options we have some we have average count numbers Max Min and there are some more functions as well so if we change this to average you will see that automatically Excel can now calculate averages so the average of these four cells is 239 let's apply the same approach over here I select average and then I press enter and there we go the average is 79.5 for these 12 numbers on the left hand side there's one more thing that I would like to tell you you should never type sum in a Cell why is that well I mean there's nothing wrong with it it's okay to type sum but when you press alt equal in Excel it automatically types sum for you and you can change the ranges based on what you want to do so if you want to add these three numbers together by pressing alt and equal I get sum and I just change the range so why type sum when you can just press alt equal and Excel does it for you so these are useful tools for working with formulas in Excel there's no question that most Excel users use the sum and average functions a lot let's explore another commonly used function in Excel if we scroll downwards we'll see that there is another data set where we have the employee first name employee ID Department and the breakdown of the total salary by month if you're just starting out with Excel you might not need it right away but you probably will at some point it is called vlookup it works with another function called hlookup and as of August 2019 vlookup and hlookup will be replaced by a new function called xlookup and the whole point of this kind of function is to let you gather information that is related you may be working on more than one worksheet at the same time as you do this in this example over here we have some employee data with their total salaries and you can imagine it to be a very large data set with thousands of rows of data and what we need to do is to to find the first names based on the employee IDs so if we use the X lookup function it will make things much easier in addition to combining the features of vlookup and hlookup xlookup makes some new improvements and gets around some of the problems that we have had with the other two I'm going to type equal x lookup left parenthesis and if you have never done anything but sum and average you might be surprised by what you find if you type in the name of the function and add a left parenthesis you might get a hint about what you need to do and if you want some extra features that are between the brackets when you type in xlookup basically what you're trying to do is I have a value like the one over here so I'll click on the lookup value then I need to identify a lookup array so that that will be the employee ID column so I'll select all of it and then I will type in a return array so if we want to find the first name that's in column A so I'll select the entire column A where you have all these rows of data and that's it right parenthesis enter so the first name of the employee with ID 12 is fluorian so we can drag down this formula and this is what we get what happened here we are getting an error and why would that be if we look closely at the formula you see that these ranges are Dynamic so the ranges are changing as you drag the formula downwards if in cell t23 we had B22 to b98 by scrolling downwards we are at b23 and b99 and if we click on the formula again the range has an empty row that it's looking into which is not correct in a nutshell you want to make these ranges static so what we can do is if we go back to the original formula in cell t23 we click on the Range and we press F4 from our keyboard so you'll have these dollar signs you do the same over here press F4 and the same for this range and the next range and there we go now drag the formula down and you don't get any errors so ID number 12 is fluorine we can do a quick check yeah that's correct that's fluorine ID number 12 Sanchez ID number six that's Sanchez ID number six so we have easily populated the first names by looking up the IDS now let's do another example together so we need to find out the total salaries and we just have the IDS so I'll just start by typing in equals x lookup left parenthesis then I'll click on the lookup value then the lookup array which is this column over here I'll select the whole array and then I need to find the return array which is the total salary so I'll select the total salary right parenthesis enter let's make these ranges static by clicking F4 and press enter again and I'll just drag the formula down so these are the total salaries of the employees with these IDs in this module we discussed some more time saving Excel techniques we explored the autosum and ALT plus equal keyboard shortcuts to sum a set of numbers automatically we also explored xlookup which is a far more powerful function of excel when compared with the more commonly used hlookup and vlookup functions values like interest rates or other percentages can be put right into a formula but almost always it's better to get a value from a cell in your worksheet putting a value in a Cell instead of putting it in a formula makes the value visible and lets you change it without having to change the formula so in this module I'll show you how to use cell references we have a spreadsheet over here with monthly revenues from January through December for fiscal years 2020 and 2021 and we need to fill in the revenue numbers based on some requirements that have been laid out let's say I want to show the value of a particular cell in this case we need to find how much money was made in January 2021 so when I look at my list I can tell that the value is in cell C4 I will go back to cell F4 and type my formula there in this case you only need to do three things the first thing I need to do is type an equal sign to show that I'm making a formula then I'll click on 258 421 as a reference in cell C4 and press enter this is the value in cell C4 and that's what we need if I want to use more than one value I can choose the cells let's say I want to find the total for quarter four that's between October and December in F5 I'll type an equal sign then sum which stands for the function sum followed by a left parenthesis and then I simply go in my data set and select the cells relevant for Q4 that is October through December and hit enter if you notice that the cell closely there's also an error indicator because I didn't include all of the values around the ones I chose but you don't need to worry about that I can do the same thing with q1 and Q4 I'll begin by typing equals s-u-m left parenthesis I select q1 that is January through March right parenthesis so I've selected cells B4 to B6 I can type in a comma and then I'll select the Q4 months that is October through December write parenthesis enter if you look at the formula closely you can see that I have two ranges each are highlighted in a different color and separated by a comma to calculate the total revenues for the entire 2020 let me show you a quick way of doing this I'll begin by typing in equals sum left parenthesis I'll select cell B4 and I'll press Ctrl shift and down arrow to quickly select the entire data set enter and the total revenue for 2020 was in excess of 4.2 million this technique saves time because you don't have to drag the formula using your mouse last but not the least I can also choose values from a horizontal range let's say I wanted to know how much revenue have we made in the month of September for 2020 and 2021 so I'll click on the empty cell over here and I'll start by typing equals sum left parenthesis I'll click on Cell B12 which is the month of September for 2020 and I'll drag this across enter let's now discuss what happens when you copy formulas and show you how to use relative and absolute references to keep track of what changes and what doesn't you got a glimpse of that in the previous module and we'll discuss it some more in this module as it's really important if we scroll down below we have some head count related information for a global Corporation this workbook tells me that this Corporation had 21 020 employees the projected growth rate is three percent from 2016 till 2024 so what would be the head count numbers from 2016 through 2024 so I want to make that formula I will start by clicking on Cell B25 I'll type in the equal sign I'll select my starting head count amount which is 21 020 in B24 and I'll put in an extra three percent on top of the initial amount so I'll do plus I'll type B24 again multiplied by the growth rate of three percent right parenthesis enter so in 2016 the head count number would be 21 651. let's say I want to drag the formula in cell B25 till cell b33 to do this I will click on cell B25 and then grab the fill handle at the bottom right corner of that cell it looks like a small square I will just drag down the formula to copy it and I get something I did not expect if there are a lot of number signs there is a mistake in this case the value won't fit into the width of column B as it is now I know that's not right because the numbers in cell b26 till b33 should be around the 22 000 mark because the formula should simply add a three percent growth rate on the ending value of the previous cell so if I click on Cell b26 let's see what's happening over here we have the starting value which is correct I want to add the growth rate on top of the starting value so if I see what's happening here it takes B25 again which is fine and it's multiplying it by the original head count rather than the growth rate so it seems by dragging the formula down the ranges have changed so how can we fix that we first correct the formula by selecting the growth rate and I'll hit F4 using my keyboard and you'll see some dollar signs appearing which makes this range static I'll press enter and this looks much better now the head count in 2017 is 22 300. so by hitting F4 the ranges are absolute but there's something else I wanted to show you if I hit F4 again there's just one dollar sign before row number 23 so this makes the row static if I hit it once more there's a dollar sign before column B so this is making the column static if I press it once more the dollar signs disappear and the range is not static anymore so in a nutshell you have a lot of flexibility here but you need to be sure what you want to do do you want to make the column static or the role for now let's make both the row and the column static so I'll hit F4 again and there we go we are back to 22 300 and now if I drag the formula down till cell b33 the numbers look much better and the formulas are working and by just glancing at the data you will note that the formula is indeed correct and it's working in this module we took a deep dive into the absolute and relative references remember in Excel it's not just about typing in formulas it's about working with addresses when we talk about formulas you can save a lot of time using the F4 button on your keyboard especially if you're working with large data sets foreign exercise this exercise will have two parts in the first part we need to identify the employee ID and the department from table one in which we have the employee ID department and the employee name and we need to identify the ID and the department using xlookup that's the first part of the question in the second part we have the initial investment amount and we have the investment rate and we see that the formula has been applied to cell B52 and b53 but then from cell b54 onwards there seems to be an error so we need to look into the formula and correct it please pause this video if you do not want to watch the answer right away starting off with the first question let's use the employee name to determine the ID and the department we just need to know how the X lookup formula works so we'll Begin by typing equals x lookup we're going to select the employee name as the lookup value then for the lookup array we are going to select the column with the employee name in our table and for ID the return array should be the employee ID column which is column A we're going to close the brackets and we're going to hit enter and there we go before I drag this formula down I need to lock the cell references and I can do that by pressing the F4 key on my keyboard so C7 is locked c42 is locked same for A7 and a42 I'll hit enter again and now I can easily drag it down so the employee IDs have been filled now for the Department I need to apply the X lookup formula again so I'll type X lookup I'll select the lookup value as the employee name I'll select the lookup array as the employee name column then for the return array I'll select the department column and there we go right parenthesis enter before dragging the formula down I'll lock the cell references again so I'll press F4 for the cells I want to be locked I'll drag it down and now I have the ID and Department per employee name for the second question let's first have a closer look at the formula so in 2016 to calculate the final investment amount for 2016 I have the initial investment plus then I have the initial investment again times the investment rate so it looks good in 2017 again I have the investment amount in 2016 plus the investment amount in 2016 times the investment rate that's fine as well so what's happening in 2018 onwards I have the initial investment amount for 2017. which is correct then I have the investment amount for 2017 again which is correct and then instead of multiplying it by the investment rate which is in cell b50 the formula is multiplying it by the initial investment of 1.15 million which is not correct so I need to fix this part of the formula instead of the formula referring to cell b51 I needed to refer to cell b50 so I'll fix this part of the formula P51 becomes b50 that's fine now and then I can drag this formula down and fix the rest of the cells but before I do that I need to lock cell b50s while hit F4 and that's it so the investment amount in 2018 is 1.256 million which makes sense and I can easily drag the formula down and there we go the error has now been corrected for the next section you'll want to download the course exercise files click the link below in the video description to get these you can also scroll through the details to find timestamps for each section in this course if you're enjoying this training please leave us a comment before you start working with lists in Excel you should be aware of a few guidelines that will help you become more proficient as you consider using sorting filtering subtotaling and possibly pivot tables in order to utilize the entire Excel toolkit at your disposal your data needs to meet certain quality standards now in this workbook I have the data over here I can use some of it it can be printed there's a lot that can be done however if I try to sort or filter the data as it currently appears a number of features will not work properly let's have a closer look at this data set in column A we have the full name then we have the office phone number in column C so it seems that column B is hidden the date of birth we have the age then column f is empty we have customer sins in column G which tells us how long has this person been a customer for our company and then we have the contract value down to three decimal places to begin with cleaning the data you might notice that row 12 and row 23 are empty again with such a small list there will be no major issues but as an example suppose the list is much longer and there is an empty row every 100 or 300 rows what if you click in the middle of the data and say to yourself I want to sort the data we can immediately see what can go wrong let me show you what I mean under the data tab I go in the sort option and see what happens Excel has selected the data set from column A through e as column f is empty so if you sort this data only columns a through e are going to be sorted leaving columns G to H static so that is certainly going to lead to inaccuracies in our data analysis so first of all we need to delete empty columns and empty rows wherever they appear so starting off with column f let me just delete that then let's delete row 12. I select right click and I hit the delete option here follow the same approach for row 22 right click hit the delete option now the data set does not have any empty rows or empty columns however one of the columns is hidden so let me unhide this I take my cursor where the column is hidden and then when it changes shape I unhide it by increasing the width of column B so after unhiding column B I notice that there are some notes in here so if you want to keep the notes as is that's fine but there is another node with no Associated data in the line that's row 25 so let me just delete that and now let's proceed to sort the data again if I press the sort button over here it's going to select the entire data set so this is also a useful tool that can help us determine if Excel is taking the entire data set into consideration and as you can see over here the entire data set has been selected however looking at this closely you might notice that the titles have also been selected looking closely at the titles they do appear a little out of the ordinary that is you have these titles that are on two lines you have full and then name you have office and then phone you have date and then off-birth this is not a good idea if you begin working with this data and possibly pivot tables or if you are about to insert subtitles Excel may fail to recognize that the titles are in two rows so we need to put our titles on a single line in order to consolidate the titles on a single line let me show you a simple Excel shortcut if I go on cell A3 I click on it and after the L are full I press alt and the enter key and then I type in name as you can see by using the alt and Enter key I've made sure that the full name is in one cell rather than this being in two cells and I can just drag it down I can press OK here it's been dragged down so I have full name I can do the same thing for office phone so clicking on the office cell pressing alt and enter and typing in phone let me just drag this down as well press OK we can do the same thing for date of birth date alt and enter offs births press enter let's drag it down repeat the process for the other two columns age can be dragged down as is the nodes can be dragged down as well and we can delete Row 3 now as it's blank so we have now cleaned up the titles let me share a few more Excel tips with you if we take a closer look at the full name column we may notice that we are limited in our capability especially if we want to sort the names by last name or by the first name so how do we clean that up we can enter a new column by clicking the insert button and then over here I'll just use the same title full name and then let's say I want the last name to come first then a comma and then the first name so that there's some more structure to the name convention so how do I go about doing that one way would be to just type in the last name comma and then the first name for all these cells but that is going to be very time consuming there is a very cool Excel shortcut I would like to share with you so I'll just do the first name I'll just type in the last name comma and then the first name and rather than just pressing enter I'm going to hit this control Enter key and then under the data options I'll hit the Flash Fill button and see what happens and just like that the same format of the naming convention has been applied to all the other rows as well so this is a very good time saving tip now we can just delete the first column and then if you have a look at the office phone there are a number of different formats that you see you have the area code separated in some lines whereas you have some dashes in between the numbers on others and some line items are without any spaces or any dashes so the data in this column is clearly inconsistent so how do we make this data clean if we select the entire data set in this column we go back to the Home tab and then if you click on this drop down under the number tab you will see a number of formatting options but we don't see a phone number option in this drop down so I'll click on more number formats I'll go in special and you will see that you have a couple of options here you have the phone number and you have the social insurance number so if we are working with phone numbers I select phone number I hit the OK button and let's see what happens now most of the numbers are consistent you didn't have to go through every single cell and re-enter the data there are a couple that are still not updated so we can manually update these so now the phone numbers have also been updated and now we can look at the contract value this contract value is very difficult to read the numbers in the contract value column because there are no decimals there are no commas it's simply not clean enough so let's say that this contract value is in dollars how do we fix this data set with a click of a button so I select the entire data set and I press the dollar sign under the number tab let's see what happens the data set is now much cleaner but let's say we want to deal with whole numbers so I'll select the entire data set again and decrease the number of decimals using this button and there you go the data set that we have right now is much cleaner than the data set we started off with in this module we discussed the importance of data quality remember Excel is packed with features like autosum sort and pivot tables using data cleaning techniques such as find and replace number formats and Flash Fill we can clean any data no matter how large very quickly if you work with colleagues who do not have access to Microsoft Excel you can request that their data be exported to a text file or a comma separated value file in this module I'll show you how to import that data into Excel there are several methods for saving a text file the first is to use a comma separated value file which is file that has a comma between each individual value Excel has the advantage of opening comma separated files directly you are under no obligation to import anything else let's open up a comma separated file I'll go in file and I'll open that file by going in open a browse for that file I'll show you that this file is indeed a comma separated value file I select it I hit the open Button the file which was a comma separated value has been opened directly in Excel one thing to keep in mind is that there is no formatting because it is a text file column headers may be bold and centered in an Excel file but not in this one let me close this file by hitting the control and W button on the keyboard and return to our Excel workbook you can also open a CSV file by going into the data Tab and then over here you see the get and transform data section you can hit the get data button and then from there you select from file and you click on from text CSV you select the file and you hit the import button when I click on that button the import data dialog box appears you can see I have items customer name units cost price markup location category and length and you can see that I have my file origin as well which in this case is Western European for Windows a comma acts as a delimiter in this data set and the data type which has been detected is based on the first 200 rows there's nothing special in this data set all I've got are numbers and text there are no dates or formulas now I will hit the load button and the data will be loaded as an Excel table into my worksheet there you go there's a separate Tab and the data is loaded the queries and connections task pane is also visible on the right if I go in the queries tab I will see that there are 50 000 rows that were loaded and then in the connections tab there are zero connections so in other words I have created a query but not a direct connection and now I can work with data just like I would in any other Excel workbook please note that when you import data in Excel you don't just copy the data and then unlink it from the original file instead Excel makes a data connection to the source file and if you tell it to see if its contents have changed and your worksheet needs to be updated you can control how Excel updates your worksheet data by displaying the external data ranges property styling box so let's see how that works if I go in the data tab I have the queries and connections section over here if I go and refresh all and look at the connection properties this is what I get so I have the query name I have the usage so this table enables a background refresh which is checked I also have the option to refresh every 60 minutes and I can change it as needed or as required I can also refresh the data when opening the file so there are a number of options over here I will show you how the refresh functionality in Excel works let me just hit OK here this is our data set now if I open that data set in a separate file and I change a customer name here the customer name is John Smith and that's also change the markup to 40 dollars and the length to one meter and if I just close this file on my original file if I hit refresh all as you can see the data has changed the length is one meter the markup is forty dollars and the customer name has changed as well so by linking your Excel workbook to a data set will ensure that the latest and the most up-to-date data is being included in your data set you just need to hit the refresh button to make sure you're always dealing with the latest and greatest information by using the get and transform data functionality in Excel you create a link to a CSV file this ensures that whenever the CSV file is updated your data set in the Excel workbook replicates those changes this is a very powerful and time-saving technique foreign we'll look at some of excel's advanced formatting options so we have the last name first name email start date and age and around 51 rows of data that you can see over here I've kept this list short to make it easier to see the two commands we'd be looking at namely filtering and removing duplicate values the true power of these two commands will be revealed when you have thousands of rows of data to work with these tools will help us make sure that our list contains no duplicate information so if we just have a look at this list you might notice that there is a duplication on row 4. and Row 18 and then we have another duplicate data on row 7. and row 26. because this is a reasonably short list just by having a closer look at the data this becomes apparent but you might have to deal with very long data sets and this will become extremely complicated to look at every row of data to make sure there is no duplication that's where the Excel filtering and removing duplicate functionality comes in when looking for duplicate data keep in mind that every single column of data within that row must be identical for Excel to recognize it as a duplicate so the last name first name email start date and the age needs to be absolutely identical in another row for Excel to identify as a duplicate record if anything differs in any of the columns it will not appear as a duplicate implying that it must be an exact match this is another crucial distinction to make between filtering and removing duplicates when you filter for unique values you will notice that it only hides the value rather than removing it so it simply returns a clean list of data whereas removing duplicate data removes the data from your spreadsheet this is a far more powerful command than simply looking for values that are unique so if you want to clean up your list you start by filtering it and then removing duplicates this is precisely what we will be doing now the first step will be to look for filtering for distinct values so let's make sure that we are on the data tab in our ribbon which is selected over here this will greatly simplify access to the commands that we'll be using also we need to make sure that we have chosen our table by clicking on one of the cells so we have cell C9 chosen in this data set so that's fine and we'll go on the advanced option that shows up in the sort and filter category and we'll click on this we need to make sure that the correct data set is selected so that's shown on in the list range where it's picking up A3 to e54 let's have a quick look yes that's correct it's selecting the entire data set now you can either filter the list in place or you can copy to another location so let's say we want to copy this data set to another location we'll select this action over here copy to another location and then Excel proposes the address where it wants to copy the data so it's showing up as a57 to e57 we can also change this address by clicking on this button over here so let's say we are fine to copy the data set over here so I'll go back and then we only want unique records so we'll check this box and we will hit OK let's see what happens so the data set has been populated let's count the number of rows so in the new data set we have 49 rows of data and then in the old data set we have 51 rows of data so the two rows of data that we identified as duplicates have been removed using the advanced filter option let's go back to our table and this time we are going to filter the list in place with unique records only let's see what happens as you can see Row 18 and row 26 has been hidden it's worth noting that the data hasn't been removed it still shows the same data but the rows have been hidden this is the primary distinction when filtering for Unique values while the data is still present let's undo the previous command and recreate our list with the duplicate included so I'll go back to the Home tab I'll hit the undo button let's now try to identify the difference between the Advanced filter option and the remove duplicates option so I will reselect my list I'll go back to the data tab and I'll hit the remove duplicates button it asks which columns I want to compare the data set on I select all five columns and I'll click ok when I hit OK I'm immediately given my result so Excel has removed the two duplicate values that it found and it also tells me that 49 unique values remain in the data set let's hit OK when removing duplicates keep in mind that Excel is looking for the information that is displayed not the information that is stored in the cell itself let me explain what this means I'll go back to the Home tab hit the undo button so one of the rows of data was John Andrews in Row 4 and Row 18 that Excel currently identifies as a duplicate entry so if I just change the format of John Andrews age on Row 4 I'll just put two decimal places now logically the age is still the same but let's see how smart Excel is if I click on the table again and I go in the data tab and I hit the remove duplicates button I select all the columns and I hit OK let's see what happens now this time Excel only found one duplicate value which was removed there are 50 unique values that remain so according to excel this data entry is not the same as this data entry because it also looks at formats which are different in both of these cells in this module we discussed how to apply the advanced filter and remove duplication Tools in Excel remember when using these tools Excel is looking for exact matches and that includes exact matches in formatting this is very important when we are looking at data sets with numbers and dates Excel is smart but not as smart as us as an end user we need to ensure that the results that Excel provides us is in line with what we are expecting the most important thing to do when you have a duplicate record or records in a list is to get rid of them as you probably saw in the previous module but sometimes you just need to identify the data attributes which includes the duplicates in a data set it might be useful to see the source information together this list appears to be arranged in a random order we have the last name first name email start date age Department the ID and the respective salaries of these Personnel that are listed over here now we have no idea if there are any duplicates let's sort the list and if you're trying to group duplicate records together the best field probably is the ID field you might think that the last name is suitable but keep in mind that not all people with the same name are the same person right so let's sort this list using column d a quick way to sort this list would be to click on column d then on the data Tab and then you have a couple of options here A to Z Z to A so Z to a meaning highest to lowest and A to Z would be lowest to highest it doesn't matter let's just click on this button and the list is now sorted let's go through the list and see if there are any duplicates so we see a duplicate over here we have another duplicate over here and we have another duplicate over here you can see them right now but imagine if the data set consisted of thousands of rows of data would you want to sift through them then so let's add a column over here to identify duplicate records so let me just name this column duplicate records I will be using an if formula and using that formula I will compare this data within the cells to determine if the data is duplicate or unique so I4 is left blank because it's the first row and after filtering you can clearly see that there are no duplicates I'll start with cell I5 I'll begin by typing equal if left parenthesis and I will compare the data in A4 is equal to A5 and we can even extend the comparison to a number of columns and all you need to do in that case is to include an and a and d before writing your comparative formula and then it will look like if and A4 equals to A5 then B4 is equal to B5 and I want to extend this all the way to column H but this becomes very complicated as you can see the formula is becoming larger as I extend the analysis two more columns then I will check the email address as well C4 is equal to C5 and then D4 is equal to B5 the formula is already very long but let's see what happens I'll close the parenthesis and then if the value is true meaning if there is a duplicate record I will say that you should check otherwise if the data is unique I will say that those rows of data are okay then I'll close this statement by using the right parenthesis and hit enter and then I'll drag the formula down so as you can see there are a number of line items where the data seems to be duplicated I can identify four rows of data which are duplicated now the formula is already really long so I need to make the formula crisp because this is a very long formula so how can I do that all I need to do is I will just ask Excel to compare A4 through H4 with A5 through H5 and then I delete the rest of the formula because I've summarized the formula by comparing the row A4 through H4 with A5 through H5 and I hit enter and then I just drag the formula down and I get the same result four rows of data need to be checked because they have duplicate records now let's say we also want to know how many unique departments do we have I can either go in this on the filter list and count them but imagine if you have a much larger data set that will be very time consuming and prone to error so what would be a simpler way to just count the unique records in your data set there is a very simple formula for that so I'll just show you how to do that using a very interesting Excel formula it's called unique so I will just start by typing equals unique left parenthesis and then I select the entire array of Department and write parenthesis enter and as you can see there are five departments within this data set but this data is not sorted let's say we want to sort this data so I'll just update my formula by typing in sort s-o-r-t left parenthesis and then I'll close the formula by adding a right parenthesis and then in order to close this particular formula this is called an array formula because this is a dynamic formula so in order to close the formula I need to hit the control shift and Enter key and now the data is sorted in alphabetical order now let's say I want to count the number of unique records so how can we do that so there is another very useful Excel formula called count a I can use that formula by typing equals cou and T a and then I select the department section right parenthesis enter there are 53 line items in the department category and if I want to count the number of unique records I will just enhance this formula by adding unique U and i q u e right parenthesis enter so there are five unique departments as I mentioned earlier these formulas are Dynamic so let's say that we have another new Department in this company and a couple of people from that company are going to move to that department that department is called RND so see what happens if I type R and D over here all of a sudden it also shows up in the list of departments and the count increases from five to six so these formulas are extremely useful because the change when the underlying data set changes foreign we discussed some very useful Excel formulas to identify duplicates in our data set we discussed the compound if statement unique and County formulas all of these functions can quickly summarize the duplicate data these functions also help us in quickly determining the most common characteristics in a data set such as the number of departments in a company before you start working with data it's very important that you have a look at the data and make sure that it's sensible if you look at this data set we do have several Fields such as the last name first name email start date age Department ID and salary but if you look closely at the data you will see quite a few abnormalities for example the last name is in lowercase the first name is in uppercase there are some strange spaces after the email address and even before the email address and then the salary field has the numerical value and then it also specifies that it's in dollars so this kind of data for example where you have numbers and alphabets within one cell is not very easy to process so in every cell ideally have one kind of data it should either be a number or an alphabet especially when you look at numerical Fields like salary age and so on so how do we go about cleaning this data first of all I would like to introduce you to a tool called find and replace in Excel there is a keyboard shortcut for that tool you just need to hit Ctrl F on your keyboard and you get this pop-up box so for example let's start by cleaning up column H which is the salary field we want to find dollars but if you look closely at the data there is a numerical value then there is a space and then dollars so we need to get rid of the space in between the numerical value and the dollars as well so if we find Space and you spell out dollars exactly the way it's written in the cell and then you press the replace button here and you replace it with nothing just leave it empty and you hit replace all Excel notifies us and says it's all done we made 53 Replacements okay great that was quick so the salary field now looks good we have numerical values and this data will be easier to process now if we have a look at the name the last name and the first name we can use some pretty handy Excel formulas to clean this data up rather quickly so for the last name let me just work on a new column and I'll call that last name updated let me just expand the column width and then over here as the last name is all in lowercase I can update this column and make this all in uppercase just like the first name is the first name is all in uppercase the last name can all be in uppercase and then the data will be consistent so let me just type a formula here equals upper left parenthesis select the cell right parenthesis enter and you just drag the formula down and just like that the last name has also been updated and it's all in uppercase or what we can also do rather quickly is if we use another Excel formula which ensures that the first letter is in uppercase and the rest of the letters are in lower case and that formula is called proper so if I just type in equals proper left parenthesis I just click first cell which is A4 right parenthesis enter Andrews and then I just drag the formula down so this column we can call this last name dropper similarly we can update the first name using the proper formula as well so let's just see how that works we use the same logic equals proper left parenthesis click on the cell B4 right parenthesis enter as John you drag the formula down and that's it so we have very quickly updated the last name and the first name fields as well let's look at how we can clean up the email field so if we look closely there are some spaces before and after the email address we can use the find and replace function here as well so let's use this column L to update the emails so what I'll do first is I'll select all the cells within this column that need to be updated and then I'll hit the control F key I just want to find the space and I want to replace it with nothing just keep it blank I'll hit the replace ALT key and there we go 530 Replacements have been made now there is another way you can clean up the emails rather than just using the find and replace functionality so if I just hit undo go back all right we have the spaces again and I can use a formula called trim so you start by typing equal then trim t-r-i-m left parenthesis select the the cell you want to trim right parenthesis enter and now the formula has been updated but you can clearly see there are no spaces in this email address I don't want this to be bold so let me just hit this button so now we have the email address and I just drag the formula down and all the formulas are also updated and there are no spaces in this email address anymore but the email addresses usually they are all in lower case but over here in the email address we see that the first letter is uppercase then it's lowercase and the last name you have uppercase and then lowercase how can we make the entire email address in lowercase we have a very similar formula in order to make all characters in lowercase so let me use another column over here I'll just call it email lore the formula is very simple equal lower left parenthesis you select the cell right parenthesis enter and now the entire email address is in lowercase let's drag the formula down and there we go showing you the various formulas and techniques that can be used to clean the data once you are satisfied with the results that you want you can delete the columns that you don't need such as the last name and the first name over here cleaning data is often misunderstood and underestimated using the find and replace functionality or the trim upper lower or proper formulas that we have in Excel makes your work a lot easier this also enables you to process data far more efficiently and effectively it's now time for an exercise to test our knowledge in this exercise we have a list of appliances and we need to use the sort and unique formula combination to find the number of unique subcategories in this list this is a fairly long list the sort and unique formula combination can help us to determine the number of unique subcategories please pause this video if you do not want to watch the answer right away so let's determine the number of unique subcategories in the list we just need to remember what the formula combination for sort and unique looks like so we'll use column I for that so we'll Begin by typing equals sort and then left parenthesis and then for the array I will type in unique then there will be another left parenthesis and I'll select the array in question which is the subcategory so I'll select the line items in column B I can use the control shift and down key to quickly select the entire array and that's it I'll close the unique formula by typing in a right parenthesis and then I'll close the sort formula by typing in another right parenthesis so I had two brackets both brackets are now closed I'll press enter and let's see what we get there we go these are the unique subcategories based on our list now if we have to count the number of unique subcategories we will use a formula with the combination of count a and unique the formula goes something like this equals sound a left parenthesis unique another parenthesis to open the bracket then I'll select the subcategory which is B6 through b 761 right parenthesis to close the first bracket and then another right parenthesis to close the second bracket I'll press enter and there we go so we have 18 unique subcategories in the list if you're not a subscriber click down below to subscribe so you get notified about similar videos we upload to get the course exercise files and follow along with this video click over there and click over there to watch more videos on YouTube from Simon says it
Info
Channel: Simon Sez IT
Views: 483,986
Rating: undefined out of 5
Keywords: excel tutorial, excel, data analysis, excel data analysis, data analysis with excel, data analytics, excel data analytics, data analytics with excel, data analysis using excel, data analytics using excel, data analytics in excel, data analysis in excel, data analyst, excel formulas, excel 365, excel formulas and functions, excel functions, data cleaning, clean data in excel, clean excel data, remove duplicates, Removing Duplicate Data, data quality, data attributes
Id: kghcAk7l6eA
Channel Id: undefined
Length: 113min 39sec (6819 seconds)
Published: Tue Mar 07 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.