Excel 2010 Statistics #15: Frequency Distributions, Histograms, Column Charts (11 Examples)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
welcome to Excel 2010 statistics video number 15 we're gonna talk all about frequency distribution and the charts for frequency distributions for categorical and quantitative data here are all of our topics this is gonna be an epic video so when you're watching this video if you look below the video it says see more if you click on that link you'll see this actual listing of all the topics in this video and next to each topic is a minute mark and if you click on that you will jump to that part of the video because this is going to cover a lot of topics in essence this is frequency distribution in charts almost all the topics all in one video now let's go over to our Excel workbook if you want to download this workbook business 210 Excel 2010 chapter 2 click on the link below the video alright so we're starting chapter 2 and in Chapter 1 we define descriptive statistics tabular graphical and numerical summaries of data now why do we have to summarize data anyway if we go over to the sheet data this is raw data and this is actually the data set we're going to be using for this whole chapter sample of 200 sales from WWF of boomerangs calm and we're gonna use descriptive statistics to reveal patterns in the raw data oh yeah that's why we have descriptive statistics where we summarized down because it's really hard to make sense of this what how many of the Carlota boomerangs or the iñaki boomerangs did people buy what age group bought the most boomerangs in what region do we buy the most boomerangs so there's lots of good information we can create from this raw data so that's what we're doing I'm gonna go back over to the topic sheet so in Chapter 2 that's this chapter we'll do tabular and graphical tabular just means hey create a frequency distribution and count how many times something occurred in a category and graph vocal means make a chart chapter 3 will do numerical that means calculating the average which means mean median or mode or standard deviation alright so here's our little schematic for chapter two there's a lot of topics here we're doing a lot of different types of frequency distributions and a lot of different types of charts and graphs categorical data that means label we saw last chapter that could be nominal or ordinal tabular methods means frequency distribution we'll also see how to do a relative a percent frequency distribution and cross tabulation for categorical data graphical methods will used column and pie very important when we do a column for categorical data the column will have no gaps between the columns and we'll we'll see that don't forget the textbook we're using always calls column chart bar but you know columns are vertical they stand up they hold up Greek architecture so and that's the word Excel uses so we're gonna call them column alright quantitative hate quantitative is just a fancy way of saying number tabular graphical tabular we're gonna do this similar to categorical data frequency distribution relative and percent frequency distributions will also do a cumulative frequency distribution and then accumulative relative or percent frequency distribution and cross tabulation I know that's a mouthful there but we'll get to it and once we do at a time or two you'll see it's pretty straightforward and then from this data we'll create histograms and ogive charts also we will create so the histogram and OGM are from made from these dot plot will is a graphical method we'll do separately stem-and-leaf is a graphical method we'll do sense separately both the dot plot on the stem and leaf display are similar to a histogram and then we'll see how to do a scatter diagram now all of this the point of all of this is what oh yeah raw data hard to make sense hard to see patterns or trends these methods here summarize so we can see patterns and trends and help us make decisions all right let's go over to the sheet data this is the data set we're going to use throughout these videos for Chapter two now we're gonna do a bunch of different charts here's the whole data set as we go through I'm only gonna take certain columns just so everything fits on the screen so for example if I go over to frequency distribution and charts ff4 formulas you can see I've taken just a few of the columns now the reason I've done this is so that it fits on the screen while I'm shooting the video in general these are the data sets you have to deal with and you would just you know select this column if you wanted to analyze it or summarize it or make a chart alright we're going over to fdn charts formulas now here I want to do three frequency distributions and then three charts and these what we see in these three examples here actually I'm going to minimize this ribbon you can either right-click minimize or do ctrl F 1 DS 3 frequency distributions for categorical data quantitative data when we have integers now look at this age is a quantity is a number right age we have 20 157 25 these are the ages of the people that bought the boomerangs right so but these are integers so when we go to create a frequency distribution we'll have a slightly different method of creating our categories or classes than we would if we were doing revenue which has a decimal so that's down here quantitative decimal all right so one two three frequency distributions and from those will create our charts now let's see if I can zoom in here all right the first one west east South Midwest we just want to count in which region did we make the most sales now these are transactions if we go back to this original data set on the data sheet you could see each one of these is a transaction a particular sale right so we have a date coupon MasterCard East oh that's the column reason right now the price the quantity the actual transaction amount etcetera so each one of these is a single sale so in this column here and if we go back over to this here oh yeah if we just count how many Wes there are in this column it tells us how many transactions we sold in the West all right so we want to count West East South and Midwest now this is a counting situation with one criteria we're not going to count how many transactions are are that's 200 we want to count how many West's East South etc alright and actually we've already done this a few times in this class we're going to use the count if function equals count if now it needs a range and a criteria now as we copy this formula down the range argument has to be locked but the criteria I notice as the formula is copied down each successive row it has a new criteria so let's keep that in mind as we create our formula here okay range I'm going to click in the first cell and use our keyboard shortcut ctrl shift down arrow to highlight all the way down to the bottom and now we need it locked so we hit the f4 key now notice I know some of you in this class are brand-new to Excel those the f4 key lock set that means dollar signs in front of the letter and number the column and row reference that just means it's locked as we copy it down the blue box will not move and we saw that in the last two chapters chapter 0 in chapter 1 all right so now I'm going to look at my screen tip type a comma and you can see the criteria argument is bold that means hey give me the criteria and it is boom so I click right there with my mouse f6 it not really f6 it's one cell to my left that way as I copy you down that the little box will move now I'm gonna close parenthesis actually with the function like this you can just enter it and it will Excel will automatically put the closed parenthesis in now I'm gonna control and hit enter that just puts the formula in the cell and keeps the cell highlighted and that's good here because I immediately want to do something which is to copy it down now I can sort of see up here in the formula bar I mean I can see up in the formula ball the formula I'm going to point to the little fill handle in the bottom corner and with mine my selection cursor turns to an angry rabbit I click and drag down I'm gonna click in the last cell and hit f2 oh look at that the blue boxes I copied it down this blue box right here remain locked and that green cell moved absolutely perfect now let's add these up the keyboard shortcut for auto sum is alt equals so I'm using the sum function and of course there are 200 transactions from our sample so of course when we count these are mutually exclusive categories which means anytime we find an item in here it only fits into one category and they are collectively exhaustive which means these there's nothing in this column that won't fit into one of these categories which means we have all of the categories or enough categories to account for everything so of course when we add it up we get 200 now does this frequency distribution give us good information yeah sure much easier to see here in this frequency distribution that West had the most sales so out of our sample of 293 of them were in the West Midwest had the second-most East and South were just about tied for last you know maybe we did a great job of selling in the West or maybe we there's just more boomerang enthusiasts in the West so we have some useful information summarized in a tabular form now this is great but how about it chart charts are great because they give you an immediate visual impact so let's go ahead and make a column chart column charts without with gaps between the columns are great for categorical data let's go ahead and highlight and you want to make sure and have your column headers or field names or variable names at the top for both the categories and the numbers and remember from last chapter these are called the elements right so we've collected some summarized information about an element which means we've listed each item exactly one time and then we have some summary statistic all right so we've highlighted this notice the ribbon is minimized but I once I click on insert I can see go to the charts group I click on this now let's fix this up a little bit let's click on the legend and delete let's click on the lines inside delete let's click on the columns and notice they're all selected now one important thing about categorical data you got to have gaps between this is just a category it's not like later when we have our numerical data that we have to get rid of this gap to indicate that there's a continuous stretch of numbers here we absolutely want that gap it visually says there is a separation between these categories all right so I've highlighted these there's little white dots on each one I'm either gonna right click format or use the keyboard shortcut control 1 now I'm just gonna be silly here I'm gonna go to film and say very colors by point absolutely do not have to do this I'm a big color guy so I like to show the different colors all right now let's go up to our chart tools and if you want to turn this back on while we're the the ribbons on you can either right click unclick the minimize or ctrl f1 is a toggle ctrl f1 all right there's our context-sensitive chart tool ribbon tabs we want to go to layout okay I'm gonna move this up a little bit and I want to show a label at the top so I'm gonna go to data labels outside in oh look at that that's really cool we have the frequency at the top so you know what I don't need this axis I'm gonna click on it and hit delete so frequency all right so we have our categories our frequency let's give this a better title I'm gonna click on this in last couple chapters videos we saw how to either create a formula or type a text title as long as you have your solid line here I'm going to click up here and I'm going to type hopefully I spelled it right now notice I've typed it there and I'm gonna hit enter number of transactions by region so we could put a label here that says frequency but this says number of transaction and then there's our little number now in a couple other important things about this column chart the order of these labels here does not matter it conveys no information now if we sorted this there's a type of chart we'll see you later called a pietro chart if I pronounced it that correctly now you can sort it but the information comes from the height of the column not the labels right so even if they're sorted right the information about which one did best when you're looking at this comes from the height not from the order of the labels alright so a number of transactions by region now I'm going to point to the edge here and click and drag in and make it a little bit smaller and then I'm gonna zoom in again you can either use your zoom down here or hold control and roll I'm gonna bring this over here like this see if I could make this a little bit bigger here I'm gonna right-click this and here's my mini toolbar I'm going to click there on type 212 all right so columns do not touch for categorical data heighth of the columns convey the count order of categories conveys no info we could see things like West had the most in South in East had the least so categorical frequency distribution and the appropriate chart a column chart now let's go on to quantitative Oh age now here's the deal we have categories here we just listed the categories but here what's the category well we could go and count how many people were exactly 21 or exactly 57 but that would be a lot of that be you know zero to 70 or something like that so that would be 71 rows I don't want that I want to summarize it a little in in a more tight fashion now in this example we're gonna have an age kind of will say something like zero up to nineteen twenty up to forty nine in a later example we'll see how to make the the categories a little bit more stretched out so what that means is when you get to numbers like age or revenue we don't have a category so we have to make one now in the textbook says there's an art to this and there is an art to it you actually have to just do trial and error and kind of stretch the data and contract the data until you get the you know a shape that makes sense now here I'm just going to kind of give you the increment in on our next tab we'll actually see how to calculate what the increment should be and how many classes you should use but even if you have a method it's still an art you know so it takes practice to figure out how many categories we should have what the increment on each category should be alright so I'm just picking 19 well obviously if we're doing age right we're gonna start at 0 and so we want 0 up to 19 now here's the deal this is integers as opposed to decimals so our category is going to be 0 to 19 20 to 49 alright so let's go ahead and see how to do this um here's the starting point so I'm gonna say equal whatever the lower is one cell to my left + and actually let me blow this up a little bit alright equals this + the increment now as I copy this down I need that lock so I'm gonna hit the f4 key alright I'm gonna hit enter so we got 0 to 19 that's gonna be our first category here and we'll count how many ages are between 0 and 19 but we need a next our next category now what's beautiful about this little table here is we're gonna be able to have it created by formulas and once we change the increment everything will change here so if this is 0 to 19 what does this have to be 20 to 49 so I say equals this that means the lower from the category the upper from the category from before +1 notice this is a relative cell reference so right now it's looking to the one before when I copy it down here it will be looking right there control-enter now remember this is one cell to the left plus the increment so if I drag it down one oh I can't add if it's 20 to 39 not 49 lucky I have Excel to tell me add 19 all right now I can copy these two different formula down and I have my categories what's beautiful about this is I could change this to 15 now what would what would be incorrect about this well I calculated the min just by using the min function and the Max the problem with these categories is that the categories would not be collectively exhaustive which means we would have some numbers that needed a an additional category which we didn't create so 63 we can clearly see there's some people that are 65 that bought some boomerangs so this won't work that's the trial-and-error part of it all right and we'll get more into all of that in a little while but here suffice it to say these categories include all the numbers from 18 to 65 all right now when we're calculating our creating a formula we're actually going to use the lower and upper and we saw this in Chapter zero zero in our introduction to excel we can use the count if function with two criteria however this is we are going to use this frequency distribution and the chart and we need a good label here so the people looking at this frequency distribution in the chart get good information so I want a little label here that says zero to 19 so we're gonna create a text formula I'm gonna say equals and I'm gonna click on that now right now that's just says hey go over there and get that number but now we need to join and we saw this in our chapter zero when you use the ampersand shift 7 this formula says take whatever is in that cell 1 2 3 4 5 cells to my left and and then I'm going to in double quotes space - space and double quote now right now all this is done is joined some text in quotes there's the join symbol and whatever I was in K 16 if I enter this you could see I have a 0 and a dash but what do I want I want to join one more thing so shift 7 and the upper this is called a text formula control-enter and now I can copy this down now that's pretty cool and notice if we change the increment here we have all of our category all of our upper and lowers that we're gonna use in our frequency or cowntess formula change and the labels change really this is the magic of Excel you know we're doing lots of data analysis and we want to build everything in a robust efficient way so we change one input and it all changes all right I'm gonna change this back to 19 now we can do our count ifs here's what we got to do we got to tell the countifs formula to look through column and we have two criteria so for the first category its hey look at each number and ask two questions are you greater than or equal to zero and are you less than or equal to 19 so for here we'll ask the question are you greater than or equal to zero true are we less than or equal to 19 false so that one's not included in this first category but when we get down here we get is this number right here greater than or equal to zero ding-ding being true is it less than or equal to 19 ding-ding-ding true so it will be counted all right let's make our formula countifs now countifs is awesome it's got an S right the count if we saw up here doesn't have an S so that means we can have as many criteria as we want I think up to 255 oh that would be silly but we only have two now here it is the screen tip will be polite it says give me the criteria range 1 and then the criteria 1 so let's go ahead and highlight our age column control shift down arrow f4 notice the screen tip is bold comma criteria now we're gonna have to put our comparative operator and join it with our lower end for this class I'm going to in double quotes greater than or equal to and double quote ampersand and the zero now the way I remember how to do this is notice the small side is pointing towards the zero so that means the zero has to be smaller and the open and on the greater than symbol or the alligator right the open end is pointing to to what oh the whole column so I always think that oh yeah that's pointed over there so that number in that column has to be greater than or equal to the zero all right so that's criterion one comma and then criteria range to same range control shift down arrow f4 you have to repeat it twice comma and then criteria two now gets kind of trickier because see now my formula is hanging over and I can't really click there but no problem I'll show you a way to do this now this is the upper so we double quote less than or equal to remember there's no such thing as a less than equal to sign together or greater than or equal to so you always have to list the two symbols equal sign always come second so I am double quote that and then join it too and I can't quite get to that well there's a couple ways it's l16 so I could type L 16 right you could sort of see the people there purple there or you can click on the cell above and the dancing ants remember when the dance and ants are going you can change it with the arrow key so then I'll down arrow all right that's all formula now we are in the course of this chapter in class gonna see a number of different ways of summarizing quantitative data in a class this is called a class and the key is there's a lower and upper we'll see how to do this with pivot tables we'll see how to do it with the frequency function and there's even some data analysis add-ins that will use the count ifs is the most versatile and the reason why is you have control over the comparative operators if I want it which we will down here for decimals you don't have to include the equal sign all right so the countifs is really the most versatile of all the methods of counting quantitative data between a lower and an upper value all right control enter and now I'm gonna drag this down I'm going to the last one and I'm going to hit f2 to put it in edit mode and since there was a lot to this form you really have to check it and it looks like the blue box is locked all the way down and it looks like thick k-19 and the l19 worked like a charm alright I'm gonna hit enter I'm gonna add this up alt equals and enter so I've added up mutually exclusive categories collectively exhaustive we captured all we counted all of the records given this criteria so we can clearly see from this frequency distribution number the point is to summarize and get some information Wow most of the boomerang to her sir between 20 and 39 and this website Colorado it's a pretty good company makes pretty good boomerangs you know a lot of the competitors are in this age range right here also we can notice well there's not very many in this age range right well these are purchases this is not saying anything about whether they're throwers this is these are the people who actually bought right not many below 19 but look at this there's a bunch the bunch between 40 and 59 in essence 40 and 60 a lot of people there there's even some between sixty and seventy nine so boomerang throwing as a sport is enjoyed by a wide range of Ages now in a little while we'll see how to break this apart and we'll actually do increments of five years I think it is to give us a better picture so this is a kind of very tight compact categorization of age again later we'll see how to expand this and get more detail all right so this is quantitative categorical for a quantitative we're going to use a column chart but again we're gonna have to do something slightly different so let's see how we do this now we have our label here and our categories this will show up on the horizontal axis and then our label for our numbers quantitative data also the chart wizard calls them series so I'm gonna go up to insert column now let me show you a keyboard shortcut right here cuz the default chart on most excel systems is the column chart so there is a keyboard shortcut it's alt f1 an alt f1 let me put a little note here I put a default chart because that is what it puts default chart on the sheet you're working on you can change that but most computers don't and since we're using so many column charts I'm going to use that from now on alt f1 alright now this is going to represent quantitative data and it is also continuous data they happen to be integers but our categories for 0 to 19 and 20 to 39 since these are all expressed in integers there's nothing that could fit between 19 and 20 right we're never going to get a number between these so we want to portray that visually by removing the gap width so I'm gonna click on this column right here notice they're all selected and control 1 or you could right-click format control 1 ah look at that under series options there's gap width just like that alright I'm gonna go to fill again this is unnecessary we'll actually see how to have columns that are the same color later on and I'm in a very color spy point oh that is so cool look at that now we're giving visual information right there is no gap 0 to 19 20 to 39 nothing could fit in between now I'm going to delete this I'm going to click here and I'm going to click up here and type number of number of transactions by age group enter now you actually can click in here and highlight and do that also there but I kind of have in this class kept to the formula Barca sometimes will make formulas that link the title or the label to the cells and other times will type all right I'm going to right click and go to my mini toolbar on type 12 I'm going to click on these lines delete I'm going to click on the columns that are all selected I go up to layout data labels outside end I'm going to click here now in this case over here I don't know we kind of got away with not a putting a label here cuz region and these regions were quite obvious but down here I want to put a label and include the unit so I'm gonna go up to layout axis horizontal below and type something like age and then in parentheses years alright alright so let's we can clearly see a couple things here now again this is quantitative data with a column chart this is called a histogram we can clearly see that this is the biggest category and there kind of tails off here we can sort of see the shape of the data to up on this categorical data we can't really see the shape because the order of the categories isn't doesn't give us any information but here this order does give us some information right it goes from left to right small us age to big us so we can kind of see a shape to this data alright so we can clearly see which ones are biggest which ones are smallest and we can see some shape also that shape will become very important later we'll talk about skew and in later chapters we'll do a lot with frequency distribution in shape you can kind of see it's sort of looks like it's skewed towards some couple bigger values are probably pulling up the average if you were to calculate the average or the mean and we'll talk about some other shape issues later I'm going to make this chart a little bit smaller I'm gonna zoom in right this is called a histogram columns touch to indicate that no one but numbers can fit between the classes height of the columns convey the count order of classes important to help reveal the shape of the data all right now we're gonna do another quantitative down here so I'm going to scroll down here this time we're gonna do sales now what's important the difference between these two here we have 0 to 19 and 20 to 39 but what about these decimals over here we couldn't do Z we couldn't so for our sales right we couldn't do 0 to 50 and then 51 to a hundred because if we had 50 as the upper and 51 as the lower there would be some numbers that fit in between now the textbook tells you to do zero up to $49.99 but I'm going to show you an alternative method here and this method is a little bit more robust especially if you have data with lots of decimals it's like how do you pick the upper end alright so what we'll do is we'll say 0 up to 50 but not including 50 50 up to 150 but not including 150 so let's see how to do this actually I've said that wrong we're gonna do 0 to 75 75 up to 150 so let's create our categories and then I'll show you very clearly how to label this through there's no ambiguity again our goal is to have mutually exclusive and collectively exhaustive categories all right so we're gonna say equals then Plus and we're gonna have to lock this here I'm going to say equals that so now we have 275 so far that means there could be some ambiguity but we'll fix it in just a moment now I can copy this down remember it's one cell to my left plus the increment now I copy this and notice checking this sure enough it's looking at hey lower is looking the upper from the previous class and this one's looking at the lower and adding the increment now we're gonna create our label here and this is where you have to be careful now when we do our pivot table later it looks now let me just do a little trick here all right the pivot table will spit out this and this is a slightly ambiguous you're not sure whether the 75 is fitting here or down here now after you use pivot tables a while and we'll see how to do this later you know you maybe just get used to it and you know that the 75 is always included on the lower end but I do not want to let our viewers on that uses this frequency distribution and the histogram to ever get confused so I'm going to be more careful with our labels now I'm going to create a formula equals the lower end and we're gonna join join the shift 7 to get the ampersand in double quotes a space up to space and double quote alright so when we enter that formula there those are relative cell references so when we copy it down we get 0 up to 75 75 up to 50 50 up to 225 now this is calm and I've seen this in histograms in charts before but I like to be even a little bit more explicit so I want to create a slightly different label now well we can use these for our charts and these can beef here for us as we learn it or vice versa I sometimes will use these sometimes we'll use these but we're going to say zero ampersand and then I want to say zero has to be less than or equal to sales the sales have to be less than space so this label right here all inside of our double quotes is going to be joined with the lower and the upper now let's think about this what does this say you may have seen something a construction like this in algebra or something like that the sales is sitting in the middle that sales number it represents all the data in this column so notice sales the bigger end is pointing towards the sale so it says sales have to be bigger than or equal to zero the little signs pointing toward the zero so zero has to be smaller but it's that big sign that's pointing towards the sale and the little sign is pointing towards the sale so it says this is a label this is not the same construction we're gonna use for our formula at all but as a label sales in the middle bigger than equal to the sales every time it goes through this sorry I was pointing to the wrong column every time through it goes through the sales column the sales number each individual one will have to ask the question am i greater than or equal to zero and am i less than 75 now notice all the way down and you got to be consistent with this the upper end is not included right and that is the trick the conceptual trick to creating categories one category to category three category or four categories that are mutually exclusive and collectively exhaustive it just means we're knocking a double count if we had an equal sign here and here we would we may double count if there was a hundred and fifty and we had an equal sign here in here 150 we counted here and here alright now this is just a label I want to compare this up here we talked about the construction of our logic up here and how the bigger that greater than sign is pointing towards all the numbers and the less than is pointing towards all the numbers so when we get to our formula it's different than our label construction now this formula is almost exam it is very similar to what we're going to do down here the only difference is this was integers so there was equal equal when we get down to our formula down here it's going to be the lower end is equal but the upper end does not have the equal all right let's create our formula equals countifs criteria rings one while it's sale so I'm gonna scroll over at the top control shift down arrow f4 comma you know and I'm gonna do a little trick here I'm gonna highlight this in edit mode copy ctrl C because I know I'm gonna have to use it again later criteria 1 in double quotes greater than or equal to n double quotes right so again that big alligators open mouth is pointing towards all the other numbers in the column so those numbers have to be greater than equal to the 0 that's criteria 1 comma and the criteria to control V comma our criteria to this one has to be in double quotes less than ampersand the upper end alright so the key here for our decimals we have the equal sign on the lower end but not the upper end that's so we don't double count closed parentheses control enter copy it down click in the last cell f2 whoa looks like it's working ok I'm going to scroll over here and make sure that blue yeah look at that now we got to add them up alt equals is the keyboard shortcut and enter so now we have counted our sales data now what this says is 0 up to 75 bucks people spent 0 up to 75 bucks 119 of the transactions 75 up to 150 bucks 61 of the transactions 158 to 225 15 and 5 now actually this is a common pattern you observe when you are making purchases right when you go to this website people are going to buy one or two boomerangs right but probably not that many that's a lot so 0 to 75 maybe that's three boomerangs right Wow here that would be like 12 boomerangs and that's true for purchases so the pattern that is revealed in this frequency distribution for purchases if you went to a department store right and said you know lots of people are buying a shirt here some socks there but not many people are buying you know $2000 refrigerator so this kind of pattern is actually quite typical now let's visualize this I'm going to highlight I'm going to use my keyboard shortcut consult f1 alt f1 you don't have to do that you can go up to insert column delete I'm going to click here click up in the formula bar and say number of so I think I is it revenue it's either sales or revenue amount hopefully I spelled that right I'm going to right-click mini toolbar 12 enter I put it in edit mode I can f7 and spellcheck it I'm going to delete the lines I'm going to click on the columns control one that's format the data series and remove the gap fill very colors by point you don't have to do that absolutely beautiful I'm going to delete this oops what's that oh I can go to layout data labels outside in and then I'm going to add a axis horizontal below sales amount and I'm gonna put a unit all right so we see clear clearly from this chart we can see the pattern right we can see individual things like wow 0 up to $75 the most by far and also the pattern right as we tail off to the upper sales transactions amounts we get fewer and fewer wow isn't descriptive statistics fun or summarizing creating useful information from raw data let's see if I can right click and maybe change it to 10 right click it's already 10 all right so let's I'm gonna zoom in I'm gonna close my ribbon up here right-click minimize or ctrl F 100 keyboard shortcuts get confusing all right so there we have these the the essence of chapter 2 we did frequency distribution for categorical frequency distribution for quantitative and note the difference in how we create our category or class for integers or decimals and then our charts column charts with gaps for categorical data column charts without gaps called histograms now I want to make a note this is categorical right these are numbers we have no gap a gap so I want to make a note about here cuz when we get to chapter 5 gonna be going well I thought only categorical data could have columns with gaps let's go over to says column note just note here discrete numbers discrete means 0 1 2 we're counting and there are gaps between there's no number when we're counting if we're counting flips and we're flipping a coin and counting how many heads right we'll do that in Chapter five or six or something you can't get one-and-a-half heads right so you actually sometimes with numbers have to use a column chart with gaps to indicate that nothing can fit in between all right so that's just a little preview about columns all right so now in this example here kind of nicely summarized everything but I want to show you how do we calculate so for our age how many how do we calculate how many categories are classes and what the increment is let's go to this construct FD here's the same age data here step 1 2 3 this is just a guide and in fact when we go through this example and use our little rule like 2 to the 7th power and calculate everything we're gonna see that it actually doesn't work we're actually gonna have to judge you know change it so all of these things we're gonna do step one to three there this is like a useful recipe but ultimately it is a judgment call this does help to kind of give you a basic idea of where to start and then adjust things to figure out what the lower and upper and the classes should be now notice we're gonna do this and we're gonna have a lot more data or summarized data revealed to help us see the patterns of who is buying the boomerangs what age group are buying transactions first thing step one determine the number of classes using a guy 2 to the K has to be greater than n now n we'll see this throughout the rest of the book n means the count of the sample 2 to the K whatever that K is the K is going to tell us number of classes so let's just count first now these are numbers so we can use the count function control shift down arrow and you could hit enter if you want to beat because let me just show you I'm all the way down at the bottom I just just show you notice this is a simple function we don't have to put that close parentheses right so if we were down at the bottom instead of taking the time to scroll up we can just hit enter right now enter brings this down one when you hit enter the formula goes down we've seen how to do ctrl enter which is useful another method in this case I like to use it is shift enter because it goes up shift enter puts the formula in the cell and goes up one so let me do that again equals count I'm going to click at the top control shift down arrow and then shift enter alright however you do it we got our count now a number of classes you just got a start I'm gonna say two to the fourth and then I'm gonna build my formula equals two and shift 6 is caret which means exponent 4 all right now our rule is 2 raised to the K has to be greater than n right so is 16 greater than n no so now I'm going to increase this 5 control enter 6 control enter 7 control enter 8 control enter ok so finally now I get a number that is greater so 2 to the 7th is greater than 200 so I'm gonna start there next step determine the width of the class so now this kind of it's going to be common sense but first we need to figure out what the max is right remember one of the main guides is it whatever categories we create we have to have enough to accommodate all the numbers ctrl shift down arrow shift enter I'm gonna do oops yeah I did the max and then the min min control by the way equals M I n I see that it's on the list I can either type open parentheses double click this or hit tab click on the top cell ctrl shift down arrow shift enter so sixty-five to eighteen well let's look at the range this is a measure of tells us the spread in the data well the range is going to be big mine or max - min now 47 so whatever we do our low end of the first class and our upper end of the last class better span 47 years right so our approximate width we just use common sense if this is how many classes that's the range equals this divided by this five point eight seven so width of the class well we could say I'm gonna move it down to five I'm not gonna use five point eight so I'm fine I'm not gonna even really use six because in our classes would say like six to 12 12 to whatever that is 18 if I can do math in my head so I'm gonna start with five and kind of the way I'm thinking this is either has to be five or ten all right range now range for classes class widths times K so let's go ahead and calculate that class width width of the class times our K oh yeah right oops so we can see right here if we were to use eight that was the first number that got our two to the seventh above two hundred right if we used eight and this width it wouldn't work so now if I change this to ten we definitely have enough but you know I want to see the spread and this data spread out more so I'm going to keep it five what does that mean well this isn't big enough right so I'm going to change this to nine ten all right so finally we get a 10 gives us 50 here right so I took my 5 times my 10 and 50 just is above 47 notice that once I change this input here it changes the approximate width alright so we still have 5 there so that's looking good now must this number must be bigger than this number now I can I this but I'd like a formula to tell me whether this is true or not and I want to start up here I'm going to move this down back to 8 I want to do a true or false from that remember in our chapter zero zero we talked about different types of formulas there's calculating formulas texts for notes which we used for our labels this is a true/false formula or a logic formula I want to ask the question is that greater than this this logical formula or true/false form that has only two possibilities either true or false false so that way I have a visual cue here all right so now I can come up here and try nine ten and finally when I see it true I can see now you know that's kind of unnecessary because we can see the difference between this and this but I don't know I've had times when I'm doing Excel and the true/false logical formula catches an error I make all right so all of this looks like a we have a recipe but we're certainly deviating from the recipe a lot our third step is to determine the class limits well first let's calculate the low limit well 18 it's got to be less than 18 less than or equal to 18 right and it's 5 so the only logical thing to keep the classes neat is to go down to 15 notice over in this example here we started from 0 to 19 all right so let's go ahead and do this the lower I'm gonna say equals that notice all of these labels are going to be linked up here so if we were to change any of this this would change also tab now for the upper we need to say equals that plus the increment and I'm gonna have to lock this f4 but wait a second this these are integers so we're gonna choose to go 15 up to 19 so really what's 15 plus 5 it's 20 so I have to subtract 1 now why that there really is 5 to our increment because both the 15 and the 19 are included count on your fingers 15 16 17 18 19 is 5 all right so now we say equals this and that will give us 19 but we need to add one up remember we're dealing with integers here so nothing can fit between 19 to 20 so this is okay all right this formula will work down here we highlight these dragged down all right and we have our categories now very quickly we just did this just a moment ago we're gonna calculate our label age class we're going to calculate using countifs remember we have two criteria upper and lower and we're looking through that age countifs criteria range one control shift down arrow f4 whoops I better scroll over so we can see criteria range one comma criteria greater than equal to comma and I control V because I just copy that paste there's criteria range two and then less than or equal to these are the integers so we have the equal sign on both side control enter double click and let's see I don't want to double click because I don't want to replace that I'm going to copy it down but there's my total now we can create our histogram alt f-14 column I'm going to click right here delete I'm going to click up here in the formula bar whoops number of transactions by age class at age group I'm going to go up here and change it to 12 click on the lines delete click on the columns control one gap with none I'm going to not change the fill I'm gonna go to border color solid line and this the reason why is it's hard to see the difference between the columns when you keep them the same color but if you add a border color alright now I'm going to delete go up to layout outside so we added those I'm going to go up to layout horizontal title below age by years all right so now we can clearly see as if compared to our other chart we can see with our frequency distribution we can see that spread in the data we can actually look at the tabular form or the graphical form much more detail here so we get a more nuanced picture 22 24 is the most 25 to 29 and then there's a tail off but look at this there's a bunch of people over in the looks like the 50 to 64 group that throw boomerangs here so again we're doing frequency distributions and then graphical representation of that frequency distribution since it's quantitative data and there's no gaps between our classes or categories we show our column chart with no gaps all right so that's about how to go through and determine how many classes class width etc you can see hits a lot of back-and-forth trial and error now I want to show you how to do all this with pivot tables I'm gonna click on the sheet' FD frequency distribution and charts pivot table now we have our data here let's create the categorical one first that's this one with pivot tables as we saw you click in one cell in the data set insert pivot table pivot table now I'm going to use the keyboard shortcut from now on alt n V T one cell selected alt n V T I'm gonna say in this sheet right here and I'm gonna click in the very top cell II 1 and click OK I'm gonna drag region let me see if I can I'm gonna drag region down to RO region down to values oh that's so easy remember pivottables annoy us with the part of the formatting so I'm going to go up to design report layout and show in tabular Wow that's amazing I'm going to click right here and type frequency and I'm going to create a column chart I'm gonna open up my ribbons either right click on minimize or ctrl f1 it's a toggle I'm going to go to here and click pivot chart oh I'm sorry let's not do that let's click in one cell and alt f1 how that is so cool now for pivot these things show up here which are fine if you want to filter but we don't so right click and then hide all field buttons click here delete click on the lines delete click here delete type your title you can color them different or not I'm going to go up to layout data labels outside in and there we have our pivot table and pivot chart categorical now let's go and do our first our age and our revenue or sales now notice integer decimal we should make a little bit integer and decimal the pivot table will pick that up let's click in one cell doesn't matter which one alt nvt I'm going to place it in this sheet now this is dangerous to put pivot tables on top of pivot tables in a sheet because if you pivot this it will can destroy the other pivot table so just to keep everything on one sheet for this video I'm doing them all on one sheet I would probably put them on separate sheets if you're never going to pivot them then it works ok I'm gonna click OK we're gonna have to drag age down to row labels now we want to come over and right-click and group now we've already done our determination here so start at I'm gonna say 1565 that's fine here and then bye and I'm gonna put five so now we get all the way to but notice what the pivot table does it includes the cludes a different last group so 55 up to 59 that's exactly five but this one's from zero to sixty five that's six so the pivot table does something a little bit different but that's okay remember it's all we're grouping it to see patterns all right let's go ahead and drag age down to values we can see we get a slightly different count I think we had 17 and then four in our last group which was 65 up to 69 alright so now we can change the design tabular and I can create a chart I'm gonna cloak I'm gonna close this if we need to open it back later we right click show field list all right so I'm going to alt f1 there's my column right click hide delete I'm going to click up here and type number of transactions by in each group enter delete these lines click on the columns control one no gap with I'll change the very color spy point go up to layout data labels outside in axis let's do horizontal age in the years I can probably delete this too all right so this that's from a pivot table so we have one two and finally we want to see how to do our last one click here alt nvt existing location I'm gonna click right here this is dangerous if we were to ever run group this it would just go bush and destroy our pivot table I'm gonna drag revenue we've already done our analysis to figure out what the the start number the increment etc so I'm going to right click group I'm gonna start at 20 tab tab 300 tab and here we're gonna do 50 now these are decimals we've entered this in and watch how the labels come out different 20 - 70 ok so we did something quite I didn't do something right here I don't want that less than 20 and these odd groupings here so I'm gonna right click go back to group then what did I do here oh start at 0 so we luckily I made that mistake we can clearly come here and edit it once we've grouped it click OK also so now we get the categories we want also you can right click ungroup and by the way in 2007 and 10 if you group these any other pivot tables based on this data set right here will also be grouped whom in here I don't really like those labels so she's got to show you a great trick here we could go in and edit each one of these in tight but I don't want to do that there's a find and replace feature so if I go up to fine replace the keyboard shortcut is ctrl H I can say find a dash and replace with space up space to space notice I've highlighted these so it's only going to search that range and then click replace all made six replacements click OK that's find and replace remember it's up in the home ribbon edit find I'm gonna go up to design and change that to tabular by the way I have no idea I don't know how to change the default to a particular style like that I wish we could do that all right I'm gonna close this and make a pivot table pivot chart in high speed you ready alt f1 Oh delete what did I do wrong there I forgot to add an extra feel to do the calculation upon so I'm going to right-click Schofield list right revenue down to here alt f1 delete delete click up here a number of transaction my revenue or sales which I'm going to like phony I could spell right-click hide all field buttons click on the columns control one gap width I'm going to keep it the same color so I'm going to add a border color go up to layout outside end data labels click here delete axis horizontal sails all right so wow that's a lot a lot same exact three pivot tables and I mean frequency tables distributions and same three charts done with a pivot table one so we did categorical we did integer quantitative decimal quantitative including a replacement fine and then we made our charts a couple more points in this video here I want to go over to F or PT when do you decide when to use a pivot table and when to use a formulas formulas for what we're doing here you know formulas take more energy but and there's an exact option inside a pivot table this is a pivot table do the same thing so if that's the case then pivot tables are much faster so that's the big advantage there but formulas formulas update instantly if you change the data so if your data is changing you might want to consider using a formulas instead of a pivot table and I'm going to show you example of that the other situation is sometimes their pivot tables are limited they only they only have so many calculations whereas formulas are you know not as limited all right let me show you an example here I want to convert this table and in chapter zero zero we saw an example I want to convert it to an official excel table so I'm going to go up to insert table or ctrl T and then I'm going to click OK so now this is a table let's add a new record down here now the advantage of this table feature is if we add a new record meaning a new transaction down here it becomes part of the table formulas that look at this will automatically update now the pivot table recognizes the table too but it won't automatically update so let's add a record and see how this works I'm going to say 7/2 sold a quad now before I hit tab and enter this notice the count is three four quad and the count is three but when I hit tab for this one does not update 400 bucks so the formula updates the pivot table doesn't know it's easy to update a pivot table all you have to do is notice it's a three right click anywhere in the pivot table and refresh and sure enough it updates but if data is continually changing that gets annoying and something people oftentimes switch over to formulas so formulas automatically update when any source data changes pivot tables do not pivot tables are faster and easier than formulas all right let's go over a couple more topics for frequency distribution sometimes you have a situation like this here's our frequency distribution we have categories reasons for dissatisfaction so we have these categories these are from surveys or phone interviews right so we have some percentages and down here we have a count of two and one and here are the percentages once these get below 5% sometimes people like to group so instead of rough-road and no instructions we want to group these together and call them other so here's how you would do that here this formula is simply a count ifs I'm going to copy it down all the way down to there now notice I replace the formatting when I copied it down so I'm going to point to the smart tag and say fill without formatting well look at this formula that's silly it's looking at other there is no author here so we have to know our categories and up here I have rough wood and no instructions I'm going to instead of clicking there click on rough wood and that's kind of hard to see there I should not have made it green and I'm going to add now right now it's going to look good look through there and get rough wood which would be a count of two so countif will give us a count of two plus I'm going to paste this again without the equal sign and I'm going to replace a double click that g11 and now I'm going to click there now it's going to count it's as if we took the formula from that cell and the formula from that cell and put them together in one cell so that's one way to do it here's another way with a pivot table you can click in the data set alt n VT and I'm going to place it down here I'm going to drag dissatisfaction down to the row labels dissatisfactions down to values I'm going to close the pivot table frequency all right so these last two right here we can simply highlight them this is not the same order so I want to sort first so notice it sorted these by alphabet but I want to sort on the numbers but I'm going to right click sort bring the biggest to the top so you can sort a pivot table now we have rough wood and no instructions at the bottom so I'm gonna highlight these right click group what group the we're not grouping numbers we're grouping categories here net word so I'm going to group ooh look at that now it's called this group I'm going to click here and type other and then we need to collapse all of these right and so there we have our other I'm gonna right-click sort and the biggest on top so that's another way with pivot tables to do it alright and finally one last topic if you're in the class we don't need to do this but this is here just so that if you ever encounter 8 you know because there's something called the frequency function now the frequency function is the easiest way to do a formula and count but the categories are always the same and you can't change them but with the count ifs you can change them so here's how the frequency frequency function 0 I'm just going to highlight this type frequency tell it the upper limits and the data and then use a special keystroke called for array functions and it will automatically create without adding all the fancy comparative operators and joining all that but again these categories will automatically be created you only give it the upper bins the first category will automatically be created the revenue counter in this category will be less than or equal to 50 the subsequent categories are similar to what we had before except for the up and is included and the lower is not and finally they have one extra category whatever you give is the upper limit of the last class it'll create automatically one extra class that says greater than that now the one great thing about this is that this set up of categories will always capture any value because the first one is less than or equal to the first upper limit all of these ones are between with the upper limit included and the last one is everything above so no matter what numbers are over here you put these upper limits in it will capture them all right you ready here some steps here highlight all of the cells you want in advance now look at this that's how many kept upper limits we gave it because it automatically creates this extra one you have to highlight one cell more in the active cell you type equals frequency it wants the data and the bins that's it data control shift down arrow now I'm gonna scroll up this is a unique type of function it's called an array function you do not need to lock those cell references R comma and then the bins you want the upper bins close parentheses now if I you have to use a special keyboard shortcut to tell Excel that you're entering an array function and you have to use control shift enter so you hold ctrl and shift both down at the same time and then tap enter notice up here in the formula bar you can see these curly brackets they're automatically added in when you do an array function like this and it is all considered one thing it actually is called array because it enters an array of values in to multiple cells at one time all right that's the longest video ever did here at YouTube all about frequency distributions we'll see a next video for more about chapter 2
Info
Channel: ExcelIsFun
Views: 86,212
Rating: undefined out of 5
Keywords: Excel, Statistics, 2010, excelisfun, Mike, Gel, Girvin, Highline, Community, College, Busn, 210, Statistical, Analysis, Business, Economics, Slaying, Dragons, Frequency, Distribution, Column, Histogram, Charts, For, Categorical, Quantitative, Data, COUNTIF, COUNTIFS, function, Other, categories, FREQUENCY, PivotTable, Pivot, Table, Chart, PivotChart
Id: 0I0g16PvuoY
Channel Id: undefined
Length: 78min 28sec (4708 seconds)
Published: Thu Jun 30 2011
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.