A VISUAL Every Analyst Should Know | HISTOGRAM with Dynamic BINS in Power BI

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
I would like to know how many of my products I sell very frequently versus just sometimes versus almost not at all well for that we need to take each product count the number of orders for that product and assign it to one of my buckets and of course we need to know how big are these buckets well not frequent could mean from zero to 100 orders and sometimes could be from 100 to 200 and everything above could be well very frequent now these buckets we sometimes refer to as bands and when we visualize this in a column chart well then we have the histogram a tool that every data analyst needs because you can get to Super useful insights about the distribution of your data now in this video I'm going to show you how to set it up and most importantly how we can let the user choose the size of these bits which is more tricky than I thought and even if you're not interested in histograms at all this video contains many super useful tricks so let's get started [Applause] welcome to how to power bi my name is boss and if this is the very first time for you visiting this channel make sure to hit that subscribe button to stay up to date on all of my videos in which I everything you know about power bi but over here I already have a gun charge with one color and it shows number 1003 which is the total number of products that I have in this data set for GIF sales all right now what I would like to have is an understanding how is this distributed how many of these products do I sell very frequently not so frequently versus almost not at all now what we're going to do as a starting point is show a breakdown by all of the products and we're going to count the number of sales orders that we have for each product okay so we can get rid of the count of the product key so I'm gonna remove that one I'm going to add over here on the y-axis the number of sales orders and here on the x-axis I want to have a breakdown by the different products so I'm going to add the product name okay now here you see for each product on the x-axis the number of orders that we have for that product now you see we have here also a horizontal scroll bar and well it's quite many products so is this really helpful not so helpful to get a better idea of how the data is distributed now what we could do is well create buckets where we say all right we are going to have a bucket from 0 to 50 orders how many products do we have that fall inside of that bucket and then from 50 to 100 how many products do we have that fall inside of that bucket that will give us a better understanding of how our data is distributed now one way in which you could do that is by going to our product table and adding a column however you will see that it's kind of static which will not be ideal for what we are going for but maybe as a first step let's start there okay I'm gonna go here to the data view I have here my product table select that and we're going to add a new column now here I want to calculate the number of sales orders that we have for each product so I'm going to call this one sales orders count and this is going to be equal to well I already have imagined that's called number of sales orders press enter and and that gives me a new column with the number of sales orders for each product now that's the exact number but I just want to assign it now to a bucket now we could use an F4 switch or maybe to keep it easy we can also go for a round function I can can also round it down and when I round down to 100 so -2 digits then this 193 would become 165 will become zero and the 218 will become 200 so that's going to be the beginning point of each bucket that is 100 in size okay now I'm going to use around down function the number is given by the number of sales orders and over here I want to have minus two number of digits and so you see if I have minus one that 193 over there will become 190 all right and if I do now minus two then that one will become 100. exactly and now we have kind of a bucket all right so now we have our starting points for our buckets for our bins so maybe a better name would be not the sales order count but bins all right and now with that new bench column we go back to the report View and here I want to first of all have a breakdown by these bins these buckets so I'm going to replace product name we have that new column which we have in a product table all right and for each bucket we want to know not the number of sales orders no we want to know how many products do fall inside of the pocket so I'm going to take out the number of sales orders all right and now I'm going to go here to my sales table take the product key so there it is and I would like to have not accounts but the testing counts okay now and that tells me how many products in my sales demo phone inside of each bucket but the Sorting is a little bit off it is now sorting by the count of product key and I want it to sort by well depends so bins and I want to have it in ascending order now an overview of the distribution already of how many products do I have that sell not so frequent versus just sometimes versus very frequent okay now this is nice however it's kind of static because maybe you're missing still valuable insights that you could get if you well would make more bins well more granular breakdown now how could I achieve that for that I would need to go back again to the data View and then instead of having buckets bin sizes of 100 well we might want to change that to 50 or 25 however it's always static the user cannot make the change and that's the thing that we want to solve next which also means we cannot do it with a calculated column because that only refreshes when we refresh the data set so that's not going to work so we have to find a different solution now let's get started all right step on we're going to add a new table with just one column that's going to have all of the starting points for our bins all right now to do that we just go here to modeling new table and this one is actually really easy we just are going to say then start that's the name of this table all right and we're going to generate a series now it's a series with the starting points for our bins now let's say we're going to have at the most granular level let's say Benz of 25 so I'm going to start over here as a minimum Point 25 and value Let's see we go all the way up to a thousand and the incremental value is going to be the most granular step that you will want to choose from 25 in my case all right I'm going to close the brackets now what the builders do well let's have a look I'm going to go and switch to the table View binstar is the table that select that and you see over here we have this one column with values that make jumps of 25 25 15 75 100 all the way up to 1000. now instead of calling this column value maybe let's rename it to something more intuitive this is going to be bin start all right now let's switch to the report View and first let's create a table so that we have an easy way to see if everything works all right let's make it a little bit bigger and put it right next to the other one and now here on the right hand side I'm going to go to my new table then start and see there's an aggregation I can write in front of it but we don't want to aggregate it we want to use that column to break down our columns right we want to use that later on on the x-axis so I'm going to select that field and then here we have column tools summarization I'm going to put this one to do not summarize okay and now we can grab it and just put it onto our table visual now you see we have jumps of 25 25 50 75 100 okay that's good however now I would like the user to be able to choose what are the starting points and when the user says 50 well it should go from 50 to 100 to 150 200 if so as the user says 100 should jump from 100 to 200 to 300 okay now first we need well the ability for the user to choose that value so that means numeric parameter so I'm gonna go all the way to the top to modeling new parameter and this is going to be a numeric range parameter so numeric range the name of this parameter well it's going to be that bin size data type whole number is okay then the minimum is going to be let's say 25 the maximum let's put it to 100 for now and increments of let's say 25 not the default let's put it to 50 and also add a slice to the page okay let's click on create all right there we have a slicer from which the user can choose the bin size value now if I slide a little bit to the right you see we have 50 75 100 and all the way to the left is 25. okay now let's put it to 50. and now I'm going to going to take the slicer and put it right above our table okay now when I have here 50 selected I would need to see here 50 100 150 how can we do that well we need a measure that's going to filter pin start so that when we have here 50 all the other ones that are not in the increments of 50 disappear now we're going to write that measure so let's add a new measure all right now let's call this measure bins filter all right because we only want to keep the relevant ones and then we can work with variables makes it a little bit easier and the first one let's call it bin size okay now that's the easy one because for that we just created a numeric range parameter and when you create that numeric range parameter you see over there also creates automatically a new measure bin size value so we just refer to that one bin size value okay no that's the first one now let's create the second one so here another variable and this is then the bins to keep now how can we figure out which bins starting points we want to keep well first of all we need to know about the bin starting point so for that we can use it selected value function which returns just the value and there's one value and here we get that say ah we have been start table bin start column all right and we want to check that value and then divide it by the bin size that is selected okay so that variable that we just created above it now let's see what this returns before we continue okay so I'm going to return bins to keep okay now here we have been filter I'm going to add it to our table now pay attention bin size that's currently selected it's 50 so we have 25 divided by 50 as a half 50 by 50 is one and for 75 is one and a half every time when we have the ones that we want to keep is always an integer one two three four so when we make jumps over 50 and when I would set this to let's say 100 okay then we have only these integers when we have here 100 200 300 Etc okay now I'm just gonna put it back to 50. so with that knowledge how can we then well we turn true for when we have over here one or two or three well let's go back and let's make that change okay so I'm going to select that bins filter measure and over here I'm going to check if this is equal to all right and now I'm just going to copy it all right and I'm going to see if this is equal to the rounded value of this so here I'm going to round it and then for the second argument well just zero close the brackets and now let's see when it returns true or false now again 50 is a select bin size and you see it only returns true for 50 100 150 okay if I set this to 100 it only returns true for 100 200 300 Okay so that's not logic because now the only thing that we need to do is go back once more then here we can add another variable let's call this one bins filter which is then going to be equal to well we want to check if bends to keep is true right so bins to keep and you don't have to say is equal to true well because it already returns true or false now if it's true then we want to return one otherwise we want to return a zero now also important don't forget to update what is after the return statement so that's going to be bins filter all right and that is the final measure that we can use as a filter and you see instead of add true and false we have now ones and zeros then we can select the table visual open the filter pane and here we want to use bins filter measure now it is already added so therefore it shows already here and I want to say that it needs to be equal to 1. and apply the filter okay so now we see we have the starting points that make jumps of 100 100 200 300 because the bin size is 100 if I set this to 50 then we make jumps at 50 if we set it to 25 we make jumps of 25. so that is already working so we have a dynamic way to choose the bucket size and Link it to the breakdown of a visual all right now the next thing the next thing is that we need to count how many products fall inside of heat bucket all right so for that again a new measure so let's add a new one all right now let's call this one number of products within bin no this is going to be equal to well first of all we need to know the start and ending point of the bin that we're looking at of the selected bin all right so first variable men bin or we can say bin size then all right that sounds a bit more professional and here we can use the selected value function because we want to check if there's only one value and then return that value and we are interested in the starting point okay now then we also need to know the ending point so we have bin size and then Max or we could also call them start that's what this one starts oops start and this one and and this is going to be equal to well bin size start plus d value of the size of the pins which we can get from the measure bin size value okay so now that we have the variables for this start ending points for each single bin we need to go row by row of all of the products in the product table and then calculate the number of orders that we have for the product and if within the bin range then return one another one is zero all right now let's see how that works I'm going to add another variable let's go on this one sales number and then we well we need to iterate so I'm going to use this sum X and we need to iterate over all of the products so I'm going to iterate the product name Point per key all right and then we can calculate B well the number of orders now to do that well let's wrap it inside of here and we want to check if the number of sales orders bigger than or equal to D pin size start okay now at the same time so Ampersand Ampersand and another condition needs to be true because the number of sales orders also needs to be lower than well the ending points so bin size at okay now if this is true then I want to return one if this is not true then I want to return a zero so if it falls within the bucket for that Brack it returns a one right and then for all of these products that we turn on then we can just sum up the ones and that gives us the well the count the number of products that fall within the bucket that we are looking at in the visual within the filter context okay now let's then return over here it did that last variable so sales number okay well that's and then this looks a little bit better and let's now add it to the table okay now let's take the table and make it a bit wider so that we can see the numbers a bit better now I made one mistake and that is I should have started at zero for the bin starting range so I go back here to the table and update the starting point to zero all right so now we have there the first bucket as well bin start zero so meaning the bucket that goes from zero to fifty we have 153 products within that bin and then we have 50 to 100 Etc okay now what is actually happening to the 50 itself when somebody has well count of orders of exactly 50 well does it fall inside of the first bucket or the second bucket well we have to make a choice otherwise we could be double counting some products so let's go back and here we can just say all right we have the bin size and well at the bin size value minus one all right so that we go from 0 to about 50 minus one 49 and then we go from 50 to 100 minus 1.99 okay so then here we have number of sales orders is equal to or bigger than okay so that's fine and then here is smaller than or equal to bin size ends all right so now we can just slide this one to the left or to the right and add the number of products that fall within that bin nicely updates okay so that is working so what is the next step well now we can use this but then on that column chart from before so let's get rid of the table we don't need that anymore I'm going to put that slicer right next to a bar chart okay and then I take over here the bar chart and here I want to first of all up into what's on the x-axis all right so I'm going to go here to the bills panel now instead of having that bends column that we added all the way at the beginning I'm going to remove that one and now we're going to have here bin start that new table that we added after all right and here instead of the count of products from before we are going to have here the last measure that we wrote the number of products Within the bin okay so that's it now looks a little bit weird because the Sorting is still off so I'm going to change that so sword access pipe and start and we want to have it in a standing order okay now here we need to make sure that the filter on the bin size is still applied so you see over here that's not the case just yet and I need to go here to the fields panel the data fields panel and then here we can take the bins fail to measure drag it onto the filters for that visual say that it needs to be equal to one and click on apply all right so now it should work if we drag the handle to the right for the pin size you see we make jumps now of 50. once more and we make jobs of 75 100. okay so we got that to work okay now I'm gonna put it back to 50. however you see it also shows me over here blanks well for well where I don't have any number count so let's make it a small change and so I'm going to go back here to the number of products within the bin and here instead of returning zero I just wanna well return a blank or I can just leave it out and it returns blank by default okay perfect so that is working now what still bothers me a little bit is that well here it's not straight away clear for the user that that first bucket where it says zero is from 0 to 50 and then 50 to 100 it needs a little bit of explanation right so maybe we can do a better job now how can we now change that from 0 to 50 well for that we would have to go back to the data view then go to that bin start table and then over here maybe create an actual column call it bin label and then whenever we have over here zero we could say 0 to 25 is equal to 50 Etc however the thing is that would again be static because it doesn't react to any slicer selection so that will not get us there now so we have to think of a different solution what we could do is that we make use of custom labels so that we just get rid of the x-axis entirely now let's do that as a starting point that's right take over here the chart go to the formatting options I'm going to get rid of the x-axis all right and now the next thing I want to have these labels on top on top or I mean above the bars okay so I go over here to data labels actually I want to use total labels but it's grayed out so why do I want to have total layers because then we can use the data labels for showing the range from 0 to 25 25 to 50. use them for a different purpose okay now how do we get total labels to show up well we can do the following we can create a dummy measure all right so I'm going to add a new measure and let's just call this one dummy and this is going to just return a black okay so not uh zero or anything like that just a blank okay so now I take that one I'm going to add it to a visualization now nothing really changed however what did change is over here that we can select total labels and we can turn it on so I'm going to just turn data labels off for a second so that you can see the total labels there you go all right so we have the total labels now we can use the data labels to show the range of the bin size okay so for that we can open up data labels choose the series you want to add it to now the number of products moving then okay and then we can say where we want to position it we're going to have an inside base all the way at the bottom then for the for the values let's make them as small as possible all right and then here we want to use custom labels now we need to write again a measure for this all right so a measure that Returns the start and ending points right so we did this kind of already here so I'm just gonna copy this part over and then create a new manager and then just paste that in and this is going to be well our custom labels that's just there call it that for now custom labels and over here we want to have the starting point the ending point and well then the label right so the label is going to be equal to pin size start then we're going to add Maybe a minus sign a hyphen and then here we want to have the ending point okay and that's what we want to return now with that measure we can go back to the formatting options data labels number of products within bin is selected short data labels all right and then I'm here for the custom label we're going to use that new measure that we just wrote custom labels okay and you see here at the bottom we have the range showing up now instead of having the x-axis there okay and if we change this now to 75 okay nicely updates and to 100 perfect now just one idea of how you could integrate that slicer that we now put right next to a visualization would be in the title right we make this a little bit smaller and you will see that well that timeline will disappear however if we make it too small you get this filter again so we go into properties and then here on the advanced options you can turn responsiveness off and also here for in the visual this is the header we can get rid of that we actually only have a box for well for the number and then here for the values let's go and add a little bit of a background so over here maybe this color okay and now I'm going to slide it right here now it still has a white background so let's go into size and style turn the background off all right and now we just have to resize it so that it looks a little bit better it fits in with the rest just like this now in the original visual app three dots there so to get rid of those we can go back to visual title and here we have the protagon that's the main title subtitle there it is size dot I'm just gonna get rid of that perfect okay so now the user can come and just click in that box and change it to whatever number they like for the bin size to get those extra insights of how the data is distributed because sometimes the bin size can make a big difference now if the user would choose a very small bin size for example 25 what could happen though is that well the range doesn't show anymore because it doesn't fit maybe that is something to keep in mind when you create this visualization that the only way that they could figure it out now is by well offering it off right right so that would not be ideal so maybe that option you would well want to take out all right and that's how you set up a histogram with Dynamic bin sizes now you see quite a few tricks that were integrated that you can also use for all different kinds of visualization tracks now if you're interested in seeing more of these tricks then check out these videos over here I want to thank you for watching if you have any questions put them in there section below I hope to see you in the next video
Info
Channel: How to Power BI
Views: 40,552
Rating: undefined out of 5
Keywords: histogram, powerbi histogram, power bi histogram, create a histogram in power bi, histogram in power bi, dynamic bins in power bi, bins in power bi, power bi dynamic bins, create bins in power bi, create bins power bi, powerbi bins, create groups in power bi, power bi groups, create groups power bi, group items in power bi, cluster in power bi, power bi cluster
Id: oa0sBSJBFe0
Channel Id: undefined
Length: 27min 0sec (1620 seconds)
Published: Sun Sep 10 2023
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.