MSPTDA 22: DAX Data Modeling to Create Date & Time Dimension Tables, Server Downtime Visualizations

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[MUSIC PLAYING] Welcome to MSPTDA video number 22. Yes, Microsoft Power Tools for Data Analysis. And like last video, we're going to learn how to do some data modeling. But in this video, we're going to use DAX instead of Power Query. And our topic is how to create date and, very importantly, a time dimension table to analyze server downtime. Now, we're starting off with the CSV file we have to import, and there are lots of machines. And very importantly, we have a DateTime marker that says when this event occurred, what the duration is, and the reason. We're going to use Power BI Desktop, because we want an amazing, interactive visualization like this. This is July 2018, and I simply want to hover and see the total downtime for each one of the server computers. Also, if I click on July 2018, I want the reasons and the sum of downtime to appear. And over here, we have a word cloud that shows us that the word changeover and atom have the largest summation of downtime. If I select an individual server, I want to see the accounts and the name of the machine. From below the video, I've downloaded this start file. Also downloaded the CSV file. We'll start out by going to Get Data, Text/CSV. There's the file that we downloaded. Double-click. The delimiter is comma. Click the Edit button. I'm naming this fDowntime. It has the source, promoted headers, and change type. We want to select Machine, DateTime, Duration. Holding Control the whole time, now I'm going to click ReasonText, right click, Remove Other Columns, select DateTime. We're going to add column, because we need to extract an hour. So Add Column has a great built-in feature. From date and time, dropdown for Time, and I want Hour. There's our hour. I'm going to change the name. Double-click. We're going to call this HourID and Enter. Select DateTime, and instead of Add Column, we're going to Transform. This will transform the column, not adding a new column. We want date only. Now, we don't have any other data sets to add later, but that file might update, and I want to filter. Also, last video, I made a slight error in what the upper limit is if I want to get just data for certain years. So let's do it correctly this time. Select the dropdown, date filters between, and we want data from '16 all the way to '19. So is after or equal to 1/1/2016, so all the way from 1/1/2016 with no time and everything above. And then the upper limit will say is before the first day in the next year we don't want-- 1/1/2020. Now, last video, I did is before or equal to and did the last day in the year, and that missed all of the time values for the last day in the year. But this way, we'll get exactly what we want-- all the records for '16, '17, '18, and '19. Click OK and the applied step filters to get just the records we want. Now, this is our fact table. And from this fact table, we'll extract data to create our date and time dimension tables. Now I go up to home. Close closes the Power Query editor. Apply applies the Power Query steps and loads it to the data model. Click Close and Apply. Over here in data or table view, I can see it's been loaded. In Relationships, I have just one table. Now I want to go back over to table view, over to modeling. And the first dimension table we're going to create is the Date table. So I'm going to click New Table. This is awesome. This isn't over in Excel, but over here, we can create a DAX-created table and load it to the data model. This will be dDate equals and then CALENDAR. We need a start date and an end date. We're going to have to get that information from the Date column in the Fact table. We're going to create the date for StartDate using the DATE function. Then we need to get the year of the min date fd, for downtime. There's our DateTime. I actually want to change that to DateID, but no problem. We'll use it here, go back and change it, and everything will update. That will get-- close parentheses, close parentheses-- the correct year for the minimum date. Comma 1, comma 1-- that'll work for month and day-- close parentheses. Now, if I copy this whole thing- Control-C, click at the end, comma Control-V. I just need to change -in to -ax and then month 1 to 12, day 1 to 31. Now when I hit Enter, I get the correct list of every single day between the min and max, including making sure that it goes to the first of the year and the end of the year, which is required for a date dimension table if you're going to use any of the time intelligence functions. That DAX formula delivered a table to the data model. Now we can attach columns. New Column. We'll name it MonthNumber, equal sign. There's the function MONTH, and we need to look at, down arrow, the Date column. Close parentheses and Enter. New Column. And to get from a date to a formatted version of that date, we can use the FORMAT function. Dd-- there it is, the Date tab-- comma, and we have to know custom number formatting, in double quotes. MMM-- that means please show me the month three-letter abbreviation. End double quote, close parentheses, and Enter. Now, as we've seen numerous times through this class, if I use this attribute column in a report, this will sort alphabetically. So no problem. We come up and say, please sort month by month number. And in fact, the month number column is not an attribute column that we drag and drop into visuals. It's a helper column to help sort this column. I can right-click, hide in report view. Part of data modeling is including only the columns we need in the reporting area. New Column, and we will create a column for year, and Enter. I also want to create a column for day. So I come up to New Column. And we're going to use the format on the date column, comma. In double quotes, I want to show MM-- that's a number with a lead in zero-- slash dd-- that's the day with a leading zero-- comma YYYY, end double quotes. And actually, I'm going to change this. I want to see the day name, so DDD, space. That way, I have a three-letter day abbreviation, and Enter. Now I want to come back to the Date column. This is showing date and time. I'm going to go up to Formatting-- actually, to Data Type and say, this data type is Date. The format-- DateTime. And there's how I want to show it. Now, this column will not sort correctly, either. All of the Sundays will be together, all the Mondays, and so on. And so I say, hey, sort that column by the date. And now when we drag this in any visualization, it will sort according to the date. Now, over in our line chart that we saw at the beginning, we'll actually-- we won't have a hierarchy, but we'll drag year and then month and then day. But we also want hour. So now we need to create a Time dimension table. This Date table is completed. Now we come up to new table. And what in the world are we going to do for dTime? Well, there's only 24 hours, and we want to start at 0 and go to 23. So we used the GENERATESERIES function, Tab. The start value-- 0-- comma, end value is 23, comma, and the increment is 1. Close parentheses and Enter. I'm going to double-click and call this HourID, and Enter. Now, that reminds me. I want to go back over to dDate and this column right here, double-click, and I'm going to call this DateID. We're going to use that to connect to the Fact table. This is what we'll use as an attribute column for Day. And over in dTime, that is our ID column to connect to the Fact table. Now, I do want a label that will show 1:00 PM, 2:00 PM, and so on. And I'm going to make it easy on myself, since creating a serial number time is easy enough. The serial number time for 12:00 AM is 0. The serial number time for 1:00 AM is 1/24, 2:00 is 2/24. New Column. SerialTime is equal to dt-- there's our HourID-- divided by 24 and enter. Of course, time in Excel and Power BI is expressed as a proportion of a 24-hour day. So those are all correct. Decimals to indicate the hour. New Column. Hour equals the FORMAT dt-- there's our serial time-- comma, in double quotes, hh space AM/PM. End double quotes, close parentheses, and Enter. I don't know how this extra column got here. I must've clicked it twice. Right-click, Delete. If we wanted to list the full category, 12:00 AM to 1:00 AM, we could do this formula. Hey, that's just 12:00 AM. I join it using the ampersand in double quotes, and we'll say up to, and double quotes join ampersand. And I'm going to cheat and copy, Control-C, Control-V, and guess what? I want this to be not 12:00 AM, but 1:00 AM. So what do I have to add when I'm dealing with time if I want one hour? Plus 1/24. Now, when I hit Enter, I have a category. Now, both of these will not sort correctly in a visualization. 1-2 will end up right over here. So I need to tell this column to sort by either one of the others. And let's say HourID. Same with this one, HourID. Now we do not need serial time to show up. Right-click, Hide in report view. Right-click, Hide in report view. Now, you would probably not have both of these columns, but for the time being, I'll leave them both there. So our TimeDimension table is completed. We used a table function, GENERATESERIES. We added one, two, three DAX columns, and then we hid the columns that we don't want to show up in report view. Over here, we can see dTime. Let's go look at dDate. These are the three columns-- year, month, and day-- that we want to show up over in report view. I don't need DateID, so right-click, Hide in report view. Same thing here, Dimension table. We used the table DAX function, and then we had various DAX-calculated columns. And we hid the columns we don't want to show up in report view. Let's go look at fDowntime. Now, this table was created using Power Query, so I'm going to go back up, because I want to change that name. Back up to Home, Edit Queries. We could have changed it in DAX, but I'm going to keep it simple, since I did everything over here in the first place. Double-click, and we'll call this DateID, and Enter. That adds a new step. Now I click Close and Apply. Any time we need to come back and do something, we can go back and forth between Power Query and the data model using DAX. Close and Apply. Now, for downtime, we actually want to add a measure, because we want a sum of the duration. So with fDowntime selected, we go up to Modeling, New Measure. Total Downtime, and I'm going to indicate the units, since it's minutes. And that's going to equal to SUM of fd-- and there's our Duration column. Tab, close parentheses, and Enter. We'll add comma. Make sure it's zero decimal. Over here, we see our measure. Now, we need to hide some of these columns, but it's easier to do over in relationship view. Hey, there's our Fact table. I'm going to select that first column, hold Shift, select HourID, right-click, Hide in report view. Now, Machine and Reason, we might pull these out into a dimension table. But for our purpose here, we're going to leave both of these columns here. We'll drag and drop these as attributes from the Fact table. And there's our measure. Now, here's dDate and dTime. HourID, this is the one side over to the many side. There's our one-to-many relationship. DateID, one-to-many relationship. Now, the cool thing about this model here is now, in our visualization, if we want to have a day and then an hour, that day will flow over, filter that Fact table down to just those days. And then with the Fact table filtered from whatever attributes are taken from the Date Dimension table, we can pull Hour. And that will further filter the Fact table and give us our desired total downtime in minutes. Now, let's go create our visualization over in report view. Now, the first thing we want is a line chart. Over here Visualizations, I'll click Line Chart. We can resize it. We want our date attributes along the horizontal axis. So I come up to Date, Year down to axis, Month below Year, Day below Month. And then, from dTime, let's drag Hour down below Day. From fDowntime, there's our measure, down to Values. Now, with a line chart like this, we want to learn about these drill-down options in the upper right-hand corner. When we have multiple items in the axis, this button allows us to go to the next level in the hierarchy. Now, we're showing year. So if I click go down to the next level, that shows me all of the Octobers for all of the years. If we go back up, if we use this button instead-- expand all down one level on the hierarchy-- click, now I see October 2018, October 2017. The other amazing option is this-- drill down. So if I select expand all down to the next level, with drill down on, when I select October 2018, that shows me all of the days in the next level down. If I hover over Friday, October 26, now when I click, if drill down is on, now I see all of the hours for that day. I'm going to uncheck this and go up, up. And now I'll click Expand All Down. We also have this amazing feature. That is called a tooltip. The legend items are shown with the total. Now, the next thing we want to do is click in the white, and we want a matrix. And we want the reason. So we come from fDowntime and reasons to rows. We can resize this. Click and drag. Sometimes it's hard to click and drag. Put that white arrow, usually, near the edge. Now we want total downtime in minutes. Now if I come up here and click on October 2018, there are all the reasons and the total downtime for that month. Now, if you click again, you could get an individual machine. Now, that's hard to see, and I can't really tell up in legend. So I actually want to, when we do that, have a card here that tells me what machine that is. Now, in order to do that, we're going to create a new measure. So I click on downtime, Modeling, New Measure. I'm going to call this Selected Machine equals. And there's a great DAX function called SELECTEDVALUE. We need the column, fd-- there's our machine, Tab, comma, and then the AlternativeResult. Double quotes, all, end double quotes, close parentheses, and Enter. In the white, I'm going to come up and select card, selected machine check, and there it is. So now we can test this. Select October, and there it says all. Now, we want to do something kind of wild here. We see there's lots of different reasons. And we can use what's called a word cloud that actually can look at this column and show the words with the largest summation of downtime with the biggest font. Now, I don't see word cloud up here, but this is amazing. There's all sorts of custom visuals we can import. Click the More button. Import from Marketplace. Wow, a bullet chart, a histogram. We know how to create histograms without that special chart, but there it is. I'm going to click Add, Import, click OK. And there it is, our custom visual. I'm going to click in the y. Select Word Cloud, check Reason. That goes to category. And I want to measure the total downtime. That's our measure-- adding up the total minutes. And that is amazing. And guess what? I can unfilter this, and it's totally changing. Now, there are some repeats. You can see ATOM, ATOM, ATOM. This may not be exactly what you want, but it definitely shows you what word is appearing the most according to total duration time. The combination together may be helpful. This tells us exactly changeover ATOM, that's the total downtime. So a little over 2 million for changeover ATOM, whereas up here, just ATOM is 13,000. But guess what? That ATOM right there, if you hover, that gives you the total of all of the summations for every time ATOM appears. So if you went through and added total downtime in minutes for each occurrence of ATOM, that's the total right there. Now, if I come up and fill to this by October or down to the individual machine, it is working. I'm going to click on 2018 twice, get back to all. Now I want to increase the font size here. There is my paintbrush. Down to title. Font size, 15. Font color, black. Come back over to May. Everything's updating. All right, in this video, we saw how to create some interactive visualizations, but our main goal in this video was creating our Date table with DAX measures and also our Time Dimension table. All right, if you like that video, be sure to click that thumbs up. Leave a comment and subscribe, because there's always lots more videos to come from ExcelIsFun, including next video, number 23. We'll see how to deal with two Fact tables and how to merge them together using DAX and data modeling. All right, we'll see you next video.
Info
Channel: ExcelIsFun
Views: 14,665
Rating: undefined out of 5
Keywords: Excelisfun, Highline College, Data Analysis, Business Intelligence, BI 348, Mike Girvin, Microsoft Power Tools for Data Analysis 22, MSPTDA 22, DAX Data Modeling, SELECTEDVALUE DAX Function, Word Cloud Visualization, Download Custom Visualization in Power BI, CALENDAR DAX Function, GENERATESERIES DAX Function, Create DAX Data Table, Create DAX Time Table, Power BI Data Table, Power BI Time Table, Time Dimension Table with DAX, Data Dimension Table with DAX, Sort By Column
Id: RhPzRk-wEYw
Channel Id: undefined
Length: 20min 8sec (1208 seconds)
Published: Tue May 07 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.