Pick Your Poison: Level of Detail or Table Calc?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right welcome everyone thank you for being with me today in a lunchtime so happy to see that it's a full room and welcome to this session where we will explore how calculations can help you analyze data even further today so this event is live streamed so I would like to say hello to and good evening to people who are watching from Europe and hopefully nobody's watching from Asia but if anybody is very early good morning to them as well so my name is Sal Chuck and well I I don't know how to pronounce it so that people can get it from the first time but it's just pronounced as it is written so just if you come to me just say Sal Chuck and yeah that's it but I come from our London office so I've been working for tableau for four years so far but as you can see from the accident I'm not actually native to London so some of you may think from the name that maybe I'm coming from Turkey because it's a Turkish name but actually I'm coming from a tiny country in Europe called Bulgaria and I live and my hometown is a really really well it is a work actually but it's a really really small town in South Bulgaria called mum children I just I didn't how can i play this you go and probably you can feel fill in this room with one third of the population of the town so it's really tiny but four years ago this week I actually moved to London to work for tableau so this is my fourth year anniversary in table so it's a special way for me and I actually remember this week four years ago because I was sending pictures of my brand-new US visa to HR so four years later I'm here well when I moved to London and started working with tableau that was a time when Elodie calculations were just released and that was the big news and everybody was talking about them and you can imagine that as a new person like protocol within tableau I was a little bit confused what they do but once I learnt them I started using them for everything and then I realized the trend so people who started working with tableau or using tableau before LEDs were really really confident and comfortable using table calculations so they would solve everything with the table calculation because they knew all the hex whereas people who started using tableau after LOD calculations were released would stick to LOD calculations for everything well this session is about learning why you should use both and when to use them so we I want to explore with you some examples where I where we will see when it makes sense to use a table calculation and when it makes sense to use an LOD and why do we have this session well because as you know in tableau there are many ways to get to the same number to solve the same use case using different methods but not every method is the best and as a good consultant my favorite answer to the question which is the best method is well it depends so it depends on the use case it depends on what you're exactly visualizing and what you're aiming and with this session hopefully we will explore what are the questions that we need to ask ourselves before making any choice of what type of calculation to use so when you make a calculation type you basically end up with a trade-off it's a trade-off between the simplicity of the calculation so how easy you write it it's flexibility so how you can use it for advancing the calculation or the analysis further on and how performant the calculation will be or how expensive it will be on the database or on tableau your machine itself the goal of this session is to learn what is the proper method to choose from the calculation types so that we can actually have more efficiency when we write the calculations so it will save a lot of time if you just know what calculation to start with second it will increase the collaboration within the team because if you select the proper calculation types then people who work on the dashboards that we have created will be able to expand the analysis using let's say web hosting or self-service analytics and finally it will increase the adoption of any analysis that you do because the dashboards that you create and the analysis that you do will be in their optimum performance so for today the agenda is very simple we'll go through some basics although this session is designed as an advanced session and we will not dive deep into what table calculations are how they work or lod calculations themselves i would like still to go over some of the basics so that everybody in the room even if you haven't use table calculations or LD know a little bit about them so that you can understand the concepts so let's start with some basics in tableau when we open tableau we see a visualization it's a simple interface with drag and drop and create immediately a nice visualization but actually a lot is happening behind the scenes so we need to understand what's going on behind the scenes so that we can understand what table is giving us as a result and the things that we really do need to take it into consideration are the records of the data set so what each role in our data represents the order of operations that tableau is using to do the calculations then we need to understand where the calculation is actually executed and finally what is the scope of the view so what is exactly that we are seeing as results on the screen so let's start with the records of the data this is a very simple data set which is about a survey with 4 participants because every record is actually as a unique participant with a unique survey the average that we see on the screen is the average score of affection rate by participant well it becomes a little bit different within this dataset it's actually the same data set but we'll look at it let's say three months later and we realize that the score has dropped the reason for that is that one person has actually filled in the survey three times so this score is no longer the average score per participant because not every role is a unique participant it's an average score maybe by number of surveys filled in so we need to understand what our records are in the data set that we are using so that we understand what the values we see on tableau represent once we understand that we also need to check what the order of operations in tableau is this is a screenshot from our website and actually if you google order of operations tableau you will find a lot of pictures and variations of this process and what it shows us is in what order everything will be executed to get to the number that we are seeing on the screen so that you can understand where exactly the calculation happens and what it takes into account as data the third thing is the execution of the calculation so the execution of the calculation can happen actually in two places it can be either in the database or it can be in tableau so understanding where the calculation happens is important again to understand what data is considered to get to the result and finally scope of the view or some people call it with the level of detail of the visualization so the scope of the view is what we would say how the results are broken down by the dimensions that we have used in the example that you see on the screen we have category subcategory as well as segments on the screen dragged and dropped in different shelves on the interface and as you can see every result is a combination of these three dimensions so if we look at a single mark a single result this blue stack bar we see that this is representing office supplies category the binder subcategory and the consumer segment so the the scope of the view is all of these three and all the calculations that we do or some of them actually will see later will take into account that scope so the scope of the view is determined by the dimensions that we drag and drop on the view but not every shelf is changing the scope of the view so does the shelves that you see which are highlighted like columns rows mark card like color size etc will change the scope of the view whereas the shelves like pages filter tooltip they will not change the scope of the view they will change the result because you are filtering but they will not change the scope just to repeat the four things that we need to take into consideration when we do any analysis with tableau are the records of the data set that we are using the order of operations where the the actual calculation is being executed as well as what is the scope of the view at that point of time so let's go now to some basics related to calculation types in tableau we have three types of calculations the basic expressions the table calculations as well as level of detail expressions so this session is about table calculations and level of data expressions only but basic expressions is also one of the mix in the in this pot so if you think of these calculations as a way to go from point A to point B you can think that these are different types of transportation to take you there they have different properties let's say often they will get you to the same point that you want or maybe close to what you want and they will depend on different way of operating so basic expressions can be something like a public transport in a large city type table calculations could be considered to be like bicycle whereas level of digital expressions let's our cars personal cars so let's start with basic expressions and why they are like public transportation well basic expressions will use every record in the data so they they happen in the data source itself but they will depend heavily on the scope of the view so when you write a basic expression the scope of the view will determine the the results of the calculation so that's why they're a little bit like a public transportation you can go anywhere in the city but there are defined stops that you need to get to so you can't be in between a stop so the scope of the view will be your stops that you can you can get to with these calculations the second type are the table calculations table calculations are calculations that are executed within tableau so they depend on what is on the view so what numbers or values we have on the view and they will take into account only those values they don't go back to the database to check for or get extract other values and because of that filters that are applied on the view will be very important because any filter will change the values that you see on the screen and that's why probably they are a little bit like cycling they are very flexible but you need to use your own power so the power of tableau and you may not go may be everywhere in the city because they will not be able to have the power to get you there and finally we have level of detail expressions they as basic expressions happen in the data source itself so they are very powerful and they are more flexible than basic expressions because they don't depend necessarily on the scope of the view they can be on any other level of detail so that's why they're a little bit like cars they are more flexible than public transportation they can't go everywhere like table calculations so it's some places cars are not allowed but they will be powerful enough to get you to almost anywhere in the city that you want and just a reminder of how LEDs are written or what the syntax of LEDs is we basically have three keywords that we can use for LOD expressions fixed include and exclude and I would encourage you if you want would like to learn more about how to use and when to use them to go to a session which is specific for LEDs and I will mention it in the end of this session and then we define on what level we want this calculation to happen and this gives us the flexibility that we want because we can define the level of detail ourselves and finally we define what exactly is going to be calculated in this case we want to see the sum of sales by each customer so let's go and do a little bit of tableau with the knowledge that we have now although and see how this works in practice for the examples that I will show I'm using data from the FAA which is actually real data you can probably you can put you can download this data from their website and it's about claims submitted by passengers about things like broken luggage lost luggage or personal injury when they were flying with an airline every records of the data represents a specific claim and as we said this is the first thing that we need to know about the data and that's why I would like you to see how the data looks like what it contains and so that we can understand what the results of our calculations are so now we will use this data to explore when to use table calculations and when to use LEDs the first example that I want to show is about the simplicity of the calculation so when table calculations are simpler than LEDs and when LEDs becomes simpler than table calculations on the screen we are looking at the number of claims submitted to each airline these numbers are showing us the total number so Delta Airlines had in this period of time it's about 10 years of data 25,000 claims submitted for again lost luggage personal injury etc obviously these claim numbers are related also correlated to the number of flights that these airlines do so they don't actually represent anything specific for the airline itself and we are looking at the top 15 Airlines in that fly to us they are not only domestic but also international airlines what we want to calculate is instead of looking at the total number we want to see the average claims submitted to an airline on a yearly basis so what is the average claims per year the first option is to use table calculation so let's see how this calculation how this analysis can be done with table calculation so I'll just create a new calculation here and call it average claims TC for table calculation and then the calculation will look like something like this we want to take the average of the total number of claims per year so one of the calculations that is really powerful is window that's a type of table calculation and if you want to see all other types of table calculations that are existing in table you can basically look at this grouping of calculations and select table calculations so under table calculations you will see all the functions that will give you as a result of table calculation and these are things like first index lust so the sequence of any row or column ranking running totals as well as window calculations which will give you any aggregation that you want for that specific window that you define so in this case we want to use window average because we want to look at the average number per year and what we are looking at is the number of claims so the calculation will look something like this I hope that it's visible from the back as well window average some number of claims so if I drag and drop this on to the view you will see that table is giving me the same number for each airline nine thousand and eight hundred and this number actually shows me the average number the average number of claims for all the airlines that we see on the view so table immediately took the numbers that we see on the screen and took the average of those numbers but the question is what is the number of claims per year for each airline but because table calculations work on the numbers on the view I cannot calculate that number without introducing also the year dimension on my view on my this so I'll just drag and drop the incident date here and it will become like a larger table that you can see here so let me just make it standard so now you can see that the calculation happened on a yearly level because the results are repeated for each year so for 2002 for example we have on average 69 claims submitted whereas we want the calculation to happen not on a yearly level but per airline so table will always get a guess and we'll try to give you a result but not always the result will be in the order that you want but solving or editing table calculations is really easy you can just click on the on the measure itself and edit the table calculation and it will give you this really useful highlighting so that you can see exactly what tableau is considering when it is doing the table calculation in this case the average is being calculated for each year separately and the compute using is tabled down so the only thing that I need to do is change to table across and all of a sudden the calculation will happen now on an airline level and I can see the average claims per year for Delta Air Lines is 1797 and for American Airlines its 1670 that's it well the calculation is done I got to the number that I wanted to have but I have one issue and that issue is the fact that now this calculation is repeated many times as many years I as I have on my data what if I I just want it to be seen on the screen only once to do that we can actually use another table calculation which is index index will give me the order of each column on the view so if I just type index like this it will immediately show me that this is column 1 column 2 column 3 etc I can use this calculation as a filter and tell table just keep the the index when it's equal to 1 so if I drag and drop index 2 to my filter shelf I can just keep that number and because it's a table calculation it will just hide the other numbers it doesn't really filter the results it just hides them so I can get to this number and the view that I want with a little bit of a hack so let's see how it it looks like if you want to calculate this with an LOD 4 analogy I will create a new calculation and I'll just say average claim LOD with an LOD I need to define on what level I want to calculate to happen so I want to take the average per airline only for each year that we have in the data so I will use a calculation as fixed and what I want to first know is how many claims I had for Delta Airlines let's say as it is in the top four every year in the data that I have so that I can first calculate the total amount per year and then take the average of these totals so the first thing that I need to consider is the airline name because I have many Airlines I need to do the calculation for each airline separately but also we need to do the calculation for each year as well so I'll just type here incident date and it will take the year of that incident date and we'll calculated the total amount for each year in it for each airline and finally the thing that we want to calculate is the sum of our number of times so fixed airline name in your incident eight I need to use two dimensions in my dimensionality of the lod calculation so now if I just drag and drop that on the view and then take the average of that calculation because we are looking at the average itself I will get to the exactly to the same number that I got with the table calculation which was 1797 for Delta Airlines and because this calculation happens in the database and I didn't have to introduce the year on the view as you can see immediately I guess I have one result per airline I didn't have to use any hack so let's look at another example about this which will show us when a table calculation could be simpler that LOD in this case we are looking at the total number of claims per year as a trend line and I can see the average of those years is 11,000 so we had a peak of claims around 2004 and 2005 and then there was like a steep increase in the number of claims so now what I want to do is to calculate the so called residual or the difference between the actual number in each year and the average for the total total period so let's do that first with the table calculation and I'll call this average difference TC the calculation itself will be exactly the same so if you look at what we have on the screen already we have the numbers of claims displayed what I'm missing is the average of these numbers and if you remember from the first example that was a really easy calculation of window average and looking at the sum of number of claims so it's exactly the same calculation that we used with example number one if you want to test your calculations what they give you as result there is a really cool trick to just highlight the part of the calculation that you want to test and then drag and drop it on the shelf and you will see exactly what this calculation does and if I just show you the labels you will see that this calculation gives me the average that we see here for repeated for every year and that's really handy because what I can do now is take the different difference between the actual number for the year and the difference so what I need to do is just add here the number of claims sorry and then take the difference when I drag and drop this view I can now see the difference between that year and the average and if we want to make it like a really nice chart we can do something like this maybe combine these two charts and synchronize it and visualize everything in one view so now let's see how this can be calculated exactly the same thing with an LOD okay what we want to do is first again calculate the different the average number for for the period that we are looking at one of the LOD functions is exclude because the scope of this view is yearly level but I just want to ignore that year is there and calculate the average I can use exclude and say exclude here and give me the average so let's do that exclude year of incident date and give me the average again let's let's see what this calculation does and I'll just drag and drop highlight and drag and drop and see the result so as you can see the result of this calculation is not actually what I expect and it's only one because if you think about the data and how it is structured once I exclude here and take the average what table is going to do is check is go to the field called number of claims which is one for each claim and take the average but because it's one for each claim basically the average of one is one so what we actually need to do is before we exclude the year we need to tell we need to tell table calculate the the total for each year store that store those values in a lod calculation and then exclude the year so it becomes a little bit complicated it's it's like a nested LOD calculation so before we exclude I'll just do it in honor on a new line we actually need to fix the calculation on a yearly level and calculate the sum so what this calculation will give us this part of the calculation is the numbers that you already see on the screen so this part of the calculation is storing those yearly values now I can take these numbers and take the average of those numbers by excluding the year itself and now if I look at what this calculation does should be giving me exclude here fixed let me see something is wrong with the calculation let's see what what it does actually so it we fixed the numbers to the yearly level and now we want to exclude that number let's see it's good here I'll just click OK and drag and drop it again as you can see it gets really complicated when you use an LED and you can get lost in your own calculation that you have done hundreds of times but the idea of this calculation is that once you create an LOD calculation with an exclude you actually can't just write a simple Xcode calculation and take the difference because once you try to take the difference between claim numbers and the exclude calculation that you have created tableau will give an error message it will tell you you cannot mix level of data expression with an aggregation because the level of data expression itself is happening on a row level so what you need to do is understand that concept and put something like a minimum or attribute or maximum in front of the LOD itself to aggregate the LOD to the level of the number of claims so I'll just make this calculation a bit bigger and the expression will look something like this although it doesn't give the exact result that I want it will be as long as complicated as that that you see on the screen and then you need to drag and drop it again because the aggregation of the calculation changed and you now can look at the difference which is interesting because the result of the calculation is correct so if you have so if you look at what we see on the screen is actually what we calculated with the table calculation magic okay so the idea of these examples is to show you that sometimes table calculations will be simpler and sometimes lod calculations will be simpler but what is what makes them different from each other if you look the concept is really really easy actually so if you look at the first example what we needed to do to calculate what we want is to bring extra data to the view not everything that we needed was already there to do the calculation to take the average on a yearly level whereas on the second example we already had everything that we needed on the view so you had the yearly claim numbers and you could use the yearly claim numbers to calculate the average and then take the difference between the two and the first thing that I want to share with you as a takeaway about simplicity of the calculation is that table calculations will be simpler than lod calculations when you already have everything all the information that you need to do the analysis on the view so there will be much much simpler than writing a complexity calculation but all the expressions will be much simpler than table calculations if you need to go back to the data and extract more information from the data itself because taking that information using a table calculation requires you to drag and drop new dimensions and then change the calculation to take into account the dimensionality or the scope the new scope that you introduced on the view so let's build up a framework for our understanding which calculation we need to use the first question that we need to ask ourselves before we start the writing any calculation is do we already have all the numbers that we need to do the calculation on the view itself if the answer is yes then a table calculation is the way to go whereas if the answer is no then an LOD will be the preferred option let's go to the next question we now know how to write a simple calculation and when to use the calculation type that will make it simple for for that use case but what about flexibility so is every calculation although it's simple flexible enough for the analysis that we want let's see it with an example about proportions what we see on the view is again the number of claims by Airlines but this time there these values are broken down by the status of that claim so what is the outcome of the claim was it approved in full was it denied was it settled or whether it's unknown yet and what we want to know is what is the percent of claims that each airline denies so that we don't use that airline maybe so or you don't have a choice sometimes so let's let's try to calculate this number do we have everything that we need on the view to calculate percentages percent of totals yes because we already have the total value which is the full bar and we have every value broken down by by the outcome of that claim I can use a simple table calculation to get to the result that I want and that table calculation actually doesn't require writing it because some of the table calculations are already pre-programmed and the way to find them is just to click on the measure itself that we want to calculate and go to quick table calculations so for things like running total percent difference percent of total year-to-date growth etc you can use this instead of writing the calculation itself table will write the calculation for you and you can save it or amend it if you want to so in this case I want to use percent of total and I'll just select it immediately my axis becomes percentages and if I hover over I will see some small percentages and you can you can guess what happened basically table took all the data all the like not the claims per airline but total claims to calculate the percent of total and you know that we need to just adjust it amend the calculation again highlight shows us that everything was taken now just this time change it to again table across and immediately all of our results will go to 100% total as we expect now I can see that Delta Airlines denies 51 percent of the claims American Airlines denies similar percentage but America West for example denies 35 percent the next thing that I want to do is it's everything is fine with this calculate with this calculation it did exactly what we want but then I have a next question now can I sort these airlines based on the number of denied like percent of the night claims so that I can see which are the worst which are the best in approving claims to do that you may think of just filtering for denied denied claims and once you do everything goes to 100% and you know you probably now can guess why this is the case because table calculations work only on the numbers on the screen once you apply a filter you lose the detail and then it basically checks each airline with itself in its hundred percent so this was a really really simple way of calculating percent of total it was just clicking without even writing a calculation but now if I want to expand the the analysis itself and go one step further I I have a basically a showstopper although in this case LOD calculations will be a little bit more complex because you need actually to write the calculation you will see why they can be useful in some cases although the simplicity may not be there so let's try to calculate the proportion using an LED to calculate the proportion we need a nominator and a denominator and our nominator is just the number of claims that we see on the screen and why is it only number of claims well because the scope of the view is already on the outcome in the airline level so if I just say right number of claims in the calculation it will give me the results based on airline and outcome so I don't have to do anything there for the denominator I need to write LOD calculation as we discussed LOD calculations can be independent from the scope of the view so I can write them on any level that I want and what I want the denominator to be is the total number of claims per airline and as the thank you if you just tell yourself in simple English what you want to calculate it it will give you already the syntax that you need to use so I I need to fix the total number of claims by airline and my dimensionality that's why in the LOD calculation will be airline name and look at the total number of claims like this so if I drag and drop this calculation you will see that it gives me exactly the same results as the table calculation but the nice thing about this calculation is that once I filter the results are still there because they happen in the database itself tableau still knows what these numbers are although we filter them in tableau it can go back to the database to check them and I can now sort the airlines by their denied number of claims so I travel with British Airways not the best choice but nothing happened to my luggage so it's fine so ok the next question about flexibility is a is there are there situations where table calculations are much more flexible than LOD calculations and actually there are there are many use cases where table calculations will be much more flexible and sometimes there will be the only choice that's why they're like a bicycle they can go anywhere in the vicinity of your vicinity basically they are not like cars which need to travel on the road they can go in like on the pedestrian well probably not legally but they can go anywhere more or less so what we will do is we'll try to find out which are the worst offenders or the I'd say the airlines which had the most claims by airport because most of the claims we will not look at that part of the data but if you analyze this data most of the claims are related to either lost luggage or broken luggage and that happens in the airport itself often I guess so what we will do is look at the top five Airlines for each in each airport and to do that I'll just drag and drop airline name like so let's just use a standard view so that you can see what's going on so for Jake JFK the top five for JetBlue Delta Airlines American Airlines etc and how can I solve this with the table calculation well it's really simple what I will do just so that you can see the the result itself is just copy the number of claims and do the calculation on this part it's a table calculation that is also pre-programmed and it's the rank so one of these quick table calculations so if I use a rank it will show me the rank of the Airlines based on the number of claims I just need to make sure that this rank happens for each Airlines separately so if you look at how the rank happened it basically takes into account all the data so the rank doesn't start from one for each airline and the way to make it start from each airline is that each airline is actually a separate pane in the interview so there is a border between the two airlines and I just need to change it from table down to pane down so if I do that / pain the compute using pain down this calculation will start from one for each Airport every time now I can use the calculation itself as a filter so I'll just drag and drop this calculation and say show me just the top five per per airport and you can see immediately the result if you don't need the ranking you can just take it out but the result will be there so again no calculation needed or at least not by just writing the syntax by yourself and you can immediately get to the top five ranking well when it comes to an LOD calculation can you do a ranking with LOD yes actually you can but it's not a very preferred way to do that I will not write the calculation itself because I don't want to miss make make another mistake but just a second let me reopen it it's getting more complicated than necessary to be honest in this calculation what it does is just ranking the top airline per Airport so if I want to get to the second rank I need to write another calculation and if I want the third and it right yet another calculation and you can see that it's not scaling so if I have a certain number of ranking like I want to see the top five maybe I can do it like if I have time just for the you know for the fun of it but if you want to take the ranking of each airline then you can see that it goes to hundreds you will not be probably able to practically write all this calculation so that's why in those cases where you need recursive calculations just recursive calculations or moving averages etc you are much better off using a table calculation in in many cases it's the only way actually so when it comes to flexibility table calculations or lod calculations are very flexible if you need later on to apply a filter on a higher degree because table calculations will take that filter into account but in many cases you don't want the filter to be applied on the calculation result itself then LOD expressions will be more flexible than table calculations whereas table calculations become really flexible when you want to do recursive type of calculations they can get you with one calculation you can get many results using just one table calculation as you saw with the ranking so you just apply it once but it calculates it hundred times or as many times as you need in the data in another use case where table calculations are more flexible than LOD calculations are when you need to work across database data sources so if you have two data sources that you need to refer to in a calculation then a table calculation will would actually be the way to go because LOD calculations can work only in one data source so you can take a measure and a dimension when they are from different data sources now our framework is getting a little bit more complex before getting to the actual question whether we need to use all values or only some of them basically there are two more questions that you may want to ask yourself the first one is whether data blending is required because if data blending is required you need to use a table calculation by default and the second question is is it a recursive calculation like ranking moving average etc or like lookups like in Tarot calculations then again a table calculation will be the way to go there is one use case where the only way to go or the only option that you have is actually an LOD calculation and it's when you need to create a dimension and the only LOD calculation that will take you there is fixed level of detail so you can't use include or exclude or table calculation or basic expression actually you need to use a fixed LOD and the reason for that is because with a fixed level of detail calculation you can basically create a result which is independent from the scope of the view and because it's independent it can be deterministic and you can basically use it as a dimension itself so the use case that we have here is we can see what we see on the screen is how many - how many airports each airline flies to so that Delta Airlines has the most advanced network at least in the data that we have it may not be the case in in real life I don't know so it's 297 airports that they had claims submitted about and you can see all the ranking here but what's interesting is that you have some Airlines which fly only to one Airport and then there are some other Airlines which fight to two airports three airports etc so I want to know how many airports flower airlines fly to only one airport how many airlines flight to two airports etc so basically it's the reverse of this analysis and because I need to create a dimension to break down the result by one two three etc I can create I can use a level of detail expression I'll just write this number of airports LOD and that will be a simple calculation like for each airline count how many airports they go to they fly to like so because this calculation is as I said independent from the scope of the view it will always give me the same result now it's it's a measure because the answer is 1 2 3 etc but what I can do is just drag and drop it to dimensions and it all of a sudden will become a dimension and now what I can do with this dimension is look at the number of the numbers that we saw on the on the view on the on a reverse scale so from 1 to as many airports an airline flies to and now instead of looking at the number of airports I can check how many airlines go to only one Airport and the answer is 32 and then 35 airlines fly only to 2 airports and only one airline which probably is Delta Airlines flies to 299 so as you can see with a fixed level of detail you can create a dimension and if you your analysis requires that then then it will be the way to go and that will include one more question to our framework whether the analysis requires a new dimension so in this case you need to use an LOD expression so before we move on I would like just to talk about performance itself using LOD or table calculation calculations will have different performance implications and that's because of how they work so with LOD expressions because they are part of the under lying query to the data source we basically depend on the speed of the data source that we are using so all the calculations all the analysis will be done by the data source itself and if the performance is slow then we need we either need to replace it with the table calculation or do the pre calculation within the database itself so because we will be bound by the speed of the database in this case whereas with table calculations because we use only tableaus power so you are cycling basically we need to understand how many calculations tableau will do for that analysis so in the case with the years and the Airlines we had let's say a combination of 15 Airlines with 10 years so 150 calculations but imagine that you are trying to do a calculation on a customer level where you have 1 million customers so a table calculation then would need to do 1 million times whatever other dimension calculation operations to get you the result and the result may be just one number sometimes so you in the cases of table calculations we need to be aware of what exactly we need to bring as data to the view to do the calculation itself and if it's like too much of data probably we're either doing something wrong or we need to you know go back again and do it in the database if level of detail expression is not possible maybe I can show you one example how it works with performance and that is about looking instead of number of claims per year looking at number of claims per day so as I said when it comes to number of claims per year we do hundred and fifty combinations or one hundred and fifty operations but now if we want to calculate it per daily level we will need to multiply this 150 by 365 so it will be a like a big number and the reason why table calculations may not be the way to go in this case even if they are simpler is that just to do the calculation I will first need to go to the level of detail that I want to do the calculation at so I need to open each and every day in the data and do the calculation on this basically practically infinite table so if you look at the marks number it's forty four thousand so I need to do forty four thousand operations in tableau just to calculate fifteen numbers at the end with an lod calculation as we saw it would be a simple statement of saying what is the number of claims per day because incident date is on a daily level I can just say for each day calculate the number of claims I just forgot that it should also be on an airline level so for each day in each airline calculate the number of claims and then let's call it daily average and when we drag and drop it and look at the average we can see how many claims each airline got on a daily level and it will give me only 30 results because all the calculations will happen in the database whereas with this example we have to do 44,000 calculations and then use something like an index just to get the third numbers that we want because probably we don't want people to scroll to get to the next airline so this is a constant consideration that you need to make as well performance because it will be a trade off by itself so to conclude when we do decide between any two calculation type table calculation LOD or basic expression we always do a trade-off between simplicity flexibility and performance and you saw that sometimes simple calculations can be non flexible sometimes they may be non performant and sometimes flexible calculations may not be so simple but they will give us the flexibility that we need and because there are multiple ways of getting to the same number we need to use something like the general framework that we just build together to answer first questions that are important before even starting analyzing the data because that will save us a lot of time just use the method to get to the point that you want and it will be much Pleasant much more pleasant experience to get the analysis that you would like and just again a simple reminder of the general framework that we built it's really simple we have just three outliers probably that we first need to consider is the recursive calculation does it require blending do we need to create a new dimension from the measures that we have and if the answers are no to everything we ask ourselves the general question of whether all the values that we need are already on the view then we use a table calculation and if it's not the case then we go and use an LOD you will be able you may be able to find even more elaborate frameworks if you go to Google with many more questions but I would like to keep it as simple as possible so that it's really easy to digest and to answer these questions on the fly with any analysis that you do if you would like to learn more about table calculations or LEDs how they work what are the use cases for this type of calculations I would suggest you to go to one of these sessions one of them is called table calculations for the advanced analyst and it's a hands-on and it will be tomorrow and the other one is called lots of fun with level of data expressions or calculations and in that session you will be able to start from scratch what an LOD is what are fixed include exclude when to use fixed when to use include when to use Xcode etc and you can combine this knowledge with what we learned today to make sure that all the analysis that you do is fantastic easy and speedy probably you've heard it many many times already but please fill in the survey and give us feedback we would like to improve these sessions as much as possible everything that you write in these evaluation forms is considered later on and thank you very much for being here [Applause]
Info
Channel: Tableau Software
Views: 983
Rating: 5 out of 5
Keywords: Visual data, Visual analytics, Business analysis, Business analytics, Business analysis tool, Data analytics tool, Data Analytics, Analytics, Analytics platform, Cloud application, Business analytics platform, data analysis, data visualization, business dashboards, business intelligence, tableau, tableau software
Id: QWOML5mFweU
Channel Id: undefined
Length: 59min 6sec (3546 seconds)
Published: Fri Nov 15 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.