The Forecast Looks Bright: Tableau Forecasting

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
all right so welcome this is the forecast looks bright tableau forecasting I'm Jason Miller and this is my colleague Joshua Rao we are both sales consultants in the Midwest and both of us are based out of Chicago so was it gonna be a clap for Chicago whoo okay yeah awesome so what are we gonna be talking to you about today well Jason's gonna come up first and he's going to be talking to you a little bit about what can we do in the product natively today using some of our drag-and-drop methodology and then how can you better understand what's actually going on and some of the bells and whistles and the knobs that you can turn and tweak to kind of better format that for your own analytics from there I'm actually gonna come back and I'm going to talk to you a little bit more about what we can do when we use some of an external service clients so MATLAB Python or etc now this being Friday I'm sure a lot of you you've been to multiple sessions we do like to kind of take a moment at the beginning just so that you can learn a little bit more about us who you can be spending the next hour of your life with so I'll just kind of kick this off the very first things you probably know about me is that I absolutely adore dogs these are my two dogs here that's Bowie on the top thank you that's Bowie on the top that's really on the bottom and this works out great for me because I was fortunate enough to marry my beautiful wife Molly who happens to be a veterinarian so there will always be dogs in our house in the last two pictures those that's my daughter Ella there on the right-hand side she's two and I you know again this being Friday maybe into a few sessions maybe you've been to some tableau doctor sessions and you think to yourself I've seen that guy before and he seems to be everywhere or he changes clothes a lot don't worry you're not crazy I'm actually an identical twin my identical twin brother Adam is also a sales consultant here at tableau who's been presenting so truth be told in that picture there I actually don't know which one I am I just pick whichever one you think it's cute ooh and that's me and also kind a little bit of a gotcha moment that's actually not me holding Ayla in the right hand side that's actually Adam so with that I'm going to turn it over to Jason to get things started I'll see you guys in a little bit Thanks all right thanks Josh so jason miller native of the chicago area those of you that were clapping I'm actually from the suburbs some Chicagoans make a big deal about that if you say you're from Chicago so I'm a native of the Chicago suburbs actually just this picture was taken about four and a half years ago after we moved back to the Chicago area I had just started working for tableau and the future was definitely looking bright at that time my kids they're about nine eight and seven now and you know it's been awesome the last four years this is a three years prior so this was actually in Washington DC at the Smithsonian this is actually Photoshop my littlest one I think is seven months there so she's not even walking yet but I had just started with my previous company which was doing integration and analytics work with the federal government but this is actually a second career for me so for an introduction I want to kind of take you guys back about 14 years to 2005 I had recently graduated from college in 2003 I was on the seven eight year plan with a psychology degree and for those of you that knew the forecast at the time for someone bachelors with a psychology degree the career and the salary forecast were not that great so I had just thrown my resume up on hot jobs I don't think LinkedIn was a was a thing at the time career builder and I was very surprised that I got a lot of responses just for generic business and sales and a lot of them were mortgages and I was a pretty naive college student really knew nothing about the housing industry or mortgages or anything so I went on some interviews and like wow these jobs pay a lot of money so took one fast forward three years later I'm working for the third largest subprime mortgage lender in the country so this is the S&P 500 this is the stock market at the end of 2015 these were our financials so we were actually owned by H&R Block we were a subsidiary and we had just posted record-breaking earnings in the second quarter we were told that over 90 percent of the earnings were just from the mortgage company the H&R Block stock price was at an all-time high and the fundamentals of the housing market were absolutely on fire housing prices were at all-time highs they were increasing some places in California a hundred percent per year you had all-time high homeownership rates and all-time low foreclosure rates so by all measures and all of these forecasts were done with tableau most of these are actually very good forecasts they show that they reduce the error quite a bit from a naive forecast and I'll kind of explain how that works but despite that fact as I'm sure all of you know every single one of those forecasts turned out to be wrong so by the end of 2008 the housing market had imploded half of the subprime mortgage lenders were out of business you can see that perfect you know heartbeat trend and seasonality from the quarterly earnings was just ruined with a 500 million dollar loss from the mortgage company so the the forecasts were clearly wrong all except for one the stock market at the end of 2004 eight was still going strong so the housing issues had not quite yet trickled over and that leads us to one of the most infamous economic forecasts probably in history this was in January 10th of 2008 Ben Bernanke who was the chairman of the Federal Reserve made the forecasts or the statement that the Reserve is not currently forecasting a recession and why is that funny because we were at the very start of the worst recession in history that year that would be later dubbed the the Great Recession I'm going the wrong way there you go that's the drop alright so drop faster and farther that it even had in 2001 all right so why did I sort of share that you know little tidbit one is to point out that forecasting is hard so all of these forecasts the even if we had developed the perfect forecast that was tuned and you know tested perfectly it's still inherently based upon an assumption that your past and current events will accurately predict the future and whether or not specifically trends and repeating patterns called seasonality will absolutely no way you know looking at the data that's available here for the earnings that you could possibly predict based on the model that the next month the profits would would tank into the negative right so and that's specific to tableau that does univariate forecasting it's even more strict so kind of what you see is what you get so here we're looking at the H&R Block quarterly earnings we are basing our forecasts strictly off the earnings you can't take anything else into consideration so the forecast models were good it just didn't arrive at the appropriate outcome all right so I'm gonna jump into forecasting in tableau with a demo all right now the first method of forecasting in tableau that I want to show is actually just using calculations or table calculations so this may be something that many of you have tried and you may have come up you know into a roadblock there is sort of a little trick that I'll show you that kind of makes this possible but I'm going to start off by looking at my sales by month ok that looks great and what I want to do is my sales managers leadership is saying that we expect a 30% increase in next year can you please forecast our revenue out 30% over what we did each month last year so I have a calculation here this is a table calculation which is most likely what you will be using because what a table calculation allows you to do is to look up or look at a window prior which is exactly what we want to do we want to look at January's value last year raised at 30% then look at February's last year raised at 30% etc so I have a table calculation this just checks that the order date is after the latest date in my data I look up twelve periods prior which would be the year prior that month and then I'm going to multiply that by 30% it apply and then I'm going to drag out that forecast calculation all right you can see it did nothing all right so the trick here is with a table calculation or any other type of calculation in tableau tableau needs those date values in the future to work with write to input into the calculations since I don't have any dates I'm gonna have to create some somehow so in this situation I'm just gonna union with another table of data and all that I've put into this table is the column with the order date that I'm you using and then I added a few of the measures with zeros really the only thing necessary is the order date it just makes things a little easier when you're not having to deal with nulls so you can put any value in here you want the main idea is just to get a date for each month and then in the next year so I will go ahead and create that Union and now if I go back you can see that I've had these zeros added to the sum and then I'm able to calculate the forecast because I do have those values in the data and then I can actually color this by whether or not that is the forecast value or not great okay so that's just a simple way of using a table calculation if I wanted to make this interactive for an end user so they could change the percent all I need to do is show the parameter and I am going to replace this 30 percent with my parameter all right and now if I wanted to do 150 percent I could do that or I could do 0% and that's gonna give me exactly the values that we had last year for those months that that portion and the seasonality and I'll kind of leave this one up here because this is actually a special case that we're gonna run into later so this is what you would call for a seasonable for a for a data set that has seasonality this would be a naive forecast so all we're doing is we're taking the last value from the last period and forecasting that over if this didn't have seasonality and it was just sort of a line that the naive forecast would be just the final value straight across okay and when we look at the quality metrics most of the quality metrics are measuring the difference in error between the naive model which we're looking at here and the forecast that tableau comes up with okay all right so now I am gonna jump into the forecasting that probably most of you are more familiar with let me get Astro of hands how many people in here have created a forecast in tableau using the analytics pane sort of drag and drop okay I would say 80 90 percent all right how many of you that raised your hands have actually used that forecast or has somebody that you know of used that forecast for any type of business decision okay so still a decent amount I'd say it dropped down to maybe 20 percent but that's not surprising to me I think there's probably two reasons one as I'm about to show you tableau makes it incredibly easy to create a forecast and I think inherently it's human nature to kind of dismiss something that seems very easy so I'm hoping to sort of dispel that and give you guys confidence that you know these really are good models that you can trust and then the second component of that obviously related is you just don't understand it so how would you know whether it's good whether it's a bad model etc so so my goal here in the next 10 15 minutes is to get you guys familiar and a basic understanding of how to have low does forecasting and then whether or not you can trust or be able to tell what how good that forecast is so whether or not you can trust it or not okay all right so we'll take the same data in create a forecast with tableaus built-in forecast feature all right and we have this little tailor data we can just exclude that all right and now I will go to the analytics pane I will go under model to forecast and drag that out to add a forecast and that's really it so if this was a talk on how to create a forecast with tableau that would be about it but that's really I don't feel where the value is the real value is is when you right click on the forecast go to forecast options and describe forecast so this is where you can configure obviously some of the options settings for forecasting and then describe forecast is where it tells you the error and how what type of forecast was created so if we start with forecast options you can see that tableau has made automatic decisions on a number of items so one the the forecast length so by default it's going out to 13 months if I wanted to change that to 3 years I can do that ok the aggregation so by default tableau will ignore the last period the reason for that is it's very common for data in your last period so let's say we're getting sales data every day it's still November I'm not gonna want to create a forecast based on November's numbers because it's not a full a full month yet so in this case I happen to know that we have a full months of data so I can change that to 0 and then the last part here is the forecast model so just a real quick sort of primer on exponential smoothing algorithms which is the type of forecasting model that tableau uses I'm gonna switch back over to my slides here and as you can see here under class custom there's two primary components one is the trend and one is the season all right so if we look at just the trend component that can be none additive or multiplicative now the trend is the overall long term trend of the data is it is there no trend up or down that would be none is there an additive trend that would be essentially a straight line and a multiplicative trend would be essentially a curved line so with multiplicative because you're multiplying the coefficients as you get farther and farther away along the time series the magnitude goes up so you're going very strong that leads to the curved line if we look at the seasonality component so that is the repeating pattern so typically like with monthly data you would have a pattern repeating every 12 months with quarterly every quarter etc and those two can be additive or multiplicative so an additive seasonality is going to be relatively constant or arrays sort of linearly whereas multiplicative the farther down you get the larger the spikes will get and when you put those together you're left with nine different combinations and these are the models that tableau evaluates calculates the error on and that determines which is the best model to use for the data okay so those are nine different exponential smoothing algorithms now you'll see in the documentation just as sort of a point of trivia we say that we choose from eight models so the one that we don't use I had kind of assumed was none none but that's not the case that's actually a simple exponential that is the simple exponential smoothing algorithm the one that we don't use is the multiplicative additive and the reason for that is that that particular combination tends to be the math is just unstable so it it doesn't do well so you can actually look that up it's just it's not typically used okay so we've covered the forecast model and then the last portion is the error bands and the error bands you can specify whether you want to show 90 95 or 99 and that indicates if it's within the band you're you're 95% confident for example that the the actual value will be within that band so sometimes they're pretty broad and that is determined by the model that is selected all right so let's do another forecast with profit this time and look at the same profit by month okay and I actually want to do the continuous month all right great and let's create a forecast all right so you might look at that forecast and say yeah that's not a good forecasts straight-line you know what good is that now as I mentioned earlier tableau is automatically evaluating all of the possible models and selecting the best one based upon which model fits better so essentially what tableau is saying here is I can actually go in here and I can force it to do some seasonality so I can go now from automatic to custom let's see there's an additive trend there you can see the overall is going up and then seasonality let's say is additive and that looks better to me but does that really fit the data and I can tell you straight out no and I'll show you how you can tell so the second most important screen I think I'm creating a forecast is the described forecast so this will actually tell you a summary of the forecast itself so I can see that we're ignoring the last month there's a 12-month seasonality this is the initial value I can see that 1% of the value is contributed from the trend 99 is the season and then if I go to the model there's a lot of information here so if we look at the first component here you have the model level trend in season so the trend in the season we've already discussed but the level that's simply you can think of is where it's gonna start the forecast so that's kind of the base value and then it applies the trend and this season above that in the second portion these are the quality metrics I would say for all intents and purposes you can just ignore the first two those are measures of error but those will vary depending upon the magnitude of the data so if you're dealing in billions of dollars it'll be billions or millions if you're dealing with tens of dollars it's gonna be single digits the last three the mace the MAPE and the AIC those are all normalized so with the mace in the MAPE the closer to zero those are the better they will both either at one or a hundred percent that is the same amount of error as the naive forecast so it's almost like a guess so you're and sometimes you will see it above that but essentially the lower the value there the less error so when I look at the mace value that's the mean aggregate squared error it will what that's telling me at point five is that I have 50% as much error in this model as I would with that naive forecast so if I just took the last value from the last period this is 50% better okay and then the last one the AIC which is the APAC a information criterion that is similarly used to compare models and the lowest value is chosen by tableau this is the value that tableau actually uses and this one also includes a penalty for more complex models so it prevents sort of overfitting so that's where we're gonna use the lowest value and then the smoothing coefficients these are the determination of how much smoothing is done so the closer to 0 the more smoothing the closer to 1 the less so with if it's 1 you're just taking the last value whereas if it's closer to 0 you're sort of averaging that out across the dataset and you're waiting the earlier values more heavily and the they correspond to the model there so alpha is the level beta is the trend and gamma is the season all right so sure so alpha is the level that's essentially where it's going to start the first point in the forecast beta is the trend that's the coefficient for the trend and gamma is the coefficient for this season so the closer to 0 that is the more it's going to consider those older values and the smoother it is and the closer to 1 the more abrupt the changes are going to be MAPE is the I believe it's the mean absolute percentage error so all of this I will point out you saw there was a link in all of the forecast windows there's a link there for learn more and we have excellent documentation online that explains in much detail what what each of those are so the MAPE is the mean absolute percentage error so so MAPE and mace which is the mean absolute scaled error are probably those are very popular forecasting probably more so MAPE than mace however as you'll see this mace value is not it's more robust so when the MAPE value is dealing with zeros it doesn't deal with zeros very well and it actually puts more weight for extreme values so the the mace has some advantages over the mate all right okay so the one thing you would see if we if we go in and note the AIC value you see that that's 597 and the mace is 0.5 so if we go back to the automatic and this is allowing tableau to choose the best model you'll see that this will have a lower or a higher AIC so the AI see here is 584 and the mace as well is 0.71 compared to 0.5 so even though it looks better to the eye it does not exhibit that level of seasonality and you can see that very clearly if you just rearrange the data and by the way forecasting does work with all types of dates so here we have the date parts that we're using and you can see that the forecast still works but you can see very clearly here that at the monthly level you know sometimes October at the lowest sometimes it's the highest there is no clear seasonality so we in essence forced tableau to show seasonality where there wasn't however if I rolled this up to the quarterly level you will see that tableau clearly finds seasonality so that is one tip if you're you're not getting what you feel is a tableau is not sensing seasonality when you think there should be for example oftentimes if you go up to a higher level of aggregation there will be clear seasonality at that level where there was just too much noise at that lower level okay and I'd like to finish up by just pointing out a few things that could potentially go wrong because I'm guessing some of you have created forecasts that either show an error no forecasts or you get a straight line and kind of not sure how to handle those situations so I ran into a similar issue when I was looking at my electricity data that I downloaded from my electricity provider and I would think that with electricity there would certainly be a lot of seasonality so this data is actually down to the hourly level but I'm gonna go ahead and look at it by month and what I had done was this October was a partial month so I excluded that and October 19 I don't remember if that was a partial month or not but we'll leave it all right so I'm the good and create forecast and again a straight line now I know that this should be seasonal if I again look at it by year and month it is very clear that there is seasonality here but tableau is not detecting it so why is that if I go in and try to force it it will show me an error message so let's do additive seasonality so it says the seasonal model cannot be computed because the time series is too short so for seasonality for tableau to use a seasonal model it is required to have a full two seasons and with this data set I just happen to be one season short so I can go back and if I on this data set if I choose not to ignore the last month that's the month I need and tableau shows me a nice seasonal forecast if I look at describe I can see that the quality is OK I can see that I have a masive point for one so by that measure I have 41% the error as a naive model going off of the made model I have 7% so decent decent model and I can see that it used a no trend for the for the trend and multiplicative for the season so that's looks pretty good all right so in review Oh did I kill the PowerPoint Joshua file just sit here courage all right thank you okay so to review tableau forecasting uses exponential smoothing which is a univariate forecasting method meaning we'd only look at one measure one variable to determine the future trend and forecast you need a time series most often this is a date value but it can also be an integer if you're representing a time series with integers a exponential smoothing model has two primary components those are trend which is the overall long term trend of the data as well as seasonality which is a repeating component that you pattern that you find in the data and both of these can be either none multiplicative or additives giving you 8 or 9 potential models that tableau selects the best from and in order to determine the determine the quality of the forecast you are gonna want to go to describe forecast and then look at the model and it will show you actually on the summary page just a qualitative poor okay good and on the model page it will show you the the actual percentage errors that you can use to determine all right so I've mentioned some limitations with tableau forecasting so one is the single you can only do univariate another one that I didn't get too much into is you really can't there's multiple many many different forecasting you know even univariate forecasting algorithms tableau you're limited to exponential smoothing and further you can't tune or for example select you know random varying seasonal patterns so and some of the customizations not there so if you need a more complex model that is we're integrating tableau with our exponential services our python and matlab come in and that is what josh is going to be covering in the second half Thanks just give me one second everyone okay okay well let's give it up for Jason all right so I want to kick off this next section with with the question so how do these activities on the screen what do you think the most people participated in last year so either golf participate in fantasy football this is inside the US golf fantasy football attending a major league soccer game perhaps going to Disney World or attended an NBA game so just just shout out some thoughts to football football okay all right good guesses so the actual answer is is fantasy football now why do I bring this up and talk about multivariate forecasting well I love fantasy football I've been in the same league with some friends from college and co-workers and some of our spouse's were probably that the last 14 years now so quite a while and I was fortunate enough for the first time ever to actually win the league last year thank you so I participate for those of you a little bit unfamiliar with fantasy football I participate in what's called a keeper League and essentially what that means is that you can choose to keep the same players you over you if you want to and give them the fact that I won the league last year I was like awesome I'm gonna keep my entire team basically and this is actually you see on the left-hand side here this is this is my record so my win-loss or the weekly score and then the act the actual week and I thought well what if I were to use that forecast without universe essentially total points scored from last year to help determine what my record might be for this year so looking at something maybe a little bit more like this how many people think that that method alone would be a good method for determining what my total points scored and then potentially my win-loss ratio might be for this year no I've seen a few head shakes yeah it's not a great method this year has not been as kind to me I mean things happen in fantasy football all the time you know players change team players get hurt their coaching changes certain quarterbacks forget their job is to throw the ball to people on their team and so with that when you have to account for these multiple variables you need to sometimes use some of these external services and that's really what we're going to be covering today so how does it work well we've actually been doing external services inside our tableau for quite a while we started back in version 8.1 with integration to our in version 10.2 it was Python in version 10.4 it was MATLAB and most recently in version 20 19.3 we allow you to add both r and python scripts to your tablet flows and we'll see that a little bit later now for those of you may be a little bit unfamiliar with some of these programming languages always really more of a very standard statistical programming language used for linear nonlinear modeling Python tends to be a little bit newer but it's very easy to read very easy to get up on so it's become very popular as of late MATLAB tends to be used more for computational analysis so I see it used by a lot of scientists and engineers used for things like thermodynamics etc and then as we discussed you know to a 19-3 you can use both o and python and this is great because for many of these languages you know you don't have to be an expert in the lame of yourself you can go out grab someone else's library particularly with R and Python they have a great open source community grab someone else's code take it and reverse engineer just add your data and you can look smart like they were so how does this how does this work actually inside of tableau well inside the tableau and this worked exactly the same on desktop as it will on server so inside your workbook you actually gonna have a script function and the script function essentially does two things number one is that it allows you to actually input code which will then call out to will be used by that external service and we'll talk a little bit more about what these external services are in just a little bit from there you're also going to pull in data from whatever data source you've already connected to inside of tableau so it'd be a database system sequel an extract flat file so on and so forth using that script function you can then going to pass both that data and that code over to an external service and basically an external service is something that it's it's just waiting it's just listening to have this code received with the data and so in the case of R as we'll see you today I'm going to be using what's called R serve in the case of Python you'd be using what's called tab PI and in the case of MATLAB you'd be using what's called MATLAB server now as we shall see I'm actually gonna be running this locally on my machine I'm running or locally on my machine but most often in kind of like an enterprise production environment you would have an external server set up to run whatever this external service along with that that program as well so it'd be like something like tab PI server or server something like that so that data gets sent over to the external service with that code it actually gets run in that particular program and the results that get sent back to tableau in the home of a table calculation and what's really nice about this is once that data gets back to tableau you can then actually play with it just as you would in the other data set inside of tableau so let's break down this actual script function just a little bit there are essentially four different kinds of script functions that you can have what's called script integer script string script boolean or script real and essentially what this does is it tells tableau hey whenever this data comes back from whatever this external service is this is how I want you to treat it so either as a integer a string value a true/false statement which would be boolean or a real number from there I actually pass in whatever code of a formula library I want to use and inside that code you're gonna start seeing different arguments so things like dot or one or two so on and so forth and these arguments correspond directly to data that you're actually sending over from tableau so let's just go ahead in and see what this looks like now here I'm actually looking at Johnson and Johnson's owning data from 1960 all the way out to roughly 1980 and I just want to create a forecast using one of these external services so I will drag up my column of my date column and I will drag out my forecast and I'm just gonna show whether or not it was my predictive versus actual on color and there we go that's it we can all go home right so it's a little hot sure so it's a little more complicated than this is actually little bit more that's going on underneath the hood let's take a look but look at my forecast function what I'm doing here is I'm actually using the script wheel function so I'm expecting a real number back inside of or I'm actually using the forecast library so that's the library that I've downloaded installed in my local auto machine I have a couple of print statements here these are really they're not really used for the formula itself this would be particularly like I was running are in a debug mode or something like that and I wanted to see exactly what was getting transitioned back between tableau and that external service in this case R so if I had to do some troubleshooting if the numbers don't look right then below there this is my actual statement that is getting run by R so I've got a forecast statement I've got different arguments in here I'm accounting for some time series values here and as you'll notice you have the dot arg1 and arg2 and this corresponds as I mentioned directly to the variables the data that I'm passing via tableau so my sum of earnings and my periods to forecast which is the parameter and this is all made possible because I've set tableau up to look for that external service connection so if we go up to help and then settings and performance manager external service connection here I'm running I serve locally on my machine so you can see that the server's the localhost and the port is 63 11 and then on the other side I'm actually using our studio which is just a popular program to run our code you could use something else but I've actually downloaded the AWS of library and I've actually started running it in debug mode which is that true statement right there so now as I go ahead in and interact with this program to here let's say I want to forecast out sixteen periods instantly what tableau is doing is this just doing that same thing over against taking that all code taking that data send it to that external service it's getting run inside of or the data state sent back to tableau and now we've got a beautiful forecast but for some of the some of you who might be kind of like looking at this you might be saying to yourself well well that's great Josh but I mean at the end of the day that's still univariate forecasting isn't that what Jason just showed us and you'd be right so let's go ahead and and let's dig a little bit deeper in this next example we're looking at data from a call center and what we can see in the blue lines here are the actual call volumes and these correspond directly we see a spike directly after each holiday which is signified in the bars right here so we have a holiday then we have a spike and then we have a pretty big drop-off now in this first example I'm actually just using a very simple univariate forecasting function here so if I quickly show it to you again nothing nothing mind-blowing here script wheel calling out the forecast library this is my actual code that's getting run using this average call volume is my univariate forecast or you might vary measure and while it does forecast out into the future we can kind of see that right here what we don't see is that spike that we would expect to see and so for that we actually need to take into account the holidays and we need to train our model to better account for that so for this I'm actually going to be using what's called an edema model so I'm actually still calling out the exact same forecast library here and inside these libraries you can have multiple models functions and these correspond to just different statistical calculations and methodologies that you want to use so in this case I'm using the ARIMA model and we must answer auto regressive integrated moving average basically what that does is depend on which IV my model you choose what that allows me to do is to better take into account these multiple variables in this case this holiday season allottee here so I'm passing in my functions here's my Rhema function right there I'm training my model I've got my second argument right here which is my average holiday and using this model we actually do see that spike that we would expect so when you start to use these multiple variables you can better train your models to understand and get a more accurate prediction and what this allows you to do is we begin to see this marriage between data science on the one hand and ease-of-use for the business user on the other because for many of your business users while many of them may be experts in their field many of them might not have a PhD from data science university but they don't need one and for those of you out there who are data scientists what this allows you to do is actually take these couple more complex models that you've developed use that whatever that programming languages are a Python etc and give those models to your business users so they can begin to do their own kind of what-if analysis as I like to joke sometimes you know a business user doesn't need to tell me oh describe for me the actual formula for standard deviation and I'm not talking about the I don't mean just Sigma or the square root of Sigma squared I'm talking about like the actual formula to you know write it out so you don't need to actually tell me that to understand what is standard deviation to appreciate the impact of that and so as we continue on this road we start to see business users really want to do their own kind of what-if analysis and that's what we're seeing here so in this example I walk through a website company we sell software and we have sales people come to our website they decide to download our trial software more trial downloads equals more sales so on and so forth you get the gist but times are tough and we're thinking about temporarily suspending a marketing campaign for a brief period of time and so I work on the marking team and I'm kind of wondering I'm using this model that my one of my data scientists has built me and I'm predicting that if we temporarily suspend this marketing campaign we're actually going to see roughly about a ten percent decrease in website hits so I'll just go ahead and make this your negative ten but to better counteract for that because we can anticipate that that's probably gonna happen I'm gonna work with some of our developers and we're actually going to move that trial download button to different parts of our website and as a result of that we're estimating that we probably can see about a 10% increase in trial downloads so I'll go ahead and start to run this so again predicting 10% decrease in web site visits we're trying to counteract that using a 10% increase in trial downloads because we're actually moving the actual download button so what we shall see here is this is our decrease in website visits we've picked that up this is our increase in charlo so good but we actually see attempts at decrease still in sales and because I'm an in user and I'm a marketing guy I don't necessarily need to know the data science behind this all I need to know is that one line is lower than the other and so now maybe we want to think about other alternatives as opposed to any of that marketing campaign up to this point however we've talked a lot about what do we want to do for decisions that we're gonna make in the future okay how do we help justify those but sometimes what you need to do is help justify decisions that you've made in the past so in this example again I work for a company and people come to our website we sell sweaters if anyone's familiar with Chicago right now it's really cold so as the temperature goes down sweater sales go up and on the particular date we decided to run a marketing campaign and thankfully because I'm a marketer we can see that our sales increase so kudos however it's not that simple you know in order for me to better justify the cost of this marketing campaign as a cost I need to better predict what would have happened had we not actually decided to have that marketing campaign so for this I'm actually gonna be using a causal impact library and if we take a look at this function here I'm still using the script Bo function I'm passing in to this causal impact library multiple arguments but one of the things I'm doing is I'm actually defining a pre and a post period so this is before my marketing campaign and after it so I'm helping to train the model and then I'm actually passing in multiple variables so in this case what my actual sales pre and post what were my actual website hits pre-imposed what my actual was the actual temperature of pre and post and as we thought of that if we take a look at this model we can actually see this nice little red line now and this red line here corresponds to what would have happened had we actually not run that marketing campaign and if I hover over this line I get this very rich statistical story about what's actually going on at any given point in that data set and this is not tableau actually generating this line this line was actually generated via that causal impact report so if I look at the summary here which is on my tooltip you can see that I'm actually essentially calling the same library doing a lot of the same analysis but here I'm actually pulling in or pulling out this impact report here I'm doing some regular expression magic but this is what actually gives me that nice little tooltip data set now seeing that line is great but you know somebody's gonna ask me will I you know yeah I can see a line I can see there's a difference I need a number okay don't show me the line I just need a number right what was the final cost of that or the final Delta of that and so for that I basically need to run a running total of the delta of that line and that's what we see here so I'm still using the causal impact analysis we pull the causal impact library passing any much of the same features I'm defining a pre and post period and then I'm actually doing a running total so this is my being in my marketing campaign Delta Delta Delta adding this up and now we can see that as a result of this marketing campaign I can take this back to my business and say that we generated roughly about seven hundred and forty eight thousand dollars in sales from what we would have generated had we not run this marketing campaign all because I can do this kind of analysis using these external services now I want to take a quick moment and just step back in and touch on something that Jason kind of talked a little bit about at the beginning so number one is all these are sent as table calculations okay and when things are stable calculations they're highly dependent upon the level of detail the view and then also the filters that are getting applied so you need to just kind of be aware of that the other thing is Jason mentioned is that we need data for that external service to send something back to now if I don't want to Union that data the few other methods that I can choose to actually make this work for me the first is called date shifting so this is my original data set up here and what I'm going to do is I'm actually gonna trim data off the end and shift it to the right nulling it out so essentially I have data for that external service to send it back to so what this looks like is I basically go in and I decide okay I'm just using a very simple date add function here to add the number of periods on to my forecast at the quarter level so we'll just drag this out at the quarter well look at my forecast and particular versus actual and there we go but the one thing you'll notice here is that if I hover over this last point I start at 1964 my data originally started in 1960 so as we can see we've actually turned off that data and shifted it to the right now this works very well if you have a very long tail of data that you can afford to actually cut off but sometimes you might not have that you might just not want to use this method so another method is using what's called domain completion and for this method the very first thing they need to do is I'll just kind of show you what my data looks like originally so this is my quarter and most of my earnings as a watch are so everything's there what the very first thing I do is I calculate a max date so just using a fixed T LOD label calculation I'm just saying what is the maximum date as a quarter of my data set from there I say okay well given that if it's equal to the max date I'm gonna add on this number of peers to forecast else show the quarter so when I drag this out I'll show them side-by-side we can see here that I have now this little this little area right here where data needs to go into okay if I drag out my forecast statement I'm actually gonna put this here and take take this one off no data no forecasts okay well something went wrong the trick here is you're going to use this domain completion is you need to check here and you should use this show missing values what show missing values will do is it will essentially null out those values but they're still sitting there waiting for tableau or waiting for the external services doing something back to so if I drag out my domain completion forecast here now I have data all the way out I can switch this to a nice little line and my predicted versus actual and there we go so I just quickly want to touch upon what this might look like if you had to do this in a tablet but this is not a class on tableau prep that a lot of great classes on tableau for app but what I've done here is I've just taken a very simple flow and I want to do some probability calculations based off some customer satisfaction scores so I'm bringing in that data here I'm doing some cleansing but to any step in my prep flow now I can actually add a script calculation which is what we see here so if we take a look at this script calculation basically what I'm doing is I'm using auto serve I've connected it to my auto server this is the actual file so you need actual file with all the code in it to actually pull back data from and that file looks something like this so this is the actual code that I'm running and I'm using this get output schema to essentially pull data out of that that we turned value to then use this in tableau prep and now I can actually add this to any point in my flow so just quickly to recap we talked a little bit about you know what we can do using our external services in this case you know Python MATLAB etc we talked about how we can use parameters to really expand that multivariate analysis to our end users so they can do some own what-if analysis and then what can we do if we want to do some impact analysis how might we better understand that now we all kind of like right at the end here so join me this afternoon there are four more sessions related to advanced analytics forecasting so we've got get it on stat statistical analysis skills and tableau at 1:15 are you ready for Python at 1:15 tableau plus Python Python is pretty popular at 1:15 and then at 2:15 even more data science applications in tableau so I know you've probably heard this from all the sessions you've into please please please go into the mobile app fill out the session survey we really do look at every response it helps make these sessions better year over year and with that we just want behalf of Jason and myself just say thank you thank you for coming to the session tableau conference for people like us it's really like Christmas just to kind of come here and geek out about things like tableau with twenty thousand of our closest friends so really do thank you Jason I know we're right at time here Jason I'll be up front just for a little bit few have any questions come see us but thanks everyone
Info
Channel: Tableau Software
Views: 3,384
Rating: 5 out of 5
Keywords: Visual data, Visual analytics, Business analysis, Business analytics, Business analysis tool, Data analytics tool, Data Analytics, Analytics, Analytics platform, Cloud application, Business analytics platform, data analysis, data visualization, business dashboards, business intelligence, tableau, tableau software
Id: aQ8aGKsNH3I
Channel Id: undefined
Length: 61min 21sec (3681 seconds)
Published: Mon Nov 18 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.