Forecasting Multiple Time Series with Modeltime | Bonus Auto-Forecast Shiny App [Lab 46]

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
well welcome everyone this is another exciting learning lab and this one is a special webinar this week where i'm going to be covering some new advancements that happened that i've been working on for the past several weeks and months for model time we'll be talking about a new application that i spent several uh spent the bulk of last week building which is called nostradamus and we'll also be talking about what's coming soon in model time and i think it's pretty exciting um at least i'm really excited about it so we've got a lot of awesome content and uh let's get started so uh the first thing up what i want to do is i want to demonstrate to you what you can do with model time and was shiny so when you combine these two skill sets this is something that you can provide your organization so i've developed an application called nostradamus and you can see it right here this is something that utilizes model time under the hood to build a bunch of different models and ensemble them together in order to build to produce a forecast and we can see it here it's uh i've already produced this forecast i've forecasted 36 periods into the future you click this button and this and it runs it also has a and it also lists the accuracies of your different models that are being utilized underneath the hood and you can see that here the ensemble the weighted ensemble which has four models included and the loadings for each of those four models and the accuracies that we're getting so this one's getting about a 15 boost in accuracy by ensembling over the the next closest what i'll call sub model so it's a pretty high tech app the idea here is that it works on almost any type of time series so if i pick a different data set and the one that we'll be working with today is the walmart sales weekly i'm going to load this data set the old forecast gets swept away there's also a ts explorer tab so what it's doing is it's pulling in this walmart sales weekly and it's inspecting how many time series exists how many different groups and this will be the focus of today which is working with time series groups so this particular data set has seven different time series has some of these time series have different seasonality they have outliers and you can see and examine some of the auto correlations that are in the different time series and you can really get with this ts explorer tab a good sense of what things are in your data that you may need to to look out for and what your data looks like so we'll go back to the auto forecast tab i'm going to change this to 52 periods because this is a weekly data set i'm going to click the run the forecast button down here what you can see is we're fitting several sub models it's making 21 different models right now and what what it's doing is basically running model time on 21 different models and these will be our sub models usually takes a few seconds to run and then what it's doing is selecting the best ensemble model now so it's combining all 21 of those different models and then it's refitting the final ensemble so this is what we get we get a a prediction that looks like this and you can see here the ensemble weighted model was winning out it's using profit with xg boost 80 uh random force 16 xg boost 3 percent earth 0.64 and then the rest of the models that it had built it's not using we can zoom in on some of these different time series so i'm going to deselect them i'll just zoom in on the first one so we can get a sense for what we're looking at and you can see that it's got the forecast here it has some confidence intervals and it looks like it's doing a pretty good job at picking out the different seasonality also what it's doing here is if i select a different time series what it did was it forecasted all of these together so the nice thing about this is this modeling approach that i'm using here is uh it's very scalable so we could do seven time series or we could do a hundred time series and the amount of time that it takes to do more and more time series won't increase that much not like it would if you're doing a rima model and running it through a loop where if each if each forecast takes five seconds and you do it you know and you do 10 forecasts that's 50 seconds this one it won't take that long it'll take about the same amount of time for every single forecast um so it's a more scalable approach and it's utilizing and leveraging some new model time features that i just released in model time 030 which i'll be talking about here shortly so it's pretty sick but the idea here again doesn't matter if it's so this is a weekly time series i'm going to load a different data set and i'll just click the run button this is a hourly time series it doesn't matter what time series you're dealing with whether it's hourly weekly monthly or yearly your your algorithm when you combine model time and when you combine shiny they are able to adapt and automatically forecast and do a pretty decent job but you need to learn the skills and you need to understand how to do this i'll talk about some of the techniques here in this learning lab but also we'll be talking a lot about the time series course and the shiny course which is how you're going to get the skills to be able to build this type of application for your organization and my ll pro members you'll get a um a light version of this application so this is an hourly data set uh again we've got four models that it's using and these are the top four and it's combining them in different proportions and you can see here this is what the forecast looks like so it does a pretty decent job okay so that's the demo that's what you can do when you learn shiny when you learn model time and when you learn how to forecast at scale and this is what every organization that i know of needs so it's very um very powerful okay so before we get jump into the content for this week i want to talk quickly about a program called learning labs pro and this is the 46th lab in learning labs pro all of these labs are recorded i put them into the learning labs pro membership program when you join you get access to all of the previous labs and any of the new labs that i release and roughly twice per month i have labs that cover all sorts of different topics like tensorflow h2o docker ml flow profit the twitter api the google analytics api and more so every lab has tons of valuable bonuses and when you sign up for this program you get that two times per month pretty sweet uh the bonus for today what my lo pro members are going to be getting is a light version of the nostradamus app so the the light version contains the the framework all of the groundwork has uh three different models that are used in it so the the full version has 21 models but this version has three models it doesn't have all the kind of bells and whistles but it gives you a great foundation that you can build off for your organization and when you take the time series course you'll learn the techniques and skills that it takes that i applied when i built the nostradamus app the full version so definitely check it out if you want to get the groundwork you should definitely join ll pro today all right agenda for this week so we've got a jam pack session here today uh what i'm going to be doing is talking a lot about the model time ecosystem where we've come from what new features are now available that power and nostradamus and the ability to scale up our time series forecasting i'll also talk about the road map of what is coming next and this is something you do not want to miss because i've got some exciting stuff coming and then what we'll do is we'll jump into a full code session so we're going to be talking about this walmart's sales demand forecasting which is the the time series that i had forecasted with nostradamus and i'll do i'll run that real quick again so it's this walmart sales weekly load this and this is the um the the model that's running right now we're going to be producing something something similar to it in this lesson so the walmart sales demand forecasting the reason i'm running it right now is i want to show you what the accuracy is that we're getting through nostradamus and then when i go through the lab what i'll do is i'll show you the accuracy that we get with the lab and you'll see how much more we can how much we can improve when we learn some of these advanced time series concepts that nostradamus uses okay so that is what uh where we're going so let's start off with the model time ecosystem so you can see three different hex stickers and these are actually three different r packages that are now available inside of the model time ecosystem so there's standard model time which is uh which has been recently upgraded there's also model time ensemble which is a new ensembling package and then there's model time resample which is for time series cross validation and resampling this is a growing ecosystem for time series packages and it's the top of the line in my opinion okay so let's talk about model time what is model time well model time is is nothing that you cannot do yourself it just takes a lot of the code that i was having to write over and over and over and over again to be able to condense and consolidate into very few lines of code so it provides a forecasting workflow so all the steps that you need to come through in your forecast it helps you build all of these models and then consolidate them and organize them into a step-by-step process where you can make your forecast on your test set you can get your forecast accuracies and then you can refit on a full data set so once you know kind of how your model is performing you can refit it on the full data set and then forecast forward uh it unlocks several packages that um were not previously available so it unlocks the tidy models ecosystem for forecasting which through uh to my knowledge there hasn't yet been anything like this produced but what tidy models does is it allows us to tap into all these machine learning algorithms like xg boost like random forest support vector machines elastic net all of them all the primary machine learning algorithms that i'm using in nostradamus to be able to ensemble that's what tidy models allows us to do okay then we also get access to profit and the forecast r package in a format that model time can just easily use to build arima models and build profit models all right so what is new that i've just released in model time that allows us to do this forecasting well model time 030 just was released here i believe a week or so ago and what it allows us to do now is i had to make some some key changes to it but now what we can do is we can apply machine learning to more than one time series and i call this panel data it's like a panel data approach if you've ever heard of panel data that's kind of you know the way i think about it but basically it's nothing new to machine learning but for time series a lot of people haven't really experimented with modeling taking kind of one global model that models all these time series at once and the reason that they don't do it is because it's it can be challenging to get good results so you have to do a lot of different things in order to be able to make forecasts that are going to work no matter what time series that you utilize we'll talk a little bit about some of that stuff but it's really the things that you learned in my time series course and i'll talk about that at the end um so what what model time o3o does is allows us to forecast many many time series at once so we've got eight time series it's only one model though that forecasts all of these eight time series okay so that's the first part of the ecosystem and that's the big upgrade that's gonna allow us to do what i call forecasting at scale there's another package called model time ensemble and this is something else we're going to go over today and we'll touch on is if you notice in this nostradamus app there's this ensemble weighted model and this is a model that i used model time ensemble to create and what it does is it takes a look at all of these different models and it provides a weighting using a proprietary algorithm a little bit of intellectual property it took me a while to figure out how to um a good strategy but basically it comes up with a modeling approach that weights these appropriately to give us the best odds of forecasting so um what i'm going to do i'll run this for 52 weeks because that's what i'm going to be running in the test today and what we'll do is we'll see how this how this model here with an rmse of 5000 or so we should get about the same but what it's doing underneath the hood is fitting a bunch of these different sub models and creating an ensemble so that's what model time ensemble allows us to do is to create these weighted stacks it also does model averaging and it also does a thing called super learners which is where we use like a linear regression or a random forest or an xg boost model as a way to to figure out how to combine these different models so it's a very flexible package and it's something that is very cutting edge and it's also a competition winning strategy and it works really well for making nostradamus okay so uh model time ensemble just real quick this is the reason i'm able to get some very high accuracies one of the reasons is i'm ensembling and you can see here on this data set i'm getting about a 14 improvement over the best sub model which is the best of each of these different models so it's definitely a good strategy and it's something that you'll want to learn the third package that was released here uh relatively recently is model time resample and what this allows us to do is figure out how our modeling algorithms stand the test of time so this is an example of what's called a time series cross validation plan which is where you take a time series and kind of break it up into different training or testing windows and you want to see how your algorithm performs you know given this blue portion of data how does it predict on the red data given this blue portion how does it predict and so on and the strategy that mild time resample uses as it resamples and refits your algorithms on each of these blue portions trying to predict the red red proportion so in this particular example if you had five models and and you had six different training sets what it would do is it would refit your models your total in total 30 times because you've got five models times six uh splits and it would predict the red portion and then what it does is it allows you to then see how that accuracy what your average accuracy is and this tells you some things about your time series models it tells you whether or not it's going to stand the test of time how stable that model is over time and so on because the problem is is if you just do this once which is what most people do they just do it you know they'll take the their training data and test it on the most recent window that's that's a good approach but this is kind of a belt and suspenders approach because it what it does is it tells you over time how well you're doing and it makes some nice plots so you can kind of see over different training windows how well how well all of your algorithms do so those are the three packages and that's the ecosystem these are all taught in my time series course it's an amazing course there's nothing else like it out there on the market it's cutting edge it has everything in it from machine learning to to feature engineering to ensembling to forecasting at scale and even more and we're going to talk about what's coming next to the time series course and also we're going to talk about what's coming next to the model time ecosystem i'm really excited about this all right here's what's coming model time glue on ts and what glue on ts is is it's a python deep learning framework for time series if you've never heard of it you know that's not shocking but uh it's quietly gaining massive popularity especially in the python time series ecosystem um why is it doing this it's because it's because deep learning has placed really highly in several different competitions most notably the m4 competition and uh the m5 competition the second place solution used a deep learning algorithm as part of the the solution so oh and also the kaggle wikipedia time series forecasting competition a deep learning algorithm won that so what gluon ts does is it's built off of it's developed by aws so amazon web services has a data science team that has built mxnet gluon glue on ts and a bunch of different other packages that are all related to this ecosystem called gluon so gluon ts is the one that i'm going to focus on that's the time series one but there's also a video one image an image uh glue on and so on what what is really nice about gluon is it's also built on top of mxnet which is an ultra fast uh package uh of uh that's written in c plus plus and it's able to be used with with both cpus and gpus so you can really scale up your analysis using these what i'm so excited about this deep learning framework called glue on ts is that it opens the doors for us to be able to efficiently build deep learning models for our time series so this means that all the benefits that you get with model time which makes it super simple and easy to forecast in very few lines of code you're now going to be able to have that with glue on ts once it's integrated so here's an example here that i was working on today it's the this is a function that i built nr and what it does is it taps into the gluon ts library and i'm running the deep ar algorithm and here you can see the forecast that it makes and basically what i've done now here is uh been able to connect up to an environment that has gluon ts running so i can if i want i can set it up for gpus i can set it up for cpus i can set it up just like i can with python all from the comfort of r so pretty sweet um this is uh one of the algorithms the deep ar algorithm but i'll have more um some so n beats is the one that has subsequently really gotten a lot of uh popularity it was used in the second place solution of the the m5 competition and it was also it wasn't available for the m4 competition but it was it was a later run on that competition and it got much much much better results than even the winning solution so uh gluon ts has end beats incorporated and model time will subsequently have end beats incorporated uh here's an example of that deep ar algorithm this is with five epochs and after running about 20 epochs uh you can see that the predictions get much better and you can see kind of the benefits here so at first you know it seems like it's getting the trend right but now once you let it run for kind of the sweet spot which is 20 epochs uh you can see that it's really doing quite a good job at on this time series here all right so that's what's coming that's uh next for the roadmap super pumped about it now let's get into the code okay so uh for my ol pro members this is the code base that everyone is going to be getting uh that's a that's a member of lo pro it looks something like this where we've got uh these are the folders and files that we have we're primarily going to be working out of this 01 walmart sales forecasting dot r file here and that's where i'm going to do forecasting at scale seven time series at once we're also going to touch on ensemble learning and uh we'll go down through that in this tutorial uh also what my lo pro members are getting are is this app.r file which is my shiny app so this is the forecast uh many times series i'm gonna click the run app button here and what we're doing right now is running the nostradamus light you can see it pops up right here and we've got a nostradamus light running so you're going to get this it has a ts explorer here um it has you know some some um all of the found work is laid for you to be able to build your monster not uh nostradamus light or your nostradamus app yourself um okay so that's what you're getting as a bonus uh so definitely sign up today um all right so first things first though we're going to be uh going through the analysis that is utilized to create the nostradamus app and this is a basic analysis it doesn't go through all the tools and techniques but it'll show you in a few lines of code how you can quickly get up and running with model time and with model time ensemble so the first thing that we're going to be doing is loading in some libraries so i do have some dependency requirements it's probably best that everyone uses the development version of model time model time ensemble model time resample and time tk for this i believe the crayon versions are up to date but i'm not 100 sure because i'm using all only the development version so if you use the development versions you're definitely going to be able to run this code the libraries that we need to load so once you have installed the development versions that you need then we'll be loading some machine learning libraries so we're going to learn load tidy models this gets us access to all the machine learning libraries that we need the we're going to load model time we're going to load mild time ensemble and we're going to load model time resample the time series package that we're going to load is time tk and we're also going to load our core tight every library okay once you do that we're going to be working with this walmart sales weekly data set which looks something like this it's 1000 by 17. it does have a bunch of different time series groups we can visualize those because we have this id here that's a grouping variable so if we group by id we can see that there's now seven different groups and this is what the data set looks like we can also plot that using the plot time series function from the time tk package i teach all of the time tk package in the first five mod or the first nine modules it's the first part of the forecasting course so you learn how to do all of this we can visualize our time series here very quickly and we can see what the each of the different time series groups look like and that should look familiar because the application that is running right here has time tk built right into it so that's what this is right here okay so definitely learn time tk it'll it's worth its weight in gold um it will make you super powerful um okay the next thing that we need to do so now that we see our time series we know kind of what we're dealing with what we need to do is we need to prepare our time series data so we're going to be um doing a an analysis which i call panel data which is just what build one model to forecast all seven of these time series at once so how we're going to do that i'm going to first come up with a forecasting horizon so this is how many steps into the future do i want to forecast well this is a weekly data set so there's about 50 to 53 weeks in a year what i'm going to do is i'm going to set this to 52 weeks and it's going to have now a variable that is saved over here 50 this forecast horizon 52 and that's what i'm going to be utilizing throughout my analysis to be able to set up my data appropriately okay so we're going to take our walmart weekly sales data set and we're going to create what's called a full data set which includes both the forecasting region which is the future data and it also includes the past history which has all the information here that's going to be valuable to our machine learning models so we're going to first pair this down to just the columns that i need for this analysis it's the id column which has my grouping feature the date column which is the time stamp and then it also has the value column which is the weekly sales value that i'm trying to predict first what i'm going to do is i'm going to apply some group wise time series manipulations so i'm going to use these i'm first going to group on the id feature and i'm going to use the future frame function to extend this time series so we're going to see the number of rows increase from 1000 to an additional 52 times seven rows because that's how how many uh forecasting horizon steps we have so forecast horizon is 52 it's going to increase each of these ids by 52 time stamps into the future so again this is a a time tk function you need to learn time tk if you don't know it stop what you're doing start learning it today it's going to help you for time series okay so we see we're now up to 1 365 time stamps so we went from 1000 1001 up an additional 364 timestamps okay the next thing that we're going to do is consolidate the ids so i'm going to run factor drop on all the ids and this is just to make sure that your ids that are a categorical feature only have the levels that are included in your data set so you run that and that just adjusts this id feature okay so i'm going to save this now as full data table and now i have the base for doing an analysis i have a full data set um i'm going to then split off into a training set and a forecast horizon so this is something that's kind of weird that you need to do for your time series is once you make your full data set uh you have to split it off into what i'm calling data prepared so that's anywhere there is not an nas in this weekly sales feature so if i run this here this will be my 1001 um if all goes well it's 1001. so this is my original 1001 and then all of the um and uh oh and i can run uh my tk summary diagnostics on this and i can see you know what uh data sets i have in here what their uh what the time range is so what i'm mostly concerned about is this time range i want to make sure that they all start and end on the same day and then my future table which is uh right here uh i can see when i filter is in a weekly sales this is the 364 that i had created when i extended with this future frame function so i have for each id the date and then i have a weekly sales and all the weekly sales are going to be missing values because we don't know that yeah we need to forecast that so i'm going to say this as future tibble and then what i'm going to do is i'm going to check out the summary diagnostics make sure that they all start and end on the same day and they all have the same number of observations and it looks like they do okay all right so now that i have this split into the data prepared and the future table i'm going to kind of set the future tibble aside for a little bit and we'll we're not going to really need that until we make our future forecasts at the end but the data prepared is what we're going to be dealing with also just a side note the um the data prepared that i have in here does not include um many of the features that my nostradamus app includes so there's many more features that i include in this step here specifically to be able to handle time series data and to be able to do what i call feature engineering and this is really one of the the main strategies that makes that nostradamus app so powerful is because it includes some feature engineering steps in here and i only teach these in the time series course so definitely check the time series course out if you're interested in learning more there um okay so now that we have our data prepared we're going to split that up into a training and test set we're going to use this time series split function from the time tk package again i teach it in my time series course uh we're going to take our data prepared looks something like this 1000 by 3 and we're going to use time series split to split it up into an analysis and an assessment set so hit control and enter and now i have this splits object here and notice that it's detecting here that data is not ordered by the date variable and the resamples are being arranged and it's also telling us that overlapping time series are detected and we're crossing processing these overlapping time series using a sliding window so all these time series that are in here all all seven of these are all being processed together and that's one of the upgrades one of the major upgrades i just did this recently to the time tk package um that allows us to to really be able to understand our data so um what we can see here is we've got 364. i'm just going to do 364 here by 7 and you can see that that's 52 we have 52 stamps for each of the seven groups and then i see i have 63 by seven here so i'm gonna do 637 by seven and you can see i have about 91 timestamps of training data i usually like to have i actually usually like to have this a little bit smaller but for this demo it'll work um i usually like to have about 20 no more than about 20 or so of the data in my test set so right now we have about a third to about 33 okay the next thing i'm going to do i'm going to create a preprocessor and this is going to add some of my engineered features that are that are easy to do with um with time series in this in this format so i'm going to create a recipe with my training data so i've got my training splits here i'm going to add in a um some a time series signature recipe which is going to take a date feature and break it off into a bunch of different calendar features that we're going to use for machine learning we're going to remove some of the features that are not important so that's this next step here we're going to normalize some of the larger variables which is the index numeric in year and then we're going to convert the weekly feature into a factor and then i'm going to dummy it here or dummy all of all the factors here so when i do that it gives me a recipe that looks something like this which has several different steps here that are going to be applied to my data if i want to see what that data set looks like post processing it's going to look something like this and this is what we call making a design matrix from our machine learning data so we've converted all these this time stamp here into a bunch of different features here like date month uh date week and so on and and you can see uh some of the other engineered features here we've got the groups that have been uh converted into ones and zeros and this is so a machine learning algorithm can learn from this data set now we still have a date feature in here we'll have to handle that separately and then you can see the weekly sales feature is in here as well this is our target okay so with a recipe in hand we now need to make one more recipe and what i'm doing is i'm using this update role and i'm going to take this date feature i'm going to give it a new role called id i'm going to say this as recipe 2. this is because recipe 1 my model time algorithms all need to have a date feature so any of my model time algorithms that come from the model time package like profit like profit boost like arima they all need to use the recipe spec one but recipe spec two anything that comes from tidy models like the parsnip library which has xgb boost which has glm net and so on they're going to need to use this recipe too which takes this date and gives it a new role and that's what we use this update role for so that's really important if i use the summary function i can now see that this date feature has a role of id where previously if i do recipe one here so let me let me copy this so do recipe spec 1 you can see that previously date is a predictor that's what we need for my model time but for my model time algorithms uh we want this to now be an id so that's what we're doing we're changing the role here from predictor to id for this date feature all right now we've got a recipe in hand we've got a preprocessor let's build some models we're going to start with a workflow and we're going to create our first one which is profit with regressors so we start with a blank workflow we need to add a preprocessor and a model the model that we're going to add is profit reg it comes from the model time package and what this does is just applies profit with regressors to our model so we can see here when i run these lines of code here we now have profit rig that's been added next i'm going to add a preprocessor we still need to add we still have preprocessor none here so i need to add a recipe we're going to use spec 1 because that comes from the model time package and you can see i've added my preprocessor here and then what we can do is fit the on the training data so this is going to give me a fitted workflow i'm going to hit control and enter it takes a second to run it runs the model time algorithm for profit and you can see here that i've got a model here profit with regressors and these are this is some of the information that it's used for that to create that model cool the next thing that we need to do we need to make more models so it's not good enough just to make one model you don't just run one model on your time series that's a recipe for crappy models if you want to do really good things for your organization you need to experiment i can't stress this enough this is what model time really excels at is helping you experiment and organize your experiments so we're going to add an xg boost model so we're going to start again same type of formula here we've got a workflow we're going to add a model we add a recipe and we add and then we fit it so we're going to do that same thing here workflow i'm adding a boosted tree model oh and here's one of the things too something i do in nostradamus is i create many of these models i don't just stop at one xg boost model but i create many models and you don't know which combination is going to work well so you need to make more than one just one model and that's one of the things that i teach you how to do in the time series course so we're going to add this model here which just has uses all the factory defaults we're going to give it spec 2 because it comes from parsnip this is not a model time model this is a machine learning model from parsnip so we give it recipe spec 2 which has the date feature given a new role from predictor to id and then we fit so we're going to run this section here and now what i've just done is i've fitted an xg boost model so i can see that here i've got a preprocessor here and the model has been fitted and it's now ready to predict on some data okay so we're going to do let's see we'll do a couple more models here i'm going to go a little bit faster this is a random forest it uses recipe spec 2 here with engine ranger which is from the ranger package we'll do a support vector machine from the kern lab package and this is again a machine learning model so we're using spec 2 and then we're going to do profit boost which is the xg boosted version of profit that i that i include in my model time package this uses recipe spec 1 because it comes from my model time package okay so we now have one two three four five different models here these are going to be our sub models so this is where we pick up with model time uh what we're doing is we're creating what's called a model time table it looks something like this the advantage of using model time and what makes it so special is it helps us organize some of our information so we now have all of our different models in here it's given it a model id and it's also giving it a description it tells us you know a little bit about of information about what's contained in this model something that's simple like that that just takes extra lines of code and time for you to organize it just does it automatically for us so i've saved that as sub models tibble what i can do in my model time workflow is is first do some calibration so these were all trained on the training data set and what i want to do is i want to calibrate which which tells me how well these do on the testing data set so i run through this model time calibrate on my testing splits and then i can use the model time accuracy function to quickly get the accuracies here and just a quick survey here it looks like this model is doing pretty good it's a profit with xgboos seems to be our best and you know each of these models do a little bit differently okay the other thing here is we can then start to visualize so i have my calibrated tibble i can pipe that into model time forecast with my new data being my testing data and the actual data which is going to be for comparison purposes is data prepared one of the new advances of in this function for model time o3 is we get this keep data function and what this does is it tries to attach your previous data here so you have now the grouping variable which is this id that gets pulled in so we can group by id and it makes it really efficient for us to plot the model time forecast which is going to be plotted right here it looks something like this now they're all stacked on top of each other um if i don't want that i can do dot facet end call so the number of columns in these facets i can set that up for two and it'll make it just slightly easier to for me to visualize there we go so we've got two different columns here and you can see now how each of these forecasts are what what they're producing what what it looks like so um i can see two i can deselect certain ones and i can see how each of these different models for example uh produce results i can see that this one looks a little bit low i can see that this one looks like it's doing a pretty good job and so on and then i can also see that some of these other models might do a little bit better like ranger seems to be predicting a little bit high on some of these models and so on okay so now i understand a little bit more about my time series and that's how easy it was to take one two three four five different models and predict it on seven different data sets so pretty cool um the next thing that i'm going to do uh in my model time workflow is i'm going to take my sub model sub models that have been calibrated and i'm going to refit them on the full data set what this does and this is what i normally do is i take the model that was trained on this training data set and then i refit it on the full data set which is all of this data in here and then that way it usually gives me a better forecast and then we just if we want to visualize what that forecast looks like into the future we just run it run that refitted tibble then through the same process now i'm gonna do dot facet underscore end call equals two and make us a little bit wider so we can see the forecast and we can see here we're getting some crazy things going on um so this is one of the things that i've seen quite frequently from profit with regressors is it tends to overfit in my opinion the data so you see how this is getting some crazy results uh it's because underneath the hood profit uses a linear regression model to analyze these regressors and it doesn't do a very good job linear regression is something that is a no-no in machine learning because it doesn't do any sort of regularization so you can get results like this that are pretty crazy with profit with regressors the rest of these models are doing okay the best of which i believe is this profit with xgboost it seems to be doing the best and that's because it doesn't use the profit with regressors here it rather uses the xg boost to analyze the regressors and xg boost is good because it regularizes the data and it makes a much better forecast here okay cool so we're making some progress here the next thing i want to do is i want to talk a little bit about model time ensemble and some of the advances here for making these forecasts at scale using what i call ensemble models so what i'm going to do here is i'm going to take my first my sub models table and i'm going to uh i want to create an ensemble but first i recognize that i don't want to use this profit with regressors so what i'm going to do is i'm going to remove that from this model timetable and then what i'm going to do is i'm going to use this function called ensembl average i'm just going to make a basic ensemble this is the most basic one that you can do is just an average ensemble using type equals mean and what this does is it takes it creates a specification for a ensemble model of your four models here and this is the sub models that it will use so we save that as ensembl fit mean and then we put that into another model time table and now you can see here we've got ensemble mean four models and you can see here we've got an ensemble and that four represents how many the number of models that are underneath so model time just knows that you've got an ensemble now and it's now starting to organize that um the next thing we'll do is we'll take our ensemble table and i'm going to combine it with the sub models table so i can compare these and then i'm going to run model time accuracy on the testing data set and right there that easy that easily i have my ensemble here and i can see how it stacks up against the rest of these now it looks like it's doing a little bit worse this 8521 rmse is what our ensemble here is getting our best model here is getting around 8160 but keep in mind i told you i didn't do everything i could in this analysis there's more to be learned and just to prove that to you if we take a look here back at the auto forecast that i did here and i did i just ran this on 52 and this is the same data set here this is the rmse i'm getting from my ensemble weighted i'm getting around 5778 which is about 40 percent better um than what we've got here so we've got 8 000 uh 5 21 so we say 57 uh what was that 48 uh 57.78 divided by 85 21 is yeah about 33 percent better so subtract one about 32 percent better so there's a lot more that you can do and i encourage you to try so this is the power of some of the techniques and this is why feature engineering is so important this is why model experimentation is so important we only have five models here you know there's many more that we can use the challenge is trying to figure out which models we can use to get us the best results which combinations how to combine them in the ensembling once you get into ensembling and that's another strategy that i use is i don't just go with an average ensemble but i try some kind of cutting edge techniques and then in the feature engineering area up here where the i'm doing the data preparation i have a lot more features that i'm adding in and i teach how to do all this stuff in the time series course okay next next what we need to do is we need to refit on the future data our ensemble table which looks something like this so it's just this one by three which has our ensemble in here and i'm going to run it through the model time refit it takes a second because it has to refit all four of the sub models so that's why it took a second right there it's just refitted them and then now what we can do is make the future forecast notice i'm using i'm finally using that future typical that i created at in the beginning now i'm using that to create my forecasts and again i'm using model time forecast keep data equals true i'm grouping by id which is this id feature here and then i've just made all seven of those forecasts just with one model my ensemble okay cool and just like that you now have learned a little bit about model time but there's a lot more to be learned the time series course that i've been talking about has all of these techniques and a lot lot more baked into it time series is something that does take hard work and a lot a little bit of effort to to learn but once you learn that you become super valuable to your organization there's things there's a lot of stuff i did not cover in here so what you still need to add to your arsenal and if you want to get that 5700 rmse this is what you need to do you need to do feature engineering you need to figure out how to add these different features and what's going to work well and you need to learn when to apply these new features and when not to apply them so features that are particularly important lags rolling features for your series i cover all of these in modules two through seven in the time series course i also have uh i say 30 plus models that's what you build that's what you learn how to build in your in the time series course modules 9 through 12 go over how to build all these machine learning models of riemann models and so on module 13 is on hyper parameter tuning stabilizing your models and module 14 is on building better models using ensembles which is stacking with super learners and weighted ensembles which are a little bit more sophisticated and better and usually give you better accuracy than just the average ensemble that i use in here okay so the idea here is when you do that you can then take this application that i have here and this is my challenge to any the students that are in my time series course take this app and make it better i want to see who can make the best application by modifying the um the the forecast analysis in here so i have the analysis in here in this observer part of the function you can see we go down through and we do the same splitting the preprocessor information uh we're creating three different models here and then we're making a sub model and then we're going to average them together and so on so when you do that you can make this application and this is the light version that all of my ll pro members are getting today you run this um 52 run the forecast and it forecasts your walmart sales weekly so when you do this you should be able to get a better model um you should be able to do to start to do better than than what i've got here and that's what i want to see if any of my ll print members can do that and how and how much they can improve their rmses the other thing here is that we have so that's the demo definitely check it out but if we want to talk about your transformation which is the kind of where you can go your top end the idea here is that it's pretty simple businesses need nostradamus like there's no question about it if you can provide this application to your organization you're gonna your career is gonna take off and i'm not just giving it away to to you guys you don't you guys aren't getting it at this um it's you know it took me a while to build it but if i just give it to you you're not going to learn anything so what i want you to do is i want you to take this nostradamus light application and try and get it as good as you can get it and i'll give you guys pointers if you're in my ll pro program and i'll point you to which sections of the time series course that you can leverage in order to do it but that's my challenge to you if you're an ll pro i want to see you try and improve it um so you've seen what model time can do uh you've also seen what when you combine model time and shiny what happens you know it's this awesome idea here that you can become super powerful to your organization by just combining these skill sets so there's a big change that's going on out there and this is something that everyone needs to recognize this is why data science is blowing up it's because organizations they need to make decisions at scale they need these applications to be able to do that businesses can't scale reports they can't scale excel files they can't you know there uh there's no other way to to scale this stuff and they have access to all this infrastructure now so cloud and servers like aws and azure and you can solve these problems for that for your business and become super valuable in the process but apps are what they need so this is what businesses need they need nostradamus they need your help in order to be able to build these applications that leverage these powers that allow them to forecast their data at scale so how are you going to do it how are you going to get there well here's the key this is the fast track this is a program that i've been developing over the past three years that will activate your transformation so it begins with the r track and then we also have learning labs pro so this this lesson that i'm giving out here is going to be stored in my ll pro but this is the end game what you need to do is you need to solidify your foundations and it's a step-by-step process so if you're on the spectrum where you're just starting out this is what you need to hear because we all start at the beginning where we don't really know a whole lot about data science right we know that it's really powerful and we want to be able to enhance our skill sets to make us more marketable and also to make an impact on our organizations so how are you going to do that well the r track is the key it's a three-step process so step one is you start with in the beginning of the r track which is taking the 101 course and what this is doing is it's building your skill sets by teaching you repetitively how to analyze data how to um how to uh do data wrangling how to visualize data how to be able to create your stories so you can explain them and communicate them to your organization and it's like climbing a hill you start by doing projects and you kind of work your way up and once you're at the top you are you're getting pretty dangerous so once you have learned data science then you need to learn how to build applications so this is the shiny component so you now have the model time component down here you need to move into the shiny component which is actually productionalizing your analysis and it's really simple once you learn shiny you just need to learn how to utilize shiny in order to be able to leverage your analysis so you basically convert an analysis that you create over here and you can convert that into an application that leverages your analysis so that's step two and then once you have those steps down the learning labs pro program which is amazing especially if you've already got the 101 course done i'd say that that's the minimum prerequisite prerequisite is you get continuous learning you get learning on things that the core curriculum does not touch so that's things like financial analysis that's things like working with apis as things like working with python and r together it's all these things that will continue to accelerate you long after you're finished with the program so that's how the system works it's kind of two different parts you got the r track and the learning labs pro so we've already talked about learning labs pro you're more welcome to join that but if you're at the beginning of the spectrum that's what i want to focus on a little bit more right now so uh these are a few examples of my success stories these are students that are in my program that have really gone on to do some amazing things so this one is jennifer cooper she's recently got a job as the vp of strategic analytics at jpmorgan chase where she's doing amazing things and if you want to know what it took in order to be able to get in the door there and be able to get excite them enough to hire her and bring her on board well it's very telling what she says in here and her testimonial what she said is that during her her interview process she used the business science problem framework to complete a project using our markdown now they didn't know that she was doing our markdowns they just they got a pdf as part of the interview process but she went step by step through my business science problem framework and she gave them a finished a polished finished product that impressed the hiring manager so much that he was willing to fast track an offer and this is this is jennifer who was the previous organization she was working with she kind of felt like she was had plateaued there a little bit so she was looking for some better roles and she was able to knock it out of the park by getting a strategic a vp of strategic analytics role with a very large company in the financial sector so great job jennifer the way to use the coursework same thing with these stories we've got masataki who got a job at the top management firm we have z here who was able to improve patient costs or patient care while reducing costs at the same time using some of the tools in the 101 course mohanna here he got a 40 raise just by showing how much more productive he was they they had no idea you know they didn't care if he was using r or python but he was learning r and all of his python counterparts were taking so long to uh analyze the data and um he was pumping out project after project he was it really helped him accelerate his career herb works at florida power and light down in florida and uh he was able to after the first three months of his coursework build a shiny app that really impressed his executives and his organization and then finally here raj so if you want to learn shiny there's no better success story than this he literally won the 20 the 2020 or studio shiny competition and he used a lot of the tools and techniques he learned in the shiny developer course so no joke the coursework works so you just need to buy into the system and if you uh if you commit then it will help you big time so these are this is again kind of a trajectory we all start here at the 101 course we move our way through advanced machine learning and time series and this is where you learn the business science problem framework this is how jennifer got the job at jpmorgan chase and then you learn the machine learning and time series together which is uh really what it takes to build an application like nostradamus and then here at the end you need to learn shiny so it's not good enough just to learn analytics but you have to know how to deploy it and that's what these two courses teach you so there's the the intermediate shiny and then the advanced developer course and this is where you learn like aws docker get how to develop full stack web applications and this is how you're able to and this is how i was able to build nostradamus in under a week um i built this entire forecasting application uh and it actually probably took me like three days now don't tell your don't tell your company that when you're building this but it's gonna make you super productive um a little bit more about the courses if you want to understand the curriculum the 101 course uh has a ton of content and uh in week six you're doing machine learning clustering regression but the fundamentals are in weeks one through five where you're learning data manipulation text time series categorical um doing visualizations and so on and then at the end you do two projects so it's really important because you need to be able to kind of solidify and and and build reports and show how you can present your information to your organization uh the 201 course this is the one that had the business science problem framework that jennifer used to get her job uh we do the machine learning the entire course walks you through a project and you have to do a return on investment you have to do machine learning you have to do data preparation understanding the data story telling and that sort of thing and you use cutting edge tools that your organization can get immediate value from then you have the time series course which is the newest edition and this is the one that teaches you all the tools that it takes that i implemented in order to be able to build this forecasting app that is what you learn in this in this course so feature engineering super super critical machine learning you learn every algorithm under the sun and then deep learning which is coming soon with model time glue on ts where we're going to be implementing deep learning as well so it's the most cutting edge time series course on the planet right here uh and then there's the shiny course so this is the first shiny course which has you building two dashboards where you're doing predictive pricing app and and that uses xg boost and then a sales dashboard with demand forecasting which also uses xgboost and then you have the more advanced course which is really what it takes in order to build nostradamus so if you want to build and deploy this you need to learn aws docker get bootstrap mongodb shiny server all the things that it takes in order to handle multiple users in an application that's what this one teaches you okay so i've talked a lot about the courses what i want to do and i'm excited to give you a 15 off coupon that's available right now so if you're interested in joining any of the courses or you get an additional benefit if you join the art track bundle you can get it for 15 off and david's just provided the in the chat the course link and keep in mind too is that no matter if you're if you are ready to dive full in we have a one-time purchase available or if you're looking to budget spread spread the payments out over time we have an 18 month plan as well and the idea here though is that this is an investment you invest in this and it pays you dividends in terms of career acceleration if you're looking to transition into data science if you're looking to add to your skill sets if you're tired of where you're at right now and plateauing and just getting three percent raises this is what will help you take your career to that next level this is how mahana got a 40 raise and in a span of a year this is how all of these students are doing good good things for their companies and it makes them really really powerful and really really valuable to their organizations and that's how you get rewarded so uh that's um the offer i can't wait to see some of you in the courses now david uh i'm ready to handle some questions all right let's do it um just really quick i know matt was laying out all the details for the courses a couple of people asked about jumping straight to the time series we highly recommend that if you're new to r or even if you think you're intermediate definitely take a look at the course descriptions because there's a lot in each of these courses there's a lot involved and you may think you're ready for the time series course and get in there and um and we don't want you to be frustrated so each course has prerequisites and uh the only one that doesn't have any is the 101 course which is right here but um the time series course which is this one right here the prerequisite is 101 and we lay that out in the first week so you're more than welcome to try it you can always go backwards but if you try it and you feel uncomfortable and you don't feel like you're you know you're quite ready for it then pick up the one-on-one course it's yeah you're not you're not going to regret that decision one bit yeah yeah and they're they're made to build on the skills from the previous one so definitely definitely uh consider that okay cool so muhammad asks does model time work with irregular time series like intermittent data uh intermittent demand um yes and to prove that let's take a look at nostradamus so i'm going to go to a database which has missing data and this is olympic running mail i'm going to load that in and it looks like this so give it a second to update i'm going to take the smoother off oh no we've got missing data in here and these are this is a time series that tracks the olympic running times for men and you can see in these years here in 1936 and 1948 we've got a span here where we didn't have olympics and that's because of some world wars that were going on so how are we going to forecast this well model time doesn't really care if you have missing data or not what it cares about is the values of the data that you do have and that's the advantage of using an algorithm like xg boost so let's try and forecast this we're going to do the next six olympics times and we want to see what is going to happen there so it's going to it's running the other thing you can see here let's see uh the anomaly plot won't show up so it does haven't have an issue there with missing values and that's just because of the the anomaly algorithm um some acf's here we can see that there's definitely information in there that seems to be auto regressive so probably going to be able to forecast this um okay so auto forecast is finishing up it's refitting the final model and right there we now have the um the forecast here um someone asked about syed asked about imputing values there was no imputation done this those values get dropped if they're missing they get cut and and that's what i do internally in nostradamus that's that's something that that's a good question i don't think i do that in the light app so that's one of the techniques that i use to get to these um these really good forecasts uh you can see our ensemble here is getting pretty good accuracy 10.3 versus the closest um uh the closest sub model which is xgboost and so on okay so yes model time does handle intermittent demand it does handle missing data it handles everything yeah there are a few questions on that so that's a that's a great one um asked how would it handle group slash hierarchical time series data oh that's a good question um it turns out that the data set that we've been dealing with is both a grouped and hierarchical data set which is walmart sales weekly and that's the advantage of dealing with um and actually i hit 26 observations but um what we did was 52 in the in the lesson but what happens is is when you create one model that works for all of your different time series your time series your algorithms are able to learn from the other models and they're able to see if there's correlations between time series and that's the that's essentially what happens when a you have a hierarchical data set is you might have overall sales you might have sales by department you might have under that department sales by a a different product or a different area and you can include all all of this time series as groups and when you model them all together it handles it just right because you see these are learning from the other groups um so yeah like these these time series here are all learning from each other so that's how it handles hierarchical data set is when you build one model that models the globally the the entire data set locally you end up getting pretty decent results wonderful uh mort mortiz asks do you cover inference in your course as well as the forecasting um so the time series course is uh an inference would be covered in the feature engineering because you don't you can't build good features unless you really understand how your time series you know which which features um seem to add information so from an inference standpoint what i do is i show how to build the different diagnostic tools that you need like the acf like the seasonality like the um there's a plot time series regression function which is absolutely phenomenal for helping you understand what information is in your time series and how you can utilize these different features in order to be able to extract information from it that's also it turns out it helps you explain the time series if there's a seasonality component you know that component will rise to the top in your plot time series regression if there's you know something that you're including like an event uh and if that event doesn't help you or if that event helps you it'll give you different um regression coefficients and that sort of thing so it becomes more explainable when you when you're going through that process of understanding your features okay great uh william asks do you have a step to explore model instability slash overfitting um so exploring model instability and overfitting that would be in model time resample and that's what i recommend um i would check that package out that is it's fantastic it's brand new and what it does is your time series cross validation so you can actually slice up or chunk up your time series and see how that that that model will perform over your different um time regimes and windows okay great let's just take one or two more here um let's see is the shiny app a kill ass in the shiny app there's a waiting models ensemble how how are these weights made that's a secret i knew it was a secret but it's a good question yeah it's a great question uh it's a weighted weighted ensemble yeah but i think that yeah it's it's uh it's a technique and um take take the uh take the 203 course you'll learn how to build ensemble models when you go through that course if you need to learn how to build that model and you can't figure it out on your own i'll point you in the right direction awesome all right let's see if we could do one more here [Music] matthew asks is there functionality within model time to work with cross-sectional panel data similar to your example of a global model by country panel but with multiple uh vintages of data yeah just do the exact same thing i just showed in this in the lesson and that's how you do it we've been working with panel data all day so you just add whatever your groups are into the same time series and stack them on top of each other and that's how you you uh you analyze the panel data so you just run it with one model now you have to make sure you select the right models that you that are going to give you good results and that's that's how you do it all right great uh there's there's still some other questions out there sorry guys we can't get to all of them but we really do appreciate you coming and sticking around for the for the whole presentation here we hope you got some value from it and um i'll throw the link to the courses in here one more time if anybody's interested uh also check out the details for each course if you click on it you'll see um all the modules so you'll get a sense of what's involved and um yeah so we'll we'll be announcing the next um uh learning lab uh in about two weeks and we hope to see you on that one all right and until then definitely check out model time you will love the package um it is cutting edge it's really got all of the bells and whistles coming to it we have model time glue on ts coming here soon so that's pretty exciting um and uh yeah i'll continue to develop it and all of the information if you want to learn model time it's contained in the course so definitely give the course a look uh you'll love that as well be the best money you've spent perfect all right thanks again everyone we'll see you soon all right see ya guys all right take care
Info
Channel: Business Science
Views: 4,143
Rating: 5 out of 5
Keywords:
Id: 6RjYIOCnRMk
Channel Id: undefined
Length: 75min 15sec (4515 seconds)
Published: Mon Nov 16 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.