Time Series Anomaly Detection with LSTM Autoencoders using Keras & TensorFlow 2 in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
👍︎︎ 1 👤︎︎ u/aivideos 📅︎︎ Feb 05 2020 🗫︎ replies
Captions
hey guys today we're going to have a look at one way to detect anomalies in time series data and we are going to do that using deep neural Nets and more specifically a STM outline colors so we are going to do that using Karis intense for fall - and we are going to detect anomalies in SMP 500 caused daily closing price so let's have a look at some of the data that we have for this tutorial and the data camps comes from Cagle and it's provided by Patric David I've linked his Twitter account in here so thanks Patrick for providing this data set and the dead set contains daily closing prices for the S&P 500 index from 1986 to 2018 so if you don't know what the S&P 500 index is you can read this or go to the Wikipedia link and I'm going to link all the to the tutorial and the source code of this for this video down in the description so you can follow along if you like so I've just opened up a new notebook in Google collaboratory and I've already used the GPU hardware accelerator for this and as you can see I have the test repeat 100 in here so this is great and then I'm going to install tensor for GPU so that this should install tensor vol 2 for us using the GPU version of it and then I'm going to do a bunch of imports I'm going to set some seeds and then I'm going to download the CSV containing the data so this is what we actually start using and writing code and using the dead set so the first thing that I'm going to do in here is to read the CSV file into a panda's data frame then we are going to have a little look on the data and then we are going to develop our ASTM auto-encoder model and I'm going to speak briefly a bit more about what HTM sir and more specifically what HTM autoencoders are so let me start with reading the data in here I'm going to use the real CSV function and as you can see we're downloading the file called SP X CSV I'm going to parse the dates in the column called name date and I'm going to set this call as an index as well so if I look at the data frame we see that we have the date and the closing price for this date particular day okay so let me check the shape of this thing yeah we have around 88k of examples in here which is quite frankly not a lot but yeah we should we should do with that so the next thing that I'm going to do is to plot the prices so we can have a look at those and I'm going to just do that and yeah this is a graph of all the closing prices and as you can see after 2016 the prices are steadily increasing up until 2018 so this is the data that we have right now so we are going to take just a part of this data or 95% of this data and use it to train our model and then we are going to predict on the next year or so using our model so the first thing that I'm going to do is to split the data and I'm going to define a variable called train size which is 95% of the data okay so after we have that I'm going to specify the test size and this is just going to be the number of rows in the data frame - the data in the training so the next thing that I'm going to do is to take a subset of the data from the training for the training set and a subset from the data for the test set using pandas I walk my toad okay so we're going to start from the train size and go to the older over the length of the data frame so this should divide our dataset nicely and I'm going to see the shape of both training and test data sets so as you can see we have just below 8k examples for training and 400 examples for testing so these are for 10 days on which we are going to detect anomalies as a test data okay so the next thing that we are going to do is to actually scale the data and scaling is really important for most like machine learning algorithms and in particular algorithms that are using gradient descent or similar algorithm to find the best parameters for your model so we are going to use the standard scale from the sk1 library and I'm going to import that in here okay so we're going to define an instance of that scaler and I'm going to do the import just to be just for the out complete to work and then we are going to fit the train cause price in here so I have a typo okay so we are fitting just on the training data and this should be always the case for you because you want to use only the training data for all your models and then you're going to test only of course on the test data so you don't want a leakage of information and for example like the mean of the whole data set that you have you want to get the mean only on the train set so you simulate the real world example as much as you can so the next thing that I'm going to do is to apply the transformation from the scalar to the training cost price and the test cost price again using only the parameters on the training data set so to do that I'm going to use the transform method on the scale and I'm going to do that on the test set of course so yeah you might receive some warnings but this should be alright at least for now so we've now scaled the training and test data and the closing price in those so we can have a look at some of it and as you can see we have a minus in here but that's that's just fine at the scabbard it is job ok so the next thing that we are going to do is to create the data set that we are going to use for the HTM model recall that the HTM model is actually a type of recurrent neural net so it expects some sort of a sequence and we're going to create a data set that can we are going to create a number of sequences that create a number of sections of number of sequences that use 30 days as sequence for the closing price that you have so for example if we train our model a single example will be like from the 1st to 30th of January in 1886 whatever year and the next example is going to be from 2nd January to 31st January of the same year and so on and so on and we're going to do that using help a helper function called create data set and this function is actually a religion Eric so it works pretty good ok so let create this function is going to be named create data set of course is going to take the X&Y and a number of time steps that we are going to use you see why in a second I'm going to create two arrays of lists in here and then I'm going to iterate over the length of the axis minus the time steps so okay so in here I'm going to take the current value from the X and I'm going to do a bit of indexing so this is just taking the value at the current time step that we are interested so this might be like January 2nd 1986 or something like that we are going to append that value to the axis and I'm sorry this is going to contain not just that but yeah this is going to contain not the general if this is going to contain for example January 1st to January 30th depending on the number of time steps that we've passed in so this might have in our example I'm going to use 30 days as a history data or we are going to use 30 days for creating the sequences so so we're going to take January 1st to January 30th in here and we're going to append this value to the axis and again I'm going to do roughly the same thing for the Y's but I'm going to take the January 31st and try to predict its value and this is going to be our label we're going to predict its value based on the previous circuit dates and days okay so let me obtain that and finally I'm going to return those to erase as numpy erase okay so this function should be working now and we are going to convert our train data sets into sequences that a HTM autoencoders will understand and I'm going to define the time steps or the days that we are going to use as a history and we are using only a single feature in here but actually this is much more generic and you can use multiple features if you had those but currently we have only the closing price and I'm going to pass in the time steps let me check if this function actually works yeah it it appears to be working and then we are going to do the same thing for the test dataset okay so let me run this and we are going to check in the shape of the X train and as you can see we have the same amount minus 30 of the sequences or the training data that we had already and we have the time steps equal to 13 so this is one the one in here and the ones dimension we have just the price so this is the shape that we are going to use as an input to our net okay so the final piece of the puzzle is going to be the HTML to in color and you should already know how our STM's work they just need a recurrence they they're just recurrent neural nets and they need some sort of a sequence to as an input and the next thing that we are going to do in here that is a bit more interesting I guess is the outline color and the outline color is not a layer in Karis or whatever it's a type of architecture or a type of neural neural network that tries to reconstruct itself the data that it has so the idea behind all of this that we are doing is to basically train a model based on this data and after we train this model we are going to measure its error and if if it's predictive error and if that error is above some threshold or below some threshold depending on how what you want to predict will do but let's say if the error is above the threshold that you've specified then your model can't reconstruct this data and this might be an anomaly or a rare event so yet again I'm going to summarize it if the error is larger than some threshold then we will say that the current data point is an anomaly and we're going to use that in the test set and we are going to see if this works in practice at least for this data set so the model itself is rather easy to to code in so I'm going to start with creating a sequential model using carrots I'm going to add in an HTM layer and it's going to contain 64 units and I'm going to specify the input shape and this is the shape that we've used in the sequences that I showed you okay so we have something like this the units and then the input shape fruit for this layer then I'm going to add in a drop out layer just for a regularization and we are going to specify a rate of about of 20 percent then here comes the interesting part we are going to do a repeat vector which is going to just copy the values that we have already so this should return a sequence as well and I'm going to specify the number of items is going to be the number of items in the first dimension or this will be 30 in here okay so again we have a sequence of turkey elements and I'm going to do another HTM layer so I'm going to reconstruct the sequence that we had so far and in here I am going to specify another parameter that you might have heard of or seen and it's the name it has the name of returned sequences and it's pretty much self-explanatory what it does it's just returning the sequence after doing the parameter optimization in here then again I'm going to add another drop out layer with the same drop out rate and finally I'm going to output our time distributed fully connected layer so this layer is going to provide us with the same amount of units that we've passed in as the input so basically what we are doing here is inputting a sequence of 30 closing prices and we are asking our model to predict the same sequence but that's a bit different from memorizing the sequence itself we are just severely regularizing our model and forcing it to learn the most important features of our data and by that we are going to then use the error that we had and threshold it and try to predict whether or not we have an anomaly in on our hands so let me add the time distributed layer and the number of units is going to be again the number of features that we have in it that is going to be just one in our case we have just the closing price finally I'm going to compile the model using the mean squared error mean answer mean absolute error in here and I'm going to use the atom optimizer yes so this should hopefully work okay and then we are going to train our model the training is pretty much what you expect so I'm going to train the model for tiny box I'm going to specify a batch size of 32 and I am going to validate on 10% of the data and here the most important thing that we need to do when training time series data is promote shuffle the data because the data is history dependent or we don't have the assumption that the data is rumbly is independent okay let's start the drink okay so we're training about on about 7k examples and the training should take should be pretty fast actually so after the training is complete I'm going to show you I'm looking at the validation was and it's decreasing okay so we have something like that and as you can see the validation was has decreased steadily in here and just 4:10 ybox we're going to have a look at the training and yeah as you can see the the training itself was a bit rough but yeah then after 10 epochs it seems like the validation test validation error was is a bit decreased so it might be the case that our model hasn't learned something of course we don't have a lot of data in the real world you probably have a lot more data and train your model for a bit longer but in here we are going to just continue with this model and we are going to take the predictions further training data that we have in here and you're going to see why in a bit and we're going to calculate the mean absolute error on the training set and I'm going to use numpy to calculate that error and I'm going to have a look at the walls so these are just okay so we have the error for each example in the training set that we had so I'm going to show you something really cool okay so we're just plotting the errors for each example and as you can see let me just zoom this and do this and as you can see we have a lot of examples that have an error just below 1.5 for example or something like that so in here we might take this value here or maybe something a bit more extreme a value of like 1.8 or something like that and declare anything that has error more than 1.8 to be certain to be an anomaly okay so let's continue with this and we're going to do the same the exact same thing on the test data let me just rearrange I'm going to take the predictions on the test set and again I'm going to calculate the test mean absolute error and then I'm going to show you how you can use that for Thresh holding I'm just specifying the test mean absolute error the threshold that we've picked is going to be one point let's say nine and let me just plot this by using this okay so I had a typo in the create dataset so the new fixed version will be linked down in the source code that I've linked down in the description so the next thing that we are going to do is to have a look at the new port and as you can see you might want to pick a threshold of about 0.65 and when you do that we are going to use the test data set so we're picking the threshold based on the training data set and I'm roughly saying that it should be this value right here so anything above that is going to be considered an anomaly so let me just plot the values in here that are above this threshold and as you can see we we are picking just a few examples in here that are just above this value and in here I'm just plotting the closing price against the threshold at the threshold Wars so the last thing that we are going to do is to actually show I am I'm going to actually show you that points at which we are saying that there will be an anomaly and to do that I'm going to use a combination of pipe lot and Seaborn so we have yeah the first thing that I'm going to do actually is to create the anomalous data set and this is going to be just the num the values that have anomaly equal to true in here so the anomalies should be a small dataset and in here we have just two anomalies so let me show you those okay so these are the anomalies that our model is detecting and you can go back and pick a rather more conservative value so zero point six for example and then these are all the values that our model considers to be anomalies and we have four examples now so these are the anomalies that are considered by our model so as you can see the price is going up up up and then suddenly the price is dropping and then again we have a sudden change in here which suggests that the model is expecting that the price to continue to go down but in here it goes up if we reduce the threshold even further we are going to observe something more interesting and in here you see that the anomalies are detected as well as in here and in here so the less of a threshold the the over the value the more anomalous you're going to detect of course and depending on your problem whether or not you are working for anomalies in like let's say server what time or anomaly detection in credit card data depending on that you might want to use a different threshold for selecting the anomalies of course selecting this threshold might be interesting problem from computational standpoint you might try different values and maybe you can read about sorry about that you can read about other ways or automatic ways to select this treasure but this is just the basic way that you can use to detect anomalies in time series data I hope you've enjoyed this video give it a thumbs up subscribe to this channel comment down below if you want some further qualification on the data that I showed you here and have a good New Year's Eve bias
Info
Channel: Venelin Valkov
Views: 36,730
Rating: undefined out of 5
Keywords: Machine Learning, TensorFlow, Artificial Intelligence, Data Science, Time Series, Anomaly Detection, Autoencoders, LSTM Autoencoders, Keras, Python
Id: H4J74KstHTE
Channel Id: undefined
Length: 29min 39sec (1779 seconds)
Published: Sun Dec 29 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.