Predicting Stock Prices in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] what is going on guys welcome back to the new online channel today we're going to learn how to predict stock prices with neural networks in python so let us get right into it now before we actually get into anything here let me tell you first of all this video is not investing advice and i'm not just saying this for legal purposes it is literally not investing advice in any way you shouldn't use this model to predict stock prices because it's not going to predict stock prices it's going to be in a recurrent neural network and lstm network and it's going to use the last 60 days let's say the last n days and it's going to predict one day into the future it's not going to project or to predict into the far future 50 days 60 days two years it's just going to predict the next day based on 60 days and you know it might be better than guessing it might be but it doesn't have to be so please don't use this to make investment decisions at all this is more about learning how to use recurrent neural networks more about learning about machine learning and neural networks and tensorflow and python programming than it is in any way about predicting stock prices this video is not about predicting stock prices it's called predicting stock prices we're going to try to predict stock prices but it's not going to be actually anything that you would want to use in your investment decisions now of course if you add a lot of more sophistication to that model it might actually turn out to make decent results but with with what we're going to show in this video we're going to see that it has some effect so we're going to compare it uh we're going to compare the predictions to the actual stock prices and we're going to see that sometimes it's quite good quote-unquote sometimes it's not good at all um but all in all this is not something that you would actually want to use this is the first step to learning how to predict stock prices with machine learning or attempting to predict stock prices with machine learning this is not in any way an investing advice or anything that you would want to use in your portfolio so having said all that we can start by importing some libraries and loading the data so the first one is going to be numpy snp import matplotlib dot pi plot splt import pandas spd import pandas data reader as web import date time sdt which is a core python module and then we're also going to see from sklearn dot pre-processing import min max scalar and from tensorflow tensorflow.keras.models import the sequential model and then from tensorflow tensorflow.keras.layers we're going to have dense layers we're going to have drop out layers and we're going to have lstm long short term memory layers and we need an import here so if you don't have all of those libraries installed you need to install them open up your command line no matter if it's a linux terminal or a windows command line and you say pip install numpy pip install matplotlib pip install pandas pip install pandas dash data reader and pip install tensor flow and also pip install scikit dash learn so that should be all of it maybe if you want to use candlestick charts you can also import or install mpl finance i have a video on that on this channel as well so this is those are the inputs that we're going to need and now we're going to load the data here and we're going to use the pandas data reader to load the data so we're going to first specify a company that we're interested in let's say facebook fb and then we're going to say start equals dt dot date time and here we can specify uh what what time stamp what time point we want to start uh the data from so we can take the full data set as well but we're going to say okay i'm interested in from 2012 first of january up until now so end is dt dot date time or actually maybe for the training data we should not use up until now but maybe something like 20 20 first of january like that and now we're going to load the data by saying data equals web dot data reader we're going to load the ticker symbol of the company by the way if you don't know the ticker symbols just google amazon ticker symbol facebook ticker symbol apple tesla whatever and then you will find the ticker symbol and then we're going to say we want this from the yahoo finance api from start to end date so the next step is to prepare the data for the neural network so prepare data and for this we're going to create a scalar first so we're going to scale down all the values that we have so that they fit in between 0 and 1. so if we have the lowest price of ten dollars and the highest price of i know 620 dollars what we're going to do is we're going to press all those values together so that they fit into zero uh or in between zero and one so we're going to say scalar equals min max scalar feature range equals zero one uh this is from the sklearn.preprocessing module and then we're going to say scaled data scale data equals scalar dot fit transform and we're now not going to transform the whole data frame we're only going to be interested in the closing price because we're not going to predict the opening price the high price we're going to predict the price of uh after the markets have closed so we're going to predict the closing price here uh and for this we're going to say fit transform data close dot values we're going to reshape them here because we need a particular format so reshape -1 1 and after that we're going to define how many days into the future we want uh or how many days we want to look into the past to predict the next day uh by the way here you can also use the adjusted close if you want if you want uh it to be adjusted to stock splits however i'm going to use close because i feel like when i look at the data it's always adjusted for stock splits so also the closing value but i might be wrong on that so you might want to use adjusted close i'm going to use close here and then we're going to define prediction days this is just going to be a number how many days do i want to look into the past past or how many days do i want to base my prediction on so how many days do i want to look back to decide what the price is going to be for the next day in this case i'm going to pick 60 you can pick more or less and then what we're going to do is we're going to define two empty lists so x strain equals empty and y train equals empty so we're going to prepare the training data here and we're going to fill it up by saying 4x in range and we're going to start with prediction days so we're going to start with 60 and we're going to go up until the length of the scaled data so we're going to start counting from the 60th index and we're going to go until the last index so to say now this is important because we're now going to say x train dot of 10 so we're going to add a value to the x train with each iteration and we're going to have this value be scaled data and the index or we're going to append uh 60 values and then a the next value as a training example because this is labeled data we're going to have uh the first 60 values that we already know and then we also know the 61st value but we need to prepare the 61st value here in a in a data set or we need to prepare the whole training data in a way that we have 60 values and then the next value so that our model can learn to predict what the next value is going to be so we're going to say scale data position x minus prediction days this is why we start at prediction days so that we don't have any negative values here up until x so 60 values 0 and then y train is going to be the 61st so scale data x 0 like that so we fill up those lists then we're going to convert them into numpy arrays so xtrain and ytrain equals np.array x train and p dot array y train and then we're going to reshape x-raying here so that it works with with the neural network so x strain is going to be np dot reshape and we're going to reshape x strain into the following format we're going to say x train x train dot shape zero x strain dot shape one and one we're going to just add one additional uh dimension here so now we can go ahead and actually start building the model so build the model and what we're going to do here is we're just going to say model equals sequential basic neural network and then we can specify the layers model dot add we're going to always add one lstm layer a dropout layer lstm layer dropout layer lstm layer um and another dropout layer and then in the end we're going to have the dense layer that is going to be just one unit and this one unit is going to be the stock price prediction so here we're going to say lstm and it's going to have units equal 50. those are just some numbers here you can change those numbers and experiment around maybe you're going to get a better performance with less layers with fewer layers or with more layers or with more units and so on but keep in mind that the more layers the more units you add the more the longer you're going to have to be training and uh probably i mean not probably but maybe you're also going to overfit if you use too many uh layers of sophistication here so return sequences is going to be true for the lstm because an lstm is a recurrent uh cell so it's going to feed back the information it's not going to just feed forward the information like an ordinary dense layer and the input shape for this first layer is going to be input shape is going to be xtrain dot shape um x strain dot shape one and one like that so i hope this is syntactically correct here and i'm not doing any nonsense with the parentheses here uh but after that we're going to just say model dot at drop out 0.2 and now we can actually go ahead and copy this we just have to remove the input shape because we don't need another input shape here and the rest stays the same and then we can also add one more lstm and one more drop out layer here we're not going to return the sequences anymore and in the end we have the dense layer so we're going to say model dot at dense and this is just going to have one unit this is going to be the prediction of the next price closing or actually that's called closing value then we're going to compile the model model.compile and the optimizer that we're going to use here is the atom optimizer the last function is the mean squared error by the way if you want to know more about those uh optimizers and the mathematics of the individual layers like the more theoretical stuff it's not going to be exciting it's not going to be fast scripting and creating some exciting project it's going to be mathematics if you're interested in that content let me know because i don't want to explain lost functions and optimizers here in this video i'm just going to show you how to do it if you're interested in the theory behind it let me know in the comment section if enough people want it i can make uh more theoretical content as well uh then we're going to fit them all on the training data so x train y train we're going to feed uh feed it into 25 epos which means that the model is going to see the same data 24 times and the batch size is going to be 32. now of course you can also tweak those parameters here so 32 batch size means that the mall is going to see 32 units at once all the time now we can optionally also go ahead and say model.save and later on we can also say model.load uh actually not model.load but i think something like load model is a function from keras uh for this video we're not going to do it we're just going to fit it and immediately use it um and that's essentially how you define them all how you build it and how you train it now before we get into the actual prediction we're now going to try to uh figure out how well the model would perform on the past data so we're not going to just uh predict the next future data that we don't know yet we're going to see how well would this model perform based on the data that we already have so if i always look at the last 60 days and predict the next day uh what are the chances of this model being right what's the accuracy so we're going to start a new thing here test the actually let's not go uppercase test the uh model accuracy on existing data and for this the first thing we're going to do is we're going to load some test data or prepare some test data so here we're going to say test start equals dt dot daytime now notice that this data has to be data that the model has not seen before so uh remember up here we said we want the data up until first of january 2020 today we have uh the 18th of january 2021 so we can say starting from the 20 21st of january up until now so test end dt daytime now uh this is the time range of the test data so we have the data here but the model has never seen that data so we're going to see how well it performs on that data um and here we're going to say test data is just web dot data reader same company name of course uh where's the auto completion don't i have company company i have company okay i'm going to write it out web dot data reader company yeah finance api test start and test end like that now um what we're going to do with that data again is we need to get the prices we need to scale the prices uh we need to concatenate a full data set of uh the the data that we want to predict on so first of all we're going to say the actual prices uh the actual prices not the predicted prices but the actual real prices from the real stock market world are going to be test data closing values dot values and then what we're going to do is we're going to create a total data set so a data set that combines the training data and the test data and for this we're going to say pd or pandestine dot concat concatenate and we're going to concatenate the closing data notice this is not scaled yet so we're concatenating the close values of the data with the close values of the test data and we're doing that on axis equals zero like that and then what we do is we say model inputs this is what our model is going to see as a uh as an input so that it can predict the next price we're going to say total data set length off the total data set so we're starting our total data set minus length of the test data and minus prediction days because we want to start as soon as possible um and then we're going up until the end and we're going to take the values so just adding a colon and nothing after that means up until the end and those model inputs we're now going to reshape them so we're going to see model inputs equals model inputs dot reshape so again -1 1 and then we're going to scale it down with the scalar that we already have so we're going to say model equals or model inputs equals scalar dot transform model inputs like that so this is how we load the data and we now have to prepare it uh not we already have prepared it we need to now predict uh based on that data based on the data that we have never seen before we need to evaluate how accurate our model is how well it performs so let's go ahead and make some predictions make predictions on test data and for this we're going to say x test is an empty list and we're going to repeat the process from above we're going to say 4 acts in range and we're going to start again with prediction days and length model inputs plus 1 in this case because we can't afford that i mean plus one is actually if you want to also have the newest one i think so we're going to not do plus one yet um and then we're going to say x test dot append whatever we get here so model inputs x minus prediction days that's again why we started prediction days to not get negative numbers up until x and 0 here so we have that and then what we do is we transform x test into we don't have y test here obviously because we don't have uh i mean we have the actual stock prices but we don't need to do it like that here so x test equals np dot array of x test and then we also need to reshape it again so x test so that it has the same format x test equals np dot reshape x test dot shape zero x test dot shape one and one additional dimension here uh however i think i forgot something oh yeah actually we need to say x test here and this is going to be a tuple of the shape there you go um and now we can go ahead and just predict based on that x test data so we're just going to say predictions or actually let's if we have the actual prices let's call this predicted prices and we're going to say model dot predict on x test now what you need to take uh or what you need to keep in mind here is that to predict that the predicted prices are now going to be scaled so we need to reverse scale them we need to inverse transform them so we're going to say predicted prices equals scalar dot inverse transform predicted prices so now we're back using the same uh scalar we're now back to the actual predicted prices here now to make things more visual and more interesting we're now going to plot the predictions instead of just looking at numbers so that we actually see how well the model performs so we're going to plot that test predictions this is not the actual prediction yet and what we're going to do here is we're just going to say plt.plot and we're going to plot the actual prices uh we can also say or actually let's just plot for now and we're going to add the labels later on so plt.plot actual prices plt.plot the predicted prices and we're going to see that we're going to have two lines now and they're not going to be the same obviously because the prediction is not going to be 100 accurate um but actually let's let's add the color here so that we know which one is which one so black is going to be the actual price and let's say color equals green is going to be the prediction and here we're also going to say label and f actual company price and we're going to copy that and we're going to say not actual but predicted predicted company price um and then we can also add a title so pot title f-string company share price i'm not a design person so you can care about the case and so on or actually come on but let's do it properly if we do it so we're going to do title case here and now we're also going to add a label so x label is going to be time we're not going to have dates here though that's not too important since we're just predicting one day so and the y label is going to be company stock or actually share price and then plt legend plt dot show now we're going to run that script now but i'm going to have to skip the training process because it's going to take some time i'm just going to show you how it starts training and then we're going to skip to the part where the training is done i hope i don't have any mistakes here so i get some warnings here but after a while it starts with epoch 1 of 25 and then when we see the progress progress we can skip this year and you can see it works now so i'm going to skip the video here and we're going i'm going to come back to you once the training is done all right so the training is now done and the plotting as well you can see that the results are kind of fine i would say they're not accurate but it's saying that's actually going to be lower i think you know if the model predicts that the price is going to be lower than it actually is uh as long as it's not too different i think it's actually better than if it overestimates um however again keep in mind we're looking at 60 days to predict one day it's like you know if you flip a coin maybe you're going to get worse results or actually flipping a coin is maybe not a good analogy because we don't have 001 but we have multiple values that we can choose here but again it's not too impressive to have this accuracy if you're just predicting one day into the future um we could also change them all in a way that it feeds back its own data into itself in terms of you know it looks at the last 60 days predicts one day then it looks at the last 59 days plus the one predicted day and after 60 iterations it's no longer looking at actual data but on its own uh predicted data and then the results are not going to be as good as uh this curve here uh but still again this is not investing advice but this is programming advice here or programming education and you can see that the curve is kind of fine and we're now going to take a look at how we can predict uh the one day into the future that we don't know yet so now the final part is predicting the future day so predicting tomorrow uh or the next stock market day the next day the stock market is open so predict next day and for this we're going to create a real data list here actually let me scroll down here a little bit so i'm not blocking this with a camera no i don't need a breakpoint here let's do it like that so we're going to create a real data list and this is going to be model inputs length dot oh not length dot but length off model inputs plus one minus prediction days up until length of model inputs plus one zero like that um now i think it's maybe where was this loop here actually this doesn't matter actually yeah i don't think it matters now if you want to plot it you could also say plus one here i think uh but we're only interested in the numerical value actually we're not interested in the plot because it's just one day if it was 60 days into the future maybe the plot isn't interesting but one data point is not too interesting to plot so we're going to say real data is that then real data is np dot array of real data and then real data it's going to be reshaped as well so np reshape the real data into the format of real data dot shape zero braille data dot shape one and one additional dimension here and now what we can do is we can just go ahead and say print scalar dot inverse transform the real data so we can first print the real data if we want to i mean that's optional i don't even think that we need this so let's just go ahead directly to the prediction we're going to say prediction equals model dot predict and we're going to predict on the real data so we're going to use the real data as the input and then we're going to predict that one uh next day that we don't know about yet so we're going to say prediction the actual prediction then is just a scalar inverse transform on the prediction and then we can say prediction is the actual prediction but we need an f string oh it's automatically making that string out of it wow so we can now run this again i'm going to have to [Music] skip the training process again but i'm going to show you the results in a second all right so after the training we get this as a prediction tomorrow again don't think that this is the actual stock price of facebook tomorrow uh this is not investing advice but the model says it predicts that this is the price of tomorrow uh tomorrow's facebook stock let's look at the yahoo finance api it's two 251.36 right now uh markets are currently opened i think it's monday yes it should be opened um so this is the current stock price i mean it it may be the case that tomorrow facebook is going to be 252 it's not too unrealistic but again i wouldn't bet on it just because the neural network says it is all right so last but not least we're going to look at a comparison between different companies here because i tried to predict stock price for those companies using this model uh notice however that here i only use the data from 2018 up until now uh because i didn't want to train the models for too long and you can see for example the tesla share price is not well predicted at all you can see the actual stock price is up here and you can say that i see that here we have a flat line in the prediction uh looking at goldman sachs it looks way better but you should not be misled by that line just because it fits kind of well we're not doing uh classification here right we're not training uh a polynomial regression here so that we have i don't know uh the right clusters we're actually trying to predict the stock price so if the green line is kind of correct but after the black line that doesn't not mean that the mall is predicting it well because this means that the stock price already fell and then the mall said it's going to fall so actually when the stock price is down here the model thinks it's going to be here so don't be misled just because uh the the graph is kind of the same um if you decrease the 60 days to i don't know 10 or 20 days you're going to be more sensitive to short-term fluctuations if you increases increase it you're going to look more for the long-term stuff so you can also see the twitter share kind of fine you can see the facebook share i mean not that fine but still okay and then the apple share is also i mean here it's overpredicting so it says it's going to be here but it's actually not here yet i mean this is probably even a pretty good line because it's always predicting before stuff actually happens so that's probably the most impressive result here um but yeah again take it with a grain of salt so that's it for this video if you enjoyed it hope you learned something if so let me know by hitting the like button and leaving a comment in the comment section down below and of course don't forget to subscribe to this channel and hit the notification bell to not miss a single future video for free other than that thank you very much for watching see you next video and bye [Music] you
Info
Channel: NeuralNine
Views: 294,563
Rating: 4.9417477 out of 5
Keywords: python, finance, investing, stocks, chart, charts, stock visualization, data science, machine learning, prediction, stock analysis, stock prediction, predicting stocks, prices, price, stock, python stock prediction, neural networks, RNN, recurrent neural networks, LSTM
Id: PuZY9q-aKLw
Channel Id: undefined
Length: 29min 14sec (1754 seconds)
Published: Tue Feb 02 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.