Cryptocurrency-predicting RNN intro - Deep Learning w/ Python, TensorFlow and Keras p.8

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
what is going on everybody and welcome to another deep learning with Python intense flow and chaos tutorial video in this video in the coming videos we're going to be talking about how to apply a you know recurrent neural network to a more realistic example of like what you're actually going to have to go through if you want to do you know make a recurrent neural network learn again some sequential data so in this case we're going to be working with a time series data set which is prices in volumes for crypto currencies if you wanted you could do it with it you could follow along and do the exact same thing with stocks or if you don't like finance at all no matter what like anytime I do a finance series people are like I don't get it I'll try to be as simple as possible here there might be some new terms but it's it's pretty basic stuff it's what's the price of the the thing and how much of the thing is being traded at any given time and what we're going to try to do is take basically the four major crypto currencies and that is Bitcoin litecoin etherium and Bitcoin cash not necessarily in that order but can we take those four crypto currencies and track their price over time so like the sequence over time of their price and their volume and can the recurrent net basically take sequences of all those prices and all those volumes so in this case like let's say we want to predict the price of light coin into the future so can we take the last 60 minutes of light coin price and volume Bitcoin price and volume aetherium price and volume and bitcoin cash price and volume so take the last 60 minutes of all of that data every minute so this data is getting updated once every minute so 60 like sequences 60 long of all that data and can we predict say I don't know 3 minutes into the future what will will the price of light coin rise or fall okay so the same thing could be said maybe you're reading in sensor data to servers and you're trying to predict whether a server is gonna you know overheat or something like that or maybe you're trying to predict traffic like traffic to a website and you've got like time of day and users or something like that at the end of the day what we're trying to do is either predict some sort of classification in this case is the price going to rise or fall or you could try to do like a regression so you could have like that the activation for the output layer could be a linear activation and you could literally try to predict price or maybe a percentage change so it's all normalized so anyways something like this is really complex because you've got to do so many things to what's going to be a typical data set so your typical data set is not you know it's not even going to be in sequences it's going to be one long sequence and we have to build the sequences and then just like everything else we've got to balance the data we've got to normalize the data especially in this case because the price of litecoin is just different from the price of Bitcoin as well as the volume that's going to be traded so we really want this to be in relative terms and then we also need to scale the data and it won't be as simple as it was with image data where you just div by 255 so anyways we've got a lot of stuff that we've got to think about here also with sequential data doing out-of-sample is a little bit different challenge so a lot of stuff for us to cover let's get into it first of all I've provided the data set I'll try to remember put a look at the description but otherwise go to the text-based version of the tutorial which definitely will be in the description and if you scroll down there is a link to it D I could just copy the address and let me just will start a thing now so we'll call this crypto RNN Tut PI yes and let me just open that up and sublime so it's at this link just in case I forget for whatever reason that's where you can find it beautiful so when you download that that's just health is shaping and then it wasn't where it needed to be anyway once you've done that now let's see pull this this is what you'll get you'll get the zip and then just go ahead and extract it and then you'll get this crypto data door and inside of there you've got these four files each one is basically just the price in you dollars of Bitcoin cash bitcoin aetherium litecoin so if we open up one of them we can see here so this is a UNIX timestamp and then I honestly forget let me check my notes real quick it is the low high open and close and then volume you do not need to understand what open high low close is just understand that closes the price at the end of that 60 second interval that's the price so we're gonna treat the closed column as hey that's the that's the actual price column that we're gonna use and then the last one there is volume okay so what we want to do first is just check out the data and read it in make sure we've got that working the way we intend so I'm gonna go ahead and import pandas as PD we're going to be using pandas if you don't have it open up a terminal command prompt pip install pandas okay so we're gonna use pandas and we're gonna say DF equals PD read underscore CSV and the CSV is in crypto underscore data and then it is we're gonna go with LTC USD dot CSV and then if you recall here the the columns aren't actually named here I just spit them out from a database so we're going to use names equals and then we specify a list of names here and that is time and then it went low it's probably gonna go off-screen huh let's do this beautiful low hi open what's going on there why oh we didn't close off low okay low high open and then it was closed and then the last column is volume so we'll read those in and then let's go ahead and just print the head of this data frame that's its shift-enter okay all right so we have the data read in I have to fix this is over like by pixels there we go okay so now we've got the data read in and we can kind of see what's going on and again we're really only going to concern ourselves with the clothes in the volume but the other thing we need to be able to do is we basically we have all this data in different CSVs right and what we want to do is get the clothes and the volume for each of them and they all share the same index which is time they're all or all this data is organized by time so we can join all these data frames on the shared you know access that is time so the way I'm going to do that is the following so we're just gonna say main DF equals Pantone OPD dot data frame and it's just an empty data frame for now there has to be a better way to do this I just I don't really know it this is the method I always used for this but there's probably a better way to to merge data frames than what I'm about to do but so must one ratios so this will be the files that we intend to use so we're gonna use BTC USD we're gonna use LTC USD for litecoin we're gonna use aetherium USD and then we're going to use Bitcoin cash bch USD then what we're gonna do is for ratio in ratios we want to iterate over these ratios so we're gonna say for ratio in ratios I think we're just gonna rename ratio here so ratio will equal ratio let me think you're so hmm I actually noticed an issue so what we because we really really want to do is have to fix that in my notes I have it wrong okay I'm just so like if we just print ratio I think what I had done initially was I used oh s lister to iterate through the director and then later I decided to change that and just manually type it in so I think that's what happened there anyway we were about to do a silly split that didn't make any sense anyways we're gonna say the data set equals and I've started to use f strings so we're gonna use f strings so that would be actually its crypto data crypto data slash and then whatever that ratio is dot CSV then we want to read in with these same names that we had before so I'm gonna copy oh so what we need to do is so that's our data set we're gonna say PD or D F equals PD read CSV we're gonna read in the data set and then the names are the same names as before and in fact I'm just gonna delete now once we don't really need that one anymore so we'll read in that data we've got a data frame and just for good measure let me make sure we got what we expected here invalid sinto we got two two commas what an amateur ok so we've got the data frame so this is like each singular data frame and we want to merge them all so there's a few things we have to do first of all we need to set time as the index like I said before we're really only gonna focus on the close price and volume but if we try to join these data frames together they share these names like these names are identical so what we want to do is give them unique identifying callin names so that we when we join them we don't get bunch of errors because we have like columns that are named identically plus we want to know which column is which so what we want to do now is go ahead and I'm just we're not we don't need to print the head every time I don't think so we'll do DF equals and actually I think we can do it in place equals true here D F dot rename columns equals and then it's a dictionary to rename so we want to rename we want to rename the closed column to be and again we'll use F strings and we'll call will do this ratio underscore close and then we'll do the exact same thing with volume so we'll say volume now becomes an F string what do we want ratio close or ratio volume so ratio underscore volume so we've renamed those those columns the next thing we want to do oh don't forget the in place it was true in place equals true and that's just so we don't have to redefine dataframe just in case anybody doesn't know what in place means so then what we want to do is set the index so DF dot set underscore index and we're gonna make that to be whatever the top of the actual time column and again we want to in place equals true and now we're going to specify that the data frame is actually just the these two columns here so we'll just copy paste and paste okay so now this data frame is just the you know close and the volume and what we can do now is just to make sure it's always a good idea just to continually print out make sure you're doing things the way that you expect I keep doing shift enter from I wrote it all like the my notes in an eye Python notebook so I'm used to shift enter is running it rather than control B okay so as you can see we've got all the columns that we need and now we want to merge them all together with each other so the way that we're going to do that is with a simple question of if Len m main DF equals 0 ie it's empty main DF becomes DF and then otherwise else what we want to do is main DF equals main DF join DF then we'll come down here and then a let's print main DF dot head run that one truncated all over silly columns so what we could say is foresee in the F dot columns it's actually main D F comma let's print C just make sure we got all the columns we're expecting sure do ok great so now we've got that the next thing that we need to start thinking about is okay so this is all just sequential data right so for any you know supervised machine learning problem we actually need two things we need the sequences themselves and then we also need targets so we need to figure out okay how are we coming how are we gonna go out making those targets and then from there okay what are the targets so we need to specify some starting some starting constants so we're gonna say sequent we're gonna have something for a sequence length we're gonna need something for the future period predict predict and then we need something for ratio to predict so what are we gonna try to predict like which one of these four things are we gonna try so for now we'll go with litecoin so LTC USD the future period like how many periods forward and every period here is one minute so we're gonna say three minutes and we're gonna use the last 60 minutes of pricing data to try to make that prediction so if you were to you know yourself look at a graph of the last 60 minutes of pricing and it's updated once a minute logically you could probably do pretty good predicting out the next three minutes but maybe not he's probably still at fifty fifty actually if most people sat down but we could believe that that sounds begins to sound reasonable especially if it's not just the last 60 minutes of the single asset that's litecoin it's the last 60 minutes of Bitcoin litecoin aetherium and Bitcoin cash and the volume and all that so probably a machine could at least do a little better than a human how much better we'll see so once we have that we need to the actual targets and basically a rule for the targets so we're going to define classify and this would be basically our classification so what we're going to do is take the current price and then the future price whatever those happen to be and then the question is pretty simple if float current is greater greater than float future so if the current price and I don't really like the way I've worded this let me do if floats future is greater than float current so if the price is higher in the future than it is right now in our training data then we're gonna say those features were a 1 else else will return 0 and we'll say 1 is a good thing 1 means you should buy this so we want to train the network that you know based on these the sequence of features in general with this sequence of features price goes up here and with this sequence of feature price goes down here and so on so we're hoping that the model can learn that relationship so now we'll come down here to this main DF and what we can say is main underscore DF future because we need to get the future price equals main DF and then what we want to do is another F string and we're looking for the ratio to predict underscore close right the thing to the future price is going to be based on that closed column and then what we want to do is shift that close column these are like up however many future predict you know how many periods forward we want to predict so that's going to be a negative future period predict and then let's let's print that unfortunately I think it's only going to show us like two columns which is really annoying what did we screw up here oh okay fix the curly brace there that's the variable okay try again so that's the future price of light coin unfortunately we're not showing light coin so I'm curious if I can get away with the following so we want to show future and we want to show this price right here this column at least right here we just want to make sure again this stuff shoot what have I done there we go especially when you're working with data like this so this should be the price in three periods so here and then three eight this is the price so so currently the price is this but three periods in the future it's gonna be this and sure enough there it is there and then the same thing here here's the current price in the future though it's gonna be this price and we can see that is the case here again this is you just have to always be checking every step of the way it's so easy to make mistake I'll be really surprised if after I'm ollie you know completely done with this tutorial and post it up that there isn't a mistake in this code because it's just so easy to make mistakes on doing stuff like this anyway it looks right so far so now we want we've got the future price now what we want to do is map this function to to a new column that we're going to call target so we're gonna say main DF target equals list this just converts the output to a list that we're just going to assign as a column so list and then we're going to map something what are we going to map we're going to map that classify function and then after the function is just each of the parameters so in this case it's the current column in the future column so the the current column is main DF this right here so it should be main dia boom so the close column and then we want the future column so main DF future and then I'm going to cut this here paste and let's see if we can get away with target and then again we're just trying to see does it work as we expected so here we can see here's the current price here's the price in three periods in the future that's less than the current price therefore the target is zero good let's see if we can't print a little more so let's do ten ten values for the head here okay so in it so this is the only one that to buy so here's the current price but three periods in the future it's this 4/7 so yeah ninety six point four seven which is true so based on the features here hey we want it to know that's a one okay so at this point looks great we're ready to make some sequences and train the model well well so down so down there Carl we have a lot more to do so we still have to build the sequences we still have to balance the data we still also have to normalize the data because the prices and volumes of laQuan are very different from Bitcoin and then we also still have to scale the data and probably a bunch of other things that I'm forgetting right now like out-of-sample so we have a lot of work ahead of us so I'm gonna stop it here and carry on in the next tutorial with the the next steps that we have to do so if you've got questions comments concerns if I've got an error in the code you think up to this point you know feel free to leave those below also a coat to the most recent sponsors boom Billy harsh soft Mark Zuckerberg JT and Newcastle geek I keep seeing Newcastle geek I don't know if if if the memberships on YouTube are updated each you know each time if you stay a member or not because I feel like the only person I keep seeing over and over is Newcastle geek and I don't maybe everyone's like remember becoming members and they're not I don't really know but I keep seeing that name so anyways that's it for now and thanks again for you guys becoming members I appreciate the support and I will see everybody else in the next video
Info
Channel: sentdex
Views: 140,217
Rating: undefined out of 5
Keywords: TensorBoard, TensorFlow, recurrent neural network, RNN, Keras, Deep Learning, tutorial, neural network, machine learning
Id: ne-dpRdNReI
Channel Id: undefined
Length: 21min 53sec (1313 seconds)
Published: Sat Sep 15 2018
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.