Matplotlib Tutorial (Part 8): Plotting Time Series Data

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey there how's it going everybody in this video we're going to be looking over time series data and plotting dates in matplotlib now there's a ton of data out there that contains date information so knowing how to plot this properly is definitely going to be a huge help when creating these graphs so first we're going to look at some basic examples using some dates that I have directly here within my Python code and then we'll see an example using data from a CSV file and the data within the CSV file are Bitcoin prices over a couple of weeks now I would like to mention that we do have a sponsor for the series of videos and that is brilliant org so I really want to thank brilliant for sponsoring this series and it would be great if you all could go check them out using the link in the description section below and support the sponsors and I'll talk more about their services in just a bit so with that said let's go ahead and get started ok so I've got some sample code pulled up here in my script so first we'll look at these time series plots using this list of data directly in my script and then we'll look at a real world example with data that I'll load in from a CSV file now if you've been following along with the series then you'll likely recognize a lot of the other matplotlib code that I have here at the moment but if not then let me go over all this real quick so here at the top we have some imports so we're importing pandas we're also importing date/time from the standard library and time deltas we are importing pipe lot from matplotlib and we're also importing dates from matplotlib and we're importing that as MPL dates because I was afraid I was going to override it with a variable like I did here called this dates anyways we are also using a style here we're using Seabourn style with matplotlib these are the data that we're going to be using but I'm going to gloss over this for now and just point out the rest of this code here we have a PLT tight layout here that adds a padding to our plot and PLT dot show will just show us our plot and we'll go over this other data once we are actually ready to plot that and as usual all this is going to be available for download on my github and there's a link to that in the description section below if anyone would like to copy and paste this into their editor and follow along with this series okay so for my sample data here I've got a list of seven dates and I'm using pythons built-in date/time module to create these so these are just seven days back-to-back and then below I've got a y variable here for our y-axis and this is just a list of saying seven random values so to plot these dates we can simply say down here below our y-axis PLT dot plot and we want to do oops that is plot date and we want dates to be the x-axis and we want Y to be the y-axis so if I run this then we can see here that it plots those out now if you get some warnings in your output down here then don't worry about that mine is just warning me about some future change in pandas that will be taking place but we can see that we get those dates and values plotted out now I'm not sure why but by default this plot has markers instead of being connected by a line but we can fix that easily just by saying that we want the line style line style of this plot to be solid so now if I run that I want to make this small again now that we've seen that warning so now we can see that these are now connected by a line and if you wanted to then you could also go ahead and turn off these markers by setting marker to none but I'm going to go ahead and leave those here for now okay so now that we have some dates to work with let's look at some different ways that we can format our plot to make this look a bit better so one way that we can do this is to run the auto format X date method on our figure and this will rotate our dates so that they fit a bit nicer and change their alignments and things like that now we haven't talked much about figures and axes in this series yet that is going to be in the sub plots video in a couple more videos but basically this is going to be a method on our figure and on this pipe lot object that we have been using so to get the current figure from pipe lot we can say P LT dot g CF which is get current figure and now to run this autoformat method we can just run it on that current figure and say Auto F mt4 format then underscore X date okay so now if I run this then we can see that now these dates are rotated and they have different alignments that just makes it so that these are you know not so bunched together and it makes it easier to read okay so now that we've got that Auto formatting in place let's also see how we can change the format of our dates so what if instead of how they're displayed now with the year month day what if instead we wanted them to start with the name of the month and then the day and then the year so to do this we have to use some date/time formatting so to do this I've already imported this line up here at the top from matplotlib import dates as MPL date and from that imported module we're going to use the date formatter class and we're going to be passing in any format string that you could also pass into the strf time method from the date/time class now if you don't know how to format dates then I do have a separate video on Python date/time on the date/time module that goes into more detail about that so I'll leave a link to that video in the description section below if anyone is interested in that and also leave a link to the Python documentation where you can find the formatting codes for the format that you're looking for so for this example let me write out the format that we want so down here below where we ran that get current figure now I'm going to say date underscore format is equal to MP L dates that's what we imported at the top and we're going to use the date formatter class from that imported module and now we're going to pass in our format string and again I'm going to leave a link in the description section below too Python documentation where you can find these formatting codes but I have mine written down here I always need to look these up but for the abbreviated name of the month that is % lowercase B and then for the days that's % D and then the year is % Capital y to do all four digits of the year and again I will leave a link to that documentation so that you can look up other formatting codes if you want to change it up so now we need to set this as the format for our x axis so just like I grabbed the figure to run the auto format method I'm going to need to grab the axis to run this format method so to grab the current axis it's a lot similar to getting the current figure we'll say PLT dot G C a which is get current axis and then we can format the x-axis by saying dot x-axis dot set underscore major underscore formatter and we will set that equal to our date format here so now if I formatted that correctly if I run this then we can see that now we don't have that year month date that we had before now this is formatted to say May 24th May 25th May 26 and so on so you can format your dates however you like to show up in your chart ok so now that we've seen how to work with date times using this simple example here now let's look at some data that I have in here in a CSV file and see if we can load that in and plot it so let me remove what we have now and I'll uncomment this code that I have here at the bottom so I am going to copy these two lines here where we are doing our plot date and also our auto format so I'm going to cut those out so that I can paste them in later and now I am going to remove from dates all the way down to where we set that formatter I just wanted to show how to form about those dates but I'm not gonna format this next example okay so now I'm going to uncomment out the other code that I here and we will explain what this is doing here in just a sec but first let me paste in where we were plotting that data and also setting the autoformat date there okay so up here we are loading in a CSV file here using pandas and if you've been falling along with this series then this probably looks familiar too since we've loaded in CSV data a few times in the series so far but just in case let me go ahead and show you this CSV data and also go over how we're loading this in so we are loading in the data from data dot CSV and I have this pulled open here in the other tab so this is the data that we're loading loading in so these are the headers here the first value is the date the next value is the open price the next value is the high price for that day low price close adjusted close and volume and like I was saying this is just Bitcoin data for about I think two weeks or so so this I just pulled off line and actually this line here at the bottom that's not supposed to be there I'm gonna add that in later I just had that there for a while I was testing okay so let me go back to the code and explain how we're loading this in so when we read this in it's loading this in as a panda's data frame and whenever we say price date is equal to data and then pass in that key of date what it's doing is its setting price date equal to all of these dates here so it's basically setting it equal to that date column now when we do price close and set it to data close I'm grabbing all of the closing prices for those days so we've got the price date and the price closed data loaded in from that CSV file so to plot this it's as easy as passing those into the plot date method so I'm going to pass in price date as the X which is the first value here and the price close will be the Y value so I will paste that in there and if we run that then we can see that we get that data plotted out now right now this might look okay but it's not actually plotting out our x axis as dates it's actually plotting these out strings so to show this let me add a line to the end of the data of our CSV file and I'm going to add it out of order so that's what that line was there before whenever I was doing some testing so what I'm gonna do is I'm just going to copy my top line here and my top line is May 18th so at the very bottom I'm gonna paste in another line and I'm gonna make this May 17th and I'm just going to leave the prices and everything the same as the first day so now if I run this then we can see that we don't have a May 17th here at the beginning it's putting it here at the end so that doesn't really make any sense now like I said the reason it's doing this is because those are being read in as a string and not dates so to fix this we're actually going to use some pandas methods to set that to a date and then we'll also sort that as well now this isn't a pandas tutorial so I'm not gonna go into much detail here but I just wanted to show this in case anyone is working with dates that are out of order this is a pretty common thing to do is to need to sort by dates by data that you're loading in so to do this underneath our data here make a couple of blank lines so I'm gonna take this data date column here and I'm going to set that equal to and I'm gonna say that I want to do pandas and then a method called to underscore date/time and I want to convert that date column to a date/time so what we're doing here is we are converting that date column to a date time using the two date/time method from pandas and then we are just replacing all those values which were strings with those converted date times and now if we want to sort that then we can simply say data dot sort and now that those are date times we can just sort by date and also I want this to just sort in place so I'm gonna say in place equals true in place just basically means that it goes head and modifies that data instead of us needing to say like data equals data dot sort something like that so we don't have to do that since we're changing that in place so now with those two changes there if I save that and run it now let me see if I'm getting an error for some reason date is not defined up you guys probably caught that as I was typing it but I said date date what I meant was data date did I make that mistake anywhere else no okay so let me try that and well let's see well and I made another mistake here sorry about that in a panda's data frame that is not sort that is sort values sorry to confuse you all there hopefully that is all the mistakes that I made so now if I rerun that then we can see let me take that output down there a little bit now we can see that our date here at the beginning we gave it the same value as the next day so we can see that now it's showing up here at the beginning instead of being put at the end so that's how you're gonna work with date times in pandas using that plot date method like I said it's a lot like any other line plot but you're working with dates here so there's a few different things with how the formatting works and things like that but basically this is what you do for time series data in matplotlib okay so we're just about finished up here but before we end I'd like to mention the sponsor of this video and that is brilliant org brilliant is a problem-solving website that helps you understand underlying concepts by actively working through guided lessons they have computer science courses ranging from algorithms and data structures to machine learning and neural networks they even have a coding environment built into their website so that you can run code directly in the browser and that's a great way to complement watching my tutorials because you can apply what you've learned in their active problem-solving environment and that helps to solidify that knowledge there are guided lessons will challenge you but you also have the ability to get hints or even solutions if you need them it's really tailored towards understanding that material so they're computer science material is fantastic and I really like what they're doing they also have plenty of courses depending on what you're most interested in so they have courses in different fields of mathematics or astronomy solar energy computational biology and all kinds of other great content so to support my channel and learn more about brilliant you can go to brilliant org /c m/s to sign up for free and also the first 200 people that go to that link will get 20% off the annual premium subscription and you can find that link in the description section below and again that's brilliant org /c m/s okay so I think that's gonna do it for this video I hope you feel like you got a good introduction to working with dates and matplotlib and how we'd plot that type of data in the next video we're going to be learning how to plot live data in real time now these real-time plots can be used in a lot of different applications for monitoring things that are constantly being changed or updated so that can be data that you're pulling down from an online API or may be something that you're reading from a sensor or something like that there's a lot of different types of applications for that so definitely be sure to check that out but if anyone has any questions about what we covered in this video then feel free to ask in the comment section below and I'll do my best to answer those and if you enjoy these tutorials and would like to support them then there are several ways you can do that the easiest way is to simply like the video and give it a thumbs up and also it's a huge help to share these videos with anyone who you think would find them useful and if you have the means you can contribute through patreon and there's a link to that page in a description section below be sure to subscribe for future videos and thank you all for watching you
Info
Channel: Corey Schafer
Views: 107,612
Rating: undefined out of 5
Keywords: matplotlib, python, python matplotlib, data science, data analytics, data visualization, python plotting, python graphing, matplotlib tutorial, time series, python time series, datetime, python plot date, matplotlib time series, matplotlib dates, python (programming language), matplotlib (software), python tutorial, corey schafer, python programming
Id: _LWjaAiKaf8
Channel Id: undefined
Length: 17min 9sec (1029 seconds)
Published: Mon Jun 17 2019
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.