Python Pandas Tutorial ( Basics ) - Start Replacing Excel for Python 2021 Series

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
everyone derek here i know it's been a minute but i hope you're all staying happy and healthy wherever you are in this one let's jump back into some more python and pandas tutorial content i want to show you some of the basics so if you're just starting out in pandas maybe this video will be helpful to you let's jump right in [Music] starting out if you're just learning pandas what it is is a great tool so we can analyze and work with spreadsheet data inside of python what that means is we don't have to use an excel spreadsheet to do a lot of the actions that we were doing before but instead we can use python to automatically do a bunch of these actions for us the cool part about using python is that we can reuse the code between different workbooks so if we're doing very repetitive actions inside excel we can automate these tasks by using python let's look at an excel spreadsheet and i'll link this spreadsheet down below so if you want to use it as well it'll be there all this is is the price of gold monthly so let's work with this notebook and like i said open it up and work along with me if you want to getting started will open up an editor of your choice and then we'll create a new python file now that we have this if we already have painted installed we can just go ahead and import it as pd so we can say import pandas as pd if you don't have pandas pip installed go ahead and do that with the command at the bottom left of the screen now what we can do is we can read in our csv file or excel sheet we'll call it df gold prices and this will just be equal to that pandas library and we'll use a method called read and then whatever we need here if you're using excel this will be read excel for us i believe this sheet is technically a csv yep monthlygoalprices.csv so we'll use a method called read csv now we need to pass in the path to that csv worksheet for me since i'm working in the same directory all of this is on my desktop then i can just use the name of that file so monthly goldprices.csv if this was inside some folder say your documents then it might look something like this like i said i'm in the same directory so i'll just leave it as this now to verify that we have this read in let's look at a few methods of viewing data we can view data inside of panda's data frame by saying something like print and then we have a couple different methods we can use there's one of head that we can pass in the number of lines at the beginning of our data frame that we want to view there's also tail so this would view the last 10 lines or we could just print the whole thing for this let's say tail and then we'll pull in the last 20 entries now we can open up our terminal or powershell depending on your operating system and we can execute our code we'll be sure to save it before we execute and we get back the result to our terminal these are the last 10 entries in our csv where we have the date and the price of gold as you can see a panda's data frame looks very similar to an excel spreadsheet this is on purpose that way we can work with this data just like how we would in excel except programmatically great so we have that down now let's take a look at a simple operation of selecting a column in our worksheet what we can do to view a column in our worksheet we can say something like print or access our data frame which is this variable name we'll reference that and then what we do is we access it by indexing out a key what a key or a label is in a pandas data frame is just what we call the heading of each of our columns so we have a date and a price and this one if we wanted to say date we could just view all this information here so we'll go ahead and run that and we should only get back the date column now which is what we get so let's go ahead and assign this as a variable that way we can use it later on in our script so we'll say dates will be equal to that column prices will be the same thing so we'll say gold prices and then we'll reference that column which i believe is just called price it is so price now let's take a look at how we can do simple operations on these columns if we need to so let's say simple operations and let's do a simple operation of determining a price that we may want to buy gold at so let's say that we want to buy gold at 90 of the value that way we make a profit we could say something like df gold prices so this is that same data frame that we're working with and then what we can do is say buy price this isn't a column that we already have so what pandas will do for us is create a new column with this header a buy price in this column we need to make sure is the same length as these columns we can do that very easily by using one of those columns so we can say prices and if we want to buy a 90 percent of the value we could say times 0.9 this is a pretty cool feature because we can do simple math operations very quickly and create new columns so simple math works just like this we can do division subtraction addition and a lot more so we'll just leave it as multiplication for right now all right so now that we have that let's look at some more functions let's do something simple so we'll say df gold prices and let's look at the column of price we'll make sure that's capitalized and there's a bunch of different functions that we can use we have things like mean we have mode for this one let's look at the max gold price saving that and running it executing that we see that we get back the maximum gold price of our data which if our data continued would probably be today but in our data it looks like the maximum gold price comes from the last entry finally i want to show you one last tool that you can use and one of the awesome reasons why i like using pandas it's very simple to clean your data to whatever formats you need as you can see here we have 2020-07 let's see if we can get rid of that dash very simply we'll say something like df gold prices will access that day column which is just the heading so date right here now let's say this will be equal to gold prices date so here we created a new column by doing this and then assigning a method since we already have this column here all we're doing is overriding the previous one with some method applied to it we'll use a string method of replace and all this is doing is saying what do you want to replace for us we want to replace a dash and to get rid of it let's say that we want either an empty string we could have a space or anything else i'll just leave it as an empty string so i will expect that the final result will be 2020.07 no spaces and no dash finally going down and saying print df gold prices we should get back that result which we do we also see that we get back that gold purchase price that we set right here great since this is a beginning tutorial for anyone getting started in the pandas i wanted to be sure to show you how we could view data how we can manipulate data and how we could do some data cleaning also think that if you're trying to replace your excel workload that you should also know how to create some graphs from your data there's a bunch of different libraries that we can use to graph data including pandas itself for me i like plotly so i'll import plotly dot express as px so the usual way to do this is to create a variable called figure then we'll use that plotly express library which we called px then from that we can say what type of graph that we want in this instance it looks like that we have a date and one variable that we want to plot so that means we should probably use a line plot using plotly is pretty easy so all we have to do as the first argument is to say our data that we want to use which is data frame gold prices then from our data frame we can say what our x value is in this instance we've already pulled that column out as dates so we'll use that variable that we created here down here for the x value and then y will be this one so we have y equal prices finally we can set a title if we want to so we'll say gold prices over time and with one line of python code we've already created our graph this just creates it so we need to be sure to show it as well so our graph has a method of show that we can use so we'll save this and execute our code and this graph pulled up over in my browser so let me pull it over here and this is what the goal prices over time looks like briefly taking a look at this graph one of the reasons why i like it is we're able to zoom in and have very dynamic graphs that we can play with whenever we create them and that's pretty much it for this one like i said these are just beginning techniques and i hope it shows you enough to where you can start using pandas on your own i'll be coming out with more videos on more advanced techniques that we can do in the future if you have any questions or comments as always please let me know and i'll get back to you until next time [Music] you
Info
Channel: Derrick Sherrill
Views: 18,037
Rating: undefined out of 5
Keywords: Python, Python Automation, Tutorial, How To, Derrick Sherrill, excel, replace, replace excel, python excel, python pandas, pandas, tutorials, beginner, 2021, csv, read csv, csv automation, excel automation, Start replacing Excel for Python, Start Replacing Excel for Python 2021, Series, Pandas, series
Id: F-gDgQ6kuuk
Channel Id: undefined
Length: 11min 0sec (660 seconds)
Published: Wed Apr 14 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.