Replace Excel If Function with Python Pandas

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments

I noticed in your udemy course, some people are waiting for a reply. Love the course though!

👍︎︎ 1 👤︎︎ u/Teslas_Understudy 📅︎︎ Dec 06 2020 🗫︎ replies
Captions
hey everyone welcome back to the channel if this is your first time here my name is derek and i teach python programming tutorials in this one i want to show you how you can replace excel's if statements so i want to show you a way that you can replace these using pandas and python let's jump into a few examples so you can see how it's done [Music] here's the small excel worksheet that i'll be using in this tutorial if you want to follow along completely this excel sheet will be linked in the description below but as you see it's just a small data set that contains a name and test scores so an operation that we might want to do on this data set would be to go through take an average of the test scores return a pass or a fail and then maybe the letter grade so let's do all that in python let's get started starting out we'll import the libraries that we'll be using whenever we're working with anything excel i always use pandas to do these operations we'll also import numpy as np then we'll import pandas spd if you don't have these go ahead and pip install them our first step is to connect our data source to python since we're using pandas this is super simple so we'll say scores df and then we can use pandas dot read excel and then we can pass in the file path i'm using the same directory so i can just read in sample scores.xlsx which is just the name of my excel workbook so if we want to print scoresdf to make sure that we've read the sent successfully we can we'll open up a terminal and then we'll execute our code after reading that in we see that we got the data frame that we expect now let's go ahead and create a new column called average which will take the test 1 and test 2 scores and average them together we can do that by saying scores df and then we can create a column by using square brackets and then the name of the column we'll just say this one will be average then we can apply our calculation here for us we can just do scores df and then we'll take a mean then we can pass in the axis that we want to do this operation along since we want to do it for each individual person we'll say axis is equal to 1. now that we have this average let's talk about how we can replace the excel if statement that we're trying to do there's a few different ways that we can do this the first way is if we only have two conditions the second way is if we potentially have many more conditions with many more desired outputs let's look at a simple case where we only have two conditions first if we only have two conditions we can say something like scores df and then we can create a new column so let's say our two conditions are pass and fail then we can use a numpy method called where what the numpy where method does is it considers a conditional and then it returns a value if it's true and then a separate value if it's false in this example let's say that scores df average is greater than the value of 60 then we'll give that person a value in that column of pass if not then that person will fail so let's see that now we can just print scores df see here that the first argument is just the condition the second argument is if this condition is true then put this value into that column and if it's false then we put in this value so now when we print this again we should see a new column of pass and fail which we do it looks like almost everyone passed except me or you can't win them all all right so now let's talk about if we have multiple conditions let's say for example we wanted to assign a letter grade depending on the average score how we can do this is by creating multiple conditions and then multiple results that we would expect depending on the condition we'll start off with the conditions and then we'll move on to the results we'll say conditions and this will be equal to a list and then we just pass in our conditions to this list so our first condition might look something like this where we say average is greater than or equal to 90. then our second condition might be something along the lines of scores df we'll take that average and we'll say if it's less than 90 and if scores df average is greater than or equal to 80. we can do this for as many conditions as we want so we can go through and do this for each letter grade that we expect great now we have all of our conditions at this point we just need to create a list which has our results so we can do that by saying results is equal to a list and then we'll just pass in each of these corresponding results to the conditions above so for us that would just be the letter grade of each of these sections now we just need to create that new letter grade column that looks at these conditions and assigns these results we'll do that the same way so we'll say scores df letter grade will be your new column and this time we'll use a numpy method called select what select will do is we'll take in a list of conditions and a list of results and then apply the corresponding result to the condition so empty select and now our conditions will be the first argument and then results will be our choice list we can see kai helping us out with the proper argument name so now we have conditions and results if we word to print scores df we should now get back letter grades we'll see the result once it executes center terminal okay so now we've done the letter grades i want to show you one advanced technique that we'll kind of replace this technique here if you're more familiar with python we can use a list comprehension so if you don't want to create that new column using the numpy method of where we can just use a list comprehension the way that we'll do that is we'll create that new column again so scores df and pass fail we'll use a list comprehension which will be square brackets and then we can just pass in the conditional here so we can say pass if our iteration so i'll just say x if x is greater than 60 then we'll include our else statement so else we'll say fail for our iteration so we already named it x in scores df and then we'll pick the column so score is df and this will be the average so now if we were to print scores df what we're doing is we're going through just using a list comprehension and saying if the average which we're creating as x if that x value is greater than 60 then assign pass if not we'll just give them a fail now on print this scores df and just like that we've worked through how we can replace an excel if statement using python pandas and numpy and that's pretty much it for this one this video was a request by one of you so if you have your own requests that you want me to tackle in a future video please let me know and i'll be happy to do it if you have any questions or comments about this video please let me know that too until next time [Music] my [Music]
Info
Channel: Derrick Sherrill
Views: 113,814
Rating: undefined out of 5
Keywords: Python, Python Automation, Tutorial, How To, Derrick Sherrill, Excel, Python Excel, Excel IF, IF statement, conditional, numpy, numpy where, numpy select, list comp, python excel automation
Id: JZXQPEXCjIk
Channel Id: undefined
Length: 8min 4sec (484 seconds)
Published: Sat Oct 10 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.