Python Pandas Tutorial 9. Merge Dataframes

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hey guys welcome to core basics coding tutorial today we are going to talk about Marge in pendous as usual I have launched my Jupiter not book by running Jupiter not book command on my command prompt and it launched my not book here and what I'm going to do is create a new notebook now to save some time on recording I have some code snippet which I am going to just copy directly so I have this first data frame which contains temperature information on different cities and then I have the second data frame which contains data on humidity now often I want to join these two data frames to create a one single data frame which contains both temperature and humidity and you can do that using pandas merge function so you can say PD dot merge your first data frame your second data frame and the column on which you want to perform this much okay and when you print your data frame it's going to look like this so now it has temperature and humidity and you can see that New York has 68 humidity which is 68 here and 21 temperature so it is truly it's not just going by the row index it is truly looking into the value of the city and performing this join now if you are aware about database joints then this is exactly same it's like doing a joint between two tables using this column such as which which is basically City okay now here you had all three cities present in both the data frames what if you have let's say one extra city here Baltimore where the average temperature is less 832 and then you don't have all londo here instead of that you have San Francisco okay so let's say this is how your data frames are set up okay now when you perform this join you will notice that the result has only New York and Chicago so whatever was common between these two data frames you are seeing it here now it's missing Orlando Baltimore and San Francisco the reason is it as you already figured it out it did an intersection so if you remember the set theory as you can see in the diagram intersection is basically taking common elements between the between the two sets okay also in our database wall there is this concept of inner join so that's what this is inner join or intersection okay now the second kind of join that you have in database is basically outer join if you want to use set theory term terminology then it will be a union taking union of our two sets basically okay so the argument that you would use is how so in how if you say outer then it's gonna take its gonna be a union of these two tables so you will see Orlando Baltimore and San Francisco and the columns for which it doesn't have information is gonna put any n over here okay so you we saw outer join then inner join is default is whatever you see when you don't supply how column so if you do and as more documentation you will see that by default how is inner okay alright another kind of joint is left joint where again as you can see in the picture left will take all the common elements between the two sets and also the remaining elements from the left data frame so you can see that now I have all Lando and Baltimore okay these two are present but I don't have San Francisco because San Francisco is my right data frame and by the way left and right is decided by the order in in your March call so DF one is left and DF 2 is right okay now let me do or right inner join not right inner join really right join so in right join you will see elements from right data frame so you can see San Francisco and of course the common elements but you don't see now Orlando and bottom are going back to outer join when you have this result data frame sometimes you might want to know from which data frame these element coming so for example this has an end so you know either it came from left or right okay so in order to know that you can use indicator flag so by default it is set to false but if you set it to true you will get this additional mosh column where it where it will say if you know where this data came from so for example this was present in both the data frames which says both then it says left only which means all and oval was present only in the left data frame so this could be useful sometimes we also have one more argument called suffixes so let's say you have common columns in both the data frames so again I'm going to copy some code snippet here okay so let's say this is your data frame which has both humidity and temperature and then you have your second data frame here which also have humidity and temperature okay now when you join this two so again do like PD dot Maj DF 1 DF 2 and your own would be city okay so what you'll notice is is gonna automatically append underscore X and underscore why because humidity column and temperature column were repeated between these two data frames hence when it shows the result it needs to somehow distinguish these two columns so it appended these suffixes now if you want your own surface suffixes then you can use this suffix six column where you can say okay my left column and my left data frame and right data frame so when you do that it by default is using underscore X and underscore Y but when you explicitly supply this suffix X it's gonna use that as you can see it here okay so that's all I had for this tutorial thank you very much for watching you can find the link of the jupiter notebook used in this tutorial in the video description below I will see you in next tutorial thanks again bye
Info
Channel: codebasics
Views: 141,286
Rating: 4.9640613 out of 5
Keywords: pandas merge, merge tutorial pandas, pandas merge data frames, pandas python tutorial, pandas dataframe, pandas dataframe tutorial, merging dataframes in pandas, python merge, python pandas marge, data science tutorial, merge pandas, merge pandas python, merge python, merge in pandas, pandas python, how to merge diffrent dataframe, how to murge two data frame in python, join two dataframes pandas, how to join two pandas dataframes, joining and merging dataframes
Id: h4hOPGo4UVU
Channel Id: undefined
Length: 7min 40sec (460 seconds)
Published: Sat May 27 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.