Seaborn FacetGrid | How to make Small Multiples with Python Seaborn | Titles, Hue, Legend

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hi everyone welcome or welcome back to this intro to seabourn series today we're talking about the seabourn fassett grid seabourn's fassett grid serves as the backbone for its cat plot row plot and the brand new disc plot so i'm really excited to show you what the faster grid can do the main idea behind the faster grid is that we'll be creating what are called small multiples this just means that we'll pick out a categorical feature in our data and then create one plot for each and every category and we're going to be able to do all of this with just a few lines of code one of the first examples of small multiples was created by edward mybridge to show how a horse gallops contributing to the foundation of motion pictures mybridge's work is actually a collection of several different photographs today's small multiples still consists of many different figures for example the market share for company a b and c this allows us to compare trends across various different categories now that we know about small multiples let's check out the seabourn code taking a look at a little bit of seaborn code i'm first just going to import the seaborne library as well as the pi plot module and then today we're going to be looking at some data about penguins and so this data comes from the seaborne library and i'll set my styling to be white as well so now we're ready to create our first facet grid to do that i'm going to reference the seaborne library and then type in facet grid so note that the syntax looks a little bit different here we do have capital letters on that facet grid so that is something that is a little bit different than some of the other functions that we've looked at for seaborn the next thing i need to supply is the data frame that i would like to plot so i'm going to use the penguins data frame alright so nothing super exciting yet we basically just have blank x and y axes ready to put some data on here but the first thing i want to show you about the facet grid is the idea of small multiples so what i can also do is either supply a row or column dimension or both let's check these out so let's say i wanted to supply this column dimension and say that i'm going to break up my small multiples by the island that the penguins live on so seabourn has created three separate subplots one for each island that it found in the island column there are three different islands and so we have three different plots and one thing to note here about the syntax because i used the call argument what that's saying is i want one column for each of those islands another defining feature of the facet grid this is actually going to return an object so let's save that as g if we check the type of g we're going to have a facet grid this is super important because we're actually going to be using methods of that facet grid object in order to show data in each of these plots one more thing to show you before we do that i used column here so i had one different column for each of the islands but of course you can also use rows let's try that as well now i should have three rows of figures because i have three different islands so that's also possible and you can also combine rows and columns if you have multiple different categories you want to split out on so now that we have our facet grid set up we can move on to step two which is to map some plots onto these axes so let's do that now we have g saved so g is our faster grid that we just created now all we need to do to add some plots is to use map so the first thing i need to supply to map is actually which kind of figure would i like to create so here let's actually create hist plots on each of these figures okay so i'm going to create a hist plot then i also need to say which column of the penguins data frame am i interested in so the his plot only requires one dimension so let's do the flipper length of these penguins awesome so here we go we have one hist plot for each of those three different islands so what seabourn is doing is actually grouping up all of our data by each of the islands and then creating a hist plot for each of those groups so that's pretty cool we're able to build all of those small multiples with just a couple lines of code the facet grid object also has another method called map data frame and this is just a little bit different but accomplishes similar things to map so let's try this one also again we need to pass in what kind of figure we're trying to make for us it'll be a hist plot and we need to supply what column from the penguins data frame are we interested in so for us flipper length alright so it looks like it does the exact same thing as map but it's slightly different one of the big differences here is that map data frame allows for named variable arguments so we could say x equals flipper length or even that y equals flipper link so map dataframe allows us to name those arguments as we go map however would actually give you an error here it requires positional arguments so if i tried to say x equals flipper length i'm going to get a big old error map does not allow that whereas map data frame does besides creating those small multiples for you the other cool thing about faster grid is that you can actually pass in whatever kind of plot you'd like here so let's try a different one how about scatter plot and so scatter plot is actually going to require two different arguments both in x and a y so let's do that again i'm just referring to the column names of that penguins data frame so there's one called coleman depth and there's another one called coleman length so this is the other really cool thing about the facet grid you can just alter what kind of figure you'd like and that will be able to produce many many different types of seaborne plots there are three main steps to produce a facet grid with seaborne first you'll set up your facet grid by referencing the data and which categories will form each facet second we'll need to specify what kind of plot is going to go on each facet and the third and final step is to then customize your facet grid using methods and attributes of the facet grid let's check that out we previously saw that the facet grid returns an object which we can save in a python variable so here i've called it g and if we check the type of g that is a seaborn facet grid this object comes with many different methods and properties and you can take a look at these by typing g dot and hitting tab so there's lots and lots of things that you can do with this facet grid and we're going to be continuing on styling by using some of these methods so let's check this one out first we can set the axis labels through this method called set axes labels and this method just accepts string values whatever you'd like to title that x axis and whatever you'd like to title the y axis and as we see those labels are then applied to the x and y axes and we will see that x axis label repeated for each of those small multiples seaborn will try its best to title those small multiples appropriately for example we have the various different islands and seabourn will say island equals dream or bisco etc but if you'd like a different title for these small multiples you can pass in a title template so let's try that again i'll just reference that facet grid object and i'll use this method called set titles now since i want to change what the template looks like for the columns i will reference the call template and the reason why i keep calling this a template will become apparent in just a moment so let's actually work through this first i'm going to reference the column name and then write the word island let's see what this does all right so for each of these different small multiples i do have the string island but this call name because it's specified in these two curly braces and because i've used this keyword column name i actually see the name of each island showing up there so this is a really nice way to style those titles and still be able to pull in the name of each of those different islands and those are again coming from the categories of this island column which is what we based our facet grid on and you'll also see that i can just continue styling the grid however i'd like first i've added the plots and then i set the axis labels i set the title templates etc and i can just continue with this until i'm satisfied with the facet grid we also previously saw that you could instead specify a row dimension for your faster grid and actually you can specify both a column and row dimension so let's actually also put the species on the row awesome so now going across each of the columns represent the various different islands but each of the rows specify the various different species of penguins this last one being gentoo and that goes across each of those different islands and some of these plots are blank because in this data set we don't have any gen2 penguins from dream island for example so this is pretty cool with the facet grid we can get one small multiple plot for every single island species combination and just like with those column templates for the titles you can also set a row template and let's do this one as the row name all right great so we have the penguin species and the island for every single small multiple in our facet grid there's a couple of additional attributes that you can set directly within the faster grid call so let's try a couple of these one thing to notice each of these plots right now is sharing this one y axis if you'd like to have a separate axis for each of your figures you can just set share y equal to false so now each plot has its own axes but one thing to caution you here this could potentially change the y limits for each of those plots so notice how each different facet has its own range for the y and that can be kind of confusing for comparative purposes among these different islands so let's actually set our y limit and you can do that right here within the facet grid and once we do that all three of these plots will now share this 20 to 70 y limit seabourn's facet grid is highly customizable you can use hue to specify other categories in your data as well as many many other styling options including custom functions let's check it out like many other plots we can use hue to show off one more categorical variable within the facet grid you may think that you should pass q here to the map data frame method but there's an issue with this what happens if you pass in hue here seaborn will actually create these hues based on the individual figures and what you'd probably like to have is a hue based on the overall facet grid and just to clarify this a little bit further we could actually add a legend here and you'll see that we have some issues so you can see from this legend we definitely have some issues ideally penguins are in blue but then gen 2 and chin strapper and orange and it's very confusing right so the problem is where we've placed this hue argument it actually should go up at the facet grid level now let's try it again awesome now we have ideally penguins in blue chin strap penguins and orange and gentoo and green which is exactly what we want three different colors for our three different penguin species so if you are going to use hue you'll want to specify that within this fastic grid function you also of course have the option to change the various colors that are displayed here and that's through this palette argument so there's plenty of different palettes to choose from in seabourn let's pick prism and there we go you can style that however you'd like the final option i want to show you is quite advanced but it does really show off the power of this facet grid so this section is all about adding your own custom functions to these facet grids and i'm going to come back to this function in just a little bit for now let's take a look at this facet grid example i've specified which quantity i want to plot that's the body mass i've created my facet grid based off of the penguins data and i'm splitting that data up by sex and by species so my two rows of data will be the male and female penguins and then each column will be the various species of penguins i've then mapped my plots onto this facet grid here i'm using the kde plot and i'm specifying the quantity of interest that i'd like to plot here that's the body mass i also wanted to point out that you can again style these however you'd like and any of these keywords you pass here are going to be going to the individual plots themselves so i can turn on shading for the kde or make that line width a little bit darker those keyword arguments are going to be specific to whichever kind of plot you're displaying on this facet grid then i've also added a title template all my rows are going to have the row names and my columns are going to have the column name and then the word penguins the way that custom functions come into play here we can actually add let's add it right here one more line that will be mapping the data frame and this map data frame method actually accepts any kind of function you'd like as long as that function has a data argument so i'm going to use my own custom function at this point if i take a look at my custom function here's what it does it has a data argument that's required it also wants a keyword argument called var so that's going to be the variable of interest if the variable isn't provided i'll just skip this step but if the variable is provided but i'm going to compute the mean of the body mass here i'll grab my current axes whichever facet i'm working on at that point and i'm going to be adding a line at the mean value a vertical line right at the mean and this final step i'm just going to be annotating the mean for that particular group of data all right so let's try it out that function is called add mean line and it's a function i created myself it takes in the data argument and that data is coming from however i set my facet grid up so it's going to be passing in the penguins data the other thing that this function needs is a variable so the var argument here and that's just going to be our quantity of interest that's just the body mass all right let's see what it does cool so for each of these different facets we added a little line indicating what the mean is for that particular data and i'm also annotating the mean of the male italy penguins are going to be 4043 grams etc so this is super powerful if you have specific things you'd like to do to these facets you can absolutely do that in a customized way you'll just want to write those functions so that they take in a data argument and they can have other arguments as well and the nice thing about how i've set up this function is i can actually change this quantity if i'd like let's say the common length and see how that compares across species and across the penguin sex so i hope you enjoyed learning all about the seabourn facet grid like always the code is available to you through my github page and like i mentioned in the beginning the faster grid serves as the backbone for the cat plot rail plot and disc plot so we'll be talking even more about the facet grid in those videos see you then
Info
Channel: Kimberly Fessel
Views: 6,736
Rating: undefined out of 5
Keywords: seaborn facetgrid, facetgrid, python facetgrid, seaborn facetgrid histogram, seaborn facetgrid legend, seaborn facetgrid kdeplot, seaborn facetgrid title, seaborn facetgrid sharey, facetgrid title, facetgrid hue, facetgrid hue legend, seaborn facetgrid hue legend, seaborn facetgrid hue, seaborn small multiples, seaborn facet grid, seabornĀ facetĀ grid, facet grid, python facet grid, seaborn facetgrid col, seaborn facetgrid map, seaborn facetgrid custom function
Id: YYeqJllXHxM
Channel Id: undefined
Length: 15min 46sec (946 seconds)
Published: Mon Jan 11 2021
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.