Seaborn Heatmap - How to Visualise Correlations and Data With Heatmaps in Python

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
do you often work with tabular data and find it tricky to identify trends and spot the outliers if so then the seaborn heat map may just be for you hey friends welcome back to the channel in today's video we're going to be looking at the seaborn heat map but first what are heat maps essentially heat maps are just paint by numbers we take a color and then assign it to a value within the table and this can help us understand our data better and with that we can understand where we've got our maximum and minimum values as well as where we've got possible extreme values outliers and also we can identify patterns within the data in today's video we're going to see how we can use the heat map for two applications the first is looking at temperature data over a period of time and the second is to look at correlation values between variables so let's go over to our jupiter number and see how we can apply a seaborn heat map to our data so the first example we are going to look at is using temperature data from the rotheriff station in the antarctic peninsula so heat maps are a great tool for visualizing this type of data as we can easily spot trends over long periods of time first thing that we need to do is import the libraries that we're going to be working with in this case we're going to be using pandas which is imported as pd seaborn which is imported as sns and we will also be calling upon matplotlib dot pi plot which is imported as plt we can then import the data that we're going to be using by using pd.read underscore csv now i have a few extra arguments in here so that the missing values which are represented by a hyphen are converted to nands i'm also setting the the months column as the index for the data frame so if we run that and then call upon the data frame we can see that we've got the years along the columns and the months along each of the rows and within that we have the temperature values so we can easily convert this to a heat map using seaborne and that is done by calling upon sns.heatmap and then passing in the data frame so once we run this we get back a heat map which shows how the temperature varies over the years for each month so let's make a few changes to this heat map first we need to increase the figure size so that we can represent every year along the x-axis and we can do that by calling upon matplotlib so first we need to call upon plt.figure and then we pass in the argument fig size and we set that to a value so in this case i'm going to set that to 15 inches by eight and then we call upon our heat map again there's an s dot heat map and then pass in our data frame and when we call upon this we now have a much larger figure and we can see each of the individual years within this data set at the moment the colors don't really make sense so we have the colder months represented by really dark purples and almost black whereas the warmer ones are represented by these light very light colors so let's change this color map over to something that's more representative of color changes and we can do this by calling up on the cmap argument and then we pass in cool warm and run it so now we can see the colder temperatures represented in blue and in the warmer temperatures represented in red and this provides a much nicer visualization to understand the temperature range we can see that the temperature during the southern hemisphere's winter months june through september has gradually warmed through the years and reduced in span so at the start here we had the winter months going from about june to october which was relatively cold and then if we move over to 2020 and 2021 we can see that that has reduced almost to a single month where temperatures are below zero so let's move on to the next way that we can use heat maps and we can use them to visualize correlations between the variables within the data set so here i've got a well log data set and i'm going to select certain curves from this data set so caliper bulk density gamma ray neutral porosity photoelectric factor and acoustic compressional so these are commonly acquired curves within the well logging environment and we often want to see how they correlate with each other so if i run this and then view the well data we can see that we've got our individual values for each of the logging curves we don't have depth as i'm not really interested in that for this particular example as i say we're just looking at the relationship between the variables so before we do that we need to call upon welldata.cor to calculate the correlation values between each of the variables and we'll assign that to a variable called core if we want to view that we can call upon the variable and we can see the correlation values between each of the the curves so at the moment we can see that we've got caliper row b et cetera along the the column headers and we've also got along the row headers along the diagonal we've got one which is where we're comparing the variable of itself and this is obviously going to be one as it's going to be exactly the same and then we have the other values however this is where heat maps can help understand their data by visualizing it in a better way so at the moment we're just looking at numeric values which is not as easy to visualize and understand so if i call upon sns.heatmap and pass in our variable or data frame here where we've got the correlations we get back this figure here with multi-colored patches now it's not very clear to see what's actually happening so we need to make some visual adjustments to our figure and the first one that we can do is change the colors so if i take our heat map here and we again call upon the cmap argument assign that to rdbu which means red to blue and then we get back a figure which looks a little bit better as we've got a divergent color scheme instead of a sequential one here so here we've got values up at 1 which are indicated by blue which indicate that we've got a strong positive correlation between the variables and negative correlations are represented by a deep red so at the moment we're going from a value of one down to negative zero point eight or thereabouts but it works if we want to just make that a little bit neater so that we're going from positive one to negative one and then the colors will be assigned accordingly so if i do v min is equal to negative one and v max is equal to one we now have the full range of the data and we can also assist the the visualization of this by using an annotation if i type in anot which is short for annotations is equal to true when we run that we now have the numeric values so here we have our data frame where we can see those values but now we've got some added color so we can see where our features are going to correlate with each other so we've got row b strongly negatively correlated with dtc and it is also strongly negatively correlated with n5 which is what we would expect and we also have strong positive correlations with n phi and caliper as caliper increases then we're going to expect the neutron porosity or the n-phi curve to also increase as the tool is reading more of the borehole fluid we can also style the annotations very easily and we do that using arnott underscore kws which is short for annotation keywords and with this we pass in a dictionary and we will pass in the font size so if we want to change the size of these values within this we then pass in the font size key and then we set the value and we'll set this to say 11 which is 11 points and then we also want to set the font weight so we maybe want to set this to bold and we can do that simply by repeating what we've just done and there we have the value so it just makes it a little bit more easy to read the numbers so you may notice that each of these entries or in these cells within the heat map are slightly squashed vertically we can make these true squares by simply passing in another argument which is called square and then we set that as equal to true so now each cell is a proper square and there we have it we've seen how to create a simple heat map very easily with the seaborne library if you've enjoyed today's video be sure to give it a thumbs up down below and if you want to see more content from this channel be sure to click on that subscribe button and ding that notification bell so thanks for watching until next time bye for now
Info
Channel: Andy McDonald
Views: 21,466
Rating: undefined out of 5
Keywords: andy mcdonald, geoscience, seaborn heatmap example, seaborn heatmap palette, seaborn heatmap pandas, seaborn, heatmap, python, seaborn tutorial, data visualization, data science, tutorial, plotting, seaborn heatmap correlation, seaborn heatmap tutorial, heat map, heatmap seaborn, correlation matrix, data analysis, python seaborn, heatmap using seaborn, heatmap tutorial, heatmap python, python data visualization, visualization with seaborn, seaborn heat map, heat map using seaborn
Id: J7cd1-g1O7A
Channel Id: undefined
Length: 8min 27sec (507 seconds)
Published: Wed Aug 10 2022
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.