What Is Correlation?

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello everyone and welcome to this brief overview of correlation so what is correlation correlation is a relationship between variables by looking at a correlation we can get an idea of how strong this relationship is and what type of relationship exists between our variables so more about the direction one feature of a correlation is of course its direction by looking at whether or not a correlation has a positive or negative sign we can tell what is happening between the variables also we can take a look at a graph of a correlation and quickly see what direction our correlation is in for example a positive correlation occurs when both variables are increasing or decreasing at the same time our data set will fall in an upward sloping direction when we have a positive correlation so if we have a data set that has points that look like what I'm putting on here now you can see that if we drew a line through this data pulling in an upward sloping direction what this tells us is that as our variable X increases our variable Y is also increasing okay now a negative correlation is what exists when our variables are going in opposite directions meaning that as one variable increases the other decreases this gives us a negative correlation so in this case our data set will fall in a downward sloping direction so if we have a data set that looks like this and we draw the line that best fits this data set we can see that it is going down okay now what this means is that as our values for X increase as we move along the x axis our values for a Y are decreasing they're going down so we can see here that our variables are going in opposite direction as one is increasing in this case our X value then Y is decreasing now if our dataset doesn't seem to follow either one of these predictable patterns then it may be described as having no correlation no correlation means there is no discernable pattern that can be detected between the variables meaning that no relationship exists this can happen in two different cases one the first one I'll present here is where our data set appears to be scattered where it has no real pattern so let's say for example we had a data set that looks like this now if we drew a line through this like we did for the positive and negative correlation you can see we would really have a straight line this would be the best fit here if we ever see this type of data set or we can draw a straight horizontal line then that means that there is no real relationship between x and y another way that we could possibly have a no correlation is if we have something that is curvilinear which means that maybe our data set looks like this seems to curve around so if we drew a line it would appear to be this way and we could also do a data set that would go within the opposite direction as well and what this means is that it's not a reliable linear relationship we can see here that when x is really low so in this case right here or when it's really high Y is low and we can see that drawing a best-fit line through our data set here does not lead to a sloping line or signifying a direction a positive correlation again is signified by a positive sign and both of our variables are going in the same direction this can be from 0 to positive 1 a negative correlation again means that our correlation is from 0 to negative 1 or has a negative sign and our variables are going in opposite directions no correlation will be close to 0 so we'll have no sign and it means there is no sir Abell relationship between the variables another feature of a correlation is its strength the closer the number is to 1 regardless of the sign or the direction whether it's positive or negative or has a positive or a negative sign the stronger the correlation is though we would have to look up the critical arm or a correlation coefficient which is affected by the sample size to determine whether or not the correlation is large enough to be significant we can get a basic idea of the strength by simply look at the correlation coefficient itself we can see here that an r equal to plus or minus 0.8 or higher is typically referred to as a strong correlation an r equal to plus or minus 0.5 to 0.8 is a considered a medium correlation while are of equal to plus or minus 0.4 or lower is usually referred to as a weak correlation now we can have this for any of the types of correlations positive or negative that we were talking about before for example a strong positive correlation would fit a straight line or as would be very close to that straight line so for example very similar to what I had listed up at the top of the screen if we had a dataset like this where our best fit line would go right through the middle of course that's supposed to be a straight line then that would be a very strong positive correlation the closer those dots are the points in our data set are two that line the stronger the relationship and the stronger our correlation the greater that number will be to one now we can see this especially in let's say a weak positive correlation now again it's positive so our data set will still go in maybe an upward sloping direction you can see here that our points are spread out more they're not attacked greatly around the line that we have here are the best fitting line so this would represent a weaker correlation overall remember that correlations tell us about a relationship between variables how strong that relationship is and in what direction those variables are going however we cannot infer causality or say that one variable is impacting or affecting the other
Info
Channel: DrMaggard
Views: 423,248
Rating: 4.844811 out of 5
Keywords: correlation, statistics
Id: Ypgo4qUBt5o
Channel Id: undefined
Length: 6min 56sec (416 seconds)
Published: Fri Jun 05 2009
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.