Polynomial regression in MATLAB

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to the screencast on polynomial regression in MATLAB in this screencast we're going to learn how MATLAB handles polynomials about the poly Val command the poly fit command and then how to plot a polynomial trendline on top of a scatter plot using only the command window or an M file so first of all just to review in mathematics a polynomial is a function that is a combination of power functions with coefficients multiplied in front of them that's in mathematics terms in MATLAB we have a special way of representing them we represent polynomials as vectors containing the coefficients for the first entry is the leading coefficient for example the polynomial above will be represented by this vector there's a zero there because there is no X to the fourth term now the command poly Val is used in MATLAB to basically plug a number into a polynomial in mathematics terms we would simply replace the X with number we want to plug in in MATLAB if I have this as my polynomial to quote-unquote plug in 4 I would use the command poly valve poly Val parentheses P comma 4 takes 4 and plugs it into the polynomial P listed above so in other words polynomials for MATLAB or a certain kind of datatype and more than just a function let's play around with this some more here in the MATLAB command window I'm going to define P to be the polynomial to 1 negative 1 that would be in math terms the polynomial 2x squared plus X minus 1 we're going to leave it like that now if I want to plug 10 into this polynomial I would say poly Val P comma 10 and I should get 209 which is what you would get if you did it in mathematics now the nice thing about poly bell is that it's vectorized I can also that means I can also plug in a list of numbers and have it evaluate all at the same time we're example poly Val P and if I wanted to plug in not just the number Tim and let's say all positive integers from 1 through 10 I could type 1 1 10 and it would evaluate all ten of those numbers automatically at the same time this is very important when it comes to plotting one of the reasons MATLAB has special functionality for polynomials specifically is because polynomials are very useful in simple models for data as we saw in the basic fitting tool screencast polynomials often make very good trend lines to give us lines of best fit or curves does fit through data and speaking of data I have a variable over here in my work place called Joko population double-click to open and what this variable contains are our two lines of data this is like a mini spreadsheet in the first column our Year values and in the second violent column or population values let me say a few things about how I've scaled the data here the years here are years since the Year 1830 so when you see 0 that's the Year 1830 this is the year 1840 1850 and so on all the way down through the year 2010 and in the second column is the population of Johnson County Indiana where I'm currently located in thousands of people so in 1830 there were 4,000 19 people in Johnson County in 1930 there were 20 1706 people and so on I close that out right now and what I'd like to do is look at a scatter plot of these data just to get a sense of the overall trend and I have an M file open right here that will create such a scatter plot just to walk quickly through the code and what it does the first line here close all simply closes all open graphics windows when they execute the second line takes the first column of the joko population variable that's got the years in it and just calls it year they said this 3rd line takes the second column of Joko population and assigns it to the variable pop and the rest of the stuff is just trappings that will make the plot look nice and here's the plot that we get so one of the things I would like to do with a scatter plot like this is to put a line of best fit through it the data don't look linear but it's a good start for fitting the data just to create a simple line now I could do that with the basic fitting tool I can actually do that from the command line or from an EM file let's go to the command line for a second now there is a command in MATLAB called poly fit Pauling fit and let me just type in the syntax that I will use poly fit Year pop and then one let me explain what each of these arguments means poly fit is going to create a polynomial regression line of best fit the first argument is the X variable and my data the second argument is the library my variable my data and the third argument is the degree of the polynomial I want by putting in degree one I am specifying a line a linear function of best fit if I wanted a cubic of best fit I would change that to a three for example when I execute polyfill it will give me back a polynomial this as a function is the function y equals point 5800 seven X minus twelve point five five one three I'm going to re execute that command one more time except I'm going to store the output as a variable so P is equal to polyfill and again the syntax for polyfill is entering your x data here your Y data here and the degree you want there's p.m. now I can do some fairly interesting things just right here for example if I wanted to get a prediction from this linear model of what the Census Bureau would say about the population in the year 2020 that for me is the x value of 190 that would be plugging a number into this polynomial so I would use paul eval p 190 into predicted population of about ninety eight thousand people ninety seven thousand seven hundred and eighty-eight people now what i would like to do is maybe to plot this line of best fit right on top of the scatter plot to do that i have to do a few things first I'm going to define P P equal to poly bell P and here I'm not going to plug in a single value for my time I'm going to plug in the entire year vector a little earlier we saw that poly bowel is vectorized and if I wanted to evaluate a polynomial and a whole string of numbers I could do that and so when I enter end this it's going to be a lot of numbers but these are the Year values these are the population values that my linear model would predict for all the years that I have in trying to be from 0 to 180 that's a vector and I can plot that like any other vector so I have already my scatter plot over here I'm going to superimpose a new plot on top of it so I'm going to type hold on to make sure I don't wipe out my current plot and then I'm going to plot using plot command on the data that I just came up with and to do that remember I have to have the X variable first that's year and then my Y variable which is pp and I'm format this so it's a solid red line going through the data and once we click back over to the scatterplot I see there's my line of best fit again not the greatest fit but it is the best fit that a line can possibly get now I'm going to go and add all those commands to my M file here just because it seems like something I'd want to do over and over and over again so I'm going to add some space here and just type in the commands we saw a minute ago P was equal to Polly Polly fit year population with degree 1 and then I assigned to PP the values of Polly Val P and all the Year values basically I'm plugging in all those your values to PE all at once and then I hold on and then I would type plot X variable first year PP for Y variable and then I formatted it with this now run this again it should give me the same plot as before there we go now what's nice about this is I can go in an EM file I can control a lot of things I could change this what if I wanted a cubic instead well I would have to do is change this one to a 3 now let's for again review what's going to happen here once I click the execute button it's going to plot the scatter plot and label it like it had before P is going to be the cubic polynomial the degree 3 polynomial of best fit through my data I'm going to take all my year values and plug them into P just as when we were creating a basic mathematical plot using the plot command much earlier in the series of screencast we had to enter nd create a variable for X's and then create a variable for the Y values that's what we're doing here through the Polly Bell command we're creating a vector of Y values till the graph to not overwrite and then plot it and now we run the M file and I see a very nice plot here here's the red line the cubit that goes up through the one last thing I might want to do is add that cubic to my legend and I don't have to do that manually just by adding a second string argument here you can just call this the third degree polynomial like so we run that M file and there we have the scatterplot the cubic and it's labeled in the legend as we would like it to be so let's recap what we learn in the screencast we learned that polynomials in MATLAB are actually a data type we represent them as vectors containing the coefficients we've learned about the poly Val command which allows me to plug single numbers or entire vectors worth of numbers into a polynomial and we've learned about the poly fit command which generates the nth degree polynomial of best fit to a set of paired data and if I'd like to plot a polynomial of best fit alongside or on top of a scatter plot of paired data I can follow this little procedure here first of all generate the nth degree polynomial using polyfill plug all your x values into that polynomial using poly Val and then plot the result basically just like you would any other mathematical plot by typing hold on so it's not to overwrite the scatter plot and then just plain plot that's all thanks for watching
Info
Channel: RobertTalbertPhD
Views: 139,993
Rating: 4.9457769 out of 5
Keywords: CamtasiaForMac, polynomial, regression, statistics, stats, MATLAB, POLYVAL, POLYFIT, data, plot, visualization, vizthink, Talbert, cmp150, tutorial, screencast, howto
Id: c2DL_bZlrLs
Channel Id: undefined
Length: 9min 51sec (591 seconds)
Published: Wed Feb 23 2011
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.