Survival Analysis Part 9 | Cox Proportional Hazards Model

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] so we've talked a lot about survival analysis and some of the important ideas surrounding that and then we started to build up the idea of how regression models can be used to estimate the survival function and we did a big-picture overview and we looked a bit at the exponential model as kind of the entry level model to to give us the foundation and now we're eating our way into Cox proportional hazards model so the big thing with this that we've already talked about is that conquer proportional hazards model it allows the hazard to change over time it allows the hazard to change over time we said the exponential model has a constant hazard and we said we can look at other parametric models like the libel of these accelerated family time walls that allow the hazard to increase proportion this time for GPS proportion so this model allows the hazard to fluctuate you can increase and/or decrease as time goes by but one thing it does assume is that hazards between groups are proportional here another way of saying that that the hazard ratio is constant the the example I'm going to use is say we're capping the hazard ratio associated with biological sex so what's the hazard for males relative to the hazard for females and what cops proportional hazards model assumes is that this hazard ratio is constant over time here that the hazard for males is proportional to females so just to give it a number suppose the hazard ratio is 2 that means at any given point in time a male is twice as likely to die as a and this model assumes that that's constant over time so at any point in time it mails always twice is like they've done is a Pico so just to kind of build the idea what is proportional hazards meet or constant hazard ratio what I wanted to do is first go back to ideas we learned of linear regression then what did I pick a logistic and then how that translates to survival so trying to take some knowledge we've already built and then transfer that into survival to understand what exactly this proportional hazards meter the way I found it easier to think about is the hazard ratio is constant that way I kind of clear my brain suppose we're looking at they're just generic so welcome numerical combined for some X variable and suppose we have the regression line for males and for females remember this distance here this is between this be me for males - and our here using parallel lines but what we call no effect modification the difference between a male and a female is the same for any value of x and i think of when there is effect modification or interaction we might see something like this here's the line for males and here it is for females and in fact the difference between the male and female changes depending on values of X here yeah that difference it means is growing the difference of means depends on where we are for an X we saw in logistic regression we've modeled the log odds of the disease as a linear function of X so again we can think of fitting a line for males it aligned for females and by people we've it parallel lines right there's no interaction or no effect modification the distance between these two what's the log odds ratio right we saw I took that difference in exponentiated it gave us the odds ratio and again this model assumed the odds ratio the odds of disease per male versus female is the same for any value of x it does not depend on that x-ray and then interaction or effect modification might look something like this where again the difference in log odds for male versus female which we solve is the log odds ratio depended on this value of x looking at survival we're modeling the log hazard as a function of X variables and remember that proportional hazards model allows the hazard to change over time you can fluctuate up and down but for the sake of pinning on to previous knowledge we built I'm going to draw is a straight line what we're showing here is the along hazard is allowed to change over time and it can actually fluctuate up and down or doesn't need to be a line but I want to connect the ideas we've learned before about linearly gistic to what we're seeing now and so we fit a line for males and females and the distance between here is the log hazard ratio okay for the same reason that we've built up previously we talked about logistic or when we kind of book with some regression what type of lessons here if the coefficient for male versus female tells us how does the log hazard change exponentiating that gives us the hazard ratio we just leave that here this one is showing proportional hazards the hazard for a male versus female is proportional over time or the hazard ratio is constant idiot just to go back to what proportional hazards means if this has it was 0.5 and this habits can be point two five the top is double water then the hazard rich hazards can increase the time with wife's let's say at time T equals to this hazard goes up 2.6 this one go up to point I'm sorry for should be 0.25 this one goes up 2.6 then hazard for females would go up to 0.3 so they are proportional then suppose at time T equals 3 the hazard for males is increased 2.9 the hazard for females for five so the top and the bottom are proportional or their ratio is constant this here is showing the idea of proportional hazards again remember these in cost proportional hazard model however it doesn't need to be continually decreasing these could be moving up and down but the distance between the two is the same if we saw this is what look like for males and this is whether the pike for females then what we're saying is the distance between these two murders along hazard ratio depends on the time with where the hazard ratio is changing over time the hazard ratio is not constant so this is going to be non proportional hazards we're gonna talk a little bit more about proportional hazards as you progress two things so we're look at the data set we're gonna fit a model over and talk about what does it mean to assume proportional hazards there I've learned how to what how we can check if it has its are proportional over time if they're not for the situation sort of like this what can we do and how we address that so that's all all what's coming up I guess let me just label some of these here so the case of linear regression we call this interaction or effect modification and we say that this effect or I'll just say y1 bar - wife you are the difference it means depends on this value packs here same with logistic regression the odds ratio or here if we've depicted the log odds ratio but if the log odds ratio changes depending on X the odds ratio changes so here are the odds ratio depends on this value X what we've done here the hazard ratio depends on time or in other words the variable sex like biological sex male versus female interacts with time you can if you're thinking ahead you can probably already think of what some of the solutions are going to be to addressing not proportional hazards if the effect of sex change depending on time we're going to essentially end up fitting a model that includes eight sex by type interaction it's a little a little bit more complicated than doing exactly that but that's the idea and concept so this is just the kind of inter overview what we're going to do is start to get into some of the details of walnuts so you may recall when we started talking about Cox proportional hazards model we said that essentially we're modeling a blog hazard using the log baseline hazard function and again we don't need to get too stuck in the details of this right now but essentially what we said was this baseline hazard function you can see it's a function of time so this allows the hazard for the reference group to increase or decrease the fluctuate over time so this plus 3 1 X 1 plus B 2 X 2 all the way up to D K Mexican yeah this is sort of acting like what previously was thought of being an interceptor for the regression model except now this intercept is a function of time you can increase or decrease it fluctuates over time or we also instead we can think of this grab them on the scale of the law of Hazzard we're modeling the Hazzard as the baseline hazard so again this is the hazard for the reference group it's a function of time it's allowed to fluctuate over time so this is sort of acting like the intercept term right the hazard for the reference group and it's a function of time so it can increase or decrease again fluctuate over time and then exponential function X 1 X 2 up to X K so you kind of big innovation with Cox proportional hazards model is he came up with a way that we can estimate these coefficients so we can estimate the v1 v2 to be the models coefficients without having to specify what this baseline hazard function is so in other words the hazard is lab to fluctuate over time increasing or decreasing and we don't need to specify how it fluctuates over time we can still estimate all the models of coefficients so the way I'm sort of spot of that is sort of like fitting a regression line so here's a simple X Y and we're freeing an aggression line where we estimate the slope without estimating the intercept and so we don't know if your set is here or up here what we do estimate the slope here before they estimated the rest of the coefficients without estimating the what X is these sort of Interceptor be the baseline that's but the Assumption there is that these two are equal distances apart okay this time fluctuates as you mentioned these may actually bounce around however they want but the distance between them is going to be the state may be proportional on the scale of the hazard or equi distant on the scale of the law expert so what we're gonna do Poli this video is look at that stanford heart transplant data set we're looking at fitting a Cox proportional hazards model in our care what some of the components there and so on and then we'll come back to talk about Cox proportional hazards model what exactly are the assumptions proportional hazards is one of them how can we check that if it's met it's not met how can we address violations of that so that's all coming up first let's just take a look at fitting this model interpreting some of these components in our pick around guys is more to see at least a seat [Music]
Info
Channel: MarinStatsLectures-R Programming & Statistics
Views: 42,344
Rating: 4.9190283 out of 5
Keywords: r course for beginners, r programming tutorial, r programming language, statistics with r, Data Science with R, statistics for data science, R programming for beginners, statistics course, statistics 101, statistics crash course, statistics course for Data science
Id: aETMUW_TWV0
Channel Id: undefined
Length: 13min 58sec (838 seconds)
Published: Wed Apr 01 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.