Epidemiology: Observational Study Types, Odds Ratio, Relative Risk, Attributable Risk

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
there are different types of studies in epidemiology this video is about observational studies there are two different types of studies observational studies and experimental studies in observational studies there is no intervention all you're doing is you have a bunch of people or which are your subject and you're just observing them in experimental studies there is actual intervention where you give medications or you certain procedures or whatever and you see what are the effects of that intervention and the benefits or whatever so there are different types of observational studies now just I'm specifically talking about observational studies there's case reports case series cross-sectional studies case control and cohort studies the first two are a bit different case report and case series they're very simple actually in a case report is something I'm sure everybody has heard of you have a single case to report it and n equals to one which means that the amount of people that you're talking about in your case report is just one that's your subject the disadvantage is that this is just a single case report and there is absolutely no control group the other type of study is a case series where you have multiple similar cases so it's exactly like a case report except you just have more than one patient this time and presenting with those similar symptoms or simply similar diagnosis and you're just reporting their cases again there is no control group this is just everybody in your study the end which is more than one everybody who has the disease or has those symptoms or problems the remaining three types of observational studies can be best studied through the understanding of a time line where in this case we have this red line which represents the time now this is the now or the present and this is what's going to happen in the future and this is what has happened already in the past so I'm trying to say is that we can divide the other observational studies based on the time period that we are studying so if we're studying the current time period or a certain point in time then it would be right here or as if we were studying something in the future we would be here and something in the past would be here so let's see how this would apply so first type of study is a cross-sectional study now why is the definition of a cross section long you know how in his pathology let's say you take a cross-section of a tissue sample or even in imaging all you're doing is you have the whole tissue or a sample and you take a specific area or part of it that's why it's called a cross section well in the same knowing this the same type of understanding can be applied here where when we have a time line we're taking a cross section of that time line so in that particular period time period like right over here like currently in the present we're gonna study the cases here so it's basically you take a cross section of this off the time line and you see how many people have a certain disease in other words you're measuring the prevalence of a disease so let's say you want to know if the risk of lung cancer increases with age you could take a bunch of people of different age groups and follow them over time that would not be a cross sectional study because you're following them over time in a cross-sectional study is just like a tissue cross section where you're just looks you just look at that at that time period and you study that your sample so in order to make cross-sectional study instead what you would do is you would collect data from data from people of different age groups at a particular time and you see all the people who have lung cancer in those age groups so let's say you notice that there are more people in the age group of 60 to 7 years old with lung cancer than in the age group of 20 to 30 years old thus you conclude that lung cancer rates are higher in older individuals but you cannot say why lung cancer is higher in older people you can just say that it just is so in other words in this type of study you're pretty much just seeing the prevalence of lung cancer you know let's say in in this case in these certain age groups okay and you can you can try to link old age to lung cancer but you cannot say that this noise the old age that resulted in the cancer rate caused it all you can say our comment on is the prevalence of lung cancer in that egg group or whatever you're studying the next study is a case control study in this study we're looking at the past in the timeline and as the name implies case control that means we have a control group in this study so this is a retrospective study and what we're doing is we take a bunch of people with the disease and compare them with a bunch of people without the disease to see if there is a link between a certain risk factor and disease so you have a bunch of people that you selected that have already been diagnosed with certain disease and you have a bunch of people without that disease and you want to see if the people with the disease have a certain risk factor or something that they share that or that is common amongst them as compared to the people who are in the control group without the disease there is no measure of incidence in this study and that's because this is a retrospective study you're not gonna see any new cases developing if you want to see incidence measure incidents you wanna you wanna just you want to study to be occurring in the future so you can see new cases as they develop and you will not be measuring prevalence either because the people that you're selecting for this study are people who have already been diagnosed with the disease so you're already selecting people specifically who will have already been diagnosed with the disease you're not seeing how prevalent a disease is actually in an entire population case control study can tell you if if a certain risk factor of a certain thing caused our disease so there is a causal relationship that can be established let's say you want to see if there's a link between cigarette smoking and breast cancer so you take hundred people already or patients already diagnosed with breast cancer and another hundred people without breast cancer so that's your control group without the breast cancer you assess in both of the groups how many of them were cigarette smokers comparing the frequency of surrogate smoke cigarette smoking in the case group to the control group there's an example of a case control study there is nothing in this in this study over here that can indicate or tell you the prevalence of breast cancer because you have selected specifically hundred patients already diagnosed with breast cancer and neither can you monitor incidence because you're not again you're not following these patients to see if there are any new cases of breast cancer that developed let's say in the control group because it's all stuff that has taken place in the past the advantage of this study is that you can study rare diseases that's because you all you have to do is you know control find and find the patients with breast cancer in this case so you don't need a large sample size either you know and that means that it would also be less time consuming the disadvantage of the study is that it's retrospective so it relies on memory that means that there is the possibility of recall bias for example in this study the patients might not be able to recall properly how much they smoked or how much they've been smoking and I can affect the effectiveness of the study again this can tell you or establish a causal relationship like in this case we can we can link cigarette smoking to them to the presence of breast cancer the last observational study is a cohort study and it is the best of all the observational studies this study is looking at the future in the timeline there is the presence of our control group in the study and since it's looking at the future it's a prospective study and we are doing is you take a bunch of people exposed to a certain risk factor and a bunch of people not exposed to that risk factor which would be your control and you follow both of the groups over time and see how many people in each of the groups develop a certain disease so therefore in this study is the only observational study that can identify incidents because you're you have a bunch of people over here that you took everybody does not have the disease and you follow them over time and to see how many of them which ones of them develop the disease so you start getting an incidence rate over here because you can see the disease as it develops and it's the best study to establish a causal relationship which would be relating that risk factor to the development of the disease that's because you're actually following the people in real time so you physically can see that it can see the disease as it develops and therefore more accurately relate that risk factor to the development of the disease here's an an example of a cohort study let's say you want to know if smoking marijuana is linked to lung cancer so you take 1000 marijuana smokers and 1,000 people who do not smoke marijuana which would be a control group and you follow them over time to see how many in each of the groups developed lung cancer therefore you're checking the incidence one thing you can notice is that you need a large sample size and that's because it's not necessary for all those people or for a lot of those people to develop the disease so the larger the sample size the more easier it is for you to have an effective cohort study whereas in a case control study you don't require that large of a sample size because you're already selecting people who have already developed the disease so that's another disadvantage again it can also be time-consuming and expensive that's because you have to you have to spend a lot of time with the patients and and the monitor when they start developing the disease so for example in this case lung cancer you know it's not gonna something that's gonna happen in one or two months it takes time so you would have to spend a lot of time and this cord study would take a long time for it to be effective the advantages were already discussed in the previous slides now let's look at analyzing observational studies so these are the three main studies are talked about from the observational studies there's cross-sectional case control and cohort studies each one of them has its own separate way of analyzing the data cross-sectional studies uses key Square which is something that would be covered in biostatistics case control uses odds ratio and cohort uses relative or attributable risk first let's talk about odds ratio odds ratio odds ratio is something that can be used for case control studies and also can be used for cohort studies however is more commonly used for case control studies so there is a difference between probability and odds the probability is the likelihood for an event to occur over all the possibilities for example if you have a league of twenty teams all competing for one championship the probability for one particular team to win that championship would be one over 20 because that's one team over all the possibilities which includes that team plus all the other teams winning the possibility for all the other teams running so that would be one over 20 whereas the odds is how likely it is for that particular event to occur over four how likely it is for that particular event to not occur so in this case the odds of that particular team to winning would be one over nineteen because the one represents likelihood for that team winning and the 19 represents the likelihood for that team not winning so probability again would be 1 over 20 and odds however would be 1 over 19 now an odds ratio is a ratio of two odds odds ratio is when you compare how likely it is that a particular event or disease will occur in someone exposed to a certain thing versus how likely it is that a particular event will occur in someone not exposed to that specific thing so in other words what you're trying to do is you're trying to link risk factors to a disease or to that particular event so you're identifying risk factors and you want to see which risk factor is is most commonly or more strongly attributed to that disease odds ratios can be applied differently for example here there's one study from 100 people diagnosed with breast cancer and and 100 people without breast cancer you see how many were smokers in each group then you can calculate what are the odds that someone with breast cancer with a smoker versus the odds that someone without breast cancer was a smoker this is a retrospective study so it's a case control study and in this case you would use odds ratio and here we have another study if I smoke what are the odds that I will get lung cancer then if I don't smoke or if you make a you can make a cord cohort study for this case and this would be a prospective cohort study and you can still determine or use odds ratio in this case as well the youthful odds ratio is for comparison so what you could do is you could identify different risk factors associated with a certain disease in in your study and you could calculate the odds free or odds ratio for each one of those risk factors and you can compare those odds ratios and we can determine then which risk factor is most strongly associated or linked to that disease versus which one is is the weakest association with that disease so if you have an odds ratio that is equal to one that means that the exposure does not affect the outcome and if you have an odds ratio that is less than one it means that exposure and negatively affects the outcome in other words it decreases the odds of the of an outcome or a particular outcome such as a disease occurring so if you have a risk factor that gives you an odds ratio of less than one it means that that risk factor has a negative effect on that outcome so in other words you can kind of say that that risk factor is actually it's actually protective against a disease whereas if you have odds ratio of more than one that means that exposure positively affects the outcome in other words increases odds of outcome occurring so in this case if you have an odds ratio that is more than one you could say that a particular risk factor increases the likelihood or or is more strongly associated with that outcome or that disease occurring now we come to calculating odds ratio this is an example of a study that I made up completely by myself a study was conducted amongst 100 participants to determine the association between cigarette smoking and rhabdomyosarcoma of the heart 40 of the 50 patients with rhabdomyosarcoma of the heart were known cigarette smokers of the 50 people in the control group five were known cigarette smokers determined association between cigarette smoking and rhabdomyosarcoma of the heart according to this study so first of all we want to know what kind of study it is well there are some clues such as the sample size being just a hundred people that is something you would not find in a cohort study another thing is rhabdomyosarcoma off the heart which is an Stream Li extremely rare disease so that's also something you would more likely find in finding a case-control study another thing and the most important thing is that in this study we're already we're looking at people who already have rhabdomyosarcoma off the heart so they're already diagnosed with a certain disease so our selection in our selection we took 50 people who already have that disease so this is the case control it is not a cohort study okay now since we want to know what is the association between cigarette smoking rhabdomyosarcoma of the heart we want to determine the odds ratio so now we're gonna see what are the things are the important things that we can that we need to recognize from this question so we want to first of all know what you're asking for the risk factor what is the risk factor or the exposure and in that in this case it would be cigarette smoking and what is the outcome in this case it would be rhabdomyosarcoma of the heart now so here what we have a disease or the outcome which is rhabdomyosarcoma the heart risk factor or the exposure is cigarette smoking so we have 50 people in the disease cases and 50 from the control and of these 15 the disease cases the forty or smokers and non-smokers controls five or smokers and forty five or non-smokers so odds remember is a likelihood for an event to occur divided by the likelihood for that event not to occur so if we wanted to see the likelihood for somebody to be smoker amongst rhabdomyosarcoma cases we see the likelihood for them to be a smoker amongst those cases divided by the likely high likelihood for them not to be a smoker amongst those rhabdomyosarcoma cases so of the 50 people who have rhabdomyosarcoma forty of them have since the 40 of them have RR smokers so that would be over here and 10 are non-smokers and so that's the odds for that and then the odds in the normal or the control group it'll be likelihood for somebody to be a smoker amongst normal or control / you likelihood for somebody not to be a smoker amongst amongst the control group in this case which would be 5 because 5 or smokers and 45 or non-smokers 5 over 45 now what we're gonna do is we're gonna calculate those odds so the odds for this case group would be 4 and the odds here would be 0.1 one in the control group and then we're gonna check out their ratio so it'd be 4 divided by 0.1 1 and if we did that we get an odds ratio of 36 which is huge it's it's something that is crazy number because it's just a made-up scenario anyway so since the odds ratio is more than 1 that means the exposure or the risk factor in this case has a positive relationship to the disease or the outcome which would be dropped in my circle Mozart in other words we can safely say that smoking does have an effect on presence of rhabdomyosarcoma off the heart the odds rate and the odds of being a smoker at 36 times greater so one way that we could say this and/or conclude this is that if you are a smoker you are 36 times more likely to develop rhabdomyosarcoma off the heart but that's incorrect because we did not take people who are smokers and follow them over time and see how many of them developed perhaps in my circle Mozart no we started the other way we took people with rhabdomyosarcoma it's hard and just so I just looked at how many of them were smokers so instead the better conclusion would be that the odds of being a smoker at 36 times greater for someone with rhabdomyosarcoma of the heart then someone without rhabdomyosarcoma off the heart now this is just for to understand the concept but on the exam you what you want to do is you want to use an easier method so you would make a table here and here we will do it we will put the risk factor at the presence of the risk factor here the absence of that risk factor and here we have the people who are diseased near the people who are in the control group meaning the ones without disease or the ones who don't have the outcome in this case rhabdomyosarcoma off the heart so just call them all the people or who are in our case case group and all the people over here are from our control group ignore the letters right now but so we're just gonna we're just gonna put all the numbers here so 40 were smokers and they had the disease five who are smokers but not have the disease ten not a smoker but still have the disease and forty five nonsmoker and don't have the disease now the formula if once you make this table there is a simple formula that they use it's called odds ratio is equal to a divided by C over B divided by D and if you calculate if you will make it simpler it becomes odds ratio is equal to ad meaning a times T divided by B times C that's might seem complicated but you just have to remember this formula odds ratio is equal to ad over BC one of the ways that they remember it is ad is in the time line you know or divide by BC before Christ so after I don't know is just go after Dominion or whatever it's called and this is before Christ so ad or / bc okay so now all you have to know is which cells are a which cells are B which D and C and D well one way to remember is that the ei cell is always in the top left corner and the D cell is a diagonal or opposite to it over here which is over here the other way to remember is that a are the people who have the risk factor and have the disease and D are you know on the completely opposite of this or the other end of the spectrum where they don't have the risk factor and they don't have the disease okay so and B and C are the others it doesn't matter if you call this one B or you call this one C because in the actual equation you're just multiplying them so in this case what we would do is we would have four D multiplied by 45 right because the risk factor + disease which is the 40 cases multiplied by risk factor no risk factor no disease which would be 45 so 40 times 45 of 5 divided by the other people which would be 5 - 5 times 10 all right so 40 times 45 divided by 5 times 10 which is 15 and if you calculate that the odds ratio that you get in this case would be 36 so that's the other way and again it's it still makes sense because essentially what you're doing for example if you look over here a divided by C all our taking is the people were from the cases from the case group we're smokers the odds of being smoker / for them not to be a smoker in this case 48 divided by 10 and divided by here for from the control group five divided by forty five so five horse smokers divided by forty five were not smokers which is exactly what we did and you know this whole scenario you're over here so in the end you don't have to remember anything except you know except first of all you just have to recognize that this is the case control group that our study where you have to determine the odds ratio and then you and then you as long as you can divide your cases like this and thus make a table and you remember the formula you're good to go and even you don't even have to remember the table as long as you remember that you're taking the people who have to risk factor and have the disease you know amongst the people who have the disease the risk factor entities which is 40 multiplied by the people without the risk factor and without the disease in this case forty times forty-five divided by the multiplication of the others which would be five times that so 40 times forty five divided by five times ten right the next type of study is a cohort study this study uses relative risk and attributable risk to analyze its data first thing you should remember that cohort study is the only study that tells you the incidence and that's because it's the only study that is a prospective study meaning looking into the future so the two terms are relative risk and attributable risk attributable risk is also known as absolute risk reduction and both of these risk values they require incidence data for their calculation that's why they're limited to court studies risk is when you ask yourself what are the chances something will happen or occur for example you mask what is my risk for developing IBD risks can be compared for example somebody might have a one in 10 chance of developing IBD in other words the risk for developing IBD would be one in 10 which is 10 percent whereas somebody who has a family history of IBD might have a risk of 5 in 10 for developing IBD so we can compare these two risks in the first case 10% risk and in the second case would be 5 out of 5 in 10 which is 50% risk for developing IBD so we can't say that the risk is 5 times greater for patients with or people with a strong family history of IBD to develop IBD then in somebody without a family history of IBD relative risk and attributable risk is when you're compared to risks in relative risk we divide two risks and in attributable risk we subtract two risk I personally like the term absolute risk reduction instead of attributable risk because it has the word reduction in it which tells you that you're subtracting two risks relative risk is how much more likely is it for an outcome to occur in someone exposed versus in someone not exposed remember in a cohort study you have a bunch of people with a certain exposure and a bunch of people who are not exposed to that exposure and you monitoring both of them to see which one's of them or how many of them in each of the groups developed a certain disease or outcome for example if you take a bunch of people who smoke and a bunch of people who don't smoke and you follow them over time to see which one's of them developed lung cancer first of all the outcome which is developing lung cancer in this case is basically incidence because that is what you're monitoring which one's actually developed the disease so let's say that you had a hundred people who smoked and hundred people who don't smoke from the hundred from the hundred people who don't smoke ten of them developed lung cancer in the future in over the next let's say 50 years and from the hundred people who smoke let's say 70 of them develop lung cancer in the future we can compare that risk in the first in the first group we had an incidence of 10 people from 100 which is 10 in 100 that is the incidence rate whereas in the second group which is amongst the smokers we had an incidence rate of lung cancer or the outcome being 770 over a hundred which is seven in ten so in that case we have two incidence rates we have one in a hundred and we have 1770 in 100 now we can compare those two and decide what is the relative risk in other words relative to the people who don't smoke or the people who are not exposed in this case how much is the risk greater for somebody who is exposed to smoking in this case tudi for developing a certain outcome in this case lung cancer is it 2 times as likely 3 times as likely maybe even half times as likely and so in this case a relative risk would be 7 times as likely so if we compare the 70 and the 10 we can see that 70 is 7 times the 10 that means that in the group were smokers there were 7 times as likely to develop lung cancer when compared to the group who don't smoke attributable risk is when you ask yourself on how many more occasions did the outcome occur in someone exposed versus in someone not exposed so in this year looking physically at the amount of cases that you had in the ones who were exposed versus the ones who are an unexposed so in this case since we had 70 people who developed lung cancer versus the ten people who developed lung cancer in the non-smokers you could say that we had 60 more cases 6-0 more cases therefore out your attributable risk would be 16 cases in other words we can attribute those 60 cases of higher of are those 60 cases of incidence of lung cancer to the risk factor or to the exposure which would be smoking in this case this is the equation for relative risk it's quite straightforward you take the incidence or the new cases of the outcome in this case lung cancer amongst the exposed group in other words amongst us the smokers in our study and divided by the incidence for new cases of the outcome which is lung cancer in the unexposed group or in the group that does not smoke so in our study we would say 70 in 100 which was our incidence for lung cancer amongst the smokers and you divide that by 10 in 100 which is the incidence of lung cancer in the unexposed group and 70 in 100 is basically 70% you can call it or you can call it a 0.7 so let's take the 0.7 and so it would be 0.7 divided by here would be 0.1 and if you calculate that 0.7 divided by 0.1 and we get a relative risk of seven now that means the relative risk is more than one that means that the exposure caused or you could say had a positive correlation to the outcome so again amongst the smokers or if you were smoke there were seven times as likely to develop lung cancer whereas it compared to if you don't smoke and if the relative risk was one that would mean that there would be no no different scene and that's imagine if we had ten people who developed lung cancer in the exposed or in the smokers divided by ten let's say who also developed lung cancer in the annex unexposed in this case we would have zero point one divided by zero point one which is equal to one and clearly there was no association then or no increased a relative risk you could say if the relative risk was was less than 1 that would mean that there would be a negative correlation for example if we were to notice that amongst the people who smoke their incidence was 10 in 100 and amongst the people who did not smoke the lung the incidence of lung cancer was 70 in 100 then we would have 0.1 divided by 0.7 which would give us a relative risk of 1 over 7 which is less than 1 that would mean that if basically what I would mean is that if you smoke you are 7 times less likely to develop lung cancer and now for attributable risk or you can also call it as absolute risk reduction you take the incidence in the exposed group and use a new subtract here the incidence in the unexposed group so in this case our incidence in the exposed group was 70 in 100 or you could say 70 minus the incidence in the unexposed group which would be 10 in 100 so let's say 70 minus 10 therefore we would get an attributable risk of 16 100 that means that all again as we discussed earlier those 60 cases can be attributed to the expulsion well can attributed attributable risk be- well yes it can't just like the relative risk it can also be negative so let's see if the incidence in an exposed group was 10 and incidents in unexposed group was 70 in this case our attributable risk would be negative 16 therefore we would we could say that when when or in the group that was not exposed in other words the non-smokers with we noticed 60 more cases of lung cancer or you could say they'd say it the other way that in the group that smoked we noticed 60 less cases of lung cancer the last thing to talk about is number needed to treat number needed to treat tells you how many people do you need to treat or do something to to in order to prevent one case of outcome there is an equation for it and the equation is 1 divided by attributable risk so for example in the previous question or attributable risks and was 60 in 100 so 1 divided by 16 or 100 which is basically 1 divided by 0.6 gives you 1.6 another way you could look at it is that since the absolute risk reduction or attributable risk is 6 win 100 meaning 60 over a hundred the inverse of that would be 100 over 60 so we just flip this and becomes hundred over 60 and they'll also give you the same answer which is 1.6 and that means that for every 1.6 people treated or for every 1.6 people that you did something to one case of that outcome will be prevented this is a good summary for observational studies here we can see this is the timeline and if we remember that our study can show or represent the current or the present or it can represent or look at the future or we can look at the past so the study that looks at the current is a cross-sectional study and the one looks in the future is a prospective study which is a cohort study and the one looks into the past is a case control study or a retrospective study and we know that when you look at the present in or in a cross-sectional study you are gonna be able to measure prevalence however in this study there is no control group and there is no causality that is okay that is able to be assessed and we know that as for our data analysis in this type of study we use chi-square or ki square and that's something that would be covered in biostatistics as for the past or the case control we know that there's a control group same with cohort there's also a control group and there's also causality relationship that can be assessed in both however the data analysis in this is odds ratio which is again the formula idea which divided by BC and whereas for cohort what you can use to now analyze the data is relative risk and absolute risk as well as a number needed to treat because these are all values or calculations that require incidence and that also means that in a cohort we have incidence or we can measure incidence it's the only one that can tell you incidence another difference between case control cohort we know that case control you just need a small sample size could just have already uh people or already diagnosed with a certain disease and requires less money and less time therefore and the cohort studies opposite and requires more money time and sample size
Info
Channel: Medical StudyBuddy
Views: 16,348
Rating: 4.9016395 out of 5
Keywords: epidemiology, Biostatistics, USMLE, Step 2, Step 2 CK, Kaplan, Observational studies, Relative Risk, Attributable Risk, Odds Ratio, Number needed to Treat, Medicine, Education, Tutorial, Cohort, Cross-sectional, Case-control
Id: 9sCFQ1M3L-g
Channel Id: undefined
Length: 34min 48sec (2088 seconds)
Published: Wed Jan 11 2017
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.