Repeated measures (afex)

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
[Music] this video contains aliens sniffer dogs some beeped out curse words more aliens really bad acting i mean really bad did i mention the aliens there is unrest in the universe a race of evil space lizards known as the inane cult are planning to take over earth their leader lazink dirt is poised for the invasion his trusted second-in-command tunic lane boards his spacecraft and heads off with an update for his leader [Music] meanwhile for a step a brave but exhausted savior of our world is oblivious to loving dirt's plans he has spent many years protecting the earth but he is counting the days until he can pass on his mission and retire to a nice beach his heroic prodigy days flitch who has been spying on their name cult hurries across the galaxy and arrives with an important message for us all [Music] horus death i'm so glad to see you days it's so good to see you i'm so tired of fighting this war i don't think my body can take anymore where have you been i've been on a long dangerous heroic mission where i was very heroic i have intercepted a transmission from lazink dirt leader of the inane cult it's not good news i'm afraid what is it let me show you are chewnik lane the shape-shifting training is going well i see well yes it's going very well except i can't get my hands to become humanoid i can do the face the body but the hands i can't do the hands is the invasion fleet ready nearly the shape-shifting training has been successful on about half of the alien fleet but the other half they just can't get the hang of the shape-shifting they can't do it i grow impatient for victory those that can't shape-shift into humanoid forms shall be sent in their natural lizard form the humans will never notice as you command mighty leader ah man that is bad horror how can we stuff them well while you've been away we have been experimenting with training sniffer dogs to detect aliens but we need more research can you get me some sniffer dogs i found the best spaniels we have but we only have eight of them hmm that's not many i don't know what i can do with that i need to consult a higher intelligence what you need is a repeated measures design well the all-knowing professor hippo has told us that to get some kind of answers with only eight dogs we're gonna need to use repeated measures designs so we best get on with it now repeated measures design we can analyze again with a linear model like all the other designs that we've looked at on this module and we're addressing a scientific question which is whether we can use sniffer dogs to sniff out aliens we've got a sample of sniffer dogs a small sample only eight of them we're going to need to test them we're going to need to collect some data from them we'll visualize that data and then we'll fit some variant of the general linear model and in doing that we're going to estimate parameters we could look at confidence intervals of those parameters and we can test hypotheses is it the case that sniffer dogs can differentiate humans from aliens we're going to have to test assumptions of that model we're going to have to test row we're going to have to fit robust versions of those models if the test assumptions aren't met everything is the same as what we already know the only difference is that we have a different kind of design a repeated measures design so let's get on and have a look at what that actually means okay let's think about the simplest experimental design that we could use to test whether our sniffer dog training had worked now our sniffer dogs have been through a training regime where they are rewarded when they make vocalizations when they're sniffing the thing that we want them to sniff so essentially what we do is we put an alien in front of them when they're sniffing that alien if they make a little noise we give them a little treat so they get used to the idea that they get rewarded for making noises when they're smelling an alien smell and then when they're smelling a human we don't give them any treats for vocalizing so hopefully if the training is successful they're able to differentiate an alien smell from a human smell over the course of the training they'll be learning to make more noise when they're with an alien because they think that if they make noise when they smell an alien smell they're going to get a treat so the simplest design we could use to test whether our training has been successful is to take two groups of dogs that have been through the training and randomly assign them so that some of them sniff humans and some of them sniff aliens and we could measure how many vocalizations they make while sniffing now in this kind of design this is the same sort of design that we've looked at in previous lectures so we have different participants in different categories of the experimental condition there's two types of variance so systematic variance so there's variance in vocalizations in the outcome that is going to be created by our experimental manipulation so in this case it's going to be created by the fact that some dogs are sniffing humans some dogs are sniffing aliens so that difference should create some kind of variability in the vocalizations if the training has been successful there'll also be other systematic variation we've got different dogs in the two groups and dogs for example have natural differences in in how much they like vocalizing some dogs are very chatty as it were and some dogs are not very chatty so there's going to be some kind of differences between the groups like naturally occurring differences um and there may be other factors as well like when they're tested you know maybe if the dogs are particularly hungry in one of the groups that could be a compound or something so there'll be some kind of other variability in vocalizations due to factors that we perhaps haven't measured so we always have these two types of systematic and unsystematic variants now when we use repeated measures design what we're doing is we are using the same participants in all of the experimental conditions so in this case we've got our eight dogs that have been trained that's all we've trained so far and we need to test whether they can differentiate humans from aliens so what we could do is we could test the same eight dogs get them to sniff a human and measure how many vocalizations they make and at a different point in time get them to sniff aliens see how many vocalizations they make the design hasn't changed all that's changed is that we're using the same dogs in each experimental condition now what this should do in theory is reduce the unsystematic variance because by uh using the same dogs in the two conditions well we know those dogs have exactly the same uh propensity to vocalize because it's the same dogs so there shouldn't be any difference there we know their noses are going to be equally as sensitive because it's the same dogs so it's not going to be the case that you know the the dog sniffing the aliens have more sensitive noses than the dog sniffing the humans because we're using the same dogs so across that group they have exactly the same sensitivity in their in smell yeah that's that's probably a good enough way to put it so by using repeater measures design hopefully we are reducing the unsystematic variance we're controlling a lot of factors across the experimental conditions so the benefits of repeated measures designs are on paper they're more sensitive so you should reduce the unsystematic variance because a lot of things are held constant across experimental conditions because you're using the same participants so you should get more sensitivity to experimental effects we should have a more sensitive design to detect whether our sniffer dogs can differentiate aliens from humans if we use the same sniffer dogs hooray the other good thing is it's an economical way to test experimental hypotheses because you need less participants so in this case i mean eight participants are still really really small but it allows us to at least test this hypothesis if we had to divide those eight dogs into two groups a group that sniffed humans in a group that sniffed aliens we'd only have four dogs in each group that would be you know an awfully small number of participants so it's a more economical design as well because you need fewer participants but you do have to be aware that participants can get fatigued so if you're uh you know putting subjecting participants to like hundreds and hundreds and hundreds of experimental conditions they are going to get fatigued the quality of their data is going to be poor that shouldn't be a problem for us because we're only going to have a very small number of experimental conditions okay so the design we have settled on thanks to professor hippo's wise advice is this to test whether our puppies can sniff our aliens first off our outcome measure is self-evidently going to be vocalizations because that's what we've been training so we're going to let them sniff an entity for one minute and we'll count the number of vocalizations they make during that minute now to test the hypothesis we are going to vary the entity that they sniff so on one occasion they're going to sniff an alien not in humanoid form on another occasion they're going to sniff a human which is a control for whether they can uh detect aliens versus humans we're also going to let them sniff a mannequin which is a control for them sniffing something in humanoid form that's not a human and we're also going to get them to sniff a shape-shifting alien so this is an alien that is able to transform itself into humanoid form so across these four different entities the dog gets to sniff each one of them for a minute and we count the number of vocalizations and for each dog they'll sniff them in a different order and that's to get rid of any potential compound of the order in which they sniff the the entities we also are going to have a variable to indicate the dog's name and uh as you already know there are eight dogs in total here are the data so we have our eight dogs milton woofy ramsay mr snifficus iii willock the venerable doctor waggy lord senecal and professor knows and each score in each column represents the number of vocalizations they made while sniffing each of the four entities so for example milton made eight vocalizations while sniffing an alien seven while sniffing a human one while sniffing a mannequin and six while sniffing a shape-shifter now a couple of things i want to note first of all if we look at the means for each of the entities these are the things that our model is predicting essentially these are the predicted values from the model so we've got uh about eight vocalizations where an alien was sniffed about four for a human on average about four for a mannequin on average and uh just under six for a shape shifter on average so they're the predicted values from the model exactly the same as in a between subject or independent design so that's what we're testing we're testing whether those values are different so that represents our experimental effect but i also want you to note that the different dogs vary in their in the in their the sort of well in the effect the experiments had so for example if we look at the venerable dr waggy has scores of seven five six and seven so basically there's kind of almost no experimental effect there it doesn't matter what he's sniffing he gives between five and seven vocalizations so it looks like for him the training hasn't worked and we can tell that because if we look at the variance in those four scores so the variance across seven five six and seven it's pretty small 0.92 so it looks like training hasn't had very much effect on the vulnerable doctor waggy now if we take another dog like professor knows who perhaps got his name because he's a very good sniffer we can see that there is a lot of variability in the vocalizations that that dog makes across the the entities so when it's an alien he does 12 vocalizations when it's a human he only does six when it's a shape shifter he does one when it's a mannequin he does eight so the variance across those scores is huge in comparison to the memorable dr waggy so if for professor knows it looks like he really has a lot of variability in his vocalizations depending on what he's sniffing so the point here is there are individual differences in the effect that the experiment has had there is variability in the extent to which the training has been successful so for a dog like the memorable doctor waggy the training hasn't been very successful his vocalizations are really pretty similar across the four entities whereas for say professor knows it looks like the training might have had more effect there's a lot more variability in the vocalizations that that dog makes across the entities so the point here is is that the experimental effect is kind of embedded within the individual and there are individual differences in the extent to which this the experimental manipulation has had an effect okay so this is what the data look like in r just so that you're not confused by the previous slide so we have it in long format so you can see we have dog name has a column and our first dog milton takes up four rows but those rows distinguish the different entities that were sniffed so that's coded in a different column and the final column for vocalizations shows the number of localizations he made for each entity so this is known as uh tidy data format or long data format this is just to illustrate this is how the data are laid out and you know we could scroll down and see some of the other dogs listed there you can see each dog takes up four rows because they sniffed four entities so when we do a repeated measures design and we want to use the linear model we have the same participants in all conditions and so the scores across those conditions are going to correlate as we've seen the experimental effect is kind of embedded within the individual and thinking back to the lecture on the beast of bias we also know that there's an assumption that the residuals in the linear model are independent now clearly this kind of design violates other things being equal that assumption because we would expect the errors now to actually be correlated because the errors for uh when we're predicting uh scores for an alien being sniffed or a mannequin being sniffed they're related to each other because they come from the same dogs so they're they can't be so fully independent because the extent to which a dog sniffs a human is going to impact how much they sniff one of the other entities because um sorry not sniff vocalizes june sniffing one entity is going to affect the number of vocalizations they make when they sniff another because dogs will vary in their natural vocalizations so we'd expect other things being equal that the assumption of independent residuals is violated so we need to adjust the model slightly to kind of estimate this dependency and how we end up doing that is by essentially making the parameters a bit more complex than they were before so ordinarily if we just simplify the model and assume that the entity is a single predictor in actual fact entity will be split into three dummy variables or contrast variables but let's pretend that's not the case just to keep things simple so let's imagine we're just predicting vocalizations from a single predictor of entity as we've seen loads and loads of times before ordinarily we'd have an intercept so what are the what's the level of vocalizations when the predictor is zero and we'd also have a parameter estimate attached to the predicted variable of entity now all that's changing in this kind of design is that we potentially can allow either of these parameters to vary across individuals so notice instead of being called b to zero it's called b to zero j and the j represents the different dogs so what this is saying b o j is saying b to zero which can be different for each dog so it's not a single value it's potentially a value that can vary from dog to dog and the way that ends up panning out is that the beta0j is actually made up of two terms so one is our regular b to zero our regular overall level of vocalizations but we also have this extra term in the model uh which we've labeled u which estimates the variability in the intercept across different dogs now if none of that makes any sense it doesn't really matter all i want you to grasp is the fact that we can let our intercept incorporate information about the fact that data comes from the same participants and in particular what we're doing here is we're allowing uh the sort of overall level of vocalizations to be different across different dogs so we're saying b to zero is not a single value for all dogs it's a value that varies from dog to dog and that's and in doing so we accommodate the repeated measures design we can also although we don't have to also take this step as well but remember earlier on i said well look it looks like the experimental effect differs for different dogs so for uh you know professor knows it looked like the training had had quite an effect whereas for the venerable doctor waggy not so much we can also incorporate this into the model we can incorporate the fact that the effective entity might be different for different dogs and that's what this b1j represents now we don't always fit this sort of complicated version of the model but we can protect if we have enough data so again the parameter estimate attached to entity now is made up of two components it's made up of the component that we've worked with all term long which is a kind of a beta one that applies to the entire sample and is a single value but again we have this uh this u value which represents the variability in that parameter estimate across different dogs or different uh different individuals if you were testing humans so we've got a beta one that now incorporates the fact that beta1 can vary from individual to individual so again it's uh it's kind of factoring in that the experimental effect can be different for different docs like i said they you don't particularly have to understand or process any of that to run the model but i'm just explaining how the repeated measures element of the design gets factored into the linear model so there's a few approaches to repeated measure designs there's a sort of a simpler approach which is the approach that we kind of go for on this module and there's a more complex approach the simple approach is to assume something called sphericity so this is kind of an added constraint on the module and if we on the model and if we can assume this and estimate it and correct for it then we can just kind of run a model as we've run other models and kind of forget about the fact we've got repeated measures design really the other alternative is more akin to what i was just explaining on the previous slide which is that we fit a different a slightly different kind of model known as a multi-level growth model or just a multi-level model if you're not looking at things over time and what this does is allows us to explicitly model different kinds of dependency in the in the error term of the model again we don't teach you the multi-level model approach which is the more complicated and more flexible approach we just teach you the uh let's assume sphericity estimate it and correct for approach so what is this sphericity thing well sphericity is all about the differences between conditions so what we have here is a table of differences so the first column alien minus human is the difference in the number of vocalizations when sniffing an alien compared to when sniffing a human so if we look for example at milton there's a difference of one so what we're saying here is when we look at the difference between how many vocalizations milton made when sniffing an alien versus sniffing a human there was a difference of one and if we look back to previous slides we can see he made eight vocalizations for an alien and seven for a human so that is indeed a difference of one so that's where that difference comes from we can also see if we look at alien compared to mannequin so eight for an alien one for mannequin that's a difference of seven alien versus shape-shifter eight for an alien six for a shape-shifter he'd have a difference of two there coming back to this slide then so we saw for milton he's got a difference of one between alien and human difference of seven alien a mannequin difference of two for alien and shape-shifting we can do for the other we can do the same for the other conditions so look at the difference between the vocalizations for human and mannequin that's six for human and shape-shifter that's one for mannequin shape-shifter that's minus five so all these columns represent different scores the difference in the number of vocalizations between pairs of conditions or pairs of entities and we've done that for every single dog we get this lovely table of numbers that i'm sure is exciting everyone now if within each of these columns we worked out the variance in these differences so if we worked out the variance across these scores we find it's 5.27 and if we do the same for the differences between alien and mannequin so what's the variance of these scores it's 4.29 now the assumption of sphericity is the assumption that these variances are the same or roughly the same so it's like the variance of the differences between conditions should be more or less the same so the two columns i've picked looks like sphericity does kind of hold for that so for alien versus human we've got a variance of 5.27 for alien versus mannequin we've got a variance of 4.29 they're roughly the same however if we now kind of come on to look at alien versus shape shifter there when we look at the variance of those differences it's huge in comparison so we get a value of about 25 so that's about five times more than the previous two columns so this would be an example of sphericity not being true because the variance for alien versus shape shifter is five times greater that than the variance for alien versus human so that's what soristi is all about it's basically saying if we take these differences in scores and we calculate the variance for each of those columns of differences all these values should be roughly the same and hopefully you can see for these data that is definitely not the case so these data violate this assumption of sphericity now so the assumption of cyricity we've just seen what it is it's it's that the differences between pairs of groups should have equal variances if we work out the differences between scores in different pairs of experimental conditions and work out the variances they should all be equal we can estimate sphericity using a couple of methods the greenhouse geyser estimate and the hume felt which i'm probably pronouncing terribly terribly wrong estimate there are two different ways of estimating the degree to which you have sphericity but in both cases if the value is one you have perfect sphericity in other words the assumption is perfectly met values less than one mean that sphericity is violated to some degree or other the further away from one the bigger the violation so we can use the greenhouse geyser estimate or the other one to um work out the degree to which we've violated the assumption of seriousity happy days so how do we do that well this is the equation for the greenhouse geyser estimate and as you can see it is basically horrible um so we don't want to we don't really want to be doing that by hand but this is the reason i put this up here is just to show you that when you're in your darkest darkest moments struggling with r and wondering why the hell you are bothering to do it then look at this equation and think well that's why i'm doing it because it means i don't have to work out equations like this by hand so you can estimate sphericity with some hideous equation that will take you through a space-time vortex into the bowels of hell or you can let r do it for you and go and have a pina colada instead so let's all choose that option i mean it doesn't have to be a peanut cloud it can be a cup of tea if you prefer so having estimated sphericity for us what r can do if we let it is to multiply the degrees of freedom for the f statistic by one of the one of the two estimates and in general we use the greenhouse geyser estimate because it's the more conservative of the two um and that's it really we estimate how much we've violated survisity by and then we correct the degrees of freedom by that amount so given that the estimate of soristic quantifies how much we have deviated from perfect sphericity what we're effectively doing is correcting the degrees of freedom by the degree to which we have a problem now remember that when sphericity is perfect the these estimates will be one which means the degrees of freedom don't get corrected at all and when you don't have perfect sphericity they will become smaller the degrees of freedom will get smaller because they'll be multiplied by a value less than one and by getting smaller it means the test statistic the f statistic uh it's harder for it to be significant so essentially you're you're making your test slightly more conservative so the bottom line here is when you have a repeated measures design uh you can probably ignore the last few side slides in their entirety but i've got to do something to fill up with 50 minutes uh you should just routinely apply the greenhouse guys to correction and forget about sphericity basically i mean you can't forget about it entirely because i might put a question about it in the exam but other things being equal apply greenhouse geyser and forget about it so is it really melts my brain but i have greenhouse glycer to help me on my way adjust the degrees of freedom and sphericity goes away so how do we fit the model then well just like uh in the previous week we can use the afex package and the ale4 function and we use it in pretty much the same way the only difference is with a between group design or an independent design we specified uh we specified a term here which uh where the rn predictors was actually uh just the value one so we're just changing one little part of how we specify the model which is this part down here where we basically list the repeated measures predictors and we have to also list the variable that represents the you know the id of the participant so in this case the idea the participant is the dog name so we put that in there and the repeated measure predictor we've only got one of them on this occasion was entity so that's what goes in there the rest of how we specify the model is exactly the same as when we use this function in a previous lecture so we're predicting vocalizations from the variable entity and that's it it'll automatically set contrasts you can get a built-in interaction plot which we looked at uh in the previous lecture but you don't get parameter estimates you don't get diagnostic plots you can't do robust methods we don't teach this so fall asleep for the next 30 seconds but if you're interested if you want a kind of a more flexible approach you can use the lme4 package and a function called elmer it's a trickier option and we don't teach it to you because it's very easy to get into a very sticky territory very quickly but this is a kin i just want to point out this is akin to using lm um for uh that we've used for other linear models really so you would manually set your contrast means you can get parameter estimates you can get diagnostic plots you can with a bit difficult to get some robust methods as well but we're going to stick with aphex for the purpose of this module so we can set some contrasts if we want to um so if although aphex will manually do contrast for you we can override them so this basically i'm saying here we can override the manual contrasts if the dog training has been successful then what we would expect to happen is the sniffer dogs should make more vocalizations when sniffing aliens versus non-aliens so our first contrast could very uh straightforwardly just look at the two alien uh kind of conditions versus the two non-alien ones so that was we'd lump together alien and shape-shifter because they're both aliens but one's in humanoid form and one's not and we could kind of mash together humans and mannequin because they aren't sort of you know things in humanoid form that are not aliens so that could be our first contrast kind of alien versus non-alien and then we'd have to sort of subdivide these two chunks so first of all we've got a chunk with alien and shape shifter so we would need to split that apart so we'd have a second contrast comparing alien and shape shifter and then we also need to split apart the human mannequin group so we could do that with a third contrast which uh ignores alien and shapeshifter and and compares human to mannequin so that's the set of contrast that would fit for this experiment and using the rules for coding which we covered in the lecture on contrast coding we would get contrasts something like this so contrast one compares alien so they get the alien and shapeshift to get the same contrast value and it compares them against human and mannequin which have minus values because they're being compared against the other two contrast two we're comparing alien against shape shifter so notice one has a positive weight one is a negative weight and human and mannequin are ignored so they get zeros and for contrast three alien and shape shifts are ignored so they get zeros and we set contra sorry we set weights for human and mannequin again notice one's positive and one's negative which means they're being compared you can you know look at where those weights come from by revising the contrast lecture so in terms the overall model summary this is what we get out of aphx so we get our effective entity these are our degrees of freedom which you can see i have reported down here this is our f statistic we get an effect size the generalized eta squared and a p value of 0.063 which is technically not significant if we use the you know the usual criterion so we'd conclude that the entity sniff did not have a significant effect on the number of localizations by our sniffer dogs but really small sample so this would have been a massively underpowered study and if we look at our effect size our etas have generalized eta squared it's kind of telling us that the effect is reasonably big because in fact uh the entity sniffed account for about a third of the variance in vocalizations that's a pretty big amount so even though we had a small sample and a non-significant result it looks like the effect is actually quite strong i mean we wouldn't want to make too many great claims based on a sample of eight but there's you know there's something worth investigating there perhaps in another study maybe so how would we interpret this well the non-significant main effect if we were just going to go with p-values and ignore the effect size we're basically saying that these four means are statistically speaking at any rate the same so we're saying there's no kind of difference between this pattern of means that we see there if we look at the contrasts so we've got three contrasts and you know again if you were gonna kind of take the p value at face value on its own you probably wouldn't look at these condors probably shouldn't look at these contrasts because you shouldn't break down a non-significant effect but if you're placing some weight on the effect size you might still be interested in seeing what the effect sizes are for these contrasts and again it looks like if we kind of look at the contrast that compares vocalizations to aliens versus non-aliens the estimate itself is 2.75 and that is significant so you know with again massive caveats on this it doesn't like there's something going on when dog sniff aliens versus when they sniff non-aliens when we compare their vocalizations between aliens and shape-shifters it's not a significant effect when we look at it between humans and mannequins the effect is basically zero highly highly non-significant so it looks like there might if we were going to replicate the study uh with a bigger sample what we'd be looking at is the fact that it seems like dogs can maybe distinguish between aliens and non-aliens they can't distinguish between an alien and a shape-shifting alien you know because they probably smell the same they just look different so that's reasonable and they can't distinguish between a human and a mannequin because remember they've been they've been trained to sniff aliens so they've not been trained to sniff uh to kind of vocalize when they sniff humans they've been trained to vocalize when they're sniffling so that's a condition where they weren't sniffing any aliens so it's not that surprising that they gave the same number of localizations if we'd had no prior hypotheses we could also look at post-hoc tests again you generally wouldn't do this when you've got a non-significant initial result but just to show you that you can and this compares all different conditions to each other so alien versus human and alien versus mannequin they give us significant differences in the number of vocalizations that seems to suggest that dogs vocalize more when they're sniffing an alien compared to a human and a mannequin when we compare aliens with shapeshifters we get non-significant difference when we compare humans with mannequins and humans with shape shifters we get non-significant differences and when we compare vocalizations when sniffing a mannequin and a shape-shifter we get a non-significant difference so again the pan here seems to suggest that sniffer dogs do make more vocalizations when sniffing an alien compared to a human and a mannequin but not when they're sniffing any other kind of combination of things now when we use afex we can't get diagnostic plots so we don't really know whether our model uh you know meets the assumptions or not but we can routinely apply a robust test instead we can do this with a function known as rm anova in the wrs 2 package when we do this we basically get a robust f statistic with associated degrees of freedom and a p-value which turns out to be non-significant so the robust test backs up the idea that the overall test is non-significant again with the caveat that this p is based on an enormously small sample we can also get robust post hoc tests which basically we're looking at whether the p value is smaller than the p critical value and if it is then that test is significant helpfully this final column just tells us whether the significance is true or false so if it says true it means significant difference if it says false means non-significant difference and for all of these robust post-hoc tests it's saying non-significant difference which suggests the training hasn't been that successful but again massive caveat over the small sample days you're back good news the sniffer dogs did well although the results were not significant the sample size was you know pretty small and the effect sizes are looking strong i'm optimistic we have intercepted another transmission it's more bad news i'm afraid oh no show me laughing dirt mighty leader i have returned ah tunic lane how's the shape-shifting training going exciting news i can do the hands look i've got human hands but i'm having a bit of difficulty with the face now i can't stop my face from being green why are you here ah well i have intercepted a report from the humanoids take a look the humans plan to use sniffer dogs to detect us but their early research shows non-significant effects good good given our natural beautiful odor of putrefying bog water i am surprised by this but a p-value never lies we can make things harder for those disgustingly cute fluffy dogs by masking our smells with human pheromones and the sense of foxy dogs love foxes quick quick tunic find me some humans and foxes that we might rub our bodies against them as you command mighty leader i told you not good news they don't like dogs what kind of evil creatures are we dealing with hmm masking their alien smells with sense interesting but i'm not sure it's going to fool our sniffer dogs let's do some more research okay so the aliens are planning to use scents to make it harder for our sniffer dogs to sniff them out now if we were going to do some research to to look at whether this was you know actually a viable plan well first of all since our last experiment we've managed to train a lot more sniffer dogs so we've now got 50 of them instead of eight so that's a more slightly better sample and what we're going to do now is we're going to get them again to predict uh sorry to sniff different entities but this time we're just going to use three we're going to use a human a shape-shifting alien and you know a lizard form alien and we're also gonna vary whether the entity was masked with no scent whether their smell was masked with human pheromones whether their uh scent was masked with fox pheromones so the dogs sniff all combinations of entities and scent masks so each dog will sniff three humans one human has no no masking scent one it has a human masking scent and the other one has a fox masking scent they'll sniff three shape shifters one with no masking scent one with a human scent one with a fox scent and three aliens one with no masking scent one with a human and one with a fox masking scent so each sniffer dog sniffs nine entities in all again we'd randomize or counterbalance the order in which they sniff them and again the outcome is a number of localizations during the one minute sniff so this is a factorial design we've now got two predictors the entity they sniff and the scent mask that was used now the model here gets really really complicated uh feel free to fall asleep at this point again if we simplify things by ignoring the fact that both entity and scent mask will be represented by dummy variables in the actual model so entity we've got three groups that'll end up there are three conditions that'll end up being two dummy variables and a cent mask also three group conditions so that would end up being represented by two dummy variables as well and for the interaction you'd have both you'd have the entity double uh the entity dummy variables multiplied by the cent dummy variable so you'd end up with four dummy variables representing the interaction but let's ignore that for ease if we pretend that entity is represented by a single predictor and scent is represented by a single predictor and the interaction is represented by a single predictor just to make things slightly more easier we're basically saying the same thing as before in that in its most complex form we can say that the intercept is made up of two terms it's made up of an intercept like an overall intercept that kind of applies to the whole group but also a term representing the variability in intercepts across different dogs across the 50 different dogs we can also say that the effect so the parameter estimate associated with entity so you know the the effective entity again there's like an overall group level effect but we can factor in that there's variability in that effect across the different dogs we can do that for scent as well we can say there's an overall effect of scent but we're going to model the fact that the effect of scent might be variable across different dogs and we can also do for the interaction so we can say the effect of the interaction there's an overall kind of effect but we can also model the fact that there's variability in the effect of the interaction across dogs this is a very very complicated model you wouldn't necessarily fit this version of the model because it's got a lot of terms in it uh but at a bare minimum if you've got repeated measures design what we're effectively saying is that the intercept can vary across different dogs so we're just acknowledging the overall levels of vocalizations can vary from dog to dog and that's a kind of a kind of a basic way of factoring in the repeated measures nature of the design so here are the data they look very similar to before so we've got our first dog takes up nine rows this time because that each dog contributes nine scores so these are the nine vocalizations for that first dog the first three of them relate to when they were sniffing aliens the first one was an alien with a fox scent applied the second one was an alien with a human scent applied and third was an alien with no scent applied next three scores represent when that same dog was sniffing shapeshifters again with a fox sent human scent to no scent and the final three scores for that dog represent when the dog was sniffing a human again once with the fox sent once for the human scent once he knows them so each dog's contributing nine scores and you know you can we can scroll through these data to see there are like different dogs they all contribute nine scores this is what the data look like so when there is no scent no masking scent at all the pattern we see is a number of vocalizations look quite similar when they're sniffing a uh an alien so a shape shifter or a lizard alien and they're quite a lot higher than when they're sniffing a human when a human scent is applied to the entity then again it looks like there's a lot more vocalizations when they're sniffing some kind of alien than when they're sniffing a human but when a fox sent is applied the vocalizations are really quite similar for all three of these entities so it looks like what's going on here potentially is that this the the human scent doesn't really make any difference at all looks to get the same pattern results to when no center is applied but the fox scent looks like it might make a difference but it mainly makes a difference in um creating more kind of vocalizations for uh when they're sniffing a human which you know is kind of not what the aliens want what the aliens want is they want the dogs to make fewer vocalizations when they're sniffing an alien not more vocalizations when they're sniffing a human so the fox then when it's applied to an alien doesn't seem to actually make much difference but let's see how that's borne out by the model so now we've got a factorial design so we have a main effect of entity a main effect of scent mask and the interaction between entity and scent mask all of these effects are significant we've corrected all of them using the greenhouse geyser method and you can see all the p-values very small we've also again got these generalized uh partially to square values as well which give us an idea of the size of each of the effects so how do we interpret all of these well first of all repeat the mantra from previous lecture it's not sensible to interpret main effects in the presence of a significant interaction so the fact we have a significant interaction means we can tour intensive purposes ignore the effect of entity and ignore the effect of scent mask because they're not that interesting given that we know that each of those effects is moderated by the other so what we can say from this is well basically report the interaction so we can say the interaction effect suggests that the effect of entity on vocalizations was significantly moderated by what scent the entity was wearing and we've got our degrees of freedom for that interaction effect we've got our f statistic for that interaction effect and we've got our p value we've also got an effect size of generalized eta squared so the effects i suggest the interaction accounts for about 24 of the variance in vocalizations not accounted for by the other variables so and that's a pretty decent amount so we saw when we looked at factorial designs that we can use a simple effects analysis to break down this interaction so first off let's have a look at the effect of entity within the type of scent so if we break it down and look at the effect of entity when no scent was used when human scent was used when fox sent was used what's the pattern of significance that we find what's the pattern of results that we find and in fact unhelpfully we find that when no mask was when no scent was used there's a highly significant effect of entity when a human mask was used scent mask was used highly significant effects of entity and when a fox scent mask was used also a highly significant effect of entity so basically the effect of entity was significant in all three masking conditions so that doesn't help us too much what we can you know what we could see from the graph is that the the uh the effective entity was smaller when a fox 7 was used but it's still significant still highly highly highly significant so this isn't really that helpful so let's do the simple effects the other way around maybe that will be more helpful so what's the effect of scent within entity so we're flipping this around to say do we get an effective scent mask uh when a human is sniffed do we get an effect descent mask when shapeshifter is sniffed and when alien is sniffed so when a human is sniffed the scent that was worn has a highly significant effect when a shape shifter was sniffed the effect of scent is highly significant and when an alien was sniffed the effect of scent was highly significant so again it's not tremendously helpful because basically um the the scent warn had a highly significant effect for all of the entities so what do we you know it's not it's not a particularly helpful finding so what do we do in this kind of situation what we can do we could do some post hoc tests across the interaction now the trouble is we've got nine uh different means and if we look at each combination of pairs of those nine means so if we compared every mean to every other mean we'd end up with 36 post-hoc tests and it would be a bewildering befuddle of tests that's quite hard to uh to kind of unpick so we can be more selective with the tests that we do so for example we might choose to kind of look within each scent so do post-doc tests within each cent so within uh within no scent being used we do post-hoc tests across these three means so we'll end up getting three tests here human versus shape shift a human versus alien shape-shifter versus alien then we could move on to masking scent and do the same thing so look at the effect the post-op tests for entity within when a human scent was worn and then look at post-op tests uh for entity when fox uh fox was descent warm so then we get another three tests here another three tests here so we'd end up with nine post-doc tests rather than 36 which is the more manageable about more interpretable we'd have to do less harsh kind of corrections for the number of tests that we've done so that is one strategy you could adopt when we do this this is exactly what i've done here so we're looking at when no scent was applied what are the differences what are the differences in the vocalizations between the different entities so when we compare human and shape shifter significant difference human and alien significant difference shapeshifter and alien significant difference so in other words all the means are significantly different when no scent is worn dogs make more vocalizations than uh when they're sniffing aliens significantly more when they're sniffing aliens compared to shapeshifters significantly more when they're sniffing aliens compared to humans and significantly more when they're um sniffing shape-shifters compared to humans so when no scent is worn dogs can discriminate reliably between aliens shape shifters and humans what about when a human scent is worn again comparing human shapeshifter highly significant human and alien highly significant shape-shifter and alien highly significant so when a human smell is worn to mask the scent again the dogs can still very well discriminate between the entities so they give more vocalizations significantly more vocalizations when sniffing an alien compared to a shape-shifter when sniffing a shape-shifter compared to a human and when sniffing an alien compared to a shape shifter so again they're very successfully discriminating what about when fox scent is worn well when foxcent is worn the human shapeshifter comparison is highly significant so that suggests that they give significantly more vocalizations when they're sniffing a shape-shifter than when they're sniffing a human i'm getting the direction of these effects from the graph by the way and also the uh sign of the estimate when they're sniffing a human compared to an alien there's also a highly significant effect so they give more vocalizations when sniffing an alien than a human however when they're sniffing aliens and shape-shifters there's no significant difference so all the fox odor does the fox scent does is it stops the dogs being able to differentiate aliens from shape shifters but it doesn't stop them being able to differentiate shape shifters from humans and aliens from humans so they're still going to be very successful at discriminating between aliens and humans they're just not going to be able to to tell whether it's a shape-shifting alien or an alien lizard alien so again this is not really what the aliens want basically our sniffer dog training has been very very successful when we look at it in a bigger sample so looking at that profile of effects we can say that when no scent is worn mean vocalizations differ between all entities aliens elicit significantly more vocalizations than both shape-shifters and humans and shape-shifters elicit significantly more vocalizations than humans happy days this pattern's exactly the same when a human scent is worn happy days when fox scent is warm things change a bit there are still significantly more vocalizations when sniffing aliens and shape-shifters compared to humans but the difference between shape-shifters and aliens is not significant so it messes up the dog's ability to distinguish liz space lizards from humanoid space lizards so to sum up the sense don't distract the sniffer dogs from detecting aliens compared to humans but it does confuse them when they're trying to distinguish a space lizard in humanoid form from a space lizard in space lizard form now if you want to test uh look at diagnostic plots and robust models in a factorial repeated measures design well there is the face of a sad spaniel on your screen because you can't do it uh and so that is the face that when you're trying to do that you're going i want some diagnostic plots now you're going to end up looking like a sad spaniel who is full of sadness because can't be done loving dirt bad news bad news our invasion has been detected how is that possible the p-value was greater than 0.05 and we smeared ourselves with human sweat and fox poo apparently the effect size was large the sniffer dogs were better at detecting us than we realized and the sense they didn't work and the dogs were so cute that we wanted to cuddle them and live in harmony with the puppies and the humans cute cute but puppies are hairy and tonguey and licky and disgusting oh but they're so nice to cuddle why don't you try one yes hmm i like its ears perhaps we should try to insimulate rather than invade but how can we as a species we are socially awkward aloof uncool logically minded hermits we cannot exist undetected in the humanoid world they are so sociable well i think there might be a way the humans have a shortage of people that have all of our qualities they're aloof socially awkward hermit-like logically-minded they're called statistics lecturers you
Info
Channel: Andy Field
Views: 3,623
Rating: 4.8800001 out of 5
Keywords: statistics, repeated measures, GLM, ANOVA, simple effects, post hoc tests, afex, rstats
Id: UjXqZiA2XWc
Channel Id: undefined
Length: 59min 50sec (3590 seconds)
Published: Fri Nov 13 2020
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.