How to Design and Analyze Experiments Using an Augmented Design

Video Statistics and Information

Video
Captions Word Cloud
Reddit Comments
Captions
hello and welcome to the plant breeding and genomics community of practice webinar an introduction to the Augmented experimental design my name is Heather Merck and I'm the content coordinator for PPG in your host today for those of you unfamiliar with the plant breeding and genomics community of practice AKA PPG i invite you to explore our training resources at WWE Extension org slash plant underscore breeding underscore genomics I also invite you to subscribe to our newsletter PPG news to stay up to date with PPG webinars and other news you can sign up at PPG works org during today's webinar dr. jennifer cling of oregon state university will provide an introduction to augmented designs and perform sample analyses using data from her meadow foam breeding program please note that today's webinar is intended to be an introduction to the Augmented design we are planning a future webinar focused on more complex analyses today's presentation as well as dr. Klein's dataset and SAS program are available at extension org slash pages slash 6 0 4 3 0 please now join me in welcoming today's presenter dr. Jennifer claim dr. cling is a plant breeder who teaches experimental design to graduate students in agriculture at Oregon State University she has considerable experience in the application of recurrent selection methods to improve yield stress tolerance and disease and insect resistance in cross pollinating crops she enjoys analyzing data and was the data curator for the barley coordinated agricultural project dr. Kling will received her bachelor's degree in crop science from Oregon State University and her master's in agronomy from the University of nebraska-lincoln she obtained her PhD in genetics with a minor in statistics from North Carolina State University I'll now turn things over to you Jenny he was thank you for that introduction Heather I'll start by giving an overview of the presentation today I'll first the essential features of augmented designs and when we use them in plant breeding we'll talk about some of the design options and as Heather mentioned today we'll look more closely at a simple case where we have this one-way control of heterogeneity I'll give a specific example from my breeding program we'll talk about how to do the randomization and feel' plan how to analyze the data with SAS and how to interpret results will also give a brief overview of variations on the basic design that you may hear about in future webinars and finally we'll talk about some software that's available and for the references augmented designs were first introduced by dr. Federer in 1956 essential features are that check varieties the controls are replicated in a standard experimental design and you generally have two or more controls also you have a number of new treatments in our case those would be genotypes that are not replicated or have fewer replicates and then the checks and so they are augment the standard design it's helpful to look at what was in place before admitted designs were widely adopted so I've chosen an example here from breeding for Streicher resistance in sorghum at it Chris at this is from a publication in 1983 they used then the systematic arrangement of checks Strega in surgham is particularly notable for the variation it's it's quite influenced by local soil conditions and so it's very essential to have some sort of local environmental control in the field in the initial stages in the observation nursery that were unreplicated they would include a susceptible check that was repeated at constant intervals throughout the field after that they would go to replicated trials and they would have one particular susceptible check that would be surrounded by H test plots and they would use that susceptible check to adjust the observations on the test plots in the final more advanced stage of screening they would actually use a susceptible check every other plot and you can see that this is quite a resource intensive in that there's a lot of the area in the field devoted to that check plots that you could not use for your selection so the advantages of augmented design one is that you can use fewer check plots then would be required for science with systematic repetition of a single check additionally you get an estimate of standard error since you have several checks in the trial that can be used for comparisons you can make comparisons among new new genotypes or between your new genotypes and check varieties you also have a means to adjust your new genotypes for the field heterogeneity in the field there are some disadvantages you are still spending some resources on production and processing of your control plots you still need those control plots also you have relatively few degrees of freedom for you your experimental error which means that you may not have sufficient power to detect differences among treatments if it's a trait that's not highly heritable also on replicated experiments are inherently imprecise no matter how sophisticated the design so when would we use them in early generations when seed is limited we may have too many new entries to replicate or it may be difficult to maintain homogeneity within our blocks because we have so many entries to look at they can also be used in on-farm research growers may prefer to grow a single replication and they may not be able to accommodate all of the entries in the experiment so um basically you have to make several decisions I choose a design that's appropriate for controlling the hair genady in the experimental error you can choose to use one way blocking with a completely randomized design or you may use choose to use an incomplete block design or you may choose to control heterogeneity in two directions using a Latin square or an incomplete incomplete two way blocking experiment such as a uni square or a row column design so you have a number of design option options you can and it's important to remember that the underlying design refers to the assignment of checks to the experimental units all Advanta designs are incomplete with respect to the new entries and so I think in some cases people may believe they need to use and say a lattice design because they have a lot of new check entries but actually the important point is to consider you know how many checks you have in the experiment for example it's also possible to replicate these designs in different environments at that point once you've committed to doing replication you may need to consider other options such as lattice designs and compare the relative efficiency compared to those designs and also consider your particular objectives hmm okay so I'm going to discuss one particular Augmented block design example a very simplest case where we're only controlling heterogeneity in one direction we want to the checks occur once in every block and the new entries occur just once in the expand experiment so this would be called an Augmented randomized complete block design and the example I'm going to give is from the crop that I breed is called meta foam this is a native plant that was first produced as a crop in Oregon in 1980 it's a it's grown for its seed oil which has very novel long-chain fatty acids that makes it have exceptional oxidative stability presently it's used in cosmetics but it also has potential uses as a fuel additive as an additive in vehicle lubricants or in pharmaceutical products and you can see on the right there that the sheaves are quite small meta foam is a very good rotation crop in the Willamette Valley of Oregon it's a winter annual so its production cycle fits in very well with weather and rainfall patterns that we have here the plant and seed meal are high in glucose in which it's a member of the Brassica leaves and the cliffs inlets are thought to have some phytosanitary properties another advantage is that it fits very well into the grassy productions which are predominant here you can see in the picture below there the wind rows that are being harvested of meta foam using the same equipment that you would use for a grass seed production there are a number of pollinators that can be used for meta foam commercially the growers would rent honey bee hives but for the breeding program we've recently figured out that we have a couple different options we could also use blue orchard bees which work very well in small isolations or in small cages we also can use bluebottle flies in the green house to do small recombinations or as you see here in this picture to self large numbers of plants the particular germ plasm I was working with this was in 2005 I had inherited some diverse breeding populations from a retired breeder those populations we generated the seat at the greenhouse and self the large number of plants we had after two generations we had s two lines and we transplanted those lines to the field and they were allowed to out cross this is the first time that we had tried using nested pest cross system in meta foam so I didn't have all the details worked out and we didn't have sufficient seed for replicated progeny trials our goal was essentially to form a broad-based population for recurrent selection so we wanted to screen the astute s crosses and then select the best s to parents for recombination so we did actually a couple of different experiments that year but I'll talk about one of them one of the Augmented designs was included 50 new s to test cross families that had never been tested before we had three check varieties the first two you see there Ross and OMF 183 came from our elite population OMF 58 represented two different cycles of selection of that elite population we also had a variety starlight which had been commercially released and had been derived from the same germ plasm pool that we were evaluating so we wanted to be sure to compare our new entries to starlight to ensure that they would be a superior to what was already available so as I said we decided to block in one direction we had these three checks and so we chose to use six blocks and played around a little bit and realized that would give us our degrees of freedom of 10 which is pretty much the minimum that you need to be able to detect differences unless you have a very highly heritable trait we had the 50 new entries so the total number of plots was including the replicated checks and the new entries was 68 this came out to about 12 entries per block but since we only had the 50 new entries the last block only had 8 plots and that was okay because one nice thing about the Augmented designs is that you don't have to have equal numbers of new entries in every block so here's our statistical model and I included the residual plot it is grand and initially at this Raymond ANOVA that included a residual plot here to make a particular point you can see along the 0 axis there that these are our new entries and they actually have no residual they're contributing nothing to error the other plots that you see represent the check plots and essentially what you're doing here is you know the the estimate of error you're getting would be the same if you didn't include the new entries at all you're essentially doing a design on the check entries and then using the information there to adjust for the block effects and adjusts the values of your new entries so in your field plan you need to be sure to include a sufficient number of checks and replicates to get a reasonable estimate of experimental error so you can detect differences among your varieties as in any design you'd want to arrange your blocks along the field gradient to get the most possible variation among blocks and minimize the variation within blocks you have to assign each of your checks at random to each block and then after that you would assign you new entries at random to the remaining plots there's also an online tool for to generate a randomization I I just use a random number generator in Excel that you - not hard to do but that online tool is a variable is available okay so here shows the field layout we have our six blocks along this dimension here and then we have within each block we have 12 plots and I usually number them in a serpentine fashion and then in the block six we did not have a complete we didn't have the 12 we had eight plots so the first thing you would do then is go ahead and assign at random your checks to those experimental units you put all three checks one time in each block and then you would randomly assign your remain entries to the remaining plots and this is the picture showing what a typical metal foam progeny trial looks like this is just shortly after flowering for data collection we collected information on flowering days plant height disease resistance our lodging seed yield thousand seed weight and we weren't able at that time to we didn't have a capability then for large scale screening of oil content we do have that now and that's a routine measurement of quartz it's our primary interest for this example I will just look at the variable thousand seed weight it has a relatively high heritability and it's positively correlated with both loyal content and oil yield okay what an important decision that has to be made before you begin the analysis is to decide whether your genotypes are going to be fixed or random it's generally assumed that our blocks are random because they represent a larger group of potential blocks and we want to be able to make inferences beyond the particular sample of blocks in the experiment for the genotypes that we know that the checks are fixed effects were wanting to make specific comparisons with those varieties however the new entries could be fixed or random most commonly they're considered to be random because they've never been evaluated before however in this case you could argue that they would be fixed you want to make specific comparisons and to select the winners and also they didn't come from a common population so we'll go ahead and look at the analysis from both of these possible analyses okay so let's start out just considering our genotypes to be fixed this is so showing the format you could use for data entry this is the simplest way where you actually put the data directly into your program so in SAS you would be to assign a data name to that data set and then you would use an input statement to indicate the order that your entries are occurring you notice that we have a dollar sign with the name variable because that includes a character variable and then you would just simply list your your data lines there's other options for data input in SAS including using the import wizard you could use in file statements or you could setup a SAS library but this is just a simplest approach so for the first analysis I'm going to use the mixed procedure in SAS assuming that our entries are fixed so we need a class statement to indicate that walk and entries are both class variables in the model statement with proc mixed you only include your fixed effects so the model statement is saying the thousand seat weight equals on entry and then in this case we have blocks our random effect you notice in this case since were the entries are all fixed we don't really need to distinguish at least at this point between our checks and our new entries so if you're going to do all possible comparisons among your entries you need to make take some kind of precaution to ensure that you don't inflate type one error so I've used in my LS means statement there I've used a two key meant to to allow for all possible comparisons another option since we are primarily interested in comparing the new entries to starlight which was entering number 91 is you could use the ctrl upper statement and in that case I'm essentially doing a dunnage test because I want to compare everything to control so to control the experiment wise type 1 error I used to done an adjustment you can in SAS is quite convenient you could sort your output in any way that you wanted to I I am quite comfortable with Excel spreadsheets so what I generally do is simply create new data sets and then export those data sets using the export wizard to excel and then I sort them or mixed selections or do whatever I want to from there so the ODS output is essentially creating new data sets for you so I'm creating one that includes the adjusted means and then another one that shows the results from the multiple comparison tests so here's the result for that particular analysis you see up at the top there the estimates of the variances for the variance components for block at box and residuals of the random effects and then below that is a log likelihood and in the sample SAS program that I provided you can for example if you wanted to find out if blocks were important you could run a model a reduce model without blocks and then take the difference between the log likelihoods from the two programs and use that to test for the significance of the block variance to see if it's greater than zero generally speaking we would want to make adjustments to our means if we get any kind of estimate for our block that's greater than zero so we will go ahead and do that we have a test of fixed effects here for entries which shows that there are highly significant differences among our and at this point we don't know if that's due to differences among our checks or among our new entries so this is just showing after I've for example exported the multiple comparison test results to an excel file I could open that up and look and the for example the first line here shows the comparison between entry one and Starlite which is entering 91 and so this would be the the difference between those observations and then over on the far right here this would be the probability value for the dunnit test now this is quite a conservative test so if you are more concerned about making a type 2 error which is throwing out a winner then you are about making a type 1 error you do have also here in the output a column that shows results from a standard type of LSD test so we see for example one entry down here is significant using the dunnage significantly better than starlight using a dunnage test and then there would be others that we could consider using just a standard LSD test I think it's important to make the point that there are different standard errors when you're comparing say one check versus another check that's the most precise comparison new entries versus the check or if you're comparing entries in the same block that comparison will be made with more precision than if you're looking at new entries in different blocks ok so for the second analysis we want to consider that our new entries are random effects so there I used the SAS code from this reference here I will finger at all in 1997 you could also just do this in annually in a spreadsheet essentially what you're trying to do is you're creating an indicator of a light variable called new that we give a value of 1 when we have a new entry and the value of 0 when we have check and then we also create another variable called hon calling it entries see here which is collectively is 999 for all of the new entries but then maintains the actual value for the check varieties for example MF 183 is entry number 90 so this allows us then to make comparison among our checks and then a collective comparison between the mean for the new entries and any of the checks okay and so if you didn't want to do that manually you could use this little routine here to create a new data set in that format initially then I just ran this together in ANOVA I used the GLM procedure and so I then might the difference between GL m and proc mixed is that in GL m in your model statement you include all of the factors in your analysis where is in proc mixed you would only include your fixed effects in your model statement and your random effects in your random statement so in this case I'm just doing ANOVA the solution option here indicates that I want solution for the fixed effect and so then the output from that analysis is shown below here the ANOVA and we can see here that we do have significant variation among our blocks there the checks actually were fairly comparable to each other so there's not significant variation among the checks but we see from the new x entry probability value here that that indicates that we have significant variation among our new entries okay so then go on and use the mixed procedure to analyze the entries as random effects the analysis would look like this we'd have our class statement here with block entries and then the new fixed effect value there and in the model of statement we only would include our fixed effects being the entry fee and then we could request the solution for the fixed effects in the random statement we would include the two random components and also request a solution for the random effects and then we would get requests means for our are check entries and then as I did before I would make a new create a new data set that react was the output this essentially the solutions for the random effects on calling them the he blocks and then you could I use the export with wizard to export a block to an excel file or CSV file or whatever you want to use so here here's the results for the second analysis there you see we have the estimates of our variance components over the top four blocks the new entries and then the residuals and that there is you know a reasonable variance among our new entries again we have then the fifth statistics which we could use if we wanted to compare one model against another and then at the bottom there we have tested fixed effects this includes both I mean it includes all the three controls and then it also includes the average of all of the new entries and that's why we have three degrees of freedom so you can see here that there actually isn't anything that they can variation among the controls so this is looking then at the excel file I created by outputting the e blocks so at the bottom there you see all of the estimates for the new entries for example the point two nine nine one indicates these are deviations from the midpoint so that was better that had larger thousand seed weight than the average and so on if you are would prefer to look at these on more of the original scale you could go back up to your solution for fixed effects and use your intercept value so you would simply add than all of these estimates on to your intercept value to get the bluff estimate found that you could use for selection for example okay now there are a number of variations that we don't have time to go into today but I will give you sort of a lead and then there'll be more discussion of this in future webinars for example if you wanted to control variation in two dimensions you could use a modified udin square other references are given there you could also use a row column design if you wanted to and then one modification of interest is these the modified augmented designs you see at the bottom these actually use a systematic placement of controls which you know it has some desirable features there was a type two modification was particularly for the case where you have long rectangular plots like as you see the picture below there those are some barley test plots and barley the barley breeding program here is using these mad type two designs in the preliminary stages of their breeding program other possibilities would include an Augmented split block say for example you wanted to test not only new entries but also look at adaptation to different say different stresses you wanted to have a well watered and a drought-stressed treatment you could combine those type you in fact toriel combinations of treatments could be analyzed with the young minutes with block if you had a large number of checks you wanted to include in the experiment you could use an Augmented lattice square or an alpha alpha lattice and references are provided they're interesting paper was published recently on Augmented P rep designs and because the future of this is that you would actually replicate some but not all of your new entries and then you might do this across multiple sites so it say at one site you might replicate the first ten entries and then another site the next ten entries and so on so by the time you collect the information from all sites everything would have been replicated at least that second time and yeah so then because you have a replication of the entries you're not it's not necessary to include all of the controls that we typically included in augmented designs there's also information published on how to combine information across sites as you can see the Federer Reynolds and prosze reference there and then additionally some of you may be interested in combining trend analysis with augmented designs and I would then refer you to a couple of articles that are in the Supplemental list of references on a book that came out in 2003 that has a couple articles by Federer that show how to combine both the trend analysis and the Augmented designs if you're actually going to go and use augmented designs across locations you know you have the option also to use lattice designs in which case you could consider your sites to be complete replications and then have one rep of a lattice design within each location the advantage of this is that then you're using information from all of your plots to adjust means for field effects so I don't know this with certainty but it seems likely that you could have greater precision to detect differences among your entries in that situation however there are other situations where you might prefer to go ahead and use augmented designs across locations one desirable feature is that you have very flexible arrangement of new entries in your field plans I read one paper where you know for participatory plant breeding for example where you could actually have growers recommending of varieties different augmented varieties at different sites you could also then obtain an estimate of experimental error from each location so you could use this to assess your site quality to do analysis of gie interactions and then you at the end of the day you have more flexibility in deciding how you want to do your across site analyses which sites you want to combine together I searched around for software available for analyzing augmented designs the Agra base has a nice feature for looking at the the mad type 2 designs and then that's what the barley program here is using there were some software's available from the Indian Agricultural Research Institute they hadn't they also had some published and SAS macros submit had a sash macro available online I couldn't get it to work some of the assassin tax was outdated but again that would be if you were going to develop your own macro that would be a starting point so as I said I have provided a full list of references in the supplement these are the two references that I used most extensively for this presentation and I'd also like to acknowledge then contributions the financial support from USD ACS we had a special grant for medical research at the time of this experiment was conducted the Ong meta foam oil seed growers cooperative we also get funding through the policy burger professorship endowment I'd like to thank the crop of soil science department here and meta foam staff that helped with our research ok that's I'm happy to take some questions thank you very much Jenny and again I I very much appreciate you being able to be one soldier on it probably went a little faster than it would have if I have it the updated presentation but no problem so we have about 15 to 20 minutes to answer your questions today and for those of you who missed the beginning of the presentation you can use the question box on your screen to type in questions and hit return the question box is closed you can click the small plus sign next to the word question to open it up we'll be reading the questions out loud and will answer as many as we have time for and as Jenny mentioned and as as well as was mentioned during the webinar the presentation dataset SAS program as well as the fullest of references and resources that Jenny mentioned towards the end will all be available online at extension org and at this point in time the resources except for the webinar recording should all be there so please feel free to make use of those resources we're making them publicly available so that they are there for anyone to use and we hope that you find them useful so the the first question Jenny is one that you and I have talked about a little bit and I'll just touch on briefly is whether there's an hour package available or augmented design and okay but I could let you go first yeah I'm not aware of that but you've looked into that more than I have so why don't you go ahead and answer that question yes I was going to say to my knowledge there is not an hour package available for augmented design one of our future objectives is to develop an R script based on Jenny's presentation today so that we will make that publicly available to people as well at collaborating in genomics we have a lot of interest in R so we would like to share whatever resources are available and we've got a few more questions here one question Jenny and I I'm sure you'll have to qualify this is what's the minimum required number of checks yeah that that really depends on what crop you're working with what traits are working with how heritable they are you know the rule of thumb is that you need to have 10 degrees of freedom for error so you have to balance your number of blocks a number of checks to you know get that that value of 10 but that's just a minimum value and that for example in corn I know from having been worked a lot of the corn breeder you need a certain plot size minimum plot size and you also need a certain number of reps to get be able to pick out too difficult differences among families they have some families are full set families for yield so I wouldn't I think even if you had quite a lot of checks you would potentially not be able to pick up differences for yield in corn with a single rep so in that case you know no way to really account for the fact that you don't have multiple observations to estimate your means for the families so I guess so ballpark would be 10 but that you know in some cases that won't be sufficient and another cases where you have a highly heritable trait you know you almost wouldn't even need to have ten degrees of freedom great thank you Jenny and we've had a couple more requests for the the link to the presentation so I've typed that into the chat box okay and there's one comment here suggesting that genetic reps and large families may help make up I believe make up for some yeah yeah I think perhaps they're thinking about the case where you'd be looking at in taking data on individual plants and that would be then constituting your family for for meta phone we're looking at a plot main basis so I or else I'm just not quite understanding the comment yeah sorry the okay so the actual genetic replication in large families Oh genetic saris genetic replicas substitute for technical Ripper replicates and detecting your trait differences and increasing okay suggestion all right I'll have to think about it think about that again that may be referring to case where you're taking measurements on individual plants within each family sure yeah my apologies I got us on a little bit of a digression there we had a another nice comment from someone saying that they know of a package and are called di ggr for designing peer f trials well that would be of interest that's sort of a it's still an Augmented design but basically without the checks because you're just using some of the replicates I mean some of the entries as repeating some of the entries replicating those so that they act as they give you the estimated standard error that you need I haven't worked with those but they look very interesting when I looked at that paper so thank you very much for for that comment and another question is are you aware of other resource researchers at OSU using experiment and using augmented experiment experimental design in their breeding work at OSU well the barley program is the one that I know about I know for example when I was working in Africa they were they were using augmented designs from cassava and that was in the particular case they had there was a cassava cassava mosaic virus epidemic in East Africa and they needed to move some resistant varieties from West Africa over to East Africa but the varieties had very different quality and growth properties the the new varieties and in East Africa the farming systems were quite different so they used augmented designs to evaluate potential new introductions in that area that's one case I know where and designs were used very effectively because they wanted to they weren't limited in seed as one of the reasons that we use like mine designs what they were wanting to get its test as many different farming systems and environments as they could so that was one very nice application of augmented designs I don't know of any other breeders here who are using of that designs but there may be some great Thank You Jenny and we've got some more technical questions here for you that you can do your best with a nice question here actually if we are testing a large number of entries at several locations do we need the same checks across all locations or can we do some common checks and some checks for specific locations I think you could do it I think you would have options you know you're going to get an independent estimate of error within the location and so that it would kind of make sense to do that in fact if you wanted to use additional checks within each location that maybe were had specific adaptation to that location you could use that together better adjustment of your block effects there within the environment but you might have some say some broadly adapted varieties that you would use across all of your locations and that would be then a nice way to tie together on the information from the different sites so that would be soon seems to me to be a very reasonable approach great and then a question here about your breeding program mm-hmm what what acreage is required for your meta phone research we're a very small scale program because the current production of meta foam in Oregon is only about 5,000 acres so this is the crop that's been used because of its potential and the it's the real constraint for us is the markets developing the markets for the crop as I said it's currently used in a cosmetics and the use of metaphor cosmetic products is gradually increasing but presently it so the other industrial applications are not you know in practice so limited acreage so we have a very small scale project so I'm usually only you know maybe three acres or something per year plus all of my isolated seed increases and so on great and we've got a nice comment here from the wheat wheat breeding group at the University of Kentucky and they say that they have been using an Augmented design to test yield in early generations when they don't have large quantities of seed right and they yeah yeah I I had in my in the the most current version I had wanted to make that comparison particularly for as I said like in corn I would be reluctant I would rather have say 200 families about with two reps per site than 400 families with single replicate just because I know that I you know wouldn't have very much chance of picking up detectable differences among my families with just a single rep but in a crop like either a clonally propagated crop or one you know an inbreeding a self pollinating crop where you're using some sort of pedigree selection system you know when you put your initial material out in the field but that's just the first go at it so for example in in Oregon here we have a lot of difficulty with stripe rust in barley so in their early generations you know main objective may be simply just to get rid of stripe brush successful susceptible lines so but they then they know where they're going to advance that material another generation and can do more in-depth screening in subsequent years so that's really an ideal situation where you'd want to you know be able to in order to be able to test as many genotypes as possible you might want to use a single rep in the first year of evaluation so it depends on the the crop and also on the heritability of the traits of interest sure and you get a couple nice thank-yous from both grad students in the Moose lab at Illinois as well from the wheat breeding you've been Kentucky okay and we have another question here asking about outliers okay and wondering if they should be or does it make sense to remove outliers based on the residuals of unreplicated entries across multiple locations it doesn't I don't know how you would do that because you really only have the residuals estimated for you know if you're doing an ANOVA analysis at least you only have the residuals estimated for the check entries the the new entries don't have residuals they're all zero so I guess you'd have to have some other means other than a typical residual plot to figure out what really you know with just an improbable observation yeah I really don't know what else to say about that except if you have something that's completely out of the ballpark then you may just have to throw it out but I'm not quite sure what criteria you use to decide that great well we have a little bit more time for questions if anyone has things that are still pressing and I I see here I will actually share the links again it may be a good idea also to check and see which version of the presentation was posted yes and I I did check you know after we had the mix-up yeah when he was still on I did update the presentation okay so if they printed it out before the presentation they probably had the older version yes they wouldn't want to go back and we have one more question here I don't know if you want to comment specifically in wheat and the number of checks that you may be looking at you know we've talked some about how it depends on the trait and the heritability of a trait it also would depend on the block size that you want to use to different crops you know and also if you have some sense of the environments that you're working on the the field heterogeneity that you have to count for I you know once you been working in an area you kind of get a sense of well you know if I go above say 20 entries in this particular crop this particular site I'm likely to have too large a block size to get good comparisons within blocks so it would also depend on you know how big your blocks are and how you need to represent have checks representing that area okay another thank you here to you and also a question about using augmented designs for QTL or Association analysis are you aware of anything in that area well in the barley cap program some of the data that we received from the the oregon state breeding program was based on augmented designs so and then the objective of the the hordeum toolbox the database that was developed through that project was to do association analysis look for marker trade associations so I guess that would be one example I don't know specifically of other examples but that would that would be one case sure okay and you have a thanks from a former OSU barley grad that's why it was a join us today so okay here's one another possible suggestion when we're looking at number of checks that we may want to use that there's a suggestion that this person has read that they thought you should have at least the square root of the number of new entries yeah I've seen that too yeah I've seen that I don't you know I think in the in the information from the Indian Agricultural Research Center I gave reference for their software and so on that's they do have capability in their little package to help you estimate that using that criterion great and coming back to the question regarding QTL on Association analysis we have a comment here that Suzy and head flight was scored using an Augmented design association or QTL analysis alright and another nice comment here that augmented designs help us to increase our genotypic replication throughout the respect expense of our line replication mm-hmm so thinking of perhaps like an f2 family to represent the have numbers of individuals from that f2 family rather than the individual lines coming from from population and another question could an Augmented design be used for selecting lines from wild accessions with small seed amounts compared with checks yeah I think that's one possibility in fact again in the with the materials online from the Indian Research Center I forget the specific name but they also mentioned that as being a I'm areas where you're getting in new sessions from gene bank or something like that and you don't have much available so that would be that would be a very appropriate use magnetic designs and one comment or what sorry one question that relates back to the design in our field space and the question is across locations field site field size restrictions may apply how would you take that into consideration in your flocks so you're saying maybe they're at one particular site you have smaller area available available than at other sites well I think there wouldn't be yeah III think I'm just trying to think off the top of my head here yeah I suppose if you're going to combine information across sites that would be desirable to have the same plot size to have you know common variance but I'm not really entirely sure about that sure I you know I would think off the top of my head as well that you would want to maintain say the same plant spacing across across locations as well I do remember from one to federer's papers that they that the common variance was not essential but I that's all that I could say sure so that there's one reference there that 2001 paper I think it is where they combine information across sites and that was comment he had that that would not be a essential but I guess I don't quite understand why that is okay and we have one more question here regarding the the square-root guide that we so if you use the square root guide is the number of reps of a Czech variety or the number of different genotypes yeah that's a good question IIIi I don't actually know which one that refers to for that particular rule of them.i i prefer to use the minimum number of degrees of freedom for your ANOVA criterion in which case you can adjust your numbers of checks and reps and you know to obtain that that minimum number so I'm not sure for the other criterion which one that refers to sure and it it looks like we are just about out of time for questions today and please join me again in thanking Jennie for a great presentation and as well I invite you all to join us for future webinars and we are planning to provide a second or part two of the Augmented design in future hopefully in the next several months and if you'd like to stay up to date with webinars put on by the pipe learning and genomics community of practice I would encourage you to sign up for PPG news and that's our newsletter and you can sign up at PPG works dot org and if you still have any questions that are particularly burning you may email them to me at mark m ER k dot nine at osu dot edu thank you so much for joining us today and please don't forget to fill out the survey evaluation
Info
Channel: National Association of Plant Breeders
Views: 10,467
Rating: 4.8730159 out of 5
Keywords: PBG, eXension, statistics, augmented design, meadowfoam, plant breeding
Id: YyAPeYGOQEE
Channel Id: undefined
Length: 57min 56sec (3476 seconds)
Published: Thu Nov 10 2011
Related Videos
Note
Please note that this website is currently a work in progress! Lots of interesting data and statistics to come.