Publication Date:
2014-06-10
Description:
Background: The statistical genetics phenomenon of epistasis is widely acknowledged to confound diseaseetiology. In order to evaluate strategies for detecting these complex multi-locus disease associations,simulation studies are required. The development of the GAMETES software for the generation ofcomplex genetic models, has provided the means to randomly generate an architecturally diversepopulation of epistatic models that are both pure and strict, i.e. all n loci, but no fewer, are predictiveof phenotype. Previous theoretical work characterizing complex genetic models has yet to examinepure, strict, epistasis which should be the most challenging to detect. This study addresses threegoals: (1) Classify and characterize pure, strict, two-locus epistatic models, (2) Investigate the effectof model `architecture¿ on detection difficulty, and (3) Explore how adjusting GAMETES constraintsinfluences diversity in the generated models. Results: In this study we utilized a geometric approach to classify pure, strict, two-locus epistatic models by¿shape¿. In total, 33 unique shape symmetry classes were identified. Using a detection difficultymetric, we found that model shape was consistently a significant predictor of model detectiondifficulty. Additionally, after categorizing shape classes by the number of edges in their shapeprojections, we found that this edge number was also significantly predictive of detection difficulty.Analysis of constraints within GAMETES indicated that increasing model population size canexpand model class coverage but does little to change the range of observed difficulty metric scores.A variable population prevalence significantly increased the range of observed difficulty metricscores and, for certain constraints, also improved model class coverage. Conclusions: These analyses further our theoretical understanding of epistatic relationships and uncover guidelinesfor the effective generation of complex models using GAMETES. Specifically, (1) we havecharacterized 33 shape classes by edge number, detection difficulty, and observed frequency (2) ourresults support the claim that model architecture directly influences detection difficulty, and (3) wefound that GAMETES will generate a maximally diverse set of models with a variable populationprevalence and a larger model population size. However, a model population size as small as 1,000 islikely to be sufficient.
Electronic ISSN:
1756-0381
Topics:
Biology
,
Computer Science
Permalink