Publication Date:
2020-08-25
Description:
Motivation Ancestral haplotype maps provide useful information about genomic variation and biological processes. Reconstructing the descendent haplotype structure of homologous chromosomes, particularly for large numbers of individuals, can help with characterizing the recombination landscape, elucidating genotype-to-phenotype relationships, improving genomic predictions and more. Inferring haplotype maps from sparse genotype data is an efficient approach to whole-genome haplotyping, but this is a non-trivial problem. A standardized approach is needed to validate whether haplotype reconstruction software, conceived population designs and existing data for a given population provides accurate haplotype information for further inference. Results We introduce SPEARS, a pipeline for the simulation-based appraisal of genome-wide haplotype maps constructed from sparse genotype data. Using a specified pedigree, the pipeline generates virtual genotypes (known data) with genotyping errors and missing data structure. It then proceeds to mimic analysis in practice, capturing sources of error due to genotyping error, imputation and haplotype inference. Standard metrics allow researchers to assess different population designs and which features of haplotype structure or regions of the genome are sufficiently accurate for analysis. Haplotype maps for 1,000 outcross progeny from a multi-parent population of maize is used to demonstrate SPEARS. Availability Freely available on the web at https://github.com/maizeatlas/spears Supplementary information Supplementary data are available at Bioinformatics online.
Print ISSN:
1367-4803
Electronic ISSN:
1460-2059
Topics:
Biology
,
Computer Science
,
Medicine
Permalink