Your browser doesn't support javascript.
loading
Misconceptions on Missing Data in RAD-seq Phylogenetics with a Deep-scale Example from Flowering Plants.
Eaton, Deren A R; Spriggs, Elizabeth L; Park, Brian; Donoghue, Michael J.
Afiliação
  • Eaton DAR; Department of Ecology and Evolutionary Biology, Yale University, PO Box 208106, New Haven, CT, 06520, USA.
  • Spriggs EL; Department of Ecology and Evolutionary Biology, Yale University, PO Box 208106, New Haven, CT, 06520, USA.
  • Park B; Department of Ecology and Evolutionary Biology, Yale University, PO Box 208106, New Haven, CT, 06520, USA.
  • Donoghue MJ; Department of Ecology and Evolutionary Biology, Yale University, PO Box 208106, New Haven, CT, 06520, USA.
Syst Biol ; 66(3): 399-412, 2017 05 01.
Article em En | MEDLINE | ID: mdl-27798402
Restriction-site associated DNA (RAD) sequencing and related methods rely on the conservation of enzyme recognition sites to isolate homologous DNA fragments for sequencing, with the consequence that mutations disrupting these sites lead to missing information. There is thus a clear expectation for how missing data should be distributed, with fewer loci recovered between more distantly related samples. This observation has led to a related expectation: that RAD-seq data are insufficiently informative for resolving deeper scale phylogenetic relationships. Here we investigate the relationship between missing information among samples at the tips of a tree and information at edges within it. We re-analyze and review the distribution of missing data across ten RAD-seq data sets and carry out simulations to determine expected patterns of missing information. We also present new empirical results for the angiosperm clade Viburnum (Adoxaceae, with a crown age >50 Ma) for which we examine phylogenetic information at different depths in the tree and with varied sequencing effort. The total number of loci, the proportion that are shared, and phylogenetic informativeness varied dramatically across the examined RAD-seq data sets. Insufficient or uneven sequencing coverage accounted for similar proportions of missing data as dropout from mutation-disruption. Simulations reveal that mutation-disruption, which results in phylogenetically distributed missing data, can be distinguished from the more stochastic patterns of missing data caused by low sequencing coverage. In Viburnum, doubling sequencing coverage nearly doubled the number of parsimony informative sites, and increased by >10X the number of loci with data shared across >40 taxa. Our analysis leads to a set of practical recommendations for maximizing phylogenetic information in RAD-seq studies. [hierarchical redundancy; phylogenetic informativeness; quartet informativeness; Restriction-site associated DNA (RAD) sequencing; sequencing coverage; Viburnum.].
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Filogenia / Magnoliopsida / Modelos Biológicos Tipo de estudo: Guideline Idioma: En Revista: Syst Biol Assunto da revista: BIOLOGIA Ano de publicação: 2017 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Filogenia / Magnoliopsida / Modelos Biológicos Tipo de estudo: Guideline Idioma: En Revista: Syst Biol Assunto da revista: BIOLOGIA Ano de publicação: 2017 Tipo de documento: Article País de afiliação: Estados Unidos