Your browser doesn't support javascript.
loading
Automated Identification of Germline de novo Mutations in Family Trios: A Consensus-Based Informatic Approach.
Shadrina, Mariya; Kalay, Özem; Demirkaya-Budak, Sinem; LeDuc, Charles A; Chung, Wendy K; Turgut, Deniz; Budak, Gungor; Arslan, Elif; Semenyuk, Vladimir; Davis-Dusenbery, Brandi; Seidman, Christine E; Yost, H Joseph; Jain, Amit; Gelb, Bruce D.
Afiliación
  • Shadrina M; Mindich Child Health and Development Institute and the Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York, NY, USA.
  • Kalay Ö; Velsera Inc, 529 Main St, Suite 6610, Charlestown, MA, USA.
  • Demirkaya-Budak S; Velsera Inc, 529 Main St, Suite 6610, Charlestown, MA, USA.
  • LeDuc CA; Department of Pediatrics, Columbia University, New York, NY, USA.
  • Chung WK; Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.
  • Turgut D; Velsera Inc, 529 Main St, Suite 6610, Charlestown, MA, USA.
  • Budak G; Velsera Inc, 529 Main St, Suite 6610, Charlestown, MA, USA.
  • Arslan E; Velsera Inc, 529 Main St, Suite 6610, Charlestown, MA, USA.
  • Semenyuk V; Velsera Inc, 529 Main St, Suite 6610, Charlestown, MA, USA.
  • Davis-Dusenbery B; Velsera Inc, 529 Main St, Suite 6610, Charlestown, MA, USA.
  • Seidman CE; Division of Cardiovascular Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
  • Yost HJ; Howard Hughes Medical Institute, Chevy Chase, MD, USA.
  • Jain A; Molecular Medicine Program, University of Utah, Salt Lake City, UT, USA.
  • Gelb BD; Velsera Inc, 529 Main St, Suite 6610, Charlestown, MA, USA.
bioRxiv ; 2024 Mar 13.
Article en En | MEDLINE | ID: mdl-38559260
ABSTRACT
Accurate identification of germline de novo variants (DNVs) remains a challenging problem despite rapid advances in sequencing technologies as well as methods for the analysis of the data they generate, with putative solutions often involving ad hoc filters and visual inspection of identified variants. Here, we present a purely informatic method for the identification of DNVs by analyzing short-read genome sequencing data from proband-parent trios. Our method evaluates variant calls generated by three genome sequence analysis pipelines utilizing different algorithms-GATK HaplotypeCaller, DeepTrio and Velsera GRAF-exploring the assumption that a requirement of consensus can serve as an effective filter for high-quality DNVs. We assessed the efficacy of our method by testing DNVs identified using a previously established, highly accurate classification procedure that partially relied on manual inspection and used Sanger sequencing to validate a DNV subset comprising less confident calls. The results show that our method is highly precise and that applying a force-calling procedure to putative variants further removes false-positive calls, increasing precision of the workflow to 99.6%. Our method also identified novel DNVs, 87% of which were validated, indicating it offers a higher recall rate without compromising accuracy. We have implemented this method as an automated bioinformatics workflow suitable for large-scale analyses without need for manual intervention.

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: BioRxiv Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Idioma: En Revista: BioRxiv Año: 2024 Tipo del documento: Article País de afiliación: Estados Unidos