Your browser doesn't support javascript.
loading
Blacklisting variants common in private cohorts but not in public databases optimizes human exome analysis.
Maffucci, Patrick; Bigio, Benedetta; Rapaport, Franck; Cobat, Aurélie; Borghesi, Alessandro; Lopez, Marie; Patin, Etienne; Bolze, Alexandre; Shang, Lei; Bendavid, Matthieu; Scott, Eric M; Stenson, Peter D; Cunningham-Rundles, Charlotte; Cooper, David N; Gleeson, Joseph G; Fellay, Jacques; Quintana-Murci, Lluis; Casanova, Jean-Laurent; Abel, Laurent; Boisson, Bertrand; Itan, Yuval.
Afiliación
  • Maffucci P; St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065.
  • Bigio B; Immunology Institute, Graduate School, Icahn School of Medicine at Mount Sinai, New York, NY 10029.
  • Rapaport F; Department of Medicine, Division of Clinical Immunology, Icahn School of Medicine at Mount Sinai, New York, NY 10029.
  • Cobat A; St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065.
  • Borghesi A; Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U1163, Necker Hospital for Sick Children, 75015 Paris, France.
  • Lopez M; Imagine Institute, Paris Descartes University, 75015 Paris, France.
  • Patin E; St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065.
  • Bolze A; Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM U1163, Necker Hospital for Sick Children, 75015 Paris, France.
  • Shang L; Imagine Institute, Paris Descartes University, 75015 Paris, France.
  • Bendavid M; School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
  • Scott EM; Human Evolutionary Genetics Unit, Pasteur Institute, 75015 Paris, France.
  • Stenson PD; CNRS UMR2000, 75015 Paris, France.
  • Cunningham-Rundles C; Center of Bioinformatics, Biostatistics and Integrative Biology, Pasteur Institute, 75015 Paris, France.
  • Cooper DN; Human Evolutionary Genetics Unit, Pasteur Institute, 75015 Paris, France.
  • Gleeson JG; CNRS UMR2000, 75015 Paris, France.
  • Fellay J; Center of Bioinformatics, Biostatistics and Integrative Biology, Pasteur Institute, 75015 Paris, France.
  • Quintana-Murci L; Helix, San Carlos, CA 94070.
  • Casanova JL; St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065.
  • Abel L; St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065.
  • Boisson B; Rady Children's Institute for Genomic Medicine, Department of Neurosciences, University of California, San Diego, La Jolla, CA 92093.
  • Itan Y; Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff CF14 4XW, United Kingdom.
Proc Natl Acad Sci U S A ; 116(3): 950-959, 2019 01 15.
Article en En | MEDLINE | ID: mdl-30591557
ABSTRACT
Computational analyses of human patient exomes aim to filter out as many nonpathogenic genetic variants (NPVs) as possible, without removing the true disease-causing mutations. This involves comparing the patient's exome with public databases to remove reported variants inconsistent with disease prevalence, mode of inheritance, or clinical penetrance. However, variants frequent in a given exome cohort, but absent or rare in public databases, have also been reported and treated as NPVs, without rigorous exploration. We report the generation of a blacklist of variants frequent within an in-house cohort of 3,104 exomes. This blacklist did not remove known pathogenic mutations from the exomes of 129 patients and decreased the number of NPVs remaining in the 3,104 individual exomes by a median of 62%. We validated this approach by testing three other independent cohorts of 400, 902, and 3,869 exomes. The blacklist generated from any given cohort removed a substantial proportion of NPVs (11-65%). We analyzed the blacklisted variants computationally and experimentally. Most of the blacklisted variants corresponded to false signals generated by incomplete reference genome assembly, location in low-complexity regions, bioinformatic misprocessing, or limitations inherent to cohort-specific private alleles (e.g., due to sequencing kits, and genetic ancestries). Finally, we provide our precalculated blacklists, together with ReFiNE, a program for generating customized blacklists from any medium-sized or large in-house cohort of exome (or other next-generation sequencing) data via a user-friendly public web server. This work demonstrates the power of extracting variant blacklists from private databases as a specific in-house but broadly applicable tool for optimizing exome analysis.
Asunto(s)
Palabras clave

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Variación Genética / Programas Informáticos / Genoma Humano / Análisis de Secuencia de ADN / Bases de Datos de Ácidos Nucleicos / Exoma Tipo de estudio: Etiology_studies / Incidence_studies / Observational_studies / Risk_factors_studies Límite: Female / Humans / Male Idioma: En Revista: Proc Natl Acad Sci U S A Año: 2019 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Variación Genética / Programas Informáticos / Genoma Humano / Análisis de Secuencia de ADN / Bases de Datos de Ácidos Nucleicos / Exoma Tipo de estudio: Etiology_studies / Incidence_studies / Observational_studies / Risk_factors_studies Límite: Female / Humans / Male Idioma: En Revista: Proc Natl Acad Sci U S A Año: 2019 Tipo del documento: Article
...