Búsqueda | OPS/OMS Uruguay

1.

A genomic mutational constraint map using variation in 76,156 human genomes.

Chen, Siwei; Francioli, Laurent C; Goodrich, Julia K; Collins, Ryan L; Kanai, Masahiro; Wang, Qingbo; Alföldi, Jessica; Watts, Nicholas A; Vittal, Christopher; Gauthier, Laura D; Poterba, Timothy; Wilson, Michael W; Tarasova, Yekaterina; Phu, William; Grant, Riley; Yohannes, Mary T; Koenig, Zan; Farjoun, Yossi; Banks, Eric; Donnelly, Stacey; Gabriel, Stacey; Gupta, Namrata; Ferriera, Steven; Tolonen, Charlotte; Novod, Sam; Bergelson, Louis; Roazen, David; Ruano-Rubio, Valentin; Covarrubias, Miguel; Llanwarne, Christopher; Petrillo, Nikelle; Wade, Gordon; Jeandet, Thibault; Munshi, Ruchi; Tibbetts, Kathleen; O'Donnell-Luria, Anne; Solomonson, Matthew; Seed, Cotton; Martin, Alicia R; Talkowski, Michael E; Rehm, Heidi L; Daly, Mark J; Tiao, Grace; Neale, Benjamin M; MacArthur, Daniel G; Karczewski, Konrad J.

Nature ; 625(7993): 92-100, 2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-38057664

RESUMEN

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.

Asunto(s)

Genoma Humano , Genómica , Modelos Genéticos , Mutación , Humanos , Acceso a la Información , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Frecuencia de los Genes , Genoma Humano/genética , Mutación/genética , Selección Genética

2.

Unsupervised removal of systematic background noise from droplet-based single-cell experiments using CellBender.

Fleming, Stephen J; Chaffin, Mark D; Arduini, Alessandro; Akkad, Amer-Denis; Banks, Eric; Marioni, John C; Philippakis, Anthony A; Ellinor, Patrick T; Babadi, Mehrtash.

Nat Methods ; 20(9): 1323-1335, 2023 09.

Artículo en Inglés | MEDLINE | ID: mdl-37550580

RESUMEN

Droplet-based single-cell assays, including single-cell RNA sequencing (scRNA-seq), single-nucleus RNA sequencing (snRNA-seq) and cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), generate considerable background noise counts, the hallmark of which is nonzero counts in cell-free droplets and off-target gene expression in unexpected cell types. Such systematic background noise can lead to batch effects and spurious differential gene expression results. Here we develop a deep generative model based on the phenomenology of noise generation in droplet-based assays. The proposed model accurately distinguishes cell-containing droplets from cell-free droplets, learns the background noise profile and provides noise-free quantification in an end-to-end fashion. We implement this approach in the scalable and robust open-source software package CellBender. Analysis of simulated data demonstrates that CellBender operates near the theoretically optimal denoising limit. Extensive evaluations using real datasets and experimental benchmarks highlight enhanced concordance between droplet-based single-cell data and established gene expression patterns, while the learned background noise profile provides evidence of degraded or uncaptured cell types.

Asunto(s)

ARN Nuclear Pequeño , Programas Informáticos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos

3.

A structural variation reference for medical and population genetics.

Collins, Ryan L; Brand, Harrison; Karczewski, Konrad J; Zhao, Xuefang; Alföldi, Jessica; Francioli, Laurent C; Khera, Amit V; Lowther, Chelsea; Gauthier, Laura D; Wang, Harold; Watts, Nicholas A; Solomonson, Matthew; O'Donnell-Luria, Anne; Baumann, Alexander; Munshi, Ruchi; Walker, Mark; Whelan, Christopher W; Huang, Yongqing; Brookings, Ted; Sharpe, Ted; Stone, Matthew R; Valkanas, Elise; Fu, Jack; Tiao, Grace; Laricchia, Kristen M; Ruano-Rubio, Valentin; Stevens, Christine; Gupta, Namrata; Cusick, Caroline; Margolin, Lauren; Taylor, Kent D; Lin, Henry J; Rich, Stephen S; Post, Wendy S; Chen, Yii-Der Ida; Rotter, Jerome I; Nusbaum, Chad; Philippakis, Anthony; Lander, Eric; Gabriel, Stacey; Neale, Benjamin M; Kathiresan, Sekar; Daly, Mark J; Banks, Eric; MacArthur, Daniel G; Talkowski, Michael E.

Nature ; 581(7809): 444-451, 2020 05.

Artículo en Inglés | MEDLINE | ID: mdl-32461652

RESUMEN

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.

Asunto(s)

Enfermedad/genética , Variación Genética , Genética Médica/normas , Genética de Población/normas , Genoma Humano/genética , Femenino , Pruebas Genéticas , Técnicas de Genotipaje , Humanos , Masculino , Persona de Mediana Edad , Mutación , Polimorfismo de Nucleótido Simple/genética , Grupos Raciales/genética , Estándares de Referencia , Selección Genética , Secuenciación Completa del Genoma

4.

The mutational constraint spectrum quantified from variation in 141,456 humans.

Karczewski, Konrad J; Francioli, Laurent C; Tiao, Grace; Cummings, Beryl B; Alföldi, Jessica; Wang, Qingbo; Collins, Ryan L; Laricchia, Kristen M; Ganna, Andrea; Birnbaum, Daniel P; Gauthier, Laura D; Brand, Harrison; Solomonson, Matthew; Watts, Nicholas A; Rhodes, Daniel; Singer-Berk, Moriel; England, Eleina M; Seaby, Eleanor G; Kosmicki, Jack A; Walters, Raymond K; Tashman, Katherine; Farjoun, Yossi; Banks, Eric; Poterba, Timothy; Wang, Arcturus; Seed, Cotton; Whiffin, Nicola; Chong, Jessica X; Samocha, Kaitlin E; Pierce-Hoffman, Emma; Zappala, Zachary; O'Donnell-Luria, Anne H; Minikel, Eric Vallabh; Weisburd, Ben; Lek, Monkol; Ware, James S; Vittal, Christopher; Armean, Irina M; Bergelson, Louis; Cibulskis, Kristian; Connolly, Kristen M; Covarrubias, Miguel; Donnelly, Stacey; Ferriera, Steven; Gabriel, Stacey; Gentry, Jeff; Gupta, Namrata; Jeandet, Thibault; Kaplan, Diane; Llanwarne, Christopher.

Nature ; 581(7809): 434-443, 2020 05.

Artículo en Inglés | MEDLINE | ID: mdl-32461654

RESUMEN

Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.

Asunto(s)

Exoma/genética , Genes Esenciales/genética , Variación Genética/genética , Genoma Humano/genética , Adulto , Encéfalo/metabolismo , Enfermedades Cardiovasculares/genética , Estudios de Cohortes , Bases de Datos Genéticas , Femenino , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo , Humanos , Mutación con Pérdida de Función/genética , Masculino , Tasa de Mutación , Proproteína Convertasa 9/genética , ARN Mensajero/genética , Reproducibilidad de los Resultados , Secuenciación del Exoma , Secuenciación Completa del Genoma

5.

Mitochondrial DNA variation across 56,434 individuals in gnomAD.

Laricchia, Kristen M; Lake, Nicole J; Watts, Nicholas A; Shand, Megan; Haessly, Andrea; Gauthier, Laura; Benjamin, David; Banks, Eric; Soto, Jose; Garimella, Kiran; Emery, James; Rehm, Heidi L; MacArthur, Daniel G; Tiao, Grace; Lek, Monkol; Mootha, Vamsi K; Calvo, Sarah E.

Genome Res ; 32(3): 569-582, 2022 03.

Artículo en Inglés | MEDLINE | ID: mdl-35074858

RESUMEN

Genomic databases of allele frequency are extremely helpful for evaluating clinical variants of unknown significance; however, until now, databases such as the Genome Aggregation Database (gnomAD) have focused on nuclear DNA and have ignored the mitochondrial genome (mtDNA). Here, we present a pipeline to call mtDNA variants that addresses three technical challenges: (1) detecting homoplasmic and heteroplasmic variants, present, respectively, in all or a fraction of mtDNA molecules; (2) circular mtDNA genome; and (3) misalignment of nuclear sequences of mitochondrial origin (NUMTs). We observed that mtDNA copy number per cell varied across gnomAD cohorts and influenced the fraction of NUMT-derived false-positive variant calls, which can account for the majority of putative heteroplasmies. To avoid false positives, we excluded contaminated samples, cell lines, and samples prone to NUMT misalignment due to few mtDNA copies. Furthermore, we report variants with heteroplasmy ≥10%. We applied this pipeline to 56,434 whole-genome sequences in the gnomAD v3.1 database that includes individuals of European (58%), African (25%), Latino (10%), and Asian (5%) ancestry. Our gnomAD v3.1 release contains population frequencies for 10,850 unique mtDNA variants at more than half of all mtDNA bases. Importantly, we report frequencies within each nuclear ancestral population and mitochondrial haplogroup. Homoplasmic variants account for most variant calls (98%) and unique variants (85%). We observed that 1/250 individuals carry a pathogenic mtDNA variant with heteroplasmy above 10%. These mtDNA population allele frequencies are freely accessible and will aid in diagnostic interpretation and research studies.

Asunto(s)

ADN Mitocondrial , Genoma Mitocondrial , Núcleo Celular/genética , ADN Mitocondrial/genética , Frecuencia de los Genes , Genoma , Humanos , Mitocondrias/genética , Análisis de Secuencia de ADN

6.

Author Correction: A genomic mutational constraint map using variation in 76,156 human genomes.

Chen, Siwei; Francioli, Laurent C; Goodrich, Julia K; Collins, Ryan L; Kanai, Masahiro; Wang, Qingbo; Alföldi, Jessica; Watts, Nicholas A; Vittal, Christopher; Gauthier, Laura D; Poterba, Timothy; Wilson, Michael W; Tarasova, Yekaterina; Phu, William; Grant, Riley; Yohannes, Mary T; Koenig, Zan; Farjoun, Yossi; Banks, Eric; Donnelly, Stacey; Gabriel, Stacey; Gupta, Namrata; Ferriera, Steven; Tolonen, Charlotte; Novod, Sam; Bergelson, Louis; Roazen, David; Ruano-Rubio, Valentin; Covarrubias, Miguel; Llanwarne, Christopher; Petrillo, Nikelle; Wade, Gordon; Jeandet, Thibault; Munshi, Ruchi; Tibbetts, Kathleen; O'Donnell-Luria, Anne; Solomonson, Matthew; Seed, Cotton; Martin, Alicia R; Talkowski, Michael E; Rehm, Heidi L; Daly, Mark J; Tiao, Grace; Neale, Benjamin M; MacArthur, Daniel G; Karczewski, Konrad J.

Nature ; 626(7997): E1, 2024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-38225470

7.

Analysis of protein-coding genetic variation in 60,706 humans.

Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M.

Nature ; 536(7616): 285-91, 2016 08 18.

Artículo en Inglés | MEDLINE | ID: mdl-27535533

RESUMEN

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

Asunto(s)

Exoma/genética , Variación Genética/genética , Análisis Mutacional de ADN , Conjuntos de Datos como Asunto , Humanos , Fenotipo , Proteoma/genética , Enfermedades Raras/genética , Tamaño de la Muestra

8.

Addendum: The mutational constraint spectrum quantified from variation in 141,456 humans.

Gudmundsson, Sanna; Karczewski, Konrad J; Francioli, Laurent C; Tiao, Grace; Cummings, Beryl B; Alföldi, Jessica; Wang, Qingbo; Collins, Ryan L; Laricchia, Kristen M; Ganna, Andrea; Birnbaum, Daniel P; Gauthier, Laura D; Brand, Harrison; Solomonson, Matthew; Watts, Nicholas A; Rhodes, Daniel; Singer-Berk, Moriel; England, Eleina M; Seaby, Eleanor G; Kosmicki, Jack A; Walters, Raymond K; Tashman, Katherine; Farjoun, Yossi; Banks, Eric; Poterba, Timothy; Wang, Arcturus; Seed, Cotton; Whiffin, Nicola; Chong, Jessica X; Samocha, Kaitlin E; Pierce-Hoffman, Emma; Zappala, Zachary; O'Donnell-Luria, Anne H; Minikel, Eric Vallabh; Weisburd, Ben; Lek, Monkol; Ware, James S; Vittal, Christopher; Armean, Irina M; Bergelson, Louis; Cibulskis, Kristian; Connolly, Kristen M; Covarrubias, Miguel; Donnelly, Stacey; Ferriera, Steven; Gabriel, Stacey; Gentry, Jeff; Gupta, Namrata; Jeandet, Thibault; Kaplan, Diane.

Nature ; 597(7874): E3-E4, 2021 09.

Artículo en Inglés | MEDLINE | ID: mdl-34373650

9.

Author Correction: The mutational constraint spectrum quantified from variation in 141,456 humans.

Karczewski, Konrad J; Francioli, Laurent C; Tiao, Grace; Cummings, Beryl B; Alföldi, Jessica; Wang, Qingbo; Collins, Ryan L; Laricchia, Kristen M; Ganna, Andrea; Birnbaum, Daniel P; Gauthier, Laura D; Brand, Harrison; Solomonson, Matthew; Watts, Nicholas A; Rhodes, Daniel; Singer-Berk, Moriel; England, Eleina M; Seaby, Eleanor G; Kosmicki, Jack A; Walters, Raymond K; Tashman, Katherine; Farjoun, Yossi; Banks, Eric; Poterba, Timothy; Wang, Arcturus; Seed, Cotton; Whiffin, Nicola; Chong, Jessica X; Samocha, Kaitlin E; Pierce-Hoffman, Emma; Zappala, Zachary; O'Donnell-Luria, Anne H; Minikel, Eric Vallabh; Weisburd, Ben; Lek, Monkol; Ware, James S; Vittal, Christopher; Armean, Irina M; Bergelson, Louis; Cibulskis, Kristian; Connolly, Kristen M; Covarrubias, Miguel; Donnelly, Stacey; Ferriera, Steven; Gabriel, Stacey; Gentry, Jeff; Gupta, Namrata; Jeandet, Thibault; Kaplan, Diane; Llanwarne, Christopher.

Nature ; 590(7846): E53, 2021 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-33536625

10.

Author Correction: A structural variation reference for medical and population genetics.

Collins, Ryan L; Brand, Harrison; Karczewski, Konrad J; Zhao, Xuefang; Alföldi, Jessica; Francioli, Laurent C; Khera, Amit V; Lowther, Chelsea; Gauthier, Laura D; Wang, Harold; Watts, Nicholas A; Solomonson, Matthew; O'Donnell-Luria, Anne; Baumann, Alexander; Munshi, Ruchi; Walker, Mark; Whelan, Christopher W; Huang, Yongqing; Brookings, Ted; Sharpe, Ted; Stone, Matthew R; Valkanas, Elise; Fu, Jack; Tiao, Grace; Laricchia, Kristen M; Ruano-Rubio, Valentin; Stevens, Christine; Gupta, Namrata; Cusick, Caroline; Margolin, Lauren; Taylor, Kent D; Lin, Henry J; Rich, Stephen S; Post, Wendy S; Chen, Yii-Der Ida; Rotter, Jerome I; Nusbaum, Chad; Philippakis, Anthony; Lander, Eric; Gabriel, Stacey; Neale, Benjamin M; Kathiresan, Sekar; Daly, Mark J; Banks, Eric; MacArthur, Daniel G; Talkowski, Michael E.

Nature ; 590(7846): E55, 2021 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-33536627

11.

Lean and deep models for more accurate filtering of SNP and INDEL variant calls.

Friedman, Sam; Gauthier, Laura; Farjoun, Yossi; Banks, Eric.

Bioinformatics ; 36(7): 2060-2067, 2020 04 01.

Artículo en Inglés | MEDLINE | ID: mdl-31830260

RESUMEN

SUMMARY: We investigate convolutional neural networks (CNNs) for filtering small genomic variants in short-read DNA sequence data. Errors created during sequencing and library preparation make variant calling a difficult task. Encoding the reference genome and aligned reads covering sites of genetic variation as numeric tensors allows us to leverage CNNs for variant filtration. Convolutions over these tensors learn to detect motifs useful for classifying variants. Variant filtering models are trained to classify variants as artifacts or real variation. Visualizing the learned weights of the CNN confirmed it detects familiar DNA motifs known to correlate with real variation, like homopolymers and short tandem repeats (STR). After confirmation of the biological plausibility of the learned features we compared our model to current state-of-the-art filtration methods like Gaussian Mixture Models, Random Forests and CNNs designed for image classification, like DeepVariant. We demonstrate improvements in both sensitivity and precision. The tensor encoding was carefully tailored for processing genomic data, respecting the qualitative differences in structure between DNA and natural images. Ablation tests quantitatively measured the benefits of our tensor encoding strategy. Bayesian hyper-parameter optimization confirmed our notion that architectures designed with DNA data in mind outperform off-the-shelf image classification models. Our cross-generalization analysis identified idiosyncrasies in truth resources pointing to the need for new methods to construct genomic truth data. Our results show that models trained on heterogenous data types and diverse truth resources generalize well to new datasets, negating the need to train separate models for each data type. AVAILABILITY AND IMPLEMENTATION: This work is available in the Genome Analysis Toolkit (GATK) with the tool name CNNScoreVariants (https://github.com/broadinstitute/gatk). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Genómica , Mutación INDEL , Teorema de Bayes , Secuenciación de Nucleótidos de Alto Rendimiento , Redes Neurales de la Computación , Análisis de Secuencia

12.

De novo mutations in schizophrenia implicate synaptic networks.

Fromer, Menachem; Pocklington, Andrew J; Kavanagh, David H; Williams, Hywel J; Dwyer, Sarah; Gormley, Padhraig; Georgieva, Lyudmila; Rees, Elliott; Palta, Priit; Ruderfer, Douglas M; Carrera, Noa; Humphreys, Isla; Johnson, Jessica S; Roussos, Panos; Barker, Douglas D; Banks, Eric; Milanova, Vihra; Grant, Seth G; Hannon, Eilis; Rose, Samuel A; Chambert, Kimberly; Mahajan, Milind; Scolnick, Edward M; Moran, Jennifer L; Kirov, George; Palotie, Aarno; McCarroll, Steven A; Holmans, Peter; Sklar, Pamela; Owen, Michael J; Purcell, Shaun M; O'Donovan, Michael C.

Nature ; 506(7487): 179-84, 2014 Feb 13.

Artículo en Inglés | MEDLINE | ID: mdl-24463507

RESUMEN

Inherited alleles account for most of the genetic risk for schizophrenia. However, new (de novo) mutations, in the form of large chromosomal copy number changes, occur in a small fraction of cases and disproportionally disrupt genes encoding postsynaptic proteins. Here we show that small de novo mutations, affecting one or a few nucleotides, are overrepresented among glutamatergic postsynaptic proteins comprising activity-regulated cytoskeleton-associated protein (ARC) and N-methyl-d-aspartate receptor (NMDAR) complexes. Mutations are additionally enriched in proteins that interact with these complexes to modulate synaptic strength, namely proteins regulating actin filament dynamics and those whose messenger RNAs are targets of fragile X mental retardation protein (FMRP). Genes affected by mutations in schizophrenia overlap those mutated in autism and intellectual disability, as do mutation-enriched synaptic pathways. Aligning our findings with a parallel case-control study, we demonstrate reproducible insights into aetiological mechanisms for schizophrenia and reveal pathophysiology shared with other neurodevelopmental disorders.

Asunto(s)

Modelos Neurológicos , Mutación/genética , Red Nerviosa/metabolismo , Vías Nerviosas/metabolismo , Esquizofrenia/genética , Esquizofrenia/fisiopatología , Sinapsis/metabolismo , Trastornos Generalizados del Desarrollo Infantil/genética , Proteínas del Citoesqueleto/metabolismo , Exoma/genética , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/metabolismo , Humanos , Discapacidad Intelectual/genética , Tasa de Mutación , Red Nerviosa/fisiopatología , Proteínas del Tejido Nervioso/metabolismo , Vías Nerviosas/fisiopatología , Fenotipo , ARN Mensajero/genética , ARN Mensajero/metabolismo , Receptores de N-Metil-D-Aspartato/metabolismo , Esquizofrenia/metabolismo , Especificidad por Sustrato

13.

A polygenic burden of rare disruptive mutations in schizophrenia.

Purcell, Shaun M; Moran, Jennifer L; Fromer, Menachem; Ruderfer, Douglas; Solovieff, Nadia; Roussos, Panos; O'Dushlaine, Colm; Chambert, Kimberly; Bergen, Sarah E; Kähler, Anna; Duncan, Laramie; Stahl, Eli; Genovese, Giulio; Fernández, Esperanza; Collins, Mark O; Komiyama, Noboru H; Choudhary, Jyoti S; Magnusson, Patrik K E; Banks, Eric; Shakir, Khalid; Garimella, Kiran; Fennell, Tim; DePristo, Mark; Grant, Seth G N; Haggarty, Stephen J; Gabriel, Stacey; Scolnick, Edward M; Lander, Eric S; Hultman, Christina M; Sullivan, Patrick F; McCarroll, Steven A; Sklar, Pamela.

Nature ; 506(7487): 185-90, 2014 Feb 13.

Artículo en Inglés | MEDLINE | ID: mdl-24463508

RESUMEN

Schizophrenia is a common disease with a complex aetiology, probably involving multiple and heterogeneous genetic factors. Here, by analysing the exome sequences of 2,536 schizophrenia cases and 2,543 controls, we demonstrate a polygenic burden primarily arising from rare (less than 1 in 10,000), disruptive mutations distributed across many genes. Particularly enriched gene sets include the voltage-gated calcium ion channel and the signalling complex formed by the activity-regulated cytoskeleton-associated scaffold protein (ARC) of the postsynaptic density, sets previously implicated by genome-wide association and copy-number variation studies. Similar to reports in autism, targets of the fragile X mental retardation protein (FMRP, product of FMR1) are enriched for case mutations. No individual gene-based test achieves significance after correction for multiple testing and we do not detect any alleles of moderately low frequency (approximately 0.5 to 1 per cent) and moderately large effect. Taken together, these data suggest that population-based exome sequencing can discover risk alleles and complements established gene-mapping paradigms in neuropsychiatric disease.

Asunto(s)

Herencia Multifactorial/genética , Mutación/genética , Esquizofrenia/genética , Trastorno Autístico/genética , Canales de Calcio/genética , Proteínas del Citoesqueleto/genética , Variaciones en el Número de Copia de ADN/genética , Homólogo 4 de la Proteína Discs Large , Femenino , Proteína de la Discapacidad Intelectual del Síndrome del Cromosoma X Frágil/metabolismo , Estudio de Asociación del Genoma Completo , Humanos , Discapacidad Intelectual/genética , Péptidos y Proteínas de Señalización Intracelular/genética , Masculino , Proteínas de la Membrana/genética , Proteínas del Tejido Nervioso/genética , Receptores de N-Metil-D-Aspartato/genética

14.

Patterns and rates of exonic de novo mutations in autism spectrum disorders.

Neale, Benjamin M; Kou, Yan; Liu, Li; Ma'ayan, Avi; Samocha, Kaitlin E; Sabo, Aniko; Lin, Chiao-Feng; Stevens, Christine; Wang, Li-San; Makarov, Vladimir; Polak, Paz; Yoon, Seungtai; Maguire, Jared; Crawford, Emily L; Campbell, Nicholas G; Geller, Evan T; Valladares, Otto; Schafer, Chad; Liu, Han; Zhao, Tuo; Cai, Guiqing; Lihm, Jayon; Dannenfelser, Ruth; Jabado, Omar; Peralta, Zuleyma; Nagaswamy, Uma; Muzny, Donna; Reid, Jeffrey G; Newsham, Irene; Wu, Yuanqing; Lewis, Lora; Han, Yi; Voight, Benjamin F; Lim, Elaine; Rossin, Elizabeth; Kirby, Andrew; Flannick, Jason; Fromer, Menachem; Shakir, Khalid; Fennell, Tim; Garimella, Kiran; Banks, Eric; Poplin, Ryan; Gabriel, Stacey; DePristo, Mark; Wimbish, Jack R; Boone, Braden E; Levy, Shawn E; Betancur, Catalina; Sunyaev, Shamil.

Nature ; 485(7397): 242-5, 2012 Apr 04.

Artículo en Inglés | MEDLINE | ID: mdl-22495311

RESUMEN

Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified. To identify further genetic risk factors, here we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n = 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant, and the overall rate of mutation is only modestly higher than the expected rate. In contrast, the proteins encoded by genes that harboured de novo missense or nonsense mutations showed a higher degree of connectivity among themselves and to previous ASD genes as indexed by protein-protein interaction screens. The small increase in the rate of de novo events, when taken together with the protein interaction results, are consistent with an important but limited role for de novo point mutations in ASD, similar to that documented for de novo copy number variants. Genetic models incorporating these data indicate that most of the observed de novo events are unconnected to ASD; those that do confer risk are distributed across many genes and are incompletely penetrant (that is, not necessarily sufficient for disease). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5- to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favour of CHD8 and KATNAL2 as genuine autism risk factors.

Asunto(s)

Trastorno Autístico/genética , Proteínas de Unión al ADN/genética , Exones/genética , Predisposición Genética a la Enfermedad/genética , Mutación/genética , Factores de Transcripción/genética , Estudios de Casos y Controles , Exoma/genética , Salud de la Familia , Humanos , Modelos Genéticos , Herencia Multifactorial/genética , Fenotipo , Distribución de Poisson , Mapas de Interacción de Proteínas

15.

Analysis of rare, exonic variation amongst subjects with autism spectrum disorders and population controls.

Liu, Li; Sabo, Aniko; Neale, Benjamin M; Nagaswamy, Uma; Stevens, Christine; Lim, Elaine; Bodea, Corneliu A; Muzny, Donna; Reid, Jeffrey G; Banks, Eric; Coon, Hillary; Depristo, Mark; Dinh, Huyen; Fennel, Tim; Flannick, Jason; Gabriel, Stacey; Garimella, Kiran; Gross, Shannon; Hawes, Alicia; Lewis, Lora; Makarov, Vladimir; Maguire, Jared; Newsham, Irene; Poplin, Ryan; Ripke, Stephan; Shakir, Khalid; Samocha, Kaitlin E; Wu, Yuanqing; Boerwinkle, Eric; Buxbaum, Joseph D; Cook, Edwin H; Devlin, Bernie; Schellenberg, Gerard D; Sutcliffe, James S; Daly, Mark J; Gibbs, Richard A; Roeder, Kathryn.

PLoS Genet ; 9(4): e1003443, 2013 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-23593035

RESUMEN

We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD.

Asunto(s)

Trastornos Generalizados del Desarrollo Infantil/genética , Exoma , Estudio de Asociación del Genoma Completo , Estudios de Casos y Controles , Niño , Trastornos Generalizados del Desarrollo Infantil/fisiopatología , Predisposición Genética a la Enfermedad , Variación Genética , Humanos , Regulación de la Población , Análisis de Secuencia de ADN , Programas Informáticos

16.

The distribution and mutagenesis of short coding INDELs from 1,128 whole exomes.

Challis, Danny; Antunes, Lilian; Garrison, Erik; Banks, Eric; Evani, Uday S; Muzny, Donna; Poplin, Ryan; Gibbs, Richard A; Marth, Gabor; Yu, Fuli.

BMC Genomics ; 16: 143, 2015 Feb 28.

Artículo en Inglés | MEDLINE | ID: mdl-25765891

RESUMEN

BACKGROUND: Identifying insertion/deletion polymorphisms (INDELs) with high confidence has been intrinsically challenging in short-read sequencing data. Here we report our approach for improving INDEL calling accuracy by using a machine learning algorithm to combine call sets generated with three independent methods, and by leveraging the strengths of each individual pipeline. Utilizing this approach, we generated a consensus exome INDEL call set from a large dataset generated by the 1000 Genomes Project (1000G), maximizing both the sensitivity and the specificity of the calls. RESULTS: This consensus exome INDEL call set features 7,210 INDELs, from 1,128 individuals across 13 populations included in the 1000 Genomes Phase 1 dataset, with a false discovery rate (FDR) of about 7.0%. CONCLUSIONS: In our study we further characterize the patterns and distributions of these exonic INDELs with respect to density, allele length, and site frequency spectrum, as well as the potential mutagenic mechanisms of coding INDELs in humans.

Asunto(s)

Exoma/genética , Mutación INDEL/genética , Mutagénesis , Biología Computacional , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Proyecto Genoma Humano , Humanos , Aprendizaje Automático

17.

Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth.

Fromer, Menachem; Moran, Jennifer L; Chambert, Kimberly; Banks, Eric; Bergen, Sarah E; Ruderfer, Douglas M; Handsaker, Robert E; McCarroll, Steven A; O'Donovan, Michael C; Owen, Michael J; Kirov, George; Sullivan, Patrick F; Hultman, Christina M; Sklar, Pamela; Purcell, Shaun M.

Am J Hum Genet ; 91(4): 597-607, 2012 Oct 05.

Artículo en Inglés | MEDLINE | ID: mdl-23040492

RESUMEN

Sequencing of gene-coding regions (the exome) is increasingly used for studying human disease, for which copy-number variants (CNVs) are a critical genetic component. However, detecting copy number from exome sequencing is challenging because of the noncontiguous nature of the captured exons. This is compounded by the complex relationship between read depth and copy number; this results from biases in targeted genomic hybridization, sequence factors such as GC content, and batching of samples during collection and sequencing. We present a statistical tool (exome hidden Markov model [XHMM]) that uses principal-component analysis (PCA) to normalize exome read depth and a hidden Markov model (HMM) to discover exon-resolution CNV and genotype variation across samples. We evaluate performance on 90 schizophrenia trios and 1,017 case-control samples. XHMM detects a median of two rare (<1%) CNVs per individual (one deletion and one duplication) and has 79% sensitivity to similarly rare CNVs overlapping three or more exons discovered with microarrays. With sensitivity similar to state-of-the-art methods, XHMM achieves higher specificity by assigning quality metrics to the CNV calls to filter out bad ones, as well as to statistically genotype the discovered CNV in all individuals, yielding a trio call set with Mendelian-inheritance properties highly consistent with expectation. We also show that XHMM breakpoint quality scores enable researchers to explicitly search for novel classes of structural variation. For example, we apply XHMM to extract those CNVs that are highly likely to disrupt (delete or duplicate) only a portion of a gene.

Asunto(s)

Variaciones en el Número de Copia de ADN , Exoma , Exones , Estudio de Asociación del Genoma Completo/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Estudios de Casos y Controles , Genotipo , Técnicas de Genotipaje/métodos , Humanos , Modelos Genéticos , Hibridación de Ácido Nucleico/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos

18.

Detection of Copy-Number Variation in Circulating Cell-Free DNA in Patients With Uveal Melanoma.

Sato, Takuto; Montazeri, Kamaneh; Gragoudas, Evangelos S; Lane, Anne Marie; Aronow, Mary Beth; Cohen, Justine V; Boland, Genevieve M; Banks, Eric; Kachulis, Christopher; Fleharty, Mark; Cibulskis, Carrie; Lawless, Aleigha; Adalsteinsson, Viktor A; Sullivan, Ryan J; Kim, Ivana K.

JCO Precis Oncol ; 8: e2300368, 2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-38237100

RESUMEN

PURPOSE: Somatic chromosomal alterations, particularly monosomy 3 and 8q gains, have been associated with metastatic risk in uveal melanoma (UM). Whole genome-scale evaluation of detectable alterations in cell-free DNA (cfDNA) in UM could provide valuable prognostic information. Our pilot study evaluates the correlation between genomic information using ultra-low-pass whole-genome sequencing (ULP-WGS) of cfDNA in UM and associated clinical outcomes. MATERIALS AND METHODS: ULP-WGS of cfDNA was performed on 29 plasma samples from 16 patients, 14 metastatic UM (mUM) and two non-metastatic, including pre- and post-treatment mUM samples from 10 patients treated with immunotherapy and one with liver-directed therapy. We estimated tumor fraction (TFx) and detected copy-number alterations (CNAs) using ichorCNA. Presence of 8q amplification was further analyzed using the likelihood ratio test (LRT). RESULTS: Eleven patients with mUM (17 samples) of 14 had detectable circulating tumor DNA (ctDNA). 8q gain was detected in all 17, whereas monosomy 3 was detectable in 10 of 17 samples. TFx generally correlated with disease status, showing an increase at the time of disease progression (PD). 8q gain detection sensitivity appeared greater with the LRT than with ichorCNA at lower TFxs. The only patient with mUM with partial response on treatment had a high pretreatment TFx and undetectable on-treatment ctDNA, correlating with her profound response and durable survival. CONCLUSION: ctDNA can be detected in mUM using ULP-WGS, and the TFx correlates with DS. 8q gain was consistently detectable in mUM, in line with previous studies indicating 8q gains early in primary UM and higher amplification with PD. Our work suggests that detection of CNAs by ULP-WGS, particularly focusing on 8q gain, could be a valuable blood biomarker to monitor PD in UM.

Asunto(s)

ADN Tumoral Circulante , Melanoma , Neoplasias de la Úvea , Femenino , Humanos , Proyectos Piloto , Melanoma/genética , Melanoma/diagnóstico , Monosomía , ADN Tumoral Circulante/genética

19.

High-throughput RNA isoform sequencing using programmed cDNA concatenation.

Al'Khafaji, Aziz M; Smith, Jonathan T; Garimella, Kiran V; Babadi, Mehrtash; Popic, Victoria; Sade-Feldman, Moshe; Gatzen, Michael; Sarkizova, Siranush; Schwartz, Marc A; Blaum, Emily M; Day, Allyson; Costello, Maura; Bowers, Tera; Gabriel, Stacey; Banks, Eric; Philippakis, Anthony A; Boland, Genevieve M; Blainey, Paul C; Hacohen, Nir.

Nat Biotechnol ; 42(4): 582-586, 2024 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-37291427

RESUMEN

Full-length RNA-sequencing methods using long-read technologies can capture complete transcript isoforms, but their throughput is limited. We introduce multiplexed arrays isoform sequencing (MAS-ISO-seq), a technique for programmably concatenating complementary DNAs (cDNAs) into molecules optimal for long-read sequencing, increasing the throughput >15-fold to nearly 40 million cDNA reads per run on the Sequel IIe sequencer. When applied to single-cell RNA sequencing of tumor-infiltrating T cells, MAS-ISO-seq demonstrated a 12- to 32-fold increase in the discovery of differentially spliced genes.

Asunto(s)

Secuenciación de Nucleótidos de Alto Rendimiento , Isoformas de ARN , ADN Complementario/genética , Isoformas de ARN/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Isoformas de Proteínas/genética , Análisis de Secuencia de ARN/métodos , Transcriptoma , Perfilación de la Expresión Génica/métodos , ARN/genética

20.

Empowering the biomedical research community: Innovative SAS deployment on the All of Us Researcher Workbench.

Humes, Izabelle; Shyr, Cathy; Dillon, Moira; Liu, Zhongjie; Peterson, Jennifer; Jeor, Chris St; Malkes, Jacqueline; Master, Hiral; Mapes, Brandy; Azuine, Romuladus; Mack, Nakia; Abdelbary, Bassent; Gamble-George, Joyonna; Goldmann, Emily; Cook, Stephanie; Choupani, Fatemeh; Baskir, Rubin; McMaster, Sydney; Lunt, Chris; Watson, Karriem; Lee, Minnkyong; Schwartz, Sophie; Munshi, Ruchi; Glazer, David; Banks, Eric; Philippakis, Anthony; Basford, Melissa; Roden, Dan; Harris, Paul A.

J Am Med Inform Assoc ; 2024 Aug 12.

Artículo en Inglés | MEDLINE | ID: mdl-39135439

RESUMEN

OBJECTIVES: The All of Us Research Program is a precision medicine initiative aimed at establishing a vast, diverse biomedical database accessible through a cloud-based data analysis platform, the Researcher Workbench (RW). Our goal was to empower the research community by co-designing the implementation of SAS in the RW alongside researchers to enable broader use of All of Us data. MATERIALS AND METHODS: Researchers from various fields and with different SAS experience levels participated in co-designing the SAS implementation through user experience interviews. RESULTS: Feedback and lessons learned from user testing informed the final design of the SAS application. DISCUSSION: The co-design approach is critical for reducing technical barriers, broadening All of Us data use, and enhancing the user experience for data analysis on the RW. CONCLUSION: Our co-design approach successfully tailored the implementation of the SAS application to researchers' needs. This approach may inform future software implementations on the RW.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA