Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 44
Filtrar
1.
Nature ; 625(7993): 92-100, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38057664

RESUMEN

The depletion of disruptive variation caused by purifying natural selection (constraint) has been widely used to investigate protein-coding genes underlying human disorders1-4, but attempts to assess constraint for non-protein-coding regions have proved more difficult. Here we aggregate, process and release a dataset of 76,156 human genomes from the Genome Aggregation Database (gnomAD)-the largest public open-access human genome allele frequency reference dataset-and use it to build a genomic constraint map for the whole genome (genomic non-coding constraint of haploinsufficient variation (Gnocchi)). We present a refined mutational model that incorporates local sequence context and regional genomic features to detect depletions of variation. As expected, the average constraint for protein-coding sequences is stronger than that for non-coding regions. Within the non-coding genome, constrained regions are enriched for known regulatory elements and variants that are implicated in complex human diseases and traits, facilitating the triangulation of biological annotation, disease association and natural selection to non-coding DNA analysis. More constrained regulatory elements tend to regulate more constrained protein-coding genes, which in turn suggests that non-coding constraint can aid the identification of constrained genes that are as yet unrecognized by current gene constraint metrics. We demonstrate that this genome-wide constraint map improves the identification and interpretation of functional human genetic variation.


Asunto(s)
Genoma Humano , Genómica , Modelos Genéticos , Mutación , Humanos , Acceso a la Información , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Frecuencia de los Genes , Genoma Humano/genética , Mutación/genética , Selección Genética
2.
Cell ; 159(4): 800-13, 2014 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-25417157

RESUMEN

We sequenced the MSY (male-specific region of the Y chromosome) of the C57BL/6J strain of the laboratory mouse Mus musculus. In contrast to theories that Y chromosomes are heterochromatic and gene poor, the mouse MSY is 99.9% euchromatic and contains about 700 protein-coding genes. Only 2% of the MSY derives from the ancestral autosomes that gave rise to the mammalian sex chromosomes. Instead, all but 45 of the MSY's genes belong to three acquired, massively amplified gene families that have no homologs on primate MSYs but do have acquired, amplified homologs on the mouse X chromosome. The complete mouse MSY sequence brings to light dramatic forces in sex chromosome evolution: lineage-specific convergent acquisition and amplification of X-Y gene families, possibly fueled by antagonism between acquired X-Y homologs. The mouse MSY sequence presents opportunities for experimental studies of a sex-specific chromosome in its entirety, in a genetically tractable model organism.


Asunto(s)
Evolución Biológica , Cromosomas de los Mamíferos , Ratones Endogámicos C57BL/genética , Análisis de Secuencia de ADN , Cromosoma Y , Animales , Centrómero , Cromosomas Artificiales Bacterianos/genética , Femenino , Humanos , Masculino , Filogenia , Primates/genética , Cromosoma X
3.
Nature ; 581(7809): 459-464, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32461653

RESUMEN

Naturally occurring human genetic variants that are predicted to inactivate protein-coding genes provide an in vivo model of human gene inactivation that complements knockout studies in cells and model organisms. Here we report three key findings regarding the assessment of candidate drug targets using human loss-of-function variants. First, even essential genes, in which loss-of-function variants are not tolerated, can be highly successful as targets of inhibitory drugs. Second, in most genes, loss-of-function variants are sufficiently rare that genotype-based ascertainment of homozygous or compound heterozygous 'knockout' humans will await sample sizes that are approximately 1,000 times those presently available, unless recruitment focuses on consanguineous individuals. Third, automated variant annotation and filtering are powerful, but manual curation remains crucial for removing artefacts, and is a prerequisite for recall-by-genotype efforts. Our results provide a roadmap for human knockout studies and should guide the interpretation of loss-of-function variants in drug development.


Asunto(s)
Genes Esenciales/efectos de los fármacos , Genes Esenciales/genética , Mutación con Pérdida de Función/genética , Terapia Molecular Dirigida , Artefactos , Automatización , Consanguinidad , Exones/genética , Mutación con Ganancia de Función/genética , Frecuencia de los Genes , Técnicas de Silenciamiento del Gen , Heterocigoto , Homocigoto , Humanos , Proteína Huntingtina/genética , Proteína 2 Quinasa Serina-Treonina Rica en Repeticiones de Leucina/genética , Enfermedades Neurodegenerativas/genética , Proteínas Priónicas/genética , Reproducibilidad de los Resultados , Tamaño de la Muestra , Proteínas tau/genética
4.
Nature ; 581(7809): 452-458, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32461655

RESUMEN

The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.


Asunto(s)
Enfermedad/genética , Haploinsuficiencia/genética , Mutación con Pérdida de Función/genética , Anotación de Secuencia Molecular , Transcripción Genética , Transcriptoma/genética , Trastorno del Espectro Autista/genética , Conjuntos de Datos como Asunto , Discapacidades del Desarrollo/genética , Exones/genética , Femenino , Genotipo , Humanos , Discapacidad Intelectual/genética , Masculino , Anotación de Secuencia Molecular/normas , Distribución de Poisson , ARN Mensajero/análisis , ARN Mensajero/genética , Enfermedades Raras/diagnóstico , Enfermedades Raras/genética , Reproducibilidad de los Resultados , Secuenciación del Exoma
5.
Nature ; 587(7833): 246-251, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-33177663

RESUMEN

New genome assemblies have been arriving at a rapidly increasing pace, thanks to decreases in sequencing costs and improvements in third-generation sequencing technologies1-3. For example, the number of vertebrate genome assemblies currently in the NCBI (National Center for Biotechnology Information) database4 increased by more than 50% to 1,485 assemblies in the year from July 2018 to July 2019. In addition to this influx of assemblies from different species, new human de novo assemblies5 are being produced, which enable the analysis of not only small polymorphisms, but also complex, large-scale structural differences between human individuals and haplotypes. This coming era and its unprecedented amount of data offer the opportunity to uncover many insights into genome evolution but also present challenges in how to adapt current analysis methods to meet the increased scale. Cactus6, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequences. Here we describe progressive extensions to Cactus to create Progressive Cactus, which enables the reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We describe results from an alignment of more than 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment created so far.


Asunto(s)
Genoma/genética , Genómica/métodos , Alineación de Secuencia/métodos , Programas Informáticos , Vertebrados/genética , Amnios , Animales , Simulación por Computador , Genómica/normas , Haplotipos , Humanos , Control de Calidad , Alineación de Secuencia/normas , Programas Informáticos/normas
6.
Nature ; 581(7809): 444-451, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32461652

RESUMEN

Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to those for SNVs. Here we present a reference of sequence-resolved SVs constructed from 14,891 genomes across diverse global populations (54% non-European) in gnomAD. We discovered a rich and complex landscape of 433,371 SVs, from which we estimate that SVs are responsible for 25-29% of all rare protein-truncating events per genome. We found strong correlations between natural selection against damaging SNVs and rare SVs that disrupt or duplicate protein-coding sequence, which suggests that genes that are highly intolerant to loss-of-function are also sensitive to increased dosage6. We also uncovered modest selection against noncoding SVs in cis-regulatory elements, although selection against protein-truncating SVs was stronger than all noncoding effects. Finally, we identified very large (over one megabase), rare SVs in 3.9% of samples, and estimate that 0.13% of individuals may carry an SV that meets the existing criteria for clinically important incidental findings7. This SV resource is freely distributed via the gnomAD browser8 and will have broad utility in population genetics, disease-association studies, and diagnostic screening.


Asunto(s)
Enfermedad/genética , Variación Genética , Genética Médica/normas , Genética de Población/normas , Genoma Humano/genética , Femenino , Pruebas Genéticas , Técnicas de Genotipaje , Humanos , Masculino , Persona de Mediana Edad , Mutación , Polimorfismo de Nucleótido Simple/genética , Grupos Raciales/genética , Estándares de Referencia , Selección Genética , Secuenciación Completa del Genoma
7.
Nature ; 581(7809): 434-443, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32461654

RESUMEN

Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes1. Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.


Asunto(s)
Exoma/genética , Genes Esenciales/genética , Variación Genética/genética , Genoma Humano/genética , Adulto , Encéfalo/metabolismo , Enfermedades Cardiovasculares/genética , Estudios de Cohortes , Bases de Datos Genéticas , Femenino , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo , Humanos , Mutación con Pérdida de Función/genética , Masculino , Tasa de Mutación , Proproteína Convertasa 9/genética , ARN Mensajero/genética , Reproducibilidad de los Resultados , Secuenciación del Exoma , Secuenciación Completa del Genoma
14.
Proc Natl Acad Sci U S A ; 116(51): 25745-25755, 2019 12 17.
Artículo en Inglés | MEDLINE | ID: mdl-31772017

RESUMEN

Venom systems are key adaptations that have evolved throughout the tree of life and typically facilitate predation or defense. Despite venoms being model systems for studying a variety of evolutionary and physiological processes, many taxonomic groups remain understudied, including venomous mammals. Within the order Eulipotyphla, multiple shrew species and solenodons have oral venom systems. Despite morphological variation of their delivery systems, it remains unclear whether venom represents the ancestral state in this group or is the result of multiple independent origins. We investigated the origin and evolution of venom in eulipotyphlans by characterizing the venom system of the endangered Hispaniolan solenodon (Solenodon paradoxus). We constructed a genome to underpin proteomic identifications of solenodon venom toxins, before undertaking evolutionary analyses of those constituents, and functional assessments of the secreted venom. Our findings show that solenodon venom consists of multiple paralogous kallikrein 1 (KLK1) serine proteases, which cause hypotensive effects in vivo, and seem likely to have evolved to facilitate vertebrate prey capture. Comparative analyses provide convincing evidence that the oral venom systems of solenodons and shrews have evolved convergently, with the 4 independent origins of venom in eulipotyphlans outnumbering all other venom origins in mammals. We find that KLK1s have been independently coopted into the venom of shrews and solenodons following their divergence during the late Cretaceous, suggesting that evolutionary constraints may be acting on these genes. Consequently, our findings represent a striking example of convergent molecular evolution and demonstrate that distinct structural backgrounds can yield equivalent functions.


Asunto(s)
Euterios , Evolución Molecular , Genoma/genética , Musarañas , Ponzoñas/genética , Animales , Euterios/clasificación , Euterios/genética , Euterios/fisiología , Duplicación de Gen , Masculino , Filogenia , Proteómica , Musarañas/clasificación , Musarañas/genética , Musarañas/fisiología , Calicreínas de Tejido/genética
15.
Am J Hum Genet ; 102(6): 1204-1211, 2018 06 07.
Artículo en Inglés | MEDLINE | ID: mdl-29861106

RESUMEN

There is a limited understanding about the impact of rare protein-truncating variants across multiple phenotypes. We explore the impact of this class of variants on 13 quantitative traits and 10 diseases using whole-exome sequencing data from 100,296 individuals. Protein-truncating variants in genes intolerant to this class of mutations increased risk of autism, schizophrenia, bipolar disorder, intellectual disability, and ADHD. In individuals without these disorders, there was an association with shorter height, lower education, increased hospitalization, and reduced age at enrollment. Gene sets implicated from GWASs did not show a significant protein-truncating variants burden beyond what was captured by established Mendelian genes. In conclusion, we provide a thorough investigation of the impact of rare deleterious coding variants on complex traits, suggesting widespread pleiotropic risk.


Asunto(s)
Mutación/genética , Sistemas de Lectura Abierta/genética , Bases de Datos Genéticas , Etnicidad/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Fenotipo , Proteínas/genética
16.
Proc Natl Acad Sci U S A ; 115(11): E2566-E2574, 2018 03 13.
Artículo en Inglés | MEDLINE | ID: mdl-29483247

RESUMEN

Elephantids are the world's most iconic megafaunal family, yet there is no comprehensive genomic assessment of their relationships. We report a total of 14 genomes, including 2 from the American mastodon, which is an extinct elephantid relative, and 12 spanning all three extant and three extinct elephantid species including an ∼120,000-y-old straight-tusked elephant, a Columbian mammoth, and woolly mammoths. Earlier genetic studies modeled elephantid evolution via simple bifurcating trees, but here we show that interspecies hybridization has been a recurrent feature of elephantid evolution. We found that the genetic makeup of the straight-tusked elephant, previously placed as a sister group to African forest elephants based on lower coverage data, in fact comprises three major components. Most of the straight-tusked elephant's ancestry derives from a lineage related to the ancestor of African elephants while its remaining ancestry consists of a large contribution from a lineage related to forest elephants and another related to mammoths. Columbian and woolly mammoths also showed evidence of interbreeding, likely following a latitudinal cline across North America. While hybridization events have shaped elephantid history in profound ways, isolation also appears to have played an important role. Our data reveal nearly complete isolation between the ancestors of the African forest and savanna elephants for ∼500,000 y, providing compelling justification for the conservation of forest and savanna elephants as separate species.


Asunto(s)
Elefantes/genética , Mamuts/genética , Mastodontes/genética , Animales , Elefantes/clasificación , Evolución Molecular , Extinción Biológica , Fósiles , Flujo Génico , Genoma , Genómica/historia , Historia Antigua , Mamuts/clasificación , Mastodontes/clasificación , Filogenia
17.
Nature ; 508(7497): 494-9, 2014 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-24759411

RESUMEN

The human X and Y chromosomes evolved from an ordinary pair of autosomes, but millions of years ago genetic decay ravaged the Y chromosome, and only three per cent of its ancestral genes survived. We reconstructed the evolution of the Y chromosome across eight mammals to identify biases in gene content and the selective pressures that preserved the surviving ancestral genes. Our findings indicate that survival was nonrandom, and in two cases, convergent across placental and marsupial mammals. We conclude that the gene content of the Y chromosome became specialized through selection to maintain the ancestral dosage of homologous X-Y gene pairs that function as broadly expressed regulators of transcription, translation and protein stability. We propose that beyond its roles in testis determination and spermatogenesis, the Y chromosome is essential for male viability, and has unappreciated roles in Turner's syndrome and in phenotypic differences between the sexes in health and disease.


Asunto(s)
Evolución Molecular , Dosificación de Gen/genética , Mamíferos/genética , Cromosoma Y/genética , Animales , Cromosomas Humanos X/genética , Cromosomas Humanos Y/genética , Enfermedad , Femenino , Regulación de la Expresión Génica , Salud , Humanos , Masculino , Marsupiales/genética , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , Biosíntesis de Proteínas/genética , Estabilidad Proteica , Selección Genética/genética , Homología de Secuencia , Caracteres Sexuales , Espermatogénesis/genética , Testículo/metabolismo , Transcripción Genética/genética , Síndrome de Turner/genética , Cromosoma X/genética
18.
Nature ; 513(7518): 375-381, 2014 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-25186727

RESUMEN

Cichlid fishes are famous for large, diverse and replicated adaptive radiations in the Great Lakes of East Africa. To understand the molecular mechanisms underlying cichlid phenotypic diversity, we sequenced the genomes and transcriptomes of five lineages of African cichlids: the Nile tilapia (Oreochromis niloticus), an ancestral lineage with low diversity; and four members of the East African lineage: Neolamprologus brichardi/pulcher (older radiation, Lake Tanganyika), Metriaclima zebra (recent radiation, Lake Malawi), Pundamilia nyererei (very recent radiation, Lake Victoria), and Astatotilapia burtoni (riverine species around Lake Tanganyika). We found an excess of gene duplications in the East African lineage compared to tilapia and other teleosts, an abundance of non-coding element divergence, accelerated coding sequence evolution, expression divergence associated with transposable element insertions, and regulation by novel microRNAs. In addition, we analysed sequence data from sixty individuals representing six closely related species from Lake Victoria, and show genome-wide diversifying selection on coding and regulatory variants, some of which were recruited from ancient polymorphisms. We conclude that a number of molecular mechanisms shaped East African cichlid genomes, and that amassing of standing variation during periods of relaxed purifying selection may have been important in facilitating subsequent evolutionary diversification.


Asunto(s)
Cíclidos/clasificación , Cíclidos/genética , Evolución Molecular , Especiación Genética , Genoma/genética , África Oriental , Animales , Elementos Transponibles de ADN/genética , Duplicación de Gen/genética , Regulación de la Expresión Génica/genética , Genómica , Lagos , MicroARNs/genética , Filogenia , Polimorfismo Genético/genética
19.
Proc Natl Acad Sci U S A ; 114(52): E11257-E11266, 2017 12 26.
Artículo en Inglés | MEDLINE | ID: mdl-29229813

RESUMEN

The CRISPR-Cas9 nuclease system holds enormous potential for therapeutic genome editing of a wide spectrum of diseases. Large efforts have been made to further understanding of on- and off-target activity to assist the design of CRISPR-based therapies with optimized efficacy and safety. However, current efforts have largely focused on the reference genome or the genome of cell lines to evaluate guide RNA (gRNA) efficiency, safety, and toxicity. Here, we examine the effect of human genetic variation on both on- and off-target specificity. Specifically, we utilize 7,444 whole-genome sequences to examine the effect of variants on the targeting specificity of ∼3,000 gRNAs across 30 therapeutically implicated loci. We demonstrate that human genetic variation can alter the off-target landscape genome-wide including creating and destroying protospacer adjacent motifs (PAMs). Furthermore, single-nucleotide polymorphisms (SNPs) and insertions/deletions (indels) can result in altered on-target sites and novel potent off-target sites, which can predispose patients to treatment failure and adverse effects, respectively; however, these events are rare. Taken together, these data highlight the importance of considering individual genomes for therapeutic genome-editing applications for the design and evaluation of CRISPR-based therapies to minimize risk of treatment failure and/or adverse outcomes.


Asunto(s)
Sistemas CRISPR-Cas , Sitios Genéticos , Terapia Genética , Polimorfismo de Nucleótido Simple , ARN Guía de Kinetoplastida/genética , Humanos
20.
Nature ; 496(7445): 311-6, 2013 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-23598338

RESUMEN

The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution.


Asunto(s)
Evolución Biológica , Peces/clasificación , Peces/genética , Genoma/genética , Animales , Animales Modificados Genéticamente , Embrión de Pollo , Secuencia Conservada/genética , Elementos de Facilitación Genéticos/genética , Evolución Molecular , Extremidades/anatomía & histología , Extremidades/crecimiento & desarrollo , Peces/anatomía & histología , Peces/fisiología , Genes Homeobox/genética , Genómica , Inmunoglobulina M/genética , Ratones , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , Filogenia , Alineación de Secuencia , Análisis de Secuencia de ADN , Vertebrados/anatomía & histología , Vertebrados/genética , Vertebrados/fisiología
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda