Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
2.
Nucleic Acids Res ; 43(15): e101, 2015 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-26001969

RESUMEN

Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied 'gold standard' haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants.


Asunto(s)
Haploinsuficiencia , Animales , Enfermedad/genética , Redes Reguladoras de Genes , Genómica/métodos , Humanos , Ratones , Máquina de Vectores de Soporte
3.
PLoS Genet ; 11(3): e1005012, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25781962

RESUMEN

Readily-accessible and standardised capture of genotypic variation has revolutionised our understanding of the genetic contribution to disease. Unfortunately, the corresponding systematic capture of patient phenotypic variation needed to fully interpret the impact of genetic variation has lagged far behind. Exploiting deep and systematic phenotyping of a cohort of 197 patients presenting with heterogeneous developmental disorders and whose genomes harbour de novo CNVs, we systematically applied a range of commonly-used functional genomics approaches to identify the underlying molecular perturbations and their phenotypic impact. Grouping patients into 408 non-exclusive patient-phenotype groups, we identified a functional association amongst the genes disrupted in 209 (51%) groups. We find evidence for a significant number of molecular interactions amongst the association-contributing genes, including a single highly-interconnected network disrupted in 20% of patients with intellectual disability, and show using microcephaly how these molecular networks can be used as baits to identify additional members whose genes are variant in other patients with the same phenotype. Exploiting the systematic phenotyping of this cohort, we observe phenotypic concordance amongst patients whose variant genes contribute to the same functional association but note that (i) this relationship shows significant variation across the different approaches used to infer a commonly perturbed molecular pathway, and (ii) that the phenotypic similarities detected amongst patients who share the same inferred pathway perturbation result from these patients sharing many distinct phenotypes, rather than sharing a more specific phenotype, inferring that these pathways are best characterized by their pleiotropic effects.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Discapacidades del Desarrollo/genética , Redes y Vías Metabólicas/genética , Mapas de Interacción de Proteínas/genética , Animales , Discapacidades del Desarrollo/metabolismo , Discapacidades del Desarrollo/patología , Expresión Génica , Estudios de Asociación Genética , Genoma Humano , Genotipo , Humanos , Ratones , Fenotipo , Mapeo de Interacción de Proteínas
4.
PLoS Comput Biol ; 10(8): e1003815, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-25166029

RESUMEN

Groupwise functional analysis of gene variants is becoming standard in next-generation sequencing studies. As the function of many genes is unknown and their classification to pathways is scant, functional associations between genes are often inferred from large-scale omics data. Such data types--including protein-protein interactions and gene co-expression networks--are used to examine the interrelations of the implicated genes. Statistical significance is assessed by comparing the interconnectedness of the mutated genes with that of random gene sets. However, interconnectedness can be affected by confounding bias, potentially resulting in false positive findings. We show that genes implicated through de novo sequence variants are biased in their coding-sequence length and longer genes tend to cluster together, which leads to exaggerated p-values in functional studies; we present here an integrative method that addresses these bias. To discern molecular pathways relevant to complex disease, we have inferred functional associations between human genes from diverse data types and assessed them with a novel phenotype-based method. Examining the functional association between de novo gene variants, we control for the heretofore unexplored confounding bias in coding-sequence length. We test different data types and networks and find that the disease-associated genes cluster more significantly in an integrated phenotypic-linkage network than in other gene networks. We present a tool of superior power to identify functional associations among genes mutated in the same disease even after accounting for significant sequencing study bias and demonstrate the suitability of this method to functionally cluster variant genes underlying polygenic disorders.


Asunto(s)
Análisis por Conglomerados , Variación Genética/genética , Genómica/métodos , Animales , Bases de Datos Genéticas , Epilepsia/genética , Perfilación de la Expresión Génica , Humanos , Trastornos Mentales/genética , Ratones , Fenotipo
5.
PLoS Genet ; 10(7): e1004525, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25057982

RESUMEN

Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25-0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1-5.0). From extrapolations we estimate that 8.2% (7.1-9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.


Asunto(s)
Secuencia Conservada/genética , Evolución Molecular , Genoma Humano , Eliminación de Secuencia/genética , Animales , Secuencia de Bases , Hominidae , Humanos , Ratones , Sistemas de Lectura Abierta , Alineación de Secuencia , Especificidad de la Especie
6.
Gigascience ; 3(1): 27, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25671092

RESUMEN

BACKGROUND: Penguins are flightless aquatic birds widely distributed in the Southern Hemisphere. The distinctive morphological and physiological features of penguins allow them to live an aquatic life, and some of them have successfully adapted to the hostile environments in Antarctica. To study the phylogenetic and population history of penguins and the molecular basis of their adaptations to Antarctica, we sequenced the genomes of the two Antarctic dwelling penguin species, the Adélie penguin [Pygoscelis adeliae] and emperor penguin [Aptenodytes forsteri]. RESULTS: Phylogenetic dating suggests that early penguins arose ~60 million years ago, coinciding with a period of global warming. Analysis of effective population sizes reveals that the two penguin species experienced population expansions from ~1 million years ago to ~100 thousand years ago, but responded differently to the climatic cooling of the last glacial period. Comparative genomic analyses with other available avian genomes identified molecular changes in genes related to epidermal structure, phototransduction, lipid metabolism, and forelimb morphology. CONCLUSIONS: Our sequencing and initial analyses of the first two penguin genomes provide insights into the timing of penguin origin, fluctuations in effective population sizes of the two penguin species over the past 10 million years, and the potential associations between these biological patterns and global climate change. The molecular changes compared with other avian genomes reflect both shared and diverse adaptations of the two penguin species to the Antarctic environment.

7.
Hum Mutat ; 34(12): 1679-87, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-24038936

RESUMEN

Copy-number variations (CNVs) are a common cause of intellectual disability and/or multiple congenital anomalies (ID/MCA). However, the clinical interpretation of CNVs remains challenging, especially for inherited CNVs. Well-phenotyped patients (5,531) with ID/MCA were screened for rare CNVs using a 250K single-nucleotide polymorphism array platform in order to improve the understanding of the contribution of CNVs to a patients phenotype. We detected 1,663 rare CNVs in 1,388 patients (25.1%; range 0-5 per patient) of which 437 occurred de novo and 638 were inherited. The detected CNVs were analyzed for various characteristics, gene content, and genotype-phenotype correlations. Patients with severe phenotypes, including organ malformations, had more de novo CNVs (P < 0.001), whereas patient groups with milder phenotypes, such as facial dysmorphisms, were enriched for both de novo and inherited CNVs (P < 0.001), indicating that not only de novo but also inherited CNVs can be associated with a clinically relevant phenotype. Moreover, patients with multiple CNVs presented with a more severe phenotype than patients with a single CNV (P < 0.001), pointing to a combinatorial effect of the additional CNVs. In addition, we identified 20 de novo single-gene CNVs that directly indicate novel genes for ID/MCA, including ZFHX4, ANKH, DLG2, MPP7, CEP89, TRIO, ASTN2, and PIK3C3.


Asunto(s)
Variaciones en el Número de Copia de ADN , Estudios de Asociación Genética , Anomalías Múltiples/diagnóstico , Anomalías Múltiples/genética , Adolescente , Niño , Preescolar , Mapeo Cromosómico , Biología Computacional/métodos , Femenino , Humanos , Discapacidad Intelectual/diagnóstico , Discapacidad Intelectual/genética , Masculino , Fenotipo , Polimorfismo de Nucleótido Simple
8.
PLoS Genet ; 9(6): e1003523, 2013 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-23754953

RESUMEN

Autism Spectrum Disorders (ASD) are highly heritable and characterised by impairments in social interaction and communication, and restricted and repetitive behaviours. Considering four sets of de novo copy number variants (CNVs) identified in 181 individuals with autism and exploiting mouse functional genomics and known protein-protein interactions, we identified a large and significantly interconnected interaction network. This network contains 187 genes affected by CNVs drawn from 45% of the patients we considered and 22 genes previously implicated in ASD, of which 192 form a single interconnected cluster. On average, those patients with copy number changed genes from this network possess changes in 3 network genes, suggesting that epistasis mediated through the network is extensive. Correspondingly, genes that are highly connected within the network, and thus whose copy number change is predicted by the network to be more phenotypically consequential, are significantly enriched among patients that possess only a single ASD-associated network copy number changed gene (p = 0.002). Strikingly, deleted or disrupted genes from the network are significantly enriched in GO-annotated positive regulators (2.3-fold enrichment, corrected p = 2×10(-5)), whereas duplicated genes are significantly enriched in GO-annotated negative regulators (2.2-fold enrichment, corrected p = 0.005). The direction of copy change is highly informative in the context of the network, providing the means through which perturbations arising from distinct deletions or duplications can yield a common outcome. These findings reveal an extensive ASD-associated molecular network, whose topology indicates ASD-relevant mutational deleteriousness and that mechanistically details how convergent aetiologies can result extensively from CNVs affecting pathways causally implicated in ASD.


Asunto(s)
Trastornos Generalizados del Desarrollo Infantil/genética , Dosificación de Gen , Redes Reguladoras de Genes , Mapas de Interacción de Proteínas/genética , Animales , Niño , Eliminación de Gen , Duplicación de Gen , Predisposición Genética a la Enfermedad , Genoma Humano , Humanos , Ratones
9.
BMC Genomics ; 14: 95, 2013 Feb 12.
Artículo en Inglés | MEDLINE | ID: mdl-23402223

RESUMEN

BACKGROUND: A classical example of repeated speciation coupled with ecological diversification is the evolution of 14 closely related species of Darwin's (Galápagos) finches (Thraupidae, Passeriformes). Their adaptive radiation in the Galápagos archipelago took place in the last 2-3 million years and some of the molecular mechanisms that led to their diversification are now being elucidated. Here we report evolutionary analyses of genome of the large ground finch, Geospiza magnirostris. RESULTS: 13,291 protein-coding genes were predicted from a 991.0 Mb G. magnirostris genome assembly. We then defined gene orthology relationships and constructed whole genome alignments between the G. magnirostris and other vertebrate genomes. We estimate that 15% of genomic sequence is functionally constrained between G. magnirostris and zebra finch. Genic evolutionary rate comparisons indicate that similar selective pressures acted along the G. magnirostris and zebra finch lineages suggesting that historical effective population size values have been similar in both lineages. 21 otherwise highly conserved genes were identified that each show evidence for positive selection on amino acid changes in the Darwin's finch lineage. Two of these genes (Igf2r and Pou1f1) have been implicated in beak morphology changes in Darwin's finches. Five of 47 genes showing evidence of positive selection in early passerine evolution have cilia related functions, and may be examples of adaptively evolving reproductive proteins. CONCLUSIONS: These results provide insights into past evolutionary processes that have shaped G. magnirostris genes and its genome, and provide the necessary foundation upon which to build population genomics resources that will shed light on more contemporaneous adaptive and non-adaptive processes that have contributed to the evolution of the Darwin's finches.


Asunto(s)
Evolución Molecular , Genómica , Passeriformes/genética , Adaptación Fisiológica , Animales , Genética de Población , Modelos Genéticos , Passeriformes/fisiología , Homología de Secuencia de Ácido Nucleico
10.
Nature ; 486(7404): 527-31, 2012 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-22722832

RESUMEN

Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.


Asunto(s)
Evolución Molecular , Variación Genética/genética , Genoma Humano/genética , Genoma/genética , Pan paniscus/genética , Pan troglodytes/genética , Animales , Elementos Transponibles de ADN/genética , Duplicación de Gen/genética , Genotipo , Humanos , Datos de Secuencia Molecular , Fenotipo , Filogenia , Especificidad de la Especie
11.
Nature ; 483(7388): 169-75, 2012 Mar 07.
Artículo en Inglés | MEDLINE | ID: mdl-22398555

RESUMEN

Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.


Asunto(s)
Evolución Molecular , Especiación Genética , Genoma/genética , Gorilla gorilla/genética , Animales , Femenino , Regulación de la Expresión Génica , Variación Genética/genética , Genómica , Humanos , Macaca mulatta/genética , Datos de Secuencia Molecular , Pan troglodytes/genética , Filogenia , Pongo/genética , Proteínas/genética , Alineación de Secuencia , Especificidad de la Especie , Transcripción Genética
12.
Annu Rev Genomics Hum Genet ; 12: 275-99, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21721940

RESUMEN

The amount of a genome's sequence that is functional has been surprisingly difficult to estimate accurately. This has severely hindered analyses asking whether the amount of functional genomic sequence correlates with organismal complexity. Most studies estimate these amounts by considering nucleotide substitution rates within aligned sequences. These approaches show reduced power to identify sequence that is aligned, functional, and constrained only within narrowly defined phyla. The neutral indel model exploits insertions or deletions (indels) rather than substitutions in predicting functional sequence. Surprisingly, this method indicates that half of all functional sequence is specific to individual eutherian lineages. This review considers the rates at which coding or noncoding and functional or nonfunctional sequence changes among mammalian genomes. In contrast to the slow rate at which protein-coding sequence changes, functional noncoding sequence appears to change or be turned over at rapid rates in mammals.


Asunto(s)
Evolución Molecular , Genoma Humano , Genoma , Animales , Humanos , Mutación INDEL , Mamíferos/genética
13.
Nature ; 469(7331): 529-33, 2011 Jan 27.
Artículo en Inglés | MEDLINE | ID: mdl-21270892

RESUMEN

'Orang-utan' is derived from a Malay term meaning 'man of the forest' and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.


Asunto(s)
Variación Genética , Genoma/genética , Pongo abelii/genética , Pongo pygmaeus/genética , Animales , Centrómero/genética , Cerebrósidos/metabolismo , Cromosomas , Evolución Molecular , Femenino , Reordenamiento Génico/genética , Especiación Genética , Genética de Población , Humanos , Masculino , Filogenia , Densidad de Población , Dinámica Poblacional , Especificidad de la Especie
14.
Genome Res ; 20(10): 1335-43, 2010 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-20693480

RESUMEN

Despite the availability of dozens of animal genome sequences, two key questions remain unanswered: First, what fraction of any species' genome confers biological function, and second, are apparent differences in organismal complexity reflected in an objective measure of genomic complexity? Here, we address both questions by applying, across the mammalian phylogeny, an evolutionary model that estimates the amount of functional DNA that is shared between two species' genomes. Our main findings are, first, that as the divergence between mammalian species increases, the predicted amount of pairwise shared functional sequence drops off dramatically. We show by simulations that this is not an artifact of the method, but rather indicates that functional (and mostly noncoding) sequence is turning over at a very high rate. We estimate that between 200 and 300 Mb (∼6.5%-10%) of the human genome is under functional constraint, which includes five to eight times as many constrained noncoding bases than bases that code for protein. In contrast, in D. melanogaster we estimate only 56-66 Mb to be constrained, implying a ratio of noncoding to coding constrained bases of about 2. This suggests that, rather than genome size or protein-coding gene complement, it is the number of functional bases that might best mirror our naïve preconceptions of organismal complexity.


Asunto(s)
Secuencia de Bases , Evolución Molecular , Genoma Humano/genética , Genoma/genética , Mamíferos/genética , Modelos Genéticos , Animales , Secuencia de Bases/genética , Secuencia de Bases/fisiología , ADN/genética , Drosophila melanogaster/genética , Humanos , Invertebrados/genética , Ratones , Ratas , Especificidad de la Especie , Vertebrados/genética
15.
Genome Res ; 20(5): 675-84, 2010 May.
Artículo en Inglés | MEDLINE | ID: mdl-20305016

RESUMEN

We describe a statistical and comparative-genomic approach for quantifying error rates of genome sequence assemblies. The method exploits not substitutions but the pattern of insertions and deletions (indels) in genome-scale alignments for closely related species. Using two- or three-way alignments, the approach estimates the amount of aligned sequence containing clusters of nucleotides that were wrongly inserted or deleted during sequencing or assembly. Thus, the method is well-suited to assessing fine-scale sequence quality within single assemblies, between different assemblies of a single set of reads, and between genome assemblies for different species. When applying this approach to four primate genome assemblies, we found that average gap error rates per base varied considerably, by up to sixfold. As expected, bacterial artificial chromosome (BAC) sequences contained lower, but still substantial, predicted numbers of errors, arguing for caution in regarding BACs as the epitome of genome fidelity. We then mapped short reads, at approximately 10-fold statistical coverage, from a Bornean orangutan onto the Sumatran orangutan genome assembly originally constructed from capillary reads. This resulted in a reduced gap error rate and a separation of error-prone from high-fidelity sequence. Over 5000 predicted indel errors in protein-coding sequence were corrected in a hybrid assembly. Our approach contributes a new fine-scale quality metric for assemblies that should facilitate development of improved genome sequencing and assembly strategies.


Asunto(s)
Mapeo Cromosómico , Genómica/métodos , Mutación INDEL , Modelos Genéticos , Primates , Animales , Secuencia de Bases , Variación Genética , Genoma , Genoma Humano , Humanos , Pan troglodytes/clasificación , Pan troglodytes/genética , Pongo abelii/clasificación , Pongo abelii/genética , Pongo pygmaeus/clasificación , Pongo pygmaeus/genética , Primates/clasificación , Primates/genética , Alineación de Secuencia , Análisis de Secuencia de ADN , Especificidad de la Especie
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...