Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
2.
Nature ; 486(7404): 527-31, 2012 Jun 28.
Artigo em Inglês | MEDLINE | ID: mdl-22722832

RESUMO

Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other.


Assuntos
Evolução Molecular , Variação Genética/genética , Genoma Humano/genética , Genoma/genética , Pan paniscus/genética , Pan troglodytes/genética , Animais , Elementos de DNA Transponíveis/genética , Duplicação Gênica/genética , Genótipo , Humanos , Dados de Sequência Molecular , Fenótipo , Filogenia , Especificidade da Espécie
3.
Nature ; 483(7388): 169-75, 2012 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-22398555

RESUMO

Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.


Assuntos
Evolução Molecular , Especiação Genética , Genoma/genética , Gorilla gorilla/genética , Animais , Feminino , Regulação da Expressão Gênica , Variação Genética/genética , Genômica , Humanos , Macaca mulatta/genética , Dados de Sequência Molecular , Pan troglodytes/genética , Filogenia , Pongo/genética , Proteínas/genética , Alinhamento de Sequência , Especificidade da Espécie , Transcrição Gênica
4.
PLoS Genet ; 11(3): e1005012, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25781962

RESUMO

Readily-accessible and standardised capture of genotypic variation has revolutionised our understanding of the genetic contribution to disease. Unfortunately, the corresponding systematic capture of patient phenotypic variation needed to fully interpret the impact of genetic variation has lagged far behind. Exploiting deep and systematic phenotyping of a cohort of 197 patients presenting with heterogeneous developmental disorders and whose genomes harbour de novo CNVs, we systematically applied a range of commonly-used functional genomics approaches to identify the underlying molecular perturbations and their phenotypic impact. Grouping patients into 408 non-exclusive patient-phenotype groups, we identified a functional association amongst the genes disrupted in 209 (51%) groups. We find evidence for a significant number of molecular interactions amongst the association-contributing genes, including a single highly-interconnected network disrupted in 20% of patients with intellectual disability, and show using microcephaly how these molecular networks can be used as baits to identify additional members whose genes are variant in other patients with the same phenotype. Exploiting the systematic phenotyping of this cohort, we observe phenotypic concordance amongst patients whose variant genes contribute to the same functional association but note that (i) this relationship shows significant variation across the different approaches used to infer a commonly perturbed molecular pathway, and (ii) that the phenotypic similarities detected amongst patients who share the same inferred pathway perturbation result from these patients sharing many distinct phenotypes, rather than sharing a more specific phenotype, inferring that these pathways are best characterized by their pleiotropic effects.


Assuntos
Variações do Número de Cópias de DNA/genética , Deficiências do Desenvolvimento/genética , Redes e Vias Metabólicas/genética , Mapas de Interação de Proteínas/genética , Animais , Deficiências do Desenvolvimento/metabolismo , Deficiências do Desenvolvimento/patologia , Expressão Gênica , Estudos de Associação Genética , Genoma Humano , Genótipo , Humanos , Camundongos , Fenótipo , Mapeamento de Interação de Proteínas
5.
Nature ; 469(7331): 529-33, 2011 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-21270892

RESUMO

'Orang-utan' is derived from a Malay term meaning 'man of the forest' and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.


Assuntos
Variação Genética , Genoma/genética , Pongo abelii/genética , Pongo pygmaeus/genética , Animais , Centrômero/genética , Cerebrosídeos/metabolismo , Cromossomos , Evolução Molecular , Feminino , Rearranjo Gênico/genética , Especiação Genética , Genética Populacional , Humanos , Masculino , Filogenia , Densidade Demográfica , Dinâmica Populacional , Especificidade da Espécie
6.
Nucleic Acids Res ; 43(15): e101, 2015 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-26001969

RESUMO

Any given human individual carries multiple genetic variants that disrupt protein-coding genes, through structural variation, as well as nucleotide variants and indels. Predicting the phenotypic consequences of a gene disruption remains a significant challenge. Current approaches employ information from a range of biological networks to predict which human genes are haploinsufficient (meaning two copies are required for normal function) or essential (meaning at least one copy is required for viability). Using recently available study gene sets, we show that these approaches are strongly biased towards providing accurate predictions for well-studied genes. By contrast, we derive a haploinsufficiency score from a combination of unbiased large-scale high-throughput datasets, including gene co-expression and genetic variation in over 6000 human exomes. Our approach provides a haploinsufficiency prediction for over twice as many genes currently unassociated with papers listed in Pubmed as three commonly-used approaches, and outperforms these approaches for predicting haploinsufficiency for less-studied genes. We also show that fine-tuning the predictor on a set of well-studied 'gold standard' haploinsufficient genes does not improve the prediction for less-studied genes. This new score can readily be used to prioritize gene disruptions resulting from any genetic variant, including copy number variants, indels and single-nucleotide variants.


Assuntos
Haploinsuficiência , Animais , Doença/genética , Redes Reguladoras de Genes , Genômica/métodos , Humanos , Camundongos , Máquina de Vetores de Suporte
7.
PLoS Genet ; 10(7): e1004525, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25057982

RESUMO

Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25-0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1-5.0). From extrapolations we estimate that 8.2% (7.1-9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.


Assuntos
Sequência Conservada/genética , Evolução Molecular , Genoma Humano , Deleção de Sequência/genética , Animais , Sequência de Bases , Hominidae , Humanos , Camundongos , Fases de Leitura Aberta , Alinhamento de Sequência , Especificidade da Espécie
8.
PLoS Genet ; 9(6): e1003523, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23754953

RESUMO

Autism Spectrum Disorders (ASD) are highly heritable and characterised by impairments in social interaction and communication, and restricted and repetitive behaviours. Considering four sets of de novo copy number variants (CNVs) identified in 181 individuals with autism and exploiting mouse functional genomics and known protein-protein interactions, we identified a large and significantly interconnected interaction network. This network contains 187 genes affected by CNVs drawn from 45% of the patients we considered and 22 genes previously implicated in ASD, of which 192 form a single interconnected cluster. On average, those patients with copy number changed genes from this network possess changes in 3 network genes, suggesting that epistasis mediated through the network is extensive. Correspondingly, genes that are highly connected within the network, and thus whose copy number change is predicted by the network to be more phenotypically consequential, are significantly enriched among patients that possess only a single ASD-associated network copy number changed gene (p = 0.002). Strikingly, deleted or disrupted genes from the network are significantly enriched in GO-annotated positive regulators (2.3-fold enrichment, corrected p = 2×10(-5)), whereas duplicated genes are significantly enriched in GO-annotated negative regulators (2.2-fold enrichment, corrected p = 0.005). The direction of copy change is highly informative in the context of the network, providing the means through which perturbations arising from distinct deletions or duplications can yield a common outcome. These findings reveal an extensive ASD-associated molecular network, whose topology indicates ASD-relevant mutational deleteriousness and that mechanistically details how convergent aetiologies can result extensively from CNVs affecting pathways causally implicated in ASD.


Assuntos
Transtornos Globais do Desenvolvimento Infantil/genética , Dosagem de Genes , Redes Reguladoras de Genes , Mapas de Interação de Proteínas/genética , Animais , Criança , Deleção de Genes , Duplicação Gênica , Predisposição Genética para Doença , Genoma Humano , Humanos , Camundongos
9.
PLoS Comput Biol ; 10(8): e1003815, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25166029

RESUMO

Groupwise functional analysis of gene variants is becoming standard in next-generation sequencing studies. As the function of many genes is unknown and their classification to pathways is scant, functional associations between genes are often inferred from large-scale omics data. Such data types--including protein-protein interactions and gene co-expression networks--are used to examine the interrelations of the implicated genes. Statistical significance is assessed by comparing the interconnectedness of the mutated genes with that of random gene sets. However, interconnectedness can be affected by confounding bias, potentially resulting in false positive findings. We show that genes implicated through de novo sequence variants are biased in their coding-sequence length and longer genes tend to cluster together, which leads to exaggerated p-values in functional studies; we present here an integrative method that addresses these bias. To discern molecular pathways relevant to complex disease, we have inferred functional associations between human genes from diverse data types and assessed them with a novel phenotype-based method. Examining the functional association between de novo gene variants, we control for the heretofore unexplored confounding bias in coding-sequence length. We test different data types and networks and find that the disease-associated genes cluster more significantly in an integrated phenotypic-linkage network than in other gene networks. We present a tool of superior power to identify functional associations among genes mutated in the same disease even after accounting for significant sequencing study bias and demonstrate the suitability of this method to functionally cluster variant genes underlying polygenic disorders.


Assuntos
Análise por Conglomerados , Variação Genética/genética , Genômica/métodos , Animais , Bases de Dados Genéticas , Epilepsia/genética , Perfilação da Expressão Gênica , Humanos , Transtornos Mentais/genética , Camundongos , Fenótipo
10.
Annu Rev Genomics Hum Genet ; 12: 275-99, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21721940

RESUMO

The amount of a genome's sequence that is functional has been surprisingly difficult to estimate accurately. This has severely hindered analyses asking whether the amount of functional genomic sequence correlates with organismal complexity. Most studies estimate these amounts by considering nucleotide substitution rates within aligned sequences. These approaches show reduced power to identify sequence that is aligned, functional, and constrained only within narrowly defined phyla. The neutral indel model exploits insertions or deletions (indels) rather than substitutions in predicting functional sequence. Surprisingly, this method indicates that half of all functional sequence is specific to individual eutherian lineages. This review considers the rates at which coding or noncoding and functional or nonfunctional sequence changes among mammalian genomes. In contrast to the slow rate at which protein-coding sequence changes, functional noncoding sequence appears to change or be turned over at rapid rates in mammals.


Assuntos
Evolução Molecular , Genoma Humano , Genoma , Animais , Humanos , Mutação INDEL , Mamíferos/genética
11.
Hum Mutat ; 34(12): 1679-87, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-24038936

RESUMO

Copy-number variations (CNVs) are a common cause of intellectual disability and/or multiple congenital anomalies (ID/MCA). However, the clinical interpretation of CNVs remains challenging, especially for inherited CNVs. Well-phenotyped patients (5,531) with ID/MCA were screened for rare CNVs using a 250K single-nucleotide polymorphism array platform in order to improve the understanding of the contribution of CNVs to a patients phenotype. We detected 1,663 rare CNVs in 1,388 patients (25.1%; range 0-5 per patient) of which 437 occurred de novo and 638 were inherited. The detected CNVs were analyzed for various characteristics, gene content, and genotype-phenotype correlations. Patients with severe phenotypes, including organ malformations, had more de novo CNVs (P < 0.001), whereas patient groups with milder phenotypes, such as facial dysmorphisms, were enriched for both de novo and inherited CNVs (P < 0.001), indicating that not only de novo but also inherited CNVs can be associated with a clinically relevant phenotype. Moreover, patients with multiple CNVs presented with a more severe phenotype than patients with a single CNV (P < 0.001), pointing to a combinatorial effect of the additional CNVs. In addition, we identified 20 de novo single-gene CNVs that directly indicate novel genes for ID/MCA, including ZFHX4, ANKH, DLG2, MPP7, CEP89, TRIO, ASTN2, and PIK3C3.


Assuntos
Variações do Número de Cópias de DNA , Estudos de Associação Genética , Anormalidades Múltiplas/diagnóstico , Anormalidades Múltiplas/genética , Adolescente , Criança , Pré-Escolar , Mapeamento Cromossômico , Biologia Computacional/métodos , Feminino , Humanos , Deficiência Intelectual/diagnóstico , Deficiência Intelectual/genética , Masculino , Fenótipo , Polimorfismo de Nucleotídeo Único
12.
BMC Genomics ; 14: 95, 2013 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-23402223

RESUMO

BACKGROUND: A classical example of repeated speciation coupled with ecological diversification is the evolution of 14 closely related species of Darwin's (Galápagos) finches (Thraupidae, Passeriformes). Their adaptive radiation in the Galápagos archipelago took place in the last 2-3 million years and some of the molecular mechanisms that led to their diversification are now being elucidated. Here we report evolutionary analyses of genome of the large ground finch, Geospiza magnirostris. RESULTS: 13,291 protein-coding genes were predicted from a 991.0 Mb G. magnirostris genome assembly. We then defined gene orthology relationships and constructed whole genome alignments between the G. magnirostris and other vertebrate genomes. We estimate that 15% of genomic sequence is functionally constrained between G. magnirostris and zebra finch. Genic evolutionary rate comparisons indicate that similar selective pressures acted along the G. magnirostris and zebra finch lineages suggesting that historical effective population size values have been similar in both lineages. 21 otherwise highly conserved genes were identified that each show evidence for positive selection on amino acid changes in the Darwin's finch lineage. Two of these genes (Igf2r and Pou1f1) have been implicated in beak morphology changes in Darwin's finches. Five of 47 genes showing evidence of positive selection in early passerine evolution have cilia related functions, and may be examples of adaptively evolving reproductive proteins. CONCLUSIONS: These results provide insights into past evolutionary processes that have shaped G. magnirostris genes and its genome, and provide the necessary foundation upon which to build population genomics resources that will shed light on more contemporaneous adaptive and non-adaptive processes that have contributed to the evolution of the Darwin's finches.


Assuntos
Evolução Molecular , Genômica , Passeriformes/genética , Adaptação Fisiológica , Animais , Genética Populacional , Modelos Genéticos , Passeriformes/fisiologia , Homologia de Sequência do Ácido Nucleico
13.
Genome Res ; 20(10): 1335-43, 2010 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-20693480

RESUMO

Despite the availability of dozens of animal genome sequences, two key questions remain unanswered: First, what fraction of any species' genome confers biological function, and second, are apparent differences in organismal complexity reflected in an objective measure of genomic complexity? Here, we address both questions by applying, across the mammalian phylogeny, an evolutionary model that estimates the amount of functional DNA that is shared between two species' genomes. Our main findings are, first, that as the divergence between mammalian species increases, the predicted amount of pairwise shared functional sequence drops off dramatically. We show by simulations that this is not an artifact of the method, but rather indicates that functional (and mostly noncoding) sequence is turning over at a very high rate. We estimate that between 200 and 300 Mb (∼6.5%-10%) of the human genome is under functional constraint, which includes five to eight times as many constrained noncoding bases than bases that code for protein. In contrast, in D. melanogaster we estimate only 56-66 Mb to be constrained, implying a ratio of noncoding to coding constrained bases of about 2. This suggests that, rather than genome size or protein-coding gene complement, it is the number of functional bases that might best mirror our naïve preconceptions of organismal complexity.


Assuntos
Sequência de Bases , Evolução Molecular , Genoma Humano/genética , Genoma/genética , Mamíferos/genética , Modelos Genéticos , Animais , Sequência de Bases/genética , Sequência de Bases/fisiologia , DNA/genética , Drosophila melanogaster/genética , Humanos , Invertebrados/genética , Camundongos , Ratos , Especificidade da Espécie , Vertebrados/genética
14.
Genome Res ; 20(5): 675-84, 2010 May.
Artigo em Inglês | MEDLINE | ID: mdl-20305016

RESUMO

We describe a statistical and comparative-genomic approach for quantifying error rates of genome sequence assemblies. The method exploits not substitutions but the pattern of insertions and deletions (indels) in genome-scale alignments for closely related species. Using two- or three-way alignments, the approach estimates the amount of aligned sequence containing clusters of nucleotides that were wrongly inserted or deleted during sequencing or assembly. Thus, the method is well-suited to assessing fine-scale sequence quality within single assemblies, between different assemblies of a single set of reads, and between genome assemblies for different species. When applying this approach to four primate genome assemblies, we found that average gap error rates per base varied considerably, by up to sixfold. As expected, bacterial artificial chromosome (BAC) sequences contained lower, but still substantial, predicted numbers of errors, arguing for caution in regarding BACs as the epitome of genome fidelity. We then mapped short reads, at approximately 10-fold statistical coverage, from a Bornean orangutan onto the Sumatran orangutan genome assembly originally constructed from capillary reads. This resulted in a reduced gap error rate and a separation of error-prone from high-fidelity sequence. Over 5000 predicted indel errors in protein-coding sequence were corrected in a hybrid assembly. Our approach contributes a new fine-scale quality metric for assemblies that should facilitate development of improved genome sequencing and assembly strategies.


Assuntos
Mapeamento Cromossômico , Genômica/métodos , Mutação INDEL , Modelos Genéticos , Primatas , Animais , Sequência de Bases , Variação Genética , Genoma , Genoma Humano , Humanos , Pan troglodytes/classificação , Pan troglodytes/genética , Pongo abelii/classificação , Pongo abelii/genética , Pongo pygmaeus/classificação , Pongo pygmaeus/genética , Primatas/classificação , Primatas/genética , Alinhamento de Sequência , Análise de Sequência de DNA , Especificidade da Espécie
15.
Gigascience ; 3(1): 27, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25671092

RESUMO

BACKGROUND: Penguins are flightless aquatic birds widely distributed in the Southern Hemisphere. The distinctive morphological and physiological features of penguins allow them to live an aquatic life, and some of them have successfully adapted to the hostile environments in Antarctica. To study the phylogenetic and population history of penguins and the molecular basis of their adaptations to Antarctica, we sequenced the genomes of the two Antarctic dwelling penguin species, the Adélie penguin [Pygoscelis adeliae] and emperor penguin [Aptenodytes forsteri]. RESULTS: Phylogenetic dating suggests that early penguins arose ~60 million years ago, coinciding with a period of global warming. Analysis of effective population sizes reveals that the two penguin species experienced population expansions from ~1 million years ago to ~100 thousand years ago, but responded differently to the climatic cooling of the last glacial period. Comparative genomic analyses with other available avian genomes identified molecular changes in genes related to epidermal structure, phototransduction, lipid metabolism, and forelimb morphology. CONCLUSIONS: Our sequencing and initial analyses of the first two penguin genomes provide insights into the timing of penguin origin, fluctuations in effective population sizes of the two penguin species over the past 10 million years, and the potential associations between these biological patterns and global climate change. The molecular changes compared with other avian genomes reflect both shared and diverse adaptations of the two penguin species to the Antarctic environment.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA