RESUMEN
Gibbons are small arboreal apes that display an accelerated rate of evolutionary chromosomal rearrangement and occupy a key node in the primate phylogeny between Old World monkeys and great apes. Here we present the assembly and analysis of a northern white-cheeked gibbon (Nomascus leucogenys) genome. We describe the propensity for a gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation genes and alter transcription by providing a premature termination site, suggesting a possible molecular mechanism for the genome plasticity of the gibbon lineage. We further show that the gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) experienced a near-instantaneous radiation â¼5 million years ago, coincident with major geographical changes in southeast Asia that caused cycles of habitat compression and expansion. Finally, we identify signatures of positive selection in genes important for forelimb development (TBX5) and connective tissues (COL1A1) that may have been involved in the adaptation of gibbons to their arboreal habitat.
Asunto(s)
Genoma/genética , Hylobates/clasificación , Hylobates/genética , Cariotipo , Filogenia , Animales , Evolución Molecular , Hominidae/clasificación , Hominidae/genética , Humanos , Datos de Secuencia Molecular , Retroelementos/genética , Selección Genética , Terminación de la Transcripción GenéticaRESUMEN
The unparalleled efficiency of next-generation sequencing (NGS) has prompted widespread adoption, but significant problems remain in the use of NGS data for whole genome assembly. We explore the advantages and disadvantages of chicken genome assemblies generated using a variety of sequencing and assembly methodologies. NGS assemblies are equivalent in some ways to a Sanger-based assembly yet deficient in others. Nonetheless, these assemblies are sufficient for the identification of the majority of genes and can reveal novel sequences when compared to existing assembly references.
Asunto(s)
Pollos/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Animales , Biología Computacional , ADN Complementario , Femenino , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento/economía , Control de Calidad , Reproducibilidad de los Resultados , Programas Informáticos , TranscriptomaRESUMEN
'Orang-utan' is derived from a Malay term meaning 'man of the forest' and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.
Asunto(s)
Variación Genética , Genoma/genética , Pongo abelii/genética , Pongo pygmaeus/genética , Animales , Centrómero/genética , Cerebrósidos/metabolismo , Cromosomas , Evolución Molecular , Femenino , Reordenamiento Génico/genética , Especiación Genética , Genética de Población , Humanos , Masculino , Filogenia , Densidad de Población , Dinámica Poblacional , Especificidad de la EspecieRESUMEN
BACKGROUND: To date, few peptides in the complex mixture of platypus venom have been identified and sequenced, in part due to the limited amounts of platypus venom available to study. We have constructed and sequenced a cDNA library from an active platypus venom gland to identify the remaining components. RESULTS: We identified 83 novel putative platypus venom genes from 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores) and invertebrates (spiders, sea anemones, starfish). A number of these are expressed in tissues other than the venom gland, and at least three of these families (those with homology to toxins from distant invertebrates) may play non-toxin roles. Thus, further functional testing is required to confirm venom activity. However, the presence of similar putative toxins in such widely divergent species provides further evidence for the hypothesis that there are certain protein families that are selected preferentially during evolution to become venom peptides. We have also used homology with known proteins to speculate on the contributions of each venom component to the symptoms of platypus envenomation. CONCLUSIONS: This study represents a step towards fully characterizing the first mammal venom transcriptome. We have found similarities between putative platypus toxins and those of a number of unrelated species, providing insight into the evolution of mammalian venom.
Asunto(s)
Ornitorrinco/genética , Ornitorrinco/metabolismo , Proteómica , Ponzoñas/genética , Ponzoñas/metabolismo , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Perfilación de la Expresión Génica , Biblioteca de Genes , Metaloproteasas/genética , Péptido Hidrolasas/genética , Péptidos/genética , Inhibidores de Proteasas , Conformación Proteica , Señales de Clasificación de Proteína , Proteínas/genética , Análisis de Secuencia de ADN , Análisis de Secuencia de ProteínaRESUMEN
Massively parallel DNA sequencing technologies provide an unprecedented ability to screen entire genomes for genetic changes associated with tumour progression. Here we describe the genomic analyses of four DNA samples from an African-American patient with basal-like breast cancer: peripheral blood, the primary tumour, a brain metastasis and a xenograft derived from the primary tumour. The metastasis contained two de novo mutations and a large deletion not present in the primary tumour, and was significantly enriched for 20 shared mutations. The xenograft retained all primary tumour mutations and displayed a mutation enrichment pattern that resembled the metastasis. Two overlapping large deletions, encompassing CTNNA1, were present in all three tumour samples. The differential mutation frequencies and structural variation patterns in metastasis and xenograft compared with the primary tumour indicate that secondary tumours may arise from a minority of cells within the primary tumour.
Asunto(s)
Neoplasias Encefálicas/genética , Neoplasias Encefálicas/secundario , Neoplasias de la Mama/genética , Genoma Humano/genética , Mutación/genética , Trasplante de Neoplasias , Adulto , Neoplasias de la Mama/patología , Variaciones en el Número de Copia de ADN/genética , Análisis Mutacional de ADN , Progresión de la Enfermedad , Femenino , Frecuencia de los Genes/genética , Genómica , Humanos , Translocación Genética/genética , Trasplante Heterólogo , alfa Catenina/genéticaRESUMEN
The human Y chromosome began to evolve from an autosome hundreds of millions of years ago, acquiring a sex-determining function and undergoing a series of inversions that suppressed crossing over with the X chromosome. Little is known about the recent evolution of the Y chromosome because only the human Y chromosome has been fully sequenced. Prevailing theories hold that Y chromosomes evolve by gene loss, the pace of which slows over time, eventually leading to a paucity of genes, and stasis. These theories have been buttressed by partial sequence data from newly emergent plant and animal Y chromosomes, but they have not been tested in older, highly evolved Y chromosomes such as that of humans. Here we finished sequencing of the male-specific region of the Y chromosome (MSY) in our closest living relative, the chimpanzee, achieving levels of accuracy and completion previously reached for the human MSY. By comparing the MSYs of the two species we show that they differ radically in sequence structure and gene content, indicating rapid evolution during the past 6 million years. The chimpanzee MSY contains twice as many massive palindromes as the human MSY, yet it has lost large fractions of the MSY protein-coding genes and gene families present in the last common ancestor. We suggest that the extraordinary divergence of the chimpanzee and human MSYs was driven by four synergistic factors: the prominent role of the MSY in sperm production, 'genetic hitchhiking' effects in the absence of meiotic crossing over, frequent ectopic recombination within the MSY, and species differences in mating behaviour. Although genetic decay may be the principal dynamic in the evolution of newly emergent Y chromosomes, wholesale renovation is the paramount theme in the continuing evolution of chimpanzee, human and perhaps other older MSYs.
Asunto(s)
Cromosomas Humanos Y/genética , Genes/genética , Conformación de Ácido Nucleico , Pan troglodytes/genética , Cromosoma Y/genética , Animales , Cromosomas Humanos Par 21/genética , ADN/química , ADN/genética , Humanos , Masculino , Datos de Secuencia Molecular , Homología de Secuencia de Ácido NucleicoRESUMEN
BACKGROUND: The full complement of DNA mutations that are responsible for the pathogenesis of acute myeloid leukemia (AML) is not yet known. METHODS: We used massively parallel DNA sequencing to obtain a very high level of coverage (approximately 98%) of a primary, cytogenetically normal, de novo genome for AML with minimal maturation (AML-M1) and a matched normal skin genome. RESULTS: We identified 12 acquired (somatic) mutations within the coding sequences of genes and 52 somatic point mutations in conserved or regulatory portions of the genome. All mutations appeared to be heterozygous and present in nearly all cells in the tumor sample. Four of the 64 mutations occurred in at least 1 additional AML sample in 188 samples that were tested. Mutations in NRAS and NPM1 had been identified previously in patients with AML, but two other mutations had not been identified. One of these mutations, in the IDH1 gene, was present in 15 of 187 additional AML genomes tested and was strongly associated with normal cytogenetic status; it was present in 13 of 80 cytogenetically normal samples (16%). The other was a nongenic mutation in a genomic region with regulatory potential and conservation in higher mammals; we detected it in one additional AML tumor. The AML genome that we sequenced contains approximately 750 point mutations, of which only a small fraction are likely to be relevant to pathogenesis. CONCLUSIONS: By comparing the sequences of tumor and skin genomes of a patient with AML-M1, we have identified recurring mutations that may be relevant for pathogenesis.
Asunto(s)
Isocitrato Deshidrogenasa/genética , Leucemia Mieloide Aguda/genética , Mutación , Adulto , Análisis Mutacional de ADN , Femenino , Frecuencia de los Genes , Genoma Humano , Humanos , Masculino , Persona de Mediana Edad , Nucleofosmina , Mutación Puntual , Análisis de Secuencia de ADN/métodosRESUMEN
Detection and characterization of genomic structural variation are important for understanding the landscape of genetic variation in human populations and in complex diseases such as cancer. Recent studies demonstrate the feasibility of detecting structural variation using next-generation, short-insert, paired-end sequencing reads. However, the utility of these reads is not entirely clear, nor are the analysis methods with which accurate detection can be achieved. The algorithm BreakDancer predicts a wide variety of structural variants including insertion-deletions (indels), inversions and translocations. We examined BreakDancer's performance in simulation, in comparison with other methods and in analyses of a sample from an individual with acute myeloid leukemia and of samples from the 1,000 Genomes trio individuals. BreakDancer sensitively and accurately detected indels ranging from 10 base pairs to 1 megabase pair that are difficult to detect via a single conventional approach.
Asunto(s)
ADN/genética , Variación Genética , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Secuencia de Bases , Simulación por Computador , Genoma Humano , Humanos , Leucemia Mieloide Aguda/genéticaRESUMEN
We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.
Asunto(s)
Evolución Molecular , Genoma/genética , Ornitorrinco/genética , Animales , Composición de Base , Dentición , Femenino , Impresión Genómica/genética , Humanos , Inmunidad/genética , Masculino , Mamíferos/genética , MicroARNs/genética , Proteínas de la Leche/genética , Filogenia , Ornitorrinco/inmunología , Ornitorrinco/fisiología , Receptores Odorantes/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Reptiles/genética , Análisis de Secuencia de ADN , Espermatozoides/metabolismo , Ponzoñas/genética , Zona Pelúcida/metabolismoRESUMEN
Studies of copy-number variation and linkage disequilibrium (LD) have typically excluded complex regions of the genome that are rich in duplications and prone to rearrangement. In an attempt to assess the heritability and LD of copy-number polymorphisms (CNPs) in duplication-rich regions of the genome, we profiled copy-number variation in 130 putative "rearrangement hotspot regions" among 269 individuals of European, Yoruba, Chinese, and Japanese ancestry analyzed by the International HapMap Consortium. Eighty-four hotspot regions, corresponding to 257 bacterial artificial chromosome (BAC) probes, showed evidence of copy-number differences. Despite a predisposing genetic architecture, no polymorphism was ever observed in the remaining 46 "rearrangement hotspots," and we suggest these represent excellent candidate sites for pathogenic rearrangements. We used a combination of BAC-based and high-density customized oligonucleotide arrays to resolve the molecular basis of structural rearrangements. For common variants (frequency >10%), we observed a distinct bias against copy-number losses, suggesting that deletions are subject to purifying selection. Heritability estimates did not differ significantly from 1.0 among the majority (30 of 34) of loci analyzed, consistent with normal Mendelian inheritance. Some of the CNPs in duplication-rich regions showed strong LD with nearby single-nucleotide polymorphisms (SNPs) and were observed to segregate on ancestral SNP haplotypes. However, LD with the best available SNP markers was weaker than has been reported for deletion polymorphisms in less complex regions of the genome. These observations may be accounted for by a low density of SNP data in duplicated regions, challenges in mapping and typing the CNPs, and the possibility that CNPs in these regions have rearranged on multiple haplotype backgrounds. Our results underscore the need for complete maps of genetic variation in duplication-rich regions of the genome.
Asunto(s)
Dosificación de Gen , Duplicación de Gen , Genoma Humano , Desequilibrio de Ligamiento , Polimorfismo Genético , Secuencias Repetitivas de Ácidos Nucleicos , Reordenamiento Génico , Humanos , Polimorfismo de Nucleótido SimpleRESUMEN
Here we present a finished sequence of human chromosome 15, together with a high-quality gene catalogue. As chromosome 15 is one of seven human chromosomes with a high rate of segmental duplication, we have carried out a detailed analysis of the duplication structure of the chromosome. Segmental duplications in chromosome 15 are largely clustered in two regions, on proximal and distal 15q; the proximal region is notable because recombination among the segmental duplications can result in deletions causing Prader-Willi and Angelman syndromes. Sequence analysis shows that the proximal and distal regions of 15q share extensive ancient similarity. Using a simple approach, we have been able to reconstruct many of the events by which the current duplication structure arose. We find that most of the intrachromosomal duplications seem to share a common ancestry. Finally, we demonstrate that some remaining gaps in the genome sequence are probably due to structural polymorphisms between haplotypes; this may explain a significant fraction of the gaps remaining in the human genome.
Asunto(s)
Cromosomas Humanos Par 15/genética , Evolución Molecular , Duplicación de Gen , Animales , Secuencia Conservada/genética , Genes , Genoma Humano , Haplotipos/genética , Humanos , Macaca mulatta/genética , Datos de Secuencia Molecular , Familia de Multigenes/genética , Filogenia , Polimorfismo Genético/genética , Análisis de Secuencia de ADN , Sintenía/genéticaRESUMEN
The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders.
Asunto(s)
Dosificación de Gen , Variación Genética , Genoma Humano , Secuencias Repetitivas de Ácidos Nucleicos , Cromosomas Artificiales Bacterianos , Humanos , Hibridación de Ácido Nucleico/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Polimorfismo Genético , Recombinación Genética , Reproducibilidad de los ResultadosRESUMEN
BACKGROUND: Pericentric inversions are the most common euchromatic chromosomal differences among humans and the great apes. The human and chimpanzee karyotype differs by nine such events, in addition to several constitutive heterochromatic increases and one chromosomal fusion event. Reproductive isolation and subsequent speciation are thought to be the potential result of pericentric inversions, as reproductive boundaries form as a result of hybrid sterility. RESULTS: Here we employed a comparative fluorescence in situ hybridization approach, using probes selected from a combination of physical mapping, genomic sequence, and segmental duplication analyses to narrow the breakpoint interval of a pericentric inversion in chimpanzee involving the orthologous human 15q11-q13 region. We have refined the inversion breakpoint of this chimpanzee-specific rearrangement to a 600 kilobase (kb) interval of the human genome consisting of entirely duplicated material. Detailed analysis of the underlying sequence indicated that this region comprises multiple segmental duplications, including a previously characterized duplication of the alpha7 neuronal nicotinic acetylcholine receptor subunit gene (CHRNA7) in 15q13.3 and several Golgin-linked-to-PML, or LCR15, duplications. CONCLUSIONS: We conclude that, on the basis of experimental data excluding the CHRNA7 duplicon as the site of inversion, and sequence analysis of regional duplications, the most likely rearrangement site is within a GLP/LCR15 duplicon. This study further exemplifies the genomic plasticity due to the presence of segmental duplications and highlights their importance for a complete understanding of genome evolution.
Asunto(s)
Rotura Cromosómica/genética , Inversión Cromosómica , Duplicación de Gen , Familia de Multigenes/genética , Pan troglodytes/genética , Animales , Cromosomas Humanos Par 15/genética , Gorilla gorilla , Humanos , Hibridación Fluorescente in Situ/métodos , Pan paniscus , Pongo pygmaeus , Homología de Secuencia de Ácido NucleicoRESUMEN
Large-scale genomic rearrangements are a major force of evolutionary change and the ascertainment of such events between the human and great ape genomes is fundamental to a complete understanding of the genetic history and evolution of our species. Here, we present the results of an evolutionary analysis utilizing array comparative genomic hybridization (array CGH), measuring copy-number gains and losses among these species. Using an array of 2460 human bacterial artificial chromosomes (BACs) (12% of the genome), we identified a total of 63 sites of putative DNA copy-number variation between humans and the great apes (chimpanzee, bonobo, gorilla, and orangutan). Detailed molecular characterization of a subset of these sites confirmed rearrangements ranging from 40 to at least 175 kb in size. Surprisingly, the majority of variant sites differentiating great ape and human genomes were found within interstitial euchromatin. These data suggest that such large-scale events are not restricted solely to subtelomeric or pericentromeric regions, but also occur within genic regions. In addition, 5/9 of the verified variant sites localized to areas of intrachromosomal segmental duplication within the human genome. On the basis of the frequency of duplication in humans, this represents a 14-fold positional bias. In contrast to previous cytogenetic and comparative mapping studies, these results indicate extensive local repatterning of hominoid chromosomes in euchromatic regions through a duplication-driven mechanism of genome evolution.
Asunto(s)
Variación Genética/genética , Genoma Humano , Genoma , Hominidae/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Animales , Cromosomas/genética , Cromosomas Humanos/genética , Gorilla gorilla/genética , Humanos , Hibridación de Ácido Nucleico/métodos , Pan paniscus/genética , Pan troglodytes/genética , Pongo pygmaeus/genéticaRESUMEN
The nucleocapsid (N) protein genes from 24 Newcastle disease virus (NDV) isolates representing various pathotypes with different geographical and chronological origins were cloned and sequenced. The N-terminal region of the N protein to residue 401 was highly conserved among isolates with several conservative substitutions occurring that correlated with phylogenetic relationships. Variability of the N protein was detected in the C-terminal portion similar to what has been reported for other members of the Paramyxovirinae. Amino acids previously identified as invariant or highly conserved in N proteins of other paramyxoviruses were also present in the NDV protein. Phylogenetic analysis of N gene coding sequences among NDV isolates again demonstrated the existence of two major groups. One clade contained viruses that included vaccine and virulent strains isolated in the USA prior to 1970 while a second clade included vaccine and virulent viruses isolated worldwide. Comparison of N protein amino acid sequences among members of the Paramyxoviridae resulted in NDV and avian paramyxovirus 6 separating as a cluster distinct from the Rubulavirus genus. This provides further support for avian paramyxoviruses being considered for their own genus among the Paramyxovirinae.