RESUMO
Using comparative sequencing approaches, we investigated the evolutionary history of the European-enriched 17q21.31 MAPT inversion polymorphism. We present a detailed, BAC-based sequence assembly of the inverted human H2 haplotype and compare it to the sequence structure and genetic variation of the corresponding 1.5-Mb region for the noninverted H1 human haplotype and that of chimpanzee and orangutan. We found that inversion of the MAPT region is similarly polymorphic in other great ape species, and we present evidence that the inversions occurred independently in chimpanzees and humans. In humans, the inversion breakpoints correspond to core duplications with the LRRC37 gene family. Our analysis favors the H2 configuration and sequence haplotype as the likely great ape and human ancestral state, with inversion recurrences during primate evolution. We show that the H2 architecture has evolved more extensive sequence homology, perhaps explaining its tendency to undergo microdeletion associated with mental retardation in European populations.
Assuntos
Inversão Cromossômica , Cromossomos Humanos Par 17 , Evolução Molecular , Polimorfismo Genético , Proteínas tau/genética , Animais , Sequência de Bases , Duplicação Gênica , Humanos , Modelos Biológicos , Dados de Sequência Molecular , Pan troglodytes/genética , Filogenia , Pongo pygmaeus/genética , Análise de Sequência de DNARESUMO
BACKGROUND: Clostridium bolteae and Clostridium clostridioforme, previously included in the complex C. clostridioforme in the group Clostridium XIVa, remain difficult to distinguish by phenotypic methods. These bacteria, prevailing in the human intestinal microbiota, are opportunistic pathogens with various drug susceptibility patterns. In order to better characterize the two species and to obtain information on their antibiotic resistance genes, we analyzed the genomes of six strains of C. bolteae and six strains of C. clostridioforme, isolated from human infection. RESULTS: The genome length of C. bolteae varied from 6159 to 6398 kb, and 5719 to 6059 CDSs were detected. The genomes of C. clostridioforme were smaller, between 5467 and 5927 kb, and contained 5231 to 5916 CDSs. The two species display different metabolic pathways. The genomes of C. bolteae contained lactose operons involving PTS system and complex regulation, which contribute to phenotypic differentiation from C. clostridioforme. The Acetyl-CoA pathway, similar to that of Faecalibacterium prausnitzii, a major butyrate producer in the human gut, was only found in C. clostridioforme. The two species have also developed diverse flagella mobility systems contributing to gut colonization. Their genomes harboured many CDSs involved in resistance to beta-lactams, glycopeptides, macrolides, chloramphenicol, lincosamides, rifampin, linezolid, bacitracin, aminoglycosides and tetracyclines. Overall antimicrobial resistance genes were similar within a species, but strain-specific resistance genes were found. We discovered a new group of genes coding for rifampin resistance in C. bolteae. C. bolteae 90B3 was resistant to phenicols and linezolide in producing a 23S rRNA methyltransferase. C. clostridioforme 90A8 contained the VanB-type Tn1549 operon conferring vancomycin resistance. We also detected numerous genes encoding proteins related to efflux pump systems. CONCLUSION: Genomic comparison of C. bolteae and C. clostridiofrome revealed functional differences in butyrate pathways and in flagellar systems, which play a critical role within human microbiota. Most of the resistance genes detected in both species were previously characterized in other bacterial species. A few of them were related to antibiotics inactive against Clostridium spp. Some were part of mobile genetic elements suggesting that these commensals of the human microbiota act as reservoir of antimicrobial resistances.
Assuntos
Antibacterianos/farmacologia , Clostridium/efeitos dos fármacos , Clostridium/genética , Farmacorresistência Bacteriana/genética , Genoma Bacteriano , Genômica , Vias Biossintéticas , Butiratos/metabolismo , Clostridium/classificação , Clostridium/metabolismo , Genômica/métodos , Humanos , FilogeniaRESUMO
Fusobacterium nucleatum is a strictly anaerobic, Gram negative bacterial species that has been associated with dental infections, pre-term labor, appendicitis, inflammatory bowel disease, and, more recently, colorectal cancer. The species is unusual in its phenotypic and genotypic heterogeneity, with some strains demonstrating a more virulent phenotype than others; however, as yet the genetic basis for these differences is not understood. Bacteriophage are known to contribute to the virulence phenotype of several bacterial species. In this work, we set out to characterize the bacteriophage associated with F. nucleatum subsp. animalis strain 7-1, a highly invasive isolate from the human gastrointestinal tract. As well, we used computational approaches to predict and compare bacteriophage signatures across available sequenced F. nucleatum genomes.
Assuntos
Bacteriófagos/genética , Fusobacterium nucleatum/virologia , Genoma Viral , Genômica , Bacteriófagos/classificação , Bacteriófagos/ultraestrutura , Análise por Conglomerados , Biologia Computacional/métodos , DNA Viral , Genômica/métodos , Humanos , Anotação de Sequência Molecular , Análise de Sequência de DNARESUMO
Exceptionally accurate genome reference sequences have proven to be of great value to microbial researchers. Thus, to date, about 1800 bacterial genome assemblies have been "finished" at great expense with the aid of manual laboratory and computational processes that typically iterate over a period of months or even years. By applying a new laboratory design and new assembly algorithm to 16 samples, we demonstrate that assemblies exceeding finished quality can be obtained from whole-genome shotgun data and automated computation. Cost and time requirements are thus dramatically reduced.
Assuntos
Bactérias/genética , Genoma Bacteriano , Biblioteca Genômica , Análise de Sequência de DNA/métodos , AlgoritmosRESUMO
The degree to which molecular epidemiology reveals information about the sources and transmission patterns of an outbreak depends on the resolution of the technology used and the samples studied. Isolates of Escherichia coli O104:H4 from the outbreak centered in Germany in May-July 2011, and the much smaller outbreak in southwest France in June 2011, were indistinguishable by standard tests. We report a molecular epidemiological analysis using multiplatform whole-genome sequencing and analysis of multiple isolates from the German and French outbreaks. Isolates from the German outbreak showed remarkably little diversity, with only two single nucleotide polymorphisms (SNPs) found in isolates from four individuals. Surprisingly, we found much greater diversity (19 SNPs) in isolates from seven individuals infected in the French outbreak. The German isolates form a clade within the more diverse French outbreak strains. Moreover, five isolates derived from a single infected individual from the French outbreak had extremely limited diversity. The striking difference in diversity between the German and French outbreak samples is consistent with several hypotheses, including a bottleneck that purged diversity in the German isolates, variation in mutation rates in the two E. coli outbreak populations, or uneven distribution of diversity in the seed populations that led to each outbreak.
Assuntos
Surtos de Doenças/estatística & dados numéricos , Infecções por Escherichia coli/epidemiologia , Infecções por Escherichia coli/microbiologia , Escherichia coli/genética , Escherichia coli/isolamento & purificação , Infecções por Escherichia coli/genética , Europa (Continente)/epidemiologia , Humanos , Modelos Genéticos , Filogenia , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
We have sequenced the genomes of 18 isolates of the closely related human pathogenic fungi Coccidioides immitis and Coccidioides posadasii to more clearly elucidate population genomic structure, bringing the total number of sequenced genomes for each species to 10. Our data confirm earlier microsatellite-based findings that these species are genetically differentiated, but our population genomics approach reveals that hybridization and genetic introgression have recently occurred between the two species. The directionality of introgression is primarily from C. posadasii to C. immitis, and we find more than 800 genes exhibiting strong evidence of introgression in one or more sequenced isolates. We performed PCR-based sequencing of one region exhibiting introgression in 40 C. immitis isolates to confirm and better define the extent of gene flow between the species. We find more coding sequence than expected by chance in the introgressed regions, suggesting that natural selection may play a role in the observed genetic exchange. We find notable heterogeneity in repetitive sequence composition among the sequenced genomes and present the first detailed genome-wide profile of a repeat-induced point mutation (RIP) process distinctly different from what has been observed in Neurospora. We identify promiscuous HLA-I and HLA-II epitopes in both proteomes and discuss the possible implications of introgression and population genomic data for public health and vaccine candidate prioritization. This study highlights the importance of population genomic data for detecting subtle but potentially important phenomena such as introgression.
Assuntos
Coccidioides/genética , Elementos de DNA Transponíveis/fisiologia , Regulação Fúngica da Expressão Gênica/genética , Hibridização Genética/genética , Sequência de Bases , California , Evolução Molecular , Variação Genética , Genoma Fúngico , Metagenômica , Dados de Sequência Molecular , Mutagênese Insercional/fisiologia , Polimorfismo de Nucleotídeo Único , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNARESUMO
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome. It is also enriched in segmental duplications, ranking third in density among the autosomes. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.
Assuntos
Cromossomos Humanos Par 17/genética , Evolução Molecular , Animais , Composição de Bases , Duplicação Gênica , Humanos , Elementos Nucleotídeos Longos e Dispersos/genética , Camundongos , Análise de Sequência de DNA , Elementos Nucleotídeos Curtos e Dispersos/genética , Sintenia/genéticaRESUMO
Here we present a finished sequence of human chromosome 15, together with a high-quality gene catalogue. As chromosome 15 is one of seven human chromosomes with a high rate of segmental duplication, we have carried out a detailed analysis of the duplication structure of the chromosome. Segmental duplications in chromosome 15 are largely clustered in two regions, on proximal and distal 15q; the proximal region is notable because recombination among the segmental duplications can result in deletions causing Prader-Willi and Angelman syndromes. Sequence analysis shows that the proximal and distal regions of 15q share extensive ancient similarity. Using a simple approach, we have been able to reconstruct many of the events by which the current duplication structure arose. We find that most of the intrachromosomal duplications seem to share a common ancestry. Finally, we demonstrate that some remaining gaps in the genome sequence are probably due to structural polymorphisms between haplotypes; this may explain a significant fraction of the gaps remaining in the human genome.
Assuntos
Cromossomos Humanos Par 15/genética , Evolução Molecular , Duplicação Gênica , Animais , Sequência Conservada/genética , Genes , Genoma Humano , Haplótipos/genética , Humanos , Macaca mulatta/genética , Dados de Sequência Molecular , Família Multigênica/genética , Filogenia , Polimorfismo Genético/genética , Análise de Sequência de DNA , Sintenia/genéticaRESUMO
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
Assuntos
Cães/genética , Evolução Molecular , Genoma/genética , Genômica , Haplótipos/genética , Animais , Sequência Conservada/genética , Doenças do Cão/genética , Cães/classificação , Feminino , Humanos , Hibridização Genética , Masculino , Camundongos , Mutagênese/genética , Polimorfismo de Nucleotídeo Único/genética , Ratos , Elementos Nucleotídeos Curtos e Dispersos/genética , Sintenia/genéticaRESUMO
Chromosome 18 appears to have the lowest gene density of any human chromosome and is one of only three chromosomes for which trisomic individuals survive to term. There are also a number of genetic disorders stemming from chromosome 18 trisomy and aneuploidy. Here we report the finished sequence and gene annotation of human chromosome 18, which will allow a better understanding of the normal and disease biology of this chromosome. Despite the low density of protein-coding genes on chromosome 18, we find that the proportion of non-protein-coding sequences evolutionarily conserved among mammals is close to the genome-wide average. Extending this analysis to the entire human genome, we find that the density of conserved non-protein-coding sequences is largely uncorrelated with gene density. This has important implications for the nature and roles of non-protein-coding sequence elements.
Assuntos
Cromossomos Humanos Par 18/genética , DNA/genética , Aneuploidia , Animais , Sequência Conservada/genética , Ilhas de CpG/genética , Éxons/genética , Etiquetas de Sequências Expressas , Genes/genética , Genoma Humano , Humanos , Íntrons/genética , Dados de Sequência Molecular , Análise de Sequência de DNA , SinteniaRESUMO
To understand the evolutionary pathways that lead to emerging infections of vertebrates, here we explore the genomic innovations that allow free-living chytrid fungi to adapt to and colonize amphibian hosts. Sequencing and comparing the genomes of two pathogenic species of Batrachochytrium to those of close saprophytic relatives reveals that pathogenicity is associated with remarkable expansions of protease and cell wall gene families, while divergent infection strategies are linked to radiations of lineage-specific gene families. By comparing the host-pathogen response to infection for both pathogens, we illuminate the traits that underpin a strikingly different immune response within a shared host species. Our results show that, despite commonalities that promote infection, specific gene-family radiations contribute to distinct infection strategies. The breadth and evolutionary novelty of candidate virulence factors that we discover underscores the urgent need to halt the advance of pathogenic chytrids and prevent incipient loss of biodiversity.
Assuntos
Parede Celular/genética , Quitridiomicetos/genética , Doenças Transmissíveis Emergentes/microbiologia , Dermatomicoses/microbiologia , Peptídeo Hidrolases/genética , Salamandridae/microbiologia , Fatores de Virulência/genética , Animais , Biodiversidade , Quitridiomicetos/patogenicidade , Evolução Molecular , GenômicaRESUMO
Pneumocystis jirovecii is a major cause of life-threatening pneumonia in immunosuppressed patients including transplant recipients and those with HIV/AIDS, yet surprisingly little is known about the biology of this fungal pathogen. Here we report near complete genome assemblies for three Pneumocystis species that infect humans, rats and mice. Pneumocystis genomes are highly compact relative to other fungi, with substantial reductions of ribosomal RNA genes, transporters, transcription factors and many metabolic pathways, but contain expansions of surface proteins, especially a unique and complex surface glycoprotein superfamily, as well as proteases and RNA processing proteins. Unexpectedly, the key fungal cell wall components chitin and outer chain N-mannans are absent, based on genome content and experimental validation. Our findings suggest that Pneumocystis has developed unique mechanisms of adaptation to life exclusively in mammalian hosts, including dependence on the lungs for gas and nutrients and highly efficient strategies to escape both host innate and acquired immune defenses.
Assuntos
Adaptação Biológica , Genoma Fúngico , Interações Hospedeiro-Patógeno/genética , Pneumocystis carinii/genética , Animais , Parede Celular/metabolismo , Humanos , Pulmão/microbiologia , Redes e Vias Metabólicas/genética , Camundongos , Família Multigênica , Pneumocystis carinii/metabolismo , Ratos , SinteniaRESUMO
Cercospora arachidicola, causal agent of early leaf spot, is an economically important peanut pathogen. Lack of genetic information about this fungus prevents understanding the role that potentially diverse genotypes may have in peanut breeding programs. Here, we report for the first time a draft genome sequence of C. arachidicola.
RESUMO
Sporothrix schenckii is a pathogenic dimorphic fungus that grows as a yeast and as mycelia. This species is the causative agent of sporotrichosis, typically a skin infection. We report the genome sequence of S. schenckii, which will facilitate the study of this fungus and of the Sporothrix schenckii group.
RESUMO
The domestic dog, Canis familiaris, is a well-established model system for mapping trait and disease loci. While the original draft sequence was of good quality, gaps were abundant particularly in promoter regions of the genome, negatively impacting the annotation and study of candidate genes. Here, we present an improved genome build, canFam3.1, which includes 85 MB of novel sequence and now covers 99.8% of the euchromatic portion of the genome. We also present multiple RNA-Sequencing data sets from 10 different canine tissues to catalog â¼175,000 expressed loci. While about 90% of the coding genes previously annotated by EnsEMBL have measurable expression in at least one sample, the number of transcript isoforms detected by our data expands the EnsEMBL annotations by a factor of four. Syntenic comparison with the human genome revealed an additional â¼3,000 loci that are characterized as protein coding in human and were also expressed in the dog, suggesting that those were previously not annotated in the EnsEMBL canine gene set. In addition to â¼20,700 high-confidence protein coding loci, we found â¼4,600 antisense transcripts overlapping exons of protein coding genes, â¼7,200 intergenic multi-exon transcripts without coding potential, likely candidates for long intergenic non-coding RNAs (lincRNAs) and â¼11,000 transcripts were reported by two different library construction methods but did not fit any of the above categories. Of the lincRNAs, about 6,000 have no annotated orthologs in human or mouse. Functional analysis of two novel transcripts with shRNA in a mouse kidney cell line altered cell morphology and motility. All in all, we provide a much-improved annotation of the canine genome and suggest regulatory functions for several of the novel non-coding transcripts.