Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 41
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Cell ; 173(2): 305-320.e10, 2018 04 05.
Article in English | MEDLINE | ID: mdl-29625049

ABSTRACT

The Cancer Genome Atlas (TCGA) has catalyzed systematic characterization of diverse genomic alterations underlying human cancers. At this historic junction marking the completion of genomic characterization of over 11,000 tumors from 33 cancer types, we present our current understanding of the molecular processes governing oncogenesis. We illustrate our insights into cancer through synthesis of the findings of the TCGA PanCancer Atlas project on three facets of oncogenesis: (1) somatic driver mutations, germline pathogenic variants, and their interactions in the tumor; (2) the influence of the tumor genome and epigenome on transcriptome and proteome; and (3) the relationship between tumor and the microenvironment, including implications for drugs targeting driver events and immunotherapies. These results will anchor future characterization of rare and common tumor types, primary and relapsed tumors, and cancers across ancestry groups and will guide the deployment of clinical genomic sequencing.


Subject(s)
Carcinogenesis/genetics , Genomics , Neoplasms/pathology , DNA Repair/genetics , Databases, Genetic , Genes, Neoplasm , Humans , Metabolic Networks and Pathways/genetics , Microsatellite Instability , Mutation , Neoplasms/genetics , Neoplasms/immunology , Transcriptome , Tumor Microenvironment/genetics
2.
Cell ; 173(2): 371-385.e18, 2018 04 05.
Article in English | MEDLINE | ID: mdl-29625053

ABSTRACT

Identifying molecular cancer drivers is critical for precision oncology. Multiple advanced algorithms to identify drivers now exist, but systematic attempts to combine and optimize them on large datasets are few. We report a PanCancer and PanSoftware analysis spanning 9,423 tumor exomes (comprising all 33 of The Cancer Genome Atlas projects) and using 26 computational tools to catalog driver genes and mutations. We identify 299 driver genes with implications regarding their anatomical sites and cancer/cell types. Sequence- and structure-based analyses identified >3,400 putative missense driver mutations supported by multiple lines of evidence. Experimental validation confirmed 60%-85% of predicted mutations as likely drivers. We found that >300 MSI tumors are associated with high PD-1/PD-L1, and 57% of tumors analyzed harbor putative clinically actionable events. Our study represents the most comprehensive discovery of cancer genes and mutations to date and will serve as a blueprint for future biological and clinical endeavors.


Subject(s)
Neoplasms/pathology , Algorithms , B7-H1 Antigen/genetics , Computational Biology , Databases, Genetic , Entropy , Humans , Microsatellite Instability , Mutation , Neoplasms/genetics , Neoplasms/immunology , Principal Component Analysis , Programmed Cell Death 1 Receptor/genetics
4.
Nat Methods ; 19(4): 429-440, 2022 04.
Article in English | MEDLINE | ID: mdl-35396482

ABSTRACT

Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.


Subject(s)
Metagenome , Metagenomics , Archaea/genetics , Metagenomics/methods , Reproducibility of Results , Sequence Analysis, DNA , Software
5.
Genet Med ; 24(6): 1316-1327, 2022 06.
Article in English | MEDLINE | ID: mdl-35311657

ABSTRACT

PURPOSE: Retrospective interpretation of sequenced data in light of the current literature is a major concern of the field. Such reinterpretation is manual and both human resources and variable operating procedures are the main bottlenecks. METHODS: Genome Alert! method automatically reports changes with potential clinical significance in variant classification between releases of the ClinVar database. Using ClinVar submissions across time, this method assigns validity category to gene-disease associations. RESULTS: Between July 2017 and December 2019, the retrospective analysis of ClinVar submissions revealed a monthly median of 1247 changes in variant classification with potential clinical significance and 23 new gene-disease associations. Re-examination of 4929 targeted sequencing files highlighted 45 changes in variant classification, and of these classifications, 89% were expert validated, leading to 4 additional diagnoses. Genome Alert! gene-disease association catalog provided 75 high-confidence associations not available in the OMIM morbid list; of which, 20% became available in OMIM morbid list For more than 356 negative exome sequencing data that were reannotated for variants in these 75 genes, this elective approach led to a new diagnosis. CONCLUSION: Genome Alert! (https://genomealert.univ-grenoble-alpes.fr/) enables systematic and reproducible reinterpretation of acquired sequencing data in a clinical routine with limited human resource effect.


Subject(s)
Databases, Genetic , Genetic Variation , Genetic Variation/genetics , Genome, Human/genetics , Genomics , Humans , Phenotype , Retrospective Studies
6.
BMC Genomics ; 22(1): 389, 2021 May 26.
Article in English | MEDLINE | ID: mdl-34039264

ABSTRACT

BACKGROUND: Whole genome sequencing of cultured pathogens is the state of the art public health response for the bioinformatic source tracking of illness outbreaks. Quasimetagenomics can substantially reduce the amount of culturing needed before a high quality genome can be recovered. Highly accurate short read data is analyzed for single nucleotide polymorphisms and multi-locus sequence types to differentiate strains but cannot span many genomic repeats, resulting in highly fragmented assemblies. Long reads can span repeats, resulting in much more contiguous assemblies, but have lower accuracy than short reads. RESULTS: We evaluated the accuracy of Listeria monocytogenes assemblies from enrichments (quasimetagenomes) of naturally-contaminated ice cream using long read (Oxford Nanopore) and short read (Illumina) sequencing data. Accuracy of ten assembly approaches, over a range of sequencing depths, was evaluated by comparing sequence similarity of genes in assemblies to a complete reference genome. Long read assemblies reconstructed a circularized genome as well as a 71 kbp plasmid after 24 h of enrichment; however, high error rates prevented high fidelity gene assembly, even at 150X depth of coverage. Short read assemblies accurately reconstructed the core genes after 28 h of enrichment but produced highly fragmented genomes. Hybrid approaches demonstrated promising results but had biases based upon the initial assembly strategy. Short read assemblies scaffolded with long reads accurately assembled the core genes after just 24 h of enrichment, but were highly fragmented. Long read assemblies polished with short reads reconstructed a circularized genome and plasmid and assembled all the genes after 24 h enrichment but with less fidelity for the core genes than the short read assemblies. CONCLUSION: The integration of long and short read sequencing of quasimetagenomes expedited the reconstruction of a high quality pathogen genome compared to either platform alone. A new and more complete level of information about genome structure, gene order and mobile elements can be added to the public health response by incorporating long read analyses with the standard short read WGS outbreak response.


Subject(s)
Listeria monocytogenes , Nanopores , Genomics , High-Throughput Nucleotide Sequencing , Listeria monocytogenes/genetics , Sequence Analysis, DNA , Whole Genome Sequencing
7.
J Antimicrob Chemother ; 76(5): 1299-1302, 2021 04 13.
Article in English | MEDLINE | ID: mdl-33417711

ABSTRACT

OBJECTIVES: To estimate the transmission rate of carbapenemase-producing Enterobacteriaceae (CPE) in households with recently hospitalized CPE carriers. METHODS: We conducted a prospective case-ascertained cohort study. We identified the presence of CPE in stool samples from index subjects, household contacts and companion animals and environmental samples at regular intervals. Linked transmissions were identified by WGS. A Markov model was constructed to estimate the household transmission potential of CPE. RESULTS: Ten recently hospitalized index patients and 14 household contacts were included. There were seven households with one contact, two households with two contacts, and one household with three contacts. Index patients were colonized with blaOXA-48-like (n = 4), blaKPC-2 (n = 3), blaIMP (n = 2), and blaNDM-1 (n = 1), distributed among divergent species of Enterobacteriaceae. After a cumulative follow-up time of 9.0 years, three family members (21.4%, 3/14) acquired four different types of CPE in the community (hazard rate of 0.22/year). The probability of CPE transmission from an index patient to a household contact was 10% (95% CI 4%-26%). CONCLUSIONS: We observed limited transmission of CPE from an index patient to household contacts. Larger studies are needed to understand the factors associated with household transmission of CPE and identify preventive strategies.


Subject(s)
Carbapenem-Resistant Enterobacteriaceae , Enterobacteriaceae Infections , Bacterial Proteins/genetics , Carbapenem-Resistant Enterobacteriaceae/genetics , Cohort Studies , Enterobacteriaceae Infections/epidemiology , Humans , Prospective Studies , beta-Lactamases/genetics
8.
Nature ; 522(7555): 173-8, 2015 Jun 11.
Article in English | MEDLINE | ID: mdl-26040716

ABSTRACT

Stem cells of the gastrointestinal tract, pancreas, liver and other columnar epithelia collectively resist cloning in their elemental states. Here we demonstrate the cloning and propagation of highly clonogenic, 'ground state' stem cells of the human intestine and colon. We show that derived stem-cell pedigrees sustain limited copy number and sequence variation despite extensive serial passaging and display exquisitely precise, cell-autonomous commitment to epithelial differentiation consistent with their origins along the intestinal tract. This developmentally patterned and epigenetically maintained commitment of stem cells is likely to enforce the functional specificity of the adult intestinal tract. Using clonally derived colonic epithelia, we show that toxins A or B of the enteric pathogen Clostridium difficile recapitulate the salient features of pseudomembranous colitis. The stability of the epigenetic commitment programs of these stem cells, coupled with their unlimited replicative expansion and maintained clonogenicity, suggests certain advantages for their use in disease modelling and regenerative medicine.


Subject(s)
Intestines/cytology , Stem Cells/cytology , Stem Cells/metabolism , Bacterial Toxins/pharmacology , Cell Differentiation/drug effects , Cell Lineage , Cells, Cultured , Clone Cells/cytology , Clone Cells/metabolism , Clostridioides difficile/physiology , Colon/cytology , Colon/drug effects , Enterocolitis, Pseudomembranous/microbiology , Enterocolitis, Pseudomembranous/pathology , Epigenesis, Genetic/genetics , Epithelium/drug effects , Epithelium/metabolism , Fetus/cytology , Genomic Instability/genetics , Humans , Intestine, Small/cytology , Intestines/drug effects , Organoids/cytology , Organoids/growth & development
9.
Emerg Infect Dis ; 26(9): 2182-2185, 2020 09.
Article in English | MEDLINE | ID: mdl-32818397

ABSTRACT

To determine the duration of carbapenemase-producing Enterobacteriaceae (CPE) carriage, we studied 21 CPE carriers for ¼1 year. Mean carriage duration was 86 days; probability of decolonization in 1 year was 98.5%, suggesting that CPE-carriers' status can be reviewed yearly. Prolonged carriage was associated with use of antimicrobial drugs.


Subject(s)
Carbapenem-Resistant Enterobacteriaceae , Enterobacteriaceae Infections , Bacterial Proteins/genetics , Enterobacteriaceae Infections/epidemiology , Hospitals , Humans , beta-Lactamases/genetics
10.
Bioinformatics ; 34(22): 3907-3914, 2018 11 15.
Article in English | MEDLINE | ID: mdl-29868820

ABSTRACT

Motivation: As we move toward an era of precision medicine, the ability to predict patient-specific drug responses in cancer based on molecular information such as gene expression data represents both an opportunity and a challenge. In particular, methods are needed that can accommodate the high-dimensionality of data to learn interpretable models capturing drug response mechanisms, as well as providing robust predictions across datasets. Results: We propose a method based on ideas from 'recommender systems' (CaDRReS) that predicts cancer drug responses for unseen cell-lines/patients based on learning projections for drugs and cell-lines into a latent 'pharmacogenomic' space. Comparisons with other proposed approaches for this problem based on large public datasets (CCLE and GDSC) show that CaDRReS provides consistently good models and robust predictions even across unseen patient-derived cell-line datasets. Analysis of the pharmacogenomic spaces inferred by CaDRReS also suggests that they can be used to understand drug mechanisms, identify cellular subtypes and further characterize drug-pathway associations. Availability and implementation: Source code and datasets are available at https://github.com/CSB5/CaDRReS. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Antineoplastic Agents/therapeutic use , Neoplasms , Humans , Neoplasms/drug therapy , Pharmacogenetics , Precision Medicine , Software
11.
Eur Respir J ; 52(1)2018 07.
Article in English | MEDLINE | ID: mdl-29880655

ABSTRACT

Understanding the composition and clinical importance of the fungal mycobiome was recently identified as a key topic in a "research priorities" consensus statement for bronchiectasis.Patients were recruited as part of the CAMEB study: an international multicentre cross-sectional Cohort of Asian and Matched European Bronchiectasis patients. The mycobiome was determined in 238 patients by targeted amplicon shotgun sequencing of the 18S-28S rRNA internally transcribed spacer regions ITS1 and ITS2. Specific quantitative PCR for detection of and conidial quantification for a range of airway Aspergillus species was performed. Sputum galactomannan, Aspergillus specific IgE, IgG and TARC (thymus and activation regulated chemokine) levels were measured systemically and associated to clinical outcomes.The bronchiectasis mycobiome is distinct and characterised by specific fungal genera, including Aspergillus, Cryptococcus and ClavisporaAspergillus fumigatus (in Singapore/Kuala Lumpur) and Aspergillus terreus (in Dundee) dominated profiles, the latter associating with exacerbations. High frequencies of Aspergillus-associated disease including sensitisation and allergic bronchopulmonary aspergillosis were detected. Each revealed distinct mycobiome profiles, and associated with more severe disease, poorer pulmonary function and increased exacerbations.The pulmonary mycobiome is of clinical relevance in bronchiectasis. Screening for Aspergillus-associated disease should be considered even in apparently stable patients.


Subject(s)
Bronchiectasis/complications , Fungi/classification , Mycobiome , Pulmonary Aspergillosis/complications , Adult , Aged , Antibodies, Fungal/blood , Aspergillus , Bronchiectasis/immunology , Bronchiectasis/microbiology , Cohort Studies , Cross-Sectional Studies , Disease Progression , Female , Humans , Immunoglobulin Isotypes/blood , Malaysia , Male , Middle Aged , Pulmonary Aspergillosis/immunology , Singapore , Sputum/microbiology , United Kingdom
12.
Genome Res ; 24(10): 1559-71, 2014 Oct.
Article in English | MEDLINE | ID: mdl-25186909

ABSTRACT

Chromosomal structural variations play an important role in determining the transcriptional landscape of human breast cancers. To assess the nature of these structural variations, we analyzed eight breast tumor samples with a focus on regions of gene amplification using mate-pair sequencing of long-insert genomic DNA with matched transcriptome profiling. We found that tandem duplications appear to be early events in tumor evolution, especially in the genesis of amplicons. In a detailed reconstruction of events on chromosome 17, we found large unpaired inversions and deletions connect a tandemly duplicated ERBB2 with neighboring 17q21.3 amplicons while simultaneously deleting the intervening BRCA1 tumor suppressor locus. This series of events appeared to be unusually common when examined in larger genomic data sets of breast cancers albeit using approaches with lesser resolution. Using siRNAs in breast cancer cell lines, we showed that the 17q21.3 amplicon harbored a significant number of weak oncogenes that appeared consistently coamplified in primary tumors. Down-regulation of BRCA1 expression augmented the cell proliferation in ERBB2-transfected human normal mammary epithelial cells. Coamplification of other functionally tested oncogenic elements in other breast tumors examined, such as RIPK2 and MYC on chromosome 8, also parallel these findings. Our analyses suggest that structural variations efficiently orchestrate the gain and loss of cancer gene cassettes that engage many oncogenic pathways simultaneously and that such oncogenic cassettes are favored during the evolution of a cancer.


Subject(s)
BRCA1 Protein/genetics , Breast Neoplasms/genetics , Chromosome Aberrations , Chromosomes, Human, Pair 17/genetics , Receptor, ErbB-2/genetics , Base Sequence , Cell Line, Tumor , Female , Gene Amplification , Gene Duplication , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Humans , MCF-7 Cells , Molecular Sequence Data , Sequence Analysis, DNA
13.
Nucleic Acids Res ; 43(7): e44, 2015 Apr 20.
Article in English | MEDLINE | ID: mdl-25572314

ABSTRACT

Extensive and multi-dimensional data sets generated from recent cancer omics profiling projects have presented new challenges and opportunities for unraveling the complexity of cancer genome landscapes. In particular, distinguishing the unique complement of genes that drive tumorigenesis in each patient from a sea of passenger mutations is necessary for translating the full benefit of cancer genome sequencing into the clinic. We address this need by presenting a data integration framework (OncoIMPACT) to nominate patient-specific driver genes based on their phenotypic impact. Extensive in silico and in vitro validation helped establish OncoIMPACT's robustness, improved precision over competing approaches and verifiable patient and cell line specific predictions (2/2 and 6/7 true positives and negatives, respectively). In particular, we computationally predicted and experimentally validated the gene TRIM24 as a putative novel amplified driver in a melanoma patient. Applying OncoIMPACT to more than 1000 tumor samples, we generated patient-specific driver gene lists in five different cancer types to identify modes of synergistic action. We also provide the first demonstration that computationally derived driver mutation signatures can be overall superior to single gene and gene expression based signatures in enabling patient stratification and prognostication. Source code and executables for OncoIMPACT are freely available from http://sourceforge.net/projects/oncoimpact.


Subject(s)
Melanoma/genetics , Algorithms , Humans , Melanoma/physiopathology , Mutation , Risk Assessment , Survival Analysis
14.
Nucleic Acids Res ; 40(22): 11189-201, 2012 Dec.
Article in English | MEDLINE | ID: mdl-23066108

ABSTRACT

The study of cell-population heterogeneity in a range of biological systems, from viruses to bacterial isolates to tumor samples, has been transformed by recent advances in sequencing throughput. While the high-coverage afforded can be used, in principle, to identify very rare variants in a population, existing ad hoc approaches frequently fail to distinguish true variants from sequencing errors. We report a method (LoFreq) that models sequencing run-specific error rates to accurately call variants occurring in <0.05% of a population. Using simulated and real datasets (viral, bacterial and human), we show that LoFreq has near-perfect specificity, with significantly improved sensitivity compared with existing methods and can efficiently analyze deep Illumina sequencing datasets without resorting to approximations or heuristics. We also present experimental validation for LoFreq on two different platforms (Fluidigm and Sequenom) and its application to call rare somatic variants from exome sequencing datasets for gastric cancer. Source code and executables for LoFreq are freely available at http://sourceforge.net/projects/lofreq/.


Subject(s)
Genetic Variation , High-Throughput Nucleotide Sequencing/methods , Computer Simulation , Dengue Virus/genetics , Escherichia coli/genetics , Genomics/methods , High-Throughput Nucleotide Sequencing/standards , Humans , Mutation , Sensitivity and Specificity , Stomach Neoplasms/genetics , Viral Proteins/chemistry , Viral Proteins/genetics
15.
Eur J Cancer ; 202: 113978, 2024 May.
Article in English | MEDLINE | ID: mdl-38471290

ABSTRACT

BACKGROUND: The PAOLA-1/ENGOT-ov25 trial showed that maintenance olaparib plus bevacizumab increases survival of advanced ovarian cancer patients with homologous recombination deficiency (HRD). However, decentralized solutions to test for HRD in clinical routine are scarce. The goal of this study was to retrospectively validate on tumor samples from the PAOLA-1 trial, the decentralized SeqOne assay, which relies on shallow Whole Genome Sequencing (sWGS) to capture genomic instability and targeted sequencing to determine BRCA status. METHODS: The study comprised 368 patients from the PAOLA-1 trial. The SeqOne assay was compared to the Myriad MyChoice HRD test (Myriad Genetics), and results were analyzed with respect to Progression-Free Survival (PFS). RESULTS: We found a 95% concordance between the HRD status of the two tests (95% Confidence Interval (CI); 92%-97%). The Positive Percentage Agreement (PPA) of the sWGS test was 95% (95% CI; 91%-97%) like its Negative Percentage Agreement (NPA) (95% CI; 89%-98%). In patients with HRD-positive tumors treated with olaparib plus bevacizumab, the PFS Hazard Ratio (HR) was 0.38 (95% CI; 0.26-0.54) with SeqOne assay and 0.32 (95% CI; 0.22-0.45) with the Myriad assay. In patients with HRD-negative tumors, HR was 0.99 (95% CI; 0.68-1.42) and 1.05 (95% CI; 0.70-1.57) with SeqOne and Myriad assays. Among patients with BRCA-wildtype tumors, those with HRD-positive tumors, benefited from olaparib plus bevacizumab maintenance, with HR of 0.48 (95% CI: 0.29-0.79) and of 0.38 (95% CI: 0.23 to 0.63) with the SeqOne and Myriad assay. CONCLUSION: The SeqOne assay offers a clinically validated approach to detect HRD.


Subject(s)
Ovarian Neoplasms , Humans , Female , Bevacizumab/therapeutic use , Retrospective Studies , Ovarian Neoplasms/drug therapy , Ovarian Neoplasms/genetics , Carcinoma, Ovarian Epithelial , Homologous Recombination
16.
Clin Transl Med ; 14(6): e1723, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38877653

ABSTRACT

BACKGROUND: Cholangiocarcinoma (CCA) is a fatal cancer of the bile duct with a poor prognosis owing to limited therapeutic options. The incidence of intrahepatic CCA (iCCA) is increasing worldwide, and its molecular basis is emerging. Environmental factors may contribute to regional differences in the mutation spectrum of European patients with iCCA, which are underrepresented in systematic genomic and transcriptomic studies of the disease. METHODS: We describe an integrated whole-exome sequencing and transcriptomic study of 37 iCCAs patients in Germany. RESULTS: We observed as most frequently mutated genes ARID1A (14%), IDH1, BAP1, TP53, KRAS, and ATM in 8% of patients. We identified FGFR2::BICC1 fusions in two tumours, and FGFR2::KCTD1 and TMEM106B::ROS1 as novel fusions with potential therapeutic implications in iCCA and confirmed oncogenic properties of TMEM106B::ROS1 in vitro. Using a data integration framework, we identified PBX1 as a novel central regulatory gene in iCCA. We performed extended screening by targeted sequencing of an additional 40 CCAs. In the joint analysis, IDH1 (13%), BAP1 (10%), TP53 (9%), KRAS (7%), ARID1A (7%), NF1 (5%), and ATM (5%) were the most frequently mutated genes, and we found PBX1 to show copy gain in 20% of the tumours. According to other studies, amplifications of PBX1 tend to occur in European iCCAs in contrast to liver fluke-associated Asian iCCAs. CONCLUSIONS: By analyzing an additional European cohort of iCCA patients, we found that PBX1 protein expression was a marker of poor prognosis. Overall, our findings provide insight into key molecular alterations in iCCA, reveal new targetable fusion genes, and suggest that PBX1 is a novel modulator of this disease.


Subject(s)
Cholangiocarcinoma , Pre-B-Cell Leukemia Transcription Factor 1 , Proto-Oncogene Proteins , Humans , Cholangiocarcinoma/genetics , Pre-B-Cell Leukemia Transcription Factor 1/genetics , Male , Proto-Oncogene Proteins/genetics , Female , Prognosis , Middle Aged , Aged , Bile Duct Neoplasms/genetics , Germany/epidemiology , Biomarkers, Tumor/genetics , Adult , Genomics/methods , Protein-Tyrosine Kinases
17.
BMC Ecol Evol ; 22(1): 106, 2022 09 03.
Article in English | MEDLINE | ID: mdl-36057769

ABSTRACT

BACKGROUND: The transient and fragmented nature of the deep-sea hydrothermal environment made of ridge subduction, plate collision and the emergence of new rifts is currently acting to separate of vent populations, promoting local adaptation and contributing to bursts of speciation and species specialization. The tube-dwelling worms Alvinella pompejana called the Pompeii worm and its sister species A. caudata live syntopically on the hottest part of deep-sea hydrothermal chimneys along the East Pacific Rise. They are exposed to extreme thermal and chemical gradients, which vary greatly in space and time, and thus represent ideal candidates for understanding the evolutionary mechanisms at play in the vent fauna evolution. RESULTS: We explored genomic patterns of divergence in the early and late stages of speciation of these emblematic worms using transcriptome assemblies and the first draft genome to better understand the relative role of geographic isolation and habitat preference in their genome evolution. Analyses were conducted on allopatric populations of Alvinella pompejana (early stage of separation) and between A. pompejana and its syntopic species Alvinella caudata (late stage of speciation). We first identified divergent genomic regions and targets of selection as well as their position in the genome over collections of orthologous genes and, then, described the speciation dynamics by documenting the annotation of the most divergent and/or positively selected genes involved in the isolation process. Gene mapping clearly indicated that divergent genes associated with the early stage of speciation, although accounting for nearly 30% of genes, are highly scattered in the genome without any island of divergence and not involved in gamete recognition or mito-nuclear incompatibilities. By contrast, genomes of A. pompejana and A. caudata are clearly separated with nearly all genes (96%) exhibiting high divergence. This congealing effect however seems to be linked to habitat specialization and still allows positive selection on genes involved in gamete recognition, as a possible long-duration process of species reinforcement. CONCLUSION: Our analyses highlight the non-negligible role of natural selection on both the early and late stages of speciation in the iconic thermophilic worms living on the walls of deep-sea hydrothermal chimneys. They shed light on the evolution of gene divergence during the process of speciation and species specialization over a very long period of time.


Subject(s)
Polychaeta , Acclimatization , Adaptation, Physiological , Animals , Genomics , Polychaeta/genetics , Selection, Genetic
18.
Nat Commun ; 13(1): 6044, 2022 10 13.
Article in English | MEDLINE | ID: mdl-36229545

ABSTRACT

Despite extensive efforts to address it, the vastness of uncharacterized 'dark matter' microbial genetic diversity can impact short-read sequencing based metagenomic studies. Population-specific biases in genomic reference databases can further compound this problem. Leveraging advances in hybrid assembly (using short and long reads) and Hi-C technologies in a cross-sectional survey, we deeply characterized 109 gut microbiomes from three ethnicities in Singapore to comprehensively reconstruct 4497 medium and high-quality metagenome assembled genomes, 1708 of which were missing in short-read only analysis and with >28× N50 improvement. Species-level clustering identified 70 (>10% of total) novel gut species out of 685, improved reference genomes for 363 species (53% of total), and discovered 3413 strains unique to these populations. Among the top 10 most abundant gut bacteria in our study, one of the species and >80% of strains were unrepresented in existing databases. Annotation of biosynthetic gene clusters (BGCs) uncovered more than 27,000 BGCs with a large fraction (36-88%) unrepresented in current databases, and with several unique clusters predicted to produce bacteriocins that could significantly alter microbiome community structure. These results reveal significant uncharacterized gut microbial diversity in Southeast Asian populations and highlight the utility of hybrid metagenomic references for bioprospecting and disease-focused studies.


Subject(s)
Bacteriocins , Microbiota , Asian People/genetics , Bacteriocins/genetics , Cross-Sectional Studies , Genome, Human , Humans , Metagenome/genetics , Metagenomics/methods , Microbiota/genetics
19.
Nat Microbiol ; 7(10): 1516-1524, 2022 10.
Article in English | MEDLINE | ID: mdl-36109646

ABSTRACT

Long-term colonization of the gut microbiome by carbapenemase-producing Enterobacteriaceae (CPE) is a growing area of public health concern as it can lead to community transmission and rapid increase in cases of life-threatening CPE infections. Here, leveraging the observation that many subjects are decolonized without interventions within a year, we used longitudinal shotgun metagenomics (up to 12 timepoints) for detailed characterization of ecological and evolutionary dynamics in the gut microbiome of a cohort of CPE-colonized subjects and family members (n = 46; 361 samples). Subjects who underwent decolonization exhibited a distinct ecological shift marked by recovery of microbial diversity, key commensals and anti-inflammatory pathways. In addition, colonization was marked by elevated but unstable Enterobacteriaceae abundances, which exhibited distinct strain-level dynamics for different species (Escherichia coli and Klebsiella pneumoniae). Finally, comparative analysis with whole-genome sequencing data from CPE isolates (n = 159) helped identify substrain variation in key functional genes and the presence of highly similar E. coli and K. pneumoniae strains with variable resistance profiles and plasmid sharing. These results provide an enhanced view into how colonization by multi-drug-resistant bacteria associates with altered gut ecology and can enable transfer of resistance genes, even in the absence of overt infection and antibiotic usage.


Subject(s)
Carbapenem-Resistant Enterobacteriaceae , Gastrointestinal Microbiome , Anti-Bacterial Agents/pharmacology , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Carbapenem-Resistant Enterobacteriaceae/genetics , Escherichia coli/genetics , Humans , Klebsiella pneumoniae/genetics , beta-Lactamases/genetics , beta-Lactamases/metabolism
20.
BMC Bioinformatics ; 12 Suppl 9: S2, 2011 Oct 05.
Article in English | MEDLINE | ID: mdl-22152029

ABSTRACT

BACKGROUND: Tandemly Arrayed Gene (TAG) clusters are groups of paralogous genes that are found adjacent on a chromosome. TAGs represent an important repertoire of genes in eukaryotes. In addition to tandem duplication events, TAG clusters are affected during their evolution by other mechanisms, such as inversion and deletion events, that affect the order and orientation of genes. The DILTAG algorithm developed in 1 makes it possible to infer a set of optimal evolutionary histories explaining the evolution of a single TAG cluster, from an ancestral single gene, through tandem duplications (simple or multiple, direct or inverted), deletions and inversion events. RESULTS: We present a general methodology, which is an extension of DILTAG, for the study of the evolutionary history of a set of orthologous TAG clusters in multiple species. In addition to the speciation events reflected by the phylogenetic tree of the considered species, the evolutionary events that are taken into account are simple or multiple tandem duplications, direct or inverted, simple or multiple deletions, and inversions. We analysed the performance of our algorithm on simulated data sets and we applied it to the protocadherin gene clusters of human, chimpanzee, mouse and rat. CONCLUSIONS: Our results obtained on simulated data sets showed a good performance in inferring the total number and size distribution of duplication events. A limitation of the algorithm is however in dealing with multiple gene deletions, as the algorithm is highly exponential in this case, and becomes quickly intractable.


Subject(s)
Algorithms , Evolution, Molecular , Gene Duplication , Multigene Family , Animals , Gene Deletion , Humans , Mice , Phylogeny , Rats
SELECTION OF CITATIONS
SEARCH DETAIL