Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 65
Filter
1.
PLoS Biol ; 20(2): e3001536, 2022 02.
Article in English | MEDLINE | ID: mdl-35167588

ABSTRACT

The importance of sampling from globally representative populations has been well established in human genomics. In human microbiome research, however, we lack a full understanding of the global distribution of sampling in research studies. This information is crucial to better understand global patterns of microbiome-associated diseases and to extend the health benefits of this research to all populations. Here, we analyze the country of origin of all 444,829 human microbiome samples that are available from the world's 3 largest genomic data repositories, including the Sequence Read Archive (SRA). The samples are from 2,592 studies of 19 body sites, including 220,017 samples of the gut microbiome. We show that more than 71% of samples with a known origin come from Europe, the United States, and Canada, including 46.8% from the US alone, despite the country representing only 4.3% of the global population. We also find that central and southern Asia is the most underrepresented region: Countries such as India, Pakistan, and Bangladesh account for more than a quarter of the world population but make up only 1.8% of human microbiome samples. These results demonstrate a critical need to ensure more global representation of participants in microbiome studies.


Subject(s)
Gastrointestinal Microbiome/genetics , Genomics/methods , Metagenome/genetics , Metagenomics/methods , Microbiota/genetics , Asia , Bangladesh , Canada , Developed Countries , Europe , Genomics/statistics & numerical data , Geography , Humans , India , Metagenomics/statistics & numerical data , Pakistan , United States
2.
J Comput Biol ; 29(2): 106-120, 2022 02.
Article in English | MEDLINE | ID: mdl-35020412

ABSTRACT

High-throughput chromosome conformation capture (Hi-C) has recently been applied to natural microbial communities and revealed great potential to study multiple genomes simultaneously. Several extraneous factors may influence chromosomal contacts rendering the normalization of Hi-C contact maps essential for downstream analyses. However, the current paucity of metagenomic Hi-C normalization methods and the ignorance for spurious interspecies contacts weaken the interpretability of the data. Here, we report on two types of biases in metagenomic Hi-C experiments: explicit biases and implicit biases, and introduce HiCzin, a parametric model to correct both types of biases and remove spurious interspecies contacts. We demonstrate that the normalized metagenomic Hi-C contact maps by HiCzin result in lower biases, higher capability to detect spurious contacts, and better performance in metagenomic contig clustering.


Subject(s)
Metagenomics/statistics & numerical data , Algorithms , Bias , Chromosomes/genetics , Computational Biology , High-Throughput Nucleotide Sequencing/statistics & numerical data , Linear Models , Logistic Models , Metagenome , Microbiota/genetics , Regression Analysis , Software , Yeasts/genetics
3.
Pediatr Infect Dis J ; 41(2): 166-171, 2022 02 01.
Article in English | MEDLINE | ID: mdl-34845152

ABSTRACT

BACKGROUND: Plasma metagenomic next-generation sequencing (mNGS) has the potential to detect thousands of different organisms with a single test. There are limited data on the real-world impact of mNGS and even less guidance on the types of patients and clinical scenarios in which mNGS testing is beneficial. METHODS: A retrospective review of patients who had mNGS testing as part of routine clinical care at Texas Children's Hospital from June 2018-August 2019 was performed. Medical records were reviewed for pertinent data. An expert panel of infectious disease physicians adjudicated each unique organism identified by mNGS for clinical impact. RESULTS: There were 169 patients with at least one mNGS test. mNGS identified a definitive, probable or possible infection in 49.7% of patients. mNGS led to no clinical impact in 139 patients (82.2%), a positive impact in 21 patients (12.4%), and a negative impact in 9 patients (5.3%). mNGS identified a plausible cause for infection more often in immunocompromised patients than in immunocompetent patients (55.8% vs. 30.0%, P = 0.006). Positive clinical impact was highest in patients with multiple indications for testing (37.5%, P = 0.006) with deep-seated infections, overall, being most often associated with a positive impact. CONCLUSION: mNGS testing has a limited real-world clinical impact when ordered indiscriminately. Immunocompromised patients with well-defined deep-seated infections are likely to benefit most from testing. Further studies are needed to evaluate the full spectrum of clinical scenarios for which mNGS testing is impactful.


Subject(s)
High-Throughput Nucleotide Sequencing/statistics & numerical data , Metagenomics/statistics & numerical data , Adolescent , Anti-Infective Agents/therapeutic use , Child , Child, Preschool , Female , Humans , Immunocompromised Host , Infant , Male , Retrospective Studies , Sepsis/blood , Sepsis/diagnosis , Sepsis/microbiology , Sepsis/virology
4.
Nat Commun ; 12(1): 6826, 2021 11 24.
Article in English | MEDLINE | ID: mdl-34819495

ABSTRACT

Listeria genus comprises two pathogenic species, L. monocytogenes (Lm) and L. ivanovii, and non-pathogenic species. All can thrive as saprophytes, whereas only pathogenic species cause systemic infections. Identifying Listeria species' respective biotopes is critical to understand the ecological contribution of Listeria virulence. In order to investigate the prevalence and abundance of Listeria species in various sources, we retrieved and analyzed 16S rRNA datasets from MG-RAST metagenomic database. 26% of datasets contain Listeria sensu stricto sequences, and Lm is the most prevalent species, most abundant in soil and host-associated environments, including 5% of human stools. Lm is also detected in 10% of human stool samples from an independent cohort of 900 healthy asymptomatic donors. A specific microbiota signature is associated with Lm faecal carriage, both in humans and experimentally inoculated mice, in which it precedes Lm faecal carriage. These results indicate that Lm faecal carriage is common and depends on the gut microbiota, and suggest that Lm faecal carriage is a crucial yet overlooked consequence of its virulence.


Subject(s)
Carrier State/epidemiology , Gastrointestinal Microbiome/genetics , Listeria monocytogenes/isolation & purification , Animals , Carrier State/diagnosis , Carrier State/microbiology , DNA, Bacterial/isolation & purification , Datasets as Topic , Disease Models, Animal , Feces/microbiology , Humans , Listeria monocytogenes/genetics , Listeria monocytogenes/pathogenicity , Male , Metagenomics/statistics & numerical data , Mice , Phylogeny , RNA, Ribosomal, 16S/genetics , Virulence
5.
Comput Math Methods Med ; 2021: 7238495, 2021.
Article in English | MEDLINE | ID: mdl-34790254

ABSTRACT

OBJECTIVE: To uncover the application value of metagenomic next-generation sequencing (mNGS) in the detection of pathogen in bronchoalveolar lavage fluid (BALF) and sputum samples. METHODS: Totally, 32 patients with pulmonary infection were included. Pathogens in BALF and sputum samples were tested simultaneously by routine microbial culture and mNGS. Main infected pathogens (bacteria, fungi, and viruses) and their distribution in BALF and sputum samples were analyzed. Moreover, the diagnostic performance of mNGS in paired BALF and sputum samples was assessed. RESULTS: The pathogen culture results were positive in 9 patients and negative in 13 patients. No statistical differences were recorded on the sensitivity (78.94% vs. 63.15%, p = 0.283) and specificity (62.50% vs. 75.00%, p = 0.375) of mNGS diagnosis in bacteria and fungus in two types of samples. As shown in mNGS detection, 10 patients' two samples were both positive, 13 patients' two samples were both negative, 7 patients were only positive in BALF samples, and 2 patients' sputum samples were positive. Main viruses mNGS detected were EB virus, human adenovirus 5, herpes simplex virus type 1, and human cytomegalovirus. Kappa consensus analysis indicated that mNGS showed significant consistency in detecting pathogens in two samples, no matter bacteria (p < 0.001), fungi (p = 0.026), or viruses (p = 0.008). CONCLUSION: mNGS showed no statistical differences in sensitivity and specificity of pathogen detection in BALF and sputum samples. Under certain conditions, sputum samples might be more suitable for pathogen detection because of invasiveness of BALF samples.


Subject(s)
Bronchoalveolar Lavage Fluid/microbiology , Bronchoalveolar Lavage Fluid/virology , High-Throughput Nucleotide Sequencing/methods , Metagenomics/methods , Pneumonia/microbiology , Pneumonia/virology , Sputum/microbiology , Sputum/virology , Adult , Computational Biology , Female , High-Throughput Nucleotide Sequencing/statistics & numerical data , Humans , Male , Metagenomics/statistics & numerical data , Microbiological Techniques , Middle Aged , Pneumonia/diagnosis , Retrospective Studies , Sensitivity and Specificity , Sequence Analysis, DNA
6.
Comput Math Methods Med ; 2021: 8008731, 2021.
Article in English | MEDLINE | ID: mdl-34812271

ABSTRACT

The human health status can be assessed by the means of research and analysis of the human microbiome. Acne is a common skin disease whose morbidity increases year by year. The lipids which influence acne to a large extent are studied by metagenomic methods in recent years. In this paper, machine learning methods are used to analyze metagenomic sequencing data of acne, i.e., all kinds of lipids in the face skin. Firstly, lipids data of the diseased skin (DS) samples and the healthy skin (HS) samples of acne patients and the normal control (NC) samples of healthy person are, respectively, analyzed by using principal component analysis (PCA) and kernel principal component analysis (KPCA). Then, the lipids which have main influence on each kind of sample are obtained. In addition, a multiset canonical correlation analysis (MCCA) is utilized to get lipids which can differentiate the face skins of the above three samples. The experimental results show the machine learning methods can effectively analyze metagenomic sequencing data of acne. According to the results, lipids which only influence one of the three samples or the lipids which simultaneously have different degree of influence on these three samples can be used as indicators to judge skin statuses.


Subject(s)
Acne Vulgaris/genetics , Acne Vulgaris/microbiology , Machine Learning , Metagenome , Acne Vulgaris/metabolism , Canonical Correlation Analysis , Case-Control Studies , Computational Biology , Face/microbiology , Humans , Lipids/analysis , Lipids/genetics , Metagenomics/statistics & numerical data , Microbiota/genetics , Principal Component Analysis , Skin/chemistry , Skin/microbiology
7.
Int J Mol Sci ; 22(10)2021 May 18.
Article in English | MEDLINE | ID: mdl-34069990

ABSTRACT

The taxonomic composition of microbial communities can be assessed using universal marker amplicon sequencing. The most common taxonomic markers are the 16S rDNA for bacterial communities and the internal transcribed spacer (ITS) region for fungal communities, but various other markers are used for barcoding eukaryotes. A crucial step in the bioinformatic analysis of amplicon sequences is the identification of representative sequences. This can be achieved using a clustering approach or by denoising raw sequencing reads. DADA2 is a widely adopted algorithm, released as an R library, that denoises marker-specific amplicons from next-generation sequencing and produces a set of representative sequences referred to as 'Amplicon Sequence Variants' (ASV). Here, we present Dadaist2, a modular pipeline, providing a complete suite for the analysis that ranges from raw sequencing reads to the statistics of numerical ecology. Dadaist2 implements a new approach that is specifically optimised for amplicons with variable lengths, such as the fungal ITS. The pipeline focuses on streamlining the data flow from the command line to R, with multiple options for statistical analysis and plotting, both interactive and automatic.


Subject(s)
DNA Barcoding, Taxonomic/statistics & numerical data , Metagenomics/statistics & numerical data , Microbiota/genetics , Software , Algorithms , Cluster Analysis , Computational Biology/methods , Data Interpretation, Statistical , High-Throughput Nucleotide Sequencing , Metadata , RNA, Ribosomal, 16S/genetics , Sequence Analysis, DNA
8.
PLoS Comput Biol ; 17(6): e1009089, 2021 06.
Article in English | MEDLINE | ID: mdl-34143768

ABSTRACT

The advent of high-throughput metagenomic sequencing has prompted the development of efficient taxonomic profiling methods allowing to measure the presence, abundance and phylogeny of organisms in a wide range of environmental samples. Multivariate sequence-derived abundance data further has the potential to enable inference of ecological associations between microbial populations, but several technical issues need to be accounted for, like the compositional nature of the data, its extreme sparsity and overdispersion, as well as the frequent need to operate in under-determined regimes. The ecological network reconstruction problem is frequently cast into the paradigm of Gaussian Graphical Models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso and neighborhood selection. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros (as opposed to sampling zeros) corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present here a zero-inflated log-normal graphical model (available at https://github.com/vincentprost/Zi-LN) specifically aimed at handling such "biological" zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets.


Subject(s)
Microbiota , Models, Biological , Algorithms , Computational Biology , Computer Simulation , Metagenome , Metagenomics/statistics & numerical data , Microbial Consortia/genetics , Microbial Consortia/physiology , Microbiota/genetics , Microbiota/physiology , Multivariate Analysis , Normal Distribution , Synthetic Biology
9.
mSphere ; 6(2)2021 04 21.
Article in English | MEDLINE | ID: mdl-33883262

ABSTRACT

Nucleocytoplasmic large DNA viruses (NCLDVs) are highly diverse and abundant in marine environments. However, the knowledge of their hosts is limited because only a few NCLDVs have been isolated so far. Taking advantage of the recent large-scale marine metagenomics census, in silico host prediction approaches are expected to fill the gap and further expand our knowledge of virus-host relationships for unknown NCLDVs. In this study, we built co-occurrence networks of NCLDVs and eukaryotic taxa to predict virus-host interactions using Tara Oceans sequencing data. Using the positive likelihood ratio to assess the performance of host prediction for NCLDVs, we benchmarked several co-occurrence approaches and demonstrated an increase in the odds ratio of predicting true positive relationships 4-fold compared to random host predictions. To further refine host predictions from high-dimensional co-occurrence networks, we developed a phylogeny-informed filtering method, Taxon Interaction Mapper, and showed it further improved the prediction performance by 12-fold. Finally, we inferred virophage-NCLDV networks to corroborate that co-occurrence approaches are effective for predicting interacting partners of NCLDVs in marine environments.IMPORTANCE NCLDVs can infect a wide range of eukaryotes, although their life cycle is less dependent on hosts compared to other viruses. However, our understanding of NCLDV-host systems is highly limited because few of these viruses have been isolated so far. Co-occurrence information has been assumed to be useful to predict virus-host interactions. In this study, we quantitatively show the effectiveness of co-occurrence inference for NCLDV host prediction. We also improve the prediction performance with a phylogeny-guided method, which leads to a concise list of candidate host lineages for three NCLDV families. Our results underpin the usage of co-occurrence approaches for the metagenomic exploration of the ecology of this diverse group of viruses.


Subject(s)
DNA Viruses/classification , DNA Viruses/genetics , Genome, Viral , Host Microbial Interactions/genetics , Phylogeny , Host Microbial Interactions/physiology , Humans , Metagenomics/statistics & numerical data
10.
BJOG ; 128(6): 976-982, 2021 05.
Article in English | MEDLINE | ID: mdl-32970908

ABSTRACT

OBJECTIVE: To determine the presence and identity of extracellular bacteriophage (phage) families, genera and species in the vagina of pregnant women. DESIGN: Descriptive, observational cohort study. SETTING: São Paulo, Brazil. POPULATION: Pregnant women at 21-24 weeks' gestation. METHODS: Vaginal samples from 107 women whose vaginal microbiome and pregnancy outcomes were previously determined were analysed for phages by metagenomic sequencing. MAIN OUTCOME MEASURES: Identification of phage families, genera and species. RESULTS: Phages were detected in 96 (89.7%) of the samples. Six different phage families were identified: Siphoviridae in 69.2%, Myoviridae in 49.5%, Microviridae in 37.4%, Podoviridae in 20.6%, Herelleviridae in 10.3% and Inviridae in 1.9% of the women. Four different phage families were present in 14 women (13.1%), three families in 20 women (18.7%), two families in 31 women (29.1%) and one family in 31 women (29.1%). The most common phage species detected were Bacillus phages in 48 (43.6%), Escherichia phages in 45 (40.9%), Staphylococcus phages in 40 (36.4%), Gokushovirus in 33 (30.0%) and Lactobacillus phages in 29 (26.4%) women. In a preliminary exploratory analysis, there were no associations between a particular phage family, the number of phage families present in the vagina or any particular phage species and either gestational age at delivery or the bacterial community state type present in the vagina. CONCLUSIONS: Multiple phages are present in the vagina of most mid-trimester pregnant women. TWEETABLE ABSTRACT: Bacteriophages are present in the vagina of most pregnant women.


Subject(s)
Bacteriophages , Microbiota/physiology , Vagina/microbiology , Adult , Bacteriophages/classification , Bacteriophages/genetics , Bacteriophages/isolation & purification , Brazil , Female , Gestational Age , Humans , Metagenome , Metagenomics/methods , Metagenomics/statistics & numerical data , Pregnancy , Pregnancy Outcome/epidemiology
11.
PLoS One ; 15(12): e0243161, 2020.
Article in English | MEDLINE | ID: mdl-33259541

ABSTRACT

BACKGROUND: Tuberculous meningitis (TBM) is a severe form of extrapulmonary tuberculosis and its early diagnosis is very difficult leading to present with severe disability or die. The current study aimed to assess the accuracy of metagenomic next generation sequencing (mNGS) for TBM, and to identify a new test for the early diagnosis of TBM. METHODS: We searched for articles published in Embase, PubMed, Cochrane Library, China National Knowledge Infrastructure, and Wanfang Data up to June 30, 2020 for studies that assessed the efficacy of mNGS for the diagnosis of TBM. Then, the accuracy between mNGS and a composite reference standard (CRS) in these articles was compared using the meta-analysis approach. RESULTS: Four independent studies with 342 samples comparing mNGS and a CRS were included in this study. The sensitivity of mNGS for TBM diagnosis ranged from 27% to 84%. The combined sensitivity of mNGS was 61%, and the I2 value was 92%. Moreover, the specificity of mNGS for TBM diagnosis ranged from 96% to 100%. The combined specificity of mNGS was 98%, and the I2 value was 74%. The heterogeneity between studies in terms of sensitivity and specificity was significant. The area under the curve (AUC) of the summary receiver operating characteristic curve (SROC) of mNGS for TBM was 0.98. CONCLUSIONS: The sensitivity of mNGS for TBM diagnosis was moderate. Furthermore, the specificity was extremely high, and the AUC of the SROC indicated a very good diagnostic efficacy. mNGS could be used as an early diagnostic method for TBM, however, the results should be treated with caution for the heterogeneity between studies was extremely significant. SYSTEMATIC REVIEW REGISTRATION: INPLASY202070100.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Metagenomics/methods , Tuberculosis, Meningeal/diagnosis , China , Early Diagnosis , High-Throughput Nucleotide Sequencing/standards , High-Throughput Nucleotide Sequencing/statistics & numerical data , Humans , Metagenome , Metagenomics/standards , Metagenomics/statistics & numerical data , Mycobacterium tuberculosis/genetics , Mycobacterium tuberculosis/isolation & purification , ROC Curve , Reference Standards , Sensitivity and Specificity , Tuberculosis, Meningeal/microbiology
12.
Nat Commun ; 11(1): 4661, 2020 09 16.
Article in English | MEDLINE | ID: mdl-32938925

ABSTRACT

The recent years have seen a growing number of studies investigating evolutionary questions using ancient DNA. To address these questions, one of the most frequently-used method is principal component analysis (PCA). When PCA is applied to temporal samples, the sample dates are, however, ignored during analysis, leading to imperfect representations of samples in PC plots. Here, we present a factor analysis (FA) method in which individual scores are corrected for the effect of allele frequency drift over time. We obtained exact solutions for the estimates of corrected factors, and we provided a fast algorithm for their computation. Using computer simulations and ancient European samples, we compared geometric representations obtained from FA with PCA and with ancestry estimation programs. In admixture analyses, FA estimates agreed with tree-based statistics, and they were more accurate than those obtained from PCA projections and from ancestry estimation programs. A great advantage of FA over existing approaches is to improve descriptive analyses of ancient DNA samples without requiring inclusion of outgroup or present-day samples.


Subject(s)
DNA, Ancient/analysis , Factor Analysis, Statistical , Genome, Human , Metagenomics/statistics & numerical data , Algorithms , England , Europe , Gene Frequency , Genetic Drift , Genetics, Population/statistics & numerical data , Humans , Models, Genetic , Principal Component Analysis
13.
Lancet Infect Dis ; 20(10): e251-e260, 2020 10.
Article in English | MEDLINE | ID: mdl-32768390

ABSTRACT

The term metagenomics refers to the use of sequencing methods to simultaneously identify genomic material from all organisms present in a sample, with the advantage of greater taxonomic resolution than culture or other methods. Applications include pathogen detection and discovery, species characterisation, antimicrobial resistance detection, virulence profiling, and study of the microbiome and microecological factors affecting health. However, metagenomics involves complex and multistep processes and there are important technical and methodological challenges that require careful consideration to support valid inference. We co-ordinated a multidisciplinary, international expert group to establish reporting guidelines that address specimen processing, nucleic acid extraction, sequencing platforms, bioinformatics considerations, quality assurance, limits of detection, power and sample size, confirmatory testing, causality criteria, cost, and ethical issues. The guidance recognises that metagenomics research requires pragmatism and caution in interpretation, and that this field is rapidly evolving.


Subject(s)
Metagenomics/methods , Metagenomics/statistics & numerical data , Computational Biology , Humans , Molecular Epidemiology , Research Design/standards
15.
Future Cardiol ; 15(6): 411-424, 2019 11.
Article in English | MEDLINE | ID: mdl-31691592

ABSTRACT

Aim: To explore potential utility of metagenomic sequencing for improving etiologic diagnosis of infective endocarditis (IE) caused by fastidious bacteria. Materials & methods: Plasma and heart valves of two patients, who were diagnosed with IE caused by Bartonella quintana and Propionibacterium species, were sequenced by using Illumina MiSeq and Nanopore MinION. Results: For patient 1, B. quintana was detected in the plasma pool collected 4 days before valvular replacement surgery. For patient 2, Propionibacterium sp. oral taxon 193 was detected in the plasma sample collected on hospital day 1. Nearly complete bacterial genomes (>98%) were retrieved from resected heart valves of both patients, enabling detection of antibiotic resistance-associated features. Real-time sequencing of heart valves identified both pathogens within the first 16 min of sequencing runs. Conclusion: Metagenomic sequencing may be a helpful supplement to IE diagnostic workflow, especially when conventional tests fail to yield a diagnosis.


Subject(s)
Bacteria/genetics , DNA, Bacterial/analysis , Endocarditis, Bacterial/diagnosis , Heart Valves/microbiology , Metagenomics/statistics & numerical data , Bacteria/isolation & purification , Humans , Metagenomics/methods , Polymerase Chain Reaction
16.
Genes (Basel) ; 10(9)2019 08 29.
Article in English | MEDLINE | ID: mdl-31470675

ABSTRACT

Metagenomic next-generation sequencing (mNGS) can capture the full spectrum of viral pathogens in a specimen and has the potential to become an all-in-one solution for virus diagnostics. To date, clinical application is still in an early phase and limitations remain. Here, we evaluated the impact of viral mNGS for cases analyzed over two years in a tertiary diagnostics unit. High throughput mNGS was performed upon request by the treating clinician in cases where the etiology of infection remained unknown or the initial differential diagnosis was very broad. The results were compared to conventional routine testing regarding outcome and workload. In total, 163 specimens from 105 patients were sequenced. The main sample types were cerebrospinal fluid (34%), blood (33%) and throat swabs (10%). In the majority of the cases, viral encephalitis/meningitis or respiratory infection was suspected. In parallel, conventional virus diagnostic tests were performed (mean 18.5 individually probed targets/patients). mNGS detected viruses in 34 cases (32%). While often confirmatory, in multiple cases, the identified viruses were not included in the selected routine diagnostic tests. Two years of mNGS in a tertiary diagnostics unit demonstrated the advantages of a single, untargeted approach for comprehensive, rapid and efficient virus diagnostics, confirming the utility of mNGS in complementing current routine tests.


Subject(s)
Metagenome , Metagenomics/methods , Molecular Diagnostic Techniques/methods , Sequence Analysis, DNA/methods , Tertiary Care Centers/statistics & numerical data , Virus Diseases/virology , Blood/virology , Cerebrospinal Fluid/virology , Genome, Viral , Humans , Metagenomics/statistics & numerical data , Molecular Diagnostic Techniques/statistics & numerical data , Mouth Mucosa/virology , Sequence Analysis, DNA/statistics & numerical data , Virus Diseases/diagnosis , Virus Diseases/epidemiology
17.
PLoS Comput Biol ; 15(7): e1007208, 2019 07.
Article in English | MEDLINE | ID: mdl-31335917

ABSTRACT

Horizontal gene transfer (HGT) has changed the way we regard evolution. Instead of waiting for the next generation to establish new traits, especially bacteria are able to take a shortcut via HGT that enables them to pass on genes from one individual to another, even across species boundaries. The tool Daisy offers the first HGT detection approach based on read mapping that provides complementary evidence compared to existing methods. However, Daisy relies on the acceptor and donor organism involved in the HGT being known. We introduce DaisyGPS, a mapping-based pipeline that is able to identify acceptor and donor reference candidates of an HGT event based on sequencing reads. Acceptor and donor identification is akin to species identification in metagenomic samples based on sequencing reads, a problem addressed by metagenomic profiling tools. However, acceptor and donor references have certain properties such that these methods cannot be directly applied. DaisyGPS uses MicrobeGPS, a metagenomic profiling tool tailored towards estimating the genomic distance between organisms in the sample and the reference database. We enhance the underlying scoring system of MicrobeGPS to account for the sequence patterns in terms of mapping coverage of an acceptor and donor involved in an HGT event, and report a ranked list of reference candidates. These candidates can then be further evaluated by tools like Daisy to establish HGT regions. We successfully validated our approach on both simulated and real data, and show its benefits in an investigation of an outbreak involving Methicillin-resistant Staphylococcus aureus data.


Subject(s)
Evolution, Molecular , Gene Transfer, Horizontal , Metagenome , Metagenomics/methods , Models, Genetic , Computational Biology , Computer Simulation , Databases, Genetic/statistics & numerical data , Disease Outbreaks/statistics & numerical data , Genetic Variation , Genome, Bacterial , Helicobacter pylori/genetics , Humans , Metagenomics/statistics & numerical data , Methicillin-Resistant Staphylococcus aureus/genetics , Mutation , Staphylococcal Infections/epidemiology , Staphylococcal Infections/microbiology
18.
Pac Symp Biocomput ; 24: 236-247, 2019.
Article in English | MEDLINE | ID: mdl-30864326

ABSTRACT

The microbiome research is going through an evolutionary transition from focusing on the characterization of reference microbiomes associated with different environments/hosts to the translational applications, including using microbiome for disease diagnosis, improving the effcacy of cancer treatments, and prevention of diseases (e.g., using probiotics). Microbial markers have been identified from microbiome data derived from cohorts of patients with different diseases, treatment responsiveness, etc, and often predictors based on these markers were built for predicting host phenotype given a microbiome dataset (e.g., to predict if a person has type 2 diabetes given his or her microbiome data). Unfortunately, these microbial markers and predictors are often not published so are not reusable by others. In this paper, we report the curation of a repository of microbial marker genes and predictors built from these markers for microbiome-based prediction of host phenotype, and a computational pipeline called Mi2P (from Microbiome to Phenotype) for using the repository. As an initial effort, we focus on microbial marker genes related to two diseases, type 2 diabetes and liver cirrhosis, and immunotherapy efficacy for two types of cancer, non-small-cell lung cancer (NSCLC) and renal cell carcinoma (RCC). We characterized the marker genes from metagenomic data using our recently developed subtractive assembly approach. We showed that predictors built from these microbial marker genes can provide fast and reasonably accurate prediction of host phenotype given microbiome data. As understanding and making use of microbiome data (our second genome) is becoming vital as we move forward in this age of precision health and precision medicine, we believe that such a repository will be useful for enabling translational applications of microbiome data.


Subject(s)
Genes, Microbial , Host Microbial Interactions/genetics , Microbiota/genetics , Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/microbiology , Carcinoma, Non-Small-Cell Lung/therapy , Carcinoma, Renal Cell/genetics , Carcinoma, Renal Cell/microbiology , Carcinoma, Renal Cell/therapy , Computational Biology/methods , Databases, Genetic , Diabetes Mellitus, Type 2/genetics , Diabetes Mellitus, Type 2/microbiology , Genetic Markers , Humans , Immunotherapy , Kidney Neoplasms/genetics , Kidney Neoplasms/microbiology , Kidney Neoplasms/therapy , Liver Cirrhosis/genetics , Liver Cirrhosis/microbiology , Lung Neoplasms/genetics , Lung Neoplasms/microbiology , Lung Neoplasms/therapy , Machine Learning , Metagenomics/methods , Metagenomics/statistics & numerical data , Phenotype , Translational Research, Biomedical
19.
J Psychiatr Res ; 113: 90-99, 2019 06.
Article in English | MEDLINE | ID: mdl-30927646

ABSTRACT

BACKGROUND: To probe the differences of gut microbiota among major depressive disorder (MDD), bipolar disorder with current major depressive episode (BPD) and health participants. METHODS: Thirty one MDD patients, thirty BPD patients, and thirty healthy controls (HCs) were recruited. All the faecal samples were analyzed by shotgun metagenomics sequencing. Except for routine analyses of alpha diversity, we specially designed a new indicator, the Gm coefficient, to evaluate the inequality of relative abundances of microbiota for each participant. RESULTS: The Gm coefficients are significant decreased in both MDD and BPD groups. The relative abundances of increased phyla Firmicutes and Actinobacteria and decreased Bacteroidetes were significantly in the MDD and BPD groups. At genus level, four of top five enriched genera (Bacteroides, Clostridium, Bifidobacterium, Oscillibacter and Streptococcus) were found increased significantly in the MDD and BPD groups compared with HCs. The genera Escherichia and Klebsiella showed significant changes in abundances only between the BPD and HC groups. At the species level, compared with BPD patients, MDD patients had a higher abundance of Prevotellaceae including Prevotella denticola F0289, Prevotella intermedia 17, Prevotella ruminicola, and Prevotella intermedia. Furthermore, the abundance of Fusobacteriaceae, Escherichia blattae DSM 4481 and Klebsiella oxytoca were significantly increased, whereas the Bifidobacterium longum subsp. infantis ATCC 15697 = JCM 1222 was significantly reduced in BPD group compared with MDD group. CONCLUSIONS: Our study suggested that gut microbiota may be involved in the pathogenesis of both MDD and BPD patients, and the nuances of bacteria may have the potentiality of being the biomarkers of MDD and BPD.


Subject(s)
Bipolar Disorder/microbiology , Depressive Disorder, Major/microbiology , Gastrointestinal Microbiome/physiology , Metagenomics/methods , Adult , Feces/microbiology , Female , Humans , Male , Metagenomics/statistics & numerical data
20.
Microbes Infect ; 21(7): 273-277, 2019.
Article in English | MEDLINE | ID: mdl-30836173

ABSTRACT

Clinical metagenomics (CMg), referring to as the application of metagenomic sequencing of clinical samples in order to recover clinically-relevant information, has been rapidly evolving these last years. Following this trend, we held the third International Conference on Clinical Metagenomics (ICCMg3) in Geneva in October 2018. During the two days of the conference, several aspects of CMg were addressed, which we propose to summarize in the present manuscript. During this ICCMg3, we kept on following the progresses achieved worldwide on clinical metagenomics, but also this year in clinical genomics. Besides, the use of metagenomics in cancer diagnostic and management was addressed. Some new challenges have also been raised such as the way to report clinical (meta)genomics output to clinicians and the pivotal place of ethics in this expanding field.


Subject(s)
Clinical Laboratory Techniques , Communicable Diseases/diagnosis , Metagenomics , Clinical Laboratory Techniques/standards , Clinical Laboratory Techniques/trends , Communicable Diseases/microbiology , Computational Biology/standards , Computational Biology/trends , High-Throughput Nucleotide Sequencing/standards , High-Throughput Nucleotide Sequencing/trends , Humans , Metagenome/genetics , Metagenomics/standards , Metagenomics/statistics & numerical data , Microbiota/genetics
SELECTION OF CITATIONS
SEARCH DETAIL