Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters








Database
Language
Publication year range
1.
Article in English | MEDLINE | ID: mdl-38888585

ABSTRACT

With the continued evolution of DNA sequencing technologies, the role of genome sequence data has become more integral in the classification and identification of Bacteria and Archaea. Six years after introducing EzBioCloud, an integrated platform representing the taxonomic hierarchy of Bacteria and Archaea through quality-controlled 16S rRNA gene and genome sequences, we present an updated version, that further refines and expands its capabilities. The current update recognizes the growing need for accurate taxonomic information as defining a species increasingly relies on genome sequence comparisons. We also incorporated an advanced strategy for addressing underrepresented or less studied lineages, bolstering the comprehensiveness and accuracy of our database. Our rigorous quality control protocols remain, where whole-genome assemblies from the NCBI Assembly Database undergo stringent screening to remove low-quality sequence data. These are then passed through our enhanced identification bioinformatics pipeline which initiates a 16S rRNA gene similarity search and then calculates the average nucleotide identity (ANI). For genome sequences lacking a 16S rRNA sequence and without a closely related genomic representative for ANI calculation, we apply a different ANI approach using bacterial core genes for improved taxonomic placement (core gene ANI, cgANI). Because of the increase in genome sequences available in NCBI and our newly introduced cgANI method, EzBioCloud now encompasses a total of 109 835 species, of which 21 964 have validly published names. 47 896 are candidate species identified either through 16S rRNA sequence similarity (phylotypes) or through whole genome ANI (genomospecies), and the remaining 39 975 were positioned in the taxonomic tree by cgANI (species clusters). Our EzBioCloud database is accessible at www.ezbiocloud.net/db.


Subject(s)
Archaea , Bacteria , Genome, Bacterial , Microbiota , RNA, Ribosomal, 16S , RNA, Ribosomal, 16S/genetics , Bacteria/genetics , Bacteria/classification , Bacteria/isolation & purification , Archaea/genetics , Archaea/classification , Phylogeny , Databases, Genetic , Genome, Archaeal , Sequence Analysis, DNA , Computational Biology/methods
2.
PLoS One ; 17(8): e0272354, 2022.
Article in English | MEDLINE | ID: mdl-35913976

ABSTRACT

The recent advance in massively parallel sequencing has enabled accurate microbiome profiling at a dramatically lowered cost. Then, the human microbiome has been the subject of intensive investigation in public health and medicine. In the meanwhile, researchers have developed lots of microbiome data analysis methods, protocols, and/or tools. Among those, especially, the web platforms can be highlighted because of the user-friendly interfaces and streamlined protocols for a long sequence of analytic procedures. However, existing web platforms can handle only a categorical trait of interest, cross-sectional study design, and the analysis with no covariate adjustment. We therefore introduce here a unified web platform, named MiCloud, for a binary or continuous trait of interest, cross-sectional or longitudinal/family-based study design, and with or without covariate adjustment. MiCloud handles all such types of analyses for both ecological measures (i.e., alpha and beta diversity indices) and microbial taxa in relative abundance on different taxonomic levels (i.e., phylum, class, order, family, genus and species). Importantly, MiCloud also provides a unified analytic protocol that streamlines data inputs, quality controls, data transformations, statistical methods and visualizations with vastly extended utility and flexibility that are suited to microbiome data analysis. We illustrate the use of MiCloud through the United Kingdom twin study on the association between gut microbiome and body mass index adjusting for age. MiCloud can be implemented on either the web server (http://micloud.kr) or the user's computer (https://github.com/wg99526/micloudgit).


Subject(s)
Gastrointestinal Microbiome , Microbiota , Cross-Sectional Studies , Data Analysis , Gastrointestinal Microbiome/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , Microbiota/genetics
3.
Front Microbiol ; 13: 912853, 2022.
Article in English | MEDLINE | ID: mdl-35983325

ABSTRACT

An association between the vaginal microbiome and preterm birth has been reported. However, in practice, it is difficult to predict premature birth using the microbiome because the vaginal microbial community varies highly among samples depending on the individual, and the prediction rate is very low. The purpose of this study was to select markers that improve predictive power through machine learning among various vaginal microbiota and develop a prediction algorithm with better predictive power that combines clinical information. As a multicenter case-control study with 150 Korean pregnant women with 54 preterm delivery group and 96 full-term delivery group, cervicovaginal fluid was collected from pregnant women during mid-pregnancy. Their demographic profiles (age, BMI, education level, and PTB history), white blood cell count, and cervical length were recorded, and the microbiome profiles of the cervicovaginal fluid were analyzed. The subjects were randomly divided into a training (n = 101) and a test set (n = 49) in a two-to-one ratio. When training ML models using selected markers, five-fold cross-validation was performed on the training set. A univariate analysis was performed to select markers using seven statistical tests, including the Wilcoxon rank-sum test. Using the selected markers, including Lactobacillus spp., Gardnerella vaginalis, Ureaplasma parvum, Atopobium vaginae, Prevotella timonensis, and Peptoniphilus grossensis, machine learning models (logistic regression, random forest, extreme gradient boosting, support vector machine, and GUIDE) were used to build prediction models. The test area under the curve of the logistic regression model was 0.72 when it was trained with the 17 selected markers. When analyzed by combining white blood cell count and cervical length with the seven vaginal microbiome markers, the random forest model showed the highest test area under the curve of 0.84. The GUIDE, the single tree model, provided a more reasonable biological interpretation, using the 10 selected markers (A. vaginae, G. vaginalis, Lactobacillus crispatus, Lactobacillus fornicalis, Lactobacillus gasseri, Lactobacillus iners, Lactobacillus jensenii, Peptoniphilus grossensis, P. timonensis, and U. parvum), and the covariates produced a tree with a test area under the curve of 0.77. It was confirmed that the association with preterm birth increased when P. timonensis and U. parvum increased (AUC = 0.77), which could also be explained by the fact that as the number of Peptoniphilus lacrimalis increased, the association with preterm birth was high (AUC = 0.77). Our study demonstrates that several candidate bacteria could be used as potential predictors for preterm birth, and that the predictive rate can be increased through a machine learning model employing a combination of cervical length and white blood cell count information.

SELECTION OF CITATIONS
SEARCH DETAIL