Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
Bioinformatics ; 39(3)2023 03 01.
Article in English | MEDLINE | ID: mdl-36929933

ABSTRACT

SUMMARY: Microbial secondary metabolites exhibit potential medicinal value. A large number of secondary metabolite biosynthetic gene clusters (BGCs) in the human gut microbiome, which exhibit essential biological activity in microbe-microbe and microbe-host interactions, have not been adequately characterized, making it difficult to prioritize these BGCs for experimental characterization. Here, we present the sBGC-hm, an atlas of secondary metabolite BGCs allows researchers to explore the potential therapeutic benefits of these natural products. One of its key features is the ability to assist in optimizing the BGC structure by utilizing the gene co-occurrence matrix obtained from Human Microbiome Project data. Results are viewable online and can be downloaded as spreadsheets. AVAILABILITY AND IMPLEMENTATION: The database is openly available at https://www.wzubio.com/sbgc. The website is powered by Apache 2 server with PHP and MariaDB.


Subject(s)
Gastrointestinal Microbiome , Microbiota , Humans , Gastrointestinal Microbiome/genetics , Multigene Family , Biosynthetic Pathways/genetics
2.
BMC Cancer ; 19(1): 127, 2019 Feb 07.
Article in English | MEDLINE | ID: mdl-30732570

ABSTRACT

BACKGROUND: The miRNA isoforms (isomiRs) have been suggested to regulate the same pathways as the canonical miRNA and play an important biological role in miRNA-mediated gene regulation. Recently, a study has demonstrated that the presence or absence of all isomiRs could efficiently discriminate amongst 32 TCGA cancer types. Besides, an effective reduction of distinguishing isomiR features for multiclass tumor discrimination must have a major impact on our understanding of the disease and treatment of cancer. METHODS: In this study, we have constructed a combination of the genetic algorithms (GA) with Random Forest (RF) algorithms to detect reliable sets of cancer-associated 5'isomiRs from TCGA isomiR expression data for multiclass tumor classification. RESULTS: We obtained 100 sets of the optimal predictive features, each of which comprised of 50-5'isomiRs that could effectively classify with an average sensitivity of 92% samples from 32 different tumor types. We calculated the frequency with which a 5'isomiR found in these sets as measuring its importance for tumor classification. Many highly frequent 5'isomiRs with different 5' loci from canonical miRNAs were detected in these sets, supporting that the isomiRs play a significant role in the multiclass tumor classification. The further functional enrichment analysis showed that the target genes of the 10 most frequently appearing 5'isomiRs were involved in the activity of transcription activator and protein kinase and cell-cell adhesion. CONCLUSIONS: The findings of the present study indicated that the 5'isomiRs might be employed for multiclass tumor classification and the suggested that GA/RF model could perform effective tumor classification by a series of largely independent optimal predictor 5' isomiR sets.


Subject(s)
Biomarkers, Tumor , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , MicroRNAs/genetics , Neoplasms/genetics , RNA Interference , Transcriptome , Algorithms , Chromosome Mapping , Computational Biology/methods , Humans , Models, Biological , Neoplasms/diagnosis
3.
J Theor Biol ; 398: 1-8, 2016 06 07.
Article in English | MEDLINE | ID: mdl-27000773

ABSTRACT

BACKGROUND: 16S rRNA genes have been widely used for phylogenetic reconstruction and the quantification of microbial diversity through the application of next-generation sequencing technology. However, long-read sequencing is still costly, while short-read sequencing carries less information for complex microbial community profiling; therefore, the applications of high throughput sequencing platforms still remain challenging in microbial community reconstruction analysis. RESULTS: Here, we developed a method to investigate the profile of aligned 16S rRNA gene sequences and to measure the proper region for microbial community reconstruction, as a step in creating a more efficient way to detect microorganism at the genus level. Finally, we found that each genus has its own preferential genus-specific amplicons for a genus assignment, which are not always located in hyper variable regions (HVRs). It was also noted that the rare genera should contribute less than dominant ones to the common profile of the aligned 16S rRNA sequences and have lower affinity to the common universal primer. CONCLUSIONS: Therefore, using multiple 16S rRNA regions rather than one "universal" region can significantly improve the ability of microbial community reconstruction. In addition, we found that a short fragment is suitable for most genera identifications, and the proper conserved regions used for primer design are larger than before.


Subject(s)
Microbiota/genetics , RNA, Ribosomal, 16S/genetics , Base Sequence , Conserved Sequence/genetics , Nucleic Acid Conformation , RNA, Ribosomal, 16S/chemistry
4.
Nucleic Acids Res ; 41(6): e74, 2013 Apr 01.
Article in English | MEDLINE | ID: mdl-23335781

ABSTRACT

Thousands of novel transcripts have been identified using deep transcriptome sequencing. This discovery of large and 'hidden' transcriptome rejuvenates the demand for methods that can rapidly distinguish between coding and noncoding RNA. Here, we present a novel alignment-free method, Coding Potential Assessment Tool (CPAT), which rapidly recognizes coding and noncoding transcripts from a large pool of candidates. To this end, CPAT uses a logistic regression model built with four sequence features: open reading frame size, open reading frame coverage, Fickett TESTCODE statistic and hexamer usage bias. CPAT software outperformed (sensitivity: 0.96, specificity: 0.97) other state-of-the-art alignment-based software such as Coding-Potential Calculator (sensitivity: 0.99, specificity: 0.74) and Phylo Codon Substitution Frequencies (sensitivity: 0.90, specificity: 0.63). In addition to high accuracy, CPAT is approximately four orders of magnitude faster than Coding-Potential Calculator and Phylo Codon Substitution Frequencies, enabling its users to process thousands of transcripts within seconds. The software accepts input sequences in either FASTA- or BED-formatted data files. We also developed a web interface for CPAT that allows users to submit sequences and receive the prediction results almost instantly.


Subject(s)
Open Reading Frames , RNA, Untranslated/chemistry , Software , Logistic Models , Sequence Analysis, RNA
5.
Archaea ; 2014: 671059, 2014.
Article in English | MEDLINE | ID: mdl-24948879

ABSTRACT

Growing evidence indicates that miRNA genes exist in the archaeal genome, though the functional role of such noncoding RNA remains unclear. Here, we integrated the phylogenetic information of available archaeal genomes to predict miRNA seeds (typically defined as the 2-8 nucleotides of mature miRNAs) on the genomic scale. Finally, we found 2649 candidate seeds with significant conservation signal. Eleven of 29 unique seeds from previous study support our result (P value <0.01), which demonstrates that the pipeline is suitable to predict experimentally detectable miRNA seeds. The statistical significance of the overlap between the detected archaeal seeds and known eukaryotic seeds shows that the miRNA may evolve before the divergence of these two domains of cellular life. In addition, miRNA targets are enriched for genes involved in transcriptional regulation, which is consistent with the situation in eukaryote. Our research will enhance the regulatory network analysis in Archaea.


Subject(s)
Archaea/genetics , Genes, Archaeal , Genome, Archaeal , MicroRNAs/genetics , Computational Biology/methods
6.
Bioinformatics ; 28(16): 2184-5, 2012 Aug 15.
Article in English | MEDLINE | ID: mdl-22743226

ABSTRACT

MOTIVATION: RNA-seq has been extensively used for transcriptome study. Quality control (QC) is critical to ensure that RNA-seq data are of high quality and suitable for subsequent analyses. However, QC is a time-consuming and complex task, due to the massive size and versatile nature of RNA-seq data. Therefore, a convenient and comprehensive QC tool to assess RNA-seq quality is sorely needed. RESULTS: We developed the RSeQC package to comprehensively evaluate different aspects of RNA-seq experiments, such as sequence quality, GC bias, polymerase chain reaction bias, nucleotide composition bias, sequencing depth, strand specificity, coverage uniformity and read distribution over the genome structure. RSeQC takes both SAM and BAM files as input, which can be produced by most RNA-seq mapping tools as well as BED files, which are widely used for gene models. Most modules in RSeQC take advantage of R scripts for visualization, and they are notably efficient in dealing with large BAM/SAM files containing hundreds of millions of alignments. AVAILABILITY AND IMPLEMENTATION: RSeQC is written in Python and C. Source code and a comprehensive user's manual are freely available at: http://code.google.com/p/rseqc/.


Subject(s)
Sequence Analysis, RNA/methods , Software , Transcriptome , Computational Biology/methods , Quality Control , RNA/genetics
7.
Genes (Basel) ; 14(10)2023 Oct 21.
Article in English | MEDLINE | ID: mdl-37895318

ABSTRACT

Sargassum horneri, a prevalent species of brown algae found along the coast of the northwest Pacific Ocean, holds significant importance as a valuable source of bioactive compounds. However, its rapid growth can lead to the formation of a destructive "golden tide", causing severe damage to the local economy and coastal ecosystems. In this study, we carried out de novo whole-genome sequencing of S. horneri using next-generation sequencing to unravel the genetic information of this alga. By utilizing a reference-guided de novo assembly pipeline with a closely related species, we successfully established a final assembled genome with a total length of 385 Mb. Repetitive sequences made up approximately 30.6% of this genome. Among the identified putative genes, around 87.03% showed homology with entries in the NCBI non-redundant protein database, with Ectocarpus siliculosus being the most closely related species for approximately one-third of these genes. One gene encoding an alkaline phosphatase family protein was found to exhibit positive selection, which could give a clue for the formation of S. horneri golden tides. Additionally, we characterized putative genes involved in fucoidan biosynthesis metabolism, a significant pathway in S. horneri. This study represents the first genome-wide characterization of a S. horneri species, providing crucial insights for future investigations, such as ecological genomic analyses.


Subject(s)
Sargassum , Seaweed , Seaweed/genetics , Sargassum/genetics , Sargassum/metabolism , Ecosystem , Pacific Ocean
8.
Front Microbiol ; 14: 1272605, 2023.
Article in English | MEDLINE | ID: mdl-38029096

ABSTRACT

Introduction: Dormitory washbasins can breed microorganisms that produce odorous gases, polluting the indoor environment. Methods: We utilized metagenome sequencing to analyze the microbiota of 40 samples from the drain pipes of dormitory washbasins. Our study aimed to investigate the microbial community structure, antibiotic resistance genes, and virulence factors, and to identify potential influencing factors such as gender, hometown, frequency of hand sanitizer usage, and number of dormitory residents. Results: The analysis revealed 12 phyla and 147 genera, with Proteobacteria and Actinobacteria being the dominant phyla, and Mycobacterium and Nakamurella being the dominant genera. We found that the factors influencing the microbial community structure of the dormitory washbasin drain pipe are complex. The investigated factors have a slight influence on the drain pipe microbial community, with gender exerting a discernible influence. The annotation results revealed the presence of various virulence factors, pathogenic toxins and antibiotic resistance genes, including 246 different toxin types and 30 different types of antibiotic resistance genes. In contrast to the observed differences in microbial composition among samples, the distribution of resistance genes shows relatively small changes among samples. Antibiotics should be a contributing factor in the overall increase of antibiotic resistance genes in drain pipes. Discussion: Overall, our study provides important insights into the community structure and function of microorganisms in dormitory drainage systems, and can guide efforts to prevent and control microbial pollution.

9.
BMC Genomics ; 13: 43, 2012 Jan 25.
Article in English | MEDLINE | ID: mdl-22276739

ABSTRACT

BACKGROUND: The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. RESULTS: Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. CONCLUSIONS: By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.


Subject(s)
Sequence Analysis, DNA/methods , Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Databases, Factual , Electronic Data Processing , Female , Humans , MicroRNAs/genetics , Transcriptome
10.
PLoS One ; 17(12): e0278503, 2022.
Article in English | MEDLINE | ID: mdl-36459525

ABSTRACT

P-nitrophenol (PNP) is a carcinogenic, teratogenic, and mutagenic compound that can cause serious harm to the environment. A strain of Pseudomonas putida DLL-E4, can efficiently degrade PNP in a complex process that is influenced by many factors. Previous studies showed that the expression level of pnpA, a key gene involved in PNP degradation, was upregulated significantly and the degradation of PNP was obviously accelerated in the presence of glucose. In addition, the expression of crc, crcY, and crcZ, key genes involved in catabolite repression, was downregulated, upregulated, and upregulated, respectively. To investigate the effect of the carbon catabolite repression (CCR) system on PNP degradation, the crc, crcY, and crcZ genes were successfully knocked out by conjugation experiments. Our results showed that the knockout of crc accelerated PNP degradation but slowed down the cell growth. However, the knockout of crcY or crcZ alone accelerated PNP degradation when PNP as the sole carbon source, but that knockout slowed down PNP degradation when glucose was added. The results indicate that the CCR system is involved in the regulation of PNP degradation, and further work is required to determine the details of the specific regulatory mechanism.


Subject(s)
Catabolite Repression , Craniocerebral Trauma , Pseudomonas putida , Humans , Catabolite Repression/genetics , Pseudomonas putida/genetics , Gene Knockout Techniques , Glucose
11.
J Agric Food Chem ; 69(48): 14643-14649, 2021 Dec 08.
Article in English | MEDLINE | ID: mdl-34812623

ABSTRACT

A type III polyketide synthase (SfuPKS1) from the edible seaweed Sargassum fusiforme was molecularly cloned and biochemically characterized. The recombinant SfuPKS1 catalyzed the condensation of fatty acyl-CoA with two or three malonyl-CoA using lactone-type intramolecular cyclization to produce tri- and/or tetraketides. Moreover, it can also utilize phenylpropanoyl-CoA to synthesize phloroglucinol derivatives through Claisen-type cyclization, exhibiting broad substrate and catalysis specificity. Furthermore, the catalytic efficiency (kcat/KM) for acetyl-CoA was 11.8-fold higher than that for 4-coumaroyl-CoA. A pathway for the synthesis of naringenin involving SfuPKS1 was also constructed in Escherichia coli by recombinant means, resulting in 4.9 mg of naringenin per liter.


Subject(s)
Sargassum , Seaweed , Acyltransferases , Catalysis , Kinetics , Substrate Specificity
12.
Mitochondrial DNA B Resour ; 5(3): 3752-3753, 2020 Nov 13.
Article in English | MEDLINE | ID: mdl-33367087

ABSTRACT

The mitochondrial genome sequence of Pseudoxenodon stejnegeri (Squamata: Colubridae: Pseudoxenodontinae) from Taishun County, Zhejiang Province, China, which is 18,475 bp in length and contains 25 tRNAs (including extra two tRNA-Tyr genes and extra one tRNA-Met gene), two rRNAs, 13 protein-coding genes and two identical control regions. The overall AT content of the mitogenome is 59.6% (A = 32.6%, T = 27%, C = 27%, G = 13.4%). In BI and ML phylogenetic analyses, the monophyly of the family Colubridae was well supported and P. stejnegeri was a basal clade of Colubridae.

13.
Mitochondrial DNA B Resour ; 5(1): 576-577, 2020 Jan 14.
Article in English | MEDLINE | ID: mdl-33366654

ABSTRACT

The complete chloroplast genome sequence of Sargassum fusiforme is presented here. Circular mapping revealed that the complete chloroplast DNA sequence of S. fusiforme was 124,298 bp in length and had an overall AT content of 69.57%, including 137 protein-coding genes, 2 open reading frames, 28 transfer RNA genes, and 6 ribosomal RNA genes. The phylogenetic tree based on Bayesian shows that all kinds of Phaeophyceae were clustered into two monophyletic groups.

14.
Mitochondrial DNA B Resour ; 5(1): 830-831, 2020 Jan 24.
Article in English | MEDLINE | ID: mdl-33366771

ABSTRACT

We describe the complete mitochondrial genome sequence of Sargassum fusiforme. This mitogenome is a circular molecule of 34,695 bp in length and had an overall GC content of 37.54%%. Gene annotation showed that 35 protein-coding genes, 2 open reading frames, 25 transfer RNA genes, and 3 ribosomal RNA genes. The phylogenetic tree based on Bayesian shows that S. fusiforme belongs to genus Sargassum, support current taxonomic systems.

15.
Langmuir ; 25(23): 13448-55, 2009 Dec 01.
Article in English | MEDLINE | ID: mdl-19863074

ABSTRACT

This Article describes a facile method to prepare smooth and homogeneous polymer brush surfaces of variable grafting density from a solid surface by combining Langmuir-Blodgett (LB) deposition with surface-initiated atom transfer radical polymerization (SI-ATRP). This method is successfully demonstrated by the preparation of thermoresponsive poly(N-isopropylacrylamide) (PNIPAM) brush surfaces on smooth silicon and quartz substrates. With the custom-synthesized inert diluent whose chemical structure, except end-functionality, is the same as that of the reactive initiator, smooth and chemically homogeneous mixed monolayers of initiators and inert diluents are immobilized on a solid surface by LB deposition, allowing the further variation of the grafting density of PNIPAM brushes grafted from the initiator monolayers of varied initiator coverage. With the optimized molar ratio of deactivator, Cu(II) in the Cu(I)-ligand catalyst complex, the brush thickness of PNIPAM brushes at varied grafting density is controlled to grow nearly linearly with reaction time while smoothness and chemical homogeneity of PNIPAM brushes are achieved. For the demonstrated PNIPAM brush surfaces, the thermoresponsive characteristics of PNIPAM brushes are also verified. This combined LB-ATRP method can be applied to graft a variety of polymer brushes, including polyelectrolytes and block copolymers, from different solid substrates.

16.
Genomics ; 92(1): 60-4, 2008 Jul.
Article in English | MEDLINE | ID: mdl-18472393

ABSTRACT

Adenylate cyclases, guanylate cyclases, cyclic nucleotide phosphodiesterases, and cyclic nucleotide-binding proteins constitute the core of cAMP and cGMP signaling components. Using a combination of BLAST and profile search methods, we found that cyclic nucleotide-binding proteins exhibited diverse domain architectures. In addition to the domain architectures involved in the characterized functional groups, a cyclic nucleotide-binding domain was also fused to various domains involved in pyridine nucleotide-disulfide oxidoreductase, acetyltransferase, thioredoxin reductase, glutaminase, rhodanese, ferredoxin, and diguanylate cyclase, implying the versatile functions of cyclic nucleotide-binding proteins. We constructed the CSCDB database to accumulate the components of cAMP and cGMP signaling pathways in the complete genomes. User-friendly interfaces were created for easier browsing, searching, and downloading the data. Besides harboring the sequence itself, each entry provided detailed annotation information, such as sequence features, chromosomal localization, functional domains, transmembrane region, and sequence similarity against several major databases. Currently, CSCDB contains 4234 entries covering 466 organisms, including 35 eukaryotes, 382 bacteria, and 29 archaea. CSCDB can be freely accessible on the web at http://cscdb.com.cn.


Subject(s)
Cyclic AMP/genetics , Cyclic AMP/metabolism , Cyclic GMP/genetics , Cyclic GMP/metabolism , Databases, Protein , Enzymes/genetics , Animals , Archaea/genetics , Bacteria/genetics , Enzymes/chemistry , Enzymes/metabolism , Internet , Protein Structure, Tertiary , Signal Transduction , Software
17.
Genomics ; 91(1): 102-7, 2008 Jan.
Article in English | MEDLINE | ID: mdl-18035520

ABSTRACT

Identification of all the transcription factors (TFs) encoded in a given genome is a prerequisite for understanding transcriptional regulatory networks. Archaea are prokaryotes that constitute one of the three main branches of organisms with an astounding diversity of habitats. In this report, we establish the ArchaeaTF database to provide an integrated information resource about TFs in Archaea, such as basic characteristics, domain architectures, and sequence similarities against the linked databases. Through its Web interface, ArchaeaTF provides three different ways for users to retrieve the data: simple browse, keyword search, and BLAST search. Moreover, ArchaeaTF can serve as a useful platform for comparative genomics analysis of archaeal TFs since it implements a series of tools, including MUSCLE for multiple sequence alignments of the DNA-binding domains, QuickTree for phylogenetic tree construction, and OrthoMCL for ortholog identification. The released ArchaeaTF 1.0 contains 2135 putative TFs from 37 completed archaeal genomes. In conclusion, we believe that ArchaeaTF will be a useful resource and convenient platform for researchers working on TFs and transcriptional regulatory networks to retrieve information from TFs in Archaea rapidly. ArchaeaTF is accessible at http://bioinformatics.zj.cn/archaeatf.


Subject(s)
Archaea/genetics , Archaeal Proteins/genetics , DNA-Binding Proteins/genetics , Databases, Protein , Phylogeny , Transcription Factors/genetics , Internet , Protein Structure, Tertiary/genetics
18.
Biomed Res Int ; 2019: 6361320, 2019.
Article in English | MEDLINE | ID: mdl-31309109

ABSTRACT

Obesity is intrinsically linked with the gut microbiome, and studies have identified several obesity-associated microbes. The microbe-microbe interactions can alter the composition of the microbial community and influence host health by producing secondary metabolites (SMs). However, the contribution of these SMs in the prevention and treatment of obesity has been largely ignored. We identified several SM-encoding biosynthetic gene clusters (BGCs) from the metagenomic data of lean and obese individuals and found significant association between some BGCs, including those that produce hitherto unknown SM, and obesity. In addition, the mean abundance of BGCs was positively correlated with obesity, consistent with the lower taxonomic diversity in the gut microbiota of obese individuals. By comparing the BGCs of known SM between obese and nonobese samples, we found that menaquinone produced by Enterobacter cloacae showed the highest correlation with BMI, in agreement with a recent study on human adipose tissue composition. Furthermore, an obesity-related nonribosomal peptide synthetase (NRPS) was negatively associated with Bacteroidetes, indicating that the SMs produced by intestinal microbes in obese individuals can change the microbiome structure. This is the first systemic study of the association between gut microbiome BGCs and obesity and provides new insights into the causes of obesity.


Subject(s)
Bacteroidetes/genetics , Gastrointestinal Microbiome/genetics , Metagenome , Microbial Interactions/genetics , Multigene Family , Obesity , Bacteroidetes/metabolism , Humans , Obesity/genetics , Obesity/microbiology
19.
Comput Biol Chem ; 78: 165-169, 2019 Feb.
Article in English | MEDLINE | ID: mdl-30530297

ABSTRACT

Secondary metabolites are a range of bioactive compounds yielded by bacteria, fungi and plants, etc. The published archaea genomic data provide the opportunity for efficient identification of secondary metabolite biosynthetic gene clusters (BGCs) by genome mining. However, the study of secondary metabolites in archaea is still rare. By using the antiSMASH, we found two main putative secondary metabolite BGCs, bacteriocin and terpene in 203 Archaea genomes. Compared with the genomes of Euryarchaeota that usually lives in less complexity of environment, the genomes of Crenarchaeota usually contained more abundant bacteriocin. In these archaea genomes, we also found the positive correlation between the abundance of bacteriocin and the abundance of CRISPR spacer, suggesting the bacteriocin might be a crucial component of the innate immune system that defense the microbe living in the common environment. The structure analysis of the bacteriocin gene clusters gave a clue that the assisted genes located at the edge of clusters evolved faster than the core biosynthetic genes. To the best of our knowledge, we are the first to systematically explore the distribution of secondary metabolites in archaea, and the investigation of the relationship between BGC and CRISPR spacer expands our understanding of the evolutionary dynamic of these functional molecules.


Subject(s)
Archaea/genetics , Bacteriocins/biosynthesis , Multigene Family , Archaea/metabolism , Bacteriocins/chemistry , Bacteriocins/genetics
20.
Neurosci Lett ; 696: 93-98, 2019 03 23.
Article in English | MEDLINE | ID: mdl-30572101

ABSTRACT

The microbiota of individuals with Parkinson's disease (PD) has been the focus of research in recent years. However, the mechanisms underlying the interactions between the gut microbiome and the brain, as well as its role in PD pathogenesis, remain to be elucidated. In this study, we used a systematic approach to predict putative biosynthetic gene clusters (BGCs) from the raw metagenomic data of the gut microbiome, and identified 43 BGCs that were significantly enriched in the PD patients. Fourteen of these clusters originated from microbes that were not increased in the patients, and the most significantly enriched one encoded a putative efflux protein and a radical SAM protein, indicating a potential role in PD. Based on a random forest classifier, these BGCs can be used to correctly discriminate between PD patients and healthy controls, with a cross-validated AUC of 0.91 from the 31 early stage PD patients and 28 healthy controls. Our study provides an alternative method to analyze the microbiota of PD patients, and further increase our understanding of this disease.


Subject(s)
Gastrointestinal Microbiome/genetics , Gastrointestinal Tract/microbiology , Multigene Family/genetics , Parkinson Disease/genetics , Brain/pathology , Case-Control Studies , Feces , Gastrointestinal Tract/metabolism , Humans
SELECTION OF CITATIONS
SEARCH DETAIL