Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
World J Gastroenterol ; 30(27): 3336-3355, 2024 Jul 21.
Artículo en Inglés | MEDLINE | ID: mdl-39086748

RESUMEN

BACKGROUND: Colorectal polyps that develop via the conventional adenoma-carcinoma sequence [e.g., tubular adenoma (TA)] often progress to malignancy and are closely associated with changes in the composition of the gut microbiome. There is limited research concerning the microbial functions and gut microbiomes associated with colorectal polyps that arise through the serrated polyp pathway, such as hyperplastic polyps (HP). Exploration of microbiome alterations associated with HP and TA would improve the understanding of mechanisms by which specific microbes and their metabolic pathways contribute to colorectal carcinogenesis. AIM: To investigate gut microbiome signatures, microbial associations, and microbial functions in HP and TA patients. METHODS: Full-length 16S rRNA sequencing was used to characterize the gut microbiome in stool samples from control participants without polyps [control group (CT), n = 40], patients with HP (n = 52), and patients with TA (n = 60). Significant differences in gut microbiome composition and functional mechanisms were identified between the CT group and patients with HP or TA. Analytical techniques in this study included differential abundance analysis, co-occurrence network analysis, and differential pathway analysis. RESULTS: Colorectal cancer (CRC)-associated bacteria, including Streptococcus gallolyticus (S. gallolyticus), Bacteroides fragilis, and Clostridium symbiosum, were identified as characteristic microbial species in TA patients. Mediterraneibacter gnavus, associated with dysbiosis and gastrointestinal diseases, was significantly differentially abundant in the HP and TA groups. Functional pathway analysis revealed that HP patients exhibited enrichment in the sulfur oxidation pathway exclusively, whereas TA patients showed dominance in pathways related to secondary metabolite biosynthesis (e.g., mevalonate); S. gallolyticus was a major contributor. Co-occurrence network and dynamic network analyses revealed co-occurrence of dysbiosis-associated bacteria in HP patients, whereas TA patients exhibited co-occurrence of CRC-associated bacteria. Furthermore, the co-occurrence of SCFA-producing bacteria was lower in TA patients than HP patients. CONCLUSION: This study revealed distinct gut microbiome signatures associated with pathways of colorectal polyp development, providing insights concerning the roles of microbial species, functional pathways, and microbial interactions in colorectal carcinogenesis.


Asunto(s)
Pólipos del Colon , Neoplasias Colorrectales , Heces , Microbioma Gastrointestinal , ARN Ribosómico 16S , Humanos , Femenino , Masculino , Persona de Mediana Edad , Pólipos del Colon/microbiología , Pólipos del Colon/patología , Neoplasias Colorrectales/microbiología , Neoplasias Colorrectales/patología , ARN Ribosómico 16S/genética , Anciano , Heces/microbiología , Tailandia/epidemiología , Adulto , Adenoma/microbiología , Bacterias/aislamiento & purificación , Bacterias/genética , Bacterias/clasificación , Hiperplasia/microbiología , Estudios de Casos y Controles , Disbiosis/microbiología , Pueblos del Sudeste Asiático
2.
Front Genet ; 13: 883766, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35571042

RESUMEN

Hypertension or elevated blood pressure is a serious medical condition that significantly increases the risks of cardiovascular disease, heart disease, diabetes, stroke, kidney disease, and other health problems, that affect people worldwide. Thus, hypertension is one of the major global causes of premature death. Regarding the prevention and treatment of hypertension with no or few side effects, antihypertensive peptides (AHTPs) obtained from natural sources might be useful as nutraceuticals. Therefore, the search for alternative/novel AHTPs in food or natural sources has received much attention, as AHTPs may be functional agents for human health. AHTPs have been observed in diverse organisms, although many of them remain underinvestigated. The identification of peptides with antihypertensive activity in the laboratory is time- and resource-consuming. Alternatively, computational methods based on robust machine learning can identify or screen potential AHTP candidates prior to experimental verification. In this paper, we propose Ensemble-AHTPpred, an ensemble machine learning algorithm composed of a random forest (RF), a support vector machine (SVM), and extreme gradient boosting (XGB), with the aim of integrating diverse heterogeneous algorithms to enhance the robustness of the final predictive model. The selected feature set includes various computed features, such as various physicochemical properties, amino acid compositions (AACs), transitions, n-grams, and secondary structure-related information; these features are able to learn more information in terms of analyzing or explaining the characteristics of the predicted peptide. In addition, the tool is integrated with a newly proposed composite feature (generated based on a logistic regression function) that combines various feature aspects to enable improved AHTP characterization. Our tool, Ensemble-AHTPpred, achieved an overall accuracy above 90% on independent test data. Additionally, the approach was applied to novel experimentally validated AHTPs, obtained from recent studies, which did not overlap with the training and test datasets, and the tool could precisely predict these AHTPs.

3.
Biology (Basel) ; 10(9)2021 Sep 08.
Artículo en Inglés | MEDLINE | ID: mdl-34571762

RESUMEN

Microbial lipid production with cost effectiveness is a prerequisite for the oleochemical sector. In this work, genome-wide transcriptional responses on the utilization of xylose and glucose in oleaginous Aspergillus oryzae were studied with relation to growth and lipid phenotypic traits. Comparative analysis of the active growth (t1) and lipid-accumulating (t2) stages showed that the C5 cultures efficiently consumed carbon sources for biomass and lipid production comparable to the C6 cultures. By pairwise comparison, 599 and 917 differentially expressed genes (DEGs) were identified in the t1 and t2 groups, respectively, in which the consensus DEGs were categorized into polysaccharide-degrading enzymes, membrane transports, and cellular processes. A discrimination in transcriptional responses of DEGs set was also found in various metabolic genes, mostly in carbohydrate, amino acid, lipid, cofactors, and vitamin metabolisms. Although central carbohydrate metabolism was shared among the C5 and C6 cultures, the metabolic functions in acetyl-CoA and NADPH generation, and biosynthesis of terpenoid backbone, fatty acid, sterol, and amino acids were allocated for leveraging biomass and lipid production through at least transcriptional control. This study revealed robust metabolic networks in the oleaginicity of A. oryzae governing glucose/xylose flux toward lipid biosynthesis that provides meaningful hints for further process developments of microbial lipid production using cellulosic sugar feedstocks.

4.
Life (Basel) ; 11(4)2021 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-33808227

RESUMEN

The accurate prediction of protein localization is a critical step in any functional genome annotation process. This paper proposes an improved strategy for protein subcellular localization prediction in plants based on multiple classifiers, to improve prediction results in terms of both accuracy and reliability. The prediction of plant protein subcellular localization is challenging because the underlying problem is not only a multiclass, but also a multilabel problem. Generally, plant proteins can be found in 10-14 locations/compartments. The number of proteins in some compartments (nucleus, cytoplasm, and mitochondria) is generally much greater than that in other compartments (vacuole, peroxisome, Golgi, and cell wall). Therefore, the problem of imbalanced data usually arises. Therefore, we propose an ensemble machine learning method based on average voting among heterogeneous classifiers. We first extracted various types of features suitable for each type of protein localization to form a total of 479 feature spaces. Then, feature selection methods were used to reduce the dimensions of the features into smaller informative feature subsets. This reduced feature subset was then used to train/build three different individual models. In the process of combining the three distinct classifier models, we used an average voting approach to combine the results of these three different classifiers that we constructed to return the final probability prediction. The method could predict subcellular localizations in both single- and multilabel locations, based on the voting probability. Experimental results indicated that the proposed ensemble method could achieve correct classification with an overall accuracy of 84.58% for 11 compartments, on the basis of the testing dataset.

5.
Genes (Basel) ; 12(2)2021 01 21.
Artículo en Inglés | MEDLINE | ID: mdl-33494403

RESUMEN

Antimicrobial peptides (AMPs) are natural peptides possessing antimicrobial activities. These peptides are important components of the innate immune system. They are found in various organisms. AMP screening and identification by experimental techniques are laborious and time-consuming tasks. Alternatively, computational methods based on machine learning have been developed to screen potential AMP candidates prior to experimental verification. Although various AMP prediction programs are available, there is still a need for improvement to reduce false positives (FPs) and to increase the predictive accuracy. In this work, several well-known single and ensemble machine learning approaches have been explored and evaluated based on balanced training datasets and two large testing datasets. We have demonstrated that the developed program with various predictive models has high performance in differentiating between AMPs and non-AMPs. Thus, we describe the development of a program for the prediction and recognition of AMPs using MaxProbVote, which is an ensemble model. Moreover, to increase prediction efficiency, the ensemble model was integrated with a new hybrid feature based on logistic regression. The ensemble model integrated with the hybrid feature can effectively increase the prediction sensitivity of the developed program called Ensemble-AMPPred, resulting in overall improvements in terms of both sensitivity and specificity compared to those of currently available programs.


Asunto(s)
Péptidos Catiónicos Antimicrobianos/farmacología , Bases de Datos Genéticas , Aprendizaje Automático , Programas Informáticos , Algoritmos , Péptidos Catiónicos Antimicrobianos/química , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
6.
Sci Rep ; 10(1): 10241, 2020 06 24.
Artículo en Inglés | MEDLINE | ID: mdl-32581273

RESUMEN

The safety of microbial cultures utilized for consumption is vital for public health and should be thoroughly assessed. Although general aspects on the safety assessment of microbial cultures have been suggested, no methodological detail nor procedural guideline have been published. Herein, we propose a detailed protocol on microbial strain safety assessment via whole-genome sequence analysis. A starter culture employed in traditional fermented pork production, nham, namely Lactobacillus plantarum BCC9546, was used as an example. The strain's whole-genome was sequenced through several next-generation sequencing techniques. Incomplete plasmid information from the PacBio sequencing platform and shorter chromosome size from the hybrid Oxford Nanopore-Illumina platform were noted. The methods for 1) unambiguous species identification using 16S rRNA gene and average nucleotide identity, 2) determination of virulence factors and undesirable genes, 3) determination of antimicrobial resistance properties and their possibility of transfer, and 4) determination of antimicrobial drug production capability of the strain were provided in detail. Applicability of the search tools and limitations of databases were discussed. Finally, a procedural guideline for the safety assessment of microbial strains via whole-genome analysis was proposed.


Asunto(s)
Alimentos Fermentados/microbiología , Lactobacillus plantarum/clasificación , Lactobacillus plantarum/crecimiento & desarrollo , Secuenciación Completa del Genoma/métodos , Técnicas Bacteriológicas , Inocuidad de los Alimentos , Tamaño del Genoma , Genoma Bacteriano , Secuenciación de Nucleótidos de Alto Rendimiento , Lactobacillus plantarum/genética , Plásmidos/genética , ARN Ribosómico 16S/genética
7.
Genes (Basel) ; 11(4)2020 03 28.
Artículo en Inglés | MEDLINE | ID: mdl-32231066

RESUMEN

Long non-coding RNAs (lncRNAs) play important roles in the regulation of complex cellular processes, including transcriptional and post-transcriptional regulation of gene expression relevant for development and stress response, among others. Compared to other important crops, there is limited knowledge of cassava lncRNAs and their roles in abiotic stress adaptation. In this study, we performed a genome-wide study of ncRNAs in cassava, integrating genomics- and transcriptomics-based approaches. In total, 56,840 putative ncRNAs were identified, and approximately half the number were verified using expression data or previously known ncRNAs. Among these were 2229 potential novel lncRNA transcripts with unmatched sequences, 250 of which were differentially expressed in cold or drought conditions, relative to controls. We showed that lncRNAs might be involved in post-transcriptional regulation of stress-induced transcription factors (TFs) such as zinc-finger, WRKY, and nuclear factor Y gene families. These findings deepened our knowledge of cassava lncRNAs and shed light on their stress-responsive roles.


Asunto(s)
Sequías , Regulación de la Expresión Génica de las Plantas , Genoma de Planta , Manihot/genética , Proteínas de Plantas/genética , ARN Largo no Codificante/genética , Estrés Fisiológico , Transcriptoma , Estudio de Asociación del Genoma Completo , Manihot/fisiología
8.
Gene ; 741: 144559, 2020 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-32169630

RESUMEN

The fungi in order Mortierellales are attractive producers for long-chain polyunsaturated fatty acids (PUFAs). Here, the genome sequencing and assembly of a novel strain of Mortierella sp. BCC40632 were done, yielding 65 contigs spanning of 49,964,116 total bases with predicted 12,149 protein-coding genes. We focused on the acetyl-CoA in relevant to its derived metabolic pathways for biosynthesis of macromolecules with biological functions, including PUFAs, eicosanoids and carotenoids. By comparative genome analysis between Mortierellales and Mucorales, the signature genetic characteristics of the arachidonic acid-producing strains, including Δ5-desaturase and GLELO-like elongase, were also identified in the strain BCC40632. Remarkably, this fungal strain contained only n-6 pathway of PUFA biosynthesis due to the absence of Δ15-desaturase or ω3-desaturase gene in contrast to other Mortierella species. Four putative enzyme sequences in the eicosanoid biosynthetic pathways were identified in the strain BCC40632 and others Mortierellale fungi, but were not detected in the Mucorales. Another unique metabolic trait of the Mortierellales was the inability in carotenoid synthesis as a result of the lack of phytoene synthase and phytoene desaturase genes. The findings provide a perspective in strain optimization for production of tailored-made products with industrial applications.


Asunto(s)
Acetilcoenzima A/biosíntesis , Ácido Araquidónico/genética , Genoma Fúngico/genética , Mortierella/metabolismo , Acetilcoenzima A/genética , Ácido Araquidónico/biosíntesis , Vías Biosintéticas/genética , Ácido Graso Desaturasas/genética , Elongasas de Ácidos Grasos/genética , Ácidos Grasos Insaturados/genética , Ácidos Grasos Insaturados/metabolismo , Mortierella/genética , Mucorales/genética , Mucorales/metabolismo , Ácido gammalinolénico/genética , Ácido gammalinolénico/metabolismo
9.
Biomed Res Int ; 2019: 5617153, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31886228

RESUMEN

Several computational approaches for predicting subcellular localization have been developed and proposed. These approaches provide diverse performance because of their different combinations of protein features, training datasets, training strategies, and computational machine learning algorithms. In some cases, these tools may yield inconsistent and conflicting prediction results. It is important to consider such conflicting or contradictory predictions from multiple prediction programs during protein annotation, especially in the case of a multiclass classification problem such as subcellular localization. Hence, to address this issue, this work proposes the use of the particle swarm optimization (PSO) algorithm to combine the prediction outputs from multiple different subcellular localization predictors with the aim of integrating diverse prediction models to enhance the final predictions. Herein, we present PSO-LocBact, a consensus classifier based on PSO that can be used to combine the strengths of several preexisting protein localization predictors specially designed for bacteria. Our experimental results indicate that the proposed method can resolve inconsistency problems in subcellular localization prediction for both Gram-negative and Gram-positive bacterial proteins. The average accuracy achieved on each test dataset is over 98%, higher than that achieved with any individual predictor.


Asunto(s)
Proteínas Bacterianas/clasificación , Biología Computacional/métodos , Espacio Intracelular/química , Aprendizaje Automático , Análisis de Secuencia de Proteína/métodos , Algoritmos , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Consenso
10.
Cancers (Basel) ; 11(7)2019 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-31336886

RESUMEN

Colorectal adenomas are precursor lesions of colorectal adenocarcinoma. The transition from adenoma to carcinoma in patients with colorectal cancer (CRC) has been associated with an accumulation of genetic aberrations. However, criteria that can screen adenoma progression to adenocarcinoma are still lacking. This present study is the first attempt to identify genetic aberrations, such as the somatic mutations, copy number variations (CNVs), and high-frequency mutated genes, found in Thai patients. In this study, we identified the genomic abnormality of two sample groups. In the first group, five cases matched normal-colorectal adenoma-colorectal adenocarcinoma. In the second group, six cases matched normal-colorectal adenomas. For both groups, whole-exome sequencing was performed. We compared the genetic aberration of the two sample groups. In both normal tissues compared with colorectal adenoma and colorectal adenocarcinoma analyses, somatic mutations were observed in the tumor suppressor gene APC (Adenomatous polyposis coli) in eight out of ten patients. In the group of normal tissue comparison with colorectal adenoma tissue, somatic mutations were also detected in Catenin Beta 1 (CTNNB1), Family With Sequence Similarity 123B (FAM123B), F-Box And WD Repeat Domain Containing 7 (FBXW7), Sex-Determining Region Y-Box 9 (SOX9), Low-Density Lipoprotein Receptor-Related Protein 5 (LRP5), Frizzled Class Receptor 10 (FZD10), and AT-Rich Interaction Domain 1A (ARID1A) genes, which are involved in the Wingless-related integration site (Wnt) signaling pathway. In the normal tissue comparison with colorectal adenocarcinoma tissue, Kirsten retrovirus-associated DNA sequences (KRAS), Tumor Protein 53 (TP53), and Ataxia-Telangiectasia Mutated (ATM) genes are found in the receptor tyrosine kinase-RAS (RTK-RAS) signaling pathway and p53 signaling pathway, respectively. These results suggest that APC and TP53 may act as a potential screening marker for colorectal adenoma and early-stage CRC. This preliminary study may help identify patients with adenoma and early-stage CRC and may aid in establishing prevention and surveillance strategies to reduce the incidence of CRC.

11.
Biomed Res Int ; 2019: 2019846, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31321230

RESUMEN

MicroRNAs are small noncoding RNAs, involved in the regulation of many cellular processes in plants. Hundreds of miRNAs have been identified in cassava by various techniques, yet these identifications were constrained by a lack of miRNA templates and the narrow range of conditions in transcriptome study. In this research, we conducted genome-wide analysis identification, whereby miRNAs from cassava genome were thoroughly screened using bioinformatics approach independent of predefined templates and studied conditions. Our work provided a catalog of putative mature miRNAs and explored the landscape of miRNAome in cassava. These putative miRNAs were validated using statistical analysis as well as available cassava expression data. We showed that the crowded locations of cassava miRNAs are consistent with other plants and animals and hypothesized to have the same evolutionary origin. At least 10 conserved miRNAs were identified in cassava based on the comparative study of miRNA conservation. Finally, investigation of miRNAs and target gene relationships enabled us to envisage the complexities of cellular regulatory systems modulated at posttranscriptional level.


Asunto(s)
Biología Computacional , Manihot/genética , MicroARNs/genética , Estrés Fisiológico/genética , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica de las Plantas/genética , Genoma de Planta/genética , Manihot/crecimiento & desarrollo , Transcriptoma/genética
12.
Curr Microbiol ; 75(1): 57-70, 2018 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-28865010

RESUMEN

The selected robust fungus, Aspergillus oryzae strain BCC7051 is of interest for biotechnological production of lipid-derived products due to its capability to accumulate high amount of intracellular lipids using various sugars and agro-industrial substrates. Here, we report the genome sequence of the oleaginous A. oryzae BCC7051. The obtained reads were de novo assembled into 25 scaffolds spanning of 38,550,958 bps with predicted 11,456 protein-coding genes. By synteny mapping, a large rearrangement was found in two scaffolds of A. oryzae BCC7051 as compared to the reference RIB40 strain. The genetic relationship between BCC7051 and other strains of A. oryzae in terms of aflatoxin production was investigated, indicating that the A. oryzae BCC7051 was categorized into group 2 nonaflatoxin-producing strain. Moreover, a comparative analysis of the structural genes focusing on the involvement in lipid metabolism among oleaginous yeast and fungi revealed the presence of multiple isoforms of metabolic enzymes responsible for fatty acid synthesis in BCC7051. The alternative routes of acetyl-CoA generation as oleaginous features and malate/citrate/pyruvate shuttle were also identified in this A. oryzae strain. The genome sequence generated in this work is a dedicated resource for expanding genome-wide study of microbial lipids at systems level, and developing the fungal-based platform for production of diversified lipids with commercial relevance.


Asunto(s)
Aspergillus oryzae/genética , Aspergillus oryzae/metabolismo , Genoma Fúngico , Lípidos/biosíntesis , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Malatos/metabolismo , Sintenía
13.
Adv Biochem Eng Biotechnol ; 160: 121-141, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-27783133

RESUMEN

To understand how biological processes work, it is necessary to explore the systematic regulation governing the behaviour of the processes. Not only driving the normal behavior of organisms, the systematic regulation evidently underlies the temporal responses to surrounding environments (dynamics) and long-term phenotypic adaptation (evolution). The systematic regulation is, in effect, formulated from the regulatory components which collaboratively work together as a network. In the drive to decipher such a code of lives, a spectrum of technologies has continuously been developed in the post-genomic era. With current advances, high-throughput sequencing technologies are tremendously powerful for facilitating genomics and systems biology studies in the attempt to understand system regulation inside the cells. The ability to explore relevant regulatory components which infer transcriptional and signaling regulation, driving core cellular processes, is thus enhanced. This chapter reviews high-throughput sequencing technologies, including second and third generation sequencing technologies, which support the investigation of genomics and transcriptomics data. Utilization of this high-throughput data to form the virtual network of systems regulation is explained, particularly transcriptional regulatory networks. Analysis of the resulting regulatory networks could lead to an understanding of cellular systems regulation at the mechanistic and dynamics levels. The great contribution of the biological networking approach to envisage systems regulation is finally demonstrated by a broad range of examples.


Asunto(s)
Regulación de la Expresión Génica/genética , Redes Reguladoras de Genes/genética , Genoma/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Modelos Genéticos , Proteoma/genética , Animales , Biología Computacional/métodos , Simulación por Computador , Humanos
14.
World J Microbiol Biotechnol ; 32(7): 122, 2016 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-27263017

RESUMEN

Lipid-degrading or lipolytic enzymes have gained enormous attention in academic and industrial sectors. Several efforts are underway to discover new lipase enzymes from a variety of microorganisms with particular catalytic properties to be used for extensive applications. In addition, various tools and strategies have been implemented to unravel the functional relevance of the versatile lipid-degrading enzymes for special purposes. This review highlights the study of microbial lipid-degrading enzymes through an integrative computational approach. The identification of putative lipase genes from microbial genomes and metagenomic libraries using homology-based mining is discussed, with an emphasis on sequence analysis of conserved motifs and enzyme topology. Molecular modelling of three-dimensional structure on the basis of sequence similarity is shown to be a potential approach for exploring the structural and functional relationships of candidate lipase enzymes. The perspectives on a discriminative framework of cutting-edge tools and technologies, including bioinformatics, computational biology, functional genomics and functional proteomics, intended to facilitate rapid progress in understanding lipolysis mechanism and to discover novel lipid-degrading enzymes of microorganisms are discussed.


Asunto(s)
Biología Computacional/métodos , Lipasa/metabolismo , Metabolismo de los Lípidos/genética , Lipólisis/genética , Animales , Bacterias/enzimología , Bacterias/genética , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Bases de Datos Factuales , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Hongos/enzimología , Hongos/genética , Genoma Microbiano , Humanos , Lipasa/química , Lipasa/genética , Metagenómica/métodos , Homología de Secuencia de Ácido Nucleico
15.
Microbiology (Reading) ; 161(8): 1613-1626, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-26271808

RESUMEN

Lipases are interesting enzymes, which contribute important roles in maintaining lipid homeostasis and cellular metabolisms. Using available genome data, seven lipase families of oleaginous and non-oleaginous yeast and fungi were categorized based on the similarity of their amino acid sequences and conserved structural domains. Of them, triacylglycerol lipase (patatin-domain-containing protein) and steryl ester hydrolase (abhydro_lipase-domain-containing protein) families were ubiquitous enzymes found in all species studied. The two essential lipases rendered signature characteristics of integral membrane proteins that might be targeted to lipid monolayer particles. At least one of the extracellular lipase families existed in each species of yeast and fungi. We found that the diversity of lipase families and the number of genes in individual families of oleaginous strains were greater than those identified in non-oleaginous species, which might play a role in nutrient acquisition from surrounding hydrophobic substrates and attribute to their obese phenotype. The gene/enzyme catalogue and relevant informative data of the lipases provided by this study are not only valuable toolboxes for investigation of the biological role of these lipases, but also convey potential in various industrial applications.


Asunto(s)
Proteínas Fúngicas/genética , Hongos/enzimología , Genoma Fúngico , Lipasa/genética , Proteínas Fúngicas/química , Proteínas Fúngicas/metabolismo , Hongos/química , Hongos/genética , Microbiología Industrial , Lipasa/química , Lipasa/metabolismo , Datos de Secuencia Molecular , Estructura Terciaria de Proteína , Alineación de Secuencia
16.
Nucleic Acids Res ; 42(11): e93, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24771344

RESUMEN

To identify non-coding RNA (ncRNA) signals within genomic regions, a classification tool was developed based on a hybrid random forest (RF) with a logistic regression model to efficiently discriminate short ncRNA sequences as well as long complex ncRNA sequences. This RF-based classifier was trained on a well-balanced dataset with a discriminative set of features and achieved an accuracy, sensitivity and specificity of 92.11%, 90.7% and 93.5%, respectively. The selected feature set includes a new proposed feature, SCORE. This feature is generated based on a logistic regression function that combines five significant features-structure, sequence, modularity, structural robustness and coding potential-to enable improved characterization of long ncRNA (lncRNA) elements. The use of SCORE improved the performance of the RF-based classifier in the identification of Rfam lncRNA families. A genome-wide ncRNA classification framework was applied to a wide variety of organisms, with an emphasis on those of economic, social, public health, environmental and agricultural significance, such as various bacteria genomes, the Arthrospira (Spirulina) genome, and rice and human genomic regions. Our framework was able to identify known ncRNAs with sensitivities of greater than 90% and 77.7% for prokaryotic and eukaryotic sequences, respectively. Our classifier is available at http://ncrna-pred.com/HLRF.htm.


Asunto(s)
Algoritmos , ARN Largo no Codificante/genética , ARN Pequeño no Traducido/genética , Clasificación/métodos , Genoma Bacteriano , Genómica , Humanos , Modelos Logísticos , ARN no Traducido/clasificación , ARN no Traducido/genética
17.
Microbiology (Reading) ; 159(Pt 12): 2548-2557, 2013 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-24065718

RESUMEN

Malic enzyme (ME) is one of the important enzymes for furnishing the cofactor NAD(P)H for the biosynthesis of fatty acids and sterols. Due to the existence of multiple ME isoforms in a range of oleaginous microbes, a molecular basis for the evolutionary relationships amongst the enzymes in oleaginous fungi was investigated using sequence analysis and structural modelling. Evolutionary distance and structural characteristics were used to discriminate the MEs of yeasts and fungi into several groups. Interestingly, the NADP(+)-dependent MEs of Mucoromycotina had an unusual insertion region (FLxxPG) that was not found in other fungi. However, the subcellular compartment of the Mucoromycotina enzyme could not be clearly identified by an analysis of signal peptide sequences. A constructed structural model of the ME of Mucor circinelloides suggested that the insertion region is located at the N-terminus of the enzyme (aa 159-163). In addition, it is presumably part of the dimer interface region of the enzyme, which might provide a continuously positively charged pocket for the efficient binding of negatively charged effector molecules. The discovery of the unique structure of the Mucoromycotina ME suggests the insertion region could be involved in particular kinetics of this enzyme, which may indicate its involvement in the lipogenesis of industrially important oleaginous microbes.


Asunto(s)
Evolución Molecular , Hongos/enzimología , Malato-Deshidrogenasa (NADP+)/genética , Hongos/genética , Malato-Deshidrogenasa (NADP+)/química , Malato-Deshidrogenasa (NADP+)/clasificación , Modelos Moleculares , Alineación de Secuencia , Homología de Secuencia de Aminoácido
18.
Int J Data Min Bioinform ; 7(2): 118-34, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23777171

RESUMEN

Non-coding RNAs (ncRNAs) have important biological functions in living cells dependent on their conserved secondary structures. Here, we focus on computational RNA secondary structure prediction by exploring primary sequences and complementary base pair interactions using the Conditional Random Fields (CRFs) model, which treats RNA prediction as a sequence labelling problem. Proposing suitable feature extraction from known RNA secondary structures, we developed a feature extraction based on natural RNA's loop and stem characteristics. Our CRFs models can predict the secondary structures of the test RNAs with optimal F-score prediction between 56.61 and 98.20% for different RNA families.


Asunto(s)
Conformación de Ácido Nucleico , ARN/química , Emparejamiento Base , Biología Computacional , ARN no Traducido/química , Alineación de Secuencia , Análisis de Secuencia de ARN
19.
Nucleic Acids Res ; 41(1): e21, 2013 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-23012261

RESUMEN

An ensemble classifier approach for microRNA precursor (pre-miRNA) classification was proposed based upon combining a set of heterogeneous algorithms including support vector machine (SVM), k-nearest neighbors (kNN) and random forest (RF), then aggregating their prediction through a voting system. Additionally, the proposed algorithm, the classification performance was also improved using discriminative features, self-containment and its derivatives, which have shown unique structural robustness characteristics of pre-miRNAs. These are applicable across different species. By applying preprocessing methods--both a correlation-based feature selection (CFS) with genetic algorithm (GA) search method and a modified-Synthetic Minority Oversampling Technique (SMOTE) bagging rebalancing method--improvement in the performance of this ensemble was observed. The overall prediction accuracies obtained via 10 runs of 5-fold cross validation (CV) was 96.54%, with sensitivity of 94.8% and specificity of 98.3%-this is better in trade-off sensitivity and specificity values than those of other state-of-the-art methods. The ensemble model was applied to animal, plant and virus pre-miRNA and achieved high accuracy, >93%. Exploiting the discriminative set of selected features also suggests that pre-miRNAs possess high intrinsic structural robustness as compared with other stem loops. Our heterogeneous ensemble method gave a relatively more reliable prediction than those using single classifiers. Our program is available at http://ncrna-pred.com/premiRNA.html.


Asunto(s)
Algoritmos , MicroARNs/clasificación , Precursores del ARN/clasificación , Emparejamiento Base , Humanos , MicroARNs/química , Precursores del ARN/química , ARN de Planta/química , ARN de Planta/clasificación , ARN Viral/química , ARN Viral/clasificación , Sensibilidad y Especificidad
20.
Stand Genomic Sci ; 6(1): 43-53, 2012 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-22675597

RESUMEN

Arthrospira platensis is a cyanobacterium that is extensively cultivated outdoors on a large commercial scale for consumption as a food for humans and animals. It can be grown in monoculture under highly alkaline conditions, making it attractive for industrial production. Here we describe the complete genome sequence of A. platensis C1 strain and its annotation. The A. platensis C1 genome contains 6,089,210 bp including 6,108 protein-coding genes and 45 RNA genes, and no plasmids. The genome information has been used for further comparative analysis, particularly of metabolic pathways, photosynthetic efficiency and barriers to gene transfer.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...