Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
mBio ; 15(1): e0146423, 2024 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-38117035

RESUMO

IMPORTANCE: Our study reveals the potential of precision-cut lung slices as an ex vivo platform to study the growth/survival of Pneumocystis spp. that can facilitate the development of new anti-fungal drugs.


Assuntos
Anti-Infecciosos , Pneumocystis , Pneumonia por Pneumocystis , Pulmão/microbiologia , Pneumonia por Pneumocystis/microbiologia
2.
Commun Biol ; 6(1): 1265, 2023 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-38092883

RESUMO

SARS-CoV-2 infection can cause persistent respiratory sequelae. However, the underlying mechanisms remain unclear. Here we report that sub-lethally infected K18-human ACE2 mice show patchy pneumonia associated with histiocytic inflammation and collagen deposition at 21 and 45 days post infection (DPI). Transcriptomic analyses revealed that compared to influenza-infected mice, SARS-CoV-2-infected mice had reduced interferon-gamma/alpha responses at 4 DPI and failed to induce keratin 5 (Krt5) at 6 DPI in lung, a marker of nascent pulmonary progenitor cells. Histologically, influenza- but not SARS-CoV-2-infected mice showed extensive Krt5+ "pods" structure co-stained with stem cell markers Trp63/NGFR proliferated in the pulmonary consolidation area at both 7 and 14 DPI, with regression at 21 DPI. These Krt5+ "pods" structures were not observed in the lungs of SARS-CoV-2-infected humans or nonhuman primates. These results suggest that SARS-CoV-2 infection fails to induce nascent Krt5+ cell proliferation in consolidated regions, leading to incomplete repair of the injured lung.


Assuntos
COVID-19 , Influenza Humana , Camundongos , Humanos , Animais , SARS-CoV-2 , Pulmão , Perfilação da Expressão Gênica
3.
mSphere ; 8(5): e0037523, 2023 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-37737611

RESUMO

Single-cell RNA-seq has been used to characterize human COVID-19. To determine if preclinical models successfully mimic the cell-intrinsic and -extrinsic effects of severe disease, we conducted a meta-analysis of single-cell data across five model species. To assess whether dissemination of viral RNA in lung cells tracks pathology and results in cell-intrinsic and -extrinsic transcriptomic changes in COVID-19. We conducted a meta-analysis by analyzing six publicly available, scRNA-seq data sets. We used dual mapping (host and virus) and differential gene expression analyses to compare viral+ and viral- cell populations. We conducted a principal component analysis to identify successful models of human COVID-19. We found expression of viral RNA in many non-epithelial cell types. Fibroblasts, macrophages, and endothelial cells exhibit clear evidence of viral-intrinsic and -extrinsic effects on host gene expression. Using viral RNA expression, we found that K18-hACE2 mice most closely modeled severe human COVID-19, followed by hamsters. Ferrets and macaques are poor models of human disease due to the low presence of viral RNA. Moreover, we found that increased transcripts of certain key inflammatory genes such as IL1B, IL18, and CXCL10 are not restricted to virally infected cells, suggesting these genes are regulated in a paracrine or autocrine fashion. These data affirm widespread dissemination of viral RNA in the lung, which may be key in the pathogenesis of severe COVID-19 and demonstrate ferrets and Rhesus macaques are poor models of human COVID-19. IMPORTANCE We conducted a high-resolution meta-analysis of scRNA-seq data from humans and five animal models of COVID-19. This study reports viral RNA dissemination in several cell types in human data as well as in some of the pre-clinical models. Using this metric, the K18-hACE2 mouse model, followed by the hamster model, most closely resembled human COVID-19. We observed clear evidence of viral-intrinsic effects within cells (e.g., IRF5 expression) as well as viral-extrinsic cytokine modulation (e.g., IL1B, IL18, CXCL10). We observed proinflammatory chemokine expression in cells devoid of viral RNA expression, suggesting autocrine/paracrine interferon regulation. This report serves as a resource-synthesizing data from COVID-19 humans and animal models and suggesting improvements for relevant pre-clinical models that may aid future diagnostic and therapeutic development projects.


Assuntos
COVID-19 , RNA Viral , Cricetinae , Humanos , Animais , Camundongos , RNA Viral/genética , SARS-CoV-2/genética , Células Endoteliais , Furões , Interleucina-18 , Macaca mulatta
4.
Int J Mol Sci ; 24(16)2023 Aug 17.
Artigo em Inglês | MEDLINE | ID: mdl-37629052

RESUMO

Within arterial plaque, HIV infection creates a state of inflammation and immune activation, triggering NLRP3/caspase-1 inflammasome, tissue damage, and monocyte/macrophage infiltration. Previously, we documented that caspase-1 activation in myeloid cells was linked with HIV-associated atherosclerosis in mice and people with HIV. Here, we mechanistically examined the direct effect of caspase-1 on HIV-associated atherosclerosis. Caspase-1-deficient (Casp-1-/-) mice were crossed with HIV-1 transgenic (Tg26+/-) mice with an atherogenic ApoE-deficient (ApoE-/-) background to create global caspase-1-deficient mice (Tg26+/-/ApoE-/-/Casp-1-/-). Caspase-1-sufficient (Tg26+/-/ApoE-/-/Casp-1+/+) mice served as the controls. Next, we created chimeric hematopoietic cell-deficient mice by reconstituting irradiated ApoE-/- mice with bone marrow cells transplanted from Tg26+/-/ApoE-/-/Casp-1-/- (BMT Casp-1-/-) or Tg26+/-/ApoE-/-/Casp-1+/+ (BMT Casp-1+/+) mice. Global caspase-1 knockout in mice suppressed plaque deposition in the thoracic aorta, serum IL-18 levels, and ex vivo foam cell formation. The deficiency of caspase-1 in hematopoietic cells resulted in reduced atherosclerotic plaque burden in the whole aorta and aortic root, which was associated with reduced macrophage infiltration. Transcriptomic analyses of peripheral mononuclear cells and splenocytes indicated that caspase-1 deficiency inhibited caspase-1 pathway-related genes. These results document the critical atherogenic role of caspase-1 in chronic HIV infection and highlight the implication of this pathway and peripheral immune activation in HIV-associated atherosclerosis.


Assuntos
Aterosclerose , Infecções por HIV , HIV-1 , Placa Aterosclerótica , Animais , Camundongos , Apolipoproteínas E/genética , Aterosclerose/genética , Caspase 1/genética , Infecções por HIV/complicações , Infecções por HIV/genética , Placa Aterosclerótica/genética
5.
J Immunol ; 211(2): 252-260, 2023 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-37265402

RESUMO

SARS-CoV-2 has caused an estimated 7 million deaths worldwide to date. A secreted SARS-CoV-2 accessory protein, known as open reading frame 8 (ORF8), elicits inflammatory pulmonary cytokine responses and is associated with disease severity in COVID-19 patients. Recent reports proposed that ORF8 mediates downstream signals in macrophages and monocytes through the IL-17 receptor complex (IL-17RA, IL-17RC). However, generally IL-17 signals are found to be restricted to the nonhematopoietic compartment, thought to be due to rate-limiting expression of IL-17RC. Accordingly, we revisited the capacity of IL-17 and ORF8 to induce cytokine gene expression in mouse and human macrophages and monocytes. In SARS-CoV-2-infected human and mouse lungs, IL17RC mRNA was undetectable in monocyte/macrophage populations. In cultured mouse and human monocytes and macrophages, ORF8 but not IL-17 led to elevated expression of target cytokines. ORF8-induced signaling was fully preserved in the presence of anti-IL-17RA/RC neutralizing Abs and in Il17ra-/- cells. ORF8 signaling was also operative in Il1r1-/- bone marrow-derived macrophages. However, the TLR/IL-1R family adaptor MyD88, which is dispensable for IL-17R signaling, was required for ORF8 activity yet MyD88 is not required for IL-17 signaling. Thus, we conclude that ORF8 transduces inflammatory signaling in monocytes and macrophages via MyD88 independently of the IL-17R.


Assuntos
COVID-19 , Fases de Leitura Aberta , SARS-CoV-2 , Animais , Humanos , Camundongos , COVID-19/imunologia , COVID-19/virologia , Citocinas/metabolismo , Macrófagos/metabolismo , Monócitos/metabolismo , Fator 88 de Diferenciação Mieloide/genética , Fator 88 de Diferenciação Mieloide/metabolismo , Receptores de Interleucina-17/genética , Receptores de Interleucina-17/metabolismo , SARS-CoV-2/metabolismo
6.
J Fungi (Basel) ; 9(6)2023 May 24.
Artigo em Inglês | MEDLINE | ID: mdl-37367538

RESUMO

Pneumocystis jirovecii is the most common cause of fungal pneumonia in children under the age of 2 years. However, the inability to culture and propagate this organism has hampered the acquisition of a fungal genome as well as the development of recombinant antigens to conduct seroprevalence studies. In this study, we performed proteomics on Pneumocystis-infected mice and used the recent P. murina and P. jirovecii genomes to prioritize antigens for recombinant protein expression. We focused on a fungal glucanase due to its conservation among fungal species. We found evidence of maternal IgG to this antigen, followed by a nadir in pediatric samples between 1 and 3 months of age, followed by an increase in prevalence over time consistent with the known epidemiology of Pneumocystis exposure. Moreover, there was a strong concordance of anti-glucanase responses and IgG against another Pneumocystis antigen, PNEG_01454. Taken together, these antigens may be useful tools for Pneumocystis seroprevalence and seroconversion studies.

8.
Curr Protein Pept Sci ; 23(11): 744-756, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35762552

RESUMO

Lysine succinylation is a post-translational modification (PTM) of protein in which a succinyl group (-CO-CH2-CH2-CO2H) is added to a lysine residue of protein that reverses lysine's positive charge to a negative charge and leads to the significant changes in protein structure and function. It occurs on a wide range of proteins and plays an important role in various cellular and biological processes in both eukaryotes and prokaryotes. Beyond experimentally identified succinylation sites, there have been a lot of studies for developing sequence-based prediction using machine learning approaches, because it has the promise of being extremely time-saving, accurate, robust, and cost-effective. Despite these benefits for computational prediction of lysine succinylation sites for different species, there are a number of issues that need to be addressed in the design and development of succinylation site predictors. In spite of the fact that many studies used different statistical and machine learning computational tools, only a few studies have focused on these bioinformatics issues in depth. Therefore, in this comprehensive comparative review, an attempt is made to present the latest advances in the prediction models, datasets, and online resources, as well as the obstacles and limits, to provide an advantageous guideline for developing more suitable and effective succinylation site prediction tools.


Assuntos
Lisina , Proteínas , Lisina/metabolismo , Sequência de Aminoácidos , Proteínas/química , Biologia Computacional , Processamento de Proteína Pós-Traducional
9.
Sci Rep ; 12(1): 2632, 2022 02 16.
Artigo em Inglês | MEDLINE | ID: mdl-35173235

RESUMO

Serine phosphorylation is one type of protein post-translational modifications (PTMs), which plays an essential role in various cellular processes and disease pathogenesis. Numerous methods are used for the prediction of phosphorylation sites. However, the traditional wet-lab based experimental approaches are time-consuming, laborious, and expensive. In this work, a computational predictor was proposed to predict serine phosphorylation sites mapping on Schizosaccharomyces pombe (SP) by the fusion of three encoding schemes namely k-spaced amino acid pair composition (CKSAAP), binary and amino acid composition (AAC) with the random forest (RF) classifier. So far, the proposed method is firstly developed to predict serine phosphorylation sites for SP. Both the training and independent test performance scores were used to investigate the success of the proposed RF based fusion prediction model compared to others. We also investigated their performances by 5-fold cross-validation (CV). In all cases, it was observed that the recommended predictor achieves the largest scores of true positive rate (TPR), true negative rate (TNR), accuracy (ACC), Mathew coefficient of correlation (MCC), Area under the ROC curve (AUC) and pAUC (partial AUC) at false positive rate (FPR) = 0.20. Thus, the prediction performance as discussed in this paper indicates that the proposed approach may be a beneficial and motivating computational resource for predicting serine phosphorylation sites in the case of Fungi. The online interface of the software for the proposed prediction model is publicly available at http://mollah-bioinformaticslab-stat.ru.ac.bd/PredSPS/ .


Assuntos
Biologia Computacional/métodos , Processamento de Proteína Pós-Traducional , Schizosaccharomyces/genética , Schizosaccharomyces/metabolismo , Serina/metabolismo , Sequência de Aminoácidos , Aminoácidos/metabolismo , Área Sob a Curva , Fosforilação
10.
Curr Med Chem ; 29(5): 865-880, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34348604

RESUMO

MicroRNAs (miRNAs) are central players that regulate the post-transcriptional processes of gene expression. Binding of miRNAs to target mRNAs can repress their translation by inducing the degradation or by inhibiting the translation of the target mRNAs. Highthroughput experimental approaches for miRNA target identification are costly and timeconsuming, depending on various factors. It is vitally important to develop bioinformatics methods for accurately predicting miRNA targets. With the increase of RNA sequences in the post-genomic era, bioinformatics methods are being developed for miRNA studies especially for miRNA target prediction. This review summarizes the current development of state-of-the-art bioinformatics tools for miRNA target prediction, points out the progress and limitations of the available miRNA databases, and their working principles. Finally, we discuss the caveat and perspectives of the next-generation algorithms for the prediction of miRNA targets.


Assuntos
MicroRNAs , Algoritmos , Biologia Computacional/métodos , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , RNA Mensageiro/genética
11.
Brief Bioinform ; 22(3)2021 05 20.
Artigo em Inglês | MEDLINE | ID: mdl-32910169

RESUMO

DNA N6-methyladenine (6mA) represents important epigenetic modifications, which are responsible for various cellular processes. The accurate identification of 6mA sites is one of the challenging tasks in genome analysis, which leads to an understanding of their biological functions. To date, several species-specific machine learning (ML)-based models have been proposed, but majority of them did not test their model to other species. Hence, their practical application to other plant species is quite limited. In this study, we explored 10 different feature encoding schemes, with the goal of capturing key characteristics around 6mA sites. We selected five feature encoding schemes based on physicochemical and position-specific information that possesses high discriminative capability. The resultant feature sets were inputted to six commonly used ML methods (random forest, support vector machine, extremely randomized tree, logistic regression, naïve Bayes and AdaBoost). The Rosaceae genome was employed to train the above classifiers, which generated 30 baseline models. To integrate their individual strength, Meta-i6mA was proposed that combined the baseline models using the meta-predictor approach. In extensive independent test, Meta-i6mA showed high Matthews correlation coefficient values of 0.918, 0.827 and 0.635 on Rosaceae, rice and Arabidopsis thaliana, respectively and outperformed the existing predictors. We anticipate that the Meta-i6mA can be applied across different plant species. Furthermore, we developed an online user-friendly web server, which is available at http://kurata14.bio.kyutech.ac.jp/Meta-i6mA/.


Assuntos
Adenosina/análogos & derivados , Biologia Computacional/métodos , DNA de Plantas/genética , Epigênese Genética/genética , Genoma de Planta/genética , Aprendizado de Máquina , Adenosina/metabolismo , Algoritmos , Arabidopsis/genética , Arabidopsis/metabolismo , Sequência de Bases , DNA de Plantas/metabolismo , Internet , Modelos Genéticos , Oryza/genética , Oryza/metabolismo , Rosaceae/genética , Rosaceae/metabolismo , Especificidade da Espécie , Máquina de Vetores de Suporte
12.
Curr Genomics ; 21(6): 454-463, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-33093807

RESUMO

Protein-protein interactions (PPIs) are the physical connections between two or more proteins via electrostatic forces or hydrophobic effects. Identification of the PPIs is pivotal, which contributes to many biological processes including protein function, disease incidence, and therapy design. The experimental identification of PPIs via high-throughput technology is time-consuming and expensive. Bioinformatics approaches are expected to solve such restrictions. In this review, our main goal is to provide an inclusive view of the existing sequence-based computational prediction of PPIs. Initially, we briefly introduce the currently available PPI databases and then review the state-of-the-art bioinformatics approaches, working principles, and their performances. Finally, we discuss the caveats and future perspective of the next generation algorithms for the prediction of PPIs.

13.
Genomics Proteomics Bioinformatics ; 18(5): 593-600, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33099033

RESUMO

Linear B-cell epitopes are critically important for immunological applications, such as vaccine design, immunodiagnostic test, and antibody production, as well as disease diagnosis and therapy. The accurate identification of linear B-cell epitopes remains challenging despite several decades of research. In this work, we have developed a novel predictor, Identification of Linear B-cell Epitope (iLBE), by integrating evolutionary and sequence-based features. The successive feature vectors were optimized by a Wilcoxon-rank sum test. Then the random forest (RF) algorithm using the optimal consecutive feature vectors was applied to predict linear B-cell epitopes. We combined the RF scores by the logistic regression to enhance the prediction accuracy. iLBE yielded an area under curve score of 0.809 on the training dataset and outperformed other prediction models on a comprehensive independent dataset. iLBE is a powerful computational tool to identify the linear B-cell epitopes and would help to develop penetrating diagnostic tests. A web application with curated datasets for iLBE is freely accessible at http://kurata14.bio.kyutech.ac.jp/iLBE/.


Assuntos
Biologia Computacional , Epitopos de Linfócito B , Algoritmos , Modelos Logísticos
14.
J Comput Aided Mol Des ; 34(12): 1229-1236, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-32964284

RESUMO

A proinflammatory peptide (PIP) is a type of signaling molecules that are secreted from immune cells, which contributes to the first line of defense against invading pathogens. Numerous experiments have shown that PIPs play an important role in human physiology such as vaccines and immunotherapeutic drugs. Considering high-throughput laboratory methods that are time consuming and costly, effective computational methods are great demand to timely and accurately identify PIPs. Thus, in this study, we proposed a computational model in conjunction with a multiple feature representation, called ProIn-Fuse, to improve the performance of PIPs identification. Specifically, a feature representation learning model was utilized to generate the probabilistic scores by using the random forest models employing eight sequence encoding schemes. Finally, the ProIn-Fuse was constructed by linearly combining the resultant eight probabilistic scores. Evaluated through independent test, the ProIn-Fuse yielded an accuracy of 0.746, which was 10% higher than those obtained by the state-of-the-art PIP predictors. The proposed ProIn-Fuse can facilitate faster and broader applications of PIPs in drug design and development. The web server, datasets and online instruction are freely accessible at http://kurata14.bio.kyutech.ac.jp/ProIn-Fuse .


Assuntos
Algoritmos , Biologia Computacional/métodos , Simulação por Computador , Mediadores da Inflamação/metabolismo , Aprendizado de Máquina , Fragmentos de Peptídeos/metabolismo , Humanos , Mediadores da Inflamação/imunologia , Fragmentos de Peptídeos/imunologia
15.
Comput Struct Biotechnol J ; 18: 906-912, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32322372

RESUMO

N4-methylcytosine (4mC) is one of the most important DNA modifications and involved in regulating cell differentiations and gene expressions. The accurate identification of 4mC sites is necessary to understand various biological functions. In this work, we developed a new computational predictor called i4mC-Mouse to identify 4mC sites in the mouse genome. Herein, six encoding schemes of k-space nucleotide composition (KSNC), k-mer nucleotide composition (Kmer), mono nucleotide binary encoding (MBE), dinucleotide binary encoding, electron-ion interaction pseudo potentials (EIIP) and dinucleotide physicochemical composition were explored that cover different characteristics of DNA sequence information. Subsequently, we built six RF-based encoding models and then linearly combined their probability scores to construct the final predictor. Among the six RF-based models, the Kmer, KSNC, MBE, and EIIP encodings are sufficient, which contributed to 10%, 45%, 25%, and 20% of the prediction performance, respectively. On the independent test the i4mC-Mouse predicted the 4mC sites with accuracy and MCC of 0.816 and 0.633, respectively, which were approximately 2.5% and 5% higher than those of the existing method (4mCpred-EL). For experimental biologists, a freely available web application was implemented at http://kurata14.bio.kyutech.ac.jp/i4mC-Mouse/.

16.
Plant Mol Biol ; 103(1-2): 225-234, 2020 May.
Artigo em Inglês | MEDLINE | ID: mdl-32140819

RESUMO

DNA N6-methyladenine (6 mA) is one of the most vital epigenetic modifications and involved in controlling the various gene expression levels. With the avalanche of DNA sequences generated in numerous databases, the accurate identification of 6 mA plays an essential role for understanding molecular mechanisms. Because the experimental approaches are time-consuming and costly, it is desirable to develop a computation model for rapidly and accurately identifying 6 mA. To the best of our knowledge, we first proposed a computational model named i6mA-Fuse to predict 6 mA sites from the Rosaceae genomes, especially in Rosa chinensis and Fragaria vesca. We implemented the five encoding schemes, i.e., mononucleotide binary, dinucleotide binary, k-space spectral nucleotide, k-mer, and electron-ion interaction pseudo potential compositions, to build the five, single-encoding random forest (RF) models. The i6mA-Fuse uses a linear regression model to combine the predicted probability scores of the five, single encoding-based RF models. The resultant species-specific i6mA-Fuse achieved remarkably high performances with AUCs of 0.982 and 0.978 and with MCCs of 0.869 and 0.858 on the independent datasets of Rosa chinensis and Fragaria vesca, respectively. In the F. vesca-specific i6mA-Fuse, the MBE and EIIP contributed to 75% and 25% of the total prediction; in the R. chinensis-specific i6mA-Fuse, Kmer, MBE, and EIIP contribute to 15%, 65%, and 20% of the total prediction. To assist high-throughput prediction for DNA 6 mA identification, the i6mA-Fuse is publicly accessible at https://kurata14.bio.kyutech.ac.jp/i6mA-Fuse/.


Assuntos
Adenina/análogos & derivados , DNA de Plantas/metabolismo , Rosaceae/metabolismo , Adenina/metabolismo , Algoritmos , Sítios de Ligação , Biologia Computacional , Conjuntos de Dados como Assunto , Aprendizado de Máquina , Modelos Genéticos , Rosaceae/genética
17.
Comput Biol Chem ; 85: 107238, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-32114285

RESUMO

Among the protein post-translational modifications (PTMs), ubiquitination is considered as one of the most significant processes which can regulate the cellular functions and various diseases. Identification of ubiquitination sites becomes important for understanding the mechanisms of ubiquitination-related biological processes. Both experimental and computational approaches are available for identifying ubiquitination sites based on protein sequences of different species. The experimental approaches are time-consuming, laborious and costly. In silico prediction is an alternative time saving, easier and cost-effective approach for identifying ubiquitination sites. Moreover, the sequence patterns in the different species around the ubiquitination sites are not similar which demands species-specific predictors. Therefore, in this study, we have proposed a novel computational method for identifying ubiquitination sites based on protein sequences of A. thaliana species which will be robust against outlying observations also. Through the comparative study of two encoding schemes and three classifiers, the random forest (RF) based predictor was selected as the best predictor under the CKSAAP encoding scheme with 1:1 ratio of positive and negative samples (i.e. ubiquitinated and non-ubiquitinated) in training dataset. The proposed predictor produced the area under the ROC curve (AUC score) as 0.91 and 0.86 for 5-fold cross-validation test with the training dataset and the independent test dataset of A. thaliana respectively. The proposed RF based predictor also performed much better than the other existing ubiquitination sites predictors for A. thaliana.


Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/genética , Biologia Computacional , Sequência de Aminoácidos , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Curva ROC
18.
Int J Biol Macromol ; 157: 752-758, 2020 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-31805335

RESUMO

One of the most important epigenetic modifications is N4-methylcytosine, which regulates many biological processes including DNA replication and chromosome stability. Identification of N4-methylcytosine sites is pivotal to understand specific biological functions. Herein, we developed the first bioinformatics tool called i4mC-ROSE for identifying N4-methylcytosine sites in the genomes of Fragaria vesca and Rosa chinensis in the Rosaceae, which utilizes a random forest classifier with six encoding methods that cover various aspects of DNA sequence information. The i4mC-ROSE predictor achieves area under the curve scores of 0.883 and 0.889 for the two genomes during cross-validation. Moreover, the i4mC-ROSE outperforms other classifiers tested in this study when objectively evaluated on the independent datasets. The proposed i4mC-ROSE tool can serve users' demand for the prediction of 4mC sites in the Rosaceae genome. The i4mC-ROSE predictor and utilized datasets are publicly accessible at http://kurata14.bio.kyutech.ac.jp/i4mC-ROSE/.


Assuntos
Biologia Computacional/métodos , Citosina , Metilação de DNA , Epigênese Genética , Epigenômica/métodos , Genoma de Planta , Rosaceae/genética , Algoritmos , Citosina/metabolismo , Bases de Dados Genéticas , Aprendizado de Máquina , Curva ROC , Reprodutibilidade dos Testes , Navegador
19.
Mol Omics ; 15(6): 451-458, 2019 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-31710075

RESUMO

Cysteine S-nitrosylation is a type of reversible post-translational modification of proteins, which controls diverse biological processes. It is associated with redox-based cellular signaling to protect against oxidative stress. The identification of S-nitrosylation sites is an important step to reveal the function of proteins; however, experimental identification of S-nitrosylation is expensive and time-consuming work. Hence, sequence-based computational prediction of potential S-nitrosylation sites is highly sought before experimentation. Herein, a novel predictor PreSNO has been developed that integrates multiple encoding schemes by the support vector machine and random forest algorithms. The PreSNO achieved an accuracy and Matthews correlation coefficient value of 0.752 and 0.252 respectively in classifying between SNO and non-SNO sites when evaluated on the independent dataset, outperforming the existing methods. The web application of the PreSNO and its associated datasets are freely available at http://kurata14.bio.kyutech.ac.jp/PreSNO/.


Assuntos
Biologia Computacional/métodos , Cisteína/metabolismo , Processamento de Proteína Pós-Traducional , Software , Máquina de Vetores de Suporte , Algoritmos
20.
Sci Rep ; 9(1): 8258, 2019 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-31164681

RESUMO

Protein phosphorylation on serine (S) and threonine (T) has emerged as a key device in the control of many biological processes. Recently phosphorylation in microbial organisms has attracted much attention for its critical roles in various cellular processes such as cell growth and cell division. Here a novel machine learning predictor, MPSite (Microbial Phosphorylation Site predictor), was developed to identify microbial phosphorylation sites using the enhanced characteristics of sequence features. The final feature vectors optimized via a Wilcoxon rank sum test. A random forest classifier was then trained using the optimum features to build the predictor. Benchmarking investigation using the 5-fold cross-validation and independent datasets test showed that the MPSite is able to achieve robust performance on the S- and T-phosphorylation site prediction. It also outperformed other existing methods on the comprehensive independent datasets. We anticipate that the MPSite is a powerful tool for proteome-wide prediction of microbial phosphorylation sites and facilitates hypothesis-driven functional interrogation of phosphorylation proteins. A web application with the curated datasets is freely available at http://kurata14.bio.kyutech.ac.jp/MPSite/ .


Assuntos
Bactérias/genética , Fosforilação/genética , Proteoma/genética , Software , Algoritmos , Bactérias/metabolismo , Biologia Computacional , Humanos , Aprendizado de Máquina , Processamento de Proteína Pós-Traducional/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA