Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 56
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Int J Biol Macromol ; 257(Pt 2): 128802, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38101670

RESUMO

Heat shock proteins (HSPs) are crucial cellular stress proteins that react to environmental cues, ensuring the preservation of cellular functions. They also play pivotal roles in orchestrating the immune response and participating in processes associated with cancer. Consequently, the classification of HSPs holds immense significance in enhancing our understanding of their biological functions and in various diseases. However, the use of computational methods for identifying and classifying HSPs still faces challenges related to accuracy and interpretability. In this study, we introduced MulCNN-HSP, a novel deep learning model based on multi-scale convolutional neural networks, for identifying and classifying of HSPs. Comparative results showed that MulCNN-HSP outperforms or matches existing models in the identification and classification of HSPs. Furthermore, MulCNN-HSP can extract and analyze essential features for the prediction task, enhancing its interpretability. To facilitate its accessibility, we have made MulCNN-HSP available at http://cbcb.cdutcm.edu.cn/HSP/. We hope that MulCNN-HSP will contribute to advancing the study of HSPs and their roles in various biological processes and diseases.


Assuntos
Aprendizado Profundo , Neoplasias , Humanos , Proteínas de Choque Térmico/metabolismo , Proteínas de Choque Térmico HSP70/metabolismo
2.
Hortic Res ; 10(9): uhad139, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37671073

RESUMO

Polygala tenuifolia is a perennial medicinal plant that has been widely used in traditional Chinese medicine for treating mental diseases. However, the lack of genomic resources limits the insight into its evolutionary and biological characterization. In the present work, we reported the P. tenuifolia genome, the first genome assembly of the Polygalaceae family. We sequenced and assembled this genome by a combination of Illumnina, PacBio HiFi, and Hi-C mapping. The assembly includes 19 pseudochromosomes covering ~92.68% of the assembled genome (~769.62 Mb). There are 36 463 protein-coding genes annotated in this genome. Detailed comparative genome analysis revealed that P. tenuifolia experienced two rounds of whole genome duplication that occurred ~39-44 and ~18-20 million years ago, respectively. Accordingly, we systematically reconstructed ancestral chromosomes of P. tenuifolia and inferred its chromosome evolution trajectories from the common ancestor of core eudicots to the present species. Based on the transcriptomics data, enzyme genes and transcription factors involved in the synthesis of triterpenoid saponin in P. tenuifolia were identified. Further analysis demonstrated that whole-genome duplications and tandem duplications play critical roles in the expansion of P450 and UGT gene families, which contributed to the synthesis of triterpenoid saponins. The genome and transcriptome data will not only provide valuable resources for comparative and functional genomic researches on Polygalaceae, but also shed light on the synthesis of triterpenoid saponin.

3.
Int J Biol Macromol ; 242(Pt 2): 124761, 2023 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-37156312

RESUMO

O-linked glycosylation is one of the most complex post-translational modifications (PTM) of human proteins modulating various cellular metabolic and signaling pathways. Unlike N-glycosylation, the O-glycosylation has non-specific sequence features and unstable glycan core structure, which makes identification of O-glycosites more challenging either by experimental or computational methods. Biochemical experiments to identify O-glycosites in batches are technically and economically demanding. Therefore, development of computation-based methods is greatly warranted. This study constructed a prediction model based on feature fusion for O-glycosites linked to the threonine residues in Homo sapiens. In the training model, we collected and sorted out high-quality human protein data with O-linked threonine glycosites. Seven feature coding methods were fused to represent the sample sequence. By comparison of different algorithms, random forest was selected as the final classifier to construct the classification model. Through 5-fold cross-validation, the proposed model, namely O-GlyThr, performed satisfactorily on both training set (AUC: 0.9308) and independent validation dataset (AUC: 0.9323). Compared with previously published predictors, O-GlyThr achieved the highest ACC of 0.8475 on the independent test dataset. These results demonstrated the high competency of our predictor in identifying O-glycosites on threonine residues. Furthermore, a user-friendly webserver named O-GlyThr (http://cbcb.cdutcm.edu.cn/O-GlyThr/) was developed to assist glycobiologists in the research associated with glycosylation structure and function.


Assuntos
Processamento de Proteína Pós-Traducional , Treonina , Humanos , Glicosilação , Algoritmos , Biologia Computacional/métodos
4.
Mol Ther Nucleic Acids ; 32: 28-35, 2023 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-36908648

RESUMO

The global pandemic of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has generated tremendous concern and poses a serious threat to international public health. Phosphorylation is a common post-translational modification affecting many essential cellular processes and is inextricably linked to SARS-CoV-2 infection. Hence, accurate identification of phosphorylation sites will be helpful to understand the mechanisms of SARS-CoV-2 infection and mitigate the ongoing COVID-19 pandemic. In the present study, an attention-based bidirectional gated recurrent unit network, called IPs-GRUAtt, was proposed to identify phosphorylation sites in SARS-CoV-2-infected host cells. Comparative results demonstrated that IPs-GRUAtt surpassed both state-of-the-art machine-learning methods and existing models for identifying phosphorylation sites. Moreover, the attention mechanism made IPs-GRUAtt able to extract the key features from protein sequences. These results demonstrated that the IPs-GRUAtt is a powerful tool for identifying phosphorylation sites. For facilitating its academic use, a freely available online web server for IPs-GRUAtt is provided at http://cbcb.cdutcm.edu.cn/phosphory/.

5.
Comput Struct Biotechnol J ; 20: 6244-6249, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36420165

RESUMO

The dynamic RNA modifications were orchestrated by a series of enzymes, namely "writer", "reader" and "eraser", which can install, recognize and remove the modifications, respectively. However, only a very small number of experimentally validated RNA modification enzymes have been identified and reported. Therefore, there is an urgent need to develop a database to deposit RNA modification enzymes. In the present work, we developed the RNAME database (https://chenweilab.cn/rname/) to provide a comprehensive resource for RNA modification enzymes. The current version of RNAME deposits more than 21,000 manually curated RNA modification enzymes, which are from 456 species and covers the 7 common kinds of RNA modifications (i.e., adenosine to inosine, N1-methyladenosine, N6-methyladenosine, 5-methylcytidine, N7-methylguanosine, mRNA cap modification, and pseudouridine). The 3D structures, domains, subcellular locations, and biological functions of these enzymes were also integrated in RNAME. It is anticipated that RNAME will facilitate the researches on RNA modifications.

6.
Front Genet ; 13: 854531, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35360870

RESUMO

Background: Prostate cancer (PCa) is an epithelial malignant tumor that occurs in the urinary system with high incidence and is the second most common cancer among men in the world. Thus, it is important to screen out potential key biomarkers for the pathogenesis and prognosis of PCa. The present study aimed to identify potential biomarkers to reveal the underlying molecular mechanisms. Methods: Differentially expressed genes (DEGs) between PCa tissues and matched normal tissues from The Cancer Genome Atlas Prostate Adenocarcinoma (TCGA-PRAD) dataset were screened out by R software. Weighted gene co-expression network analysis was performed primarily to identify statistically significant genes for clinical manifestations. Protein-protein interaction (PPI) network analysis and network screening were performed based on the STRING database in conjunction with Cytoscape software. Hub genes were then screened out by Cytoscape in conjunction with stepwise algorithm and multivariate Cox regression analysis to construct a risk model. Gene expression in different clinical manifestations and survival analysis correlated with the expression of hub genes were performed. Moreover, the protein expression of hub genes was validated by the Human Protein Atlas database. Results: A total of 1,621 DEGs (870 downregulated genes and 751 upregulated genes) were identified from the TCGA-PRAD dataset. Eight prognostic genes [BUB1, KIF2C, CCNA2, CDC20, CCNB2, PBK, RRM2, and CDC45] and four hub genes (BUB1, KIF2C, CDC20, and PBK) potentially correlated with the pathogenesis of PCa were identified. A prognostic model with good predictive power for survival was constructed and was validated by the dataset in GSE21032. The survival analysis demonstrated that the expression of RRM2 was statistically significant to the prognosis of PCa, indicating that RRM2 may potentially play an important role in the PCa progression. Conclusion: The present study implied that RRM2 was associated with prognosis and could be used as a potential therapeutic target for PCa clinical treatment.

7.
Methods ; 203: 28-31, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-33882361

RESUMO

The 5-methyluridine (m5U)modification plays important roles in a series of biological processes. Accurate identification of m5U sites will be helpful to decode its biological functions. Although experimental techniques have been proposed to detect m5U, they are still expensive and time consuming. In the present work, a support vector machine based method, called iRNA-m5U, was developed to identify the m5U sites in the Saccharomyces cerevisiae transcriptome. The performance of iRNA-m5U was validated based on different datasets. The accuracies obtained by iRNA-m5U is promising, indicating that it holds the potential to become an useful tool for the identification of m5U sites.


Assuntos
Saccharomyces cerevisiae , Máquina de Vetores de Suporte , Biologia Computacional/métodos , Saccharomyces cerevisiae/genética , Transcriptoma , Uridina/análogos & derivados
8.
Int J Biol Macromol ; 167: 1575-1578, 2021 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-33212104

RESUMO

Small heat shock protein (sHSP) is a superfamily of molecular chaperone and is found from archaea to human. Recent researches have demonstrated that sHSPs participate in a series of biological processes and are even closely associated with serious diseases. Since sHSP is a very large superfamily and members from different superfamilies exhibit distinct functions, accurate classification of the subfamily of sHSP will be helpful for unrevealing its functions. In the present work, a support vector machine-based method was proposed to classify the subfamily of sHSPs. In the 10-fold cross validation test, an overall accuracy of 93.25% was obtained for classifying the subfamily of sHSPs. The superiority of the proposed method was also demonstrated by comparing it with the other methods. It is anticipated that the proposed method will become a useful tool for classifying the subfamily of sHSPs.


Assuntos
Biologia Computacional/métodos , Dipeptídeos/classificação , Proteínas de Choque Térmico Pequenas/classificação , Aprendizado de Máquina , Sequência de Aminoácidos , Animais , Bases de Dados de Proteínas , Dipeptídeos/química , Dipeptídeos/genética , Proteínas de Choque Térmico Pequenas/química , Proteínas de Choque Térmico Pequenas/genética , Humanos , Proteômica/métodos , Alinhamento de Sequência
9.
Curr Pharm Des ; 27(9): 1219-1229, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33167827

RESUMO

BACKGROUND: N6-methyladenosine (m6A) plays critical roles in a broad range of biological processes. Knowledge about the precise location of m6A site in the transcriptome is vital for deciphering its biological functions. Although experimental techniques have made substantial contributions to identify m6A, they are still labor intensive and time consuming. As complement to experimental methods, in the past few years, a series of computational approaches have been proposed to identify m6A sites. METHODS: In order to facilitate researchers to select appropriate methods for identifying m6A sites, it is necessary to conduct a comprehensive review and comparison of existing methods. RESULTS: Since research works on m6A in Saccharomyces cerevisiae are relatively clear, in this review, we summarized recent progress of computational prediction of m6A sites in S. cerevisiae and assessed the performance of existing computational methods. Finally, future directions of computationally identifying m6A sites are presented. CONCLUSION: Taken together, we anticipate that this review will serve as an important guide for computational analysis of m6A modifications.


Assuntos
Biologia Computacional , Saccharomyces cerevisiae , Adenosina/análogos & derivados , Humanos , Transcriptoma
10.
Brief Bioinform ; 22(4)2021 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-33169141

RESUMO

MOTIVATION: N7-methylguanosine (m7G) is an important epigenetic modification, playing an essential role in gene expression regulation. Therefore, accurate identification of m7G modifications will facilitate revealing and in-depth understanding their potential functional mechanisms. Although high-throughput experimental methods are capable of precisely locating m7G sites, they are still cost ineffective. Therefore, it's necessary to develop new methods to identify m7G sites. RESULTS: In this work, by using the iterative feature representation algorithm, we developed a machine learning based method, namely m7G-IFL, to identify m7G sites. To demonstrate its superiority, m7G-IFL was evaluated and compared with existing predictors. The results demonstrate that our predictor outperforms existing predictors in terms of accuracy for identifying m7G sites. By analyzing and comparing the features used in the predictors, we found that the positive and negative samples in our feature space were more separated than in existing feature space. This result demonstrates that our features extracted more discriminative information via the iterative feature learning process, and thus contributed to the predictive performance improvement.


Assuntos
Metilação de DNA , Epigênese Genética , Guanosina/análogos & derivados , Máquina de Vetores de Suporte , Guanosina/genética , Guanosina/metabolismo , Células HeLa , Células Hep G2 , Humanos
11.
Database (Oxford) ; 20202020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-32608478

RESUMO

RNA modifications are involved in various kinds of cellular biological processes. Accumulated evidences have demonstrated that the functions of RNA modifications are determined by the effectors that can catalyze, recognize and remove RNA modifications. They are called 'writers', 'readers' and 'erasers'. The identification of RNA modification effectors will be helpful for understanding the regulatory mechanisms and biological functions of RNA modifications. In this work, we developed a database called RNAWRE that specially deposits RNA modification effectors. The current version of RNAWRE stored 2045 manually curated writers, readers and erasers for the six major kinds of RNA modifications, namely Cap, m1A, m6A, m5C, ψ and Poly A. The main modules of RNAWRE not only allow browsing and downloading the RNA modification effectors but also support the BLAST search of the potential RNA modification effectors in other species. We hope that RNAWRE will be helpful for the researches on RNA modifications. Database URL: http://rnawre.bio2db.com.


Assuntos
Biologia Computacional/métodos , Processamento Pós-Transcricional do RNA , RNA , Software , Bases de Dados de Ácidos Nucleicos , Edição de RNA , Interface Usuário-Computador
12.
Curr Drug Metab ; 21(10): 804-809, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32682368

RESUMO

Antioxidants are molecules that can prevent damages to cells caused by free radicals. Recent studies also demonstrated that antioxidants play roles in preventing diseases. However, the number of known molecules with antioxidant activity is very small. Therefore, it is necessary to identify antioxidants from various resources. In the past several years, a series of computational methods have been proposed to identify antioxidants. In this review, we briefly summarized recent advances in computationally identifying antioxidants. The challenges and future perspectives for identifying antioxidants were also discussed. We hope this review will provide insights into researches on antioxidant identification.


Assuntos
Antioxidantes , Aprendizado de Máquina
13.
Int J Biol Macromol ; 162: 931-934, 2020 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-32599233

RESUMO

Pattern recognition receptors (PRRs) play crucial roles in the innate immune system, and are able to identify pathogen-associated molecular patterns and damage-associated molecular patterns. Accurate identification of PRRs is essential for understanding their functions. In the present work, a random forest based method was proposed to identify PRRs, in which the sequences were formulated by using the optimal features. In the 10-fold cross validation test, an accuracy of 80.95% was obtained in identifying PRRs. We wish that the proposed method will become a useful tool, or at least play a complementary role to the existing predictors for identifying PRRs.


Assuntos
Receptores de Reconhecimento de Padrão/química , Análise de Sequência de Proteína , Sequência de Aminoácidos , Animais , Humanos , Imunidade Inata , Receptores de Reconhecimento de Padrão/imunologia
15.
Med Chem ; 16(5): 620-625, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31339073

RESUMO

BACKGROUND: Tuberculosis is one of the biggest threats to human health. Recent studies have demonstrated that anti-tubercular peptides are promising candidates for the discovery of new anti-tubercular drugs. Since experimental methods are still labor intensive, it is highly desirable to develop automatic computational methods to identify anti-tubercular peptides from the huge amount of natural and synthetic peptides. Hence, accurate and fast computational methods are highly needed. METHODS AND RESULTS: In this study, a support vector machine based method was proposed to identify anti-tubercular peptides, in which the peptides were encoded by using the optimal g-gap dipeptide compositions. Comparative results demonstrated that our method outperforms existing methods on the same benchmark dataset. For the convenience of scientific community, a freely accessible web-server was built, which is available at http://lin-group.cn/server/iATP. CONCLUSION: It is anticipated that the proposed method will become a useful tool for identifying anti-tubercular peptides.


Assuntos
Antituberculosos/análise , Biologia Computacional , Peptídeos/análise , Máquina de Vetores de Suporte , Bases de Dados de Proteínas , Humanos
16.
Mol Ther Nucleic Acids ; 18: 269-274, 2019 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-31581051

RESUMO

As an essential post-transcriptional modification, N7-methylguanosine (m7G) regulates nearly every step of the life cycle of mRNA. Accurate identification of the m7G site in the transcriptome will provide insights into its biological functions and mechanisms. Although the m7G-methylated RNA immunoprecipitation sequencing (MeRIP-seq) method has been proposed in this regard, it is still cost-ineffective for detecting the m7G site. Therefore, it is urgent to develop new methods to identify the m7G site. In this work, we developed the first computational predictor called iRNA-m7G to identify m7G sites in the human transcriptome. The feature fusion strategy was used to integrate both sequence- and structure-based features. In the jackknife test, iRNA-m7G obtained an accuracy of 89.88%. The superiority of iRNA-m7G for identifying m7G sites was also demonstrated by comparing with other methods. We hope that iRNA-m7G can become a useful tool to identify m7G sites. A user-friendly web server for iRNA-m7G is freely accessible at http://lin-group.cn/server/iRNA-m7G/.

17.
Bioinformatics ; 35(23): 4922-4929, 2019 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-31077296

RESUMO

MOTIVATION: Dihydrouridine (D) is a common RNA post-transcriptional modification found in eukaryotes, bacteria and a few archaea. The modification can promote the conformational flexibility of individual nucleotide bases. And its levels are increased in cancerous tissues. Therefore, it is necessary to detect D in RNA for further understanding its functional roles. Since wet-experimental techniques for the aim are time-consuming and laborious, it is urgent to develop computational models to identify D modification sites in RNA. RESULTS: We constructed a predictor, called iRNAD, for identifying D modification sites in RNA sequence. In this predictor, the RNA samples derived from five species were encoded by nucleotide chemical property and nucleotide density. Support vector machine was utilized to perform the classification. The final model could produce the overall accuracy of 96.18% with the area under the receiver operating characteristic curve of 0.9839 in jackknife cross-validation test. Furthermore, we performed a series of validations from several aspects and demonstrated the robustness and reliability of the proposed model. AVAILABILITY AND IMPLEMENTATION: A user-friendly web-server called iRNAD can be freely accessible at http://lin-group.cn/server/iRNAD, which will provide convenience and guide to users for further studying D modification.


Assuntos
Máquina de Vetores de Suporte , Sequência de Bases , Biologia Computacional , Nucleotídeos , RNA , Reprodutibilidade dos Testes
18.
Molecules ; 24(3)2019 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-30678171

RESUMO

As an abundant post-transcriptional modification, dihydrouridine (D) has been found in transfer RNA (tRNA) from bacteria, eukaryotes, and archaea. Nonetheless, knowledge of the exact biochemical roles of dihydrouridine in mediating tRNA function is still limited. Accurate identification of the position of D sites is essential for understanding their functions. Therefore, it is desirable to develop novel methods to identify D sites. In this study, an ensemble classifier was proposed for the detection of D modification sites in the Saccharomyces cerevisiae transcriptome by using heterogeneous features. The jackknife test results demonstrate that the proposed predictor is promising for the identification of D modification sites. It is anticipated that the proposed method can be widely used for identifying D modification sites in tRNA.


Assuntos
RNA de Transferência/química , Saccharomyces cerevisiae/química , Máquina de Vetores de Suporte , Uridina/química , Algoritmos , Fenômenos Químicos , Conformação de Ácido Nucleico , Reprodutibilidade dos Testes , Uridina/análogos & derivados
19.
Genomics ; 111(1): 96-102, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-29360500

RESUMO

N6-methyladenine (6mA) is one kind of post-replication modification (PTM or PTRM) occurring in a wide range of DNA sequences. Accurate identification of its sites will be very helpful for revealing the biological functions of 6mA, but it is time-consuming and expensive to determine them by experiments alone. Unfortunately, so far, no bioinformatics tool is available to do so. To fill in such an empty area, we have proposed a novel predictor called iDNA6mA-PseKNC that is established by incorporating nucleotide physicochemical properties into Pseudo K-tuple Nucleotide Composition (PseKNC). It has been observed via rigorous cross-validations that the predictor's sensitivity (Sn), specificity (Sp), accuracy (Acc), and stability (MCC) are 93%, 100%, 96%, and 0.93, respectively. For the convenience of most experimental scientists, a user-friendly web server for iDNA6mA-PseKNC has been established at http://lin-group.cn/server/iDNA6mA-PseKNC, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.


Assuntos
Adenosina/análogos & derivados , Biologia Computacional , Nucleotídeos/química , Adenosina/análise , Adenosina/química , Algoritmos , Animais , Sequência de Bases , DNA/química , Confiabilidade dos Dados , Bases de Dados Genéticas , Genoma Bacteriano , Genoma Helmíntico , Genoma de Planta , Sensibilidade e Especificidade , Software , Validação de Programas de Computador
20.
IEEE/ACM Trans Comput Biol Bioinform ; 16(4): 1309-1312, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-28212093

RESUMO

Antimicrobial peptides are crucial components of the innate host defense system of most living organisms and promising candidates for antimicrobial agents. Accurate classification of antimicrobial peptides will be helpful to the discovery of new therapeutic targets. In this work, the Increment of Diversity with Quadratic Discriminant analysis (IDQD) was presented to classify antifungal and antibacterial peptides based on primary sequence information. In the jackknife test, the proposed IDQD model yields an accuracy of 86.02 percent with the sensitivity of 74.31 percent and specificity of 92.79 percent for identifying antimicrobial peptides, which is superior to other state-of-the-art methods. This result suggests that the proposed IDQD model can be efficiently used to antimicrobial peptide classification.


Assuntos
Aminoácidos/química , Peptídeos Catiônicos Antimicrobianos/química , Dipeptídeos/química , Análise Discriminante , Descoberta de Drogas/métodos , Algoritmos , Antibacterianos/química , Antifúngicos/química , Bactérias/efeitos dos fármacos , Biologia Computacional/métodos , Aprendizado de Máquina , Modelos Moleculares , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...