Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
Int J Biol Macromol ; 265(Pt 1): 130659, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38462114

RESUMEN

Understanding the subcellular localization of lncRNAs is crucial for comprehending their regulation activities. The conventional detection of lncRNA subcellular location usually uses in situ detection techniques, which are resource intensive. Some machine learning-based algorithms have been proposed for lncRNA subcellular location prediction in mammals. However, due to the low level of conservation of lncRNA sequence, the performance of cross-species models remains unsatisfactory. In this study, we curated a novel dataset containing subcellular location information of lncRNAs in Homo sapiens. Subsequently, based on the BERT pre-trained language algorithm, we developed a model for lncRNA subcellular location prediction. Our model achieved a micro-average area under the receiver operating characteristic (AUROC) of 0.791 on the training set and an AUROC of 0.700 on the testing nucleus set. Additionally, we conducted cross-species validation and motif discovery to further investigate underlying patterns. In summary, our study provides valuable guidance and computational analysis tools for exploring the mechanisms of lncRNA subcellular localization and the dynamic spatial changes of RNA in abnormal physiological states.


Asunto(s)
ARN Largo no Codificante , Animales , Humanos , ARN Largo no Codificante/genética , Algoritmos , Aprendizaje Automático , Biología Computacional/métodos , Mamíferos/genética
2.
IET Syst Biol ; 2024 Mar 26.
Artículo en Inglés | MEDLINE | ID: mdl-38530028

RESUMEN

Pancreatic ductal adenocarcinoma (PDAC) accounts for 95% of all pancreatic cancer cases, posing grave challenges to its diagnosis and treatment. Timely diagnosis is pivotal for improving patient survival, necessitating the discovery of precise biomarkers. An innovative approach was introduced to identify gene markers for precision PDAC detection. The core idea of our method is to discover gene pairs that display consistent opposite relative expression and differential co-expression patterns between PDAC and normal samples. Reversal gene pair analysis and differential partial correlation analysis were performed to determine reversal differential partial correlation (RDC) gene pairs. Using incremental feature selection, the authors refined the selected gene set and constructed a machine-learning model for PDAC recognition. As a result, the approach identified 10 RDC gene pairs. And the model could achieve a remarkable accuracy of 96.1% during cross-validation, surpassing gene expression-based models. The experiment on independent validation data confirmed the model's performance. Enrichment analysis revealed the involvement of these genes in essential biological processes and shed light on their potential roles in PDAC pathogenesis. Overall, the findings highlight the potential of these 10 RDC gene pairs as effective diagnostic markers for early PDAC detection, bringing hope for improving patient prognosis and survival.

3.
Front Microbiol ; 14: 1170785, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37125199

RESUMEN

Promotors are those genomic regions on the upstream of genes, which are bound by RNA polymerase for starting gene transcription. Because it is the most critical element of gene expression, the recognition of promoters is crucial to understand the regulation of gene expression. This study aimed to develop a machine learning-based model to predict promotors in Agrobacterium tumefaciens (A. tumefaciens) strain C58. In the model, promotor sequences were encoded by three different kinds of feature descriptors, namely, accumulated nucleotide frequency, k-mer nucleotide composition, and binary encodings. The obtained features were optimized by using correlation and the mRMR-based algorithm. These optimized features were inputted into a random forest (RF) classifier to discriminate promotor sequences from non-promotor sequences in A. tumefaciens strain C58. The examination of 10-fold cross-validation showed that the proposed model could yield an overall accuracy of 0.837. This model will provide help for the study of promoters in A. tumefaciens C58 strain.

4.
Comput Struct Biotechnol J ; 21: 2253-2261, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37035551

RESUMEN

Hormone binding proteins (HBPs) belong to the group of soluble carrier proteins. These proteins selectively and non-covalently interact with hormones and promote growth hormone signaling in human and other animals. The HBPs are useful in many medical and commercial fields. Thus, the identification of HBPs is very important because it can help to discover more details about hormone binding proteins. Meanwhile, the experimental methods are time-consuming and expensive for hormone binding proteins recognition. Computational prediction methods have played significant roles in the correct recognition of hormone binding proteins with the use of sequence information and ML algorithms. In this review, we compared and assessed the implementation of ML-based tools in recognition of HBPs in a unique way. We hope that this study will give enough awareness and knowledge for research on HBPs.

5.
Int J Biol Macromol ; 228: 706-714, 2023 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-36584777

RESUMEN

CRISPR-Cas, as a tool for gene editing, has received extensive attention in recent years. Anti-CRISPR (Acr) proteins can inactivate the CRISPR-Cas defense system during interference phase, and can be used as a potential tool for the regulation of gene editing. In-depth study of Anti-CRISPR proteins is of great significance for the implementation of gene editing. In this study, we developed a high-accuracy prediction model based on two-step model fusion strategy, called AcrPred, which could produce an AUC of 0.952 with independent dataset validation. To further validate the proposed model, we compared with published tools and correctly identified 9 of 10 new Acr proteins, indicating the strong generalization ability of our model. Finally, for the convenience of related wet-experimental researchers, a user-friendly web-server AcrPred (Anti-CRISPR proteins Prediction) was established at http://lin-group.cn/server/AcrPred, by which users can easily identify potential Anti-CRISPR proteins.


Asunto(s)
Sistemas CRISPR-Cas , Edición Génica , Sistemas CRISPR-Cas/genética , Algoritmos , Aprendizaje Automático , Proteínas Virales/genética
6.
Methods ; 208: 42-47, 2022 12.
Artículo en Inglés | MEDLINE | ID: mdl-36341922

RESUMEN

The adaptor proteins play a crucially important role in regulating lymphocyte activation. Rapid and efficient identification of adaptor proteins is essential for understanding their functions. However, biochemical methods require not only expensive experimental costs, but also long experiment cycles and more personnel. Therefore, a computational method that could accurately identify adaptor proteins is urgently needed. To solve this issue, we developed a classifier that combined the support vector machine (SVM) with the composition of k-Spaced Amino Acid Pairs (CKSAAP) and the amino acid composition (AAC) to identify adaptor proteins. Analysis of variance (ANOVA) was used to select the optimized features which could generate the maximum prediction performance. By examining the proposed model on independent data, we found that the 447 optimized features could achieve an accuracy of 92.39% with an AUC of 0.9766, demonstrating the powerful capabilities of our model. We hope that the proposed model could provide more clues for studying adaptor proteins.


Asunto(s)
Biología Computacional , Máquina de Vectores de Soporte , Biología Computacional/métodos , Aminoácidos/metabolismo , Análisis de Varianza
7.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-36070864

RESUMEN

The location of microRNAs (miRNAs) in cells determines their function in regulation activity. Studies have shown that miRNAs are stable in the extracellular environment that mediates cell-to-cell communication and are located in the intracellular region that responds to cellular stress and environmental stimuli. Though in situ detection techniques of miRNAs have made great contributions to the study of the localization and distribution of miRNAs, miRNA subcellular localization and their role are still in progress. Recently, some machine learning-based algorithms have been designed for miRNA subcellular location prediction, but their performance is still far from satisfactory. Here, we present a new data partitioning strategy that categorizes functionally similar locations for the precise and instructive prediction of miRNA subcellular location in Homo sapiens. To characterize the localization signals, we adopted one-hot encoding with post padding to represent the whole miRNA sequences, and proposed a deep bidirectional long short-term memory with the multi-head self-attention algorithm to model. The algorithm showed high selectivity in distinguishing extracellular miRNAs from intracellular miRNAs. Moreover, a series of motif analyses were performed to explore the mechanism of miRNA subcellular localization. To improve the convenience of the model, a user-friendly web server named iLoc-miRNA was established (http://iLoc-miRNA.lin-group.cn/).


Asunto(s)
Biología Computacional , MicroARNs , Algoritmos , Biología Computacional/métodos , Humanos , Aprendizaje Automático , MicroARNs/genética
8.
Comput Struct Biotechnol J ; 20: 4942-4951, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36147670

RESUMEN

Ion binding proteins (IBPs) can selectively and non-covalently interact with ions. IBPs in phages also play an important role in biological processes. Therefore, accurate identification of IBPs is necessary for understanding their biological functions and molecular mechanisms that involve binding to ions. Since molecular biology experimental methods are still labor-intensive and cost-ineffective in identifying IBPs, it is helpful to develop computational methods to identify IBPs quickly and efficiently. In this work, a random forest (RF)-based model was constructed to quickly identify IBPs. Based on the protein sequence information and residues' physicochemical properties, the dipeptide composition combined with the physicochemical correlation between two residues were proposed for the extraction of features. A feature selection technique called analysis of variance (ANOVA) was used to exclude redundant information. By comparing with other classified methods, we demonstrated that our method could identify IBPs accurately. Based on the model, a Python package named IBPred was built with the source code which can be accessed at https://github.com/ShishiYuan/IBPred.

9.
Front Microbiol ; 13: 790063, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35273581

RESUMEN

Thermophilic proteins have important application value in biotechnology and industrial processes. The correct identification of thermophilic proteins provides important information for the application of these proteins in engineering. The identification method of thermophilic proteins based on biochemistry is laborious, time-consuming, and high cost. Therefore, there is an urgent need for a fast and accurate method to identify thermophilic proteins. Considering this urgency, we constructed a reliable benchmark dataset containing 1,368 thermophilic and 1,443 non-thermophilic proteins. A multi-layer perceptron (MLP) model based on a multi-feature fusion strategy was proposed to discriminate thermophilic proteins from non-thermophilic proteins. On independent data set, the proposed model could achieve an accuracy of 96.26%, which demonstrates that the model has a good application prospect. In order to use the model conveniently, a user-friendly software package called iThermo was established and can be freely accessed at http://lin-group.cn/server/iThermo/index.html. The high accuracy of the model and the practicability of the developed software package indicate that this study can accelerate the discovery and engineering application of thermally stable proteins.

10.
Front Biosci (Landmark Ed) ; 27(3): 84, 2022 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-35345316

RESUMEN

BACKGROUND: Lipocalin belongs to the calcyin family, and its sequence length is generally between 165 and 200 residues. They are mainly stable and multifunctional extracellular proteins. Lipocalin plays an important role in several stress responses and allergic inflammations. Because the accurate identification of lipocalins could provide significant evidences for the study of their function, it is necessary to develop a machine learning-based model to recognize lipocalin. METHODS: In this study, we constructed a prediction model to identify lipocalin. Their sequences were encoded by six types of features, namely amino acid composition (AAC), composition of k-spaced amino acid pairs (CKSAAP), pseudo amino acid composition (PseAAC), Geary correlation (GD), normalized Moreau-Broto autocorrelation (NMBroto) and composition/transition/distribution (CTD). Subsequently, these features were optimized by using feature selection techniques. A classifier based on random forest was trained according to the optimal features. RESULTS: The results of 10-fold cross-validation showed that our computational model would classify lipocalins with accuracy of 95.03% and area under the curve of 0.987. On the independent dataset, our computational model could produce the accuracy of 89.90% which was 4.17% higher than the existing model. CONCLUSIONS: In this work, we developed an advanced computational model to discriminate lipocalin proteins from non-lipocalin proteins. In the proposed model, protein sequences were encoded by six descriptors. Then, feature selection was performed to pick out the best features which could produce the maximum accuracy. On the basis of the best feature subset, the RF-based classifier can obtained the best prediction results.


Asunto(s)
Inteligencia Artificial , Lipocalinas , Aminoácidos , Biología Computacional , Lipocalinas/química , Aprendizaje Automático , Proteínas/química
11.
Comput Math Methods Med ; 2022: 7493834, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35069791

RESUMEN

Helicobacter pylori (H. pylori) is the most common risk factor for gastric cancer worldwide. The membrane proteins of the H. pylori are involved in bacterial adherence and play a vital role in the field of drug discovery. Thus, an accurate and cost-effective computational model is needed to predict the uncharacterized membrane proteins of H. pylori. In this study, a reliable benchmark dataset consisted of 114 membrane and 219 nonmembrane proteins was constructed based on UniProt. A support vector machine- (SVM-) based model was developed for discriminating H. pylori membrane proteins from nonmembrane proteins by using sequence information. Cross-validation showed that our method achieved good performance with an accuracy of 91.29%. It is anticipated that the proposed model will be useful for the annotation of H. pylori membrane proteins and the development of new anti-H. pylori agents.


Asunto(s)
Proteínas Bacterianas/genética , Helicobacter pylori/genética , Proteínas de la Membrana/genética , Secuencia de Aminoácidos , Aminoácidos/análisis , Proteínas Bacterianas/química , Biología Computacional , Bases de Datos de Proteínas/estadística & datos numéricos , Helicobacter pylori/química , Helicobacter pylori/patogenicidad , Interacciones Microbiota-Huesped , Humanos , Proteínas de la Membrana/química , Máquina de Vectores de Soporte
12.
Curr Med Chem ; 29(5): 789-806, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34514982

RESUMEN

Protein-ligand interactions are necessary for majority protein functions. Adenosine- 5'-triphosphate (ATP) is one such ligand that plays vital role as a coenzyme in providing energy for cellular activities, catalyzing biological reaction and signaling. Knowing ATP binding residues of proteins is helpful for annotation of protein function and drug design. However, due to the huge amounts of protein sequences influx into databases in the post-genome era, experimentally identifying ATP binding residues is costineffective and time-consuming. To address this problem, computational methods have been developed to predict ATP binding residues. In this review, we briefly summarized the application of machine learning methods in detecting ATP binding residues of proteins. We expect this review will be helpful for further research.


Asunto(s)
Biología Computacional , Proteínas , Adenosina Trifosfato/metabolismo , Secuencia de Aminoácidos , Sitios de Unión , Biología Computacional/métodos , Bases de Datos de Proteínas , Humanos , Aprendizaje Automático , Unión Proteica , Proteínas/metabolismo
13.
Transl Lung Cancer Res ; 10(5): 2172-2192, 2021 May.
Artículo en Inglés | MEDLINE | ID: mdl-34164268

RESUMEN

BACKGROUND: In recent years, immunotherapy has made great progress, and the regulatory role of epigenetics has been verified. However, the role of 5-methylcytosine (m5C) in the tumor microenvironment (TME) and immunotherapy response remains unclear. METHODS: Based on 11 m5C regulators, we evaluated the m5C modification patterns of 572 lung adenocarcinoma (LUAD) patients. The m5C score was constructed by principal component analysis (PCA) algorithms in order to quantify the m5C modification pattern of individual LUAD patients. RESULTS: Two m5C methylation modification patterns were identified according to 11 m5C regulators. The two patterns had a remarkably distinct TME immune cell infiltration characterization. Next, 226 differentially expressed genes (DEGs) related to the m5C phenotype were screened. Patients were divided into three different gene cluster subtypes based on these genes, which had different TME immune cell infiltration and prognosis characteristics. The m5C score was constructed to quantify the m5C modification pattern of individual LUAD patients. We found that the high m5C score group had a better prognosis. The role of the m5C score in predicting prognosis was also verified in the dataset GSE31210. CONCLUSIONS: Our study revealed that m5C modification played a significant role in TME regulation of LUAD. Investigation of the m5C regulation mode may have some implications for tumor immunotherapy in the future.

14.
Cancer Manag Res ; 13: 3229-3234, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33880065

RESUMEN

PURPOSE: Intensity-modulated radiotherapy (IMRT) can improve the prognosis of patients with esophageal cancer. This study aimed to evaluate clinical factors relevant to the prognosis of patients with esophageal cancer who received intensity-modulated radiotherapy (IMRT) alone. PATIENT AND METHODS: Data of 103 patients with pathologically confirmed esophageal cancer who were admitted to our hospital between October 2011 and November 2017 were retrospectively reviewed. All patients had squamous cell carcinoma. All patients received IMRT. Patients with stage I-IVA tumors were included to represent the real-world clinical practice. We performed univariate and multivariate analyses to identify prognostic factors for overall survival (OS) and progression-free survival (PFS). In univariate analyses, the Kaplan-Meier method was used to estimate OS and PFS for various subgroups. In multivariate analyses, hazard ratios were calculated. RESULTS: Single-factor analysis revealed that T stage (P=0.019), N stage (P =0.047), and lesion length (P =0.000) were associated with the prognosis of esophageal cancer patients who received IMRT. Cox regression analysis revealed that T stage (odds ratio [OR] = 4.68; P < 0.05), N stage (OR = 0.28; P < 0.05), and lesion length (OR = 0.09; P < 0.05) were independent factors relevant to prognosis. CONCLUSION: T stage, N stage, and lesion length influenced the long-term curative effects of IMRT for esophageal cancer and were prognostic factors for patients with esophageal cancer receiving definitive radiotherapy alone. The higher the stage and the longer the tumor, the lower the survival rate.

15.
Comput Math Methods Med ; 2021: 6664362, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33505515

RESUMEN

Bioluminescent proteins (BLPs) are a class of proteins that widely distributed in many living organisms with various mechanisms of light emission including bioluminescence and chemiluminescence from luminous organisms. Bioluminescence has been commonly used in various analytical research methods of cellular processes, such as gene expression analysis, drug discovery, cellular imaging, and toxicity determination. However, the identification of bioluminescent proteins is challenging as they share poor sequence similarities among them. In this paper, we briefly reviewed the development of the computational identification of BLPs and subsequently proposed a novel predicting framework for identifying BLPs based on eXtreme gradient boosting algorithm (XGBoost) and using sequence-derived features. To train the models, we collected BLP data from bacteria, eukaryote, and archaea. Then, for getting more effective prediction models, we examined the performances of different feature extraction methods and their combinations as well as classification algorithms. Finally, based on the optimal model, a novel predictor named iBLP was constructed to identify BLPs. The robustness of iBLP has been proved by experiments on training and independent datasets. Comparison with other published method further demonstrated that the proposed method is powerful and could provide good performance for BLP identification. The webserver and software package for BLP identification are freely available at http://lin-group.cn/server/iBLP.


Asunto(s)
Algoritmos , Proteínas Luminiscentes , Secuencia de Aminoácidos , Fenómenos Químicos , Biología Computacional , Bases de Datos de Proteínas , Descubrimiento de Drogas , Luminiscencia , Proteínas Luminiscentes/química , Proteínas Luminiscentes/genética , Proteínas Luminiscentes/metabolismo , Aprendizaje Automático , Programas Informáticos
16.
Brief Bioinform ; 22(1): 526-535, 2021 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-31994694

RESUMEN

Messenger RNAs (mRNAs) shoulder special responsibilities that transmit genetic code from DNA to discrete locations in the cytoplasm. The locating process of mRNA might provide spatial and temporal regulation of mRNA and protein functions. The situ hybridization and quantitative transcriptomics analysis could provide detail information about mRNA subcellular localization; however, they are time consuming and expensive. It is highly desired to develop computational tools for timely and effectively predicting mRNA subcellular location. In this work, by using binomial distribution and one-way analysis of variance, the optimal nonamer composition was obtained to represent mRNA sequences. Subsequently, a predictor based on support vector machine was developed to identify the mRNA subcellular localization. In 5-fold cross-validation, results showed that the accuracy is 90.12% for Homo sapiens (H. sapiens). The predictor may provide a reference for the study of mRNA localization mechanisms and mRNA translocation strategies. An online web server was established based on our models, which is available at http://lin-group.cn/server/iLoc-mRNA/.


Asunto(s)
Biología Computacional/métodos , Transporte de ARN , ARN Mensajero/metabolismo , Humanos , ARN Mensajero/química , Análisis de Secuencia de ARN/métodos , Programas Informáticos
17.
Artículo en Inglés | MEDLINE | ID: mdl-32292778

RESUMEN

Hepatocellular carcinoma (HCC) is a serious cancer which ranked the fourth in cancer-related death worldwide. Hence, more accurate diagnostic models are urgently needed to aid the early HCC diagnosis under clinical scenarios and thus improve HCC treatment and survival. Several conventional methods have been used for discriminating HCC from cirrhosis tissues in patients without HCC (CwoHCC). However, the recognition successful rates are still far from satisfactory. In this study, we applied a computational approach that based on machine learning method to a set of microarray data generated from 1091 HCC samples and 242 CwoHCC samples. The within-sample relative expression orderings (REOs) method was used to extract numerical descriptors from gene expression profiles datasets. After removing the unrelated features by using maximum redundancy minimum relevance (mRMR) with incremental feature selection, we achieved "11-gene-pair" which could produce outstanding results. We further investigated the discriminate capability of the "11-gene-pair" for HCC recognition on several independent datasets. The wonderful results were obtained, demonstrating that the selected gene pairs can be signature for HCC. The proposed computational model can discriminate HCC and adjacent non-cancerous tissues from CwoHCC even for minimum biopsy specimens and inaccurately sampled specimens, which can be practical and effective for aiding the early HCC diagnosis at individual level.

18.
Mol Ther Nucleic Acids ; 17: 337-346, 2019 Sep 06.
Artículo en Inglés | MEDLINE | ID: mdl-31299595

RESUMEN

Promoter is a fundamental DNA element located around the transcription start site (TSS) and could regulate gene transcription. Promoter recognition is of great significance in determining transcription units, studying gene structure, analyzing gene regulation mechanisms, and annotating gene functional information. Many models have already been proposed to predict promoters. However, the performances of these methods still need to be improved. In this work, we combined pseudo k-tuple nucleotide composition (PseKNC) with position-correlation scoring function (PCSF) to formulate promoter sequences of Homo sapiens (H. sapiens), Drosophila melanogaster (D. melanogaster), Caenorhabditis elegans (C. elegans), Bacillus subtilis (B. subtilis), and Escherichia coli (E. coli). Minimum Redundancy Maximum Relevance (mRMR) algorithm and increment feature selection strategy were then adopted to find out optimal feature subsets. Support vector machine (SVM) was used to distinguish between promoters and non-promoters. In the 10-fold cross-validation test, accuracies of 93.3%, 93.9%, 95.7%, 95.2%, and 93.1% were obtained for H. sapiens, D. melanogaster, C. elegans, B. subtilis, and E. coli, with the areas under receiver operating curves (AUCs) of 0.974, 0.975, 0.981, 0.988, and 0.976, respectively. Comparative results demonstrated that our method outperforms existing methods for identifying promoters. An online web server was established that can be freely accessed (http://lin-group.cn/server/iProEP/).

19.
Bioinformatics ; 35(9): 1469-1477, 2019 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-30247625

RESUMEN

MOTIVATION: Transcription termination is an important regulatory step of gene expression. If there is no terminator in gene, transcription could not stop, which will result in abnormal gene expression. Detecting such terminators can determine the operon structure in bacterial organisms and improve genome annotation. Thus, accurate identification of transcriptional terminators is essential and extremely important in the research of transcription regulations. RESULTS: In this study, we developed a new predictor called 'iTerm-PseKNC' based on support vector machine to identify transcription terminators. The binomial distribution approach was used to pick out the optimal feature subset derived from pseudo k-tuple nucleotide composition (PseKNC). The 5-fold cross-validation test results showed that our proposed method achieved an accuracy of 95%. To further evaluate the generalization ability of 'iTerm-PseKNC', the model was examined on independent datasets which are experimentally confirmed Rho-independent terminators in Escherichia coli and Bacillus subtilis genomes. As a result, all the terminators in E. coli and 87.5% of the terminators in B. subtilis were correctly identified, suggesting that the proposed model could become a powerful tool for bacterial terminator recognition. AVAILABILITY AND IMPLEMENTATION: For the convenience of most of wet-experimental researchers, the web-server for 'iTerm-PseKNC' was established at http://lin-group.cn/server/iTerm-PseKNC/, by which users can easily obtain their desired result without the need to go through the detailed mathematical equations involved.


Asunto(s)
Transcripción Genética , Bacillus subtilis , Escherichia coli , Nucleótidos , Operón , Programas Informáticos
20.
Curr Gene Ther ; 18(5): 257-267, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30209997

RESUMEN

Proteins with at least one carbohydrate recognition domain are lectins that can identify and reversibly interact with glycan moiety of glycoconjugates or a soluble carbohydrate. It has been proved that lectins can play various vital roles in mediating signal transduction, cell-cell recognition and interaction, immune defense, and so on. Most organisms can synthesize and secret lectins. A portion of lectins closely related to diverse cancers, called cancerlectins, are involved in tumor initiation, growth and recrudescence. Cancerlectins have been investigated for their applications in the laboratory study, clinical diagnosis and therapy, and drug delivery and targeting of cancers. The identification of cancerlectin genes from a lot of lectins is helpful for dissecting cancers. Several cancerlectin prediction tools based on machine learning approaches have been established and have become an excellent complement to experimental methods. In this review, we comprehensively summarize and expound the indispensable materials for implementing cancerlectin prediction models. We hope that this review will contribute to understanding cancerlectins and provide valuable clues for the study of cancerlectins. Novel systems for cancerlectin gene identification are expected to be developed for clinical applications and gene therapy.


Asunto(s)
Lectinas/inmunología , Aprendizaje Automático , Neoplasias/terapia , Transducción de Señal/inmunología , Terapia Genética/métodos , Humanos , Lectinas/genética , Lectinas/metabolismo , Neoplasias/genética , Neoplasias/metabolismo , Transducción de Señal/genética , Encuestas y Cuestionarios
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...