Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 38(10): 2705-2711, 2022 05 13.
Artículo en Inglés | MEDLINE | ID: mdl-35561183

RESUMEN

MOTIVATION: Protein structure can be severely disrupted by frameshift and non-sense mutations at specific positions in the protein sequence. Frameshift and non-sense mutation cases can also be found in healthy individuals. A method to distinguish neutral and potentially disease-associated frameshift and non-sense mutations is of practical and fundamental importance. It would allow researchers to rapidly screen out the potentially pathogenic sites from a large number of mutated genes and then use these sites as drug targets to speed up diagnosis and improve access to treatment. The problem of how to distinguish between neutral and potentially disease-associated frameshift and non-sense mutations remains under-researched. RESULTS: We built a Transformer-based neural network model to predict the pathogenicity of frameshift and non-sense mutations on protein features and named it TransPPMP. The feature matrix of contextual sequences computed by the ESM pre-training model, type of mutation residue and the auxiliary features, including structure and function information, are combined as input features, and the focal loss function is designed to solve the sample imbalance problem during the training. In 10-fold cross-validation and independent blind test set, TransPPMP showed good robust performance and absolute advantages in all evaluation metrics compared with four other advanced methods, namely, ENTPRISE-X, VEST-indel, DDIG-in and CADD. In addition, we demonstrate the usefulness of the multi-head attention mechanism in Transformer to predict the pathogenicity of mutations-not only can multiple self-attention heads learn local and global interactions but also functional sites with a large influence on the mutated residue can be captured by attention focus. These could offer useful clues to study the pathogenicity mechanism of human complex diseases for which traditional machine learning methods fall short. AVAILABILITY AND IMPLEMENTATION: TransPPMP is available at https://github.com/lennylv/TransPPMP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Mutación del Sistema de Lectura , Programas Informáticos , Humanos , Mutación , Redes Neurales de la Computación
2.
Artículo en Inglés | MEDLINE | ID: mdl-32976105

RESUMEN

Deep learning has been successfully applied to surprisingly different domains. Researchers and practitioners are employing trained deep learning models to enrich our knowledge. Transcription factors (TFs)are essential for regulating gene expression in all organisms by binding to specific DNA sequences. Here, we designed a deep learning model named SemanticCS (Semantic ChIP-seq)to predict TF binding specificities. We trained our learning model on an ensemble of ChIP-seq datasets (Multi-TF-cell)to learn useful intermediate features across multiple TFs and cells. To interpret these feature vectors, visualization analysis was used. Our results indicate that these learned representations can be used to train shallow machines for other tasks. Using diverse experimental data and evaluation metrics, we show that SemanticCS outperforms other popular methods. In addition, from experimental data, SemanticCS can help to identify the substitutions that cause regulatory abnormalities and to evaluate the effect of substitutions on the binding affinity for the RXR transcription factor. The online server for SemanticCS is freely available at http://qianglab.scst.suda.edu.cn/semanticCS/.


Asunto(s)
Secuenciación de Inmunoprecipitación de Cromatina , Factores de Transcripción , Secuencia de Bases , Sitios de Unión/genética , Unión Proteica , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
3.
IEEE J Biomed Health Inform ; 25(7): 2811-2819, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-33571101

RESUMEN

The control of the coordinated expression of genes is primarily regulated by the interactions between transcription factors (TFs) and their DNA binding sites, which are an integral part of transcriptional regulatory networks. There are many computational tools focused on determining TF binding or unbinding to a DNA sequence. However, other tools focused on further determining the relative preference of such binding are needed. Here, we propose a regression model with deep learning, called SemanticBI, to predict intensities of TF-DNA binding. SemanticBI is a convolutional neural network (CNN)-recurrent neural network (RNN) architecture model that was trained on an ensemble of protein binding microarray data sets that covered multiple TFs. Using this approach, SemanticBI exhibited superior accuracy in predicting binding intensities compared to other popular methods. Moreover, SemanticBI uncovered vectorized sequence-oriented features using its CNN-RNN architecture, which is an abstract representation of the original DNA sequences. Additionally, the use of SemanticBI raises the question of whether motifs are necessary for computational models of TF binding. The online SemanticBI service can be accessed at http://qianglab.scst.suda.edu.cn/semantic/.


Asunto(s)
Algoritmos , Biología Computacional , Sitios de Unión , ADN/genética , Humanos , Unión Proteica , Factores de Transcripción/genética
4.
Oncotarget ; 8(38): 63382-63391, 2017 Sep 08.
Artículo en Inglés | MEDLINE | ID: mdl-28968998

RESUMEN

Gastric intestinal metaplasia (GIM) is a precancerous gastric carcinoma (GC) lesion with pivotal roles in carcinogenesis. CD24, LGR5 and Ki67 are expressed in GIM; we previously demonstrated that aquaporin 3 (AQP3) is expressed in goblet cells and is positively correlated with GIM severity. However, the relationships of AQP3 with GIM classification and with other proteins, and their roles in the transition from GIM to gastric carcinoma (GC) remain unknown. Sixteen patients with intestinal-type GC were enrolled in this study. GIM was determined according to the updated Sydney system; GIM classification was determined via HID-AB staining, and AQP3, CD24, LGR5 and Ki67 expression were determined by immunohistochemistry. Type III GIM was more prevalent around the GC and displayed a positive association with GIM severity. CD24 was found in GIM, but LGR5 and Ki67 were found in tissues regardless of GIM. AQP3 expression showed significant correlation to type III GIM. CD24 expression was correlated with the marked GIM and incomplete GIM, while LGR5 expression decreased with GIM aggravation and did not have relationship with classification of GIM. However, Ki67 presented no association with GIM grade or classification. These observations identify AQP3 and CD24 as biomarkers for carcinogenesis of GIM, and may provide a precise strategy for screening at-risk candidates with GIM.

5.
Oncotarget ; 8(14): 23817-23830, 2017 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-28423604

RESUMEN

Alpha-fetoprotein-producing gastric cancer (AFPGC) accounts for 1.5%-7.1% of all gastric cancer cases. Compared with other types of gastric cancer, AFPGC is more aggressive and prone to liver and lymph node (LN) metastasis, with extremely poor prognosis. To improve understanding of AFPGC we reviewed a consecutive series of 82 AFPGC patients and investigated the prognostic factors. The incidence of AFPGC among our gastric cancer patients was 1.95%, and 29.27% of AFPGCs were diagnosed with metastasis at the time of presentation, mainly liver metastasis. The serum AFP level of patients with AFPGC was significantly associated with tumor differentiation. Histologically, these AFPGC patients were composed of 34.55% hapatiod type, 58.18% fetal gastrointestinal type, 9.09% yolk sac tumor-like type, and 14.55% mixed type. Patient gender, tumor differentiation, Lauren classification, and number of metastatic lymph nodes showed significant differences among these four subtypes. The overall survival time was 42.02 months and the 3-year cumulative survival rate was 53.13%. Age, American Joint Committee on Cancer (AJCC) TNM staging classification (TNM stage), serum AFP level, and surgery were prognostic factors for overall survival; however, TNM stage was the only independent risk factor for prognosis of AFPGC. In short, AFPGC is a rare, unique, and heterogeneous entity, and its proper identification and treatment remain a challenge. More attention should be paid to AFPGC to improve patient care and the dismal prognosis.


Asunto(s)
Neoplasias Gástricas/metabolismo , alfa-Fetoproteínas/biosíntesis , Adulto , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Pronóstico , Factores de Riesgo , Neoplasias Gástricas/mortalidad , Neoplasias Gástricas/patología , Análisis de Supervivencia , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...