Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 458
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38348746

RESUMEN

The prediction of molecular interactions is vital for drug discovery. Existing methods often focus on individual prediction tasks and overlook the relationships between them. Additionally, certain tasks encounter limitations due to insufficient data availability, resulting in limited performance. To overcome these limitations, we propose KGE-UNIT, a unified framework that combines knowledge graph embedding (KGE) and multi-task learning, for simultaneous prediction of drug-target interactions (DTIs) and drug-drug interactions (DDIs) and enhancing the performance of each task, even when data availability is limited. Via KGE, we extract heterogeneous features from the drug knowledge graph to enhance the structural features of drug and protein nodes, thereby improving the quality of features. Additionally, employing multi-task learning, we introduce an innovative predictor that comprises the task-aware Convolutional Neural Network-based (CNN-based) encoder and the task-aware attention decoder which can fuse better multimodal features, capture the contextual interactions of molecular tasks and enhance task awareness, leading to improved performance. Experiments on two imbalanced datasets for DTIs and DDIs demonstrate the superiority of KGE-UNIT, achieving high area under the receiver operating characteristics curves (AUROCs) (0.942, 0.987) and area under the precision-recall curve ( AUPRs) (0.930, 0.980) for DTIs and high AUROCs (0.975, 0.989) and AUPRs (0.966, 0.988) for DDIs. Notably, on the LUO dataset where the data were more limited, KGE-UNIT exhibited a more pronounced improvement, with increases of 4.32$\%$ in AUROC and 3.56$\%$ in AUPR for DTIs and 6.56$\%$ in AUROC and 8.17$\%$ in AUPR for DDIs. The scalability of KGE-UNIT is demonstrated through its extension to protein-protein interactions prediction, ablation studies and case studies further validate its effectiveness.


Asunto(s)
Aprendizaje , Reconocimiento de Normas Patrones Automatizadas , Descubrimiento de Drogas , Área Bajo la Curva , Redes Neurales de la Computación , Interacciones Farmacológicas
2.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38975895

RESUMEN

Spatial transcriptomics provides valuable insights into gene expression within the native tissue context, effectively merging molecular data with spatial information to uncover intricate cellular relationships and tissue organizations. In this context, deciphering cellular spatial domains becomes essential for revealing complex cellular dynamics and tissue structures. However, current methods encounter challenges in seamlessly integrating gene expression data with spatial information, resulting in less informative representations of spots and suboptimal accuracy in spatial domain identification. We introduce stCluster, a novel method that integrates graph contrastive learning with multi-task learning to refine informative representations for spatial transcriptomic data, consequently improving spatial domain identification. stCluster first leverages graph contrastive learning technology to obtain discriminative representations capable of recognizing spatially coherent patterns. Through jointly optimizing multiple tasks, stCluster further fine-tunes the representations to be able to capture complex relationships between gene expression and spatial organization. Benchmarked against six state-of-the-art methods, the experimental results reveal its proficiency in accurately identifying complex spatial domains across various datasets and platforms, spanning tissue, organ, and embryo levels. Moreover, stCluster can effectively denoise the spatial gene expression patterns and enhance the spatial trajectory inference. The source code of stCluster is freely available at https://github.com/hannshu/stCluster.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos , Algoritmos , Humanos , Animales , Programas Informáticos , Aprendizaje Automático
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38300515

RESUMEN

Accurate cell type annotation in single-cell RNA-sequencing data is essential for advancing biological and medical research, particularly in understanding disease progression and tumor microenvironments. However, existing methods are constrained by single feature extraction approaches, lack of adaptability to immune cell types with similar molecular profiles but distinct functions and a failure to account for the impact of cell label noise on model accuracy, all of which compromise the precision of annotation. To address these challenges, we developed a supervised approach called scMMT. We proposed a novel feature extraction technique to uncover more valuable information. Additionally, we constructed a multi-task learning framework based on the GradNorm method to enhance the recognition of challenging immune cells and reduce the impact of label noise by facilitating mutual reinforcement between cell type annotation and protein prediction tasks. Furthermore, we introduced logarithmic weighting and label smoothing mechanisms to enhance the recognition ability of rare cell types and prevent model overconfidence. Through comprehensive evaluations on multiple public datasets, scMMT has demonstrated state-of-the-art performance in various aspects including cell type annotation, rare cell identification, dropout and label noise resistance, protein expression prediction and low-dimensional embedding representation.


Asunto(s)
Investigación Biomédica , Aprendizaje Profundo , Humanos , Anotación de Secuencia Molecular , Análisis de Expresión Génica de una Sola Célula , Progresión de la Enfermedad
4.
Brief Bioinform ; 24(5)2023 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-37539835

RESUMEN

Enhancers are crucial cis-regulatory elements that control gene expression in a cell-type-specific manner. Despite extensive genetic and computational studies, accurately predicting enhancer activity in different cell types remains a challenge, and the grammar of enhancers is still poorly understood. Here, we present HEAP (high-resolution enhancer activity prediction), an explainable deep learning framework for predicting enhancers and exploring enhancer grammar. The framework includes three modules that use grammar-based reasoning for enhancer prediction. The algorithm can incorporate DNA sequences and epigenetic modifications to obtain better accuracy. We use a novel two-step multi-task learning method, task adaptive parameter sharing (TAPS), to efficiently predict enhancers in different cell types. We first train a shared model with all cell-type datasets. Then we adapt to specific tasks by adding several task-specific subset layers. Experiments demonstrate that HEAP outperforms published methods and showcases the effectiveness of the TAPS, especially for those with limited training samples. Notably, the explainable framework HEAP utilizes post-hoc interpretation to provide insights into the prediction mechanisms from three perspectives: data, model architecture and algorithm, leading to a better understanding of model decisions and enhancer grammar. To the best of our knowledge, HEAP will be a valuable tool for insight into the complex mechanisms of enhancer activity.


Asunto(s)
Aprendizaje Profundo , Elementos de Facilitación Genéticos , Algoritmos , Secuencia de Bases , Epigénesis Genética
5.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37287133

RESUMEN

MicroRNAs (miRNAs) are a family of non-coding RNA molecules with vital roles in regulating gene expression. Although researchers have recognized the importance of miRNAs in the development of human diseases, it is very resource-consuming to use experimental methods for identifying which dysregulated miRNA is associated with a specific disease. To reduce the cost of human effort, a growing body of studies has leveraged computational methods for predicting the potential miRNA-disease associations. However, the extant computational methods usually ignore the crucial mediating role of genes and suffer from the data sparsity problem. To address this limitation, we introduce the multi-task learning technique and develop a new model called MTLMDA (Multi-Task Learning model for predicting potential MicroRNA-Disease Associations). Different from existing models that only learn from the miRNA-disease network, our MTLMDA model exploits both miRNA-disease and gene-disease networks for improving the identification of miRNA-disease associations. To evaluate model performance, we compare our model with competitive baselines on a real-world dataset of experimentally supported miRNA-disease associations. Empirical results show that our model performs best using various performance metrics. We also examine the effectiveness of model components via ablation study and further showcase the predictive power of our model for six types of common cancers. The data and source code are available from https://github.com/qwslle/MTLMDA.


Asunto(s)
MicroARNs , Neoplasias , Humanos , MicroARNs/genética , MicroARNs/metabolismo , Algoritmos , Biología Computacional/métodos , Neoplasias/genética , Programas Informáticos
6.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-36932656

RESUMEN

Post- and co-transcriptional RNA modifications are found to play various roles in regulating essential biological processes at all stages of RNA life. Precise identification of RNA modification sites is thus crucial for understanding the related molecular functions and specific regulatory circuitry. To date, a number of computational approaches have been developed for in silico identification of RNA modification sites; however, most of them require learning from base-resolution epitranscriptome datasets, which are generally scarce and available only for a limited number of experimental conditions, and predict only a single modification, even though there are multiple inter-related RNA modification types available. In this study, we proposed AdaptRM, a multi-task computational method for synergetic learning of multi-tissue, type and species RNA modifications from both high- and low-resolution epitranscriptome datasets. By taking advantage of adaptive pooling and multi-task learning, the newly proposed AdaptRM approach outperformed the state-of-the-art computational models (WeakRM and TS-m6A-DL) and two other deep-learning architectures based on Transformer and ConvMixer in three different case studies for both high-resolution and low-resolution prediction tasks, demonstrating its effectiveness and generalization ability. In addition, by interpreting the learned models, we unveiled for the first time the potential association between different tissues in terms of epitranscriptome sequence patterns. AdaptRM is available as a user-friendly web server from http://www.rnamd.org/AdaptRM together with all the codes and data used in this project.


Asunto(s)
Biología Computacional , ARN , ARN/genética , Metilación , Análisis de Secuencia de ARN/métodos , Biología Computacional/métodos
7.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36562724

RESUMEN

Drug combinations could trigger pharmacological therapeutic effects (TEs) and adverse effects (AEs). Many computational methods have been developed to predict TEs, e.g. the therapeutic synergy scores of anti-cancer drug combinations, or AEs from drug-drug interactions. However, most of the methods treated the AEs and TEs predictions as two separate tasks, ignoring the potential mechanistic commonalities shared between them. Based on previous clinical observations, we hypothesized that by learning the shared mechanistic commonalities between AEs and TEs, we could learn the underlying MoAs (mechanisms of actions) and ultimately improve the accuracy of TE predictions. To test our hypothesis, we formulated the TE prediction problem as a multi-task heterogeneous network learning problem that performed TE and AE learning tasks simultaneously. To solve this problem, we proposed Muthene (multi-task heterogeneous network embedding) and evaluated it on our collected drug-drug interaction dataset with both TEs and AEs indications. Our experimental results showed that, by including the AE prediction as an auxiliary task, Muthene generated more accurate TE predictions than standard single-task learning methods, which supports our hypothesis. Using a drug pair Vincristine-Dasatinib as a case study, we demonstrated that our method not only provides a novel way of TE predictions but also helps us gain a deeper understanding of the MoAs of drug combinations.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Interacciones Farmacológicas , Combinación de Medicamentos , Aprendizaje Automático
8.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37903413

RESUMEN

Accurate prediction of drug-target affinity (DTA) is of vital importance in early-stage drug discovery, facilitating the identification of drugs that can effectively interact with specific targets and regulate their activities. While wet experiments remain the most reliable method, they are time-consuming and resource-intensive, resulting in limited data availability that poses challenges for deep learning approaches. Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue. To overcome this challenge, we present the Semi-Supervised Multi-task training (SSM) framework for DTA prediction, which incorporates three simple yet highly effective strategies: (1) A multi-task training approach that combines DTA prediction with masked language modeling using paired drug-target data. (2) A semi-supervised training method that leverages large-scale unpaired molecules and proteins to enhance drug and target representations. This approach differs from previous methods that only employed molecules or proteins in pre-training. (3) The integration of a lightweight cross-attention module to improve the interaction between drugs and targets, further enhancing prediction accuracy. Through extensive experiments on benchmark datasets such as BindingDB, DAVIS and KIBA, we demonstrate the superior performance of our framework. Additionally, we conduct case studies on specific drug-target binding activities, virtual screening experiments, drug feature visualizations and real-world applications, all of which showcase the significant potential of our work. In conclusion, our proposed SSM-DTA framework addresses the data limitation challenge in DTA prediction and yields promising results, paving the way for more efficient and accurate drug discovery processes.


Asunto(s)
Benchmarking , Descubrimiento de Drogas , Sistemas de Liberación de Medicamentos
9.
Methods ; 226: 71-77, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38641084

RESUMEN

Biomedical Named Entity Recognition (BioNER) is one of the most basic tasks in biomedical text mining, which aims to automatically identify and classify biomedical entities in text. Recently, deep learning-based methods have been applied to Biomedical Named Entity Recognition and have shown encouraging results. However, many biological entities are polysemous and ambiguous, which is one of the main obstacles to the task of biomedical named entity recognition. Deep learning methods require large amounts of training data, so the lack of data also affect the performance of model recognition. To solve the problem of polysemous words and insufficient data, for the task of biomedical named entity recognition, we propose a multi-task learning framework fused with language model based on the BiLSTM-CRF architecture. Our model uses a language model to design a differential encoding of the context, which could obtain dynamic word vectors to distinguish words in different datasets. Moreover, we use a multi-task learning method to collectively share the dynamic word vector of different types of entities to improve the recognition performance of each type of entity. Experimental results show that our model reduces the false positives caused by polysemous words through differentiated coding, and improves the performance of each subtask by sharing information between different entity data. Compared with other state-of-the art methods, our model achieved superior results in four typical training sets, and achieved the best results in F1 values.


Asunto(s)
Minería de Datos , Aprendizaje Profundo , Minería de Datos/métodos , Humanos , Procesamiento de Lenguaje Natural , Redes Neurales de la Computación , Lenguaje
10.
Methods ; 222: 41-50, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38157919

RESUMEN

Predicting the therapeutic effect of anti-cancer drugs on tumors based on the characteristics of tumors and patients is one of the important contents of precision oncology. Existing computational methods regard the drug response prediction problem as a classification or regression task. However, few of them consider leveraging the relationship between the two tasks. In this work, we propose a Multi-task Interaction Graph Convolutional Network (MTIGCN) for anti-cancer drug response prediction. MTIGCN first utilizes an graph convolutional network-based model to produce embeddings for both cell lines and drugs. After that, the model employs multi-task learning to predict anti-cancer drug response, which involves training the model on three different tasks simultaneously: the main task of the drug sensitive or resistant classification task and the two auxiliary tasks of regression prediction and similarity network reconstruction. By sharing parameters and optimizing the losses of different tasks simultaneously, MTIGCN enhances the feature representation and reduces overfitting. The results of the experiments on two in vitro datasets demonstrated that MTIGCN outperformed seven state-of-the-art baseline methods. Moreover, the well-trained model on the in vitro dataset GDSC exhibited good performance when applied to predict drug responses in in vivo datasets PDX and TCGA. The case study confirmed the model's ability to discover unknown drug responses in cell lines.


Asunto(s)
Antineoplásicos , Neoplasias , Humanos , Neoplasias/tratamiento farmacológico , Medicina de Precisión , Antineoplásicos/farmacología , Antineoplásicos/uso terapéutico , Oncología Médica , Línea Celular
11.
J Comput Chem ; 45(23): 2001-2023, 2024 Sep 05.
Artículo en Inglés | MEDLINE | ID: mdl-38713612

RESUMEN

The proteins within the human epidermal growth factor receptor (EGFR) family, members of the tyrosine kinase receptor family, play a pivotal role in the molecular mechanisms driving the development of various tumors. Tyrosine kinase inhibitors, key compounds in targeted therapy, encounter challenges in cancer treatment due to emerging drug resistance mutations. Consequently, machine learning has undergone significant evolution to address the challenges of cancer drug discovery related to EGFR family proteins. However, the application of deep learning in this area is hindered by inherent difficulties associated with small-scale data, particularly the risk of overfitting. Moreover, the design of a model architecture that facilitates learning through multi-task and transfer learning, coupled with appropriate molecular representation, poses substantial challenges. In this study, we introduce GraphEGFR, a deep learning regression model designed to enhance molecular representation and model architecture for predicting the bioactivity of inhibitors against both wild-type and mutant EGFR family proteins. GraphEGFR integrates a graph attention mechanism for molecular graphs with deep and convolutional neural networks for molecular fingerprints. We observed that GraphEGFR models employing multi-task and transfer learning strategies generally achieve predictive performance comparable to existing competitive methods. The integration of molecular graphs and fingerprints adeptly captures relationships between atoms and enables both global and local pattern recognition. We further validated potential multi-targeted inhibitors for wild-type and mutant HER1 kinases, exploring key amino acid residues through molecular dynamics simulations to understand molecular interactions. This predictive model offers a robust strategy that could significantly contribute to overcoming the challenges of developing deep learning models for drug discovery with limited data and exploring new frontiers in multi-targeted kinase drug discovery for EGFR family proteins.


Asunto(s)
Aprendizaje Profundo , Receptores ErbB , Inhibidores de Proteínas Quinasas , Receptores ErbB/antagonistas & inhibidores , Receptores ErbB/metabolismo , Receptores ErbB/química , Inhibidores de Proteínas Quinasas/farmacología , Inhibidores de Proteínas Quinasas/química , Humanos , Aprendizaje Automático , Descubrimiento de Drogas , Redes Neurales de la Computación
12.
Brief Bioinform ; 23(6)2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36274238

RESUMEN

More than one-third of the proteins contain metal ions in the Protein Data Bank. Correct identification of metal ion-binding residues is important for understanding protein functions and designing novel drugs. Due to the small size and high versatility of metal ions, it remains challenging to computationally predict their binding sites from protein sequence. Existing sequence-based methods are of low accuracy due to the lack of structural information, and time-consuming owing to the usage of multi-sequence alignment. Here, we propose LMetalSite, an alignment-free sequence-based predictor for binding sites of the four most frequently seen metal ions in BioLiP (Zn2+, Ca2+, Mg2+ and Mn2+). LMetalSite leverages the pretrained language model to rapidly generate informative sequence representations and employs transformer to capture long-range dependencies. Multi-task learning is adopted to compensate for the scarcity of training data and capture the intrinsic similarities between different metal ions. LMetalSite was shown to surpass state-of-the-art structure-based methods by more than 19.7, 14.4, 36.8 and 12.6% in area under the precision recall on the four independent tests, respectively. Further analyses indicated that the self-attention modules are effective to learn the structural contexts of residues from protein sequence. We provide the data sets, source codes and trained models of LMetalSite at https://github.com/biomed-AI/LMetalSite.


Asunto(s)
Lenguaje , Proteínas , Conformación Proteica , Unión Proteica , Sitios de Unión , Proteínas/química , Metales/química , Metales/metabolismo , Iones/química
13.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34643232

RESUMEN

Cancer is thought to be caused by the accumulation of driver genetic mutations. Therefore, identifying cancer driver genes plays a crucial role in understanding the molecular mechanism of cancer and developing precision therapies and biomarkers. In this work, we propose a Multi-Task learning method, called MTGCN, based on the Graph Convolutional Network to identify cancer driver genes. First, we augment gene features by introducing their features on the protein-protein interaction (PPI) network. After that, the multi-task learning framework propagates and aggregates nodes and graph features from input to next layer to learn node embedding features, simultaneously optimizing the node prediction task and the link prediction task. Finally, we use a Bayesian task weight learner to balance the two tasks automatically. The outputs of MTGCN assign each gene a probability of being a cancer driver gene. Our method and the other four existing methods are applied to predict cancer drivers for pan-cancer and some single cancer types. The experimental results show that our model shows outstanding performance compared with the state-of-the-art methods in terms of the area under the Receiver Operating Characteristic (ROC) curves and the area under the precision-recall curves. The MTGCN is freely available via https://github.com/weiba/MTGCN.


Asunto(s)
Neoplasias , Mapas de Interacción de Proteínas , Teorema de Bayes , Humanos , Aprendizaje , Neoplasias/genética , Oncogenes
14.
BMC Neurosci ; 25(1): 27, 2024 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-38872076

RESUMEN

Autism Spectrum Disorders (ASD) are neurodevelopmental disorders that cause people difficulties in social interaction and communication. Identifying ASD patients based on resting-state functional magnetic resonance imaging (rs-fMRI) data is a promising diagnostic tool, but challenging due to the complex and unclear etiology of autism. And it is difficult to effectively identify ASD patients with a single data source (single task). Therefore, to address this challenge, we propose a novel multi-task learning framework for ASD identification based on rs-fMRI data, which can leverage useful information from multiple related tasks to improve the generalization performance of the model. Meanwhile, we adopt an attention mechanism to extract ASD-related features from each rs-fMRI dataset, which can enhance the feature representation and interpretability of the model. The results show that our method outperforms state-of-the-art methods in terms of accuracy, sensitivity and specificity. This work provides a new perspective and solution for ASD identification based on rs-fMRI data using multi-task learning. It also demonstrates the potential and value of machine learning for advancing neuroscience research and clinical practice.


Asunto(s)
Trastorno del Espectro Autista , Encéfalo , Imagen por Resonancia Magnética , Redes Neurales de la Computación , Humanos , Trastorno del Espectro Autista/diagnóstico por imagen , Trastorno del Espectro Autista/fisiopatología , Trastorno del Espectro Autista/diagnóstico , Imagen por Resonancia Magnética/métodos , Encéfalo/diagnóstico por imagen , Encéfalo/fisiopatología , Masculino , Femenino , Adulto , Aprendizaje Automático , Adulto Joven , Niño , Adolescente
15.
Biotechnol Bioeng ; 2024 Jun 03.
Artículo en Inglés | MEDLINE | ID: mdl-38831695

RESUMEN

Mammalian cells are commonly used as hosts in cell culture for biologics production in the pharmaceutical industry. Structured mechanistic models of metabolism have been used to capture complex cellular mechanisms that contribute to varying metabolic shifts in different cell lines. However, little research has focused on the impact of temporal changes in enzyme abundance and activity on the modeling of cell metabolism. In this work, we present a framework for constructing mechanistic models of metabolism that integrate growth-signaling control of enzyme activity and transcript dynamics. The proposed approach is applied to build models for three Chinese hamster ovary (CHO) cell lines using fed-batch culture data and time-series transcript profiles. Leveraging information from the transcriptome data, we develop a parameter estimation approach based on multi-cell-line (MCL) learning, which combines data sets from different cell lines and trains the individual cell-line models jointly to improve model accuracy. The computational results demonstrate the important role of growth signaling and transcript variability in metabolic models as well as the virtue of the MCL approach for constructing cell-line models with a limited amount of data. The resulting models exhibit a high level of accuracy in predicting distinct metabolic behaviors in the different cell lines; these models can potentially be used to accelerate the process and cell-line development for the biomanufacturing of new protein therapeutics.

16.
Int J Legal Med ; 138(4): 1741-1757, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38467754

RESUMEN

Sex and chronological age estimation are crucial in forensic investigations and research on individual identification. Although manual methods for sex and age estimation have been proposed, these processes are labor-intensive, time-consuming, and error-prone. The purpose of this study was to estimate sex and chronological age from panoramic radiographs automatically and robustly using a multi-task deep learning network (ForensicNet). ForensicNet consists of a backbone and both sex and age attention branches to learn anatomical context features of sex and chronological age from panoramic radiographs and enables the multi-task estimation of sex and chronological age in an end-to-end manner. To mitigate bias in the data distribution, our dataset was built using 13,200 images with 100 images for each sex and age range of 15-80 years. The ForensicNet with EfficientNet-B3 exhibited superior estimation performance with mean absolute errors of 2.93 ± 2.61 years and a coefficient of determination of 0.957 for chronological age, and achieved accuracy, specificity, and sensitivity values of 0.992, 0.993, and 0.990, respectively, for sex prediction. The network demonstrated that the proposed sex and age attention branches with a convolutional block attention module significantly improved the estimation performance for both sex and chronological age from panoramic radiographs of elderly patients. Consequently, we expect that ForensicNet will contribute to the automatic and accurate estimation of both sex and chronological age from panoramic radiographs.


Asunto(s)
Aprendizaje Profundo , Radiografía Panorámica , Determinación del Sexo por el Esqueleto , Humanos , Masculino , Adulto , Anciano , Femenino , Adolescente , Persona de Mediana Edad , Anciano de 80 o más Años , Adulto Joven , República de Corea , Determinación del Sexo por el Esqueleto/métodos , Determinación de la Edad por los Dientes/métodos
17.
Methods ; 218: 48-56, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37516260

RESUMEN

Drug repurposing, which typically applies the procedure of drug-disease associations (DDAs) prediction, is a feasible solution to drug discovery. Compared with traditional methods, drug repurposing can reduce the cost and time for drug development and advance the success rate of drug discovery. Although many methods for drug repurposing have been proposed and the obtained results are relatively acceptable, there is still some room for improving the predictive performance, since those methods fail to consider fully the issue of sparseness in known drug-disease associations. In this paper, we propose a novel multi-task learning framework based on graph representation learning to identify DDAs for drug repurposing. In our proposed framework, a heterogeneous information network is first constructed by combining multiple biological datasets. Then, a module consisting of multiple layers of graph convolutional networks is utilized to learn low-dimensional representations of nodes in the constructed heterogeneous information network. Finally, two types of auxiliary tasks are designed to help to train the target task of DDAs prediction in the multi-task learning framework. Comprehensive experiments are conducted on real data and the results demonstrate the effectiveness of the proposed method for drug repurposing.


Asunto(s)
Desarrollo de Medicamentos , Reposicionamiento de Medicamentos , Descubrimiento de Drogas
18.
J Biomed Inform ; 149: 104545, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37992791

RESUMEN

Liver transplantation is a life-saving procedure for patients with end-stage liver disease. There are two main challenges in liver transplant: finding the best matching patient for a donor and ensuring transplant equity among different subpopulations. The current MELD scoring system evaluates a patient's mortality risk if not receiving an organ within 90 days. However, the donor-patient matching should also consider post-transplant risk factors, such as cardiovascular disease, chronic rejection, etc., which are all common complications after transplant. Accurate prediction of these risk scores remains a significant challenge. In this study, we used predictive models to solve the above challenges. Specifically, we proposed a deep learning model to predict multiple risk factors after a liver transplant. By formulating it as a multi-task learning problem, the proposed deep neural network was trained to simultaneously predict the five post-transplant risks and achieve equal good performance by exploiting task-balancing techniques. We also proposed a novel fairness-achieving algorithm to ensure prediction fairness across different subpopulations. We used electronic health records of 160,360 liver transplant patients, including demographic information, clinical variables, and laboratory values, collected from the liver transplant records of the United States from 1987 to 2018. The model's performance was evaluated using various performance metrics such as AUROC and AUPRC. Our experiment results highlighted the success of our multi-task model in achieving task balance while maintaining accuracy. The model significantly reduced the task discrepancy by 39 %. Further application of the fairness-achieving algorithm substantially reduced fairness disparity among all sensitive attributes (gender, age group, and race/ethnicity) in each risk factor. It underlined the potency of integrating fairness considerations into the task-balancing framework, ensuring robust and fair predictions across multiple tasks and diverse demographic groups.


Asunto(s)
Aprendizaje Profundo , Trasplante de Hígado , Humanos , Estados Unidos , Donantes de Tejidos , Redes Neurales de la Computación , Factores de Riesgo
19.
J Biomed Inform ; 150: 104599, 2024 02.
Artículo en Inglés | MEDLINE | ID: mdl-38272433

RESUMEN

OBJECTIVE: Event extraction plays a crucial role in natural language processing. However, in the biomedical domain, the presence of nested events adds complexity to event extraction compared to single events, and these events usually have strong semantic relationships and constraints. Previous approaches ignored the binding connections between these complex nested events. This study aims to develop a unified framework based on event constraint information that jointly extract biomedical event triggers and arguments and enhance the performance of nested biomedical event extraction. MATERIAL AND METHODS: We propose a multi-task learning framework based on constraint information called CMBEE for the task of biomedical event extraction. The N-tuple form of event patterns is used to represent the constrained information, which is integrated into role detection and event type classification tasks. The framework use attention mechanism and gating mechanism to explore the fusion of multiple tuple information, as well as local and global constrained information fusion methods to dig further into the connections between events. RESULTS: Experimental results demonstrate that our proposed method achieves the highest F1 score on a multilevel event extraction biomedical (MLEE) corpus and performs favorably on the biomedical natural language processing shared task 2013 Genia event corpus (GE 13). CONCLUSIONS: The experimental results indicate that modeling event patterns and constraints for multi-event extraction tasks is effective for complex biomedical event extraction. The fusion strategy proposed in this study, which incorporates different constraint information, helps to better express semantic information.


Asunto(s)
Aprendizaje Automático , Procesamiento de Lenguaje Natural , Semántica , Minería de Datos/métodos
20.
J Biomed Inform ; 151: 104603, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38331081

RESUMEN

BACKGROUND: An adverse drug event (ADE) is any unfavorable effect that occurs due to the use of a drug. Extracting ADEs from unstructured clinical notes is essential to biomedical text extraction research because it helps with pharmacovigilance and patient medication studies. OBJECTIVE: From the considerable amount of clinical narrative text, natural language processing (NLP) researchers have developed methods for extracting ADEs and their related attributes. This work presents a systematic review of current methods. METHODOLOGY: Two biomedical databases have been searched from June 2022 until December 2023 for relevant publications regarding this review, namely the databases PubMed and Medline. Similarly, we searched the multi-disciplinary databases IEEE Xplore, Scopus, ScienceDirect, and the ACL Anthology. We adopted the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement guidelines and recommendations for reporting systematic reviews in conducting this review. Initially, we obtained 5,537 articles from the search results from the various databases between 2015 and 2023. Based on predefined inclusion and exclusion criteria for article selection, 100 publications have undergone full-text review, of which we consider 82 for our analysis. RESULTS: We determined the general pattern for extracting ADEs from clinical notes, with named entity recognition (NER) and relation extraction (RE) being the dual tasks considered. Researchers that tackled both NER and RE simultaneously have approached ADE extraction as a "pipeline extraction" problem (n = 22), as a "joint task extraction" problem (n = 7), and as a "multi-task learning" problem (n = 6), while others have tackled only NER (n = 27) or RE (n = 20). We further grouped the reviews based on the approaches for data extraction, namely rule-based (n = 8), machine learning (n = 11), deep learning (n = 32), comparison of two or more approaches (n = 11), hybrid (n = 12) and large language models (n = 8). The most used datasets are MADE 1.0, TAC 2017 and n2c2 2018. CONCLUSION: Extracting ADEs is crucial, especially for pharmacovigilance studies and patient medications. This survey showcases advances in ADE extraction research, approaches, datasets, and state-of-the-art performance in them. Challenges and future research directions are highlighted. We hope this review will guide researchers in gaining background knowledge and developing more innovative ways to address the challenges.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA