Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 109
Filtrar
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38678587

RESUMEN

Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.


Asunto(s)
Biomarcadores de Tumor , Aprendizaje Profundo , Recurrencia Local de Neoplasia , Humanos , Biomarcadores de Tumor/metabolismo , Biomarcadores de Tumor/genética , Recurrencia Local de Neoplasia/metabolismo , Recurrencia Local de Neoplasia/genética , Biología Computacional/métodos , Neoplasias/genética , Neoplasias/metabolismo , Neoplasias/patología , Genómica/métodos , Multiómica
2.
Artículo en Inglés | MEDLINE | ID: mdl-38607720

RESUMEN

CircRNA has been shown to be involved in the occurrence of many diseases. Several computational frameworks have been proposed to identify circRNA-disease associations. Despite the existing computational methods have obtained considerable successes, these methods still require to be improved as their performance may degrade due to the sparsity of the data and the problem of memory overflow. We develop a novel computational framework called LGCDA to predict circRNA-disease associations by fusing local and global features to solve the above mentioned problems. First, we construct closed local subgraphs by using k-hop closed subgraph and label the subgraphs to obtain rich graph pattern information. Then, the local features are extracted by using graph neural network (GNN). In addition, we fuse Gaussian interaction profile (GIP) kernel and cosine similarity to obtain global features. Finally, the score of circRNA-disease associations is predicted by using the multilayer perceptron (MLP) based on local and global features. We perform five- fold cross validation on five datasets for model evaluation and our model surpasses other advanced methods. The code is available at https://github.com/lanbiolab/LGCDA.

3.
Artículo en Inglés | MEDLINE | ID: mdl-38607719

RESUMEN

By generating massive gene transcriptome data and analyzing transcriptomic variations at the cell level, single-cell RNA-sequencing (scRNA-seq) technology has provided new way to explore cellular heterogeneity and functionality. Clustering scRNA-seq data could discover the hidden diversity and complexity of cell populations, which can aid to the identification of the disease mechanisms and biomarkers. In this paper, a novel method (DSINMF) is presented for single cell RNA sequencing data by using deep matrix factorization. Our proposed method comprises four steps: first, the feature selection is utilized to remove irrelevant features. Then, the dropout imputation is used to handle missing value problem. Further, the dimension reduction is employed to preserve data characteristics and reduce noise effects. Finally, the deep matrix factorization with bi-stochastic graph regularization is used to obtain cluster results from scRNA-seq data. We compare DSINMF with other state-of-the-art algorithms on nine datasets and the results show our method outperformances than other methods.

4.
Methods ; 226: 89-101, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38642628

RESUMEN

Obtaining an accurate segmentation of the pulmonary nodules in computed tomography (CT) images is challenging. This is due to: (1) the heterogeneous nature of the lung nodules; (2) comparable visual characteristics between the nodules and their surroundings. A robust multi-scale feature extraction mechanism that can effectively obtain multi-scale representations at a granular level can improve segmentation accuracy. As the most commonly used network in lung nodule segmentation, UNet, its variants, and other image segmentation methods lack this robust feature extraction mechanism. In this study, we propose a multi-stride residual 3D UNet (MRUNet-3D) to improve the segmentation accuracy of lung nodules in CT images. It incorporates a multi-slide Res2Net block (MSR), which replaces the simple sequence of convolution layers in each encoder stage to effectively extract multi-scale features at a granular level from different receptive fields and resolutions while conserving the strengths of 3D UNet. The proposed method has been extensively evaluated on the publicly available LUNA16 dataset. Experimental results show that it achieves competitive segmentation performance with an average dice similarity coefficient of 83.47 % and an average surface distance of 0.35 mm on the dataset. More notably, our method has proven to be robust to the heterogeneity of lung nodules. It has also proven to perform better at segmenting small lung nodules. Ablation studies have shown that the proposed MSR and RFIA modules are fundamental to improving the performance of the proposed model.


Asunto(s)
Imagenología Tridimensional , Neoplasias Pulmonares , Tomografía Computarizada por Rayos X , Humanos , Tomografía Computarizada por Rayos X/métodos , Neoplasias Pulmonares/diagnóstico por imagen , Imagenología Tridimensional/métodos , Nódulo Pulmonar Solitario/diagnóstico por imagen , Algoritmos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Pulmón/diagnóstico por imagen
5.
Artículo en Inglés | MEDLINE | ID: mdl-37962997

RESUMEN

Multivariate time-series anomaly detection is critically important in many applications, including retail, transportation, power grid, and water treatment plants. Existing approaches for this problem mostly employ either statistical models which cannot capture the nonlinear relations well or conventional deep learning (DL) models e.g., convolutional neural network (CNN) and long short-term memory (LSTM) that do not explicitly learn the pairwise correlations among variables. To overcome these limitations, we propose a novel method, correlation-aware spatial-temporal graph learning (termed ), for time-series anomaly detection. explicitly captures the pairwise correlations via a correlation learning (MTCL) module based on which a spatial-temporal graph neural network (STGNN) can be developed. Then, by employing a graph convolution network (GCN) that exploits one-and multihop neighbor information, our STGNN component can encode rich spatial information from complex pairwise dependencies between variables. With a temporal module that consists of dilated convolutional functions, the STGNN can further capture long-range dependence over time. A novel anomaly scoring component is further integrated into to estimate the degree of an anomaly in a purely unsupervised manner. Experimental results demonstrate that can detect and diagnose anomalies effectively in general settings as well as enable early detection across different time delays. Our code is available at https://github.com/huankoh/CST-GL.

6.
Comput Biol Med ; 164: 107274, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37506451

RESUMEN

Tumour heterogeneity is one of the critical confounding aspects in decoding tumour growth. Malignant cells display variations in their gene transcription profiles and mutation spectra even when originating from a single progenitor cell. Single-cell and spatial transcriptomics sequencing have recently emerged as key technologies for unravelling tumour heterogeneity. Single-cell sequencing promotes individual cell-type identification through transcriptome-wide gene expression measurements of each cell. Spatial transcriptomics facilitates identification of cell-cell interactions and the structural organization of heterogeneous cells within a tumour tissue through associating spatial RNA abundance of cells at distinct spots in the tissue section. However, extracting features and analyzing single-cell and spatial transcriptomics data poses challenges. Single-cell transcriptome data is extremely noisy and its sparse nature and dropouts can lead to misinterpretation of gene expression and the misclassification of cell types. Deep learning predictive power can overcome data challenges, provide high-resolution analysis and enhance precision oncology applications that involve early cancer prognosis, diagnosis, patient survival estimation and anti-cancer therapy planning. In this paper, we provide a background to and review of the recent progress of deep learning frameworks to investigate tumour heterogeneity using both single-cell and spatial transcriptomics data types.


Asunto(s)
Aprendizaje Profundo , Neoplasias , Humanos , Transcriptoma/genética , Medicina de Precisión , Perfilación de la Expresión Génica
7.
Comput Biol Med ; 156: 106700, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36871338

RESUMEN

Accurate prediction of the trajectory of Alzheimer's disease (AD) from an early stage is of substantial value for treatment and planning to delay the onset of AD. We propose a novel attention transfer method to train a 3D convolutional neural network to predict which patients with mild cognitive impairment (MCI) will progress to AD within 3 years. A model is first trained on a separate but related source task (task we are transferring information from) to automatically learn regions of interest (ROI) from a given image. Next we train a model to simultaneously classify progressive MCI (pMCI) and stable MCI (sMCI) (the target task we want to solve) and the ROIs learned from the source task. The predicted ROIs are then used to focus the model's attention on certain areas of the brain when classifying pMCI versus sMCI. Thus, in contrast to traditional transfer learning, we transfer attention maps instead of transferring model weights from a source task to the target classification task. Our Method outperformed all methods tested including traditional transfer learning and methods that used expert knowledge to define ROI. Furthermore, the attention map transferred from the source task highlights known Alzheimer's pathology.


Asunto(s)
Enfermedad de Alzheimer , Disfunción Cognitiva , Humanos , Imagen por Resonancia Magnética/métodos , Redes Neurales de la Computación , Encéfalo/patología , Atención
8.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36611256

RESUMEN

Accumulating evidences demonstrate that circular RNA (circRNA) plays an important role in human diseases. Identification of circRNA-disease associations can help for the diagnosis of human diseases, while the traditional method based on biological experiments is time-consuming. In order to address the limitation, a series of computational methods have been proposed in recent years. However, few works have summarized these methods or compared the performance of them. In this paper, we divided the existing methods into three categories: information propagation, traditional machine learning and deep learning. Then, the baseline methods in each category are introduced in detail. Further, 5 different datasets are collected, and 14 representative methods of each category are selected and compared in the 5-fold, 10-fold cross-validation and the de novo experiment. In order to further evaluate the effectiveness of these methods, six common cancers are selected to compare the number of correctly identified circRNA-disease associations in the top-10, top-20, top-50, top-100 and top-200. In addition, according to the results, the observation about the robustness and the character of these methods are concluded. Finally, the future directions and challenges are discussed.


Asunto(s)
Neoplasias , ARN Circular , Humanos , ARN Circular/genética , Benchmarking , Aprendizaje Automático , Neoplasias/genética , Biología Computacional/métodos
9.
Artif Intell Med ; 136: 102475, 2023 02.
Artículo en Inglés | MEDLINE | ID: mdl-36710063

RESUMEN

The growing prevalence of neurological disorders, e.g., Autism Spectrum Disorder (ASD), demands robust computer-aided diagnosis (CAD) due to the diverse symptoms which require early intervention, particularly in young children. The absence of a benchmark neuroimaging diagnostics paves the way to study transitions in the brain's anatomical structure and neurological patterns associated with ASD. The existing CADs take advantage of the large-scale baseline dataset from the Autism Brain Imaging Data Exchange (ABIDE) repository to improve diagnostic performance, but the involvement of multisite data also amplifies the variabilities and heterogeneities that hinder satisfactory results. To resolve this problem, we propose a Deep Multimodal Neuroimaging Framework (DeepMNF) that employs Functional Magnetic Resonance Imaging (fMRI) and Structural Magnetic Resonance Imaging (sMRI) to integrate cross-modality spatiotemporal information by exploiting 2-dimensional time-series data along with 3-dimensional images. The purpose is to fuse complementary information that increases group differences and homogeneities. To the best of our knowledge, our DeepMNF achieves superior validation performance than the best reported result on the ABIDE-1 repository involving datasets from all available screening sites. In this work, we also demonstrate the performance of the studied modalities in a single model as well as their possible combinations to develop the multimodal framework.


Asunto(s)
Trastorno del Espectro Autista , Trastorno Autístico , Niño , Humanos , Preescolar , Trastorno del Espectro Autista/diagnóstico por imagen , Encéfalo/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos , Mapeo Encefálico/métodos
10.
PLoS One ; 17(10): e0276509, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36288359

RESUMEN

OBJECTIVE(S): To use machine learning (ML) to predict short-term requirements for invasive ventilation in patients with COVID-19 admitted to Australian intensive care units (ICUs). DESIGN: A machine learning study within a national ICU COVID-19 registry in Australia. PARTICIPANTS: Adult patients who were spontaneously breathing and admitted to participating ICUs with laboratory-confirmed COVID-19 from 20 February 2020 to 7 March 2021. Patients intubated on day one of their ICU admission were excluded. MAIN OUTCOME MEASURES: Six machine learning models predicted the requirement for invasive ventilation by day three of ICU admission from variables recorded on the first calendar day of ICU admission; (1) random forest classifier (RF), (2) decision tree classifier (DT), (3) logistic regression (LR), (4) K neighbours classifier (KNN), (5) support vector machine (SVM), and (6) gradient boosted machine (GBM). Cross-validation was used to assess the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity of machine learning models. RESULTS: 300 ICU admissions collected from 53 ICUs across Australia were included. The median [IQR] age of patients was 59 [50-69] years, 109 (36%) were female and 60 (20%) required invasive ventilation on day two or three. Random forest and Gradient boosted machine were the best performing algorithms, achieving mean (SD) AUCs of 0.69 (0.06) and 0.68 (0.07), and mean sensitivities of 77 (19%) and 81 (17%), respectively. CONCLUSION: Machine learning can be used to predict subsequent ventilation in patients with COVID-19 who were spontaneously breathing and admitted to Australian ICUs.


Asunto(s)
COVID-19 , Ventilación no Invasiva , Adulto , Humanos , Persona de Mediana Edad , Anciano , COVID-19/epidemiología , COVID-19/terapia , Enfermedad Crítica/terapia , Australia/epidemiología , Aprendizaje Automático
11.
PLoS One ; 17(5): e0267931, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35507629

RESUMEN

BACKGROUND: Predicting reduced health-related quality of life (HRQoL) after resection of a benign or low-grade brain tumour provides the opportunity for early intervention, and targeted expenditure of scarce supportive care resources. We aimed to develop, and evaluate the performance of, machine learning (ML) algorithms to predict HRQoL outcomes in this patient group. METHODS: Using a large prospective dataset of HRQoL outcomes in patients surgically treated for low grade glioma, acoustic neuroma and meningioma, we investigated the capability of ML to predict a) HRQoL-impacting symptoms persisting between 12 and 60 months from tumour resection and b) a decline in global HRQoL by more than the minimum clinically important difference below a normative population mean within 12 and 60 months after resection. Ten-fold cross-validation was used to measure the area under the receiver operating characteristic curve (AUC), area under the precision-recall curve (PR-AUC), sensitivity, and specificity of models. Six ML algorithms were explored per outcome: Random Forest Classifier, Decision Tree Classifier, Logistic Regression, K Neighbours Classifier, Support Vector Machine, and Gradient Boosting Machine. RESULTS: The final cohort included 262 patients. Outcome measures for which AUC>0.9 were Appetite loss, Constipation, Nausea and vomiting, Diarrhoea, Dyspnoea and Fatigue. AUC was between 0.8 and 0.9 for global HRQoL and Financial difficulty. Pain and Insomnia achieved AUCs below 0.8. PR-AUCs were similar overall to the AUC of each respective classifier. CONCLUSIONS: ML algorithms based on routine demographic and perioperative data show promise in their ability to predict HRQoL outcomes in patients with low grade and benign brain tumours between 12 and 60 months after surgery.


Asunto(s)
Glioma , Neoplasias Meníngeas , Meningioma , Neuroma Acústico , Glioma/patología , Glioma/cirugía , Humanos , Aprendizaje Automático , Neoplasias Meníngeas/cirugía , Meningioma/cirugía , Neuroma Acústico/cirugía , Estudios Prospectivos , Calidad de Vida , Estudios Retrospectivos
12.
Int J Intell Syst ; 37(3): 2371-2392, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37520859

RESUMEN

The coronavirus of 2019 (COVID-19) was declared a global pandemic by World Health Organization in March 2020. Effective testing is crucial to slow the spread of the pandemic. Artificial intelligence and machine learning techniques can help COVID-19 detection using various clinical symptom data. While deep learning (DL) approach requiring centralized data is susceptible to a high risk of data privacy breaches, federated learning (FL) approach resting on decentralized data can preserve data privacy, a critical factor in the health domain. This paper reviews recent advances in applying DL and FL techniques for COVID-19 detection with a focus on the latter. A model FL implementation use case in health systems with a COVID-19 detection using chest X-ray image data sets is studied. We have also reviewed applications of previously published FL experiments for COVID-19 research to demonstrate the applicability of FL in tackling health research issues. Last, several challenges in FL implementation in the healthcare domain are discussed in terms of potential future work.

13.
IEEE/ACM Trans Comput Biol Bioinform ; 19(3): 1715-1723, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-33125333

RESUMEN

It has been proved that long noncoding RNA (lncRNA) plays critical roles in many human diseases. Therefore, inferring associations between lncRNAs and diseases can contribute to disease diagnosis, prognosis and treatment. To overcome the limitation of traditional experimental methods such as expensive and time-consuming, several computational methods have been proposed to predict lncRNA-disease associations by fusing different biological data. However, the prediction performance of lncRNA-disease associations identification needs to be improved. In this study, we propose a computational model (named LDICDL) to identify lncRNA-disease associations based on collaborative deep learning. It uses an automatic encoder to denoise multiple lncRNA feature information and multiple disease feature information, respectively. Then, the matrix decomposition algorithm is employed to predict the potential lncRNA-disease associations. In addition, to overcome the limitation of matrix decomposition, the hybrid model is developed to predict associations between new lncRNA (or disease) and diseases (or lncRNA). The ten-fold cross validation and de novo test are applied to evaluate the performance of method. The experimental results show LDICDL outperforms than other state-of-the-art methods in prediction performance.


Asunto(s)
Biología Computacional , Aprendizaje Profundo , ARN Largo no Codificante , Algoritmos , Biología Computacional/métodos , Humanos , ARN Largo no Codificante/genética
14.
IEEE/ACM Trans Comput Biol Bioinform ; 19(6): 3530-3538, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34506289

RESUMEN

Accumulating evidences have shown that circRNA plays an important role in human diseases. It can be used as potential biomarker for diagnose and treatment of disease. Although some computational methods have been proposed to predict circRNA-disease associations, the performance still need to be improved. In this paper, we propose a new computational model based on Improved Graph convolutional network and Negative Sampling to predict CircRNA-Disease Associations. In our method, it constructs the heterogeneous network based on known circRNA-disease associations. Then, an improved graph convolutional network is designed to obtain the feature vectors of circRNA and disease. Further, the multi-layer perceptron is employed to predict circRNA-disease associations based on the feature vectors of circRNA and disease. In addition, the negative sampling method is employed to reduce the effect of the noise samples, which selects negative samples based on circRNA's expression profile similarity and Gaussian Interaction Profile kernel similarity. The 5-fold cross validation is utilized to evaluate the performance of the method. The results show that IGNSCDA outperforms than other state-of-the-art methods in the prediction performance. Moreover, the case study shows that IGNSCDA is an effective tool for predicting potential circRNA-disease associations.


Asunto(s)
Redes Neurales de la Computación , ARN Circular , Humanos , ARN Circular/genética , ARN Circular/metabolismo , Algoritmos , Biología Computacional/métodos
15.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34864877

RESUMEN

Increasing evidences have proved that circRNA plays a significant role in the development of many diseases. In addition, many researches have shown that circRNA can be considered as the potential biomarker for clinical diagnosis and treatment of disease. Some computational methods have been proposed to predict circRNA-disease associations. However, the performance of these methods is limited as the sparsity of low-order interaction information. In this paper, we propose a new computational method (KGANCDA) to predict circRNA-disease associations based on knowledge graph attention network. The circRNA-disease knowledge graphs are constructed by collecting multiple relationship data among circRNA, disease, miRNA and lncRNA. Then, the knowledge graph attention network is designed to obtain embeddings of each entity by distinguishing the importance of information from neighbors. Besides the low-order neighbor information, it can also capture high-order neighbor information from multisource associations, which alleviates the problem of data sparsity. Finally, the multilayer perceptron is applied to predict the affinity score of circRNA-disease associations based on the embeddings of circRNA and disease. The experiment results show that KGANCDA outperforms than other state-of-the-art methods in 5-fold cross validation. Furthermore, the case study demonstrates that KGANCDA is an effective tool to predict potential circRNA-disease associations.


Asunto(s)
MicroARNs , ARN Circular , Biología Computacional/métodos , MicroARNs/genética , Redes Neurales de la Computación , Reconocimiento de Normas Patrones Automatizadas
17.
IEEE Trans Neural Netw Learn Syst ; 32(11): 4770-4780, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34546931

RESUMEN

The coronavirus disease 2019 (COVID-19) has continued to spread worldwide since late 2019. To expedite the process of providing treatment to those who have contracted the disease and to ensure the accessibility of effective drugs, numerous strategies have been implemented to find potential anti-COVID-19 drugs in a short span of time. Motivated by this critical global challenge, in this review, we detail approaches that have been used for drug repurposing for COVID-19 and suggest improvements to the existing deep learning (DL) approach to identify and repurpose drugs to treat this complex disease. By optimizing hyperparameter settings, deploying suitable activation functions, and designing optimization algorithms, the improved DL approach will be able to perform feature extraction from quality big data, turning the traditional DL approach, referred to as a "black box," which generalizes and learns the transmitted data, into a "glass box" that will have the interpretability of its rationale while maintaining a high level of prediction accuracy. When adopted for drug repurposing for COVID-19, this improved approach will create a new generation of DL approaches that can establish a cause and effect relationship as to why the repurposed drugs are suitable for treating COVID-19. Its ability can also be extended to repurpose drugs for other complex diseases, develop appropriate treatment strategies for new diseases, and provide precision medical treatment to patients, thus paving the way to discover new drugs that can potentially be effective for treating COVID-19.


Asunto(s)
Tratamiento Farmacológico de COVID-19 , Aprendizaje Profundo/tendencias , Reposicionamiento de Medicamentos/métodos , Reposicionamiento de Medicamentos/tendencias , Redes Neurales de la Computación , Antivirales/administración & dosificación , COVID-19/epidemiología , Descubrimiento de Drogas/métodos , Descubrimiento de Drogas/tendencias , Humanos
18.
Artif Intell Med ; 118: 102129, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34412846

RESUMEN

Leukocytes are key cellular elements of the innate immune system in all vertebrates, which play a crucial role in defending organisms against invading pathogens. Tracking these highly migratory and amorphous cells in in vivo models such as zebrafish embryos is a challenging task in cellular immunology. As temporal and special analysis of these imaging datasets by a human operator is quite laborious, developing an automated cell tracking method is highly in demand. Despite the remarkable advances in cell detection, this field still lacks powerful algorithms to accurately associate the detected cell across time frames. The cell association challenge is mostly related to the amorphous nature of cells, and their complicated motion profile through their migratory paths. To tackle the cell association challenge, we proposed a novel deep-learning-based object linkage method. For this aim, we trained the 3D cell association learning network (3D-CALN) with enough manually labelled paired 3D images of single fluorescent zebrafish's neutrophils from two consecutive frames. Our experiment results prove that deep learning is significantly applicable in cell linkage and particularly for tracking highly mobile and amorphous leukocytes. A comparison of our tracking accuracy with other available tracking algorithms shows that our approach performs well in relation to addressing cell tracking problems.


Asunto(s)
Aprendizaje por Asociación , Pez Cebra , Algoritmos , Animales , Humanos , Leucocitos , Imagen de Lapso de Tiempo
19.
JMIR Med Inform ; 9(4): e25000, 2021 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-33792549

RESUMEN

BACKGROUND: Cardiovascular disease (CVD) is the greatest health problem in Australia, which kills more people than any other disease and incurs enormous costs for the health care system. In this study, we present a benchmark comparison of various artificial intelligence (AI) architectures for predicting the mortality rate of patients with CVD using structured medical claims data. Compared with other research in the clinical literature, our models are more efficient because we use a smaller number of features, and this study could help health professionals accurately choose AI models to predict mortality among patients with CVD using only claims data before a clinic visit. OBJECTIVE: This study aims to support health clinicians in accurately predicting mortality among patients with CVD using only claims data before a clinic visit. METHODS: The data set was obtained from the Medicare Benefits Scheme and Pharmaceutical Benefits Scheme service information in the period between 2004 and 2014, released by the Department of Health Australia in 2016. It included 346,201 records, corresponding to 346,201 patients. A total of five AI algorithms, including four classical machine learning algorithms (logistic regression [LR], random forest [RF], extra trees [ET], and gradient boosting trees [GBT]) and a deep learning algorithm, which is a densely connected neural network (DNN), were developed and compared in this study. In addition, because of the minority of deceased patients in the data set, a separate experiment using the Synthetic Minority Oversampling Technique (SMOTE) was conducted to enrich the data. RESULTS: Regarding model performance, in terms of discrimination, GBT and RF were the models with the highest area under the receiver operating characteristic curve (97.8% and 97.7%, respectively), followed by ET (96.8%) and LR (96.4%), whereas DNN was the least discriminative (95.3%). In terms of reliability, LR predictions were the least calibrated compared with the other four algorithms. In this study, despite increasing the training time, SMOTE was proven to further improve the model performance of LR, whereas other algorithms, especially GBT and DNN, worked well with class imbalanced data. CONCLUSIONS: Compared with other research in the clinical literature involving AI models using claims data to predict patient health outcomes, our models are more efficient because we use a smaller number of features but still achieve high performance. This study could help health professionals accurately choose AI models to predict mortality among patients with CVD using only claims data before a clinic visit.

20.
Comput Biol Med ; 133: 104361, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33872968

RESUMEN

It is a well-known fact that there are often side effects to the long-term use of certain medications. These side effects can vary from mild dizziness to, at its most serious, death. The main factors that cause these side effects are the chemical composition, the mode of treatment, and the dose. The dynamics that govern the reaction of a drug heavily depend on its structural composition. The structural composition of a drug is defined by the structural arrangement of the corresponding basic chemical functional groups. Hence, it is essential to investigate the effect of chemical functional groups on the side effects to synthesize drugs with minimal side effects. To support this process, we developed a framework named MedFused (Medical Functional Group Side Effects Database), which is composed of drugs (International Union of Pure and Applied Chemistry: IUPAC nomenclature), functional groups, and the side effects along with other valuable information such as STITCH (search tool for interactions of chemicals) compound ID, and the Unified Medical Language System (UMLS) concept ID. We develop a web framework that functions on the MedFused system database on top of the Django web framework. Our web server supports functionalities such as exploring the database and descriptive graph tools, which provide additional exploration capabilities to the framework. These descriptive tools include histograms, pie charts, and association charts, which further explore the system. Above these basic tools, MedFused includes functionality to discover the drug's "chemical functional group" impact on "side effects". The method conducts an association rule analysis on the relationships by considering the MedFused database as a collection of transactions. A specific transaction has a list of the functional groups of a drug and one side effect. Hence, a drug that has more than one side effect forms multiple transactions. Next, we generate a binary feature matrix based on the transactions and introduce a pruning mechanism to consider only the potential functional groups and side effects based on their support (frequencies), subjected to a predefined threshold (which can be changed accordingly). As the current version of the MedFused database has a limited number of side effects (hence low support), we restricted the analysis to identify the functional groups which have the most potential of causing a particular side effect, based on a confidence value of 1. Our framework can be further extended with more functions and tools as it supports the model view controller (MVC) architecture, which is inherited from the Django Python web framework.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Preparaciones Farmacéuticas , Bases de Datos Factuales , Humanos , Unified Medical Language System
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...