Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
J Biomed Inform ; 144: 104440, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37429511

RESUMEN

The imputation of missing values in multivariate time series (MTS) data is critical in ensuring data quality and producing reliable data-driven predictive models. Apart from many statistical approaches, a few recent studies have proposed state-of-the-art deep learning methods to impute missing values in MTS data. However, the evaluation of these deep methods is limited to one or two data sets, low missing rates, and completely random missing value types. This survey performs six data-centric experiments to benchmark state-of-the-art deep imputation methods on five time series health data sets. Our extensive analysis reveals that no single imputation method outperforms the others on all five data sets. The imputation performance depends on data types, individual variable statistics, missing value rates, and types. Deep learning methods that jointly perform cross-sectional (across variables) and longitudinal (across time) imputations of missing values in time series data yield statistically better data quality than traditional imputation methods. Although computationally expensive, deep learning methods are practical given the current availability of high-performance computing resources, especially when data quality and sample size are of paramount importance in healthcare informatics. Our findings highlight the importance of data-centric selection of imputation methods to optimize data-driven predictive models.


Asunto(s)
Benchmarking , Proyectos de Investigación , Factores de Tiempo , Estudios Transversales , Encuestas y Cuestionarios
2.
Knowl Based Syst ; 2492022 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-36159738

RESUMEN

Missing values in tabular data restrict the use and performance of machine learning, requiring the imputation of missing values. The most popular imputation algorithm is arguably multiple imputations using chains of equations (MICE), which estimates missing values from linear conditioning on observed values. This paper proposes methods to improve both the imputation accuracy of MICE and the classification accuracy of imputed data by replacing MICE's linear regressors with ensemble learning and deep neural networks (DNN). The imputation accuracy is further improved by characterizing individual samples with cluster labels (CISCL) obtained from the training data. Our extensive analyses involving six tabular data sets, up to 80% missing values, and three missing types (missing completely at random, missing at random, missing not at random) reveal that ensemble or deep learning within MICE is superior to the baseline MICE (b-MICE), both of which are consistently outperformed by CISCL. Results show that CISCL + b-MICE outperforms b-MICE for all percentages and types of missingness. Our proposed DNN-based MICE and gradient boosting MICE plus CISCL (GB-MICE-CISCL) outperform seven state-of-the-art imputation algorithms in most experimental cases. The classification accuracy of GB-MICE imputed data is further improved by our proposed GB-MICE-CISCL imputation method across all missingness percentages. Results also reveal a shortcoming of the MICE framework at high missingness (>50%) and when the missing type is not random. This paper provides a generalized approach to identifying the best imputation model for a data set with a missingness percentage and type.

3.
Neural Netw ; 156: 160-169, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-36270199

RESUMEN

Fully connected deep neural networks (DNN) often include redundant weights leading to overfitting and high memory requirements. Additionally, in tabular data classification, DNNs are challenged by the often superior performance of traditional machine learning models. This paper proposes periodic perturbations (prune and regrow) of DNN weights, especially at the self-supervised pre-training stage of deep autoencoders. The proposed weight perturbation strategy outperforms dropout learning or weight regularization (L1 or L2) for four out of six tabular data sets in downstream classification tasks. Unlike dropout learning, the proposed weight perturbation routine additionally achieves 15% to 40% sparsity across six tabular data sets, resulting in compressed pretrained models. The proposed pretrained model compression improves the accuracy of downstream classification, unlike traditional weight pruning methods that trade off performance for model compression. Our experiments reveal that a pretrained deep autoencoder with weight perturbation can outperform traditional machine learning in tabular data classification, whereas baseline fully-connected DNNs yield the worst classification accuracy. However, traditional machine learning models are superior to any deep model when a tabular data set contains uncorrelated variables. Therefore, the performance of deep models with tabular data is contingent on the types and statistics of constituent variables.


Asunto(s)
Compresión de Datos , Redes Neurales de la Computación , Aprendizaje Automático , Fenómenos Físicos
4.
IEEE Int Conf Healthc Inform ; 2022: 104-111, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36148026

RESUMEN

The unpredictability and unknowns surrounding the ongoing coronavirus disease (COVID-19) pandemic have led to an unprecedented consequence taking a heavy toll on the lives and economies of all countries. There have been efforts to predict COVID-19 case counts (CCC) using epidemiological data and numerical tokens online, which may allow early preventive measures to slow the spread of the disease. In this paper, we use state-of-the-art natural language processing (NLP) algorithms to numerically encode COVID-19 related tweets originated from eight cities in the United States and predict city-specific CCC up to eight days in the future. A city-embedding is proposed to obtain a time series representation of daily tweets posted from a city, which is then used to predict case counts using a custom long-short term memory (LSTM) model. The universal sentence encoder yields the best normalized root mean squared error (NRMSE) 0.090 (0.039), averaged across all cities in predicting CCC six days in the future. The R 2 scores in predicting CCC are more than 0.70 and often over 0.8, which suggests a strong correlation between the actual and our model predicted CCC values. Our analyses show that the NRMSE and R 2 scores are consistently robust across different cities and different numbers of time steps in time series data. Results show that the LSTM model can learn the mapping between the NLP-encoded tweet semantics and the case counts, which infers that social media text can be directly mined to identify the future course of the pandemic.

5.
Artículo en Inglés | MEDLINE | ID: mdl-36157884

RESUMEN

The concept of weight pruning has shown success in neural network model compression with marginal loss in classification performance. However, similar concepts have not been well recognized in improving unsupervised learning. To the best of our knowledge, this paper proposes one of the first studies on weight pruning in unsupervised autoencoder models using non-imaging data points. We adapt the weight pruning concept to investigate the dynamic behavior of weights while reconstructing data using an autoencoder and propose a deterministic model perturbation algorithm based on the weight statistics. The model perturbation at periodic intervals resets a percentage of weight values using a binary weight mask. Experiments across eight non-imaging data sets ranging from gene sequence to swarm behavior data show that only a few periodic perturbations of weights improve the data reconstruction accuracy of autoencoders and additionally introduce model compression. All data sets yield a small portion of (<5%) weights that are substantially higher than the mean weight value. These weights are found to be much more informative than a substantial portion (>90%) of the weights with negative values. In general, the perturbation of low or negative weight values at periodic intervals has improved the data reconstruction loss for most data sets when compared to the case without perturbation. The proposed approach may help explain and correct the dynamic behavior of neural network models in a deterministic way for data reconstruction and obtaining a more accurate representation of latent variables using autoencoders.

6.
IEEE Int Conf Healthc Inform ; 2021: 48-52, 2021 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-36168324

RESUMEN

Deep transfer learning is a popular choice for classifying monochromatic medical images using models that are pretrained by natural images with color channels. This choice may introduce unnecessarily redundant model complexity that can limit explanations of such model behavior and outcomes in the context of medical imaging. To investigate this hypothesis, we develop a configurable deep convolutional neural network (CNN) to classify four macular disease conditions using retinal optical coherence tomography (OCT) images. Our proposed non-transfer deep CNN model (acc: 97.9%) outperforms existing transfer learning models such as ResNet-50 (acc: 89.0%), ResNet-101 (acc: 96.7%), VGG-19 (acc: 93.3%), Inception-V3 (acc: 95.8%) in the same retinal OCT image classification task. We perform post-hoc analysis of the trained model and model extracted image features, which reveals that only eight out of 256 filter kernels are active at our final convolutional layer. The convolutional responses of these selective eight filters yield image features that efficiently separate four macular disease classes even when projected onto two-dimensional principal component space. Our findings suggest that many deep learning parameters and their computations are redundant and expensive for retinal OCT image classification, which are expected to be more intense when using transfer learning. Additionally, we provide clinical interpretations of our misclassified test images identifying manifest artifacts, shadowing of useful texture, false texture representing fluids, and other confounding factors. These clinical explanations along with model optimization via kernel selection can improve the classification accuracy, computational costs, and explainability of model outcomes.

7.
J Med Imaging (Bellingham) ; 6(2): 024501, 2019 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-31037246

RESUMEN

A glioma grading method using conventional structural magnetic resonance image (MRI) and molecular data from patients is proposed. The noninvasive grading of glioma tumors is obtained using multiple radiomic texture features including dynamic texture analysis, multifractal detrended fluctuation analysis, and multiresolution fractal Brownian motion in structural MRI. The proposed method is evaluated using two multicenter MRI datasets: (1) the brain tumor segmentation (BRATS-2017) challenge for high-grade versus low-grade (LG) and (2) the cancer imaging archive (TCIA) repository for glioblastoma (GBM) versus LG glioma grading. The grading performance using MRI is compared with that of digital pathology (DP) images in the cancer genome atlas (TCGA) data repository. The results show that the mean area under the receiver operating characteristic curve (AUC) is 0.88 for the BRATS dataset. The classification of tumor grades using MRI and DP images in TCIA/TCGA yields mean AUC of 0.90 and 0.93, respectively. This work further proposes and compares tumor grading performance using molecular alterations (IDH1/2 mutations) along with MRI and DP data, following the most recent World Health Organization grading criteria, respectively. The overall grading performance demonstrates the efficacy of the proposed noninvasive glioma grading approach using structural MRI.

8.
JACC Cardiovasc Imaging ; 12(4): 681-689, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-29909114

RESUMEN

OBJECTIVES: The goal of this study was to use machine learning to more accurately predict survival after echocardiography. BACKGROUND: Predicting patient outcomes (e.g., survival) following echocardiography is primarily based on ejection fraction (EF) and comorbidities. However, there may be significant predictive information within additional echocardiography-derived measurements combined with clinical electronic health record data. METHODS: Mortality was studied in 171,510 unselected patients who underwent 331,317 echocardiograms in a large regional health system. The authors investigated the predictive performance of nonlinear machine learning models compared with that of linear logistic regression models using 3 different inputs: 1) clinical variables, including 90 cardiovascular-relevant International Classification of Diseases, Tenth Revision, codes, and age, sex, height, weight, heart rate, blood pressures, low-density lipoprotein, high-density lipoprotein, and smoking; 2) clinical variables plus physician-reported EF; and 3) clinical variables and EF, plus 57 additional echocardiographic measurements. Missing data were imputed with a multivariate imputation by using a chained equations algorithm (MICE). The authors compared models versus each other and baseline clinical scoring systems by using a mean area under the curve (AUC) over 10 cross-validation folds and across 10 survival durations (6 to 60 months). RESULTS: Machine learning models achieved significantly higher prediction accuracy (all AUC >0.82) over common clinical risk scores (AUC = 0.61 to 0.79), with the nonlinear random forest models outperforming logistic regression (p < 0.01). The random forest model including all echocardiographic measurements yielded the highest prediction accuracy (p < 0.01 across all models and survival durations). Only 10 variables were needed to achieve 96% of the maximum prediction accuracy, with 6 of these variables being derived from echocardiography. Tricuspid regurgitation velocity was more predictive of survival than LVEF. In a subset of studies with complete data for the top 10 variables, multivariate imputation by chained equations yielded slightly reduced predictive accuracies (difference in AUC of 0.003) compared with the original data. CONCLUSIONS: Machine learning can fully utilize large combinations of disparate input variables to predict survival after echocardiography with superior accuracy.


Asunto(s)
Minería de Datos/métodos , Bases de Datos Factuales , Ecocardiografía , Registros Electrónicos de Salud , Cardiopatías/diagnóstico por imagen , Aprendizaje Automático , Cardiopatías/mortalidad , Humanos , Valor Predictivo de las Pruebas , Pronóstico , Reproducibilidad de los Resultados , Estudios Retrospectivos , Medición de Riesgo , Factores de Riesgo , Factores de Tiempo
9.
IEEE Trans Neural Syst Rehabil Eng ; 26(2): 353-361, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29432106

RESUMEN

Autism spectrum disorder (ASD) is a neurodevelopmental disability with atypical traits in behavioral and physiological responses. These atypical traits in individuals with ASD may be too subtle and subjective to measure visually using tedious methods of scoring. Alternatively, the use of intrusive sensors in the measurement of psychophysical responses in individuals with ASD may likely cause inhibition and bias. This paper proposes a novel experimental protocol for non-intrusive sensing and analysis of facial expression, visual scanning, and eye-hand coordination to investigate behavioral markers for ASD. An institutional review board approved pilot study is conducted to collect the response data from two groups of subjects (ASD and control) while they engage in the tasks of visualization, recognition, and manipulation. For the first time in the ASD literature, the facial action coding system is used to classify spontaneous facial responses. Statistical analyses reveal significantly (p <0.01) higher prevalence of smile expression for the group with ASD with the eye-gaze significantly averted (p<0.05) from viewing the face in the visual stimuli. This uncontrolled manifestation of smile without proper visual engagement suggests impairment in reciprocal social communication, e.g., social smile. The group with ASD also reveals poor correlation in eye-gaze and hand movement data suggesting deficits in motor coordination while performing a dynamic manipulation task. The simultaneous sensing and analysis of multimodal response data may provide useful quantitative insights into ASD to facilitate early detection of symptoms for effective intervention planning.


Asunto(s)
Trastorno Autístico/psicología , Conducta , Expresión Facial , Movimiento , Desempeño Psicomotor , Adolescente , Algoritmos , Trastorno Autístico/diagnóstico , Biomarcadores , Niño , Estudios de Factibilidad , Femenino , Fijación Ocular , Humanos , Masculino , Estimulación Luminosa , Proyectos Piloto , Conducta Social , Adulto Joven
10.
Eur Heart J Cardiovasc Imaging ; 19(7): 730-738, 2018 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-29538684

RESUMEN

Aims: Previous studies using regression analyses have failed to identify which patients with repaired tetralogy of Fallot (rTOF) are at risk for deterioration in ventricular size and function despite using common clinical and cardiac function parameters as well as cardiac mechanics (strain and dyssynchrony). This study used a machine learning pipeline to comprehensively investigate the predictive value of the baseline variables derived from cardiac magnetic resonance (CMR) imaging and provide models for identifying patients at risk for deterioration. Methods and results: Longitudinal deterioration for 153 patients with rTOF was categorized as 'none', 'minor', or 'major' based on changes in ventricular size and ejection fraction between two CMR scans at least 6 months apart (median 2.7 years). Baseline variables were measured at the time of the first CMR. An exhaustive variable search with a support vector machine classifier and five-fold cross-validation was used to predict deterioration and identify the most useful variables. For predicting any deterioration (minor or major) vs. no deterioration, the mean area under the curve (AUC) was 0.82 ± 0.06. For predicting major deterioration vs. minor or no deterioration, the AUC was 0.77 ± 0.07. Baseline left ventricular (LV) ejection fraction, LV circumferential strain, and pulmonary regurgitation were most useful for achieving accurate predictions. Conclusion: For the prediction of deterioration in patients with rTOF, a machine learning pipeline uncovered the utility of baseline variables that was previously lost to regression analyses. The predictive models may be useful for planning early interventions in patients with high risk.


Asunto(s)
Procedimientos Quirúrgicos Cardíacos/efectos adversos , Tetralogía de Fallot/cirugía , Disfunción Ventricular Izquierda/diagnóstico por imagen , Área Bajo la Curva , Procedimientos Quirúrgicos Cardíacos/métodos , Niño , Preescolar , Estudios de Cohortes , Bases de Datos Factuales , Electrocardiografía/métodos , Femenino , Estudios de Seguimiento , Hospitales Pediátricos , Humanos , Lactante , Aprendizaje Automático , Imagen por Resonancia Cinemagnética/métodos , Masculino , Valor Predictivo de las Pruebas , Estudios Retrospectivos , Volumen Sistólico/fisiología , Tetralogía de Fallot/diagnóstico por imagen , Resultado del Tratamiento , Disfunción Ventricular Izquierda/fisiopatología , Función Ventricular Izquierda/fisiología
11.
Artículo en Inglés | MEDLINE | ID: mdl-21095684

RESUMEN

In robot-assisted surgery, it may be important to provide force feedback to the hand of the surgeon. Here we examine how force feedback from each degree of freedom (DOF) on a hand controller affects the motion accuracy of a surgical tool. We studied the motion accuracy of a needle-shaped tool in performing a robot-assisted tracing task. On a virtual simulation of the tool and neuroArm robot, human participants manipulated a hand controller to move the tool attached to the end-effector of the robot. They used the tool to trace a line on pipes (mimicking blood vessels) along 3 orthogonal directions, corresponding to 3 DOF on the hand controller. We observed that force feedback from each DOF on the hand controller had a significant effect on the motion accuracy of the tool during tracing. Varying force conditions yielded insignificant difference in motion accuracy. These results indicate a need of revising the hand controller for achieving improved motion accuracy in performing robot-assisted tasks.


Asunto(s)
Retroalimentación , Robótica/instrumentación , Cirugía Asistida por Computador/instrumentación , Adulto , Algoritmos , Diseño de Equipo , Humanos , Movimiento (Física) , Reproducibilidad de los Resultados , Robótica/métodos , Programas Informáticos , Cirugía Asistida por Computador/métodos , Interfaz Usuario-Computador , Visión Ocular
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA