Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
1.
ArXiv ; 2024 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-38313201

RESUMO

Traumatic Brain Injury (TBI) presents a broad spectrum of clinical presentations and outcomes due to its inherent heterogeneity, leading to diverse recovery trajectories and varied therapeutic responses. While many studies have delved into TBI phenotyping for distinct patient populations, identifying TBI phenotypes that consistently generalize across various settings and populations remains a critical research gap. Our research addresses this by employing multivariate time-series clustering to unveil TBI's dynamic intricates. Utilizing a self-supervised learning-based approach to clustering multivariate time-Series data with missing values (SLAC-Time), we analyzed both the research-centric TRACK-TBI and the real-world MIMIC-IV datasets. Remarkably, the optimal hyperparameters of SLAC-Time and the ideal number of clusters remained consistent across these datasets, underscoring SLAC-Time's stability across heterogeneous datasets. Our analysis revealed three generalizable TBI phenotypes (α, ß, and γ), each exhibiting distinct non-temporal features during emergency department visits, and temporal feature profiles throughout ICU stays. Specifically, phenotype α represents mild TBI with a remarkably consistent clinical presentation. In contrast, phenotype ß signifies severe TBI with diverse clinical manifestations, and phenotype γ represents a moderate TBI profile in terms of severity and clinical diversity. Age is a significant determinant of TBI outcomes, with older cohorts recording higher mortality rates. Importantly, while certain features varied by age, the core characteristics of TBI manifestations tied to each phenotype remain consistent across diverse populations.

2.
J Biomed Inform ; 144: 104438, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37414368

RESUMO

Unpacking and comprehending how black-box machine learning algorithms (such as deep learning models) make decisions has been a persistent challenge for researchers and end-users. Explaining time-series predictive models is useful for clinical applications with high stakes to understand the behavior of prediction models, e.g., to determine how different variables and time points influence the clinical outcome. However, existing approaches to explain such models are frequently unique to architectures and data where the features do not have a time-varying component. In this paper, we introduce WindowSHAP, a model-agnostic framework for explaining time-series classifiers using Shapley values. We intend for WindowSHAP to mitigate the computational complexity of calculating Shapley values for long time-series data as well as improve the quality of explanations. WindowSHAP is based on partitioning a sequence into time windows. Under this framework, we present three distinct algorithms of Stationary, Sliding and Dynamic WindowSHAP, each evaluated against baseline approaches, KernelSHAP and TimeSHAP, using perturbation and sequence analyses metrics. We applied our framework to clinical time-series data from both a specialized clinical domain (Traumatic Brain Injury - TBI) as well as a broad clinical domain (critical care medicine). The experimental results demonstrate that, based on the two quantitative metrics, our framework is superior at explaining clinical time-series classifiers, while also reducing the complexity of computations. We show that for time-series data with 120 time steps (hours), merging 10 adjacent time points can reduce the CPU time of WindowSHAP by 80 % compared to KernelSHAP. We also show that our Dynamic WindowSHAP algorithm focuses more on the most important time steps and provides more understandable explanations. As a result, WindowSHAP not only accelerates the calculation of Shapley values for time-series data, but also delivers more understandable explanations with higher quality.


Assuntos
Algoritmos , Lesões Encefálicas Traumáticas , Humanos , Fatores de Tempo , Benchmarking , Lesões Encefálicas Traumáticas/diagnóstico , Aprendizado de Máquina
3.
J Biomed Inform ; 143: 104401, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37225066

RESUMO

Self-supervised learning approaches provide a promising direction for clustering multivariate time-series data. However, real-world time-series data often include missing values, and the existing approaches require imputing missing values before clustering, which may cause extensive computations and noise and result in invalid interpretations. To address these challenges, we present a Self-supervised Learning-based Approach to Clustering multivariate Time-series data with missing values (SLAC-Time). SLAC-Time is a Transformer-based clustering method that uses time-series forecasting as a proxy task for leveraging unlabeled data and learning more robust time-series representations. This method jointly learns the neural network parameters and the cluster assignments of the learned representations. It iteratively clusters the learned representations with the K-means method and then utilizes the subsequent cluster assignments as pseudo-labels to update the model parameters. To evaluate our proposed approach, we applied it to clustering and phenotyping Traumatic Brain Injury (TBI) patients in the Transforming Research and Clinical Knowledge in Traumatic Brain Injury (TRACK-TBI) study. Clinical data associated with TBI patients are often measured over time and represented as time-series variables characterized by missing values and irregular time intervals. Our experiments demonstrate that SLAC-Time outperforms the baseline K-means clustering algorithm in terms of silhouette coefficient, Calinski Harabasz index, Dunn index, and Davies Bouldin index. We identified three TBI phenotypes that are distinct from one another in terms of clinically significant variables as well as clinical outcomes, including the Extended Glasgow Outcome Scale (GOSE) score, Intensive Care Unit (ICU) length of stay, and mortality rate. The experiments show that the TBI phenotypes identified by SLAC-Time can be potentially used for developing targeted clinical trials and therapeutic strategies.


Assuntos
Lesões Encefálicas Traumáticas , Humanos , Lesões Encefálicas Traumáticas/diagnóstico , Análise por Conglomerados , Fatores de Tempo , Unidades de Terapia Intensiva , Aprendizado de Máquina Supervisionado
4.
IEEE Trans Cybern ; 53(4): 2124-2136, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34546938

RESUMO

Electronic health records (EHRs) have been heavily used in modern healthcare systems for recording patients' admission information to health facilities. Many data-driven approaches employ temporal features in EHR for predicting specific diseases, readmission times, and diagnoses of patients. However, most existing predictive models cannot fully utilize EHR data, due to an inherent lack of labels in supervised training for some temporal events. Moreover, it is hard for the existing methods to simultaneously provide generic and personalized interpretability. To address these challenges, we propose Sherbet, a self-supervised graph learning framework with hyperbolic embeddings for temporal health event prediction. We first propose a hyperbolic embedding method with information flow to pretrain medical code representations in a hierarchical structure. We incorporate these pretrained representations into a graph neural network (GNN) to detect disease complications and design a multilevel attention method to compute the contributions of particular diseases and admissions, thus enhancing personalized interpretability. We present a new hierarchy-enhanced historical prediction proxy task in our self-supervised learning framework to fully utilize EHR data and exploit medical domain knowledge. We conduct a comprehensive set of experiments on widely used publicly available EHR datasets to verify the effectiveness of our model. Our results demonstrate the proposed model's strengths in both predictive tasks and interpretable abilities.


Assuntos
Registros Eletrônicos de Saúde , Redes Neurais de Computação , Humanos
5.
AMIA Annu Symp Proc ; 2023: 379-388, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38222366

RESUMO

Determining clinically relevant physiological states from multivariate time-series data with missing values is essential for providing appropriate treatment for acute conditions such as Traumatic Brain Injury (TBI), respiratory failure, and heart failure. Utilizing non-temporal clustering or data imputation and aggregation techniques may lead to loss of valuable information and biased analyses. In our study, we apply the SLAC-Time algorithm, an innovative self-supervision-based approach that maintains data integrity by avoiding imputation or aggregation, offering a more useful representation of acute patient states. By using SLAC-Time to cluster data in a large research dataset, we identified three distinct TBI physiological states and their specific feature profiles. We employed various clustering evaluation metrics and incorporated input from a clinical domain expert to validate and interpret the identified physiological states. Further, we discovered how specific clinical events and interventions can influence patient states and state transitions.


Assuntos
Lesões Encefálicas Traumáticas , Humanos , Lesões Encefálicas Traumáticas/diagnóstico , Algoritmos , Análise por Conglomerados , Fatores de Tempo , Benchmarking
6.
Sci Rep ; 12(1): 10748, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35750878

RESUMO

Developing prediction models for emerging infectious diseases from relatively small numbers of cases is a critical need for improving pandemic preparedness. Using COVID-19 as an exemplar, we propose a transfer learning methodology for developing predictive models from multi-modal electronic healthcare records by leveraging information from more prevalent diseases with shared clinical characteristics. Our novel hierarchical, multi-modal model ([Formula: see text]) integrates baseline risk factors from the natural language processing of clinical notes at admission, time-series measurements of biomarkers obtained from laboratory tests, and discrete diagnostic, procedure and drug codes. We demonstrate the alignment of [Formula: see text]'s predictions with well-established clinical knowledge about COVID-19 through univariate and multivariate risk factor driven sub-cohort analysis. [Formula: see text]'s superior performance over state-of-the-art methods shows that leveraging patient data across modalities and transferring prior knowledge from similar disorders is critical for accurate prediction of patient outcomes, and this approach may serve as an important tool in the early response to future pandemics.


Assuntos
COVID-19 , Pandemias , COVID-19/epidemiologia , Humanos , Aprendizado de Máquina , Processamento de Linguagem Natural , Prognóstico
7.
AMIA Annu Symp Proc ; 2022: 815-824, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37128424

RESUMO

A longstanding challenge surrounding deep learning algorithms is unpacking and understanding how they make their decisions. Explainable Artificial Intelligence (XAI) offers methods to provide explanations of internal functions of algorithms and reasons behind their decisions in ways that are interpretable and understandable to human users. . Numerous XAI approaches have been developed thus far, and a comparative analysis of these strategies seems necessary to discern their relevance to clinical prediction models. To this end, we first implemented two prediction models for short- and long-term outcomes of traumatic brain injury (TBI) utilizing structured tabular as well as time-series physiologic data, respectively. Six different interpretation techniques were used to describe both prediction models at the local and global levels. We then performed a critical analysis of merits and drawbacks of each strategy, highlighting the implications for researchers who are interested in applying these methodologies. The implemented methods were compared to one another in terms of several XAI characteristics such as understandability, fidelity, and stability. Our findings show that SHAP is the most stable with the highest fidelity but falls short of understandability. Anchors, on the other hand, is the most understandable approach, but it is only applicable to tabular data and not time series data.


Assuntos
Inteligência Artificial , Lesões Encefálicas Traumáticas , Humanos , Algoritmos , Pesquisadores , Fatores de Tempo
8.
Sci Rep ; 11(1): 19826, 2021 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-34615894

RESUMO

Medical images are difficult to comprehend for a person without expertise. The scarcity of medical practitioners across the globe often face the issue of physical and mental fatigue due to the high number of cases, inducing human errors during the diagnosis. In such scenarios, having an additional opinion can be helpful in boosting the confidence of the decision maker. Thus, it becomes crucial to have a reliable visual question answering (VQA) system to provide a 'second opinion' on medical cases. However, most of the VQA systems that work today cater to real-world problems and are not specifically tailored for handling medical images. Moreover, the VQA system for medical images needs to consider a limited amount of training data available in this domain. In this paper, we develop MedFuseNet, an attention-based multimodal deep learning model, for VQA on medical images taking the associated challenges into account. Our MedFuseNet aims at maximizing the learning with minimal complexity by breaking the problem statement into simpler tasks and predicting the answer. We tackle two types of answer prediction-categorization and generation. We conducted an extensive set of quantitative and qualitative analyses to evaluate the performance of MedFuseNet. Our experiments demonstrate that MedFuseNet outperforms the state-of-the-art VQA methods, and that visualization of the captured attentions showcases the intepretability of our model's predicted results.


Assuntos
Atenção , Aprendizado Profundo , Diagnóstico por Imagem , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Imagem Assistida por Computador/métodos , Software , Algoritmos , Humanos , Interface Usuário-Computador
9.
AMIA Annu Symp Proc ; 2021: 900-909, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35309007

RESUMO

We developed a prognostic model for longer-term outcome prediction in traumatic brain injury (TBI) using an attention-based recurrent neural network (RNN). The model was trained on admission and time series data obtained from a multi-site, longitudinal, observational study of TBI patients. We included 110 clinical variables as model input and Glasgow Outcome Score Extended (GOSE) at six months after injury as the outcome variable. Designed to handle missing values in time series data, the RNN model was compared to an existing TBI prognostic model using 10-fold cross validation. The area under receiver operating characteristic curve (AUC) for the RNN model is 0.86 (95% CI 0.83-0.89) for binary outcomes, whereas the AUC of the comparison model is 0.69 (95% CI 0.67-0.71). We demonstrated that including time series data into prognostic models for TBI can boost the discriminative ability of prediction models with either binary or ordinal outcomes.


Assuntos
Lesões Encefálicas Traumáticas , Lesões Encefálicas Traumáticas/diagnóstico , Humanos , Redes Neurais de Computação , Prognóstico , Curva ROC , Fatores de Tempo
10.
JMIR Biomed Eng ; 6(1): e24698, 2021 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38907379

RESUMO

BACKGROUND: With advances in digital health technologies and proliferation of biomedical data in recent years, applications of machine learning in health care and medicine have gained considerable attention. While inpatient settings are equipped to generate rich clinical data from patients, there is a dearth of actionable information that can be used for pursuing secondary research for specific clinical conditions. OBJECTIVE: This study focused on applying unsupervised machine learning techniques for traumatic brain injury (TBI), which is the leading cause of death and disability among children and adults aged less than 44 years. Specifically, we present a case study to demonstrate the feasibility and applicability of subspace clustering techniques for extracting patterns from data collected from TBI patients. METHODS: Data for this study were obtained from the Progesterone for Traumatic Brain Injury, Experimental Clinical Treatment-Phase III (PROTECT III) trial, which included a cohort of 882 TBI patients. We applied subspace-clustering methods (density-based, cell-based, and clustering-oriented methods) to this data set and compared the performance of the different clustering methods. RESULTS: The analyses showed the following three clusters of laboratory physiological data: (1) international normalized ratio (INR), (2) INR, chloride, and creatinine, and (3) hemoglobin and hematocrit. While all subclustering algorithms had a reasonable accuracy in classifying patients by mortality status, the density-based algorithm had a higher F1 score and coverage. CONCLUSIONS: Clustering approaches serve as an important step for phenotype definition and validation in clinical domains such as TBI, where patient and injury heterogeneity are among the major reasons for failure of clinical trials. The results from this study provide a foundation to develop scalable clustering algorithms for further research and validation.

11.
IEEE Trans Neural Netw Learn Syst ; 31(7): 2469-2489, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-31425057

RESUMO

In recent times, sequence-to-sequence (seq2seq) models have gained a lot of popularity and provide state-of-the-art performance in a wide variety of tasks, such as machine translation, headline generation, text summarization, speech-to-text conversion, and image caption generation. The underlying framework for all these models is usually a deep neural network comprising an encoder and a decoder. Although simple encoder-decoder models produce competitive results, many researchers have proposed additional improvements over these seq2seq models, e.g., using an attention-based model over the input, pointer-generation models, and self-attention models. However, such seq2seq models suffer from two common problems: 1) exposure bias and 2) inconsistency between train/test measurement. Recently, a completely novel point of view has emerged in addressing these two problems in seq2seq models, leveraging methods from reinforcement learning (RL). In this survey, we consider seq2seq problems from the RL point of view and provide a formulation combining the power of RL methods in decision-making with seq2seq models that enable remembering long-term memories. We present some of the most recent frameworks that combine the concepts from RL and deep neural networks. Our work aims to provide insights into some of the problems that inherently arise with current approaches and how we can address them with better RL models. We also provide the source code for implementing most of the RL models discussed in this paper to support the complex task of abstractive text summarization and provide some targeted experiments for these RL models, both in terms of performance and training time.

12.
IEEE Access ; 7: 78421-78433, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-32661495

RESUMO

This paper presents a Speech Enhancement (SE) technique based on multi-objective learning convolutional neural network to improve the overall quality of speech perceived by Hearing Aid (HA) users. The proposed method is implemented on a smartphone as an application that performs real-time SE. This arrangement works as an assistive tool to HA. A multi-objective learning architecture including primary and secondary features uses a mapping-based convolutional neural network (CNN) model to remove noise from a noisy speech spectrum. The algorithm is computationally fast and has a low processing delay which enables it to operate seamlessly on a smartphone. The steps and the detailed analysis of real-time implementation are discussed. The proposed method is compared with existing conventional and neural network-based SE techniques through speech quality and intelligibility metrics in various noisy speech conditions. The key contribution of this paper includes the realization of CNN SE model on a smartphone processor that works seamlessly with HA. The experimental results demonstrate significant improvements over the state-of-the-art techniques and reflect the usability of the developed SE application in noisy environments.

13.
Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 5503-5506, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30441583

RESUMO

In this paper, we present a Speech Enhancement (SE) technique to improve intelligibility of speech perceived by Hearing Aid users using smartphone as an assistive device. We use the formant frequency information to improve the overall quality and intelligibility of the speech. The proposed SE method is based on new super Gaussian joint maximum a Posteriori (SGJMAP) estimator. Using the priori information of formant frequency locations, the derived gain function has " tradeoff" factors that allows the smartphone user to customize perceptual preference, by controlling the amount of noise suppression and speech distortion in real-time. The formant frequency information helps the hearing aid user to control the gains over the non-formant frequency band, allowing the HA users to attain more noise suppression while maintaining the speech intelligibility using a smartphone application. Objective intelligibility measures and subjective results reflect the usability of the developed SE application in noisy real world acoustic environment.


Assuntos
Auxiliares de Audição , Smartphone , Percepção da Fala , Ruído , Inteligibilidade da Fala
14.
Annu Int Conf IEEE Eng Med Biol Soc ; 2018: 417-420, 2018 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-30440422

RESUMO

This paper presents the minimum variance distortionless response (MVDR) beamformer combined with a Speech Enhancement (SE) gain function as a real-time application running on smartphones that work as an assistive device to Hearing Aids. It has been shown that beamforming techniques improve the Signal to Noise Ratio (SNR) in noisy conditions. In the proposed algorithm, MVDR beamformer is used as an SNR booster for the SE method. The proposed SE gain is based on the Log-Spectral Amplitude estimator to improve the speech quality in the presence of different background noises. Objective evaluation and intelligibility measures support the theoretical analysis and show significant improvements of the proposed method in comparison with existing methods. Subjective test results show the effectiveness of the application in real-world noisy conditions at SNR levels of -5 dB, 0 dB, and 5 dB.


Assuntos
Algoritmos , Auxiliares de Audição , Smartphone , Software , Humanos , Ruído , Tecnologia Assistiva , Razão Sinal-Ruído , Inteligibilidade da Fala , Percepção da Fala
15.
PLoS One ; 13(2): e0193259, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29474481

RESUMO

An Acute Hypotensive Episode (AHE) is the sudden onset of a sustained period of low blood pressure and is one among the most critical conditions in Intensive Care Units (ICU). Without timely medical care, it can lead to an irreversible organ damage and death. By identifying patients at risk for AHE early, adequate medical intervention can save lives and improve patient outcomes. In this paper, we design a novel dual-boundary classification based approach for identifying patients at risk for AHE. Our algorithm uses only simple summary statistics of past Blood Pressure measurements and can be used in an online environment facilitating real-time updates and prediction. We perform extensive experiments with more than 4,500 patient records and demonstrate that our method outperforms the previous best approaches of AHE prediction. Our method can identify AHE patients two hours in advance of the onset, giving sufficient time for appropriate clinical intervention with nearly 80% sensitivity and at 95% specificity, thus having very few false positives.


Assuntos
Pressão Sanguínea , Cuidados Críticos/métodos , Hipotensão , Sistemas Computadorizados de Registros Médicos , Modelos Cardiovasculares , Feminino , Humanos , Hipotensão/diagnóstico , Hipotensão/fisiopatologia , Masculino , Valor Preditivo dos Testes
16.
Annu Int Conf IEEE Eng Med Biol Soc ; 2017: 3660-3663, 2017 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-29060692

RESUMO

Clinical time series, comprising of repeated clinical measurements provide valuable information of the trajectory of patients' condition. Linear dynamical systems (LDS) are used extensively in science and engineering for modeling time series data. The observation and state variables in LDS are assumed to be uniformly sampled in time with a fixed sampling rate. The observation sequence for clinical time series is often irregularly sampled and LDS do not model such data well. In this paper, we develop two LDS-based models for irregularly sampled data. The key idea is to incorporate a temporal difference variable within the state equations of LDS whose parameters are estimated using observed data. Our models are evaluated on prediction and imputation tasks using real irregularly sampled clinical time series data and are found to outperform state-of-the-art techniques.


Assuntos
Modelos Lineares
17.
IEEE Signal Process Lett ; 24(11): 1601-1605, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-29353988

RESUMO

In this letter, we derive a new super Gaussian Joint Maximum a Posteriori (SGJMAP) based single microphone speech enhancement gain function. The developed Speech Enhancement method is implemented on a smartphone, and this arrangement functions as an assistive device to hearing aids. We introduce a "tradeoff" parameter in the derived gain function that allows the smartphone user to customize their listening preference, by controlling the amount of noise suppression and speech distortion in real-time based on their level of hearing comfort perceived in noisy real world acoustic environment. Objective quality and intelligibility measures show the effectiveness of the proposed method in comparison to benchmark techniques considered in this paper. Subjective results reflect the usefulness of the developed Speech Enhancement application in real-world noisy conditions at signal to noise ratio levels of -5 dB, 0 dB and 5 dB.

18.
Health Innov Point Care Conf ; 2017: 32-35, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-32705090

RESUMO

In this paper, we present a Speech Enhancement (SE) method implemented on a smartphone, and this arrangement functions as an assistive device to hearing aids (HA). Many benchmark single channel SE algorithms implemented on HAs provide considerable improvement in speech quality, while speech intelligibility improvement still remains a prime challenge. The proposed SE method based on Log spectral amplitude estimator improves speech intelligibility in the noisy real world acoustic environment using the priori information of formant frequency locations. The formant frequency information avails us to control the amount of speech distortion in these frequency bands, thereby controlling speech distortion. We introduce a 'scaling' parameter for the SE gain function, which controls the gains over the non-formant frequency band, allowing the HA users to customize the playback speech using a smartphone application to their listening preference. Objective intelligibility measures show the effectiveness of the proposed SE method. Subjective results reflect the suitability of the developed Speech Enhancement application in real-world noisy conditions at SNR levels of -5 dB, 0 dB and 5 dB.

19.
Knowl Inf Syst ; 48(1): 201-228, 2016 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-27378821

RESUMO

A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data is not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting based instance-transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.

20.
Inf Sci (N Y) ; 330: 245-259, 2016 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-26681811

RESUMO

In recent years, electronic health records (EHRs) have been widely adapted at many healthcare facilities in an attempt to improve the quality of patient care and increase the productivity and efficiency of healthcare delivery. These EHRs can accurately diagnose diseases if utilized appropriately. While the EHRs can potentially resolve many of the existing problems associated with disease diagnosis, one of the main obstacles in effectively using them is the patient privacy and sensitivity of the medical information available in the EHR. Due to these concerns, even if the EHRs are available for storage and retrieval purposes, sharing of the patient records between different healthcare facilities has become a major concern and has hampered some of the effective advantages of using EHRs. Due to this lack of data sharing, most of the facilities aim at building clinical decision support systems using limited amount of patient data from their own EHR systems to provide important diagnosis related decisions. It becomes quite infeasible for a newly established healthcare facility to build a robust decision making system due to the lack of sufficient patient records. However, to make effective decisions from clinical data, it is indispensable to have large amounts of data to train the decision models. In this regard, there are conflicting objectives of preserving patient privacy and having sufficient data for modeling and decision making. To handle such disparate goals, we develop two adaptive distributed privacy-preserving algorithms based on a distributed ensemble strategy. The basic idea of our approach is to build an elegant model for each participating facility to accurately learn the data distribution, and then can transfer the useful healthcare knowledge acquired on their data from these participators in the form of their own decision models without revealing and sharing the patient-level sensitive data, thus protecting patient privacy. We demonstrate that our approach can successfully build accurate and robust prediction models, under privacy constraints, using the healthcare data collected from different geographical locations. We demonstrate the performance of our method using the Type-2 diabetes EHRs accumulated from multiple sources from all fifty states in the U.S. Our method was evaluated on diagnosing diabetes in the presence of insufficient number of patient records from certain regions without revealing the actual patient data from other regions. Using the proposed approach, we also discovered the important biomarkers, both universal and region-specific, and validated the selected biomarkers using the biomedical literature.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...