Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
J Biomed Inform ; 148: 104547, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37984547

RESUMEN

OBJECTIVE: Computing phenotypes that provide high-fidelity, time-dependent characterizations and yield personalized interpretations is challenging, especially given the complexity of physiological and healthcare systems and clinical data quality. This paper develops a methodological pipeline to estimate unmeasured physiological parameters and produce high-fidelity, personalized phenotypes anchored to physiological mechanics from electronic health record (EHR). METHODS: A methodological phenotyping pipeline is developed that computes new phenotypes defined with unmeasurable computational biomarkers quantifying specific physiological properties in real time. Working within the inverse problem framework, this pipeline is applied to the glucose-insulin system for ICU patients using data assimilation to estimate an established mathematical physiological model with stochastic optimization. This produces physiological model parameter vectors of clinically unmeasured endocrine properties, here insulin secretion, clearance, and resistance, estimated for individual patient. These physiological parameter vectors are used as inputs to unsupervised machine learning methods to produce phenotypic labels and discrete physiological phenotypes. These phenotypes are inherently interpretable because they are based on parametric physiological descriptors. To establish potential clinical utility, the computed phenotypes are evaluated with external EHR data for consistency and reliability and with clinician face validation. RESULTS: The phenotype computation was performed on a cohort of 109 ICU patients who received no or short-acting insulin therapy, rendering continuous and discrete physiological phenotypes as specific computational biomarkers of unmeasured insulin secretion, clearance, and resistance on time windows of three days. Six, six, and five discrete phenotypes were found in the first, middle, and last three-day periods of ICU stays, respectively. Computed phenotypic labels were predictive with an average accuracy of 89%. External validation of discrete phenotypes showed coherence and consistency in clinically observable differences based on laboratory measurements and ICD 9/10 codes and clinical concordance from face validity. A particularly clinically impactful parameter, insulin secretion, had a concordance accuracy of 83%±27%. CONCLUSION: The new physiological phenotypes computed with individual patient ICU data and defined by estimates of mechanistic model parameters have high physiological fidelity, are continuous, time-specific, personalized, interpretable, and predictive. This methodology is generalizable to other clinical and physiological settings and opens the door for discovering deeper physiological information to personalize medical care.


Asunto(s)
Algoritmos , Registros Electrónicos de Salud , Humanos , Reproducibilidad de los Resultados , Fenotipo , Biomarcadores , Unidades de Cuidados Intensivos
2.
J Biomed Inform ; 78: 87-101, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-29369797

RESUMEN

We study the question of how to represent or summarize raw laboratory data taken from an electronic health record (EHR) using parametric model selection to reduce or cope with biases induced through clinical care. It has been previously demonstrated that the health care process (Hripcsak and Albers, 2012, 2013), as defined by measurement context (Hripcsak and Albers, 2013; Albers et al., 2012) and measurement patterns (Albers and Hripcsak, 2010, 2012), can influence how EHR data are distributed statistically (Kohane and Weber, 2013; Pivovarov et al., 2014). We construct an algorithm, PopKLD, which is based on information criterion model selection (Burnham and Anderson, 2002; Claeskens and Hjort, 2008), is intended to reduce and cope with health care process biases and to produce an intuitively understandable continuous summary. The PopKLD algorithm can be automated and is designed to be applicable in high-throughput settings; for example, the output of the PopKLD algorithm can be used as input for phenotyping algorithms. Moreover, we develop the PopKLD-CAT algorithm that transforms the continuous PopKLD summary into a categorical summary useful for applications that require categorical data such as topic modeling. We evaluate our methodology in two ways. First, we apply the method to laboratory data collected in two different health care contexts, primary versus intensive care. We show that the PopKLD preserves known physiologic features in the data that are lost when summarizing the data using more common laboratory data summaries such as mean and standard deviation. Second, for three disease-laboratory measurement pairs, we perform a phenotyping task: we use the PopKLD and PopKLD-CAT algorithms to define high and low values of the laboratory variable that are used for defining a disease state. We then compare the relationship between the PopKLD-CAT summary disease predictions and the same predictions using empirically estimated mean and standard deviation to a gold standard generated by clinical review of patient records. We find that the PopKLD laboratory data summary is substantially better at predicting disease state. The PopKLD or PopKLD-CAT algorithms are not meant to be used as phenotyping algorithms, but we use the phenotyping task to show what information can be gained when using a more informative laboratory data summary. In the process of evaluation our method we show that the different clinical contexts and laboratory measurements necessitate different statistical summaries. Similarly, leveraging the principle of maximum entropy we argue that while some laboratory data only have sufficient information to estimate a mean and standard deviation, other laboratory data captured in an EHR contain substantially more information than can be captured in higher-parameter models.


Asunto(s)
Algoritmos , Técnicas de Laboratorio Clínico/estadística & datos numéricos , Minería de Datos/métodos , Registros Electrónicos de Salud/estadística & datos numéricos , Ensayos Analíticos de Alto Rendimiento/métodos , Humanos , Modelos Estadísticos , Fenotipo
3.
medRxiv ; 2023 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-37662404

RESUMEN

Objective: Computing phenotypes that provide high-fidelity, time-dependent characterizations and yield personalized interpretations is challenging, especially given the complexity of physiological and healthcare systems and clinical data quality. This paper develops a methodological pipeline to estimate unmeasured physiological parameters and produce high-fidelity, personalized phenotypes anchored to physiological mechanics from electronic health record (EHR). Methods: A methodological phenotyping pipeline is developed that computes new phenotypes defined with unmeasurable computational biomarkers quantifying specific physiological properties in real time. Working within the inverse problem framework, this pipeline is applied to the glucose-insulin system for ICU patients using data assimilation to estimate an established mathematical physiological model with stochastic optimization. This produces physiological model parameter vectors of clinically unmeasured endocrine properties, here insulin secretion, clearance, and resistance, estimated for individual patient. These physiological parameter vectors are used as inputs to unsupervised machine learning methods to produce phenotypic labels and discrete physiological phenotypes. These phenotypes are inherently interpretable because they are based on parametric physiological descriptors. To establish potential clinical utility, the computed phenotypes are evaluated with external EHR data for consistency and reliability and with clinician face validation. Results: The phenotype computation was performed on a cohort of 109 ICU patients who received no or short-acting insulin therapy, rendering continuous and discrete physiological phenotypes as specific computational biomarkers of unmeasured insulin secretion, clearance, and resistance on time windows of three days. Six, six, and five discrete phenotypes were found in the first, middle, and last three-day periods of ICU stays, respectively. Computed phenotypic labels were predictive with an average accuracy of 89%. External validation of discrete phenotypes showed coherence and consistency in clinically observable differences based on laboratory measurements and ICD 9/10 codes and clinical concordance from face validity. A particularly clinically impactful parameter, insulin secretion, had a concordance accuracy of 83% ± 27%. Conclusion: The new physiological phenotypes computed with individual patient ICU data and defined by estimates of mechanistic model parameters have high physiological fidelity, are continuous, time-specific, personalized, interpretable, and predictive. This methodology is generalizable to other clinical and physiological settings and opens the door for discovering deeper physiological information to personalize medical care.

4.
Chaos ; 22(1): 013111, 2012 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-22462987

RESUMEN

This paper addresses how to calculate and interpret the time-delayed mutual information (TDMI) for a complex, diversely and sparsely measured, possibly non-stationary population of time-series of unknown composition and origin. The primary vehicle used for this analysis is a comparison between the time-delayed mutual information averaged over the population and the time-delayed mutual information of an aggregated population (here, aggregation implies the population is conjoined before any statistical estimates are implemented). Through the use of information theoretic tools, a sequence of practically implementable calculations are detailed that allow for the average and aggregate time-delayed mutual information to be interpreted. Moreover, these calculations can also be used to understand the degree of homo or heterogeneity present in the population. To demonstrate that the proposed methods can be used in nearly any situation, the methods are applied and demonstrated on the time series of glucose measurements from two different subpopulations of individuals from the Columbia University Medical Center electronic health record repository, revealing a picture of the composition of the population as well as physiological features.


Asunto(s)
Almacenamiento y Recuperación de la Información/métodos , Modelos Biológicos , Modelos Estadísticos , Dinámica Poblacional , Simulación por Computador , Humanos
5.
Chaos Solitons Fractals ; 45(6): 853-860, 2012 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-22536009

RESUMEN

A method to estimate the time-dependent correlation via an empirical bias estimate of the time-delayed mutual information for a time-series is proposed. In particular, the bias of the time-delayed mutual information is shown to often be equivalent to the mutual information between two distributions of points from the same system separated by infinite time. Thus intuitively, estimation of the bias is reduced to estimation of the mutual information between distributions of data points separated by large time intervals. The proposed bias estimation techniques are shown to work for Lorenz equations data and glucose time series data of three patients from the Columbia University Medical Center database.

6.
Phys Lett A ; 374(9): 1159-1164, 2010 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-20544004

RESUMEN

Statistical physics and information theory is applied to the clinical chemistry measurements present in a patient database containing 2.5 million patients' data over a 20-year period. Despite the seemingly naive approach of aggregating all patients over all times (with respect to particular clinical chemistry measurements), both a diurnal signal in the decay of the time-delayed mutual information and the presence of two sub-populations with differing health are detected. This provides a proof in principle that the highly fragmented data in electronic health records has potential for being useful in defining disease and human phenotypes.

7.
Phys Rev E Stat Nonlin Soft Matter Phys ; 74(5 Pt 2): 057201, 2006 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-17280024

RESUMEN

An extensive statistical survey of universal approximators shows that as the dimension of a typical dissipative dynamical system is increased, the number of positive Lyapunov exponents increases monotonically and the number of parameter windows with periodic behavior decreases. A subset of parameter space remains where noncatastrophic topological change induced by a small parameter variation becomes inevitable. A geometric mechanism depending on dimension and an associated conjecture depict why topological change is expected but not catastrophic, thus providing an explanation of how and why deterministic chaos persists in high dimensions.

8.
PLoS One ; 9(6): e96443, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24933368

RESUMEN

Using glucose time series data from a well measured population drawn from an electronic health record (EHR) repository, the variation in predictability of glucose values quantified by the time-delayed mutual information (TDMI) was explained using a mechanistic endocrine model and manual and automated review of written patient records. The results suggest that predictability of glucose varies with health state where the relationship (e.g., linear or inverse) depends on the source of the acuity. It was found that on a fine scale in parameter variation, the less insulin required to process glucose, a condition that correlates with good health, the more predictable glucose values were. Nevertheless, the most powerful effect on predictability in the EHR subpopulation was the presence or absence of variation in health state, specifically, in- and out-of-control glucose versus in-control glucose. Both of these results are clinically and scientifically relevant because the magnitude of glucose is the most commonly used indicator of health as opposed to glucose dynamics, thus providing for a connection between a mechanistic endocrine model and direct insight to human health via clinically collected data.


Asunto(s)
Biomarcadores/análisis , Sistema Endocrino/fisiología , Glucosa/análisis , Fenotipo , Simulación por Computador , Registros Electrónicos de Salud , Humanos , Modelos Biológicos , Modelos Estadísticos , Evaluación de Resultado en la Atención de Salud
9.
PLoS One ; 7(12): e48058, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23272040

RESUMEN

Studying physiology and pathophysiology over a broad population for long periods of time is difficult primarily because collecting human physiologic data can be intrusive, dangerous, and expensive. One solution is to use data that have been collected for a different purpose. Electronic health record (EHR) data promise to support the development and testing of mechanistic physiologic models on diverse populations and allow correlation with clinical outcomes, but limitations in the data have thus far thwarted such use. For example, using uncontrolled population-scale EHR data to verify the outcome of time dependent behavior of mechanistic, constructive models can be difficult because: (i) aggregation of the population can obscure or generate a signal, (ii) there is often no control population with a well understood health state, and (iii) diversity in how the population is measured can make the data difficult to fit into conventional analysis techniques. This paper shows that it is possible to use EHR data to test a physiological model for a population and over long time scales. Specifically, a methodology is developed and demonstrated for testing a mechanistic, time-dependent, physiological model of serum glucose dynamics with uncontrolled, population-scale, physiological patient data extracted from an EHR repository. It is shown that there is no observable daily variation the normalized mean glucose for any EHR subpopulations. In contrast, a derived value, daily variation in nonlinear correlation quantified by the time-delayed mutual information (TDMI), did reveal the intuitively expected diurnal variation in glucose levels amongst a random population of humans. Moreover, in a population of continuously (tube) fed patients, there was no observable TDMI-based diurnal signal. These TDMI-based signals, via a glucose insulin model, were then connected with human feeding patterns. In particular, a constructive physiological model was shown to correctly predict the difference between the general uncontrolled population and a subpopulation whose feeding was controlled.


Asunto(s)
Registros Electrónicos de Salud , Sistema Endocrino/fisiología , Estudios de Cohortes , Simulación por Computador , Recolección de Datos , Interpretación Estadística de Datos , Glucosa/metabolismo , Humanos , Insulina/metabolismo , Modelos Estadísticos , Modelos Teóricos , Evaluación de Resultado en la Atención de Salud , Investigación , Proyectos de Investigación , Programas Informáticos , Estados Unidos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA