Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Am Heart Assoc ; 6(4)2017 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-28438733

RESUMEN

BACKGROUND: Clinicians who are using the Framingham Risk Score (FRS) or the American College of Cardiology/American Heart Association Pooled Cohort Equations (PCE) to estimate risk for their patients based on electronic health data (EHD) face 4 questions. (1) Do published risk scores applied to EHD yield accurate estimates of cardiovascular risk? (2) Are FRS risk estimates, which are based on data that are up to 45 years old, valid for a contemporary patient population seeking routine care? (3) Do the PCE make the FRS obsolete? (4) Does refitting the risk score using EHD improve the accuracy of risk estimates? METHODS AND RESULTS: Data were extracted from the EHD of 84 116 adults aged 40 to 79 years who received care at a large healthcare delivery and insurance organization between 2001 and 2011. We assessed calibration and discrimination for 4 risk scores: published versions of FRS and PCE and versions obtained by refitting models using a subset of the available EHD. The published FRS was well calibrated (calibration statistic K=9.1, miscalibration ranging from 0% to 17% across risk groups), but the PCE displayed modest evidence of miscalibration (calibration statistic K=43.7, miscalibration from 9% to 31%). Discrimination was similar in both models (C-index=0.740 for FRS, 0.747 for PCE). Refitting the published models using EHD did not substantially improve calibration or discrimination. CONCLUSIONS: We conclude that published cardiovascular risk models can be successfully applied to EHD to estimate cardiovascular risk; the FRS remains valid and is not obsolete; and model refitting does not meaningfully improve the accuracy of risk estimates.


Asunto(s)
Enfermedades Cardiovasculares/epidemiología , Registros Electrónicos de Salud , Medición de Riesgo , Síndrome Coronario Agudo/epidemiología , Síndrome Coronario Agudo/mortalidad , Adulto , Anciano , Enfermedades Cardiovasculares/mortalidad , Enfermedad Coronaria/epidemiología , Enfermedad Coronaria/mortalidad , Femenino , Insuficiencia Cardíaca/epidemiología , Insuficiencia Cardíaca/mortalidad , Humanos , Masculino , Persona de Mediana Edad , Infarto del Miocardio/epidemiología , Infarto del Miocardio/mortalidad , Enfermedad Arterial Periférica/epidemiología , Enfermedad Arterial Periférica/mortalidad , Accidente Cerebrovascular/epidemiología , Accidente Cerebrovascular/mortalidad
2.
J Biomed Inform ; 61: 119-31, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-26992568

RESUMEN

Models for predicting the probability of experiencing various health outcomes or adverse events over a certain time frame (e.g., having a heart attack in the next 5years) based on individual patient characteristics are important tools for managing patient care. Electronic health data (EHD) are appealing sources of training data because they provide access to large amounts of rich individual-level data from present-day patient populations. However, because EHD are derived by extracting information from administrative and clinical databases, some fraction of subjects will not be under observation for the entire time frame over which one wants to make predictions; this loss to follow-up is often due to disenrollment from the health system. For subjects without complete follow-up, whether or not they experienced the adverse event is unknown, and in statistical terms the event time is said to be right-censored. Most machine learning approaches to the problem have been relatively ad hoc; for example, common approaches for handling observations in which the event status is unknown include (1) discarding those observations, (2) treating them as non-events, (3) splitting those observations into two observations: one where the event occurs and one where the event does not. In this paper, we present a general-purpose approach to account for right-censored outcomes using inverse probability of censoring weighting (IPCW). We illustrate how IPCW can easily be incorporated into a number of existing machine learning algorithms used to mine big health care data including Bayesian networks, k-nearest neighbors, decision trees, and generalized additive models. We then show that our approach leads to better calibrated predictions than the three ad hoc approaches when applied to predicting the 5-year risk of experiencing a cardiovascular adverse event, using EHD from a large U.S. Midwestern healthcare system.


Asunto(s)
Análisis por Conglomerados , Registros Electrónicos de Salud , Aprendizaje Automático , Algoritmos , Teorema de Bayes , Humanos , Probabilidad
3.
Stat Med ; 34(21): 2941-57, 2015 Sep 20.
Artículo en Inglés | MEDLINE | ID: mdl-25980520

RESUMEN

Predicting an individual's risk of experiencing a future clinical outcome is a statistical task with important consequences for both practicing clinicians and public health experts. Modern observational databases such as electronic health records provide an alternative to the longitudinal cohort studies traditionally used to construct risk models, bringing with them both opportunities and challenges. Large sample sizes and detailed covariate histories enable the use of sophisticated machine learning techniques to uncover complex associations and interactions, but observational databases are often 'messy', with high levels of missing data and incomplete patient follow-up. In this paper, we propose an adaptation of the well-known Naive Bayes machine learning approach to time-to-event outcomes subject to censoring. We compare the predictive performance of our method with the Cox proportional hazards model which is commonly used for risk prediction in healthcare populations, and illustrate its application to prediction of cardiovascular risk using an electronic health record dataset from a large Midwest integrated healthcare system.


Asunto(s)
Teorema de Bayes , Biometría/métodos , Modelos de Riesgos Proporcionales , Medición de Riesgo/métodos , Enfermedades Cardiovasculares/epidemiología , Simulación por Computador , Bases de Datos Factuales , Prestación Integrada de Atención de Salud , Registros Electrónicos de Salud , Humanos , Estudios Longitudinales , Aprendizaje Automático , Medio Oeste de Estados Unidos/epidemiología , Riesgo , Agrupamiento Espacio-Temporal
4.
Nat Methods ; 7(12): 1017-24, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21076421

RESUMEN

Global quantitative analysis of genetic interactions is a powerful approach for deciphering the roles of genes and mapping functional relationships among pathways. Using colony size as a proxy for fitness, we developed a method for measuring fitness-based genetic interactions from high-density arrays of yeast double mutants generated by synthetic genetic array (SGA) analysis. We identified several experimental sources of systematic variation and developed normalization strategies to obtain accurate single- and double-mutant fitness measurements, which rival the accuracy of other high-resolution studies. We applied the SGA score to examine the relationship between physical and genetic interaction networks, and we found that positive genetic interactions connect across functionally distinct protein complexes revealing a network of genetic suppression among loss-of-function alleles.


Asunto(s)
Aptitud Genética , Genoma Fúngico , Levaduras/genética , Algoritmos , Regulación Fúngica de la Expresión Génica , Estudio de Asociación del Genoma Completo/métodos , Mutagénesis , Mutación , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Rayos Ultravioleta , Levaduras/efectos de la radiación
5.
BMC Evol Biol ; 10: 357, 2010 Nov 18.
Artículo en Inglés | MEDLINE | ID: mdl-21087504

RESUMEN

BACKGROUND: Gene duplication can lead to genetic redundancy, which masks the function of mutated genes in genetic analyses. Methods to increase sensitivity in identifying genetic redundancy can improve the efficiency of reverse genetics and lend insights into the evolutionary outcomes of gene duplication. Machine learning techniques are well suited to classifying gene family members into redundant and non-redundant gene pairs in model species where sufficient genetic and genomic data is available, such as Arabidopsis thaliana, the test case used here. RESULTS: Machine learning techniques that combine multiple attributes led to a dramatic improvement in predicting genetic redundancy over single trait classifiers alone, such as BLAST E-values or expression correlation. In withholding analysis, one of the methods used here, Support Vector Machines, was two-fold more precise than single attribute classifiers, reaching a level where the majority of redundant calls were correctly labeled. Using this higher confidence in identifying redundancy, machine learning predicts that about half of all genes in Arabidopsis showed the signature of predicted redundancy with at least one but typically less than three other family members. Interestingly, a large proportion of predicted redundant gene pairs were relatively old duplications (e.g., Ks > 1), suggesting that redundancy is stable over long evolutionary periods. CONCLUSIONS: Machine learning predicts that most genes will have a functionally redundant paralog but will exhibit redundancy with relatively few genes within a family. The predictions and gene pair attributes for Arabidopsis provide a new resource for research in genetics and genome evolution. These techniques can now be applied to other organisms.


Asunto(s)
Inteligencia Artificial , Duplicación de Gen , Algoritmos , Arabidopsis/genética , Teorema de Bayes , Regulación de la Expresión Génica de las Plantas , Genes de Plantas , Genoma de Planta , Modelos Logísticos , Familia de Multigenes , Curva ROC
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA