RESUMO
Heritability is essential for understanding the biological causes of disease but requires laborious patient recruitment and phenotype ascertainment. Electronic health records (EHRs) passively capture a wide range of clinically relevant data and provide a resource for studying the heritability of traits that are not typically accessible. EHRs contain next-of-kin information collected via patient emergency contact forms, but until now, these data have gone unused in research. We mined emergency contact data at three academic medical centers and identified 7.4 million familial relationships while maintaining patient privacy. Identified relationships were consistent with genetically derived relatedness. We used EHR data to compute heritability estimates for 500 disease phenotypes. Overall, estimates were consistent with the literature and between sites. Inconsistencies were indicative of limitations and opportunities unique to EHR research. These analyses provide a validation of the use of EHRs for genetics and disease research.
Assuntos
Registros Eletrônicos de Saúde , Doenças Genéticas Inatas/genética , Algoritmos , Bases de Dados Factuais , Relações Familiares , Doenças Genéticas Inatas/patologia , Genótipo , Humanos , Linhagem , Fenótipo , Característica Quantitativa HerdávelRESUMO
PURPOSE: The U.S. Food and Drug Administration's Sentinel Initiative "modular programs" have been shown to replicate findings from conventional protocol-driven, custom-programmed studies. One such parallel assessment-dabigatran and warfarin and selected outcomes-produced concordant findings for three of four study outcomes. The effect estimates and confidence intervals for the fourth-acute myocardial infarction-had more variability as compared with other outcomes. This paper evaluates the potential sources of that variability that led to unexpected divergence in findings. METHODS: We systematically compared the two studies and evaluated programming differences and their potential impact using a different dataset that allowed more granular data access for investigation. We reviewed the output at each of five main processing steps common in both study programs: cohort identification, propensity score estimation, propensity score matching, patient follow-up, and risk estimation. RESULTS: Our findings point to several design features that warrant greater investigator attention when performing observational database studies: (a) treatment of recorded events (eg, diagnoses, procedures, and dispensings) co-occurring on the index date of study drug dispensing in cohort eligibility criteria and propensity score estimation and (b) construction of treatment episodes for study drugs of interest that have more complex dispensing patterns. CONCLUSIONS: More precise and unambiguous operational definitions of all study parameters will increase transparency and reproducibility in observational database studies.
Assuntos
Dabigatrana/uso terapêutico , Infarto do Miocárdio/epidemiologia , Farmacoepidemiologia/normas , Vigilância de Produtos Comercializados/estatística & dados numéricos , Varfarina/uso terapêutico , Estudos de Coortes , Dabigatrana/administração & dosagem , Interpretação Estatística de Dados , Bases de Dados Factuais , Infarto do Miocárdio/prevenção & controle , Farmacoepidemiologia/estatística & dados numéricos , Pontuação de Propensão , Reprodutibilidade dos Testes , Estados Unidos , United States Food and Drug Administration , Varfarina/administração & dosagemRESUMO
Networks of constellations of longitudinal observational databases, often electronic medical records or transactional insurance claims or both, are increasingly being used for studying the effects of medicinal products in real-world use. Such databases are frequently configured as distributed networks. That is, patient-level data are kept behind firewalls and not communicated outside of the data vendor other than in aggregate form. Instead, data are standardized across the network, and queries of the network are executed locally by data partners, and summary results provided to a central research partner(s) for amalgamation, aggregation, and summarization. Such networks can be huge covering years of data on upwards of 100 million patients. Examples of such networks include the FDA Sentinel Network, ASPEN, CNODES, and EU-ADR. As this is a new emerging field, we note in this paper the conceptual similarities and differences between the analysis of distributed networks and the now well-established field of meta-analysis of randomized clinical trials (RCTs). We recommend, wherever appropriate, to apply learnings from meta-analysis to help guide the development of distributed network analyses of longitudinal observational databases.
Assuntos
Redes de Comunicação de Computadores/estatística & dados numéricos , Mineração de Dados/estatística & dados numéricos , Bases de Dados Factuais/estatística & dados numéricos , Metanálise como Assunto , Estudos Observacionais como Assunto/estatística & dados numéricos , Ensaios Clínicos Controlados Aleatórios como Assunto/estatística & dados numéricos , Projetos de Pesquisa/estatística & dados numéricos , Sistemas de Notificação de Reações Adversas a Medicamentos/estatística & dados numéricos , Angioedema/induzido quimicamente , Angioedema/diagnóstico , Angioedema/epidemiologia , Inibidores da Enzima Conversora de Angiotensina/efeitos adversos , Confiabilidade dos Dados , Interpretação Estatística de Dados , Mineração de Dados/métodos , Humanos , Estudos Observacionais como Assunto/métodos , Ensaios Clínicos Controlados Aleatórios como Assunto/métodos , Medição de Risco , Fatores de RiscoRESUMO
Approaches to comparing safety and efficacy of interventions include analyzing data from randomized controlled trials (RCTs), registries and observational databases (ODBs). RCTs are regarded as the gold standard but data from such trials are sometimes unavailable because a disease is uncommon, because the intervention is uncommon, because of structural limitations or because randomization cannot be done for practical or (seemingly) ethical reasons. There are many examples of an unproved intervention being so widely-believed to be effective that clinical trialists and potential subjects decline randomization. Often, when a RCT is finally done the intervention is proved ineffective or even harmful. These situations are termed medical reversals and are not uncommon [1,2]. There is also the dilemma of when seemingly similar RCTs report discordant conclisions Data from high-quality registries, especially ODBs can be used when data from RCTs are unavailable but also have limitations. Biases and confounding co-variates may be unknown, difficult or impossible to identify and/or difficult to adjust for adequately. However, ODBs sometimes have large numbers of diverse subjects and often give answers more useful to clinicians than RCTs. Side-by-side comparisons suggest analyses from high-quality ODBs often give similar conclusions from high quality RCTs. Meta-analyses combining data from RCTs, registries and ODBs are sometimes appropriate. We suggest increased use of registries and ODBs to compare efficacy of interventions.
Assuntos
Ensaios Clínicos Controlados Aleatórios como Assunto , Humanos , Sistema de Registros , Bases de Dados como AssuntoRESUMO
PURPOSE: Atrial fibrillation/flutter (AF) is frequently associated with cardiovascular comorbidities. Observational health care databases are commonly used for research purposes in studies of quality of care, health economics, outcomes research, drug safety, and epidemiology. This retrospective cohort study applied a common data model to administrative claims data (Truven Health Analytics MarketScan(®) claims databases [MS-Claims]) and electronic medical records data (Geisinger Health System's MedMining electronic medical record database [MG-EMR]) to examine the risk of cardiovascular hospitalization and all-cause mortality in relation to clinical risk factors in recent-onset AF and to assess the consistency of analyses for each data source. METHODS: Cohorts of patients with newly diagnosed AF (n=105,262 [MS-Claims] and n=3,919 [MG-EMR]) and demographically similar patients without AF (n=105,262 [MS-Claims] and n=3,872 [MG-EMR]) were followed from the qualifying AF diagnosis until cardiovascular hospitalization, death, database disenrollment, or study completion. A common data model standardized the data in structure, format, content, and nomenclature to allow for systematic assessment and comparison of outcomes from two disparate data sets. RESULTS: In both databases, AF patients had greater overall baseline comorbidity and higher incidence rates of cardiovascular hospitalization (threefold higher) and all-cause mortality (46% higher) than non-AF patients. For AF patients, incidence rates of cardiovascular hospitalization and all-cause mortality were increased by the concomitant presence of coronary disease, chronic obstructive pulmonary disease, and stroke at baseline. Overall, the pattern of cardiovascular hospitalization in the MS-Claims database was similar to that in the MG-EMR database. Compared with the MS-Claims database, the use of cardiovascular medications and the capture of certain comorbidities among AF patients appeared to be higher in the MG-EMR data set. CONCLUSION: Similar standardized analyses across EMR and Claims databases were consistent in the association of AF with acute morbidity and an increased risk of all-cause mortality. Areas of inconsistency were due to differences in underlying population demographics and cardiovascular risks and completeness of certain data fields.
RESUMO
Observational healthcare databases represent a valuable resource for health economics, outcomes research, quality of care, drug safety, epidemiology and comparative effectiveness research. The methods used to identify a population for study in an observational healthcare database with the desired drug exposures of interest are complex and not consistent nor apparent in the published literature. Our research evaluates three drug classification systems and their impact on prevalence in the analysis of observational healthcare databases using opioids as a case in point. The standard terminologies compiled in the Observational Medical Outcomes Partnership's Common Data Model vocabulary were used to facilitate the identification of populations with opioid exposures. This study analyzed three distinct observational healthcare databases and identified patients with at least one exposure to an opioid as defined by drug codes derived through the application of three classification systems. Opioid code sets were created for each of the three classification systems and the number of identified codes was summarized. We estimated the prevalence of opioid exposure in three observational healthcare databases using the three defined code sets. In addition we compared the number of drug codes and distinct ingredients that were identified using these classification systems. We found substantial variation in the prevalence of opioid exposure identified using an individual classification system versus a composite method using multiple classification systems. To ensure transparent and reproducible research publications should include a description of the process used to develop code sets and the complete code set used in studies.