Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 64
Filtrar
2.
J Am Med Inform Assoc ; 31(1): 139-153, 2023 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-37885303

RESUMO

OBJECTIVE: The All of Us Research Program (All of Us) aims to recruit over a million participants to further precision medicine. Essential to the verification of biobanks is a replication of known associations to establish validity. Here, we evaluated how well All of Us data replicated known cigarette smoking associations. MATERIALS AND METHODS: We defined smoking exposure as follows: (1) an EHR Smoking exposure that used International Classification of Disease codes; (2) participant provided information (PPI) Ever Smoking; and, (3) PPI Current Smoking, both from the lifestyle survey. We performed a phenome-wide association study (PheWAS) for each smoking exposure measurement type. For each, we compared the effect sizes derived from the PheWAS to published meta-analyses that studied cigarette smoking from PubMed. We defined two levels of replication of meta-analyses: (1) nominally replicated: which required agreement of direction of effect size, and (2) fully replicated: which required overlap of confidence intervals. RESULTS: PheWASes with EHR Smoking, PPI Ever Smoking, and PPI Current Smoking revealed 736, 492, and 639 phenome-wide significant associations, respectively. We identified 165 meta-analyses representing 99 distinct phenotypes that could be matched to EHR phenotypes. At P < .05, 74 were nominally replicated and 55 were fully replicated. At P < 2.68 × 10-5 (Bonferroni threshold), 58 were nominally replicated and 40 were fully replicated. DISCUSSION: Most phenotypes found in published meta-analyses associated with smoking were nominally replicated in All of Us. Both survey and EHR definitions for smoking produced similar results. CONCLUSION: This study demonstrated the feasibility of studying common exposures using All of Us data.


Assuntos
Estudo de Associação Genômica Ampla , Saúde da População , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único , Fumar
3.
Genet Med ; 25(12): 100966, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37622442

RESUMO

PURPOSE: Automated use of electronic health records may aid in decreasing the diagnostic delay for rare diseases. The phenotype risk score (PheRS) is a weighted aggregate of syndromically related phenotypes that measures the similarity between an individual's conditions and features of a disease. For some diseases, there are individuals without a diagnosis of that disease who have scores similar to diagnosed patients. These individuals may have that disease but not yet be diagnosed. METHODS: We calculated the PheRS for cystic fibrosis (CF) for 965,626 subjects in the Vanderbilt University Medical Center electronic health record. RESULTS: Of the 400 subjects with the highest PheRS for CF, 248 (62%) had been diagnosed with CF. Twenty-six of the remaining participants, those who were alive and had DNA available in the linked DNA biobank, underwent clinical review and sequencing analysis of CFTR and SERPINA1. This uncovered a potential diagnosis for 2 subjects, 1 with CF and 1 with alpha-1-antitrypsin deficiency. An additional 7 subjects had pathogenic or likely pathogenic variants, 2 in CFTR and 5 in SERPINA1. CONCLUSION: These findings may be clinically actionable for the providers caring for these patients. Importantly, this study highlights feasibility and challenges for future implications of this approach.


Assuntos
Regulador de Condutância Transmembrana em Fibrose Cística , Fibrose Cística , Humanos , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Registros Eletrônicos de Saúde , Diagnóstico Tardio , Fibrose Cística/diagnóstico , Fibrose Cística/genética , Fibrose Cística/patologia , DNA , Mutação
4.
PLoS One ; 18(5): e0283553, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37196047

RESUMO

OBJECTIVE: Diverticular disease (DD) is one of the most prevalent conditions encountered by gastroenterologists, affecting ~50% of Americans before the age of 60. Our aim was to identify genetic risk variants and clinical phenotypes associated with DD, leveraging multiple electronic health record (EHR) data sources of 91,166 multi-ancestry participants with a Natural Language Processing (NLP) technique. MATERIALS AND METHODS: We developed a NLP-enriched phenotyping algorithm that incorporated colonoscopy or abdominal imaging reports to identify patients with diverticulosis and diverticulitis from multicenter EHRs. We performed genome-wide association studies (GWAS) of DD in European, African and multi-ancestry participants, followed by phenome-wide association studies (PheWAS) of the risk variants to identify their potential comorbid/pleiotropic effects in clinical phenotypes. RESULTS: Our developed algorithm showed a significant improvement in patient classification performance for DD analysis (algorithm PPVs ≥ 0.94), with up to a 3.5 fold increase in terms of the number of identified patients than the traditional method. Ancestry-stratified analyses of diverticulosis and diverticulitis of the identified subjects replicated the well-established associations between ARHGAP15 loci with DD, showing overall intensified GWAS signals in diverticulitis patients compared to diverticulosis patients. Our PheWAS analyses identified significant associations between the DD GWAS variants and circulatory system, genitourinary, and neoplastic EHR phenotypes. DISCUSSION: As the first multi-ancestry GWAS-PheWAS study, we showcased that heterogenous EHR data can be mapped through an integrative analytical pipeline and reveal significant genotype-phenotype associations with clinical interpretation. CONCLUSION: A systematic framework to process unstructured EHR data with NLP could advance a deep and scalable phenotyping for better patient identification and facilitate etiological investigation of a disease with multilayered data.


Assuntos
Doenças Diverticulares , Diverticulite , Divertículo , Humanos , Registros Eletrônicos de Saúde , Estudo de Associação Genômica Ampla/métodos , Processamento de Linguagem Natural , Fenótipo , Algoritmos , Polimorfismo de Nucleotídeo Único
6.
Patterns (N Y) ; 3(8): 100570, 2022 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-36033590

RESUMO

The All of Us Research Program seeks to engage at least one million diverse participants to advance precision medicine and improve human health. We describe here the cloud-based Researcher Workbench that uses a data passport model to democratize access to analytical tools and participant information including survey, physical measurement, and electronic health record (EHR) data. We also present validation study findings for several common complex diseases to demonstrate use of this novel platform in 315,000 participants, 78% of whom are from groups historically underrepresented in biomedical research, including 49% self-reporting non-White races. Replication findings include medication usage pattern differences by race in depression and type 2 diabetes, validation of known cancer associations with smoking, and calculation of cardiovascular risk scores by reported race effects. The cloud-based Researcher Workbench represents an important advance in enabling secure access for a broad range of researchers to this large resource and analytical tools.

7.
BMC Genomics ; 23(1): 385, 2022 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-35590255

RESUMO

BACKGROUND: As genomic sequencing moves closer to clinical implementation, there has been an increasing acceptance of returning incidental findings to research participants and patients for mutations in highly penetrant, medically actionable genes. A curated list of genes has been recommended by the American College of Medical Genetics and Genomics (ACMG) for return of incidental findings. However, the pleiotropic effects of these genes are not fully known. Such effects could complicate genetic counseling when returning incidental findings. In particular, there has been no systematic evaluation of psychiatric manifestations associated with rare variation in these genes. RESULTS: Here, we leveraged a targeted sequence panel and real-world electronic health records from the eMERGE network to assess the burden of rare variation in the ACMG-56 genes and two psychiatric-associated genes (CACNA1C  and TCF4) across common mental health conditions in 15,181 individuals of European descent. As a positive control, we showed that this approach replicated the established association between rare mutations in LDLR and hypercholesterolemia with no visible inflation from population stratification. However, we did not identify any genes significantly enriched with rare deleterious variants that confer risk for common psychiatric disorders after correction for multiple testing. Suggestive associations were observed between depression and rare coding variation in PTEN (P = 1.5 × 10-4), LDLR (P = 3.6 × 10-4), and CACNA1S (P = 5.8 × 10-4). We also observed nominal associations between rare variants in KCNQ1 and substance use disorders (P = 2.4 × 10-4), and APOB and tobacco use disorder (P = 1.1 × 10-3). CONCLUSIONS: Our results do not support an association between psychiatric disorders and incidental findings in medically actionable gene mutations, but power was limited with the available sample sizes. Given the phenotypic and genetic complexity of psychiatric phenotypes, future work will require a much larger sequencing dataset to determine whether incidental findings in these genes have implications for risk of psychopathology.


Assuntos
Exoma , Testes Genéticos , Testes Genéticos/métodos , Variação Genética , Genômica/métodos , Humanos , Mutação , Fenótipo
8.
JAMA Oncol ; 8(6): 835-844, 2022 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-35446370

RESUMO

Importance: Knowledge about the spectrum of diseases associated with hereditary cancer syndromes may improve disease diagnosis and management for patients and help to identify high-risk individuals. Objective: To identify phenotypes associated with hereditary cancer genes through a phenome-wide association study. Design, Setting, and Participants: This phenome-wide association study used health data from participants in 3 cohorts. The Electronic Medical Records and Genomics Sequencing (eMERGEseq) data set recruited predominantly healthy individuals from 10 US medical centers from July 16, 2016, through February 18, 2018, with a mean follow-up through electronic health records (EHRs) of 12.7 (7.4) years. The UK Biobank (UKB) cohort recruited participants from March 15, 2006, through August 1, 2010, with a mean (SD) follow-up of 12.4 (1.0) years. The Hereditary Cancer Registry (HCR) recruited patients undergoing clinical genetic testing at Vanderbilt University Medical Center from May 1, 2012, through December 31, 2019, with a mean (SD) follow-up through EHRs of 8.8 (6.5) years. Exposures: Germline variants in 23 hereditary cancer genes. Pathogenic and likely pathogenic variants for each gene were aggregated for association analyses. Main Outcomes and Measures: Phenotypes in the eMERGEseq and HCR cohorts were derived from the linked EHRs. Phenotypes in UKB were from multiple sources of health-related data. Results: A total of 214 020 participants were identified, including 23 544 in eMERGEseq cohort (mean [SD] age, 47.8 [23.7] years; 12 611 women [53.6%]), 187 234 in the UKB cohort (mean [SD] age, 56.7 [8.1] years; 104 055 [55.6%] women), and 3242 in the HCR cohort (mean [SD] age, 52.5 [15.5] years; 2851 [87.9%] women). All 38 established gene-cancer associations were replicated, and 19 new associations were identified. These included the following 7 associations with neoplasms: CHEK2 with leukemia (odds ratio [OR], 3.81 [95% CI, 2.64-5.48]) and plasma cell neoplasms (OR, 3.12 [95% CI, 1.84-5.28]), ATM with gastric cancer (OR, 4.27 [95% CI, 2.35-7.44]) and pancreatic cancer (OR, 4.44 [95% CI, 2.66-7.40]), MUTYH (biallelic) with kidney cancer (OR, 32.28 [95% CI, 6.40-162.73]), MSH6 with bladder cancer (OR, 5.63 [95% CI, 2.75-11.49]), and APC with benign liver/intrahepatic bile duct tumors (OR, 52.01 [95% CI, 14.29-189.29]). The remaining 12 associations with nonneoplastic diseases included BRCA1/2 with ovarian cysts (OR, 3.15 [95% CI, 2.22-4.46] and 3.12 [95% CI, 2.36-4.12], respectively), MEN1 with acute pancreatitis (OR, 33.45 [95% CI, 9.25-121.02]), APC with gastritis and duodenitis (OR, 4.66 [95% CI, 2.61-8.33]), and PTEN with chronic gastritis (OR, 15.68 [95% CI, 6.01-40.92]). Conclusions and Relevance: The findings of this genetic association study analyzing the EHRs of 3 large cohorts suggest that these new phenotypes associated with hereditary cancer genes may facilitate early detection and better management of cancers. This study highlights the potential benefits of using EHR data in genomic medicine.


Assuntos
Gastrite , Síndromes Neoplásicas Hereditárias , Pancreatite , Doença Aguda , Feminino , Predisposição Genética para Doença , Mutação em Linhagem Germinativa , Humanos , Masculino
9.
Int J Epidemiol ; 51(6): 1931-1942, 2022 12 13.
Artigo em Inglês | MEDLINE | ID: mdl-35218343

RESUMO

BACKGROUND: Sex hormone-binding globulin (SHBG), testosterone and oestradiol have been associated with many diseases in observational studies; however, the causality of associations remains unestablished. METHODS: A phenome-wide Mendelian randomization (MR) association study was performed to explore disease outcomes associated with genetically proxied circulating SHBG, testosterone and oestradiol levels by using updated genetic instruments in 339 197 unrelated White British individuals (54% female) in the UK Biobank. Two-sample MR analyses with data from large genetic studies were conducted to replicate identified associations in phenome-wide MR analyses. Multivariable MR analyses were performed to investigate mediation effects of hormone-related biomarkers in observed associations with diseases. RESULTS: Phenome-wide MR analyses examined associations of genetically predicted SHBG, testosterone and oestradiol levels with 1211 disease outcomes, and identified 28 and 13 distinct phenotypes associated with genetically predicted SHBG and testosterone, respectively; 22 out of 28 associations for SHBG and 10 out of 13 associations for testosterone were replicated in two-sample MR analyses. Higher genetically predicted SHBG levels were associated with a reduced risk of hypertension, type 2 diabetes, diabetic complications, coronary atherosclerotic outcomes, gout and benign and malignant neoplasm of uterus, but an increased risk of varicose veins and fracture (mainly in females). Higher genetically predicted testosterone levels were associated with a lower risk of type 2 diabetes, coronary atherosclerotic outcomes, gout and coeliac disease mainly in males, but an increased risk of cholelithiasis in females. CONCLUSIONS: These findings suggest that sex hormones may causally affect risk of several health outcomes.


Assuntos
Estradiol , Globulina de Ligação a Hormônio Sexual , Testosterona , Feminino , Humanos , Masculino , Estradiol/sangue , Estradiol/genética , Hormônios Esteroides Gonadais , Análise da Randomização Mendeliana , Globulina de Ligação a Hormônio Sexual/genética , Testosterona/sangue , Testosterona/genética
10.
J Clin Invest ; 2021 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-34283806

RESUMO

Both epidemiologic and cellular studies in the context of autoimmune diseases have established that protein tyrosine phosphatase non-receptor type 22 (PTPN22) is a key regulator of T cell receptor (TCR) signaling. However, its mechanism of action in tumors and its translatability as a target for cancer immunotherapy have not been established. Here we show that a germline variant of PTPN22, rs2476601, portended a lower likelihood of cancer in patients. PTPN22 expression was also associated with markers of immune regulation in multiple cancer types. In mice, lack of PTPN22 augmented antitumor activity with greater infiltration and activation of macrophages, natural killer (NK) cells, and T cells. Notably, we generated a novel small molecule inhibitor of PTPN22, named L-1, that phenocopied the antitumor effects seen in genotypic PTPN22 knockout. PTPN22 inhibition promoted activation of CD8+ T cells and macrophage subpopulations toward MHC-II expressing M1-like phenotypes, both of which were necessary for successful antitumor efficacy. Increased PD1-PDL1 axis in the setting of PTPN22 inhibition could be further leveraged with PD1 inhibition to augment antitumor effects. Similarly, cancer patients with the rs2476601 variant responded significantly better to checkpoint inhibitor immunotherapy. Our findings suggest that PTPN22 is a druggable systemic target for cancer immunotherapy.

11.
BMC Biol ; 18(1): 83, 2020 07 03.
Artigo em Inglês | MEDLINE | ID: mdl-32620114

RESUMO

BACKGROUND: Experimental reproducibility in mouse models is impacted by both genetics and environment. The generation of reproducible data is critical for the biomedical enterprise and has become a major concern for the scientific community and funding agencies alike. Among the factors that impact reproducibility in experimental mouse models is the variable composition of the microbiota in mice supplied by different commercial vendors. Less attention has been paid to how the microbiota of mice supplied by a particular vendor might change over time. RESULTS: In the course of conducting a series of experiments in a mouse model of malaria, we observed a profound and lasting change in the severity of malaria in mice infected with Plasmodium yoelii; while for several years mice obtained from a specific production suite of a specific commercial vendor were able to clear the parasites effectively in a relatively short time, mice subsequently shipped from the same unit suffered much more severe disease. Gut microbiota analysis of frozen cecal samples identified a distinct and lasting shift in bacteria populations that coincided with the altered response of the later shipments of mice to infection with malaria parasites. Germ-free mice colonized with cecal microbiota from mice within the same production suite before and after this change followed by Plasmodium infection provided a direct demonstration that the change in gut microbiota profoundly impacted the severity of malaria. Moreover, spatial changes in gut microbiota composition were also shown to alter the acute bacterial burden following Salmonella infection, and tumor burden in a lung tumorigenesis model. CONCLUSION: These changes in gut bacteria may have impacted the experimental reproducibility of diverse research groups and highlight the need for both laboratory animal providers and researchers to collaborate in determining the methods and criteria needed to stabilize the gut microbiota of animal breeding colonies and research cohorts, and to develop a microbiota solution to increase experimental rigor and reproducibility.


Assuntos
Modelos Animais de Doenças , Microbioma Gastrointestinal , Malária/fisiopatologia , Plasmodium yoelii/fisiologia , Animais , Feminino , Camundongos , Camundongos Endogâmicos C57BL , Análise Espaço-Temporal
12.
JNCI Cancer Spectr ; 4(3): pkaa021, 2020 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32596635

RESUMO

BACKGROUND: Genome-wide association studies have identified common genetic risk variants in many loci associated with multiple cancers. We sought to systematically evaluate the utility of these risk variants in identifying high-risk individuals for eight common cancers. METHODS: We constructed polygenic risk scores (PRS) using genome-wide association studies-identified risk variants for each cancer. Using data from 400 812 participants of European descent in a population-based cohort study, UK Biobank, we estimated hazard ratios associated with PRS using Cox proportional hazard models and evaluated the performance of the PRS in cancer risk prediction and their ability to identify individuals at more than a twofold elevated risk, a risk level comparable to a moderate-penetrance mutation in known cancer predisposition genes. RESULTS: During a median follow-up of 5.8 years, 14 584 incident case patients of cancers were identified (ranging from 358 epithelial ovarian cancer case patients to 4430 prostate cancer case patients). Compared with those at an average risk, individuals among the highest 5% of the PRS had a two- to threefold elevated risk for cancer of the prostate, breast, pancreas, colorectal, or ovary, and an approximately 1.5-fold elevated risk of cancer of the lung, bladder, or kidney. The areas under the curve ranged from 0.567 to 0.662. Using PRS, 40.4% of the study participants can be classified as having more than a twofold elevated risk for at least one site-specific cancer. CONCLUSIONS: A large proportion of the general population can be identified at an elevated cancer risk by PRS, supporting the potential clinical utility of PRS for personalized cancer risk prediction.

13.
J Clin Endocrinol Metab ; 105(6)2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-31917831

RESUMO

CONTEXT: As many as 75% of patients with polycystic ovary syndrome (PCOS) are estimated to be unidentified in clinical practice. OBJECTIVE: Utilizing polygenic risk prediction, we aim to identify the phenome-wide comorbidity patterns characteristic of PCOS to improve accurate diagnosis and preventive treatment. DESIGN, PATIENTS, AND METHODS: Leveraging the electronic health records (EHRs) of 124 852 individuals, we developed a PCOS risk prediction algorithm by combining polygenic risk scores (PRS) with PCOS component phenotypes into a polygenic and phenotypic risk score (PPRS). We evaluated its predictive capability across different ancestries and perform a PRS-based phenome-wide association study (PheWAS) to assess the phenomic expression of the heightened risk of PCOS. RESULTS: The integrated polygenic prediction improved the average performance (pseudo-R2) for PCOS detection by 0.228 (61.5-fold), 0.224 (58.8-fold), 0.211 (57.0-fold) over the null model across European, African, and multi-ancestry participants respectively. The subsequent PRS-powered PheWAS identified a high level of shared biology between PCOS and a range of metabolic and endocrine outcomes, especially with obesity and diabetes: "morbid obesity", "type 2 diabetes", "hypercholesterolemia", "disorders of lipid metabolism", "hypertension", and "sleep apnea" reaching phenome-wide significance. CONCLUSIONS: Our study has expanded the methodological utility of PRS in patient stratification and risk prediction, especially in a multifactorial condition like PCOS, across different genetic origins. By utilizing the individual genome-phenome data available from the EHR, our approach also demonstrates that polygenic prediction by PRS can provide valuable opportunities to discover the pleiotropic phenomic network associated with PCOS pathogenesis.


Assuntos
Algoritmos , Estudo de Associação Genômica Ampla , Herança Multifatorial/genética , Fenômica/métodos , Fenótipo , Síndrome do Ovário Policístico/diagnóstico , Adolescente , Idoso , Estudos de Casos e Controles , Criança , Registros Eletrônicos de Saúde , Feminino , Seguimentos , Predisposição Genética para Doença , Humanos , Pessoa de Meia-Idade , Síndrome do Ovário Policístico/epidemiologia , Síndrome do Ovário Policístico/genética , Prognóstico , Fatores de Risco
14.
World J Surg ; 44(1): 84-94, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31605180

RESUMO

BACKGROUND: The extent to which obesity and genetics determine postoperative complications is incompletely understood. METHODS: We performed a retrospective study using two population cohorts with electronic health record (EHR) data. The first included 736,726 adults with body mass index (BMI) recorded between 1990 and 2017 at Vanderbilt University Medical Center. The second cohort consisted of 65,174 individuals from 12 institutions contributing EHR and genome-wide genotyping data to the Electronic Medical Records and Genomics (eMERGE) Network. Pairwise logistic regression analyses were used to measure the association of BMI categories with postoperative complications derived from International Classification of Disease-9 codes, including postoperative infection, incisional hernia, and intestinal obstruction. A genetic risk score was constructed from 97 obesity-risk single-nucleotide polymorphisms for a Mendelian randomization study to determine the association of genetic risk of obesity on postoperative complications. Logistic regression analyses were adjusted for sex, age, site, and race/principal components. RESULTS: Individuals with overweight or obese BMI (≥25 kg/m2) had increased risk of incisional hernia (odds ratio [OR] 1.7-5.5, p < 3.1 × 10-20), and people with obesity (BMI ≥ 30 kg/m2) had increased risk of postoperative infection (OR 1.2-2.3, p < 2.5 × 10-5). In the eMERGE cohort, genetically predicted BMI was associated with incisional hernia (OR 2.1 [95% CI 1.8-2.5], p = 1.4 × 10-6) and postoperative infection (OR 1.6 [95% CI 1.4-1.9], p = 3.1 × 10-6). Association findings were similar after limitation of the cohorts to those who underwent abdominal procedures. CONCLUSIONS: Clinical and Mendelian randomization studies suggest that obesity, as measured by BMI, is associated with the development of postoperative incisional hernia and infection.


Assuntos
Análise da Randomização Mendeliana/métodos , Obesidade/complicações , Complicações Pós-Operatórias/genética , Adulto , Índice de Massa Corporal , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Complicações Pós-Operatórias/etiologia , Estudos Retrospectivos , Fatores de Risco
15.
BMC Genomics ; 20(1): 805, 2019 Nov 04.
Artigo em Inglês | MEDLINE | ID: mdl-31684865

RESUMO

BACKGROUND: The growth of DNA biobanks linked to data from electronic health records (EHRs) has enabled the discovery of numerous associations between genomic variants and clinical phenotypes. Nonetheless, although clinical data are generally longitudinal, standard approaches for detecting genotype-phenotype associations in such linked data, notably logistic regression, do not naturally account for variation in the period of follow-up or the time at which an event occurs. Here we explored the advantages of quantifying associations using Cox proportional hazards regression, which can account for the age at which a patient first visited the healthcare system (left truncation) and the age at which a patient either last visited the healthcare system or acquired a particular phenotype (right censoring). RESULTS: In comprehensive simulations, we found that, compared to logistic regression, Cox regression had greater power at equivalent Type I error. We then scanned for genotype-phenotype associations using logistic regression and Cox regression on 50 phenotypes derived from the EHRs of 49,792 genotyped individuals. Consistent with the findings from our simulations, Cox regression had approximately 10% greater relative sensitivity for detecting known associations from the NHGRI-EBI GWAS Catalog. In terms of effect sizes, the hazard ratios estimated by Cox regression were strongly correlated with the odds ratios estimated by logistic regression. CONCLUSIONS: As longitudinal health-related data continue to grow, Cox regression may improve our ability to identify the genetic basis for a wide range of human phenotypes.


Assuntos
Registros Eletrônicos de Saúde , Genômica , Genótipo , Fenótipo , Modelos de Riscos Proporcionais , Estudo de Associação Genômica Ampla , Humanos , Neoplasias/genética
16.
J Am Med Inform Assoc ; 26(12): 1437-1447, 2019 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-31609419

RESUMO

OBJECTIVE: The Phenotype Risk Score (PheRS) is a method to detect Mendelian disease patterns using phenotypes from the electronic health record (EHR). We compared the performance of different approaches mapping EHR phenotypes to Mendelian disease features. MATERIALS AND METHODS: PheRS utilizes Mendelian diseases descriptions annotated with Human Phenotype Ontology (HPO) terms. In previous work, we presented a map linking phecodes (based on International Classification of Diseases [ICD]-Ninth Revision) to HPO terms. For this study, we integrated ICD-Tenth Revision codes and lab data. We also created a new map between HPO terms using customized groupings of ICD codes. We compared the performance with cases and controls for 16 Mendelian diseases using 2.5 million de-identified medical records. RESULTS: PheRS effectively distinguished cases from controls for all 15 positive controls and all approaches tested (P < 4 × 1016). Adding lab data led to a statistically significant improvement for 4 of 14 diseases. The custom ICD groupings improved specificity, leading to an average 8% increase for precision at 100 (-2% to 22%). Eight of 10 adults with cystic fibrosis tested had PheRS in the 95th percentile prio to diagnosis. DISCUSSION: Both phecodes and custom ICD groupings were able to detect differences between affected cases and controls at the population level. The ICD map showed better precision for the highest scoring individuals. Adding lab data improved performance at detecting population-level differences. CONCLUSIONS: PheRS is a scalable method to study Mendelian disease at the population level using electronic health record data and can potentially be used to find patients with undiagnosed Mendelian disease.


Assuntos
Mineração de Dados/métodos , Registros Eletrônicos de Saúde , Doenças Genéticas Inatas/diagnóstico , Fenótipo , Adulto , Criança , Fibrose Cística , Doenças Genéticas Inatas/genética , Humanos , Classificação Internacional de Doenças , Fatores de Risco
18.
Stud Health Technol Inform ; 264: 1041-1045, 2019 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-31438083

RESUMO

Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.


Assuntos
Armazenamento e Recuperação da Informação , Neoplasias , Registros Eletrônicos de Saúde , Humanos , Processamento de Linguagem Natural , Relatório de Pesquisa
19.
Front Genet ; 10: 511, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31249589

RESUMO

Uterine fibroids affect up to 77% of women by menopause and account for up to $34 billion in healthcare costs each year. Although fibroid risk is heritable, genetic risk for fibroids is not well understood. We conducted a two-stage case-control meta-analysis of genetic variants in European and African ancestry women with and without fibroids classified by a previously published algorithm requiring pelvic imaging or confirmed diagnosis. Women from seven electronic Medical Records and Genomics (eMERGE) network sites (3,704 imaging-confirmed cases and 5,591 imaging-confirmed controls) and women of African and European ancestry from UK Biobank (UKB, 5,772 cases and 61,457 controls) were included in the discovery genome-wide association study (GWAS) meta-analysis. Variants showing evidence of association in Stage I GWAS (P < 1 × 10-5) were targeted in an independent replication sample of African and European ancestry individuals from the UKB (Stage II) (12,358 cases and 138,477 controls). Logistic regression models were fit with genetic markers imputed to a 1000 Genomes reference and adjusted for principal components for each race- and site-specific dataset, followed by fixed-effects meta-analysis. Final analysis with 21,804 cases and 205,525 controls identified 326 genome-wide significant variants in 11 loci, with three novel loci at chromosome 1q24 (sentinel-SNP rs14361789; P = 4.7 × 10-8), chromosome 16q12.1 (sentinel-SNP rs4785384; P = 1.5 × 10-9) and chromosome 20q13.1 (sentinel-SNP rs6094982; P = 2.6 × 10-8). Our statistically significant findings further support previously reported loci including SNPs near WT1, TNRC6B, SYNE1, BET1L, and CDC42/WNT4. We report evidence of ancestry-specific findings for sentinel-SNP rs10917151 in the CDC42/WNT4 locus (P = 1.76 × 10-24). Ancestry-specific effect-estimates for rs10917151 were in opposite directions (P-Het-between-groups = 0.04) for predominantly African (OR = 0.84) and predominantly European women (OR = 1.16). Genetically-predicted gene expression of several genes including LUZP1 in vagina (P = 4.6 × 10-8), OBFC1 in esophageal mucosa (P = 8.7 × 10-8), NUDT13 in multiple tissues including subcutaneous adipose tissue (P = 3.3 × 10-6), and HEATR3 in skeletal muscle tissue (P = 5.8 × 10-6) were associated with fibroids. The finding for HEATR3 was supported by SNP-based summary Mendelian randomization analysis. Our study suggests that fibroid risk variants act through regulatory mechanisms affecting gene expression and are comprised of alleles that are both ancestry-specific and shared across continental ancestries.

20.
JCO Clin Cancer Inform ; 3: 1-9, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31141421

RESUMO

PURPOSE: Drug development is becoming increasingly expensive and time consuming. Drug repurposing is one potential solution to accelerate drug discovery. However, limited research exists on the use of electronic health record (EHR) data for drug repurposing, and most published studies have been conducted in a hypothesis-driven manner that requires a predefined hypothesis about drugs and new indications. Whether EHRs can be used to detect drug repurposing signals is not clear. We want to demonstrate the feasibility of mining large, longitudinal EHRs for drug repurposing by detecting candidate noncancer drugs that can potentially be used for the treatment of cancer. PATIENTS AND METHODS: By linking cancer registry data to EHRs, we identified 43,310 patients with cancer treated at Vanderbilt University Medical Center (VUMC) and 98,366 treated at the Mayo Clinic. We assessed the effect of 146 noncancer drugs on cancer survival using VUMC EHR data and sought to replicate significant associations (false discovery rate < .1) using the identical approach with Mayo Clinic EHR data. To evaluate replicated signals further, we reviewed the biomedical literature and clinical trials on cancers for corroborating evidence. RESULTS: We identified 22 drugs from six drug classes (statins, proton pump inhibitors, angiotensin-converting enzyme inhibitors, ß-blockers, nonsteroidal anti-inflammatory drugs, and α-1 blockers) associated with improved overall cancer survival (false discovery rate < .1) from VUMC; nine of the 22 drug associations were replicated at the Mayo Clinic. Literature and cancer clinical trial evaluations also showed very strong evidence to support the repurposing signals from EHRs. CONCLUSION: Mining of EHRs for drug exposure-mediated survival signals is feasible and identifies potential candidates for antineoplastic repurposing. This study sets up a new model of mining EHRs for drug repurposing signals.


Assuntos
Reposicionamento de Medicamentos , Registros Eletrônicos de Saúde , Neoplasias/epidemiologia , Ensaios Clínicos como Assunto , Mineração de Dados , Desenvolvimento de Medicamentos , Humanos , Neoplasias/tratamento farmacológico , Neoplasias/mortalidade , Prognóstico , Sistema de Registros , Reprodutibilidade dos Testes , Resultado do Tratamento
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA