Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
1.
medRxiv ; 2024 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-38645167

RESUMO

Apart from ancestry, personal or environmental covariates may contribute to differences in polygenic score (PGS) performance. We analyzed effects of covariate stratification and interaction on body mass index (BMI) PGS (PGSBMI) across four cohorts of European (N=491,111) and African (N=21,612) ancestry. Stratifying on binary covariates and quintiles for continuous covariates, 18/62 covariates had significant and replicable R2 differences among strata. Covariates with the largest differences included age, sex, blood lipids, physical activity, and alcohol consumption, with R2 being nearly double between best and worst performing quintiles for certain covariates. 28 covariates had significant PGSBMI-covariate interaction effects, modifying PGSBMI effects by nearly 20% per standard deviation change. We observed overlap between covariates that had significant R2 differences among strata and interaction effects - across all covariates, their main effects on BMI were correlated with their maximum R2 differences and interaction effects (0.56 and 0.58, respectively), suggesting high-PGSBMI individuals have highest R2 and increase in PGS effect. Using quantile regression, we show the effect of PGSBMI increases as BMI itself increases, and that these differences in effects are directly related to differences in R2 when stratifying by different covariates. Given significant and replicable evidence for context-specific PGSBMI performance and effects, we investigated ways to increase model performance taking into account non-linear effects. Machine learning models (neural networks) increased relative model R2 (mean 23%) across datasets. Finally, creating PGSBMI directly from GxAge GWAS effects increased relative R2 by 7.8%. These results demonstrate that certain covariates, especially those most associated with BMI, significantly affect both PGSBMI performance and effects across diverse cohorts and ancestries, and we provide avenues to improve model performance that consider these effects.

2.
Br J Nutr ; 131(1): 156-162, 2024 01 14.
Artigo em Inglês | MEDLINE | ID: mdl-37519237

RESUMO

Though diet quality is widely recognised as linked to risk of chronic disease, health systems have been challenged to find a user-friendly, efficient way to obtain information about diet. The Penn Healthy Diet (PHD) survey was designed to fill this void. The purposes of this pilot project were to assess the patient experience with the PHD, to validate the accuracy of the PHD against related items in a diet recall and to explore scoring algorithms with relationship to the Healthy Eating Index (HEI)-2015 computed from the recall data. A convenience sample of participants in the Penn Health BioBank was surveyed with the PHD, the Automated Self-Administered 24-hour recall (ASA24) and experience questions. Kappa scores and Spearman correlations were used to compare related questions in the PHD to the ASA24. Numerical scoring, regression tree and weighted regressions were computed for scoring. Participants assessed the PHD as easy to use and were willing to repeat the survey at least annually. The three scoring algorithms were strongly associated with HEI-2015 scores using National Health and Nutrition Examination Survey 2017-2018 data from which the PHD was developed and moderately associated with the pilot replication data. The PHD is acceptable to participants and at least moderately correlated with the HEI-2015. Further validation in a larger sample will enable the selection of the strongest scoring approach.


Assuntos
Dieta Saudável , Dieta , Humanos , Inquéritos Nutricionais , Projetos Piloto , Inquéritos sobre Dietas
3.
Pac Symp Biocomput ; 29: 594-610, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38160309

RESUMO

Access to safe and effective antiretroviral therapy (ART) is a cornerstone in the global response to the HIV pandemic. Among people living with HIV, there is considerable interindividual variability in absolute CD4 T-cell recovery following initiation of virally suppressive ART. The contribution of host genetics to this variability is not well understood. We explored the contribution of a polygenic score which was derived from large, publicly available summary statistics for absolute lymphocyte count from individuals in the general population (PGSlymph) due to a lack of publicly available summary statistics for CD4 T-cell count. We explored associations with baseline CD4 T-cell count prior to ART initiation (n=4959) and change from baseline to week 48 on ART (n=3274) among treatment-naïve participants in prospective, randomized ART studies of the AIDS Clinical Trials Group. We separately examined an African-ancestry-derived and a European-ancestry-derived PGSlymph, and evaluated their performance across all participants, and also in the African and European ancestral groups separately. Multivariate models that included PGSlymph, baseline plasma HIV-1 RNA, age, sex, and 15 principal components (PCs) of genetic similarity explained ∼26-27% of variability in baseline CD4 T-cell count, but PGSlymph accounted for <1% of this variability. Models that also included baseline CD4 T-cell count explained ∼7-9% of variability in CD4 T-cell count increase on ART, but PGSlymph accounted for <1% of this variability. In univariate analyses, PGSlymph was not significantly associated with baseline or change in CD4 T-cell count. Among individuals of African ancestry, the African PGSlymph term in the multivariate model was significantly associated with change in CD4 T-cell count while not significant in the univariate model. When applied to lymphocyte count in a general medical biobank population (Penn Medicine BioBank), PGSlymph explained ∼6-10% of variability in multivariate models (including age, sex, and PCs) but only ∼1% in univariate models. In summary, a lymphocyte count PGS derived from the general population was not consistently associated with CD4 T-cell recovery on ART. Nonetheless, adjusting for clinical covariates is quite important when estimating such polygenic effects.


Assuntos
Fármacos Anti-HIV , Infecções por HIV , Humanos , Linfócitos T CD4-Positivos , Estudos Prospectivos , Fármacos Anti-HIV/uso terapêutico , Biologia Computacional , Infecções por HIV/tratamento farmacológico , Infecções por HIV/genética , Contagem de Linfócito CD4 , Carga Viral
4.
Am J Hum Genet ; 110(4): 575-591, 2023 04 06.
Artigo em Inglês | MEDLINE | ID: mdl-37028392

RESUMO

Leveraging linkage disequilibrium (LD) patterns as representative of population substructure enables the discovery of additive association signals in genome-wide association studies (GWASs). Standard GWASs are well-powered to interrogate additive models; however, new approaches are required for invesigating other modes of inheritance such as dominance and epistasis. Epistasis, or non-additive interaction between genes, exists across the genome but often goes undetected because of a lack of statistical power. Furthermore, the adoption of LD pruning as customary in standard GWASs excludes detection of sites that are in LD but might underlie the genetic architecture of complex traits. We hypothesize that uncovering long-range interactions between loci with strong LD due to epistatic selection can elucidate genetic mechanisms underlying common diseases. To investigate this hypothesis, we tested for associations between 23 common diseases and 5,625,845 epistatic SNP-SNP pairs (determined by Ohta's D statistics) in long-range LD (>0.25 cM). Across five disease phenotypes, we identified one significant and four near-significant associations that replicated in two large genotype-phenotype datasets (UK Biobank and eMERGE). The genes that were most likely involved in the replicated associations were (1) members of highly conserved gene families with complex roles in multiple pathways, (2) essential genes, and/or (3) genes that were associated in the literature with complex traits that display variable expressivity. These results support the highly pleiotropic and conserved nature of variants in long-range LD under epistatic selection. Our work supports the hypothesis that epistatic interactions regulate diverse clinical mechanisms and might especially be driving factors in conditions with a wide range of phenotypic outcomes.


Assuntos
Epistasia Genética , Estudo de Associação Genômica Ampla , Desequilíbrio de Ligação/genética , Genótipo , Bancos de Espécimes Biológicos , Reino Unido , Polimorfismo de Nucleotídeo Único/genética
5.
Clin Pharmacol Ther ; 113(5): 1036-1047, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36350094

RESUMO

Pharmacogenomics (PGx) investigates the genetic influence on drug response and is an integral part of precision medicine. While PGx testing is becoming more common in clinical practice and may be reimbursed by Medicare/Medicaid and commercial insurance, interpreting PGx testing results for clinical decision support is still a challenge. The Pharmacogenomics Clinical Annotation Tool (PharmCAT) has been designed to tackle the need for transparent, automatic interpretations of patient genetic data. PharmCAT incorporates a patient's genotypes, annotates PGx information (allele, genotype, and phenotype), and generates a report with PGx guideline recommendations from the Clinical Pharmacogenetics Implementation Consortium (CPIC) and/or the Dutch Pharmacogenetics Working Group (DPWG). PharmCAT has introduced new features in the last 2 years, including a variant call format (VCF) Preprocessor, the inclusion of DPWG guidelines, and functionalities for PGx research. For example, researchers can use the VCF Preprocessor to prepare biobank-scale data for PharmCAT. In addition, PharmCAT enables the assessment of novel partial and combination alleles that are composed of known PGx variants and can call CYP2D6 genotypes based on single and deletions in the input VCF file. This tutorial provides materials and detailed step-by-step instructions for how to use PharmCAT in a versatile way that can be tailored to users' individual needs.


Assuntos
Medicare , Farmacogenética , Idoso , Estados Unidos , Humanos , Farmacogenética/métodos , Medicina de Precisão/métodos , Genótipo , Fenótipo
6.
J Pers Med ; 12(12)2022 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-36556195

RESUMO

The Penn Medicine BioBank (PMBB) is an electronic health record (EHR)-linked biobank at the University of Pennsylvania (Penn Medicine). A large variety of health-related information, ranging from diagnosis codes to laboratory measurements, imaging data and lifestyle information, is integrated with genomic and biomarker data in the PMBB to facilitate discoveries and translational science. To date, 174,712 participants have been enrolled into the PMBB, including approximately 30% of participants of non-European ancestry, making it one of the most diverse medical biobanks. There is a median of seven years of longitudinal data in the EHR available on participants, who also consent to permission to recontact. Herein, we describe the operations and infrastructure of the PMBB, summarize the phenotypic architecture of the enrolled participants, and use body mass index (BMI) as a proof-of-concept quantitative phenotype for PheWAS, LabWAS, and GWAS. The major representation of African-American participants in the PMBB addresses the essential need to expand the diversity in genetic and translational research. There is a critical need for a "medical biobank consortium" to facilitate replication, increase power for rare phenotypes and variants, and promote harmonized collaboration to optimize the potential for biological discovery and precision medicine.

7.
J Transl Med ; 20(1): 550, 2022 11 28.
Artigo em Inglês | MEDLINE | ID: mdl-36443877

RESUMO

BACKGROUND: Pharmacogenomics (PGx) aims to utilize a patient's genetic data to enable safer and more effective prescribing of medications. The Clinical Pharmacogenetics Implementation Consortium (CPIC) provides guidelines with strong evidence for 24 genes that affect 72 medications. Despite strong evidence linking PGx alleles to drug response, there is a large gap in the implementation and return of actionable pharmacogenetic findings to patients in standard clinical practice. In this study, we evaluated opportunities for genetically guided medication prescribing in a diverse health system and determined the frequencies of actionable PGx alleles in an ancestrally diverse biobank population. METHODS: A retrospective analysis of the Penn Medicine electronic health records (EHRs), which includes ~ 3.3 million patients between 2012 and 2020, provides a snapshot of the trends in prescriptions for drugs with genotype-based prescribing guidelines ('CPIC level A or B') in the Penn Medicine health system. The Penn Medicine BioBank (PMBB) consists of a diverse group of 43,359 participants whose EHRs are linked to genome-wide SNP array and whole exome sequencing (WES) data. We used the Pharmacogenomics Clinical Annotation Tool (PharmCAT), to annotate PGx alleles from PMBB variant call format (VCF) files and identify samples with actionable PGx alleles. RESULTS: We identified ~ 316.000 unique patients that were prescribed at least 2 drugs with CPIC Level A or B guidelines. Genetic analysis in PMBB identified that 98.9% of participants carry one or more PGx actionable alleles where treatment modification would be recommended. After linking the genetic data with prescription data from the EHR, 14.2% of participants (n = 6157) were prescribed medications that could be impacted by their genotype (as indicated by their PharmCAT report). For example, 856 participants received clopidogrel who carried CYP2C19 reduced function alleles, placing them at increased risk for major adverse cardiovascular events. When we stratified by genetic ancestry, we found disparities in PGx allele frequencies and clinical burden. Clopidogrel users of Asian ancestry in PMBB had significantly higher rates of CYP2C19 actionable alleles than European ancestry users of clopidrogrel (p < 0.0001, OR = 3.68). CONCLUSIONS: Clinically actionable PGx alleles are highly prevalent in our health system and many patients were prescribed medications that could be affected by PGx alleles. These results illustrate the potential utility of preemptive genotyping for tailoring of medications and implementation of PGx into routine clinical care.


Assuntos
Bancos de Espécimes Biológicos , Farmacogenética , Humanos , Alelos , Citocromo P-450 CYP2C19 , Clopidogrel , Estudos Retrospectivos
8.
JCO Oncol Pract ; 17(12): e1879-e1886, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34133219

RESUMO

PURPOSE: Multiple studies have demonstrated the negative impact of cancer care delays during the COVID-19 pandemic, and transmission mitigation techniques are imperative for continued cancer care delivery. We aimed to gauge the effectiveness of these measures at the University of Pennsylvania. METHODS: We conducted a longitudinal study of SARS-CoV-2 antibody seropositivity and seroconversion in patients presenting to infusion centers for cancer-directed therapy between May 21, 2020, and October 8, 2020. Participants completed questionnaires and had up to five serial blood collections. RESULTS: Of 124 enrolled patients, only two (1.6%) had detectable SARS-CoV-2 antibodies on initial blood draw, and no initially seronegative patients developed newly detectable antibodies on subsequent blood draw(s), corresponding to a seroconversion rate of 0% (95% CI, 0.0 TO 4.1%) over 14.8 person-years of follow up, with a median of 13 health care visits per patient. CONCLUSION: These results suggest that patients with cancer receiving in-person care at a facility with aggressive mitigation efforts have an extremely low likelihood of COVID-19 infection.


Assuntos
COVID-19 , Neoplasias , Humanos , Estudos Longitudinais , Neoplasias/terapia , Pandemias , SARS-CoV-2 , Soroconversão
9.
Nat Genet ; 53(7): 972-981, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34140684

RESUMO

Plasma lipids are known heritable risk factors for cardiovascular disease, but increasing evidence also supports shared genetics with diseases of other organ systems. We devised a comprehensive three-phase framework to identify new lipid-associated genes and study the relationships among lipids, genotypes, gene expression and hundreds of complex human diseases from the Electronic Medical Records and Genomics (347 traits) and the UK Biobank (549 traits). Aside from 67 new lipid-associated genes with strong replication, we found evidence for pleiotropic SNPs/genes between lipids and diseases across the phenome. These include discordant pleiotropy in the HLA region between lipids and multiple sclerosis and putative causal paths between triglycerides and gout, among several others. Our findings give insights into the genetic basis of the relationship between plasma lipids and diseases on a phenome-wide scale and can provide context for future prevention and treatment strategies.


Assuntos
Biomarcadores , Suscetibilidade a Doenças , Registros Eletrônicos de Saúde , Lipídeos/sangue , Alelos , Bancos de Espécimes Biológicos , Estudos de Associação Genética , Predisposição Genética para Doença , Humanos , Polimorfismo de Nucleotídeo Único , Vigilância em Saúde Pública , Característica Quantitativa Herdável , Reino Unido
10.
medRxiv ; 2021 Jan 16.
Artigo em Inglês | MEDLINE | ID: mdl-33469597

RESUMO

Multiple studies have demonstrated the negative impact of cancer care delays during the COVID-19 pandemic, and transmission mitigation techniques are imperative for continued cancer care delivery. To gauge the effectiveness of these measures at the University of Pennsylvania, we conducted a longitudinal study of SARS-CoV-2 antibody seropositivity and seroconversion in patients presenting to infusion centers for cancer-directed therapy between 5/21/2020 and 10/8/2020. Participants completed questionnaires and had up to five serial blood collections. Of 124 enrolled patients, only two (1.6%) had detectable SARS-CoV-2 antibodies on initial blood draw, and no initially seronegative patients developed newly detectable antibodies on subsequent blood draw(s), corresponding to a seroconversion rate of 0% (95%CI 0.0-4.1%) over 14.8 person-years of follow up, with a median of 13 healthcare visits per patient. These results suggest that cancer patients receiving in-person care at a facility with aggressive mitigation efforts have an extremely low likelihood of COVID-19 infection.

11.
Pharmacogenomics J ; 19(2): 178-190, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-29795408

RESUMO

Identifying genetic variants associated with chemotherapeutic induced toxicity is an important step towards personalized treatment of cancer patients. However, annotating and interpreting the associated genetic variants remains challenging because each associated variant is a surrogate for many other variants in the same region. The issue is further complicated when investigating patterns of associated variants with multiple drugs. In this study, we used biological knowledge to annotate and compare genetic variants associated with cellular sensitivity to mechanistically distinct chemotherapeutic drugs, including platinating agents (cisplatin, carboplatin), capecitabine, cytarabine, and paclitaxel. The most significantly associated SNPs from genome wide association studies of cellular sensitivity to each drug in lymphoblastoid cell lines derived from populations of European (CEU) and African (YRI) descent were analyzed for their enrichment in biological pathways and processes. We annotated genetic variants using higher-level biological annotations in efforts to group variants into more interpretable biological modules. Using the higher-level annotations, we observed distinct biological modules associated with cell line populations as well as classes of chemotherapeutic drugs. We also integrated genetic variants and gene expression variables to build predictive models for chemotherapeutic drug cytotoxicity and prioritized the network models based on the enrichment of DNA regulatory data. Several biological annotations, often encompassing different SNPs, were replicated in independent datasets. By using biological knowledge and DNA regulatory information, we propose a novel approach for jointly analyzing genetic variants associated with multiple chemotherapeutic drugs.


Assuntos
Variação Genética/genética , Estudo de Associação Genômica Ampla/métodos , Neoplasias/tratamento farmacológico , Farmacogenética/métodos , População Negra/genética , Capecitabina/efeitos adversos , Capecitabina/uso terapêutico , Carboplatina/efeitos adversos , Carboplatina/uso terapêutico , Linhagem Celular , Cisplatino/efeitos adversos , Cisplatino/uso terapêutico , Regulação Neoplásica da Expressão Gênica/efeitos dos fármacos , Genoma Humano/genética , Humanos , Anotação de Sequência Molecular , Neoplasias/genética , Paclitaxel/efeitos adversos , Paclitaxel/uso terapêutico , Polimorfismo de Nucleotídeo Único/genética , População Branca/genética
12.
PLoS One ; 14(12): e0226771, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31891604

RESUMO

We performed a hypothesis-generating phenome-wide association study (PheWAS) to identify and characterize cross-phenotype associations, where one SNP is associated with two or more phenotypes, between thousands of genetic variants assayed on the Metabochip and hundreds of phenotypes in 5,897 African Americans as part of the Population Architecture using Genomics and Epidemiology (PAGE) I study. The PAGE I study was a National Human Genome Research Institute-funded collaboration of four study sites accessing diverse epidemiologic studies genotyped on the Metabochip, a custom genotyping chip that has dense coverage of regions in the genome previously associated with cardio-metabolic traits and outcomes in mostly European-descent populations. Here we focus on identifying novel phenome-genome relationships, where SNPs are associated with more than one phenotype. To do this, we performed a PheWAS, testing each SNP on the Metabochip for an association with up to 273 phenotypes in the participating PAGE I study sites. We identified 133 putative pleiotropic variants, defined as SNPs associated at an empirically derived p-value threshold of p<0.01 in two or more PAGE study sites for two or more phenotype classes. We further annotated these PheWAS-identified variants using publicly available functional data and local genetic ancestry. Amongst our novel findings is SPARC rs4958487, associated with increased glucose levels and hypertension. SPARC has been implicated in the pathogenesis of diabetes and is also known to have a potential role in fibrosis, a common consequence of multiple conditions including hypertension. The SPARC example and others highlight the potential that PheWAS approaches have in improving our understanding of complex disease architecture by identifying novel relationships between genetic variants and an array of common human phenotypes.


Assuntos
Aterosclerose/genética , Negro ou Afro-Americano/genética , Pleiotropia Genética , Metagenômica , Fenômica , Idoso , Estudos Epidemiológicos , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único
13.
BioData Min ; 11: 5, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29713383

RESUMO

BACKGROUND: Machine learning methods have gained popularity and practicality in identifying linear and non-linear effects of variants associated with complex disease/traits. Detection of epistatic interactions still remains a challenge due to the large number of features and relatively small sample size as input, thus leading to the so-called "short fat data" problem. The efficiency of machine learning methods can be increased by limiting the number of input features. Thus, it is very important to perform variable selection before searching for epistasis. Many methods have been evaluated and proposed to perform feature selection, but no single method works best in all scenarios. We demonstrate this by conducting two separate simulation analyses to evaluate the proposed collective feature selection approach. RESULTS: Through our simulation study we propose a collective feature selection approach to select features that are in the "union" of the best performing methods. We explored various parametric, non-parametric, and data mining approaches to perform feature selection. We choose our top performing methods to select the union of the resulting variables based on a user-defined percentage of variants selected from each method to take to downstream analysis. Our simulation analysis shows that non-parametric data mining approaches, such as MDR, may work best under one simulation criteria for the high effect size (penetrance) datasets, while non-parametric methods designed for feature selection, such as Ranger and Gradient boosting, work best under other simulation criteria. Thus, using a collective approach proves to be more beneficial for selecting variables with epistatic effects also in low effect size datasets and different genetic architectures. Following this, we applied our proposed collective feature selection approach to select the top 1% of variables to identify potential interacting variables associated with Body Mass Index (BMI) in ~ 44,000 samples obtained from Geisinger's MyCode Community Health Initiative (on behalf of DiscovEHR collaboration). CONCLUSIONS: In this study, we were able to show that selecting variables using a collective feature selection approach could help in selecting true positive epistatic variables more frequently than applying any single method for feature selection via simulation studies. We were able to demonstrate the effectiveness of collective feature selection along with a comparison of many methods in our simulation analysis. We also applied our method to identify non-linear networks associated with obesity.

14.
BMC Bioinformatics ; 19(1): 120, 2018 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-29618318

RESUMO

BACKGROUND: Phenome-wide association studies (PheWAS) are a high-throughput approach to evaluate comprehensive associations between genetic variants and a wide range of phenotypic measures. PheWAS has varying sample sizes for quantitative traits, and variable numbers of cases and controls for binary traits across the many phenotypes of interest, which can affect the statistical power to detect associations. The motivation of this study is to investigate the various parameters which affect the estimation of statistical power in PheWAS, including sample size, case-control ratio, minor allele frequency, and disease penetrance. RESULTS: We performed a PheWAS simulation study, where we investigated variations in statistical power based on different parameters, such as overall sample size, number of cases, case-control ratio, minor allele frequency, and disease penetrance. The simulation was performed on both binary and quantitative phenotypic measures. Our simulation on binary traits suggests that the number of cases has more impact on statistical power than the case to control ratio; also, we found that a sample size of 200 cases or more maintains the statistical power to identify associations for common variants. For quantitative traits, a sample size of 1000 or more individuals performed best in the power calculations. We focused on common genetic variants (MAF > 0.01) in this study; however, in future studies, we will be extending this effort to perform similar simulations on rare variants. CONCLUSIONS: This study provides a series of PheWAS simulation analyses that can be used to estimate statistical power for some potential scenarios. These results can be used to provide guidelines for appropriate study design for future PheWAS analyses.


Assuntos
Simulação por Computador , Doença/genética , Estudos de Associação Genética , Estudo de Associação Genômica Ampla , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Algoritmos , Humanos
15.
Nat Commun ; 8(1): 1167, 2017 10 27.
Artigo em Inglês | MEDLINE | ID: mdl-29079728

RESUMO

Genome-wide, imputed, sequence, and structural data are now available for exceedingly large sample sizes. The needs for data management, handling population structure and related samples, and performing associations have largely been met. However, the infrastructure to support analyses involving complexity beyond genome-wide association studies is not standardized or centralized. We provide the PLatform for the Analysis, Translation, and Organization of large-scale data (PLATO), a software tool equipped to handle multi-omic data for hundreds of thousands of samples to explore complexity using genetic interactions, environment-wide association studies and gene-environment interactions, phenome-wide association studies, as well as copy number and rare variant analyses. Using the data from the Marshfield Personalized Medicine Research Project, a site in the electronic Medical Records and Genomics Network, we apply each feature of PLATO to type 2 diabetes and demonstrate how PLATO can be used to uncover the complex etiology of common traits.


Assuntos
Biologia Computacional , Genoma Humano , Estudo de Associação Genômica Ampla , Consumo de Bebidas Alcoólicas , Alelos , Bases de Dados Genéticas , Diabetes Mellitus Tipo 2/genética , Dieta , Epistasia Genética , Deleção de Genes , Dosagem de Genes , Interação Gene-Ambiente , Genômica , Genótipo , Glutamato Descarboxilase/genética , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Linguagens de Programação , Recidiva , Análise de Sequência de DNA , Software , Inquéritos e Questionários
16.
BioData Min ; 10: 25, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28770004

RESUMO

BACKGROUND: The genetic etiology of human lipid quantitative traits is not fully elucidated, and interactions between variants may play a role. We performed a gene-centric interaction study for four different lipid traits: low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), total cholesterol (TC), and triglycerides (TG). RESULTS: Our analysis consisted of a discovery phase using a merged dataset of five different cohorts (n = 12,853 to n = 16,849 depending on lipid phenotype) and a replication phase with ten independent cohorts totaling up to 36,938 additional samples. Filters are often applied before interaction testing to correct for the burden of testing all pairwise interactions. We used two different filters: 1. A filter that tested only single nucleotide polymorphisms (SNPs) with a main effect of p < 0.001 in a previous association study. 2. A filter that only tested interactions identified by Biofilter 2.0. Pairwise models that reached an interaction significance level of p < 0.001 in the discovery dataset were tested for replication. We identified thirteen SNP-SNP models that were significant in more than one replication cohort after accounting for multiple testing. CONCLUSIONS: These results may reveal novel insights into the genetic etiology of lipid levels. Furthermore, we developed a pipeline to perform a computationally efficient interaction analysis with multi-cohort replication.

17.
J Am Med Inform Assoc ; 24(3): 577-587, 2017 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-28040685

RESUMO

It is common that cancer patients have different molecular signatures even though they have similar clinical features, such as histology, due to the heterogeneity of tumors. To overcome this variability, we previously developed a new approach incorporating prior biological knowledge that identifies knowledge-driven genomic interactions associated with outcomes of interest. However, no systematic approach has been proposed to identify interaction models between pathways based on multi-omics data. Here we have proposed such a novel methodological framework, called metadimensional knowledge-driven genomic interactions (MKGIs). To test the utility of the proposed framework, we applied it to an ovarian cancer dataset including multi-omics profiles from The Cancer Genome Atlas to predict grade, stage, and survival outcome. We found that each knowledge-driven genomic interaction model, based on different genomic datasets, contains different sets of pathway features, which suggests that each genomic data type may contribute to outcomes in ovarian cancer via a different pathway. In addition, MKGI models significantly outperformed the single knowledge-driven genomic interaction model. From the MKGI models, many interactions between pathways associated with outcomes were found, including the mitogen-activated protein kinase (MAPK) signaling pathway and the gonadotropin-releasing hormone (GnRH) signaling pathway, which are known to play important roles in cancer pathogenesis. The beauty of incorporating biological knowledge into the model based on multi-omics data is the ability to improve diagnosis and prognosis and provide better interpretability. Thus, determining variability in molecular signatures based on these interactions between pathways may lead to better diagnostic/treatment strategies for better precision medicine.


Assuntos
Genômica/métodos , Modelos Genéticos , Neoplasias Ovarianas/genética , Adulto , Idoso , Idoso de 80 Anos ou mais , Conjuntos de Dados como Assunto , Feminino , Expressão Gênica , Humanos , Pessoa de Meia-Idade , Neoplasias Ovarianas/diagnóstico , Prognóstico
18.
BioData Min ; 9: 18, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27168765

RESUMO

BACKGROUND: The future of medicine is moving towards the phase of precision medicine, with the goal to prevent and treat diseases by taking inter-individual variability into account. A large part of the variability lies in our genetic makeup. With the fast paced improvement of high-throughput methods for genome sequencing, a tremendous amount of genetics data have already been generated. The next hurdle for precision medicine is to have sufficient computational tools for analyzing large sets of data. Genome-Wide Association Studies (GWAS) have been the primary method to assess the relationship between single nucleotide polymorphisms (SNPs) and disease traits. While GWAS is sufficient in finding individual SNPs with strong main effects, it does not capture potential interactions among multiple SNPs. In many traits, a large proportion of variation remain unexplained by using main effects alone, leaving the door open for exploring the role of genetic interactions. However, identifying genetic interactions in large-scale genomics data poses a challenge even for modern computing. RESULTS: For this study, we present a new algorithm, Grammatical Evolution Bayesian Network (GEBN) that utilizes Bayesian Networks to identify interactions in the data, and at the same time, uses an evolutionary algorithm to reduce the computational cost associated with network optimization. GEBN excelled in simulation studies where the data contained main effects and interaction effects. We also applied GEBN to a Type 2 diabetes (T2D) dataset obtained from the Marshfield Personalized Medicine Research Project (PMRP). We were able to identify genetic interactions for T2D cases and controls and use information from those interactions to classify T2D samples. We obtained an average testing area under the curve (AUC) of 86.8 %. We also identified several interacting genes such as INADL and LPP that are known to be associated with T2D. CONCLUSIONS: Developing the computational tools to explore genetic associations beyond main effects remains a critically important challenge in human genetics. Methods, such as GEBN, demonstrate the utility of considering genetic interactions, as they likely explain some of the missing heritability.

19.
Bioinformatics ; 32(15): 2361-3, 2016 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-27153576

RESUMO

MOTIVATION: We present an update to the pathway enrichment analysis tool 'Pathway Analysis by Randomization Incorporating Structure (PARIS)' that determines aggregated association signals generated from genome-wide association study results. Pathway-based analyses highlight biological pathways associated with phenotypes. PARIS uses a unique permutation strategy to evaluate the genomic structure of interrogated pathways, through permutation testing of genomic features, thus eliminating many of the over-testing concerns arising with other pathway analysis approaches. RESULTS: We have updated PARIS to incorporate expanded pathway definitions through the incorporation of new expert knowledge from multiple database sources, through customized user provided pathways, and other improvements in user flexibility and functionality. AVAILABILITY AND IMPLEMENTATION: PARIS is freely available to all users at https://ritchielab.psu.edu/software/paris-download CONTACT: jnc43@case.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados Factuais , Estudo de Associação Genômica Ampla , Genômica , Humanos , Software
20.
Neurobiol Aging ; 38: 141-150, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26827652

RESUMO

Late-onset Alzheimer disease (AD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance, and gene-gene interactions; however, the investigation of interactions in recent genome-wide association studies has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across 13 data sets from the Alzheimer Disease Genetics Consortium. Fifteen single nucleotide polymorphism (SNP)-SNP pairs within 3 gene-gene combinations were identified: SIRT1 × ABCB1, PSAP × PEBP4, and GRIN2B × ADRA1A. In addition, we extend a previously identified interaction from an endophenotype analysis between RYR3 × CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this article highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis.


Assuntos
Doença de Alzheimer/genética , Conjuntos de Dados como Assunto , Epistasia Genética/genética , Estudos de Associação Genética , Subfamília B de Transportador de Cassetes de Ligação de ATP/genética , Proteínas Relacionadas a Caderinas , Caderinas/genética , Canais de Cálcio Tipo L/genética , Progressão da Doença , Feminino , Humanos , Masculino , Modelos Genéticos , Proteína de Ligação a Fosfatidiletanolamina/genética , Polimorfismo de Nucleotídeo Único , Receptores Adrenérgicos alfa 1/genética , Receptores de N-Metil-D-Aspartato/genética , Risco , Canal de Liberação de Cálcio do Receptor de Rianodina/genética , Saposinas/genética , Sirtuína 1/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA