Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
Curr Protoc ; 2(11): e603, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36441943

RESUMO

Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of many complex diseases. Regardless of the context, the practical utility of this information ultimately depends upon the quality of the data used for statistical analyses. Quality control (QC) procedures for GWAS are constantly evolving. Here, we enumerate some of the challenges in QC of genotyped GWAS data and describe the approaches involving genotype imputation of a sample dataset along with post-imputation quality assurance, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of the GWAS data (genotyped and imputed), including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We provide detailed guidelines along with a sample dataset to suggest current best practices and discuss areas of ongoing and future research. © 2022 Wiley Periodicals LLC.


Assuntos
Estudo de Associação Genômica Ampla , Projetos de Pesquisa , Humanos , Controle de Qualidade , Genótipo , Aberrações dos Cromossomos Sexuais
2.
Spine (Phila Pa 1976) ; 46(11): E625-E631, 2021 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-33332786

RESUMO

STUDY DESIGN: A case-control genome-wide association study (GWAS) on spondylosis. OBJECTIVE: Leveraging Geisinger's MyCode initiative's multimodal dataset, we aimed to identify genetic associations with degenerative spine disease. SUMMARY OF BACKGROUND DATA: Degenerative spine conditions are a leading cause of global disability; however, the genetic underpinnings of these conditions remain under-investigated. Previous studies using candidate-gene approach suggest a genetic risk for degenerative spine conditions, but large-scale GWASs are lacking. METHODS: We identified 4434 patients with a diagnosis of spondylosis using ICD diagnosis codes with genotype data available. We identified a population-based control of 12,522 patients who did not have any diagnosis for osteoarthritis. A linear-mix, additive genetic model was employed to perform the genetic association tests adjusting for age, sex, and genetic principal components to account for the population structure and relatedness. Gene-based association tests were performed and heritability and genetic correlations with other traits were investigated. RESULTS: We identified a genome-wide significant locus at rs12190551 (odds ratio = 1.034, 95% confidence interval 1.022-1.046, P = 8.5 × 10-9, minor allele frequency = 36.9%) located in the intron of BMP6. Additionally, NIPAL1 and CNGA1 achieved Bonferroni significance in the gene-based association tests. The estimated heritability was 7.19%. Furthermore, significant genetic correlations with pain, depression, lumbar spine bone mineral density, and osteoarthritis were identified. CONCLUSION: We demonstrated the use of a massive database of genotypes combined with electronic health record data to identify a novel and significant association spondylosis. We also identified significant genetic correlations with pain, depression, bone mineral density, and osteoarthritis, suggesting shared genetic etiology and molecular pathways with these phenotypes.Level of Evidence: N/A.


Assuntos
Proteína Morfogenética Óssea 6/genética , Proteínas de Transporte de Cátions/genética , Canais de Cátion Regulados por Nucleotídeos Cíclicos/genética , Espondilose , Estudos de Casos e Controles , Feminino , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Humanos , Masculino , Espondilose/epidemiologia , Espondilose/genética
3.
J Clin Med ; 10(1)2020 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-33396741

RESUMO

BACKGROUND: The imputation of missingness is a key step in Electronic Health Records (EHR) mining, as it can significantly affect the conclusions derived from the downstream analysis in translational medicine. The missingness of laboratory values in EHR is not at random, yet imputation techniques tend to disregard this key distinction. Consequently, the development of an adaptive imputation strategy designed specifically for EHR is an important step in improving the data imbalance and enhancing the predictive power of modeling tools for healthcare applications. METHOD: We analyzed the laboratory measures derived from Geisinger's EHR on patients in three distinct cohorts-patients tested for Clostridioides difficile (Cdiff) infection, patients with a diagnosis of inflammatory bowel disease (IBD), and patients with a diagnosis of hip or knee osteoarthritis (OA). We extracted Logical Observation Identifiers Names and Codes (LOINC) from which we excluded those with 75% or more missingness. The comorbidities, primary or secondary diagnosis, as well as active problem lists, were also extracted. The adaptive imputation strategy was designed based on a hybrid approach. The comorbidity patterns of patients were transformed into latent patterns and then clustered. Imputation was performed on a cluster of patients for each cohort independently to show the generalizability of the method. The results were compared with imputation applied to the complete dataset without incorporating the information from comorbidity patterns. RESULTS: We analyzed a total of 67,445 patients (11,230 IBD patients, 10,000 OA patients, and 46,215 patients tested for C. difficile infection). We extracted 495 LOINC and 11,230 diagnosis codes for the IBD cohort, 8160 diagnosis codes for the Cdiff cohort, and 2042 diagnosis codes for the OA cohort based on the primary/secondary diagnosis and active problem list in the EHR. Overall, the most improvement from this strategy was observed when the laboratory measures had a higher level of missingness. The best root mean square error (RMSE) difference for each dataset was recorded as -35.5 for the Cdiff, -8.3 for the IBD, and -11.3 for the OA dataset. CONCLUSIONS: An adaptive imputation strategy designed specifically for EHR that uses complementary information from the clinical profile of the patient can be used to improve the imputation of missing laboratory values, especially when laboratory codes with high levels of missingness are included in the analysis.

4.
Pac Symp Biocomput ; 23: 365-376, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29218897

RESUMO

Alzheimer's disease (AD) is a neurodegenerative disorder with few biomarkers even though it impacts a relatively large portion of the population and is predicted to affect significantly more individuals in the future. Neuroimaging has been used in concert with genetic information to improve our understanding in relation to how AD arises and how it can be potentially diagnosed. Additionally, evidence suggests synonymous variants can have a functional impact on gene regulatory mechanisms, including those related to AD. Some synonymous codons are preferred over others leading to a codon bias. The bias can arise with respect to codons that are more or less frequently used in the genome. A bias can also result from optimal and non-optimal codons, which have stronger and weaker codon anti-codon interactions, respectively. Although association tests have been utilized before to identify genes associated with AD, it remains unclear how codon bias plays a role and if it can improve rare variant analysis. In this work, rare variants from whole-genome sequencing from the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort were binned into genes using BioBin. An association analysis of the genes with AD-related neuroimaging biomarker was performed using SKAT-O. While using all synonymous variants we did not identify any genomewide significant associations, using only synonymous variants that affected codon frequency we identified several genes as significantly associated with the imaging phenotype. Additionally, significant associations were found using only rare variants that contains an optimal codon in among minor alleles and a non-optimal codon in the major allele. These results suggest that codon bias may play a role in AD and that it can be used to improve detection power in rare variant association analysis.


Assuntos
Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/genética , Códon/genética , Variação Genética , Idoso , Idoso de 80 Anos ou mais , Biomarcadores , Biologia Computacional/métodos , Feminino , Estudos de Associação Genética , Humanos , Imageamento por Ressonância Magnética , Masculino , Neuroimagem , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma
5.
BMC Med Genomics ; 11(Suppl 3): 76, 2018 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-30255815

RESUMO

BACKGROUND: Alzheimer's disease (AD) is one of the most common neurodegenerative diseases that causes problems related to brain function. To some extent it is understood on a molecular level how AD arises, however there are a lack of biomarkers that can be used for early diagnosis. Two popular methods to identify AD-related biomarkers use genetics and neuroimaging. Genes and neuroimaging phenotypes have provided some insights as to the potential for AD biomarkers. While the field of imaging-genomics has identified genetic features associated with structural and functional neuroimaging phenotypes, it remains unclear how variants that affect splicing could be important for understanding the genetic etiology of AD. METHODS: In this study, rare variants (minor allele frequency < 0.01) in splicing regulatory element (SRE) loci from whole genome sequencing (WGS) in the Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort, were used to identify genes that are associated with global brain cortical glucose metabolism in AD measured by FDG PET-scans. Gene-based associated analyses of rare variants were performed using the program BioBin and the optimal Sequence Kernel Association Test (SKAT-O). RESULTS: The gene, EXOC3L4, was identified as significantly associated with global cortical glucose metabolism (FDR (false discovery rate) corrected p < 0.05) using SRE coding variants only. Three loci that may affect splicing within EXOC3L4 contribute to the association. CONCLUSION: Based on sequence homology, EXOC3L4 is likely a part of the exocyst complex. Our results suggest the possibility that variants which affect proper splicing of EXOC3L4 via SREs may impact vesicle transport, giving rise to AD related phenotypes. Overall, by utilizing WGS and functional neuroimaging we have identified a gene significantly associated with an AD related endophenotype, potentially through a mechanism that involves splicing.


Assuntos
Doença de Alzheimer/genética , Doença de Alzheimer/metabolismo , Encéfalo/metabolismo , Glucose/metabolismo , Polimorfismo de Nucleotídeo Único , Splicing de RNA/genética , Sequências Reguladoras de Ácido Nucleico , Proteínas de Transporte Vesicular/genética , Doença de Alzheimer/diagnóstico por imagem , Biomarcadores/análise , Encéfalo/patologia , Estudos de Coortes , Biologia Computacional/métodos , Humanos , Neuroimagem/métodos , Fenótipo , Sequenciamento Completo do Genoma/métodos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa