Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 119
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Genet ; 20(3): e1011192, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38517939

RESUMO

The HostSeq initiative recruited 10,059 Canadians infected with SARS-CoV-2 between March 2020 and March 2023, obtained clinical information on their disease experience and whole genome sequenced (WGS) their DNA. We analyzed the WGS data for genetic contributors to severe COVID-19 (considering 3,499 hospitalized cases and 4,975 non-hospitalized after quality control). We investigated the evidence for replication of loci reported by the International Host Genetics Initiative (HGI); analyzed the X chromosome; conducted rare variant gene-based analysis and polygenic risk score testing. Population stratification was adjusted for using meta-analysis across ancestry groups. We replicated two loci identified by the HGI for COVID-19 severity: the LZTFL1/SLC6A20 locus on chromosome 3 and the FOXP4 locus on chromosome 6 (the latter with a variant significant at P < 5E-8). We found novel significant associations with MRAS and WDR89 in gene-based analyses, and constructed a polygenic risk score that explained 1.01% of the variance in severe COVID-19. This study provides independent evidence confirming the robustness of previously identified COVID-19 severity loci by the HGI and identifies novel genes for further investigation.


Assuntos
COVID-19 , População Norte-Americana , Humanos , COVID-19/genética , SARS-CoV-2/genética , Predisposição Genética para Doença , Polimorfismo de Nucleotídeo Único , Canadá/epidemiologia , Estudo de Associação Genômica Ampla , Proteínas de Membrana Transportadoras , Fatores de Transcrição Forkhead
2.
Stat Med ; 42(13): 2134-2161, 2023 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-36964996

RESUMO

INTRODUCTION: When a study sample includes a large proportion of long-term survivors, mixture cure (MC) models that separately assess biomarker associations with long-term recurrence-free survival and time to disease recurrence are preferred to proportional-hazards models. However, in samples with few recurrences, standard maximum likelihood can be biased. OBJECTIVE AND METHODS: We extend Firth-type penalized likelihood (FT-PL) developed for bias reduction in the exponential family to the Weibull-logistic MC, using the Jeffreys invariant prior. Via simulation studies based on a motivating cohort study, we compare parameter estimates of the FT-PL method to those by ML, as well as type 1 error (T1E) and power obtained using likelihood ratio statistics. RESULTS: In samples with relatively few events, the Firth-type penalized likelihood estimates (FT-PLEs) have mean bias closer to zero and smaller mean squared error than maximum likelihood estimates (MLEs), and can be obtained in samples where the MLEs are infinite. Under similar T1E rates, FT-PL consistently exhibits higher statistical power than ML in samples with few events. In addition, we compare FT-PL estimation with two other penalization methods (a log-F prior method and a modified Firth-type method) based on the same simulations. DISCUSSION: Consistent with findings for logistic and Cox regressions, FT-PL under MC regression yields finite estimates under stringent conditions, and better bias-and-variance balance than the other two penalizations. The practicality and strength of FT-PL for MC analysis is illustrated in a cohort study of breast cancer prognosis with long-term follow-up for recurrence-free survival.


Assuntos
Recidiva Local de Neoplasia , Humanos , Estudos de Coortes , Funções Verossimilhança , Simulação por Computador , Modelos de Riscos Proporcionais
3.
Genet Epidemiol ; 44(4): 368-381, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32237178

RESUMO

Next generation sequencing technologies have made it possible to investigate the role of rare variants (RVs) in disease etiology. Because RVs associated with disease susceptibility tend to be enriched in families with affected individuals, study designs based on affected sib pairs (ASP) can be more powerful than case-control studies. We construct tests of RV-set association in ASPs for single genomic regions as well as for multiple regions. Single-region tests can efficiently detect a gene region harboring susceptibility variants, while multiple-region extensions are meant to capture signals dispersed across a biological pathway, potentially as a result of locus heterogeneity. Within ascertained ASPs, the test statistics contrast the frequencies of duplicate rare alleles (usually appearing on a shared haplotype) against frequencies of a single rare allele copy (appearing on a nonshared haplotype); we call these allelic parity tests. Incorporation of minor allele frequency estimates from reference populations can markedly improve test efficiency. Under various genetic penetrance models, application of the tests in simulated ASP data sets demonstrates good type I error properties as well as power gains over approaches that regress ASP rare allele counts on sharing state, especially in small samples. We discuss robustness of the allelic parity methods to the presence of genetic linkage, misspecification of reference population allele frequencies, sequencing error and de novo mutations, and population stratification. As proof of principle, we apply single- and multiple-region tests in a motivating study data set consisting of whole exome sequencing of sisters ascertained with early onset breast cancer.


Assuntos
Variação Genética , Modelos Genéticos , Alelos , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Cromossomos Humanos Par 1 , Feminino , Frequência do Gene , Heterogeneidade Genética , Ligação Genética , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Modelos de Riscos Proporcionais
4.
Biostatistics ; 21(3): 518-530, 2020 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-30590388

RESUMO

In this work, we propose a single nucleotide polymorphism set association test for survival phenotypes in the presence of a non-susceptible fraction. We consider a mixture model with a logistic regression for the susceptibility indicator and a proportional hazards regression to model survival in the susceptible group. We propose a joint test to assess the significance of the genetic variant in both logistic and survival regressions simultaneously. We adopt the spirit of SKAT and conduct a variance-component test treating the genetic effects of multiple variants as random. We derive score-type test statistics, and we investigate several approaches to compute their $p$-values. The finite-sample properties of the proposed tests are assessed and compared to existing approaches by simulations and their use is illustrated through an application to ovarian cancer data from the Consortium of Investigators of Modifiers of BRCA1 and BRCA2.


Assuntos
Suscetibilidade a Doenças , Modelos Genéticos , Modelos Estatísticos , Análise de Sobrevida , Proteína BRCA2/genética , Feminino , Humanos , Neoplasias Ovarianas/genética , Neoplasias Ovarianas/mortalidade , Polimorfismo de Nucleotídeo Único , Ubiquitina-Proteína Ligases/genética
5.
Stat Med ; 40(30): 6792-6817, 2021 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-34596256

RESUMO

Post-GWAS analysis, in many cases, focuses on fine-mapping targeted genetic regions discovered at GWAS-stage; that is, the aim is to pinpoint potential causal variants and susceptibility genes for complex traits and disease outcomes using next-generation sequencing (NGS) technologies. Large-scale GWAS cohorts are necessary to identify target regions given the typically modest genetic effect sizes. In this context, two-phase sampling design and analysis is a cost-reduction technique that utilizes data collected during phase 1 GWAS to select an informative subsample for phase 2 sequencing. The main goal is to make inference for genetic variants measured via NGS by efficiently combining data from phases 1 and 2. We propose two approaches for selecting a phase 2 design under a budget constraint. The first method identifies sampling fractions that select a phase 2 design yielding an asymptotic variance covariance matrix with certain optimal characteristics, for example, smallest trace, via Lagrange multipliers (LM). The second relies on a genetic algorithm (GA) with a defined fitness function to identify exactly a phase 2 subsample. We perform comprehensive simulation studies to evaluate the empirical properties of the proposed designs for a genetic association study of a quantitative trait. We compare our methods against two ranked designs: residual-dependent sampling and a recently identified optimal design. Our findings demonstrate that the proposed designs, GA in particular, can render competitive power in combined phase 1 and 2 analysis compared with alternative designs while preserving type 1 error control. These results are especially evident under the more practical scenario where design values need to be defined a priori and are subject to misspecification. We illustrate the proposed methods in a study of triglyceride levels in the North Finland Birth Cohort of 1966. R code to reproduce our results is available at github.com/egosv/TwoPhase_postGWAS.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudos de Associação Genética , Genótipo , Humanos , Fenótipo
6.
Bioinformatics ; 35(21): 4419-4421, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31070701

RESUMO

SUMMARY: For the analysis of high-throughput genomic data produced by next-generation sequencing (NGS) technologies, researchers need to identify linkage disequilibrium (LD) structure in the genome. In this work, we developed an R package gpart which provides clustering algorithms to define LD blocks or analysis units consisting of SNPs. The visualization tool in gpart can display the LD structure and gene positions for up to 20 000 SNPs in one image. The gpart functions facilitate construction of LD blocks and SNP partitions for vast amounts of genome sequencing data within reasonable time and memory limits in personal computing environments. AVAILABILITY AND IMPLEMENTATION: The R package is available at https://bioconductor.org/packages/gpart. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma Humano , Polimorfismo de Nucleotídeo Único , Haplótipos , Humanos , Desequilíbrio de Ligação , Software
7.
J Am Soc Nephrol ; 30(10): 2000-2016, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31537649

RESUMO

BACKGROUND: Although diabetic kidney disease demonstrates both familial clustering and single nucleotide polymorphism heritability, the specific genetic factors influencing risk remain largely unknown. METHODS: To identify genetic variants predisposing to diabetic kidney disease, we performed genome-wide association study (GWAS) analyses. Through collaboration with the Diabetes Nephropathy Collaborative Research Initiative, we assembled a large collection of type 1 diabetes cohorts with harmonized diabetic kidney disease phenotypes. We used a spectrum of ten diabetic kidney disease definitions based on albuminuria and renal function. RESULTS: Our GWAS meta-analysis included association results for up to 19,406 individuals of European descent with type 1 diabetes. We identified 16 genome-wide significant risk loci. The variant with the strongest association (rs55703767) is a common missense mutation in the collagen type IV alpha 3 chain (COL4A3) gene, which encodes a major structural component of the glomerular basement membrane (GBM). Mutations in COL4A3 are implicated in heritable nephropathies, including the progressive inherited nephropathy Alport syndrome. The rs55703767 minor allele (Asp326Tyr) is protective against several definitions of diabetic kidney disease, including albuminuria and ESKD, and demonstrated a significant association with GBM width; protective allele carriers had thinner GBM before any signs of kidney disease, and its effect was dependent on glycemia. Three other loci are in or near genes with known or suggestive involvement in this condition (BMP7) or renal biology (COLEC11 and DDR1). CONCLUSIONS: The 16 diabetic kidney disease-associated loci may provide novel insights into the pathogenesis of this condition and help identify potential biologic targets for prevention and treatment.


Assuntos
Autoantígenos/genética , Colágeno Tipo IV/genética , Diabetes Mellitus Tipo 1/genética , Nefropatias Diabéticas/genética , Estudo de Associação Genômica Ampla , Membrana Basal Glomerular , Mutação , Estudos de Coortes , Feminino , Humanos , Masculino
8.
Genet Epidemiol ; 42(1): 104-116, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29239496

RESUMO

We evaluate two-phase designs to follow-up findings from genome-wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation-maximization-based inference under a semiparametric maximum likelihood formulation tailored for post-GWAS inference. A GWAS-SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT-SNP-dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme-QT strata yields significant power improvements compared to marginal QT- or SNP-based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure.


Assuntos
Estudo de Associação Genômica Ampla , Genótipo , Funções Verossimilhança , Característica Quantitativa Herdável , Análise de Sequência de DNA , Algoritmos , Pressão Sanguínea/genética , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
9.
Bioinformatics ; 34(3): 388-397, 2018 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-29028986

RESUMO

Motivation: Linkage disequilibrium (LD) block construction is required for research in population genetics and genetic epidemiology, including specification of sets of single nucleotide polymorphisms (SNPs) for analysis of multi-SNP based association and identification of haplotype blocks in high density sequencing data. Existing methods based on a narrow sense definition do not allow intermediate regions of low LD between strongly associated SNP pairs and tend to split high density SNP data into small blocks having high between-block correlation. Results: We present Big-LD, a block partition method based on interval graph modeling of LD bins which are clusters of strong pairwise LD SNPs, not necessarily physically consecutive. Big-LD uses an agglomerative approach that starts by identifying small communities of SNPs, i.e. the SNPs in each LD bin region, and proceeds by merging these communities. We determine the number of blocks using a method to find maximum-weight independent set. Big-LD produces larger LD blocks compared to existing methods such as MATILDE, Haploview, MIG ++, or S-MIG ++ and the LD blocks better agree with recombination hotspot locations determined by sperm-typing experiments. The observed average runtime of Big-LD for 13 288 240 non-monomorphic SNPs from 1000 Genomes Project autosome data (286 East Asians) is about 5.83 h, which is a significant improvement over the existing methods. Availability and implementation: Source code and documentation are available for download at http://github.com/sunnyeesl/BigLD. Contact: yyoo@snu.ac.kr. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Genética Populacional/métodos , Genoma Humano , Haplótipos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Algoritmos , Povo Asiático/genética , Humanos , Desequilíbrio de Ligação , Modelos Genéticos
10.
Diabetologia ; 61(5): 1098-1111, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29404672

RESUMO

AIMS/HYPOTHESIS: The aim of this study was to identify genetic variants associated with beta cell function in type 1 diabetes, as measured by serum C-peptide levels, through meta-genome-wide association studies (meta-GWAS). METHODS: We performed a meta-GWAS to combine the results from five studies in type 1 diabetes with cross-sectionally measured stimulated, fasting or random C-peptide levels, including 3479 European participants. The p values across studies were combined, taking into account sample size and direction of effect. We also performed separate meta-GWAS for stimulated (n = 1303), fasting (n = 2019) and random (n = 1497) C-peptide levels. RESULTS: In the meta-GWAS for stimulated/fasting/random C-peptide levels, a SNP on chromosome 1, rs559047 (Chr1:238753916, T>A, minor allele frequency [MAF] 0.24-0.26), was associated with C-peptide (p = 4.13 × 10-8), meeting the genome-wide significance threshold (p < 5 × 10-8). In the same meta-GWAS, a locus in the MHC region (rs9260151) was close to the genome-wide significance threshold (Chr6:29911030, C>T, MAF 0.07-0.10, p = 8.43 × 10-8). In the stimulated C-peptide meta-GWAS, rs61211515 (Chr6:30100975, T/-, MAF 0.17-0.19) in the MHC region was associated with stimulated C-peptide (ß [SE] = - 0.39 [0.07], p = 9.72 × 10-8). rs61211515 was also associated with the rate of stimulated C-peptide decline over time in a subset of individuals (n = 258) with annual repeated measures for up to 6 years (p = 0.02). In the meta-GWAS of random C-peptide, another MHC region, SNP rs3135002 (Chr6:32668439, C>A, MAF 0.02-0.06), was associated with C-peptide (p = 3.49 × 10-8). Conditional analyses suggested that the three identified variants in the MHC region were independent of each other. rs9260151 and rs3135002 have been associated with type 1 diabetes, whereas rs559047 and rs61211515 have not been associated with a risk of developing type 1 diabetes. CONCLUSIONS/INTERPRETATION: We identified a locus on chromosome 1 and multiple variants in the MHC region, at least some of which were distinct from type 1 diabetes risk loci, that were associated with C-peptide, suggesting partly non-overlapping mechanisms for the development and progression of type 1 diabetes. These associations need to be validated in independent populations. Further investigations could provide insights into mechanisms of beta cell loss and opportunities to preserve beta cell function.


Assuntos
Peptídeo C/sangue , Cromossomos Humanos Par 1/genética , Diabetes Mellitus Tipo 1/genética , Estudo de Associação Genômica Ampla , Antígenos de Histocompatibilidade Classe I/genética , Adolescente , Adulto , Alelos , Estudos Transversais , Diabetes Mellitus Tipo 1/sangue , Feminino , Frequência do Gene , Predisposição Genética para Doença , Genótipo , Humanos , Células Secretoras de Insulina/metabolismo , Masculino , Polimorfismo de Nucleotídeo Único , Adulto Jovem
11.
Genet Epidemiol ; 41(2): 108-121, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-27885705

RESUMO

By jointly analyzing multiple variants within a gene, instead of one at a time, gene-based multiple regression can improve power, robustness, and interpretation in genetic association analysis. We investigate multiple linear combination (MLC) test statistics for analysis of common variants under realistic trait models with linkage disequilibrium (LD) based on HapMap Asian haplotypes. MLC is a directional test that exploits LD structure in a gene to construct clusters of closely correlated variants recoded such that the majority of pairwise correlations are positive. It combines variant effects within the same cluster linearly, and aggregates cluster-specific effects in a quadratic sum of squares and cross-products, producing a test statistic with reduced degrees of freedom (df) equal to the number of clusters. By simulation studies of 1000 genes from across the genome, we demonstrate that MLC is a well-powered and robust choice among existing methods across a broad range of gene structures. Compared to minimum P-value, variance-component, and principal-component methods, the mean power of MLC is never much lower than that of other methods, and can be higher, particularly with multiple causal variants. Moreover, the variation in gene-specific MLC test size and power across 1000 genes is less than that of other methods, suggesting it is a complementary approach for discovery in genome-wide analysis. The cluster construction of the MLC test statistics helps reveal within-gene LD structure, allowing interpretation of clustered variants as haplotypic effects, while multiple regression helps to distinguish direct and indirect associations.


Assuntos
Marcadores Genéticos/genética , Haplótipos/genética , Modelos Lineares , Desequilíbrio de Ligação , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética , Humanos , Fenótipo , Locos de Características Quantitativas
12.
BMC Cancer ; 18(1): 750, 2018 Jul 20.
Artigo em Inglês | MEDLINE | ID: mdl-30029633

RESUMO

BACKGROUND: We previously observed that T-bet+ tumor-infiltrating T lymphocytes (T-bet+ TILs) in primary breast tumors were associated with adverse clinicopathological features, yet favorable clinical outcome. We identified BRD4 (Bromodomain-Containing Protein 4), a member of the  Bromodomain and Extra Terminal domain (BET) family, as a gene that distinguished T-bet+/high and T-bet-/low tumors. In clinical studies, BET inhibitors have been shown to suppress inflammation in various cancers, suggesting a potential link between BRD4 and immune infiltration in cancer. Hence, we examined the BRD4 expression and clinicopathological features of breast cancer. METHODS: The cohort consisted of a prospectively ascertained consecutive series of women with axillary node-negative breast cancer with long follow-up. Gene expression microarray data were used to detect mRNAs differentially expressed between T-bet+/high (n = 6) and T-bet-/low (n = 41) tumors. Tissue microarrays (TMAs) constructed from tumors of 612 women were used to quantify expression of BRD4 by immunohistochemistry, which was analyzed for its association with T-bet+ TILs, Jagged1, clinicopathological features, and disease-free survival. RESULTS: Microarray analysis indicated that BRD4 mRNA expression was up to 44-fold higher in T-bet+/high tumors compared to T-bet-/low tumors (p = 5.38E-05). Immunohistochemical expression of BRD4 in cancer cells was also shown to be associated with T-bet+ TILs (p = 0.0415) as well as with Jagged1 mRNA and protein expression (p = 0.0171, 0.0010 respectively). BRD4 expression correlated with larger tumor size (p = 0.0049), pre-menopausal status (p = 0.0018), and high Ki-67 proliferative index (p = 0.0009). Women with high tumoral BRD4 expression in the absence of T-bet+ TILs exhibited a significantly poorer outcome (log rank test p = 0.0165) relative to other subgroups. CONCLUSIONS: The association of BRD4 expression with T-bet+ TILs, and T-bet+ TIL-dependent disease-free survival suggests a potential link between BRD4-mediated tumor development and tumor immune surveillance, possibly through BRD4's regulation of Jagged1 signaling pathways. Further understanding BRD4's role in different immune contexts may help to identify an appropriate subset of breast cancer patients who may benefit from BET inhibitors without the risk of diminishing the anti-tumoral immune activity.


Assuntos
Neoplasias da Mama/mortalidade , Linfócitos do Interstício Tumoral/imunologia , Proteínas Nucleares/fisiologia , Proteínas com Domínio T/análise , Fatores de Transcrição/fisiologia , Neoplasias da Mama/imunologia , Neoplasias da Mama/patologia , Proteínas de Ciclo Celular , Intervalo Livre de Doença , Feminino , Humanos , Imuno-Histoquímica , Proteína Jagged-1/fisiologia , Linfonodos/patologia , Proteínas Nucleares/análise , Proteínas Nucleares/genética , Estudos Prospectivos , Fatores de Transcrição/análise , Fatores de Transcrição/genética
13.
Genet Epidemiol ; 39(7): 518-28, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26411674

RESUMO

The "winner's curse" is a subtle and difficult problem in interpretation of genetic association, in which association estimates from large-scale gene detection studies are larger in magnitude than those from subsequent replication studies. This is practically important because use of a biased estimate from the original study will yield an underestimate of sample size requirements for replication, leaving the investigators with an underpowered study. Motivated by investigation of the genetics of type 1 diabetes complications in a longitudinal cohort of participants in the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC) Genetics Study, we apply a bootstrap resampling method in analysis of time to nephropathy under a Cox proportional hazards model, examining 1,213 single-nucleotide polymorphisms (SNPs) in 201 candidate genes custom genotyped in 1,361 white probands. Among 15 top-ranked SNPs, bias reduction in log hazard ratio estimates ranges from 43.1% to 80.5%. In simulation studies based on the observed DCCT/EDIC genotype data, genome-wide bootstrap estimates for false-positive SNPs and for true-positive SNPs with low-to-moderate power are closer to the true values than uncorrected naïve estimates, but tend to overcorrect SNPs with high power. This bias-reduction technique is generally applicable for complex trait studies including quantitative, binary, and time-to-event traits.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Viés , Diabetes Mellitus Tipo 1/complicações , Diabetes Mellitus Tipo 1/epidemiologia , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 1/terapia , Reações Falso-Positivas , Feminino , Genótipo , Humanos , Nefropatias/complicações , Nefropatias/genética , Nefropatias/patologia , Masculino , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Modelos de Riscos Proporcionais , Risco , Tamanho da Amostra , Fatores de Tempo
14.
PLoS Genet ; 9(8): e1003609, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23950724

RESUMO

Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Modelos Teóricos , Polimorfismo de Nucleotídeo Único/genética , Neoplasias da Mama/genética , Feminino , Genótipo , Humanos , Masculino , Neoplasias da Próstata/genética , Tamanho da Amostra
15.
Genet Epidemiol ; 38(7): 599-609, 2014 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-25132153

RESUMO

In focused studies designed to follow up associations detected in a genome-wide association study (GWAS), investigators can proceed to fine-map a genomic region by targeted sequencing or dense genotyping of all variants in the region, aiming to identify a functional sequence variant. For the analysis of a quantitative trait, we consider a Bayesian approach to fine-mapping study design that incorporates stratification according to a promising GWAS tag SNP in the same region. Improved cost-efficiency can be achieved when the fine-mapping phase incorporates a two-stage design, with identification of a smaller set of more promising variants in a subsample taken in stage 1, followed by their evaluation in an independent stage 2 subsample. To avoid the potential negative impact of genetic model misspecification on inference we incorporate genetic model selection based on posterior probabilities for each competing model. Our simulation study shows that, compared to simple random sampling that ignores genetic information from GWAS, tag-SNP-based stratified sample allocation methods reduce the number of variants continuing to stage 2 and are more likely to promote the functional sequence variant into confirmation studies.


Assuntos
Estudo de Associação Genômica Ampla , Teorema de Bayes , Mapeamento Cromossômico , Simulação por Computador , Genoma Humano , Genótipo , Humanos , Modelos Genéticos , Fenótipo , Polimorfismo de Nucleotídeo Único , Probabilidade
16.
Hum Genet ; 134(2): 247-57, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25487307

RESUMO

We investigated the association of signals from previous GWAS and candidate gene meta-analyses for diabetic retinopathy (DR) or nephropathy (DN), as well as an EPO variant in meta-analyses of severe (SDR) and mild diabetic retinopathy (MDR). Meta-analyses of SDR (≥severe non-proliferative diabetic retinopathy (NPDR) or history of panretinal photocoagulation) and MDR (≥mild NPDR), defined based on seven-field stereoscopic fundus photographs, were performed in two well-characterized type 1 diabetes (T1D) cohorts: the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC, n = 1,304) and Wisconsin Epidemiologic Study of Diabetic Retinopathy (WESDR, n = 603). Among 34 previous signals for DR, after controlling for multiple testing, no association was replicated in our meta-analyses. rs1571942 and rs12219125 at PLXDC2 locus showed nominally significant (<0.05) association with SDR in the same direction as previous report, as did rs1801282 in PPARG gene with MDR. Among 55 loci previously associated with DN, three showed suggestive associations with SDR in our study without maintaining significance after correction for multiple testing. Of particular interest, rs1617640 (EPO) was not significantly associated with DR status, combined SDR-DN phenotype, time to SDR or time to DN (all P > 0.05). Lack of replication of previous DR hits and EPO despite reasonable statistical power implies that many of these may be false positives. Consistent with pleiotropy, we provide suggestive collective evidence for association between DR and variants previously associated with DN without reaching statistical significance at any single locus.


Assuntos
Diabetes Mellitus Tipo 1/genética , Retinopatia Diabética/genética , Eritropoetina/genética , Loci Gênicos , Polimorfismo Genético , Receptores de Superfície Celular/genética , Ensaios Clínicos como Assunto , Feminino , Humanos , Masculino
17.
BMC Cancer ; 15: 483, 2015 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-26112005

RESUMO

BACKGROUND: Menacalc is an immunofluorescence-based, quantitative method in which expression of the non-invasive Mena protein isoform (Mena11a) is subtracted from total Mena protein expression. Previous work has found a significant positive association between Menacalc and risk of death from breast cancer. Our goal was to determine if Menacalc could be used as an independent prognostic marker for axillary node-negative (ANN) breast cancer. METHODS: Analysis of the association of Menacalc with overall survival (death from any cause) was performed for 403 ANN tumors using Kaplan Meier survival curves and the univariate Cox proportional hazards (PH) model with the log-rank or the likelihood ratio test. Cox PH models were used to estimate hazard ratios (HRs) for the association of Menacalc with risk of death after adjustment for HER2 status and clinicopathological tumor features. RESULTS: High Menacalc was associated with increased risk of death from any cause (P=0.0199, HR (CI)=2.18 (1.19, 4.00)). A similarly elevated risk of death was found in the subset of the Menacalc cohort which did not receive hormone or chemotherapy (n=142) (P=0.0052, HR (CI)=3.80 (1.58, 9.97)). There was a trend toward increased risk of death with relatively high Menacalc in the HER2, basal and luminal molecular subtypes. CONCLUSIONS: Menacalc may serve as an independent prognostic biomarker for the ANN breast cancer patient population.


Assuntos
Biomarcadores Tumorais/biossíntese , Neoplasias da Mama/genética , Proteínas dos Microfilamentos/biossíntese , Idoso , Biomarcadores Tumorais/genética , Neoplasias da Mama/mortalidade , Neoplasias da Mama/patologia , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Estimativa de Kaplan-Meier , Proteínas dos Microfilamentos/genética , Pessoa de Meia-Idade , Metástase Neoplásica , Prognóstico , Isoformas de Proteínas/biossíntese , Isoformas de Proteínas/genética , Receptor ErbB-2/genética
18.
Mod Pathol ; 27(4): 554-61, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24051696

RESUMO

The objectives of this study were to determine the prognostic significance of subgrouping estrogen receptor (ER)-positive breast tumors into low- and high-risk luminal categories using Ki67 index, TP53, or progesterone receptor (PR) status. The study group comprised 540 patients with lymph node negative, invasive breast carcinoma. Luminal A subtype was defined as being ER positive, HER2 negative, and Ki67 low (<14% cells positive) and luminal B subtype as being ER positive, HER2 negative, and Ki67 high (≥ 14% cells positive). Luminal tumors were also subgrouped into risk categories based on the PR and TP53 status. Survival analysis was performed. Patients with luminal B tumors (n=173) had significantly worse disease-free survival compared to those with luminal A tumors (n=186) (log rank P-value=0.0164; univariate Cox regression relative risk 2.00; 95% CI, 1.12-3.58; P=0.0187). Luminal subtype remained an independent prognostic indicator on multivariate analysis including traditional prognostic factors (relative risk 2.12; 95% CI, 1.16-3.88; P=0.0151). Using TP53 status or PR negativity rather than Ki67 to classify ER-positive luminal tumors gave similar outcome results to those obtained using the proliferation index. However, it was a combination of the three markers, which proved the most powerful prognostically. Ki67 index, TP53 status, or PR negativity can be used to segregate ER-positive, HER2-negative tumors into prognostically meaningful subgroups with significantly different clinical outcomes. These biomarkers particularly in combination may potentially be used clinically to guide patient management.


Assuntos
Neoplasias da Mama/química , Carcinoma/química , Antígeno Ki-67/análise , Receptores de Progesterona/análise , Proteína Supressora de Tumor p53/análise , Neoplasias da Mama/classificação , Neoplasias da Mama/mortalidade , Neoplasias da Mama/patologia , Neoplasias da Mama/terapia , Carcinoma/classificação , Carcinoma/mortalidade , Carcinoma/patologia , Carcinoma/terapia , Distribuição de Qui-Quadrado , Diagnóstico Diferencial , Intervalo Livre de Doença , Feminino , Humanos , Imuno-Histoquímica , Estimativa de Kaplan-Meier , Análise Multivariada , Invasividade Neoplásica , Ontário , Valor Preditivo dos Testes , Modelos de Riscos Proporcionais , Estudos Prospectivos , Receptor ErbB-2/análise , Receptores de Estrogênio/análise , Fatores de Risco , Fatores de Tempo
19.
Rheumatology (Oxford) ; 53(2): 233-9, 2014 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-24185760

RESUMO

OBJECTIVES: We conducted a case-control study to determine the association between KIR2D and KIR3D gene polymorphisms and their interaction with HLA alleles in PsA. METHODS: A total of 678 subjects with PsA and 688 healthy controls were studied. Differences between cases and controls in the frequency of individual KIR polymorphisms were tested for significance by an asymptotic χ(2) test and Fisher's exact test. Trends for increasing susceptibility to PsA from combined genotypes (HLA-KIR and HLA) were evaluated by the Cochran-Armitage trend test. Multigene logistic regression analysis was conducted to identify independent associations and interactions. RESULTS: In univariate analyses, KIR2DL2 and KIR2DS2 polymorphisms were significantly associated with PsA. Only KIR2DS2 was associated with PsA compared with healthy controls in multivariate analysis [odds ratio (OR) 1.25, 95% CI 1.01, 1.54, P = 0.044]. The presence of HLA-C group 2 alleles was associated with a higher risk of PsA (trend test P = 0.006). The risk of PsA is higher when KIR2DS2 is present with the HLA-C ligands (C group 1) for the corresponding inhibitory KIRs, and is highest when KIR2DS2 is present in the absence of HLA-C ligands for homologous inhibitor KIRs, compared with the state when KIR2DS2 is absent (trend test P = 0.027). The presence of HLA-C alleles that have high cell surface expression was also associated with a higher risk of PsA (trend test P < 0.001). HLA-B Bw4 and HLA-B Bw4 80ile allele groups were associated with a higher PsA risk (trend test P < 0.0001 for both analyses). CONCLUSION: This study confirms the association of the KIR2DS gene, especially KIR2DS2, with PsA.


Assuntos
Artrite Psoriásica/genética , Receptores KIR2DL2/genética , Receptores KIR/genética , Adulto , Alelos , Estudos de Casos e Controles , Feminino , Predisposição Genética para Doença/genética , Genótipo , Antígenos HLA-C/genética , Humanos , Masculino , Pessoa de Meia-Idade , Análise Multivariada , Polimorfismo Genético , Análise de Regressão
20.
Genet Epidemiol ; 36(4): 320-32, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-22460746

RESUMO

By systematic examination of common tag single-nucleotide polymorphisms (SNPs) across the genome, the genome-wide association study (GWAS) has proven to be a successful approach to identify genetic variants that are associated with complex diseases and traits. Although the per base pair cost of sequencing has dropped dramatically with the advent of the next-generation technologies, it may still only be feasible to obtain DNA sequence data for a portion of available study subjects due to financial constraints. Two-phase sampling designs have been used frequently in large-scale surveys and epidemiological studies where certain variables are too costly to be measured on all subjects. We consider two-phase stratified sampling designs for genetic association, in which tag SNPs for candidate genes or regions are genotyped on all subjects in phase 1, and a proportion of subjects are selected into phase 2 based on genotypes at one or more tag SNPs. Deep sequencing in the region is then applied to genotype phase 2 subjects at sequence SNPs. We investigate alternative sampling designs for selection of phase 2 subjects within strata defined by tag SNP genotypes and develop methods of inference for sequence SNP variant associations using data from both phases. In comparison to methods that use data from phase 2 alone, the combined analysis improves efficiency.


Assuntos
Análise de Sequência de DNA/métodos , Algoritmos , Mapeamento Cromossômico/métodos , Biologia Computacional/métodos , Simulação por Computador , Genoma Humano , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Modelos Estatísticos , Epidemiologia Molecular , Polimorfismo de Nucleotídeo Único , Probabilidade , Análise de Regressão
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA