Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Am J Hum Genet ; 93(3): 545-54, 2013 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-23972371

RESUMO

High blood pressure (BP) is more prevalent and contributes to more severe manifestations of cardiovascular disease (CVD) in African Americans than in any other United States ethnic group. Several small African-ancestry (AA) BP genome-wide association studies (GWASs) have been published, but their findings have failed to replicate to date. We report on a large AA BP GWAS meta-analysis that includes 29,378 individuals from 19 discovery cohorts and subsequent replication in additional samples of AA (n = 10,386), European ancestry (EA) (n = 69,395), and East Asian ancestry (n = 19,601). Five loci (EVX1-HOXA, ULK4, RSPO3, PLEKHG1, and SOX6) reached genome-wide significance (p < 1.0 × 10(-8)) for either systolic or diastolic BP in a transethnic meta-analysis after correction for multiple testing. Three of these BP loci (EVX1-HOXA, RSPO3, and PLEKHG1) lack previous associations with BP. We also identified one independent signal in a known BP locus (SOX6) and provide evidence for fine mapping in four additional validated BP loci. We also demonstrate that validated EA BP GWAS loci, considered jointly, show significant effects in AA samples. Consequently, these findings suggest that BP loci might have universal effects across studied populations, demonstrating that multiethnic samples are an essential component in identifying, fine mapping, and understanding their trait variability.


Assuntos
População Negra/genética , Pressão Sanguínea/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Característica Quantitativa Herdável , África , Estudos de Coortes , Bases de Dados Genéticas , Loci Gênicos/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética , Reprodutibilidade dos Testes
2.
Bioinformatics ; 31(11): 1805-15, 2015 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-25618864

RESUMO

MOTIVATION: Identifying microRNAs associated with diseases (disease miRNAs) is helpful for exploring the pathogenesis of diseases. Because miRNAs fulfill function via the regulation of their target genes and because the current number of experimentally validated targets is insufficient, some existing methods have inferred potential disease miRNAs based on the predicted targets. It is difficult for these methods to achieve excellent performance due to the high false-positive and false-negative rates for the target prediction results. Alternatively, several methods have constructed a network composed of miRNAs based on their associated diseases and have exploited the information within the network to predict the disease miRNAs. However, these methods have failed to take into account the prior information regarding the network nodes and the respective local topological structures of the different categories of nodes. Therefore, it is essential to develop a method that exploits the more useful information to predict reliable disease miRNA candidates. RESULTS: miRNAs with similar functions are normally associated with similar diseases and vice versa. Therefore, the functional similarity between a pair of miRNAs is calculated based on their associated diseases to construct a miRNA network. We present a new prediction method based on random walk on the network. For the diseases with some known related miRNAs, the network nodes are divided into labeled nodes and unlabeled nodes, and the transition matrices are established for the two categories of nodes. Furthermore, different categories of nodes have different transition weights. In this way, the prior information of nodes can be completely exploited. Simultaneously, the various ranges of topologies around the different categories of nodes are integrated. In addition, how far the walker can go away from the labeled nodes is controlled by restarting the walking. This is helpful for relieving the negative effect of noisy data. For the diseases without any known related miRNAs, we extend the walking on a miRNA-disease bilayer network. During the prediction process, the similarity between diseases, the similarity between miRNAs, the known miRNA-disease associations and the topology information of the bilayer network are exploited. Moreover, the importance of information from different layers of network is considered. Our method achieves superior performance for 18 human diseases with AUC values ranging from 0.786 to 0.945. Moreover, case studies on breast neoplasms, lung neoplasms, prostatic neoplasms and 32 diseases further confirm the ability of our method to discover potential disease miRNAs. AVAILABILITY AND IMPLEMENTATION: A web service for the prediction and analysis of disease miRNAs is available at http://bioinfolab.stx.hk/midp/.


Assuntos
Doença/genética , Redes Reguladoras de Genes , Humanos , MicroRNAs/genética , Modelos Estatísticos , Neoplasias/genética
3.
Genet Epidemiol ; 35(5): 350-9, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21484862

RESUMO

Large-scale genome-wide association studies (GWAS) have become feasible recently because of the development of bead and chip technology. However, the success of GWAS partially depends on the statistical methods that are able to manage and analyze this sort of large-scale data. Currently, the commonly used tests for GWAS include the Cochran-Armitage trend test, the allelic χ(2) test, the genotypic χ(2) test, the haplotypic χ(2) test, and the multi-marker genotypic χ(2) test among others. From a methodological point of view, it is a great challenge to improve the power of commonly used tests, since these tests are commonly used precisely because they are already among the most powerful tests. In this article, we propose an improved score test that is uniformly more powerful than the score test based on the generalized linear model. Since the score test based on the generalized linear model includes the aforementioned commonly used tests as its special cases, our proposed improved score test is thus uniformly more powerful than these commonly used tests. We evaluate the performance of the improved score test by simulation studies and application to a real data set. Our results show that the power increases of the improved score test over the score test cannot be neglected in most cases.


Assuntos
Estudo de Associação Genômica Ampla/estatística & dados numéricos , Alelos , Esclerose Lateral Amiotrófica/genética , Estudos de Casos e Controles , Simulação por Computador , Bases de Dados Genéticas , Frequência do Gene , Marcadores Genéticos , Predisposição Genética para Doença , Genótipo , Haplótipos , Humanos , Modelos Lineares , Modelos Genéticos , Modelos Estatísticos
4.
Genet Epidemiol ; 35 Suppl 1: S92-100, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22128066

RESUMO

Group 14 of Genetic Analysis Workshop 17 examined several issues related to analysis of complex traits using DNA sequence data. These issues included novel methods for analyzing rare genetic variants in an aggregated manner (often termed collapsing rare variants), evaluation of various study designs to increase power to detect effects of rare variants, and the use of machine learning approaches to model highly complex heterogeneous traits. Various published and novel methods for analyzing traits with extreme locus and allelic heterogeneity were applied to the simulated quantitative and disease phenotypes. Overall, we conclude that power is (as expected) dependent on locus-specific heritability or contribution to disease risk, large samples will be required to detect rare causal variants with small effect sizes, extreme phenotype sampling designs may increase power for smaller laboratory costs, methods that allow joint analysis of multiple variants per gene or pathway are more powerful in general than analyses of individual rare variants, population-specific analyses can be optimal when different subpopulations harbor private causal mutations, and machine learning methods may be useful for selecting subsets of predictors for follow-up in the presence of extreme locus heterogeneity and large numbers of potential predictors.


Assuntos
Predisposição Genética para Doença/genética , Epidemiologia Molecular/métodos , Polimorfismo de Nucleotídeo Único/genética , Análise de Regressão , Inteligência Artificial , Interpretação Estatística de Dados , Mineração de Dados , Exoma , Variação Genética , Projeto Genoma Humano , Humanos , Metanálise como Assunto , Análise de Sequência
5.
Ann Hum Genet ; 74(5): 406-15, 2010 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-20636464

RESUMO

In this paper, we propose a two-stage approach based on 17 biologically plausible models to search for two-locus combinations that have significant joint effects on the disease status in genome-wide association (GWA) studies. In the two-stage analyses, we only test two-locus joint effects of SNPs that show modest marginal effects. We use simulation studies to compare the power of our two-stage analysis with a single-marker analysis and a two-stage analysis by using a full model. We find that for most plausible interaction effects, our two-stage analysis can dramatically increase the power to identify two-locus joint effects compared to a single-marker analysis and a two-stage analysis based on the full model. We also compare two-stage methods with one-stage methods. Our simulation results indicate that two-stage methods are more powerful than one-stage methods. We applied our two-stage approach to a GWA study for identifying genetic factors that might be relevant in the pathogenesis of sporadic Amyotrophic Lateral Sclerosis (ALS). Our proposed two-stage approach found that two SNPs have significant joint effect on sporadic ALS while the single-marker analysis and the two-stage analysis based on the full model did not find any significant results.


Assuntos
Esclerose Lateral Amiotrófica/genética , Estudo de Associação Genômica Ampla , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença , Humanos
6.
BMC Med Genet ; 10: 86, 2009 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-19740415

RESUMO

BACKGROUND: Amyotrophic lateral sclerosis (ALS) is a fatal, degenerative neuromuscular disease characterized by a progressive loss of voluntary motor activity. About 95% of ALS patients are in "sporadic form"-meaning their disease is not associated with a family history of the disease. To date, the genetic factors of the sporadic form of ALS are poorly understood. METHODS: We proposed a two-stage approach based on seventeen biological plausible models to search for two-locus combinations that have significant joint effects to the disease in a genome-wide association study (GWAS). We used a two-stage strategy to reduce the computational burden associated with performing an exhaustive two-locus search across the genome. In the first stage, all SNPs were screened using a single-marker test. In the second stage, all pairs made from the 1000 SNPs with the lowest p-values from the first stage were evaluated under each of the 17 two-locus models. RESULTS: we performed the two-stage approach on a GWAS data set of sporadic ALS from the SNP Database at the NINDS Human Genetics Resource Center DNA and Cell Line Repository http://ccr.coriell.org/ninds/. Our two-locus analysis showed that two two-locus combinations--rs4363506 (SNP1) and rs3733242 (SNP2), and rs4363506 and rs16984239 (SNP3) -- were significantly associated with sporadic ALS. After adjusting for multiple tests and multiple models, the combination of SNP1 and SNP2 had a p-value of 0.032 under the Dom intersection Dom epistatic model; SNP1 and SNP3 had a p-value of 0.042 under the Dom x Dom multiplicative model. CONCLUSION: The proposed two-stage analytical method can be used to search for joint effects of genes in GWAS. The two-stage strategy decreased the computational time and the multiple testing burdens associated with GWAS. We have also observed that the loci identified by our two-stage strategy can not be detected by single-locus tests.


Assuntos
Esclerose Lateral Amiotrófica/genética , Estudo de Associação Genômica Ampla , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Alelos , Estudos de Casos e Controles , Predisposição Genética para Doença , Humanos , Funções Verossimilhança , Razão de Chances
7.
Front Genet ; 10: 416, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31130990

RESUMO

A lot of studies indicated that aberrant expression of long non-coding RNA genes (lncRNAs) is closely related to human diseases. Identifying disease-related lncRNAs (disease lncRNAs) is critical for understanding the pathogenesis and etiology of diseases. Most of the previous methods focus on prioritizing the potential disease lncRNAs based on shallow learning methods. The methods fail to extract the deep and complex feature representations of lncRNA-disease associations. Furthermore, nearly all the methods ignore the discriminative contributions of the similarity, association, and interaction relationships among lncRNAs, disease, and miRNAs for the association prediction. A dual convolutional neural networks with attention mechanisms based method is presented for predicting the candidate disease lncRNAs, and it is referred to as CNNLDA. CNNLDA deeply integrates the multiple source data like the lncRNA similarities, the disease similarities, the lncRNA-disease associations, the lncRNA-miRNA interactions, and the miRNA-disease associations. The diverse biological premises about lncRNAs, miRNAs, and diseases are combined to construct the feature matrix from the biological perspectives. A novel framework based on the dual convolutional neural networks is developed to learn the global and attention representations of the lncRNA-disease associations. The left part of the framework exploits the various information contained by the feature matrix to learn the global representation of lncRNA-disease associations. The different connection relationships among the lncRNA, miRNA, and disease nodes and the different features of these nodes have the discriminative contributions for the association prediction. Hence we present the attention mechanisms from the relationship level and the feature level respectively, and the right part of the framework learns the attention representation of associations. The experimental results based on the cross validation indicate that CNNLDA yields superior performance than several state-of-the-art methods. Case studies on stomach cancer, lung cancer, and colon cancer further demonstrate CNNLDA's ability to discover the potential disease lncRNAs.

8.
BMC Genet ; 8: 65, 2007 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-17894890

RESUMO

BACKGROUND: Complex diseases are believed to be the results of many genes and environmental factors. Hence, multi-marker methods that can use the information of markers from different genes are appropriate for mapping complex disease genes. There already have been several multi-marker methods proposed for case-control studies. In this article, we propose a multi-marker test called a Multi-marker Pedigree Disequilibrium Test (MPDT) to analyze family data from genome-wide association studies. If the parental phenotypes are available, we also propose a two-stage test in which a genomic screening test is used to select SNPs, and then the MPDT is used to test the association of the selected SNPs. RESULTS: We use simulation studies to evaluate the performance of the MPDT and the two-stage approach. The results show that the MPDT constantly outperforms the single marker transmission/disequilibrium test (TDT) 1. Comparing the power of the two-stage approach with that of the one-stage approach, which approach is more powerful depends on the value of the prevalence; when the prevalence is no less than 10%, the two-stage approach may be more powerful than the one-stage approach. Otherwise, the one-stage approach is more powerful. CONCLUSION: The proposed MPDT, is more powerful than the single marker TDT. When the parental phenotypes are available and the prevalence is no less than 10%, the proposed two-stage approach is more powerful than the one-stage approach.


Assuntos
Marcadores Genéticos/genética , Predisposição Genética para Doença/genética , Genoma Humano , Modelos Genéticos , Algoritmos , Genótipo , Humanos , Desequilíbrio de Ligação , Linhagem , Fenótipo
9.
Circ Cardiovasc Genet ; 8(1): 122-30, 2015 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-25561047

RESUMO

BACKGROUND: Numerous experimental studies suggest that B-type natriuretic peptide (BNP) is cardioprotective; however, in clinical studies, higher plasma BNP concentrations have been associated with incident cardiovascular disease and higher left ventricular mass. Genetic association studies may allow us to determine the true causal directions without confounding by compensatory mechanisms. METHODS AND RESULTS: We performed a meta-analysis of 2 genome-wide association results from a total of 2790 blacks. We assumed an additive genetic model in an association analysis of imputed 2.5 million single-nucleotide polymorphism dosages with residuals generated from multivariable-adjusted logarithmically transformed BNP controlling for relevant covariates and population stratification. Two loci were genome-wide significant, a candidate gene locus NPPB (rs198389, P=1.18×10(-09)) and a novel missense variant in the KLKB1 locus (rs3733402, P=1.75×10(-11)) that explained 0.4% and 1.9% of variation in log BNP concentration, respectively. The observed increase in BNP concentration was proportional to the number of effect allele copies, and an average of 8.1 pg/mL increase was associated with 2 allele copies. In a companion study, single-nucleotide polymorphisms in this loci were cross-checked with genome-wide association results for the aldosterone/renin ratio in individuals of European ancestry, and rs3733402 was genome-wide significant (P<5.0×10(-8)), suggesting possible shared genetic architecture for these 2 pathways. Other statistically significant relations for these single-nucleotide polymorphisms included the following: rs198389 with systolic blood pressure in blacks (COGENT consortium) and rs198389 and rs3733402 with left ventricular mass in whites (EchoGEN consortium). CONCLUSIONS: These findings improve our knowledge of the genetic basis of BNP variation in blacks, demonstrate a possible shared allelic architecture for BNP with aldosterone-renin ratio, and motivate further studies of underlying mechanisms.


Assuntos
Negro ou Afro-Americano/genética , Doenças Cardiovasculares , Loci Gênicos , Estudo de Associação Genômica Ampla , Peptídeo Natriurético Encefálico , Polimorfismo de Nucleotídeo Único , Adulto , Idoso , Aldosterona/sangue , Aldosterona/genética , Doenças Cardiovasculares/sangue , Doenças Cardiovasculares/etnologia , Doenças Cardiovasculares/genética , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Mississippi , Peptídeo Natriurético Encefálico/sangue , Peptídeo Natriurético Encefálico/genética , Renina/sangue , Renina/genética
10.
PLoS One ; 6(7): e21957, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21799758

RESUMO

In family-based data, association information can be partitioned into the between-family information and the within-family information. Based on this observation, Steen et al. (Nature Genetics. 2005, 683-691) proposed an interesting two-stage test for genome-wide association (GWA) studies under family-based designs which performs genomic screening and replication using the same data set. In the first stage, a screening test based on the between-family information is used to select markers. In the second stage, an association test based on the within-family information is used to test association at the selected markers. However, we learn from the results of case-control studies (Skol et al. Nature Genetics. 2006, 209-213) that this two-stage approach may be not optimal. In this article, we propose a novel two-stage joint analysis for GWA studies under family-based designs. For this joint analysis, we first propose a new screening test that is based on the between-family information and is robust to population stratification. This new screening test is used in the first stage to select markers. Then, a joint test that combines the between-family information and within-family information is used in the second stage to test association at the selected markers. By extensive simulation studies, we demonstrate that the joint analysis always results in increased power to detect genetic association and is robust to population stratification.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Linhagem , Feminino , Marcadores Genéticos/genética , Humanos , Masculino
11.
BMC Proc ; 5 Suppl 9: S112, 2011 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-22373188

RESUMO

We develop statistical methods for detecting rare variants that are associated with quantitative traits. We propose two strategies and their combination for this purpose: the iterative regression strategy and the extreme values strategy. In the iterative regression strategy, we use iterative regression on residuals and a multimarker association test to identify a group of significant variants. In the extreme values strategy, we use individuals with extreme trait values to select candidate genes and then test only these candidate genes. These two strategies are integrated into a hybrid approach through a weighting technology. We apply the proposed methods to analyze the Genetic Analysis Workshop 17 data set. The results show that the hybrid approach is the most powerful approach. Using the hybrid approach, the average power to detect causal genes for Q1 is about 40% and the powers to detect FLT1 and KDR are 100% and 68% for Q1, respectively. The powers to detect VNN3 and BCHE are 34% and 30% for Q2, respectively.

12.
BMC Proc ; 3 Suppl 7: S26, 2009 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-20018016

RESUMO

The goal of this paper is to search for two-locus combinations that are jointly associated with rheumatoid arthritis using the data set of Genetic Analysis Workshop 16 Problem 1. We use a two-stage strategy to reduce the computational burden associated with performing an exhaustive two-locus search across the genome. In the first stage, the full set of 531,689 single-nucleotide polymorphisms was screened using univariate testing. In the second stage, all pairs made from the 500 single-nucleotide polymorphisms with the lowest p-values from the first stage were evaluated under each of 17 two-locus models. Our analyses identified a two-locus combination - rs6939589 and rs11634386 - that proved to be significantly associated with rheumatoid arthritis under a Rec x Rec model (p-value = 0.045 after adjusting for multiple tests and multiple models).

13.
Genet Epidemiol ; 32(4): 285-300, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18205210

RESUMO

Complex diseases are presumed to be the results of interactions of several genes and environmental factors, with each gene only having a small effect on the disease. Thus, the methods that can account for gene-gene interactions to search for a set of marker loci in different genes or across genome and to analyze these loci jointly are critical. In this article, we propose an ensemble learning approach (ELA) to detect a set of loci whose main and interaction effects jointly have a significant association with the trait. In the ELA, we first search for "base learners" and then combine the effects of the base learners by a linear model. Each base learner represents a main effect or an interaction effect. The result of the ELA is easy to interpret. When the ELA is applied to analyze a data set, we can get a final model, an overall P-value of the association test between the set of loci involved in the final model and the trait, and an importance measure for each base learner and each marker involved in the final model. The final model is a linear combination of some base learners. We know which base learner represents a main effect and which one represents an interaction effect. The importance measure of each base learner or marker can tell us the relative importance of the base learner or marker in the final model. We used intensive simulation studies as well as a real data set to evaluate the performance of the ELA. Our simulation studies demonstrated that the ELA is more powerful than the single-marker test in all the simulation scenarios. The ELA also outperformed the other three existing multi-locus methods in almost all cases. In an application to a large-scale case-control study for Type 2 diabetes, the ELA identified 11 single nucleotide polymorphisms that have a significant multi-locus effect (P-value=0.01), while none of the single nucleotide polymorphisms showed significant marginal effects and none of the two-locus combinations showed significant two-locus interaction effects.


Assuntos
Modelos Genéticos , Algoritmos , Inteligência Artificial , Biometria , Diabetes Mellitus Tipo 2/genética , Métodos Epidemiológicos , Epistasia Genética , Predisposição Genética para Doença , Humanos , Modelos Logísticos , Polimorfismo de Nucleotídeo Único , Análise de Regressão
14.
BMC Proc ; 1 Suppl 1: S140, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18466484

RESUMO

Multiple testing is a problem in genome-wide or region-wide association studies. In this report, we consider a study design given by the Genetic Analysis Workshop 15 (GAW15) Problem 3 - nuclear families (parents with their affected children) and unrelated controls. Based on this design, we propose three two-stage approaches to deal with the problem of multiple testing. The tests in the first stage, statistically independent of the association test used in the second stage, are used to screen or select single-nucleotide polymorphisms (SNPs). Then, in the second stage, a family-based association test is performed on a much smaller set of selected SNPs. Thus, the problem of multiple testing is much less severe. Our simulation studies and application to the dense SNP data of chromosome 6 in the GAW15 Problem 3 show that the two-stage methods are more powerful than the one-stage method (using the family-based association test only).

15.
Genet Epidemiol ; 31 Suppl 1: S118-23, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18046769

RESUMO

In this summary paper, we describe the contributions included in the Multistage Design group (Group 14) at the Genetic Analysis Workshop 15, which was held during November 12-14, 2006. Our group contrasted and compared different approaches to reducing complexity in a genetic study through implementation of staged designs. Most groups used the simulated dataset (problem 3), which provided ample opportunities for evaluating various staged designs. A wide range of multistage designs that targeted different aspects of complexity were explored. We categorized these approaches as reducing phenotypic complexity, model complexity, analytic complexity or genetic complexity. In general we learned that: (1) when staged designs are carefully planned and implemented, the power loss compared to a single-stage analysis can be minimized and study cost is greatly reduced; (2) a joint analysis of the results from each stage is generally more powerful than treating the second stage as a replication analysis.


Assuntos
Genoma Humano , Humanos , Modelos Genéticos , Fenótipo , Projetos de Pesquisa
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA