Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
PLoS Comput Biol ; 18(6): e1010172, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35653402

RESUMO

Gene-based association analysis is an effective gene-mapping tool. Many gene-based methods have been proposed recently. However, their power depends on the underlying genetic architecture, which is rarely known in complex traits, and so it is likely that a combination of such methods could serve as a universal approach. Several frameworks combining different gene-based methods have been developed. However, they all imply a fixed set of methods, weights and functional annotations. Moreover, most of them use individual phenotypes and genotypes as input data. Here, we introduce sumSTAAR, a framework for gene-based association analysis using summary statistics obtained from genome-wide association studies (GWAS). It is an extended and modified version of STAAR framework proposed by Li and colleagues in 2020. The sumSTAAR framework offers a wider range of gene-based methods to combine. It allows the user to arbitrarily define a set of these methods, weighting functions and probabilities of genetic variants being causal. The methods used in the framework were adapted to analyse genes with large number of SNPs to decrease the running time. The framework includes the polygene pruning procedure to guard against the influence of the strong GWAS signals outside the gene. We also present new improved matrices of correlations between the genotypes of variants within genes. These matrices estimated on a sample of 265,000 individuals are a state-of-the-art replacement of widely used matrices based on the 1000 Genomes Project data.


Assuntos
Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Estudos de Associação Genética , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
2.
Int J Mol Sci ; 24(5)2023 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-36902492

RESUMO

Every week, 1-2 breeds of farm animals, including local cattle, disappear in the world. As the keepers of rare allelic variants, native breeds potentially expand the range of genetic solutions to possible problems of the future, which means that the study of the genetic structure of these breeds is an urgent task. Providing nomadic herders with valuable resources necessary for life, domestic yaks have also become an important object of study. In order to determine the population genetic characteristics, and clarify the phylogenetic relationships of modern representatives of 155 cattle populations from different regions of the world, we collected a large set of STR data (10,250 individuals), including unique native cattle, 12 yak populations from Russia, Mongolia and Kyrgyzstan, as well as zebu breeds. Estimation of main population genetic parameters, phylogenetic analysis, principal component analysis and Bayesian cluster analysis allowed us to refine genetic structure and provided insights in relationships of native populations, transboundary breeds and populations of domestic yak. Our results can find practical application in conservation programs of endangered breeds, as well as become the basis for future fundamental research.


Assuntos
Estruturas Genéticas , Animais , Bovinos , Filogenia , Teorema de Bayes , Federação Russa
3.
Bioinformatics ; 35(19): 3701-3708, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-30860568

RESUMO

MOTIVATION: A huge number of genome-wide association studies (GWAS) summary statistics freely available in databases provide a new material for gene-based association analysis aimed at identifying rare genetic variants. Only a few of the many popular gene-based methods developed for individual genotype and phenotype data are adapted for the practical use of the GWAS summary statistics as input. RESULTS: We analytically prove and numerically illustrate that all popular powerful methods developed for gene-based association analysis of individual phenotype and genotype data can be modified to utilize GWAS summary statistics. We have modified and implemented all of the popular methods, including burden and kernel machine-based tests, multiple and functional linear regression, principal components analysis and others, in the R package sumFREGAT. Using real summary statistics for coronary artery disease, we show that the new package is able to detect genes not found by the existing packages. AVAILABILITY AND IMPLEMENTATION: The R package sumFREGAT is freely and publicly available at: https://CRAN.R-project.org/package=sumFREGAT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Estudo de Associação Genômica Ampla , Software , Genótipo , Modelos Lineares , Fenótipo
4.
Bioinformatics ; 32(15): 2392-3, 2016 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-27153598

RESUMO

UNLABELLED: Several approaches to the region-based association analysis of quantitative traits have recently been developed and successively applied. However, no software package has been developed that implements all of these approaches for either independent or structured samples. Here we introduce FREGAT (Family REGional Association Tests), an R package that can handle family and population samples and implements a wide range of region-based association methods including burden tests, functional linear models, and kernel machine-based regression. FREGAT can be used in genome/exome-wide region-based association studies of quantitative traits and candidate gene analysis. FREGAT offers many useful options to empower its users and increase the effectiveness and applicability of region-based association analysis. AVAILABILITY AND IMPLEMENTATION: https://cran.r-project.org/web/packages/FREGAT/index.html SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Online. CONTACT: belon@bionet.nsc.ru.


Assuntos
Exoma , Modelos Lineares , Software , Humanos
5.
Genes (Basel) ; 14(10)2023 10 19.
Artigo em Inglês | MEDLINE | ID: mdl-37895311

RESUMO

Back pain (BP) is a major contributor to disability worldwide, with heritability estimated at 40-60%. However, less than half of the heritability is explained by common genetic variants identified by genome-wide association studies. More powerful methods and rare and ultra-rare variant analysis may offer additional insight. This study utilized exome sequencing data from the UK Biobank to perform a multi-trait gene-based association analysis of three BP-related phenotypes: chronic back pain, dorsalgia, and intervertebral disc disorder. We identified the SLC13A1 gene as a contributor to chronic back pain via loss-of-function (LoF) and missense variants. This gene has been previously detected in two studies. A multi-trait approach uncovered the novel FSCN3 gene and its impact on back pain through LoF variants. This gene deserves attention because it is only the second gene shown to have an effect on back pain due to LoF variants and represents a promising drug target for back pain therapy.


Assuntos
Exoma , Estudo de Associação Genômica Ampla , Humanos , Exoma/genética , Predisposição Genética para Doença , Fenótipo , Dor nas Costas/genética
6.
Animals (Basel) ; 12(3)2022 Jan 18.
Artigo em Inglês | MEDLINE | ID: mdl-35158545

RESUMO

Mongolian goats are of great interest for studying ancient migration routes and domestication, and also represent a good model of adaptability to harsh environments. Recent climatic disasters and uncontrolled massive breeding endangered the valuable genetic resources of Mongolian goats and raised the question of their conservation status. Meanwhile, Mongolian goats have never been studied on genomic scale. We used Illumina Goat SNP50 to estimate genetic risks in five Mongolian goat breeds (Buural, Ulgii Red, Gobi GS, Erchim, Dorgon) and explored phylogenic relationships among these populations and in the context of other breeds. Various clustering methods showed that Mongolian goats grouped with other Asian breeds and were especially close to some neighboring Russian and Chinese breeds. The Buural breed showed the lowest estimates of inbreeding and exhibited the shortest genetic distances within the other Mongolian breeds, especially, to Ulgii Red and Gobi GS. These three breeds formed a single core group, being weakly differentiated from each other. Among them, Gobi GS displayed obvious signs of inbreeding probably resulted from artificial selection pressure. Dorgon and especially Erchim goats stand apart from the other Mongolian breeds according to various types of analyses, and bear unique features pointing to different breeding histories or distinct origins of these breeds. All populations showed strong decline in effective population size. However, none of them met formal criteria to be considered as endangered breeds. The SNP data obtained in this study improved the knowledge of Mongolian goat breeds and could be used in future management decisions in order to preserve their genetic diversity.

7.
Genes (Basel) ; 13(10)2022 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-36292579

RESUMO

We propose a novel effective framework for the analysis of the shared genetic background for a set of genetically correlated traits using SNP-level GWAS summary statistics. This framework called SHAHER is based on the construction of a linear combination of traits by maximizing the proportion of its genetic variance explained by the shared genetic factors. SHAHER requires only full GWAS summary statistics and matrices of genetic and phenotypic correlations between traits as inputs. Our framework allows both shared and unshared genetic factors to be effectively analyzed. We tested our framework using simulation studies, compared it with previous developments, and assessed its performance using three real datasets: anthropometric traits, psychiatric conditions and lipid concentrations. SHAHER is versatile and applicable to summary statistics from GWASs with arbitrary sample sizes and sample overlaps, allows for the incorporation of different GWAS models (Cox, linear and logistic), and is computationally fast.


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Polimorfismo de Nucleotídeo Único/genética , Fenótipo , Patrimônio Genético , Lipídeos
8.
Animals (Basel) ; 10(9)2020 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-32846979

RESUMO

We report the genetic analysis of 18 population samples of animals, which were taken from cattle (Bos taurus) breeds of European and Asian origins. The main strength of our study is the use of rare and ancient native cattle breeds: the Altai, Ukrainian Grey, Tagil, and Buryat ones. The cattle samples studied have different production purposes, belong to various eco-geographic regions, and consequently have distinct farming conditions. In order to clarify the genetic diversity, phylogenetic relationships and historical origin of the studied breeds, we carried out an analysis of the genetic variation of 14 high-variability microsatellite loci at 1168 genotyped animals. High levels of heterozygosity and allelic richness were identified in four of the ancient local breeds, namely the Kalmyk, Tagil, Kyrgyz native, and Buryat breeds. The greatest phylogenetic distances from a common ancestor were observed for the Yakut and Ukrainian Grey breeds, while the Tagil breed showed the smallest difference. By using clustering approaches, we found that the Altai cattle is genetically close to the Kyrgyz one. Moreover, both the Altai and Kyrgyz breeds exposed genetic divergences from other representatives of the Turano-Mongolian type and genetic relationships with the Brown Swiss and Kostroma breeds. This phenomenon can be explained by the extensive use of the Brown Swiss and Kostroma breeds in the breeding and improvement processes for the Kyrgyz breeds, which have been involved in the process of keeping the Altai cattle. Our results can be valuable for conservation and management purposes.

9.
Sci Rep ; 9(1): 5461, 2019 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-30940856

RESUMO

Here I propose a fundamentally new flexible model to reveal the association between a trait and a set of genetic variants in a genomic region/gene. This model was developed for the situation when original individual-level phenotype and genotype data are not available, but the researcher possesses the results of statistical analyses conducted on these data (namely, SNP-level summary Z score statistics and SNP-by-SNP correlations). The new model was analytically derived from the classical multiple linear regression model applied for the region-based association analysis of individual-level phenotype and genotype data by using the linear compression of data, where the SNP-by-SNP correlations are among the explanatory variables, and the summary Z score statistics are categorized as the response variables. I analytically show that the regional association analysis methods developed within the framework of the classical multiple linear regression model with additive effects of genetic variants can be reformulated in terms of the new model without the loss of information. The results obtained from the regional association analysis utilizing the classical model and those derived using the proposed model are identical when SNP-by-SNP correlations and SNP-level statistics are estimated from the same genetic data.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Biologia Computacional/métodos , Interpretação Estatística de Dados , Genótipo , Humanos , Modelos Lineares , Modelos Genéticos , Fenótipo
10.
PLoS One ; 13(1): e0190486, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29309409

RESUMO

Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Humanos , Análise de Regressão
11.
Biol Psychiatry ; 81(8): 702-707, 2017 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-27745872

RESUMO

BACKGROUND: Despite high heritability, little success was achieved in mapping genetic determinants of depression-related traits by means of genome-wide association studies. METHODS: To identify genes associated with depressive symptomology, we performed a gene-based association analysis of nonsynonymous variation captured using exome-sequencing and exome-chip genotyping in a genetically isolated population from the Netherlands (n = 1999). Finally, we reproduced our significant findings in an independent population-based cohort (n = 1604). RESULTS: We detected significant association of depressive symptoms with a gene NKPD1 (p = 3.7 × 10-08). Nonsynonymous variants in the gene explained 0.9% of sex- and age-adjusted variance of depressive symptoms in the discovery study, which is translated into 3.8% of the total estimated heritability (h2 = 0.24). Significant association of depressive symptoms with NKPD1 was also observed (n = 1604; p = 1.5 × 10-03) in the independent replication sample despite little overlap with the discovery cohort in the set of nonsynonymous genetic variants observed in the NKPD1 gene. Meta-analysis of the discovery and replication studies improved the association signal (p = 1.0 × 10-09). CONCLUSIONS: Our study suggests that nonsynonymous variation in the gene NKPD1 affects depressive symptoms in the general population. NKPD1 is predicted to be involved in the de novo synthesis of sphingolipids, which have been implicated in the pathogenesis of depression.


Assuntos
Depressão/genética , Transtorno Depressivo Maior/genética , Nucleosídeo-Trifosfatase/genética , Exoma , Feminino , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Masculino , Proteínas de Membrana/genética , Pessoa de Meia-Idade , Proteínas do Tecido Nervoso/genética , Países Baixos , Polimorfismo de Nucleotídeo Único , População Branca/genética
12.
PLoS One ; 10(6): e0128999, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26111046

RESUMO

Region-based association analysis is a more powerful tool for gene mapping than testing of individual genetic variants, particularly for rare genetic variants. The most powerful methods for regional mapping are based on the functional data analysis approach, which assumes that the regional genome of an individual may be considered as a continuous stochastic function that contains information about both linkage and linkage disequilibrium. Here, we extend this powerful approach, earlier applied only to independent samples, to the samples of related individuals. To this end, we additionally include a random polygene effects in functional linear model used for testing association between quantitative traits and multiple genetic variants in the region. We compare the statistical power of different methods using Genetic Analysis Workshop 17 mini-exome family data and a wide range of simulation scenarios. Our method increases the power of regional association analysis of quantitative traits compared with burden-based and kernel-based methods for the majority of the scenarios. In addition, we estimate the statistical power of our method using regions with small number of genetic variants, and show that our method retains its advantage over burden-based and kernel-based methods in this case as well. The new method is implemented as the R-function 'famFLM' using two types of basis functions: the B-spline and Fourier bases. We compare the properties of the new method using models that differ from each other in the type of their function basis. The models based on the Fourier basis functions have an advantage in terms of speed and power over the models that use the B-spline basis functions and those that combine B-spline and Fourier basis functions. The 'famFLM' function is distributed under GPLv3 license and is freely available at http://mga.bionet.nsc.ru/soft/famFLM/.


Assuntos
Estudos de Associação Genética/métodos , Modelos Lineares , Variação Genética , Genoma Humano , Humanos , Desequilíbrio de Ligação
13.
PLoS One ; 9(6): e99407, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24905468

RESUMO

The kernel machine-based regression is an efficient approach to region-based association analysis aimed at identification of rare genetic variants. However, this method is computationally complex. The running time of kernel-based association analysis becomes especially long for samples with genetic (sub) structures, thus increasing the need to develop new and effective methods, algorithms, and software packages. We have developed a new R-package called fast family-based sequence kernel association test (FFBSKAT) for analysis of quantitative traits in samples of related individuals. This software implements a score-based variance component test to assess the association of a given set of single nucleotide polymorphisms with a continuous phenotype. We compared the performance of our software with that of two existing software for family-based sequence kernel association testing, namely, ASKAT and famSKAT, using the Genetic Analysis Workshop 17 family sample. Results demonstrate that FFBSKAT is several times faster than other available programs. In addition, the calculations of the three-compared software were similarly accurate. With respect to the available analysis modes, we combined the advantages of both ASKAT and famSKAT and added new options to empower FFBSKAT users. The FFBSKAT package is fast, user-friendly, and provides an easy-to-use method to perform whole-exome kernel machine-based regression association analysis of quantitative traits in samples of related individuals. The FFBSKAT package, along with its manual, is available for free download at http://mga.bionet.nsc.ru/soft/FFBSKAT/.


Assuntos
Exoma , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Análise de Sequência de DNA/métodos , Software
14.
PLoS One ; 8(6): e65395, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23799013

RESUMO

Regional-based association analysis instead of individual testing of each SNP was introduced in genome-wide association studies to increase the power of gene mapping, especially for rare genetic variants. For regional association tests, the kernel machine-based regression approach was recently proposed as a more powerful alternative to collapsing-based methods. However, the vast majority of existing algorithms and software for the kernel machine-based regression are applicable only to unrelated samples. In this paper, we present a new method for the kernel machine-based regression association analysis of quantitative traits in samples of related individuals. The method is based on the GRAMMAR+ transformation of phenotypes of related individuals, followed by use of existing kernel machine-based regression software for unrelated samples. We compared the performance of kernel-based association analysis on the material of the Genetic Analysis Workshop 17 family sample and real human data by using our transformation, the original untransformed trait, and environmental residuals. We demonstrated that only the GRAMMAR+ transformation produced type I errors close to the nominal value and that this method had the highest empirical power. The new method can be applied to analysis of related samples by using existing software for kernel-based association analysis developed for unrelated samples.


Assuntos
Locos de Características Quantitativas , Algoritmos , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único
15.
Nat Genet ; 44(10): 1166-70, 2012 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-22983301

RESUMO

The variance component tests used in genome-wide association studies (GWAS) including large sample sizes become computationally exhaustive when the number of genetic markers is over a few hundred thousand. We present an extremely fast variance components-based two-step method, GRAMMAR-Gamma, developed as an analytical approximation within a framework of the score test approach. Using simulated and real human GWAS data sets, we show that this method provides unbiased estimates of the SNP effect and has a power close to that of the likelihood ratio test-based method. The computational complexity of our method is close to its theoretical minimum, that is, to the complexity of the analysis that ignores genetic structure. The running time of our method linearly depends on sample size, whereas this dependency is quadratic for other existing methods. Simulations suggest that GRAMMAR-Gamma may be used for association testing in whole-genome resequencing studies of large human cohorts.


Assuntos
Simulação por Computador , Estudo de Associação Genômica Ampla , Modelos Genéticos , Algoritmos , Arabidopsis/genética , Marcadores Genéticos , Humanos , Funções Verossimilhança , Modelos Lineares , Distribuição Normal , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA