Búsqueda | OPS/OMS Uruguay

Testing for associations between loci and environmental gradients using latent factor mixed models.

Frichot, Eric; Schoville, Sean D; Bouchard, Guillaume; François, Olivier.

Mol Biol Evol ; 30(7): 1687-99, 2013 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-23543094

RESUMEN

Adaptation to local environments often occurs through natural selection acting on a large number of loci, each having a weak phenotypic effect. One way to detect these loci is to identify genetic polymorphisms that exhibit high correlation with environmental variables used as proxies for ecological pressures. Here, we propose new algorithms based on population genetics, ecological modeling, and statistical learning techniques to screen genomes for signatures of local adaptation. Implemented in the computer program "latent factor mixed model" (LFMM), these algorithms employ an approach in which population structure is introduced using unobserved variables. These fast and computationally efficient algorithms detect correlations between environmental and genetic variation while simultaneously inferring background levels of population structure. Comparing these new algorithms with related methods provides evidence that LFMM can efficiently estimate random effects due to population history and isolation-by-distance patterns when computing gene-environment correlations, and decrease the number of false-positive associations in genome scans. We then apply these models to plant and human genetic data, identifying several genes with functions related to development that exhibit strong correlations with climatic gradients.

Asunto(s)

Adaptación Fisiológica/genética , Genética de Población , Polimorfismo Genético , Selección Genética/genética , Algoritmos , Ecología , Ambiente , Interacción Gen-Ambiente , Variación Genética , Humanos , Modelos Teóricos

Genome scan methods against more complex models: when and how much should we trust them?

de Villemereuil, Pierre; Frichot, Éric; Bazin, Éric; François, Olivier; Gaggiotti, Oscar E.

Mol Ecol ; 23(8): 2006-19, 2014 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-24611968

RESUMEN

The recent availability of next-generation sequencing (NGS) has made possible the use of dense genetic markers to identify regions of the genome that may be under the influence of selection. Several statistical methods have been developed recently for this purpose. Here, we present the results of an individual-based simulation study investigating the power and error rate of popular or recent genome scan methods: linear regression, Bayescan, BayEnv and LFMM. Contrary to previous studies, we focus on complex, hierarchical population structure and on polygenic selection. Additionally, we use a false discovery rate (FDR)-based framework, which provides an unified testing framework across frequentist and Bayesian methods. Finally, we investigate the influence of population allele frequencies versus individual genotype data specification for LFMM and the linear regression. The relative ranking between the methods is impacted by the consideration of polygenic selection, compared to a monogenic scenario. For strongly hierarchical scenarios with confounding effects between demography and environmental variables, the power of the methods can be very low. Except for one scenario, Bayescan exhibited moderate power and error rate. BayEnv performance was good under nonhierarchical scenarios, while LFMM provided the best compromise between power and error rate across scenarios. We found that it is possible to greatly reduce error rates by considering the results of all three methods when identifying outlier loci.

Asunto(s)

Teorema de Bayes , Genética de Población/métodos , Modelos Genéticos , Simulación por Computador , Interpretación Estadística de Datos , Frecuencia de los Genes , Interacción Gen-Ambiente , Genotipo , Modelos Lineales , Polimorfismo de Nucleótido Simple

Fast and efficient estimation of individual ancestry coefficients.

Frichot, Eric; Mathieu, François; Trouillon, Théo; Bouchard, Guillaume; François, Olivier.

Genetics ; 196(4): 973-83, 2014 Apr.

Artículo en Inglés | MEDLINE | ID: mdl-24496008

RESUMEN

Inference of individual ancestry coefficients, which is important for population genetic and association studies, is commonly performed using computer-intensive likelihood algorithms. With the availability of large population genomic data sets, fast versions of likelihood algorithms have attracted considerable attention. Reducing the computational burden of estimation algorithms remains, however, a major challenge. Here, we present a fast and efficient method for estimating individual ancestry coefficients based on sparse nonnegative matrix factorization algorithms. We implemented our method in the computer program sNMF and applied it to human and plant data sets. The performances of sNMF were then compared to the likelihood algorithm implemented in the computer program ADMIXTURE. Without loss of accuracy, sNMF computed estimates of ancestry coefficients with runtimes â¼10-30 times shorter than those of ADMIXTURE.

Asunto(s)

Algoritmos , Genética de Población , Programas Informáticos , Biología Computacional/métodos , Frecuencia de los Genes , Estudios de Asociación Genética , Genotipo , Humanos , Funciones de Verosimilitud , Plantas/genética , Grupos de Población , Factores de Tiempo

Correcting principal component maps for effects of spatial autocorrelation in population genetic data.

Frichot, Eric; Schoville, Sean; Bouchard, Guillaume; François, Olivier.

Front Genet ; 3: 254, 2012.

Artículo en Inglés | MEDLINE | ID: mdl-23181073

RESUMEN

In many species, spatial genetic variation displays patterns of "isolation-by-distance." Characterized by locally correlated allele frequencies, these patterns are known to create periodic shapes in geographic maps of principal components which confound signatures of specific migration events and influence interpretations of principal component analyses (PCA). In this study, we introduced models combining probabilistic PCA and kriging models to infer population genetic structure from genetic data while correcting for effects generated by spatial autocorrelation. The corresponding algorithms are based on singular value decomposition and low rank approximation of the genotypic data. As their complexity is close to that of PCA, these algorithms scale with the dimensions of the data. To illustrate the utility of these new models, we simulated isolation-by-distance patterns and broad-scale geographic variation using spatial coalescent models. Our methods remove the horseshoe patterns usually observed in PC maps and simplify interpretations of spatial genetic variation. We demonstrate our approach by analyzing single nucleotide polymorphism data from the Human Genome Diversity Panel, and provide comparisons with other recently introduced methods.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA