Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-38313273

RESUMO

All published methods for learning about demographic history make the simplifying assumption that the genome evolves neutrally, and do not seek to account for the effects of natural selection on patterns of variation. This is a major concern, as ample work has demonstrated the pervasive effects of natural selection and in particular background selection (BGS) on patterns of genetic variation in diverse species. Simulations and theoretical work have shown that methods to infer changes in effective population size over time (Ne(t)) become increasingly inaccurate as the strength of linked selection increases. Here, we introduce an extension to the Pairwise Sequentially Markovian Coalescent (PSMC) algorithm, PSMC+, which explicitly co-models demographic history and natural selection. We benchmark our method using forward-in-time simulations with BGS and find that our approach improves the accuracy of effective population size inference. Leveraging a high resolution map of BGS in humans, we infer considerable changes in the magnitude of inferred effective population size relative to previous reports. Finally, we separately infer Ne(t) on the X chromosome and on the autosomes in diverse great apes without making a correction for selection, and find that the inferred ratio fluctuates substantially through time in a way that differs across species, showing that uncorrected selection may be an important driver of signals of genetic difference on the X chromosome and autosomes.

2.
medRxiv ; 2023 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-38106023

RESUMO

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.

3.
Nat Genet ; 55(11): 1854-1865, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37814053

RESUMO

The analysis of longitudinal data from electronic health records (EHRs) has the potential to improve clinical diagnoses and enable personalized medicine, motivating efforts to identify disease subtypes from patient comorbidity information. Here we introduce an age-dependent topic modeling (ATM) method that provides a low-rank representation of longitudinal records of hundreds of distinct diseases in large EHR datasets. We applied ATM to 282,957 UK Biobank samples, identifying 52 diseases with heterogeneous comorbidity profiles; analyses of 211,908 All of Us samples produced concordant results. We defined subtypes of the 52 heterogeneous diseases based on their comorbidity profiles and compared genetic risk across disease subtypes using polygenic risk scores (PRSs), identifying 18 disease subtypes whose PRS differed significantly from other subtypes of the same disease. We further identified specific genetic variants with subtype-dependent effects on disease risk. In conclusion, ATM identifies disease subtypes with differential genome-wide and locus-specific genetic risk profiles.


Assuntos
Predisposição Genética para Doença , Saúde da População , Humanos , Bancos de Espécimes Biológicos , Estudo de Associação Genômica Ampla/métodos , Fatores de Risco , Comorbidade , Herança Multifatorial/genética , Reino Unido/epidemiologia
4.
medRxiv ; 2023 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-37790574

RESUMO

The role of gene-environment (GxE) interaction in disease and complex trait architectures is widely hypothesized, but currently unknown. Here, we apply three statistical approaches to quantify and distinguish three different types of GxE interaction for a given disease/trait and E variable. First, we detect locus-specific GxE interaction by testing for genetic correlation (rg) < 1 across E bins. Second, we detect genome-wide effects of the E variable on genetic variance by leveraging polygenic risk scores (PRS) to test for significant PRSxE in a regression of phenotypes on PRS, E, and PRSxE, together with differences in SNP-heritability across E bins. Third, we detect genome-wide proportional amplification of genetic and environmental effects as a function of the E variable by testing for significant PRSxE with no differences in SNP-heritability across E bins. Simulations show that these approaches achieve high sensitivity and specificity in distinguishing these three GxE scenarios. We applied our framework to 33 UK Biobank diseases/traits (average N=325K) and 10 E variables spanning lifestyle, diet, and other environmental exposures. First, we identified 19 trait-E pairs with rg significantly < 1 (FDR<5%) (average rg=0.95); for example, white blood cell count had rg=0.95 (s.e. 0.01) between smokers and non-smokers. Second, we identified 28 trait-E pairs with significant PRSxE and significant SNP-heritability differences across E bins; for example, type 2 diabetes had a significant PRSxE for alcohol consumption (P=1e-13) with 4.2x larger SNP-heritability in the largest versus smallest quintiles of alcohol consumption (P<1e-16). Third, we identified 15 trait-E pairs with significant PRSxE with no SNP-heritability differences across E bins; for example, triglyceride levels had a significant PRSxE effect for composite diet score (P=4e-5) with no SNP-heritability differences. Analyses using biological sex as the E variable produced additional significant findings in each of the three scenarios. Overall, we infer a substantial contribution of GxE and GxSex effects to disease and complex trait variance.

5.
Elife ; 122023 03 20.
Artigo em Inglês | MEDLINE | ID: mdl-36939312

RESUMO

The genetic variants introduced into the ancestors of modern humans from interbreeding with Neanderthals have been suggested to contribute an unexpected extent to complex human traits. However, testing this hypothesis has been challenging due to the idiosyncratic population genetic properties of introgressed variants. We developed rigorous methods to assess the contribution of introgressed Neanderthal variants to heritable trait variation and applied these methods to analyze 235,592 introgressed Neanderthal variants and 96 distinct phenotypes measured in about 300,000 unrelated white British individuals in the UK Biobank. Introgressed Neanderthal variants make a significant contribution to trait variation (explaining 0.12% of trait variation on average). However, the contribution of introgressed variants tends to be significantly depleted relative to modern human variants matched for allele frequency and linkage disequilibrium (about 59% depletion on average), consistent with purifying selection on introgressed variants. Different from previous studies (McArthur et al., 2021), we find no evidence for elevated heritability across the phenotypes examined. We identified 348 independent significant associations of introgressed Neanderthal variants with 64 phenotypes. Previous work (Skov et al., 2020) has suggested that a majority of such associations are likely driven by statistical association with nearby modern human variants that are the true causal variants. Applying a customized fine-mapping led us to identify 112 regions across 47 phenotypes containing 4303 unique genetic variants where introgressed variants are highly likely to have a phenotypic effect. Examination of these variants reveals their substantial impact on genes that are important for the immune system, development, and metabolism.


Assuntos
Hominidae , Homem de Neandertal , Animais , Humanos , Homem de Neandertal/genética , Herança Multifatorial , Hominidae/genética , Frequência do Gene , Genética Populacional , Genoma Humano
6.
Mol Biol Evol ; 40(1)2023 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-36617238

RESUMO

Adaptive introgression (AI) facilitates local adaptation in a wide range of species. Many state-of-the-art methods detect AI with ad-hoc approaches that identify summary statistic outliers or intersect scans for positive selection with scans for introgressed genomic regions. Although widely used, approaches intersecting outliers are vulnerable to a high false-negative rate as the power of different methods varies, especially for complex introgression events. Moreover, population genetic processes unrelated to AI, such as background selection or heterosis, may create similar genomic signals to AI, compromising the reliability of methods that rely on neutral null distributions. In recent years, machine learning (ML) methods have been increasingly applied to population genetic questions. Here, we present a ML-based method called MaLAdapt for identifying AI loci from genome-wide sequencing data. Using an Extra-Trees Classifier algorithm, our method combines information from a large number of biologically meaningful summary statistics to capture a powerful composite signature of AI across the genome. In contrast to existing methods, MaLAdapt is especially well-powered to detect AI with mild beneficial effects, including selection on standing archaic variation, and is robust to non-AI selective sweeps, heterosis from deleterious mutations, and demographic misspecification. Furthermore, MaLAdapt outperforms existing methods for detecting AI based on the analysis of simulated data and the validation of empirical signals through visual inspection of haplotype patterns. We apply MaLAdapt to the 1000 Genomes Project human genomic data and discover novel AI candidate regions in non-African populations, including genes that are enriched in functionally important biological pathways regulating metabolism and immune responses.


Assuntos
Homem de Neandertal , Humanos , Animais , Homem de Neandertal/genética , Reprodutibilidade dos Testes , Genética Populacional , Adaptação Fisiológica , Seleção Genética , Genoma Humano
7.
Res Sq ; 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38168385

RESUMO

The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.

8.
Bioinformatics ; 37(Suppl_1): i142-i150, 2021 07 12.
Artigo em Inglês | MEDLINE | ID: mdl-34252951

RESUMO

MOTIVATION: Admixture, the interbreeding between previously distinct populations, is a pervasive force in evolution. The evolutionary history of populations in the presence of admixture can be modeled by augmenting phylogenetic trees with additional nodes that represent admixture events. While enabling a more faithful representation of evolutionary history, admixture graphs present formidable inferential challenges, and there is an increasing need for methods that are accurate, fully automated and computationally efficient. One key challenge arises from the size of the space of admixture graphs. Given that exhaustively evaluating all admixture graphs can be prohibitively expensive, heuristics have been developed to enable efficient search over this space. One heuristic, implemented in the popular method TreeMix, consists of adding edges to a starting tree while optimizing a suitable objective function. RESULTS: Here, we present a demographic model (with one admixed population incident to a leaf) where TreeMix and any other starting-tree-based maximum likelihood heuristic using its likelihood function is guaranteed to get stuck in a local optimum and return an incorrect network topology. To address this issue, we propose a new search strategy that we term maximum likelihood network orientation (MLNO). We augment TreeMix with an exhaustive search for an MLNO, referring to this approach as OrientAGraph. In evaluations including previously published admixture graphs, OrientAGraph outperformed TreeMix on 4/8 models (there are no differences in the other cases). Overall, OrientAGraph found graphs with higher likelihood scores and topological accuracy while remaining computationally efficient. Lastly, our study reveals several directions for improving maximum likelihood admixture graph estimation. AVAILABILITY AND IMPLEMENTATION: OrientAGraph is available on Github (https://github.com/sriramlab/OrientAGraph) under the GNU General Public License v3.0. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Humanos , Funções Verossimilhança , Filogenia , Grupos Populacionais
9.
Am J Hum Genet ; 108(4): 620-631, 2021 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-33691092

RESUMO

Phenotype prediction is a key goal for medical genetics. Unfortunately, most genome-wide association studies are done in European populations, which reduces the accuracy of predictions via polygenic scores in non-European populations. Here, we use population genetic models to show that human demographic history and negative selection on complex traits can result in population-specific genetic architectures. For traits where alleles with the largest effect on the trait are under the strongest negative selection, approximately half of the heritability can be accounted for by variants in Europe that are absent from Africa, leading to poor performance in phenotype prediction across these populations. Further, under such a model, individuals in the tails of the genetic risk distribution may not be identified via polygenic scores generated in another population. We empirically test these predictions by building a model to stratify heritability between European-specific and shared variants and applied it to 37 traits and diseases in the UK Biobank. Across these phenotypes, ∼30% of the heritability comes from European-specific variants. We conclude that genetic association studies need to include more diverse populations to enable the utility of phenotype prediction in all populations.


Assuntos
Predisposição Genética para Doença , Genética Populacional , Modelos Genéticos , Herança Multifatorial/genética , Fenótipo , Seleção Genética/genética , África/etnologia , Simulação por Computador , Conjuntos de Dados como Assunto , Europa (Continente)/etnologia , Variação Genética/genética , Humanos , Crescimento Demográfico , Reino Unido
10.
Science ; 371(6527): 415-419, 2021 01 22.
Artigo em Inglês | MEDLINE | ID: mdl-33479156

RESUMO

Metabolic pathways differ across species but are expected to be similar within a species. We discovered two functional, incompatible versions of the galactose pathway in Saccharomyces cerevisiae We identified a three-locus genetic interaction for growth in galactose, and used precisely engineered alleles to show that it arises from variation in the galactose utilization genes GAL2, GAL1/10/7, and phosphoglucomutase (PGM1), and that the reference allele of PGM1 is incompatible with the alternative alleles of the other genes. Multiloci balancing selection has maintained the two incompatible versions of the pathway for millions of years. Strains with alternative alleles are found primarily in galactose-rich dairy environments, and they grow faster in galactose but slower in glucose, revealing a trade-off on which balancing selection may have acted.


Assuntos
Galactose/metabolismo , Redes e Vias Metabólicas/genética , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Seleção Genética , Alelos , Galactoquinase/genética , Proteínas de Transporte de Monossacarídeos/genética , Fosfoglucomutase/genética , Transativadores/genética
11.
Elife ; 92020 06 23.
Artigo em Inglês | MEDLINE | ID: mdl-32573438

RESUMO

The explosion in population genomic data demands ever more complex modes of analysis, and increasingly, these analyses depend on sophisticated simulations. Recent advances in population genetic simulation have made it possible to simulate large and complex models, but specifying such models for a particular simulation engine remains a difficult and error-prone task. Computational genetics researchers currently re-implement simulation models independently, leading to inconsistency and duplication of effort. This situation presents a major barrier to empirical researchers seeking to use simulations for power analyses of upcoming studies or sanity checks on existing genomic data. Population genetics, as a field, also lacks standard benchmarks by which new tools for inference might be measured. Here, we describe a new resource, stdpopsim, that attempts to rectify this situation. Stdpopsim is a community-driven open source project, which provides easy access to a growing catalog of published simulation models from a range of organisms and supports multiple simulation engine backends. This resource is available as a well-documented python library with a simple command-line interface. We share some examples demonstrating how stdpopsim can be used to systematically compare demographic inference methods, and we encourage a broader community of developers to contribute to this growing resource.


Assuntos
Genética Populacional , Biblioteca Genômica , Modelos Genéticos , Animais , Arabidopsis/genética , Cães/genética , Drosophila melanogaster/genética , Escherichia coli/genética , Genética Populacional/métodos , Genética Populacional/organização & administração , Genoma/genética , Genoma Humano/genética , Humanos , Pongo abelii/genética
12.
Sci Adv ; 6(7): eaax5097, 2020 02.
Artigo em Inglês | MEDLINE | ID: mdl-32095519

RESUMO

While introgression from Neanderthals and Denisovans has been documented in modern humans outside Africa, the contribution of archaic hominins to the genetic variation of present-day Africans remains poorly understood. We provide complementary lines of evidence for archaic introgression into four West African populations. Our analyses of site frequency spectra indicate that these populations derive 2 to 19% of their genetic ancestry from an archaic population that diverged before the split of Neanderthals and modern humans. Using a method that can identify segments of archaic ancestry without the need for reference archaic genomes, we built genome-wide maps of archaic ancestry in the Yoruba and the Mende populations. Analyses of these maps reveal segments of archaic ancestry at high frequency in these populations that represent potential targets of adaptive introgression. Our results reveal the substantial contribution of archaic ancestry in shaping the gene pool of present-day West African populations.


Assuntos
População Negra/genética , Genética Populacional , Etnicidade/genética , Frequência do Gene , Humanos , Filogenia
13.
PLoS Genet ; 15(5): e1008175, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31136573

RESUMO

Statistical analyses of genomic data from diverse human populations have demonstrated that archaic hominins, such as Neanderthals and Denisovans, interbred or admixed with the ancestors of present-day humans. Central to these analyses are methods for inferring archaic ancestry along the genomes of present-day individuals (archaic local ancestry). Methods for archaic local ancestry inference rely on the availability of reference genomes from the ancestral archaic populations for accurate inference. However, several instances of archaic admixture lack reference archaic genomes, making it difficult to characterize these events. We present a statistical method that combines diverse population genetic summary statistics to infer archaic local ancestry without access to an archaic reference genome. We validate the accuracy and robustness of our method in simulations. When applied to genomes of European individuals, our method recovers segments that are substantially enriched for Neanderthal ancestry, even though our method did not have access to any Neanderthal reference genomes.


Assuntos
Genética Populacional/métodos , Genômica/métodos , Hominidae/genética , Animais , Genoma Humano/genética , Humanos , Modelos Estatísticos , Homem de Neandertal/genética
14.
Nat Commun ; 9(1): 2750, 2018 07 16.
Artigo em Inglês | MEDLINE | ID: mdl-30013096

RESUMO

Dominance is a fundamental concept in molecular genetics and has implications for understanding patterns of genetic variation, evolution, and complex traits. However, despite its importance, the degree of dominance in natural populations is poorly quantified. Here, we leverage multiple mating systems in natural populations of Arabidopsis to co-estimate the distribution of fitness effects and dominance coefficients of new amino acid changing mutations. We find that more deleterious mutations are more likely to be recessive than less deleterious mutations. Further, this pattern holds across gene categories, but varies with the connectivity and expression patterns of genes. Our work argues that dominance arises as a consequence of the functional importance of genes and their optimal expression levels.


Assuntos
Arabidopsis/genética , Evolução Molecular , Genes Dominantes , Genes de Plantas , Modelos Genéticos , Arabidopsis/metabolismo , Expressão Gênica , Genes Recessivos , Aptidão Genética , Variação Genética , Genética Populacional , Mutação , Seleção Genética
15.
Mol Biol Evol ; 35(5): 1190-1209, 2018 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-29688543

RESUMO

Pigmentation is often used to understand how natural selection affects genetic variation in wild populations since it can have a simple genetic basis, and can affect a variety of fitness-related traits (e.g., camouflage, thermoregulation, and sexual display). In gray wolves, the K locus, a ß-defensin gene, causes black coat color via a dominantly inherited KB allele. The allele is derived from dog-wolf hybridization and is at high frequency in North American wolf populations. We designed a DNA capture array to probe the geographic origin, age, and number of introgression events of the KB allele in a panel of 331 wolves and 20 dogs. We found low diversity in KB, but not ancestral ky, wolf haplotypes consistent with a selective sweep of the black haplotype across North America. Further, North American wolf KB haplotypes are monophyletic, suggesting that a single adaptive introgression from dogs to wolves most likely occurred in the Northwest Territories or Yukon. We use a new analytical approach to date the origin of the KB allele in Yukon wolves to between 1,598 and 7,248 years ago, suggesting that introgression with early Native American dogs was the source. Using population genetic simulations, we show that the K locus is undergoing natural selection in four wolf populations. We find evidence for balancing selection, specifically in Yellowstone wolves, which could be a result of selection for enhanced immunity in response to distemper. With these data, we demonstrate how the spread of an adaptive variant may have occurred across a species' geographic range.


Assuntos
Cor de Cabelo/genética , Seleção Genética , Lobos/genética , beta-Defensinas/genética , Animais , Simulação por Computador , Cães , Frequência do Gene , Variação Genética , Haplótipos , Homozigoto , América do Norte
16.
Science ; 360(6389): 656-660, 2018 May 11.
Artigo em Inglês | MEDLINE | ID: mdl-29674434

RESUMO

To investigate the consequences of hybridization between species, we studied three replicate hybrid populations that formed naturally between two swordtail fish species, estimating their fine-scale genetic map and inferring ancestry along the genomes of 690 individuals. In all three populations, ancestry from the "minor" parental species is more common in regions of high recombination and where there is linkage to fewer putative targets of selection. The same patterns are apparent in a reanalysis of human and archaic admixture. These results support models in which ancestry from the minor parental species is more likely to persist when rapidly uncoupled from alleles that are deleterious in hybrids. Our analyses further indicate that selection on swordtail hybrids stems predominantly from deleterious combinations of epistatically interacting alleles.


Assuntos
Quimera/genética , Epistasia Genética , Evolução Molecular , Recombinação Genética , Seleção Genética , Alelos , Animais , Peixes , Hibridização Genética
17.
Proc Natl Acad Sci U S A ; 114(20): 5213-5218, 2017 05 16.
Artigo em Inglês | MEDLINE | ID: mdl-28473417

RESUMO

Over the past 20 y, many studies have examined the history of the plant ecological and molecular model, Arabidopsis thaliana, in Europe and North America. Although these studies informed us about the recent history of the species, the early history has remained elusive. In a large-scale genomic analysis of African A. thaliana, we sequenced the genomes of 78 modern and herbarium samples from Africa and analyzed these together with over 1,000 previously sequenced Eurasian samples. In striking contrast to expectations, we find that all African individuals sampled are native to this continent, including those from sub-Saharan Africa. Moreover, we show that Africa harbors the greatest variation and represents the deepest history in the A. thaliana lineage. Our results also reveal evidence that selfing, a major defining characteristic of the species, evolved in a single geographic region, best represented today within Africa. Demographic inference supports a model in which the ancestral A. thaliana population began to split by 120-90 kya, during the last interglacial and Abbassia pluvial, and Eurasian populations subsequently separated from one another at around 40 kya. This bears striking similarities to the patterns observed for diverse species, including humans, implying a key role for climatic events during interglacial and pluvial periods in shaping the histories and current distributions of a wide range of species.


Assuntos
Arabidopsis/genética , Genômica/métodos , África , África Subsaariana , Sequência de Bases , Evolução Biológica , Europa (Continente) , Evolução Molecular , Variação Genética/genética , Genética Populacional/métodos , Genoma de Planta/genética , Haplótipos/genética , Filogenia , Análise de Componente Principal
18.
Mol Ecol Resour ; 16(6): 1449-1454, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27480660

RESUMO

High-throughput sequencing has changed many aspects of population genetics, molecular ecology and related fields, affecting both experimental design and data analysis. The software package angsd allows users to perform a number of population genetic analyses on high-throughput sequencing data. angsd uses probabilistic approaches which can directly make use of genotype likelihoods; thus, SNP calling is not required for comparative analyses. This takes advantage of all the sequencing data and produces more accurate results for samples with low sequencing depth. Here, we present angsd-wrapper, a set of wrapper scripts that provides a user-friendly interface for running angsd and visualizing results. angsd-wrapper supports multiple types of analyses including estimates of nucleotide sequence diversity neutrality tests, principal component analysis, estimation of admixture proportions for individual samples and calculation of statistics that quantify recent introgression. angsd-wrapper also provides interactive graphing of angsd results to enhance data exploration. We demonstrate the usefulness of angsd-wrapper by analysing resequencing data from populations of wild and domesticated Zea. angsd-wrapper is freely available from https://github.com/mojaveazure/angsd-wrapper.


Assuntos
Biologia Computacional/métodos , Genética Populacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA/métodos , Variação Genética , Software , Zea mays/classificação , Zea mays/genética
19.
Nat Plants ; 2: 16084, 2016 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-27294617

RESUMO

Genetic diversity is shaped by the interaction of drift and selection, but the details of this interaction are not well understood. The impact of genetic drift in a population is largely determined by its demographic history, typically summarized by its long-term effective population size (Ne). Rapidly changing population demographics complicate this relationship, however. To better understand how changing demography impacts selection, we used whole-genome sequencing data to investigate patterns of linked selection in domesticated and wild maize (teosinte). We produce the first whole-genome estimate of the demography of maize domestication, showing that maize was reduced to approximately 5% the population size of teosinte before it experienced rapid expansion post-domestication to population sizes much larger than its ancestor. Evaluation of patterns of nucleotide diversity in and near genes shows little evidence of selection on beneficial amino acid substitutions, and that the domestication bottleneck led to a decline in the efficiency of purifying selection in maize. Young alleles, however, show evidence of much stronger purifying selection in maize, reflecting the much larger effective size of present day populations. Our results demonstrate that recent demographic change-a hall-mark of many species including both humans and crops-can have immediate and wide-ranging impacts on diversity that conflict with expectations based on long-term Ne alone.


Assuntos
Evolução Molecular , Genoma de Planta , Seleção Genética , Zea mays/genética , Produtos Agrícolas/genética , Domesticação , Sequenciamento Completo do Genoma
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA