Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
Elife ; 122023 06 21.
Artículo en Inglés | MEDLINE | ID: mdl-37342968

RESUMEN

Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.


Asunto(s)
Genoma , Programas Informáticos , Simulación por Computador , Genética de Población , Genómica
2.
Am J Hum Genet ; 110(2): 326-335, 2023 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-36610402

RESUMEN

Local ancestry is the source ancestry at each point in the genome of an admixed individual. Inferred local ancestry is used for admixture mapping and population genetic analyses. We present FLARE (fast local ancestry estimation), a method for local ancestry inference. FLARE achieves high accuracy through the use of an extended Li and Stephens model, and it achieves exceptional computational performance through incorporation of computational techniques developed for genotype imputation. Memory requirements are reduced through on-the-fly compression of reference haplotypes and stored checkpoints. Computation time is reduced through the use of composite reference haplotypes. These techniques allow FLARE to scale to datasets with hundreds of thousands of sequenced individuals and to provide superior accuracy on large-scale data. FLARE is open source and available at https://github.com/browning-lab/flare.


Asunto(s)
Genética de Población , Genoma Humano , Humanos , Etnicidad , Genotipo , Haplotipos/genética
3.
HGG Adv ; 3(4): 100118, 2022 Oct 13.
Artículo en Inglés | MEDLINE | ID: mdl-36267056

RESUMEN

The common Arctic-specific LDLR p.G137S variant was recently shown to be associated with elevated lipid levels. Motivated by this, we aimed to investigate the effect of p.G137S on metabolic health and cardiovascular disease risk among Greenlanders to quantify its impact on the population. In a population-based Greenlandic cohort (n = 5,063), we tested for associations between the p.G137S variant and metabolic health traits as well as cardiovascular disease risk based on registry data. In addition, we explored the variant's impact on plasma NMR measured lipoprotein concentration and composition in another Greenlandic cohort (n = 1,629); 29.5% of the individuals in the cohort carried at least one copy of the p.G137S risk allele. Furthermore, 25.4% of the heterozygous and 54.7% of the homozygous carriers had high levels (>4.9 mmol/L) of serum LDL cholesterol, which is above the diagnostic level for familial hypercholesterolemia (FH). Moreover, p.G137S was associated with an overall atherosclerotic lipid profile, and increased risk of ischemic heart disease (HR [95% CI], 1.51 [1.18-1.92], p = 0.00096), peripheral artery disease (1.69 [1.01-2.82], p = 0.046), and coronary operations (1.78 [1.21-2.62], p = 0.0035). Due to its high frequency and large effect sizes, p.G137S has a marked population-level impact, increasing the risk of FH and cardiovascular disease for up to 30% of the Greenlandic population. Thus, p.G137S is a potential marker for early intervention in Arctic populations.

4.
Mol Ecol Resour ; 22(2): 503-518, 2022 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-34351073

RESUMEN

In genomic-scale data sets, loci are closely packed within chromosomes and hence provide correlated information. Averaging across loci as if they were independent creates pseudoreplication, which reduces the effective degrees of freedom (df') compared to the nominal degrees of freedom, df. This issue has been known for some time, but consequences have not been systematically quantified across the entire genome. Here, we measured pseudoreplication (quantified by the ratio df'/df) for a common metric of genetic differentiation (FST ) and a common measure of linkage disequilibrium between pairs of loci (r2 ). Based on data simulated using models (SLiM and msprime) that allow efficient forward-in-time and coalescent simulations while precisely controlling population pedigrees, we estimated df' and df'/df by measuring the rate of decline in the variance of mean FST and mean r2 as more loci were used. For both indices, df' increases with Ne and genome size, as expected. However, even for large Ne and large genomes, df' for mean r2 plateaus after a few thousand loci, and a variance components analysis indicates that the limiting factor is uncertainty associated with sampling individuals rather than genes. Pseudoreplication is less extreme for FST , but df'/df ≤0.01 can occur in data sets using tens of thousands of loci. Commonly-used block-jackknife methods consistently overestimated var (FST ), producing very conservative confidence intervals. Predicting df' based on our modelling results as a function of Ne , L, S, and genome size provides a robust way to quantify precision associated with genomic-scale data sets.


Asunto(s)
Genómica , Modelos Genéticos , Tamaño del Genoma , Desequilibrio de Ligamiento , Linaje , Densidad de Población
5.
Gastroenterology ; 162(4): 1171-1182.e3, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-34914943

RESUMEN

BACKGROUND & AIMS: The sucrase-isomaltase (SI) c.273_274delAG loss-of-function variant is common in Arctic populations and causes congenital sucrase-isomaltase deficiency, which is an inability to break down and absorb sucrose and isomaltose. Children with this condition experience gastrointestinal symptoms when dietary sucrose is introduced. We aimed to describe the health of adults with sucrase-isomaltase deficiency. METHODS: The association between c.273_274delAG and phenotypes related to metabolic health was assessed in 2 cohorts of Greenlandic adults (n = 4922 and n = 1629). A sucrase-isomaltase knockout (Sis-KO) mouse model was used to further elucidate the findings. RESULTS: Homozygous carriers of the variant had a markedly healthier metabolic profile than the remaining population, including lower body mass index (ß [standard error], -2.0 [0.5] kg/m2; P = 3.1 × 10-5), body weight (-4.8 [1.4] kg; P = 5.1 × 10-4), fat percentage (-3.3% [1.0%]; P = 3.7 × 10-4), fasting triglyceride (-0.27 [0.07] mmol/L; P = 2.3 × 10-6), and remnant cholesterol (-0.11 [0.03] mmol/L; P = 4.2 × 10-5). Further analyses suggested that this was likely mediated partly by higher circulating levels of acetate observed in homozygous carriers (ß [standard error], 0.056 [0.002] mmol/L; P = 2.1 × 10-26), and partly by reduced sucrose uptake, but not lower caloric intake. These findings were verified in Sis-KO mice, which, compared with wild-type mice, were leaner on a sucrose-containing diet, despite similar caloric intake, had significantly higher plasma acetate levels in response to a sucrose gavage, and had lower plasma glucose level in response to a sucrose-tolerance test. CONCLUSIONS: These results suggest that sucrase-isomaltase constitutes a promising drug target for improvement of metabolic health, and that the health benefits are mediated by reduced dietary sucrose uptake and possibly also by higher levels of circulating acetate.


Asunto(s)
Sacarosa en la Dieta , Complejo Sacarasa-Isomaltasa , Acetatos , Animales , Errores Innatos del Metabolismo de los Carbohidratos , Sacarosa en la Dieta/efectos adversos , Humanos , Ratones , Oligo-1,6-Glucosidasa , Complejo Sacarasa-Isomaltasa/deficiencia , Complejo Sacarasa-Isomaltasa/genética , Complejo Sacarasa-Isomaltasa/metabolismo
6.
Curr Biol ; 31(10): 2214-2219.e4, 2021 05 24.
Artículo en Inglés | MEDLINE | ID: mdl-33711251

RESUMEN

The Inuit ancestors of the Greenlandic people arrived in Greenland close to 1,000 years ago.1 Since then, Europeans from many different countries have been present in Greenland. Consequently, the present-day Greenlandic population has ∼25% of its genetic ancestry from Europe.2 In this study, we investigated to what extent different European countries have contributed to this genetic ancestry. We combined dense SNP chip data from 3,972 Greenlanders and 8,275 Europeans from 14 countries and inferred the ancestry contribution from each of these 14 countries using haplotype-based methods. Due to the rapid increase in population size in Greenland over the past ∼100 years, we hypothesized that earlier European interactions, such as pre-colonial Dutch whalers and early German and Danish-Norwegian missionaries, as well as the later Danish colonists and post-colonial immigrants, all contributed European genetic ancestry. However, we found that the European ancestry is almost entirely Danish and that a substantial fraction is from admixture that took place within the last few generations.


Asunto(s)
Genética de Población , Inuk/genética , Población Blanca , Dinamarca , Groenlandia , Haplotipos , Humanos , Polimorfismo de Nucleótido Simple , Población Blanca/genética
7.
Curr Biol ; 31(9): 1862-1871.e5, 2021 05 10.
Artículo en Inglés | MEDLINE | ID: mdl-33636121

RESUMEN

Large carnivores are generally sensitive to ecosystem changes because their specialized diet and position at the top of the trophic pyramid is associated with small population sizes. Accordingly, low genetic diversity at the whole-genome level has been reported for all big cat species, including the widely distributed leopard. However, all previous whole-genome analyses of leopards are based on the Far Eastern Amur leopards that live at the extremity of the species' distribution and therefore are not necessarily representative of the whole species. We sequenced 53 whole genomes of African leopards. Strikingly, we found that the genomic diversity in the African leopard is 2- to 5-fold higher than in other big cats, including the Amur leopard, likely because of an exceptionally high effective population size maintained by the African leopard throughout the Pleistocene. Furthermore, we detected ongoing gene flow and very low population differentiation within African leopards compared with those of other big cats. We corroborated this by showing a complete absence of an otherwise ubiquitous equatorial forest barrier to gene flow. This sets the leopard apart from most other widely distributed large African mammals, including lions. These results revise our understanding of trophic sensitivity and highlight the remarkable resilience of the African leopard, likely because of its extraordinary habitat versatility and broad dietary niche.


Asunto(s)
Ecosistema , Variación Genética , Panthera/anatomía & histología , Panthera/genética , África , Animales , Femenino , Flujo Génico , Masculino , Panthera/clasificación , Densidad de Población
8.
Mol Ecol ; 28(1): 35-48, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30462358

RESUMEN

Knowledge of how individuals are related is important in many areas of research, and numerous methods for inferring pairwise relatedness from genetic data have been developed. However, the majority of these methods were not developed for situations where data are limited. Specifically, most methods rely on the availability of population allele frequencies, the relative genomic position of variants and accurate genotype data. But in studies of non-model organisms or ancient samples, such data are not always available. Motivated by this, we present a new method for pairwise relatedness inference, which requires neither allele frequency information nor information on genomic position. Furthermore, it can be applied not only to accurate genotype data but also to low-depth sequencing data from which genotypes cannot be accurately called. We evaluate it using data from a range of human populations and show that it can be used to infer close familial relationships with a similar accuracy as a widely used method that relies on population allele frequencies. Additionally, we show that our method is robust to SNP ascertainment and applicable to low-depth sequencing data generated using different strategies, including resequencing and RADseq, which is important for application to a diverse range of populations and species.


Asunto(s)
Frecuencia de los Genes/genética , Genoma Humano/genética , Genómica , Algoritmos , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento/estadística & datos numéricos , Humanos , Modelos Genéticos , Linaje , Polimorfismo de Nucleótido Simple/genética , Programas Informáticos
9.
Mol Ecol Resour ; 18(3): 570-579, 2018 May.
Artículo en Inglés | MEDLINE | ID: mdl-29394521

RESUMEN

Whole-genome duplications have occurred in the recent ancestors of many plants, fish and amphibians. Signals of these whole-genome duplications still exist in the form of paralogous loci. Recent advances have allowed reliable identification of paralogs in genotyping-by-sequencing (GBS) data such as that generated from restriction-site-associated DNA sequencing (RADSeq); however, excluding paralogs from analyses is still routine due to difficulties in genotyping. This exclusion of paralogs may filter a large fraction of loci, including loci that may be adaptively important or informative for population genetic analyses. We present a maximum-likelihood method for inferring allele dosage in paralogs and assess its accuracy using simulated GBS, empirical RADSeq and amplicon sequencing data from Chinook salmon. We accurately infer allele dosage for some paralogs from a RADSeq data set and show how accuracy is dependent upon both read depth and allele frequency. The amplicon sequencing data set, using RADSeq-derived markers, achieved sufficient depth to infer allele dosage for all paralogs. This study demonstrates that RADSeq locus discovery combined with amplicon sequencing of targeted loci is an effective method for incorporating paralogs into population genetic analyses.


Asunto(s)
Dosificación de Gen , Variación Genética , Técnicas de Genotipaje , Salmón/genética , Animales , Conjuntos de Datos como Asunto , Frecuencia de los Genes , Genética de Población/métodos , Genoma , Funciones de Verosimilitud , Análisis de Secuencia de ADN/métodos
10.
Evol Appl ; 10(2): 146-160, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-28127391

RESUMEN

Effective population size (Ne ) is among the most important metrics in evolutionary biology. In natural populations, it is often difficult to collect adequate demographic data to calculate Ne directly. Consequently, genetic methods to estimate Ne have been developed. Two Ne estimators based on sibship reconstruction using multilocus genotype data have been developed in recent years: sibship assignment and parentage analysis without parents. In this study, we evaluated the accuracy of sibship reconstruction using a large empirical dataset from five hatchery steelhead populations with known pedigrees and using 95 single nucleotide polymorphism (SNP) markers. We challenged the software COLONY with 2,599,961 known relationships and demonstrated that reconstruction of full-sib and unrelated pairs was greater than 95% and 99% accurate, respectively. However, reconstruction of half-sib pairs was poor (<5% accurate). Despite poor half-sib reconstruction, both estimators provided accurate estimates of the effective number of breeders (Nb ) when sample sizes were near or greater than the true Nb and when assuming a monogamous mating system. We further demonstrated that both methods provide roughly equivalent estimates of Nb . Our results indicate that sibship reconstruction and current SNP panels provide promise for estimating Nb in steelhead populations in the region.

11.
Mol Ecol Resour ; 17(4): 656-669, 2017 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-27762098

RESUMEN

Whole-genome duplications have occurred in the recent ancestors of many plants, fish, and amphibians, resulting in a pervasiveness of paralogous loci and the potential for both disomic and tetrasomic inheritance in the same genome. Paralogs can be difficult to reliably genotype and are often excluded from genotyping-by-sequencing (GBS) analyses; however, removal requires paralogs to be identified which is difficult without a reference genome. We present a method for identifying paralogs in natural populations by combining two properties of duplicated loci: (i) the expected frequency of heterozygotes exceeds that for singleton loci, and (ii) within heterozygotes, observed read ratios for each allele in GBS data will deviate from the 1:1 expected for singleton (diploid) loci. These deviations are often not apparent within individuals, particularly when sequence coverage is low; but, we postulated that summing allele reads for each locus over all heterozygous individuals in a population would provide sufficient power to detect deviations at those loci. We identified paralogous loci in three species: Chinook salmon (Oncorhynchus tshawytscha) which retains regions with ongoing residual tetrasomy on eight chromosome arms following a recent whole-genome duplication, mountain barberry (Berberis alpina) which has a large proportion of paralogs that arose through an unknown mechanism, and dusky parrotfish (Scarus niger) which has largely rediploidized following an ancient whole-genome duplication. Importantly, this approach only requires the genotype and allele-specific read counts for each individual, information which is readily obtained from most GBS analysis pipelines.


Asunto(s)
Berberis/genética , Sitios Genéticos , Genética de Población , Técnicas de Genotipaje , Salmón/genética , Animales , Mapeo Cromosómico , Bases de Datos Genéticas , Genotipo , Heterocigoto
12.
G3 (Bethesda) ; 5(11): 2463-73, 2015 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-26384769

RESUMEN

Meiotic recombination is fundamental for generating new genetic variation and for securing proper disjunction. Further, recombination plays an essential role during the rediploidization process of polyploid-origin genomes because crossovers between pairs of homeologous chromosomes retain duplicated regions. A better understanding of how recombination affects genome evolution is crucial for interpreting genomic data; unfortunately, current knowledge mainly originates from a few model species. Salmonid fishes provide a valuable system for studying the effects of recombination in nonmodel species. Salmonid females generally produce thousands of embryos, providing large families for conducting inheritance studies. Further, salmonid genomes are currently rediploidizing after a whole genome duplication and can serve as models for studying the role of homeologous crossovers on genome evolution. Here, we present a detailed interrogation of recombination patterns in sockeye salmon (Oncorhynchus nerka). First, we use RAD sequencing of haploid and diploid gynogenetic families to construct a dense linkage map that includes paralogous loci and location of centromeres. We find a nonrandom distribution of paralogs that mainly cluster in extended regions distally located on 11 different chromosomes, consistent with ongoing homeologous recombination in these regions. We also estimate the strength of interference across each chromosome; results reveal strong interference and crossovers are mostly limited to one per arm. Interference was further shown to continue across centromeres, but metacentric chromosomes generally had at least one crossover on each arm. We discuss the relevance of these findings for both mapping and population genomic studies.


Asunto(s)
Cromosomas/genética , Ligamiento Genético , Genoma , Recombinación Genética , Salmón/genética , Animales
13.
J Hered ; 105(6): 741-51, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25292170

RESUMEN

A species' genetic diversity bears the marks of evolutionary processes that have occurred throughout its history. However, robust detection of selection in wild populations is difficult and often impeded by lack of replicate tests. Here, we investigate selection in pink salmon (Oncorhynchus gorbuscha) using genome scans coupled with inference from a haploid-assisted linkage map. Pink salmon have a strict 2-year semelparous life history which has resulted in temporally isolated (allochronic) lineages that remain sympatric through sharing of spawning habitats in alternate years. The lineages differ in a range of adaptive traits, suggesting different genetic backgrounds. We used genotyping by sequencing of haploids to generate a high-density linkage map with 7035 loci and screened an existing panel of 8036 loci for signatures of selection. The linkage map enabled identification of novel genomic regions displaying signatures of parallel selection shared between lineages. Furthermore, 24 loci demonstrated divergent selection and differences in genetic diversity between lineages, suggesting that adaptation in the 2 lineages has arisen from different pools of standing genetic variation. Findings have implications for understanding asynchronous population abundances as well as predicting future ecosystem impacts from lineage-specific responses to climate change.


Asunto(s)
Adaptación Fisiológica/genética , Ligamiento Genético , Variación Genética , Genética de Población , Salmón/genética , Animales , Mapeo Cromosómico , Cambio Climático , Femenino , Sitios Genéticos , Genotipo , Haploidia , Masculino
14.
Evol Appl ; 7(3): 355-69, 2014 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-24665338

RESUMEN

Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure.

15.
Evol Appl ; 6(2): 266-78, 2013 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-23798976

RESUMEN

The future spread and impact of an introduced species will depend on how it adapts to the abiotic and biotic conditions encountered in its new range, so the potential for rapid evolution subsequent to species introduction is a critical, evolutionary dimension of invasion biology. Using a resurrection approach, we provide a direct test for change over time within populations in a species' introduced range, in the Asian shade annual Polygonum cespitosum. We document, over an 11-year period, the evolution of increased reproductive output as well as greater physiological and root-allocational plasticity in response to the more open, sunny conditions found in the North American range in which the species has become invasive. These findings show that extremely rapid adaptive modifications to ecologically-important traits and plastic expression patterns can evolve subsequent to a species' introduction, within populations established in its introduced range. This study is one of the first to directly document evolutionary change in adaptive plasticity. Such rapid evolutionary changes can facilitate the spread of introduced species into novel habitats and hence contribute to their invasive success in a new range. The data also reveal how evolutionary trajectories can differ among populations in ways that can influence invasion dynamics.

16.
Mol Ecol Resour ; 11 Suppl 1: 162-71, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21429172

RESUMEN

An important use of genetic parentage analysis is the ability to directly calculate the number of offspring produced by each parent (k(i)) and hence effective population size, N(e). But what if parental genotypes are not available? In theory, given enough markers, it should be possible to reconstruct parental genotypes based entirely on a sample of progeny, and if so the vector of parental k(i) values. However, this would provide information only about parents that actually contributed offspring to the sample. How would ignoring the 'null' parents (those that produced no offspring) affect an estimate of N(e)? The surprising answer is that null parents have no effect at all. We show that: (i) The standard formula for inbreeding N(e) can be rewritten so that it is a function only of sample size and ∑(k(2)(i)); it is not necessary to know the total number of parents (N). This same relationship does not hold for variance N(e). (ii) This novel formula provides an unbiased estimate of N(e) even if only a subset of progeny is available, provided the parental contributions are accurately determined, in which case precision is also high compared to other single-sample estimators of N(e). (iii) It is not necessary to actually reconstruct parental genotypes; from a matrix of pairwise relationships (as can be estimated by some current software programs), it is possible to construct the vector of k(i) values and estimate N(e). The new method based on parentage analysis without parents (PwoP) can potentially be useful as a single-sample estimator of contemporary N(e), provided that either (i) relationships can be accurately determined, or (ii) ∑(k(2)(i)) can be estimated directly.


Asunto(s)
Endogamia , Modelos Genéticos , Animales , Genotipo , Polimorfismo de Nucleótido Simple , Densidad de Población , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...