Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
1.
Cell ; 186(5): 923-939.e14, 2023 03 02.
Artículo en Inglés | MEDLINE | ID: mdl-36868214

RESUMEN

We conduct high coverage (>30×) whole-genome sequencing of 180 individuals from 12 indigenous African populations. We identify millions of unreported variants, many predicted to be functionally important. We observe that the ancestors of southern African San and central African rainforest hunter-gatherers (RHG) diverged from other populations >200 kya and maintained a large effective population size. We observe evidence for ancient population structure in Africa and for multiple introgression events from "ghost" populations with highly diverged genetic lineages. Although currently geographically isolated, we observe evidence for gene flow between eastern and southern Khoesan-speaking hunter-gatherer populations lasting until ∼12 kya. We identify signatures of local adaptation for traits related to skin color, immune response, height, and metabolic processes. We identify a positively selected variant in the lightly pigmented San that influences pigmentation in vitro by regulating the enhancer activity and gene expression of PDPK1.


Asunto(s)
Aclimatación , Pigmentación de la Piel , Humanos , Secuenciación Completa del Genoma , Densidad de Población , África , Proteínas Quinasas Dependientes de 3-Fosfoinosítido
2.
Am J Hum Genet ; 109(7): 1286-1297, 2022 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-35716666

RESUMEN

Despite the growing number of genome-wide association studies (GWASs), it remains unclear to what extent gene-by-gene and gene-by-environment interactions influence complex traits in humans. The magnitude of genetic interactions in complex traits has been difficult to quantify because GWASs are generally underpowered to detect individual interactions of small effect. Here, we develop a method to test for genetic interactions that aggregates information across all trait-associated loci. Specifically, we test whether SNPs in regions of European ancestry shared between European American and admixed African American individuals have the same causal effect sizes. We hypothesize that in African Americans, the presence of genetic interactions will drive the causal effect sizes of SNPs in regions of European ancestry to be more similar to those of SNPs in regions of African ancestry. We apply our method to two traits: gene expression in 296 African Americans and 482 European Americans in the Multi-Ethnic Study of Atherosclerosis (MESA) and low-density lipoprotein cholesterol (LDL-C) in 74K African Americans and 296K European Americans in the Million Veteran Program (MVP). We find significant evidence for genetic interactions in our analysis of gene expression; for LDL-C, we observe a similar point estimate, although this is not significant, most likely due to lower statistical power. These results suggest that gene-by-gene or gene-by-environment interactions modify the effect sizes of causal variants in human complex traits.


Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , LDL-Colesterol , Expresión Génica , Humanos , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple/genética , Población Blanca/genética
3.
Bioinformatics ; 40(8)2024 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-39185959

RESUMEN

SUMMARY: Pool sequencing is an efficient method for capturing genome-wide allele frequencies from multiple individuals, with broad applications such as studying adaptation in Evolve-and-Resequence experiments, monitoring of genetic diversity in wild populations, and genotype-to-phenotype mapping. Here, we present grenedalf, a command line tool written in C++ that implements common population genetic statistics such as θ, Tajima's D, and FST for Pool sequencing. It is orders of magnitude faster than current tools, and is focused on providing usability and scalability, while also offering a plethora of input file formats and convenience options. AVAILABILITY AND IMPLEMENTATION: grenedalf is published under the GPL-3, and freely available at github.com/lczech/grenedalf.


Asunto(s)
Genética de Población , Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Genética de Población/métodos , Frecuencia de los Genes , Análisis de Secuencia de ADN/métodos , Variación Genética , Humanos
4.
Nature ; 538(7624): 201-206, 2016 Oct 13.
Artículo en Inglés | MEDLINE | ID: mdl-27654912

RESUMEN

Here we report the Simons Genome Diversity Project data set: high quality genomes from 300 individuals from 142 diverse populations. These genomes include at least 5.8 million base pairs that are not present in the human reference genome. Our analysis reveals key features of the landscape of human genome variation, including that the rate of accumulation of mutations has accelerated by about 5% in non-Africans compared to Africans since divergence. We show that the ancestors of some pairs of present-day human populations were substantially separated by 100,000 years ago, well before the archaeologically attested onset of behavioural modernity. We also demonstrate that indigenous Australians, New Guineans and Andamanese do not derive substantial ancestry from an early dispersal of modern humans; instead, their modern human ancestry is consistent with coming from the same source as that of other non-Africans.


Asunto(s)
Variación Genética/genética , Genoma Humano/genética , Genómica , Tasa de Mutación , Filogenia , Grupos Raciales/genética , Animales , Australia , Población Negra/genética , Conjuntos de Datos como Asunto , Genética de Población , Historia Antigua , Migración Humana/historia , Humanos , Nativos de Hawái y Otras Islas del Pacífico/genética , Hombre de Neandertal/genética , Nueva Guinea , Análisis de Secuencia de ADN , Especificidad de la Especie , Factores de Tiempo
5.
Proc Natl Acad Sci U S A ; 116(34): 17115-17120, 2019 08 20.
Artículo en Inglés | MEDLINE | ID: mdl-31387977

RESUMEN

There has been much interest in analyzing genome-scale DNA sequence data to infer population histories, but inference methods developed hitherto are limited in model complexity and computational scalability. Here we present an efficient, flexible statistical method, diCal2, that can use whole-genome sequence data from multiple populations to infer complex demographic models involving population size changes, population splits, admixture, and migration. Applying our method to data from Australian, East Asian, European, and Papuan populations, we find that the population ancestral to Australians and Papuans started separating from East Asians and Europeans about 100,000 y ago, and that the separation of East Asians and Europeans started about 50,000 y ago, with pervasive gene flow between all pairs of populations.


Asunto(s)
Flujo Génico , Estudio de Asociación del Genoma Completo , Migración Humana , Modelos Genéticos , Nativos de Hawái y Otras Islas del Pacífico/genética , Secuenciación Completa del Genoma , Australia , Genética de Población , Historia Antigua , Humanos , Nativos de Hawái y Otras Islas del Pacífico/historia
6.
Mol Ecol ; 27(19): 3873-3888, 2018 10.
Artículo en Inglés | MEDLINE | ID: mdl-29603507

RESUMEN

Genetic evidence has revealed that the ancestors of modern human populations outside Africa and their hominin sister groups, notably Neanderthals, exchanged genetic material in the past. The distribution of these introgressed sequence tracts along modern-day human genomes provides insight into the selective forces acting on them and the role of introgression in the evolutionary history of hominins. Studying introgression patterns on the X-chromosome is of particular interest, as sex chromosomes are thought to play a special role in speciation. Recent studies have developed methods to localize introgressed ancestries, reporting long regions that are depleted of Neanderthal introgression and enriched in genes, suggesting negative selection against the Neanderthal variants. On the other hand, enriched Neanderthal ancestry in hair- and skin-related genes suggests that some introgressed variants facilitated adaptation to new environments. Here, we present a model-based introgression detection method called dical-admix. We demonstrate its efficiency and accuracy through extensive simulations and apply it to detect tracts of Neanderthal introgression in modern human individuals from the 1000 Genomes Project. Our findings are largely concordant with previous studies, consistent with weak selection against Neanderthal ancestry. We find evidence that selection against Neanderthal ancestry was due to higher genetic load in Neanderthals resulting from small effective population size, rather than widespread Dobzhansky-Müller incompatibilities (DMIs) that could contribute to reproductive isolation. Moreover, we confirm the previously reported low level of introgression on the X-chromosome, but find little evidence that DMIs contributed to this pattern.


Asunto(s)
Genética de Población , Genoma Humano , Modelos Genéticos , Hombre de Neandertal/genética , Animales , Cromosomas Humanos X/genética , Simulación por Computador , Carga Genética , Humanos , Hibridación Genética , Cadenas de Markov , Densidad de Población , Selección Genética
7.
Ann Appl Stat ; 18(1): 858-881, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38784669

RESUMEN

In scientific studies involving analyses of multivariate data, basic but important questions often arise for the researcher: Is the sample exchangeable, meaning that the joint distribution of the sample is invariant to the ordering of the units? Are the features independent of one another, or perhaps the features can be grouped so that the groups are mutually independent? In statistical genomics, these considerations are fundamental to downstream tasks such as demographic inference and the construction of polygenic risk scores. We propose a non-parametric approach, which we call the V test, to address these two questions, namely, a test of sample exchangeability given dependency structure of features, and a test of feature independence given sample exchangeability. Our test is conceptually simple, yet fast and flexible. It controls the Type I error across realistic scenarios, and handles data of arbitrary dimensions by leveraging large-sample asymptotics. Through extensive simulations and a comparison against unsupervised tests of stratification based on random matrix theory, we find that our test compares favorably in various scenarios of interest. We apply the test to data from the 1000 Genomes Project, demonstrating how it can be employed to assess exchangeability of the genetic sample, or find optimal linkage disequilibrium (LD) splits for downstream analysis. For exchangeability assessment, we find that removing rare variants can substantially increase the p-value of the test statistic. For optimal LD splitting, the V test reports different optimal splits than previous approaches not relying on hypothesis testing. Software for our methods is available in R (CRAN: flintyR) and Python (PyPI: flintyPy).

8.
bioRxiv ; 2024 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-39345369

RESUMEN

As genetic sequencing costs have plummeted, datasets with sizes previously un-thinkable have begun to appear. Such datasets present new opportunities to learn about evolutionary history, particularly via rare alleles that record the very recent past. However, beyond the computational challenges inherent in the analysis of many large-scale datasets, large population-genetic datasets present theoretical problems. In particular, the majority of population-genetic tools require the assumption that each mutant allele in the sample is the result of a single mutation (the "infinite sites" assumption), which is violated in large samples. Here, we present DR EVIL, a method for estimating mutation rates and recent demographic history from very large samples. DR EVIL avoids the infinite-sites assumption by using a diffusion approximation to a branching-process model with recurrent mutation. The branching-process approach limits the method to rare alleles, but, along with recent results, renders tractable likelihoods with recurrent mutation. We show that DR EVIL performs well in simulations and apply it to rare-variant data from a million haploid samples, identifying a signal of mutation-rate heterogeneity within commonly analyzed classes and predicting that in modern sample sizes, most rare variants at sites with high mutation rates represent the descendants of multiple mutation events.

9.
bioRxiv ; 2024 Apr 10.
Artículo en Inglés | MEDLINE | ID: mdl-37292653

RESUMEN

Measures of selective constraint on genes have been used for many applications including clinical interpretation of rare coding variants, disease gene discovery, and studies of genome evolution. However, widely-used metrics are severely underpowered at detecting constraint for the shortest ~25% of genes, potentially causing important pathogenic mutations to be over-looked. We developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, s het . Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease, and other phenotypes, especially for short genes. Our new estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve estimation of many gene-level properties, such as rare variant burden or gene expression differences.

10.
Nat Genet ; 56(8): 1632-1643, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38977852

RESUMEN

Measures of selective constraint on genes have been used for many applications, including clinical interpretation of rare coding variants, disease gene discovery and studies of genome evolution. However, widely used metrics are severely underpowered at detecting constraints for the shortest ~25% of genes, potentially causing important pathogenic mutations to be overlooked. Here we developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, shet. Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease and other phenotypes, especially for short genes. Our estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve the estimation of many gene-level properties, such as rare variant burden or gene expression differences.


Asunto(s)
Teorema de Bayes , Evolución Molecular , Genética de Población , Modelos Genéticos , Humanos , Genética de Población/métodos , Aprendizaje Automático , Selección Genética , Mutación , Fenotipo
11.
bioRxiv ; 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-39005431

RESUMEN

Gene regulatory networks (GRNs) govern many core developmental and biological processes underlying human complex traits. Even with broad-scale efforts to characterize the effects of molecular perturbations and interpret gene coexpression, it remains challenging to infer the architecture of gene regulation in a precise and efficient manner. Key properties of GRNs, like hierarchical structure, modular organization, and sparsity, provide both challenges and opportunities for this objective. Here, we seek to better understand properties of GRNs using a new approach to simulate their structure and model their function. We produce realistic network structures with a novel generating algorithm based on insights from small-world network theory, and we model gene expression regulation using stochastic differential equations formulated to accommodate modeling molecular perturbations. With these tools, we systematically describe the effects of gene knockouts within and across GRNs, finding a subset of networks that recapitulate features of a recent genome-scale perturbation study. With deeper analysis of these exemplar networks, we consider future avenues to map the architecture of gene expression regulation using data from cells in perturbed and unperturbed states, finding that while perturbation data are critical to discover specific regulatory interactions, data from unperturbed cells may be sufficient to reveal regulatory programs.

12.
bioRxiv ; 2024 Oct 22.
Artículo en Inglés | MEDLINE | ID: mdl-39484505

RESUMEN

Genetic diversity within species is the basis for evolutionary adaptive capacity and has recently been included as a target for protection in the United Nations' Global Biodiversity Framework (GBF). However, there is a lack of reliable large-scale predictive frameworks to quantify how much genetic diversity has already been lost, let alone to quantitatively predict future losses under different conservation scenarios in the 21st century. Combining spatio-temporal population genetic theory with population genomic data of 18 plant and animal species, we studied the dynamics of genetic diversity after habitat area losses. We show genetic diversity reacts slowly to habitat area and population declines, but lagged losses will continue for many decades even after habitats are fully protected. To understand the magnitude of this problem, we combined our predictive method with species' habitat area and population monitoring reported in the Living Planet Index, the Red List, and new GBF indicators. We then project genetic diversity loss in 13,808 species with a short-term genetic diversity loss of 13 - 22% and long-term loss of 42 - 48% with substantial deviations depending on the level of habitat fragmentation. These results highlight that protection of only current habitats is insufficient to ensure the genetic health of species and that continuous genetic monitoring alone likely underestimates long term impacts. We provide an area-based spatio-temporal predictive framework to develop quantitative scenarios of global genetic biodiversity.

13.
medRxiv ; 2024 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-39132491

RESUMEN

The human leukocyte antigen (HLA) region plays an important role in human health through involvement in immune cell recognition and maturation. While genetic variation in the HLA region is associated with many diseases, the pleiotropic patterns of these associations have not been systematically investigated. Here, we developed a haplotype approach to investigate disease associations phenome-wide for 412,181 Finnish individuals and 2,459 traits. Across the 1,035 diseases with a GWAS association, we found a 17-fold average per-SNP enrichment of hits in the HLA region. Altogether, we identified 7,649 HLA associations across 647 traits, including 1,750 associations uncovered by haplotype analysis. We find some haplotypes show trade-offs between diseases, while others consistently increase risk across traits, indicating a complex pleiotropic landscape involving a range of diseases. This study highlights the extensive impact of HLA variation on disease risk, and underscores the importance of classical and non-classical genes, as well as non-coding variation.

14.
medRxiv ; 2024 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-39132496

RESUMEN

Background: Genetic factors play an important role in prostate cancer (PCa) development with polygenic risk scores (PRS) predicting disease risk across genetic ancestries. However, there are few convincing modifiable factors for PCa and little is known about their potential interaction with genetic risk. We analyzed incident PCa cases (n=6,155) and controls (n=98,257) of European and African ancestry from the UK Biobank (UKB) cohort to evaluate the role of neighborhood socioeconomic status (nSES)-and how it may interact with PRS-on PCa risk. Methods: We evaluated a multi-ancestry PCa PRS containing 269 genetic variants to understand the association of germline genetics with PCa in UKB. Using the English Indices of Deprivation, a set of validated metrics that quantify lack of resources within geographical areas, we performed logistic regression to investigate the main effects and interactions between nSES deprivation, PCa PRS, and PCa. Results: The PCa PRS was strongly associated with PCa (OR=2.04; 95%CI=2.00-2.09; P<0.001). Additionally, nSES deprivation indices were inversely associated with PCa: employment (OR=0.91; 95%CI=0.86-0.96; P<0.001), education (OR=0.94; 95%CI=0.83-0.98; P<0.001), health (OR=0.91; 95%CI=0.86-0.96; P<0.001), and income (OR=0.91; 95%CI=0.86-0.96; P<0.001). The PRS effects showed little heterogeneity across nSES deprivation indices, except for the Townsend Index (P=0.03). Conclusions: We reaffirmed genetics as a risk factor for PCa and identified nSES deprivation domains that influence PCa detection and are potentially correlated with environmental exposures that are a risk factor for PCa. These findings also suggest that nSES and genetic risk factors for PCa act independently.

15.
Cell Genom ; 4(9): 100629, 2024 Sep 11.
Artículo en Inglés | MEDLINE | ID: mdl-39111318

RESUMEN

With hundreds of copies of rDNA, it is unknown whether they possess sequence variations that form different types of ribosomes. Here, we developed an algorithm for long-read variant calling, termed RGA, which revealed that variations in human rDNA loci are predominantly insertion-deletion (indel) variants. We developed full-length rRNA sequencing (RIBO-RT) and in situ sequencing (SWITCH-seq), which showed that translating ribosomes possess variation in rRNA. Over 1,000 variants are lowly expressed. However, tens of variants are abundant and form distinct rRNA subtypes with different structures near indels as revealed by long-read rRNA structure probing coupled to dimethyl sulfate sequencing. rRNA subtypes show differential expression in endoderm/ectoderm-derived tissues, and in cancer, low-abundance rRNA variants can become highly expressed. Together, this study identifies the diversity of ribosomes at the level of rRNA variants, their chromosomal location, and unique structure as well as the association of ribosome variation with tissue-specific biology and cancer.


Asunto(s)
ARN Ribosómico , Ribosomas , Humanos , Ribosomas/metabolismo , Ribosomas/genética , ARN Ribosómico/genética , Neoplasias/genética , Neoplasias/clasificación , Variación Genética , Mutación INDEL , Algoritmos , ADN Ribosómico/genética
16.
bioRxiv ; 2024 Jun 17.
Artículo en Inglés | MEDLINE | ID: mdl-38948697

RESUMEN

Natural selection on complex traits is difficult to study in part due to the ascertainment inherent to genome-wide association studies (GWAS). The power to detect a trait-associated variant in GWAS is a function of frequency and effect size - but for traits under selection, the effect size of a variant determines the strength of selection against it, constraining its frequency. To account for GWAS ascertainment, we propose studying the joint distribution of allele frequencies across populations, conditional on the frequencies in the GWAS cohort. Before considering these conditional frequency spectra, we first characterized the impact of selection and non-equilibrium demography on allele frequency dynamics forwards and backwards in time. We then used these results to understand conditional frequency spectra under realistic human demography. Finally, we investigated empirical conditional frequency spectra for GWAS variants associated with 106 complex traits, finding compelling evidence for either stabilizing or purifying selection. Our results provide insight into polygenic score portability and other properties of variants ascertained with GWAS, highlighting the utility of conditional frequency spectra.

17.
Elife ; 132024 Jan 30.
Artículo en Inglés | MEDLINE | ID: mdl-38288729

RESUMEN

Ancient DNA research in the past decade has revealed that European population structure changed dramatically in the prehistoric period (14,000-3000 years before present, YBP), reflecting the widespread introduction of Neolithic farmer and Bronze Age Steppe ancestries. However, little is known about how population structure changed from the historical period onward (3000 YBP - present). To address this, we collected whole genomes from 204 individuals from Europe and the Mediterranean, many of which are the first historical period genomes from their region (e.g. Armenia and France). We found that most regions show remarkable inter-individual heterogeneity. At least 7% of historical individuals carry ancestry uncommon in the region where they were sampled, some indicating cross-Mediterranean contacts. Despite this high level of mobility, overall population structure across western Eurasia is relatively stable through the historical period up to the present, mirroring geography. We show that, under standard population genetics models with local panmixia, the observed level of dispersal would lead to a collapse of population structure. Persistent population structure thus suggests a lower effective migration rate than indicated by the observed dispersal. We hypothesize that this phenomenon can be explained by extensive transient dispersal arising from drastically improved transportation networks and the Roman Empire's mobilization of people for trade, labor, and military. This work highlights the utility of ancient DNA in elucidating finer scale human population dynamics in recent history.


Asunto(s)
ADN Antiguo , Genoma Humano , Humanos , Europa (Continente) , Francia , Genética de Población , Dinámica Poblacional , Migración Humana
18.
Nat Genet ; 55(11): 1866-1875, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37857933

RESUMEN

Most signals in genome-wide association studies (GWAS) of complex traits implicate noncoding genetic variants with putative gene regulatory effects. However, currently identified regulatory variants, notably expression quantitative trait loci (eQTLs), explain only a small fraction of GWAS signals. Here, we show that GWAS and cis-eQTL hits are systematically different: eQTLs cluster strongly near transcription start sites, whereas GWAS hits do not. Genes near GWAS hits are enriched in key functional annotations, are under strong selective constraint and have complex regulatory landscapes across different tissue/cell types, whereas genes near eQTLs are depleted of most functional annotations, show relaxed constraint, and have simpler regulatory landscapes. We describe a model to understand these observations, including how natural selection on complex traits hinders discovery of functionally relevant eQTLs. Our results imply that GWAS and eQTL studies are systematically biased toward different types of variant, and support the use of complementary functional approaches alongside the next generation of eQTL studies.


Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Regulación de la Expresión Génica/genética , Sitios de Carácter Cuantitativo/genética , Expresión Génica , Polimorfismo de Nucleótido Simple/genética
19.
Res Sq ; 2023 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-37398424

RESUMEN

Measures of selective constraint on genes have been used for many applications including clinical interpretation of rare coding variants, disease gene discovery, and studies of genome evolution. However, widely-used metrics are severely underpowered at detecting constraint for the shortest ~25% of genes, potentially causing important pathogenic mutations to be overlooked. We developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, shet. Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease, and other phenotypes, especially for short genes. Our new estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve estimation of many gene-level properties, such as rare variant burden or gene expression differences.

20.
Genetics ; 225(3)2023 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-37724741

RESUMEN

The discrete-time Wright-Fisher (DTWF) model and its diffusion limit are central to population genetics. These models can describe the forward-in-time evolution of allele frequencies in a population resulting from genetic drift, mutation, and selection. Computing likelihoods under the diffusion process is feasible, but the diffusion approximation breaks down for large samples or in the presence of strong selection. Existing methods for computing likelihoods under the DTWF model do not scale to current exome sequencing sample sizes in the hundreds of thousands. Here, we present a scalable algorithm that approximates the DTWF model with provably bounded error. Our approach relies on two key observations about the DTWF model. The first is that transition probabilities under the model are approximately sparse. The second is that transition distributions for similar starting allele frequencies are extremely close as distributions. Together, these observations enable approximate matrix-vector multiplication in linear (as opposed to the usual quadratic) time. We prove similar properties for Hypergeometric distributions, enabling fast computation of likelihoods for subsamples of the population. We show theoretically and in practice that this approximation is highly accurate and can scale to population sizes in the tens of millions, paving the way for rigorous biobank-scale inference. Finally, we use our results to estimate the impact of larger samples on estimating selection coefficients for loss-of-function variants. We find that increasing sample sizes beyond existing large exome sequencing cohorts will provide essentially no additional information except for genes with the most extreme fitness effects.


Asunto(s)
Bancos de Muestras Biológicas , Genética de Población , Frecuencia de los Genes , Flujo Genético , Probabilidad , Modelos Genéticos , Selección Genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA