Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 175(3): 848-858.e6, 2018 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-30318150

RESUMO

In familial searching in forensic genetics, a query DNA profile is tested against a database to determine whether it represents a relative of a database entrant. We examine the potential for using linkage disequilibrium to identify pairs of profiles as belonging to relatives when the query and database rely on nonoverlapping genetic markers. Considering data on individuals genotyped with both microsatellites used in forensic applications and genome-wide SNPs, we find that ∼30%-32% of parent-offspring pairs and ∼35%-36% of sib pairs can be identified from the SNPs of one member of the pair and the microsatellites of the other. The method suggests the possibility of performing familial searches of microsatellite databases using query SNP profiles, or vice versa. It also reveals that privacy concerns arising from computations across multiple databases that share no genetic markers in common entail risks, not only for database entrants, but for their close relatives as well.


Assuntos
Família , Genética Forense/métodos , Genética Populacional/métodos , Técnicas de Genotipagem/métodos , Polimorfismo de Nucleotídeo Único , Feminino , Humanos , Desequilíbrio de Ligação , Masculino , Repetições de Microssatélites , Modelos Genéticos , Modelos Estatísticos , Linhagem
2.
Proc Natl Acad Sci U S A ; 121(12): e2319496121, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38470926

RESUMO

Without the ability to control or randomize environments (or genotypes), it is difficult to determine the degree to which observed phenotypic differences between two groups of individuals are due to genetic vs. environmental differences. However, some have suggested that these concerns may be limited to pathological cases, and methods have appeared that seem to give-directly or indirectly-some support to claims that aggregate heritable variation within groups can be related to heritable variation among groups. We consider three families of approaches: the "between-group heritability" sometimes invoked in behavior genetics, the statistic [Formula: see text] used in empirical work in evolutionary quantitative genetics, and methods based on variation in ancestry in an admixed population, used in anthropological and statistical genetics. We take up these examples to show mathematically that information on within-group genetic and phenotypic information in the aggregate cannot separate among-group differences into genetic and environmental components, and we provide simulation results that support our claims. We discuss these results in terms of the long-running debate on this topic.


Assuntos
Evolução Biológica , Genética Populacional , Humanos , Fenótipo , Genótipo , Simulação por Computador , Variação Genética
3.
Am J Hum Genet ; 110(12): 2077-2091, 2023 Dec 07.
Artigo em Inglês | MEDLINE | ID: mdl-38065072

RESUMO

Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide association studies (GWASs) are a powerful way to find genetic loci associated with phenotypes. GWASs are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history. One way to model this shared history is through the ancestral recombination graph (ARG), which encodes a series of local coalescent trees. Recent computational and methodological breakthroughs have made it feasible to estimate approximate ARGs from large-scale samples. Here, we explore the potential of an ARG-based approach to quantitative-trait locus (QTL) mapping, echoing existing variance-components approaches. We propose a framework that relies on the conditional expectation of a local genetic relatedness matrix (local eGRM) given the ARG. Simulations show that our method is especially beneficial for finding QTLs in the presence of allelic heterogeneity. By framing QTL mapping in terms of the estimated ARG, we can also facilitate the detection of QTLs in understudied populations. We use local eGRM to analyze two chromosomes containing known body size loci in a sample of Native Hawaiians. Our investigations can provide intuition about the benefits of using estimated ARGs in population- and statistical-genetic methods in general.


Assuntos
Genética Populacional , Estudo de Associação Genômica Ampla , Locos de Características Quantitativas , Humanos , Mapeamento Cromossômico/métodos , Modelos Genéticos , Fenótipo , Locos de Características Quantitativas/genética , Havaiano Nativo ou Outro Ilhéu do Pacífico/genética
4.
Trends Genet ; 38(2): 113-115, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34740452

RESUMO

Advocates of transparency in science often point to the benefits of open practices for the scientific process. Here, we focus on a possibly underappreciated effect of standards for transparency: their influence on non-scientific decisions. As a case study, we consider the current state of probabilistic genotyping software in forensics.


Assuntos
Genética Forense , Ciências Forenses , Humanos
5.
Am J Phys Anthropol ; 175(2): 406-421, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33772750

RESUMO

OBJECTIVES: In genetic admixture processes, source groups for an admixed population possess distinct patterns of genotype and phenotype at the onset of admixture. Particularly in the context of recent and ongoing admixture, such differences are sometimes taken to serve as markers of ancestry for individuals-that is, phenotypes initially associated with the ancestral background in one source population are assumed to continue to reflect ancestry in that population. Such phenotypes might possess ongoing significance in social categorizations of individuals, owing in part to perceived continuing correlations with ancestry. However, genotypes or phenotypes initially associated with ancestry in one specific source population have been seen to decouple from overall admixture levels, so that they no longer serve as proxies for genetic ancestry. Here, we aim to develop an understanding of the joint dynamics of admixture levels and phenotype distributions in an admixed population. METHODS: We devise a mechanistic model, consisting of an admixture model, a quantitative trait model, and a mating model. We analyze the behavior of the mechanistic model in relation to the model parameters. RESULTS: We find that it is possible for the decoupling of genetic ancestry and phenotype to proceed quickly, and that it occurs faster if the phenotype is driven by fewer loci. Positive assortative mating attenuates the process of dissociation relative to a scenario in which mating is random with respect to genetic admixture and with respect to phenotype. CONCLUSIONS: The mechanistic framework suggests that in an admixed population, a trait that initially differed between source populations might serve as a reliable proxy for ancestry for only a short time, especially if the trait is determined by few loci. It follows that a social categorization based on such a trait is increasingly uninformative about genetic ancestry and about other traits that differed between source populations at the onset of admixture.


Assuntos
Frequência do Gene/genética , Genética Populacional , Antropologia Física , Feminino , Fluxo Gênico/genética , Genoma Humano/genética , Genótipo , Humanos , Masculino , Fenótipo , Pigmentação da Pele/genética
6.
Proc Natl Acad Sci U S A ; 114(22): 5671-5676, 2017 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-28507140

RESUMO

Combining genotypes across datasets is central in facilitating advances in genetics. Data aggregation efforts often face the challenge of record matching-the identification of dataset entries that represent the same individual. We show that records can be matched across genotype datasets that have no shared markers based on linkage disequilibrium between loci appearing in different datasets. Using two datasets for the same 872 people-one with 642,563 genome-wide SNPs and the other with 13 short tandem repeats (STRs) used in forensic applications-we find that 90-98% of forensic STR records can be connected to corresponding SNP records and vice versa. Accuracy increases to 99-100% when ∼30 STRs are used. Our method expands the potential of data aggregation, but it also suggests privacy risks intrinsic in maintenance of databases containing even small numbers of markers-including databases of forensic significance.


Assuntos
Genética Forense/métodos , Marcadores Genéticos/genética , Genômica/métodos , Desequilíbrio de Ligação/genética , Repetições de Microssatélites/genética , Coleta de Dados , Genoma Humano/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética
8.
Hum Hered ; 82(3-4): 87-102, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-28910803

RESUMO

OBJECTIVES: Recent studies have highlighted the potential of analyses of genomic sharing to produce insight into the demographic processes affecting human populations. We study runs of homozygosity (ROH) in 18 Jewish populations, examining these groups in relation to 123 non-Jewish populations sampled worldwide. METHODS: By sorting ROH into 3 length classes (short, intermediate, and long), we evaluate the impact of demographic processes on genomic patterns in Jewish populations. RESULTS: We find that the portion of the genome appearing in long ROH - the length class most directly related to recent consanguinity - closely accords with data gathered from interviews during the 1950s on frequencies of consanguineous unions in various Jewish groups. CONCLUSION: The high correlation between 1950s consanguinity levels and coverage by long ROH explains differences across populations in ROH patterns. The dissection of ROH into length classes and the comparison to consanguinity data assist in understanding a number of additional phenomena, including similarities of Jewish populations to Middle Eastern, European, and Central and South Asian non-Jewish populations in short ROH patterns, relative lengths of identity-by-descent tracts in different Jewish groups, and the "population isolate" status of the Ashkenazi Jews.

10.
Hum Biol ; 87(4): 313-337, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-27737590

RESUMO

Models that examine genetic differences between populations alongside a genotype-phenotype map can provide insight about phenotypic variation among groups. We generalize a simple model of a completely heritable, additive, selectively neutral quantitative trait to examine the relationship between single-locus genetic differentiation and phenotypic differentiation on quantitative traits. In agreement with similar efforts using different models, we show that the expected degree to which two groups differ on a neutral quantitative trait is not strongly affected by the number of genetic loci that influence the trait: neutral trait differences are expected to have a magnitude comparable to the genetic differences at a single neutral locus. We discuss this result with respect to population differences in disease phenotypes, arguing that although neutral genetic differences between populations can contribute to specific differences between populations in health outcomes, systematic patterns of difference that run in the same direction for many genetically independent health conditions are unlikely to be explained by neutral genetic differentiation.


Assuntos
Variação Genética/genética , Fenótipo , Locos de Características Quantitativas/genética , Algoritmos , Frequência do Gene/genética , Deriva Genética , Genética Populacional , Disparidades nos Níveis de Saúde , Humanos , Modelos Genéticos , Seleção Genética/genética
11.
Theor Popul Biol ; 97: 20-34, 2014 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-25132646

RESUMO

FST is one of the most frequently-used indices of genetic differentiation among groups. Though FST takes values between 0 and 1, authors going back to Wright have noted that under many circumstances, FST is constrained to be less than 1. Recently, we showed that at a genetic locus with an unspecified number of alleles, FST for two subpopulations is strictly bounded from above by functions of both the frequency of the most frequent allele (M) and the homozygosity of the total population (HT). In the two-subpopulation case, FST can equal one only when the frequency of the most frequent allele and the total homozygosity are 1/2. Here, we extend this work by deriving strict bounds on FST for two subpopulations when the number of alleles at the locus is specified to be I. We show that restricting to I alleles produces the same upper bound on FST over much of the allowable domain for M and HT, and we derive more restrictive bounds in the windows M∈[1/I,1/(I-1)) and HT∈[1/I,I/(I(2)-1)). These results extend our understanding of the behavior of FST in relation to other population-genetic statistics.


Assuntos
Frequência do Gene , Genética Populacional , Homozigoto , Modelos Genéticos , Bioestatística
12.
bioRxiv ; 2024 Feb 12.
Artigo em Inglês | MEDLINE | ID: mdl-37986815

RESUMO

Without the ability to control or randomize environments (or genotypes), it is difficult to determine the degree to which observed phenotypic differences between two groups of individuals are due to genetic vs. environmental differences. However, some have suggested that these concerns may be limited to pathological cases, and methods have appeared that seem to give-directly or indirectly-some support to claims that aggregate heritable variation within groups can be related to heritable variation among groups. We consider three families of approaches: the "between-group heritability" sometimes invoked in behavior genetics, the statistic PST used in empirical work in evolutionary quantitative genetics, and methods based on variation in ancestry in an admixed population, used in anthropological and statistical genetics. We take up these examples to show mathematically that information on within-group genetic and phenotypic information in the aggregate cannot separate among-group differences into genetic and environmental components, and we provide simulation results that support our claims. We discuss these results in terms of the long-running debate on this topic.

13.
bioRxiv ; 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38854009

RESUMO

Scalable methods for estimating marginal coalescent trees across the genome present new opportunities for studying evolution and have generated considerable excitement, with new methods extending scalability to thousands of samples. Benchmarking of the available methods has revealed general tradeoffs between accuracy and scalability, but performance in downstream applications has not always been easily predictable from general performance measures, suggesting that specific features of the ARG may be important for specific downstream applications of estimated ARGs. To exemplify this point, we benchmark ARG estimation methods with respect to a specific set of methods for estimating the historical time course of a population-mean polygenic score (PGS) using the marginal coalescent trees encoded by the ancestral recombination graph (ARG). Here we examine the performance in simulation of six ARG estimation methods: ARGweaver, RENT+, Relate, tsinfer+tsdate, ARG-Needle/ASMC-clust , and SINGER , using their estimated coalescent trees and examining bias, mean squared error (MSE), confidence interval coverage, and Type I and II error rates of the downstream methods. Although it does not scale to the sample sizes attainable by other new methods, SINGER produced the most accurate estimated PGS histories in many instances, even when Relate, tsinfer+tsdate , and ARG-Needle/ASMC-clust used samples ten times as large as those used by SINGER. In general, the best choice of method depends on the number of samples available and the historical time period of interest. In particular, the unprecedented sample sizes allowed by Relate, tsinfer+tsdate , and ARG-Needle/ASMC-clust are of greatest importance when the recent past is of interest-further back in time, most of the tree has coalesced, and differences in contemporary sample size are less salient.

14.
bioRxiv ; 2024 May 30.
Artigo em Inglês | MEDLINE | ID: mdl-38854110

RESUMO

With advances in sequencing technology, forensic workers can access genetic information from increasingly challenging samples. A recently published computational approach, IBDGem , analyzes sequencing reads, including from low-coverage samples, in order to arrive at likelihood ratios for tests of identity. Here, we show that likelihood ratios produced by IBDGem test a null hypothesis different from the traditional one used in a forensic genetics context. In particular, rather than testing the hypothesis that the sample comes from a person unrelated to the person of interest, IBDGem tests the hypothesis that the sample comes from an individual who is included in the reference sample used to run the method. This null hypothesis is not generally of forensic interest, because the defense hypothesis is not that the evidence comes from an individual included in a reference panel. Further, it does not take into account genetic variation outside the reference panel, and as a result, the computed likelihood ratios can be much larger than likelihood ratios computed for the standard forensic null hypothesis, often by many orders of magnitude, thus potentially creating an impression of stronger evidence for identity than is warranted. We lay out this result and illustrate it with examples, giving suggestions for directions that might lead to likelihood ratios that have the traditional interpretation.

15.
bioRxiv ; 2024 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-38496530

RESUMO

In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these two fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred. To help bridge this divide, we derive a general model describing the covariance between the genetic contributions to the quantitative phenotypes of different individuals. Taking this approach shows that standard models in both statistical genetics (e.g., Genome-Wide Association Studies; GWAS) and phylogenetic comparative biology (e.g., phylogenetic regression) can be interpreted as special cases of this more general quantitative-genetic model. The fact that these models share the same core architecture means that we can build a unified understanding of the strengths and limitations of different methods for controlling for genetic structure when testing for associations. We develop intuition for why and when spurious correlations may occur using analytical theory and conduct population-genetic and phylogenetic simulations of quantitative traits. The structural similarity of problems in statistical genetics and phylogenetics enables us to take methodological advances from one field and apply them in the other. We demonstrate this by showing how a standard GWAS technique-including both the genetic relatedness matrix (GRM) as well as its leading eigenvectors, corresponding to the principal components of the genotype matrix, in a regression model-can mitigate spurious correlations in phylogenetic analyses. As a case study of this, we re-examine an analysis testing for co-evolution of expression levels between genes across a fungal phylogeny, and show that including covariance matrix eigenvectors as covariates decreases the false positive rate while simultaneously increasing the true positive rate. More generally, this work provides a foundation for more integrative approaches for understanding the genetic architecture of phenotypes and how evolutionary processes shape it.

16.
Cell Genom ; 3(5): 100297, 2023 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-37228747

RESUMO

Sex differences in complex traits are suspected to be in part due to widespread gene-by-sex interactions (GxSex), but empirical evidence has been elusive. Here, we infer the mixture of ways in which polygenic effects on physiological traits covary between males and females. We find that GxSex is pervasive but acts primarily through systematic sex differences in the magnitude of many genetic effects ("amplification") rather than in the identity of causal variants. Amplification patterns account for sex differences in trait variance. In some cases, testosterone may mediate amplification. Finally, we develop a population-genetic test linking GxSex to contemporary natural selection and find evidence of sexually antagonistic selection on variants affecting testosterone levels. Our results suggest that amplification of polygenic effects is a common mode of GxSex that may contribute to sex differences and fuel their evolution.

17.
bioRxiv ; 2023 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37066144

RESUMO

Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide Association Studies (GWAS) are a powerful way to find genetic loci associated with phenotypes. GWAS are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history. One way to model this shared history is through the ancestral recombination graph (ARG), which encodes a series of local coalescent trees. Recent computational and methodological breakthroughs have made it feasible to estimate approximate ARGs from large-scale samples. Here, we explore the potential of an ARG-based approach to quantitative-trait locus (QTL) mapping, echoing existing variance-components approaches. We propose a framework that relies on the conditional expectation of a local genetic relatedness matrix given the ARG (local eGRM). Simulations show that our method is especially beneficial for finding QTLs in the presence of allelic heterogeneity. By framing QTL mapping in terms of the estimated ARG, we can also facilitate the detection of QTLs in understudied populations. We use local eGRM to identify a large-effect BMI locus, the CREBRF gene, in a sample of Native Hawaiians in which it was not previously detectable by GWAS because of a lack of population-specific imputation resources. Our investigations can provide intuition about the benefits of using estimated ARGs in population- and statistical-genetic methods in general.

18.
bioRxiv ; 2023 Mar 09.
Artigo em Inglês | MEDLINE | ID: mdl-36945578

RESUMO

The 20 short tandem repeat (STR) markers of the combined DNA index system (CODIS) are the basis of the vast majority of forensic genetics in the United States. One argument for permissive rules about the collection of CODIS genotypes is that the CODIS markers are thought to contain information relevant to identification only (such as a human fingerprint would), with little information about ancestry or traits. However, in the past 20 years, a quickly growing field has identified hundreds of thousands of genotype-trait associations. Here we conduct a survey of the landscape of such associations surrounding the CODIS loci as compared with non-CODIS STRs. We find that the regions around the CODIS markers are enriched for both known pathogenic variants (>90th percentile) and for SNPs identified as trait-associated in genome-wide association studies (GWAS) (≥95th percentile in 10kb and 100kb flanking regions), compared with other random sets of autosomal tetranucleotide-repeat STRs. Although it is not obvious how much phenotypic information CODIS would need to convey to strain the "DNA fingerprint" analogy, the CODIS markers, considered as a set, are in regions unusually dense with variants with known phenotypic associations.

19.
bioRxiv ; 2023 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-37873208

RESUMO

The demographic history of a population drives the pattern of genetic variation and is encoded in the gene-genealogical trees of the sampled alleles. However, existing methods to infer demographic history from genetic data tend to use relatively low-dimensional summaries of the genealogy, such as allele frequency spectra. As a step toward capturing more of the information encoded in the genome-wide sequence of genealogical trees, here we propose a novel framework called the genealogical likelihood (gLike), which derives the full likelihood of a genealogical tree under any hypothesized demographic history. Employing a graph-based structure, gLike summarizes across independent trees the relationships among all lineages in a tree with all possible trajectories of population memberships through time and efficiently computes the exact marginal probability under a parameterized demographic model. Through extensive simulations and empirical applications on populations that have experienced multiple admixtures, we showed that gLike can accurately estimate dozens of demographic parameters when the true genealogy is known, including ancestral population sizes, admixture timing, and admixture proportions. Moreover, when using genealogical trees inferred from genetic data, we showed that gLike outperformed conventional demographic inference methods that leverage only the allele-frequency spectrum and yielded parameter estimates that align with established historical knowledge of the past demographic histories for populations like Latino Americans and Native Hawaiians. Furthermore, our framework can trace ancestral histories by analyzing a sample from the admixed population without proxies for its source populations, removing the need to sample ancestral populations that may no longer exist. Taken together, our proposed gLike framework harnesses underutilized genealogical information to offer exceptional sensitivity and accuracy in inferring complex demographies for humans and other species, particularly as estimation of genome-wide genealogies improves.

20.
iScience ; 26(10): 107992, 2023 Oct 20.
Artigo em Inglês | MEDLINE | ID: mdl-37841589

RESUMO

The 20 short tandem repeat (STR) loci of the combined DNA index system (CODIS) are the basis of the vast majority of forensic genetics in the United States. One argument for permissive rules about the collection of CODIS genotypes is that the CODIS loci are thought to contain little information about ancestry or traits. However, in the past 20 years, a growing field has identified hundreds of thousands of genotype-trait associations. Here, we conduct a survey of the landscape of such associations surrounding the CODIS loci as compared with non-CODIS STRs. Although this study cannot establish or quantify associations between CODIS genotypes and phenotypes, we find that the regions around the CODIS loci are enriched for both known pathogenic variants (> 90th percentile) and for trait-associated SNPs identified in genome-wide association studies (GWAS) (≥ 95th percentile in 10kb and 100kb flanking regions), compared with other random sets of autosomal tetranucleotide-repeat STRs.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA