Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 70
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
PLoS Genet ; 20(2): e1011133, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38412146

RESUMO

[This corrects the article DOI: 10.1371/journal.pgen.1010871.].

2.
Proc Natl Acad Sci U S A ; 121(19): e2315780121, 2024 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-38687793

RESUMO

Measuring inbreeding and its consequences on fitness is central for many areas in biology including human genetics and the conservation of endangered species. However, there is no consensus on the best method, neither for quantification of inbreeding itself nor for the model to estimate its effect on specific traits. We simulated traits based on simulated genomes from a large pedigree and empirical whole-genome sequences of human data from populations with various sizes and structures (from the 1,000 Genomes project). We compare the ability of various inbreeding coefficients ([Formula: see text]) to quantify the strength of inbreeding depression: allele-sharing, two versions of the correlation of uniting gametes which differ in the weight they attribute to each locus and two identical-by-descent segments-based estimators. We also compare two models: the standard linear model and a linear mixed model (LMM) including a genetic relatedness matrix (GRM) as random effect to account for the nonindependence of observations. We find LMMs give better results in scenarios with population or family structure. Within the LMM, we compare three different GRMs and show that in homogeneous populations, there is little difference among the different [Formula: see text] and GRM for inbreeding depression quantification. However, as soon as a strong population or family structure is present, the strength of inbreeding depression can be most efficiently estimated only if i) the phenotypes are regressed on [Formula: see text] based on a weighted version of the correlation of uniting gametes, giving more weight to common alleles and ii) with the GRM obtained from an allele-sharing relatedness estimator.


Assuntos
Depressão por Endogamia , Modelos Genéticos , Humanos , Linhagem , Genética Populacional/métodos , Endogamia , Alelos
3.
PLoS Genet ; 19(11): e1010871, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-38011288

RESUMO

Being able to properly quantify genetic differentiation is key to understanding the evolutionary potential of a species. One central parameter in this context is FST, the mean coancestry within populations relative to the mean coancestry between populations. Researchers have been estimating FST globally or between pairs of populations for a long time. More recently, it has been proposed to estimate population-specific FST values, and population-pair mean relative coancestry. Here, we review the several definitions and estimation methods of FST, and stress that they provide values relative to a reference population. We show the good statistical properties of an allele-sharing, method of moments based estimator of FST (global, population-specific and population-pair) under a very general model of population structure. We point to the limitation of existing likelihood and Bayesian estimators when the populations are not independent. Last, we show that recent attempts to estimate absolute, rather than relative, mean coancestry fail to do so.


Assuntos
Evolução Biológica , Modelos Genéticos , Alelos , Teorema de Bayes , Deriva Genética , Genética Populacional
5.
Heredity (Edinb) ; 128(1): 1-10, 2022 01.
Artigo em Inglês | MEDLINE | ID: mdl-34824382

RESUMO

The two alleles an individual carries at a locus are identical by descent (ibd) if they have descended from a single ancestral allele in a reference population, and the probability of such identity is the inbreeding coefficient of the individual. Inbreeding coefficients can be predicted from pedigrees with founders constituting the reference population, but estimation from genetic data is not possible without data from the reference population. Most inbreeding estimators that make explicit use of sample allele frequencies as estimates of allele probabilities in the reference population are confounded by average kinships with other individuals. This means that the ranking of those estimates depends on the scope of the study sample and we show the variation in rankings for common estimators applied to different subdivisions of 1000 Genomes data. Allele-sharing estimators of within-population inbreeding relative to average kinship in a study sample, however, do have invariant rankings across all studies including those individuals. They are unbiased with a large number of SNPs. We discuss how allele sharing estimates are the relevant quantities for a range of empirical applications.


Assuntos
Endogamia , Polimorfismo de Nucleotídeo Único , Alelos , Frequência do Gene , Humanos , Modelos Genéticos , Linhagem
6.
Proc Natl Acad Sci U S A ; 114(32): 8602-8607, 2017 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-28747529

RESUMO

Quantifying the effects of inbreeding is critical to characterizing the genetic architecture of complex traits. This study highlights through theory and simulations the strengths and shortcomings of three SNP-based inbreeding measures commonly used to estimate inbreeding depression (ID). We demonstrate that heterogeneity in linkage disequilibrium (LD) between causal variants and SNPs biases ID estimates, and we develop an approach to correct this bias using LD and minor allele frequency stratified inference (LDMS). We quantified ID in 25 traits measured in [Formula: see text] participants of the UK Biobank, using LDMS, and confirmed previously published ID for 4 traits. We find unique evidence of ID for handgrip strength, waist/hip ratio, and visual and auditory acuity (ID between -2.3 and -5.2 phenotypic SDs for complete inbreeding; [Formula: see text]). Our results illustrate that a careful choice of the measure of inbreeding combined with LDMS stratification improves both detection and quantification of ID using SNP data.


Assuntos
Consanguinidade , Bases de Dados de Ácidos Nucleicos , Desequilíbrio de Ligação , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Feminino , Humanos , Masculino
7.
Genet Epidemiol ; 42(1): 34-48, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29071737

RESUMO

Standard statistical tests for equality of allele frequencies in males and females and tests for Hardy-Weinberg equilibrium are tightly linked by their assumptions. Tests for equality of allele frequencies assume Hardy-Weinberg equilibrium, whereas the usual chi-square or exact test for Hardy-Weinberg equilibrium assume equality of allele frequencies in the sexes. In this paper, we propose ways to break this interdependence in assumptions of the two tests by proposing an omnibus exact test that can test both hypotheses jointly, as well as a likelihood ratio approach that permits these phenomena to be tested both jointly and separately. The tests are illustrated with data from the 1000 Genomes project.


Assuntos
Alelos , Frequência do Gene , Marcadores Genéticos/genética , Modelos Genéticos , Consanguinidade , Feminino , Genoma Humano/genética , Genômica , Humanos , Masculino
8.
Am J Hum Genet ; 98(1): 127-48, 2016 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-26748516

RESUMO

Genealogical inference from genetic data is essential for a variety of applications in human genetics. In genome-wide and sequencing association studies, for example, accurate inference on both recent genetic relatedness, such as family structure, and more distant genetic relatedness, such as population structure, is necessary for protection against spurious associations. Distinguishing familial relatedness from population structure with genotype data, however, is difficult because both manifest as genetic similarity through the sharing of alleles. Existing approaches for inference on recent genetic relatedness have limitations in the presence of population structure, where they either (1) make strong and simplifying assumptions about population structure, which are often untenable, or (2) require correct specification of and appropriate reference population panels for the ancestries in the sample, which might be unknown or not well defined. Here, we propose PC-Relate, a model-free approach for estimating commonly used measures of recent genetic relatedness, such as kinship coefficients and IBD sharing probabilities, in the presence of unspecified structure. PC-Relate uses principal components calculated from genome-screen data to partition genetic correlations among sampled individuals due to the sharing of recent ancestors and more distant common ancestry into two separate components, without requiring specification of the ancestral populations or reference population panels. In simulation studies with population structure, including admixture, we demonstrate that PC-Relate provides accurate estimates of genetic relatedness and improved relationship classification over widely used approaches. We further demonstrate the utility of PC-Relate in applications to three ancestrally diverse samples that vary in both size and genealogical complexity.


Assuntos
Modelos Genéticos , Humanos , Los Angeles , Americanos Mexicanos/genética
9.
Am J Hum Genet ; 98(1): 165-84, 2016 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-26748518

RESUMO

US Hispanic/Latino individuals are diverse in genetic ancestry, culture, and environmental exposures. Here, we characterized and controlled for this diversity in genome-wide association studies (GWASs) for the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). We simultaneously estimated population-structure principal components (PCs) robust to familial relatedness and pairwise kinship coefficients (KCs) robust to population structure, admixture, and Hardy-Weinberg departures. The PCs revealed substantial genetic differentiation within and among six self-identified background groups (Cuban, Dominican, Puerto Rican, Mexican, and Central and South American). To control for variation among groups, we developed a multi-dimensional clustering method to define a "genetic-analysis group" variable that retains many properties of self-identified background while achieving substantially greater genetic homogeneity within groups and including participants with non-specific self-identification. In GWASs of 22 biomedical traits, we used a linear mixed model (LMM) including pairwise empirical KCs to account for familial relatedness, PCs for ancestry, and genetic-analysis groups for additional group-associated effects. Including the genetic-analysis group as a covariate accounted for significant trait variation in 8 of 22 traits, even after we fit 20 PCs. Additionally, genetic-analysis groups had significant heterogeneity of residual variance for 20 of 22 traits, and modeling this heteroscedasticity within the LMM reduced genomic inflation for 19 traits. Furthermore, fitting an LMM that utilized a genetic-analysis group rather than a self-identified background group achieved higher power to detect previously reported associations. We expect that the methods applied here will be useful in other studies with multiple ethnic groups, admixture, and relatedness.


Assuntos
Variação Genética , Hispânico ou Latino/genética , Estudo de Associação Genômica Ampla , Humanos , Estados Unidos
10.
Am J Hum Genet ; 98(2): 229-42, 2016 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-26805783

RESUMO

Platelets play an essential role in hemostasis and thrombosis. We performed a genome-wide association study of platelet count in 12,491 participants of the Hispanic Community Health Study/Study of Latinos by using a mixed-model method that accounts for admixture and family relationships. We discovered and replicated associations with five genes (ACTN1, ETV7, GABBR1-MOG, MEF2C, and ZBTB9-BAK1). Our strongest association was with Amerindian-specific variant rs117672662 (p value = 1.16 × 10(-28)) in ACTN1, a gene implicated in congenital macrothrombocytopenia. rs117672662 exhibited allelic differences in transcriptional activity and protein binding in hematopoietic cells. Our results underscore the value of diverse populations to extend insights into the allelic architecture of complex traits.


Assuntos
Estudos de Associação Genética/métodos , Loci Gênicos , Hispânico ou Latino/genética , Contagem de Plaquetas , Actinina/genética , Adolescente , Adulto , Idoso , Alelos , Frequência do Gene , Genótipo , Técnicas de Genotipagem , Humanos , Fatores de Transcrição MEF2/genética , Proteínas de Membrana/genética , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único , Receptores de GABA-B/genética , Adulto Jovem
11.
Theor Popul Biol ; 128: 19-26, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31145877

RESUMO

The linkage disequilibrium coefficient r2 is a measure of statistical dependence of the alleles possessed by an individual at different genetic loci. It is widely used in association studies to search for the locations of disease-causing genes on chromosomes. Most studies to date treat r2 as a fixed property of two loci in a finite population, and investigate the sampling distribution of estimators due to the statistical sampling of individuals from the population. Here, we instead consider the distribution of r2 itself under a process of genetic sampling through the generations. Using a classical two-locus model for genetic drift, mutation, and recombination, we investigate the probability density function of r2 at stationarity. This density function provides a tool for inference on evolutionary parameters such as mutation and recombination rates. We reconstruct the approximate stationary density of r2 by calculating a finite sequence of the distribution's moments and applying the maximum entropy principle. Our approach is based on the diffusion approximation, under which we demonstrate that for certain models in population genetics, moments of the stationary distribution can be obtained without knowing the probability distribution itself. To illustrate our approach, we show how the stationary probability density of r2 can be used in a maximum likelihood framework to estimate mutation and recombination rates from sample data of r2.


Assuntos
Desequilíbrio de Ligação , Modelos Estatísticos , Algoritmos , Alelos , Loci Gênicos , Genética Populacional
12.
Am J Hum Genet ; 96(3): 377-85, 2015 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-25683123

RESUMO

For human complex traits, non-additive genetic variation has been invoked to explain "missing heritability," but its discovery is often neglected in genome-wide association studies. Here we propose a method of using SNP data to partition and estimate the proportion of phenotypic variance attributed to additive and dominance genetic variation at all SNPs (hSNP(2) and δSNP(2)) in unrelated individuals based on an orthogonal model where the estimate of hSNP(2) is independent of that of δSNP(2). With this method, we analyzed 79 quantitative traits in 6,715 unrelated European Americans. The estimate of δSNP(2) averaged across all the 79 quantitative traits was 0.03, approximately a fifth of that for additive variation (average hSNP(2) = 0.15). There were a few traits that showed substantial estimates of δSNP(2), none of which were replicated in a larger sample of 11,965 individuals. We further performed genome-wide association analyses of the 79 quantitative traits and detected SNPs with genome-wide significant dominance effects only at the ABO locus for factor VIII and von Willebrand factor. All these results suggest that dominance variation at common SNPs explains only a small fraction of phenotypic variation for human complex traits and contributes little to the missing narrow-sense heritability problem.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Fenótipo , Polimorfismo de Nucleotídeo Único , Característica Quantitativa Herdável , Estudos de Coortes , Estudos de Avaliação como Assunto , Feminino , Humanos , Modelos Lineares , Masculino , Modelos Genéticos , População Branca/genética
13.
Bioinformatics ; 33(15): 2251-2257, 2017 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-28334390

RESUMO

MOTIVATION: Whole-genome sequencing (WGS) data are being generated at an unprecedented rate. Analysis of WGS data requires a flexible data format to store the different types of DNA variation. Variant call format (VCF) is a general text-based format developed to store variant genotypes and their annotations. However, VCF files are large and data retrieval is relatively slow. Here we introduce a new WGS variant data format implemented in the R/Bioconductor package 'SeqArray' for storing variant calls in an array-oriented manner which provides the same capabilities as VCF, but with multiple high compression options and data access using high-performance parallel computing. RESULTS: Benchmarks using 1000 Genomes Phase 3 data show file sizes are 14.0 Gb (VCF), 12.3 Gb (BCF, binary VCF), 3.5 Gb (BGT) and 2.6 Gb (SeqArray) respectively. Reading genotypes in the SeqArray package are two to three times faster compared with the htslib C library using BCF files. For the allele frequency calculation, the implementation in the SeqArray package is over 5 times faster than PLINK v1.9 with VCF and BCF files, and over 16 times faster than vcftools. When used in conjunction with R/Bioconductor packages, the SeqArray package provides users a flexible, feature-rich, high-performance programming environment for analysis of WGS variant data. AVAILABILITY AND IMPLEMENTATION: http://www.bioconductor.org/packages/SeqArray. CONTACT: zhengx@u.washington.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Compressão de Dados/métodos , Variação Genética , Software , Sequenciamento Completo do Genoma/métodos , Genoma Humano , Genômica/métodos , Humanos
14.
Mol Ecol ; 27(20): 4121-4135, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30107060

RESUMO

The concept of kinship permeates many domains of fundamental and applied biology ranging from social evolution to conservation science to quantitative and human genetics. Until recently, pedigrees were the gold standard to infer kinship, but the advent of next-generation sequencing and the availability of dense genetic markers in many species make it a good time to (re)evaluate the usefulness of genetic markers in this context. Using three published data sets where both pedigrees and markers are available, we evaluate two common and a new genetic estimator of kinship. We show discrepancies between pedigree values and marker estimates of kinship and explore via simulations the possible reasons for these. We find these discrepancies are attributable to two main sources: pedigree errors and heterogeneity in the origin of founders. We also show that our new marker-based kinship estimator has very good statistical properties and behaviour and is particularly well suited for situations where the source population is of small size, as will often be the case in conservation biology, and where high levels of kinship are expected, as is typical in social evolution studies.


Assuntos
Genética Populacional/métodos , Linhagem , Marcadores Genéticos , Humanos , Modelos Genéticos
15.
Theor Popul Biol ; 107: 65-76, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26482676

RESUMO

Principal component analysis (PCA) is widely used in genome-wide association studies (GWAS), and the principal component axes often represent perpendicular gradients in geographic space. The explanation of PCA results is of major interest for geneticists to understand fundamental demographic parameters. Here, we provide an interpretation of PCA based on relatedness measures, which are described by the probability that sets of genes are identical-by-descent (IBD). An approximately linear transformation between ancestral proportions (AP) of individuals with multiple ancestries and their projections onto the principal components is found. In addition, a new method of eigenanalysis "EIGMIX" is proposed to estimate individual ancestries. EIGMIX is a method of moments with computational efficiency suitable for millions of SNP data, and it is not subject to the assumption of linkage equilibrium. With the assumptions of multiple ancestries and their surrogate ancestral samples, EIGMIX is able to infer ancestral proportions (APs) of individuals. The methods were applied to the SNP data from the HapMap Phase 3 project and the Human Genome Diversity Panel. The APs of individuals inferred by EIGMIX are consistent with the findings of the program ADMIXTURE. In conclusion, EIGMIX can be used to detect population structure and estimate genome-wide ancestral proportions with a relatively high accuracy.


Assuntos
Consanguinidade , Variação Genética , Genética Populacional/métodos , Análise de Componente Principal , Teorema de Bayes , Frequência do Gene/genética , Projeto HapMap , Humanos , Modelos Genéticos , Grupos Populacionais/genética
17.
Nat Rev Genet ; 10(9): 639-50, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19687804

RESUMO

Wright's F-statistics, and especially F(ST), provide important insights into the evolutionary processes that influence the structure of genetic variation within and among populations, and they are among the most widely used descriptive statistics in population and evolutionary genetics. Estimates of F(ST) can identify regions of the genome that have been the target of selection, and comparisons of F(ST) from different parts of the genome can provide insights into the demographic history of populations. For these reasons and others, F(ST) has a central role in population and evolutionary genetics and has wide applications in fields that range from disease association mapping to forensic science. This Review clarifies how F(ST) is defined, how it should be estimated, how it is related to similar statistics and how estimates of F(ST) should be interpreted.


Assuntos
Interpretação Estatística de Dados , Genética Populacional/estatística & dados numéricos , Grupos Populacionais/genética , Animais , Frequência do Gene , Variação Genética , Genética Populacional/métodos , Geografia , Humanos , Grupos Populacionais/estatística & dados numéricos , Estatística como Assunto
18.
PLoS Genet ; 8(2): e1002469, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22346758

RESUMO

With the expansion of offender/arrestee DNA profile databases, genetic forensic identification has become commonplace in the United States criminal justice system. Implementation of familial searching has been proposed to extend forensic identification to family members of individuals with profiles in offender/arrestee DNA databases. In familial searching, a partial genetic profile match between a database entrant and a crime scene sample is used to implicate genetic relatives of the database entrant as potential sources of the crime scene sample. In addition to concerns regarding civil liberties, familial searching poses unanswered statistical questions. In this study, we define confidence intervals on estimated likelihood ratios for familial identification. Using these confidence intervals, we consider familial searching in a structured population. We show that relatives and unrelated individuals from population samples with lower gene diversity over the loci considered are less distinguishable. We also consider cases where the most appropriate population sample for individuals considered is unknown. We find that as a less appropriate population sample, and thus allele frequency distribution, is assumed, relatives and unrelated individuals become more difficult to distinguish. In addition, we show that relationship distinguishability increases with the number of markers considered, but decreases for more distant genetic familial relationships. All of these results indicate that caution is warranted in the application of familial searching in structured populations, such as in the United States.


Assuntos
Identificação Biométrica/métodos , Impressões Digitais de DNA/métodos , Genética Forense , População/genética , Alelos , Intervalos de Confiança , Crime , Criminosos , Interpretação Estatística de Dados , Bases de Dados de Ácidos Nucleicos , Família , Frequência do Gene/genética , Humanos , Funções Verossimilhança , Estados Unidos
19.
Forensic Sci Int Genet ; 69: 103009, 2024 03.
Artigo em Inglês | MEDLINE | ID: mdl-38237274

RESUMO

Population data have become available for sequence data to aid forensic investigations and prepare the forensic community in the move towards implementing NGS methods. This comes with a need for updated population genetic parameters estimates to allow DNA evidence evaluations using sequence data. Initial work has been done on a small sample and here we expand this work by providing estimates of population structure and relatedness for autosomal STR data generated by sequencing technologies. We also discuss the effect of inbreeding on forensic calculations and discuss why the use of genotypic-based estimates may be preferred over allelic-based estimates.


Assuntos
Genética Forense , Endogamia , Humanos , Genética Forense/métodos , Repetições de Microssatélites , Genótipo , DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Impressões Digitais de DNA/métodos
20.
Am J Hum Genet ; 86(5): 674-85, 2010 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-20381007

RESUMO

Coevolving interacting genes undergo complementary mutations to maintain their interaction. Distinct combinations of alleles in coevolving genes interact differently, conferring varying degrees of fitness. If this fitness differential is adequately large, the resulting selection for allele matching could maintain allelic association, even between physically unlinked loci. Allelic association is often observed in a population with the use of gametic linkage disequilibrium. However, because the coevolving genes are not necessarily in physical linkage, this is not an appropriate measure of coevolution-induced allelic association. Instead, we propose using both composite linkage disequilibrium (CLD) and a measure of association between genotypes, which we call genotype association (GA). Using a simple selective model, we simulated loci and calculated power for tests of CLD and GA, showing that the tests can detect the allelic association expected under realistic selective pressure. We apply CLD and GA tests to the polymorphic, physically unlinked, and putatively coevolving human gamete-recognition genes ZP3 and ZP3R. We observe unusual allelic association, not attributable to population structure, between ZP3 and ZP3R. This study shows that selection for allele matching can drive allelic association between unlinked loci in a contemporary human population, and that selection can be detected with the use of CLD and GA tests. The observation of this selection is surprising, but reasonable in the highly selected system of fertilization. If confirmed, this sort of selection provides an exception to the paradigm of chromosomal independent assortment.


Assuntos
Alelos , Proteínas do Ovo/genética , Desequilíbrio de Ligação , Glicoproteínas de Membrana/genética , Receptores de Superfície Celular/genética , Genótipo , Humanos , Polimorfismo Genético , Glicoproteínas da Zona Pelúcida
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA