Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Am J Hum Genet ; 109(8): 1405-1420, 2022 08 04.
Artigo em Inglês | MEDLINE | ID: mdl-35908549

RESUMO

Population genetic analyses of local ancestry tracts routinely assume that the ancestral admixture process is identical for both parents of an individual, an assumption that may be invalid when considering recent admixture. Here, we present Parental Admixture Proportion Inference (PAPI), a Bayesian tool for inferring the admixture proportions and admixture times for each parent of a single admixed individual. PAPI analyzes unphased local ancestry tracts and has two components: a binomial model that leverages genome-wide ancestry fractions to infer parental admixture proportions and a hidden Markov model (HMM) that infers admixture times from tract lengths. Crucially, the HMM accounts for unobserved within-ancestry recombination by approximating the pedigree crossover dynamics, enabling inference of parental admixture times. In simulations, we find that PAPI's admixture proportion estimates deviate from the truth by 0.047 on average, outperforming ANCESTOR and PedMix by 46.0% and 57.6%, respectively. Moreover, PAPI's admixture time estimates were strongly correlated with the truth (R=0.76) but have an average downward bias of 1.01 generations that is partly attributable to inaccuracies in local ancestry inference. As an illustration of its utility, we ran PAPI on African American genotypes from the PAGE study (N = 5,786) and found strong evidence of assortative mating by ancestry proportion: couples' ancestry proportions are highly correlated (R = 0.87) and are closer to each other than expected under random mating (p < 10-6). We anticipate that PAPI will be useful in studying the population dynamics of admixture and will also be of interest to individuals seeking to learn about their personal genealogies.


Assuntos
Negro ou Afro-Americano , Genética Populacional , Teorema de Bayes , Humanos , Pais , Linhagem
2.
Am J Hum Genet ; 108(1): 68-83, 2021 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-33385324

RESUMO

The proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling pedigree-based analyses. Despite this, relatives are generally unreported and current inference methods typically detect only the degree of relatedness of sample pairs and not pedigree relationships. We developed CREST, an accurate and fast method that identifies the pedigree relationships of close relatives. CREST utilizes identity by descent (IBD) segments shared between a pair of samples and their mutual relatives, leveraging the fact that sharing rates among these individuals differ across pedigree configurations. Furthermore, CREST exploits the profound differences in sex-specific genetic maps to classify pairs as maternally or paternally related-e.g., paternal half-siblings-using the locations of autosomal IBD segments shared between the pair. In simulated data, CREST correctly classifies 91.5%-100% of grandparent-grandchild (GP) pairs, 80.0%-97.5% of avuncular (AV) pairs, and 75.5%-98.5% of half-siblings (HS) pairs compared to PADRE's rates of 38.5%-76.0% of GP, 60.5%-92.0% of AV, 73.0%-95.0% of HS pairs. Turning to the real 20,032 sample Generation Scotland (GS) dataset, CREST identified seven pedigrees with incorrect relationship types or maternal/paternal parent sexes, five of which we confirmed as mistakes, and two with uncertain relationships. After correcting these, CREST correctly determines relationship types for 93.5% of GP, 97.7% of AV, and 92.2% of HS pairs that have sufficient mutual relative data; the parent sex in 100% of HS and 99.6% of GP pairs; and it completes this analysis in 2.8 h including IBD detection in eight threads.


Assuntos
Genoma Humano/genética , Feminino , Ligação Genética/genética , Genótipo , Humanos , Masculino , Modelos Genéticos , Linhagem , Escócia
3.
Am J Hum Genet ; 108(9): 1792-1806, 2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-34411538

RESUMO

The Finnish population is a unique example of a genetic isolate affected by a recent founder event. Previous studies have suggested that the ancestors of Finnic-speaking Finns and Estonians reached the circum-Baltic region by the 1st millennium BC. However, high linguistic similarity points to a more recent split of their languages. To study genetic connectedness between Finns and Estonians directly, we first assessed the efficacy of imputation of low-coverage ancient genomes by sequencing a medieval Estonian genome to high depth (23×) and evaluated the performance of its down-sampled replicas. We find that ancient genomes imputed from >0.1× coverage can be reliably used in principal-component analyses without projection. By searching for long shared allele intervals (LSAIs; similar to identity-by-descent segments) in unphased data for >143,000 present-day Estonians, 99 Finns, and 14 imputed ancient genomes from Estonia, we find unexpectedly high levels of individual connectedness between Estonians and Finns for the last eight centuries in contrast to their clear differentiation by allele frequencies. High levels of sharing of these segments between Estonians and Finns predate the demographic expansion and late settlement process of Finland. One plausible source of this extensive sharing is the 8th-10th centuries AD migration event from North Estonia to Finland that has been proposed to explain uniquely shared linguistic features between the Finnish language and the northern dialect of Estonian and shared Christianity-related loanwords from Slavic. These results suggest that LSAI detection provides a computationally tractable way to detect fine-scale structure in large cohorts.


Assuntos
Alelos , DNA Antigo/análise , Genoma Humano , Migração Humana/história , Linhagem , Estônia , Feminino , Finlândia , Frequência do Gene , Genealogia e Heráldica , Sequenciamento de Nucleotídeos em Larga Escala , História do Século XXI , História Antiga , História Medieval , Humanos , Idioma/história , Masculino
4.
Bioinformatics ; 39(3)2023 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-36847450

RESUMO

SUMMARY: Leveraging local ancestry and haplotype information in genome-wide association studies and downstream analyses can improve the utility of genomics for individuals from diverse and recently admixed ancestries. However, most existing simulation, visualization and variant analysis frameworks are based on variant-level analysis and do not automatically handle these features. We present haptools, an open-source toolkit for performing local ancestry aware and haplotype-based analysis of complex traits. Haptools supports fast simulation of admixed genomes, visualization of admixture tracks, simulation of haplotype- and local ancestry-specific phenotype effects and a variety of file operations and statistics computed in a haplotype-aware manner. AVAILABILITY AND IMPLEMENTATION: Haptools is freely available at https://github.com/cast-genomics/haptools. DOCUMENTATION: Detailed documentation is available at https://haptools.readthedocs.io. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Estudo de Associação Genômica Ampla , Software , Haplótipos , Genômica , Genoma
5.
Am J Hum Genet ; 106(4): 453-466, 2020 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-32197076

RESUMO

Identity-by-descent (IBD) segments are a useful tool for applications ranging from demographic inference to relationship classification, but most detection methods rely on phasing information and therefore require substantial computation time. As genetic datasets grow, methods for inferring IBD segments that scale well will be critical. We developed IBIS, an IBD detector that locates long regions of allele sharing between unphased individuals, and benchmarked it with Refined IBD, GERMLINE, and TRUFFLE on 3,000 simulated individuals. Phasing these with Beagle 5 takes 4.3 CPU days, followed by either Refined IBD or GERMLINE segment detection in 2.9 or 1.1 h, respectively. By comparison, IBIS finishes in 6.8 min or 7.8 min with IBD2 functionality enabled: speedups of 805-946× including phasing time. TRUFFLE takes 2.6 h, corresponding to IBIS speedups of 20.2-23.3×. IBIS is also accurate, inferring ≥7 cM IBD segments at quality comparable to Refined IBD and GERMLINE. With these segments, IBIS classifies first through third degree relatives in real Mexican American samples at rates meeting or exceeding other methods tested and identifies fourth through sixth degree pairs at rates within 0.0%-2.0% of the top method. While allele frequency-based approaches that do not detect segments can infer relationship degrees faster than IBIS, the fastest are biased in admixed samples, with KING inferring 30.8% fewer fifth degree Mexican American relatives correctly compared with IBIS. Finally, we ran IBIS on chromosome 2 of the UK Biobank dataset and estimate its runtime on the autosomes to be 3.3 days parallelized across 128 cores.


Assuntos
Análise de Sequência/métodos , Alelos , Cromossomos Humanos Par 2/genética , Frequência do Gene/genética , Genoma Humano/genética , Humanos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética
6.
PLoS Genet ; 16(8): e1008895, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32760067

RESUMO

The sequencing of Neanderthal and Denisovan genomes has yielded many new insights about interbreeding events between extinct hominins and the ancestors of modern humans. While much attention has been paid to the relatively recent gene flow from Neanderthals and Denisovans into modern humans, other instances of introgression leave more subtle genomic evidence and have received less attention. Here, we present a major extension of the ARGweaver algorithm, called ARGweaver-D, which can infer local genetic relationships under a user-defined demographic model that includes population splits and migration events. This Bayesian algorithm probabilistically samples ancestral recombination graphs (ARGs) that specify not only tree topologies and branch lengths along the genome, but also indicate migrant lineages. The sampled ARGs can therefore be parsed to produce probabilities of introgression along the genome. We show that this method is well powered to detect the archaic migration into modern humans, even with only a few samples. We then show that the method can also detect introgressed regions stemming from older migration events, or from unsampled populations. We apply it to human, Neanderthal, and Denisovan genomes, looking for signatures of older proposed migration events, including ancient humans into Neanderthal, and unknown archaic hominins into Denisovans. We identify 3% of the Neanderthal genome that is putatively introgressed from ancient humans, and estimate that the gene flow occurred between 200-300kya. We find no convincing evidence that negative selection acted against these regions. Finally, we predict that 1% of the Denisovan genome was introgressed from an unsequenced, but highly diverged, archaic hominin ancestor. About 15% of these "super-archaic" regions-comprising at least about 4Mb-were, in turn, introgressed into modern humans and continue to exist in the genomes of people alive today.


Assuntos
Fluxo Gênico , Modelos Genéticos , Homem de Neandertal/genética , População/genética , Recombinação Genética , Animais , Evolução Molecular , Migração Humana , Humanos
7.
PLoS Genet ; 15(12): e1007979, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31860654

RESUMO

Simulations of close relatives and identical by descent (IBD) segments are common in genetic studies, yet most past efforts have utilized sex averaged genetic maps and ignored crossover interference, thus omitting features known to affect the breakpoints of IBD segments. We developed Ped-sim, a method for simulating relatives that can utilize either sex-specific or sex averaged genetic maps and also either a model of crossover interference or the traditional Poisson model for inter-crossover distances. To characterize the impact of previously ignored mechanisms, we simulated data for all four combinations of these factors. We found that modeling crossover interference decreases the standard deviation of pairwise IBD proportions by 10.4% on average in full siblings through second cousins. By contrast, sex-specific maps increase this standard deviation by 4.2% on average, and also impact the number of segments relatives share. Most notably, using sex-specific maps, the number of segments half-siblings share is bimodal; and when combined with interference modeling, the probability that sixth cousins have non-zero IBD sharing ranges from 9.0 to 13.1%, depending on the sexes of the individuals through which they are related. We present new analytical results for the distributions of IBD segments under these models and show they match results from simulations. Finally, we compared IBD sharing rates between simulated and real relatives and find that the combination of sex-specific maps and interference modeling most accurately captures IBD rates in real data. Ped-sim is open source and available from https://github.com/williamslab/ped-sim.


Assuntos
Mapeamento Cromossômico/métodos , Simulação por Computador , Caracteres Sexuais , Feminino , Variação Genética , Genética Populacional , Genoma Humano , Humanos , Masculino , Modelos Genéticos , Linhagem , Distribuição de Poisson
8.
Am J Hum Genet ; 103(1): 30-44, 2018 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-29937093

RESUMO

As genetic datasets increase in size, the fraction of samples with one or more close relatives grows rapidly, resulting in sets of mutually related individuals. We present DRUID-deep relatedness utilizing identity by descent-a method that works by inferring the identical-by-descent (IBD) sharing profile of an ungenotyped ancestor of a set of close relatives. Using this IBD profile, DRUID infers relatedness between unobserved ancestors and more distant relatives, thereby combining information from multiple samples to remove one or more generations between the deep relationships to be identified. DRUID constructs sets of close relatives by detecting full siblings and also uses an approach to identify the aunts/uncles of two or more siblings, recovering 92.2% of real aunts/uncles with zero false positives. In real and simulated data, DRUID correctly infers up to 10.5% more relatives than PADRE when using data from two sets of distantly related siblings, and 10.7%-31.3% more relatives given two sets of siblings and their aunts/uncles. DRUID frequently infers relationships either correctly or within one degree of the truth, with PADRE classifying 43.3%-58.3% of tenth degree relatives in this way compared to 79.6%-96.7% using DRUID.


Assuntos
Genoma Humano/genética , Polimorfismo de Nucleotídeo Único/genética , Feminino , Genética Populacional/métodos , Humanos , Masculino , Linhagem , Irmãos
9.
Proc Natl Acad Sci U S A ; 115(2): 379-384, 2018 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-29279374

RESUMO

A major challenge in evaluating the contribution of rare variants to complex disease is identifying enough copies of the rare alleles to permit informative statistical analysis. To investigate the contribution of rare variants to the risk of type 2 diabetes (T2D) and related traits, we performed deep whole-genome analysis of 1,034 members of 20 large Mexican-American families with high prevalence of T2D. If rare variants of large effect accounted for much of the diabetes risk in these families, our experiment was powered to detect association. Using gene expression data on 21,677 transcripts for 643 pedigree members, we identified evidence for large-effect rare-variant cis-expression quantitative trait loci that could not be detected in population studies, validating our approach. However, we did not identify any rare variants of large effect associated with T2D, or the related traits of fasting glucose and insulin, suggesting that large-effect rare variants account for only a modest fraction of the genetic risk of these traits in this sample of families. Reliable identification of large-effect rare variants will require larger samples of extended pedigrees or different study designs that further enrich for such variants.


Assuntos
Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença/genética , Variação Genética , Americanos Mexicanos/genética , Diabetes Mellitus Tipo 2/etnologia , Diabetes Mellitus Tipo 2/patologia , Saúde da Família , Feminino , Frequência do Gene , Predisposição Genética para Doença/etnologia , Estudo de Associação Genômica Ampla/métodos , Genótipo , Humanos , Masculino , Linhagem , Fenótipo , Locos de Características Quantitativas/genética , Sequenciamento Completo do Genoma/métodos
10.
Nature ; 506(7486): 97-101, 2014 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-24390345

RESUMO

Performing genetic studies in multiple human populations can identify disease risk alleles that are common in one population but rare in others, with the potential to illuminate pathophysiology, health disparities, and the population genetic origins of disease alleles. Here we analysed 9.2 million single nucleotide polymorphisms (SNPs) in each of 8,214 Mexicans and other Latin Americans: 3,848 with type 2 diabetes and 4,366 non-diabetic controls. In addition to replicating previous findings, we identified a novel locus associated with type 2 diabetes at genome-wide significance spanning the solute carriers SLC16A11 and SLC16A13 (P = 3.9 × 10(-13); odds ratio (OR) = 1.29). The association was stronger in younger, leaner people with type 2 diabetes, and replicated in independent samples (P = 1.1 × 10(-4); OR = 1.20). The risk haplotype carries four amino acid substitutions, all in SLC16A11; it is present at ~50% frequency in Native American samples and ~10% in east Asian, but is rare in European and African samples. Analysis of an archaic genome sequence indicated that the risk haplotype introgressed into modern humans via admixture with Neanderthals. The SLC16A11 messenger RNA is expressed in liver, and V5-tagged SLC16A11 protein localizes to the endoplasmic reticulum. Expression of SLC16A11 in heterologous cells alters lipid metabolism, most notably causing an increase in intracellular triacylglycerol levels. Despite type 2 diabetes having been well studied by genome-wide association studies in other populations, analysis in Mexican and Latin American individuals identified SLC16A11 as a novel candidate gene for type 2 diabetes with a possible role in triacylglycerol metabolism.


Assuntos
Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença/genética , Transportadores de Ácidos Monocarboxílicos/genética , Polimorfismo de Nucleotídeo Único/genética , Alelos , Animais , Povo Asiático/genética , População Negra/genética , Estudos de Coortes , Retículo Endoplasmático/genética , Feminino , Estudo de Associação Genômica Ampla , Haplótipos/genética , Células HeLa , Humanos , Indígenas Norte-Americanos/genética , Metabolismo dos Lipídeos/genética , Fígado/citologia , Fígado/metabolismo , Masculino , México , Homem de Neandertal/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Triglicerídeos/metabolismo , População Branca/genética
11.
BMC Bioinformatics ; 19(1): 478, 2018 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-30541436

RESUMO

BACKGROUND: Researchers typically sequence a given individual multiple times, either re-sequencing the same DNA sample (technical replication) or sequencing different DNA samples collected on the same individual (biological replication) or both. Before merging the data from these replicate sequence runs, it is important to verify that no errors, such as DNA contamination or mix-ups, occurred during the data collection pipeline. Methods to detect such errors exist but are often ad hoc, cannot handle missing data and several require phased data. Because they require some combination of genotype calling, imputation, and haplotype phasing, these methods are unsuitable for error detection in low- to moderate-depth sequence data where such tasks are difficult to perform accurately. Additionally, because most existing methods employ a pairwise-comparison approach for error detection rather than joint analysis of the putative replicates, results may be difficult to interpret. RESULTS: We introduce a new method for error detection suitable for shallow-, moderate-, and high-depth sequence data. Using Bayes Theorem, we calculate the posterior probability distribution over the set of relations describing the putative replicates and infer which of the samples originated from an identical genotypic source. CONCLUSIONS: Our method addresses key limitations of existing approaches and produced highly accurate results in simulation experiments. Our method is implemented as an R package called BIGRED (Bayes Inferred Genotype Replicate Error Detector), which is freely available for download: https://github.com/ac2278/BIGRED .


Assuntos
Bases de Dados de Ácidos Nucleicos/normas , Análise de Sequência de DNA/métodos , Humanos
12.
Am J Hum Genet ; 91(2): 238-51, 2012 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-22883141

RESUMO

Haplotypes are an important resource for a large number of applications in human genetics, but computationally inferred haplotypes are subject to switch errors that decrease their utility. The accuracy of computationally inferred haplotypes increases with sample size, and although ever larger genotypic data sets are being generated, the fact that existing methods require substantial computational resources limits their applicability to data sets containing tens or hundreds of thousands of samples. Here, we present HAPI-UR (haplotype inference for unrelated samples), an algorithm that is designed to handle unrelated and/or trio and duo family data, that has accuracy comparable to or greater than existing methods, and that is computationally efficient and can be applied to 100,000 samples or more. We use HAPI-UR to phase a data set with 58,207 samples and show that it achieves practical runtime and that switch errors decrease with sample size even with the use of samples from multiple ethnicities. Using a data set with 16,353 samples, we compare HAPI-UR to Beagle, MaCH, IMPUTE2, and SHAPEIT and show that HAPI-UR runs 18× faster than all methods and has a lower switch-error rate than do other methods except for Beagle; with the use of consensus phasing, running HAPI-UR three times gives a slightly lower switch-error rate than Beagle does and is more than six times faster. We demonstrate results similar to those from Beagle on another data set with a higher marker density. Lastly, we show that HAPI-UR has better runtime scaling properties than does Beagle so that for larger data sets, HAPI-UR will be practical and will have an even larger runtime advantage. HAPI-UR is available online (see Web Resources).


Assuntos
Algoritmos , Biologia Computacional/métodos , Genética , Haplótipos/genética , Software , Humanos , Internet , Projetos de Pesquisa , Tamanho da Amostra
13.
Regul Toxicol Pharmacol ; 73(1): 378-90, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26239692

RESUMO

In its review of the U.S. Environmental Protection Agency's toxicological review of inorganic arsenic (iAs), the National Academy of Sciences identified carcinogenic endpoints among the highest priority health effects of concern and stated the need to consider evidence that early life exposures may increase the risk of adverse health effects. Recent studies in mice suggest that in utero exposure to arsenic increases susceptibility to cancer later in life. These data are striking in light of the general lack of evidence for carcinogenicity in rodents exposed to iAs. To evaluate the transplacental carcinogenic potential of iAs, a detailed analysis of the toxicology literature evaluating the role of in utero arsenic exposure in carcinogenesis was conducted. Bladder, lung, and skin tumors, which are the tumor types most consistently reported in humans exposed to high arsenic levels, were not consistently increased in mouse studies. There was also a lack of concordance across studies for other tumor types not typically reported in humans. Therefore, we considered methodological and other critical issues that may have contributed to variable results and we suggest additional studies to address these issues. It was concluded that the available data do not provide evidence of a causal link between in utero arsenic exposure and cancer or indicate early life-stage susceptibility to arsenic-induced cancer, particularly at environmentally relevant doses.


Assuntos
Arsênio/toxicidade , Carcinógenos/toxicidade , Neoplasias/induzido quimicamente , Efeitos Tardios da Exposição Pré-Natal/induzido quimicamente , Animais , Feminino , Humanos , Troca Materno-Fetal/fisiologia , Camundongos , Gravidez
14.
Regul Toxicol Pharmacol ; 73(3): 754-7, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26550933

RESUMO

Recently Bergman et al. (2015) took issue with our comments (Lamb et al., 2014) on the WHO-UNEP(1) report entitled the "State of the Science of Endocrine Disrupting Chemicals - 2012" (WHO 2013a). We find several key differences between their view and ours regarding the selection of studies and presentation of data related to endocrine disrupting chemicals (EDCs) under the WHO-IPCS(2) definition (2002). In this response we address the factors that we think are most important: 1. the difference between hazard and risk; 2. the different approaches for hazard identification (weight of the evidence [WOE] vs. emphasizing positive findings over null results); and 3. the lack of a justification for conceptual or practical differences between EDCs and other groups of agents.


Assuntos
Disruptores Endócrinos/toxicidade , Animais , Humanos
15.
Regul Toxicol Pharmacol ; 69(1): 22-40, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24530840

RESUMO

Early in 2013, the World Health Organization (WHO) released a 2012 update to the 2002 State of the Science of Endocrine Disrupting Chemicals. Several significant concerns have been identified that raise questions about conclusions reached in this report regarding endocrine disruption. First, the report is not a state-of-the-science review and does not follow the 2002 WHO recommended weight-of-evidence approach. Second, endocrine disruption is often presumed to occur based on exposure or a potential mechanism despite a lack of evidence to show that chemicals are causally established as endocrine disruptors. Additionally, causation is often inferred by the presentation of a series of unrelated facts, which collectively do not demonstrate causation. Third, trends in disease incidence or prevalence are discussed without regard to known causes or risk factors; endocrine disruption is implicated as the reason for such trends in the absence of evidence. Fourth, dose and potency are ignored for most chemicals discussed. Finally, controversial topics (i.e., low dose effects, non-monotonic dose response) are presented in a one-sided manner and these topics are important to understanding endocrine disruption. Overall, the 2012 report does not provide a balanced perspective, nor does it accurately reflect the state of the science on endocrine disruption.


Assuntos
Disruptores Endócrinos/toxicidade , Animais , Poluentes Ambientais/toxicidade , Humanos , Medição de Risco , Organização Mundial da Saúde
16.
JAMA ; 311(22): 2305-14, 2014 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-24915262

RESUMO

IMPORTANCE: Latino populations have one of the highest prevalences of type 2 diabetes worldwide. OBJECTIVES: To investigate the association between rare protein-coding genetic variants and prevalence of type 2 diabetes in a large Latino population and to explore potential molecular and physiological mechanisms for the observed relationships. DESIGN, SETTING, AND PARTICIPANTS: Whole-exome sequencing was performed on DNA samples from 3756 Mexican and US Latino individuals (1794 with type 2 diabetes and 1962 without diabetes) recruited from 1993 to 2013. One variant was further tested for allele frequency and association with type 2 diabetes in large multiethnic data sets of 14,276 participants and characterized in experimental assays. MAIN OUTCOME AND MEASURES: Prevalence of type 2 diabetes. Secondary outcomes included age of onset, body mass index, and effect on protein function. RESULTS: A single rare missense variant (c.1522G>A [p.E508K]) was associated with type 2 diabetes prevalence (odds ratio [OR], 5.48; 95% CI, 2.83-10.61; P = 4.4 × 10(-7)) in hepatocyte nuclear factor 1-α (HNF1A), the gene responsible for maturity onset diabetes of the young type 3 (MODY3). This variant was observed in 0.36% of participants without type 2 diabetes and 2.1% of participants with it. In multiethnic replication data sets, the p.E508K variant was seen only in Latino patients (n = 1443 with type 2 diabetes and 1673 without it) and was associated with type 2 diabetes (OR, 4.16; 95% CI, 1.75-9.92; P = .0013). In experimental assays, HNF-1A protein encoding the p.E508K mutant demonstrated reduced transactivation activity of its target promoter compared with a wild-type protein. In our data, carriers and noncarriers of the p.E508K mutation with type 2 diabetes had no significant differences in compared clinical characteristics, including age at onset. The mean (SD) age for carriers was 45.3 years (11.2) vs 47.5 years (11.5) for noncarriers (P = .49) and the mean (SD) BMI for carriers was 28.2 (5.5) vs 29.3 (5.3) for noncarriers (P = .19). CONCLUSIONS AND RELEVANCE: Using whole-exome sequencing, we identified a single low-frequency variant in the MODY3-causing gene HNF1A that is associated with type 2 diabetes in Latino populations and may affect protein function. This finding may have implications for screening and therapeutic modification in this population, but additional studies are required.


Assuntos
Diabetes Mellitus Tipo 2/genética , Fator 1-alfa Nuclear de Hepatócito/genética , Adulto , Idade de Início , Idoso , Feminino , Genótipo , Hispânico ou Latino/genética , Humanos , Masculino , México , Pessoa de Meia-Idade , Mutação de Sentido Incorreto , Análise de Sequência de DNA , Estados Unidos
17.
bioRxiv ; 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38766004

RESUMO

Haplotype phasing, the process of determining which genetic variants are physically located on the same chromosome, is crucial for various genetic analyses. In this study, we first benchmark SHAPEIT and Beagle, two state-of-the-art phasing methods, on two large datasets: > 8 million diverse, research-consented 23andMe, Inc. customers and the UK Biobank (UKB). We find that both perform exceptionally well. Beagle's median switch error rate (SER) (after excluding single SNP switches) in white British trios from UKB is 0.026% compared to 0.00% for European ancestry 23andMe research participants; 55.6% of European ancestry 23andMe research participants have zero non-single SNP switches, compared to 42.4% of white British trios. South Asian ancestry 23andMe research participants have the highest median SER amongst the 23andMe populations, but it is still remarkably low at 0.46%. We also investigate the relationship between identity-by-descent (IBD) and SER, finding that switch errors tend to occur in regions of little or no IBD segment coverage. SHAPEIT and Beagle excel at 'intra-chromosomal' phasing, but lack the ability to phase across chromosomes, motivating us to develop an inter-chromosomal phasing method, called HAPTIC ( HAP lotype TI ling and C lustering), that assigns paternal and maternal variants discretely genome-wide. Our approach uses identity-by-descent (IBD) segments to phase blocks of variants on different chromosomes. HAPTIC represents the segments a focal individual shares with their relatives as nodes in a signed graph and performs bipartite clustering on the signed graph using spectral clustering. We test HAPTIC on 1022 UKB trios, yielding a median phase error of 0.08% in regions covered by IBD segments (33.5% of sites). We also ran HAPTIC in the 23andMe database and found a median phase error rate (the rate of mismatching alleles between the inferred and true phase) of 0.92% in Europeans (93.8% of sites) and 0.09% in admixed Africans (92.7% of sites). HAPTIC's precision depends heavily on data from relatives, so will increase as datasets grow larger and more diverse. HAPTIC enables analyses that require the parent-of-origin of variants, such as association studies and ancestry inference of untyped parents.

18.
bioRxiv ; 2024 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-38798596

RESUMO

Reconstructing the DNA of ancestors from their descendants has the potential to empower phenotypic analyses (including association and genetic nurture studies), improve pedigree reconstruction, and shed light on the ancestral population and phenotypes of ancestors. We developed HAPI-RECAP, a method that reconstructs the DNA of parents from full siblings and their relatives. This tool leverages HAPI2's output, a new phasing approach that applies to siblings (and optionally one or both parents) and reliably infers parent haplotypes but does not link the ungenotyped parents' DNA across chromosomes or between segments flanking ambiguities. By combining IBD between the reconstructed parents and the relatives, HAPI-RECAP resolves the source parent of these segments. Moreover, the method exploits crossovers the children inherited and sex-specific genetic maps to infer the reconstructed parents' sexes. We validated these methods on research participants from both 23andMe, Inc. and the San Antonio Mexican American Family Studies. Given data for one parent, HAPI2 reconstructs large fractions of the missing parent's DNA, between 77.6% and 99.97% among all families, and 90.3% on average in three- and four-child families. When reconstructing both parents, HAPI-RECAP inferred between 33.2% and 96.6% of the parents' genotypes, averaging 70.6% in four-child families. Reconstructed genotypes have average error rates < 10-3, or comparable to those from direct genotyping. HAPI-RECAP inferred the parent sexes 100% correctly given IBD-linked segments and can also reconstruct parents without any IBD. As datasets grow in size, more families will be implicitly collected; HAPI-RECAP holds promise to enable high quality parent genotype reconstruction.

19.
Crit Rev Toxicol ; 43(2): 79-95, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23286529

RESUMO

The herbicide glyphosate has undergone multiple safety tests for developmental toxicity in rats and rabbits. The European Commission's 2002 review of available glyphosate data discusses specific heart defects observed in several individual rabbit developmental toxicity studies, but describes the evidence for a potential causal relationship as equivocal. The present assessment was undertaken to analyze the current body of information generated from seven unpublished rabbit studies in order to determine if glyphosate poses a risk for cardiovascular malformations. In addition, the results of six unpublished developmental toxicity studies in rats were considered. Five of the seven rabbit studies (dose range: 10-500 mg/kg/day) were GLP- and testing guideline-compliant for the era in which the studies were performed; a sixth study predated testing and GLP guidelines, but generally adhered to these principles. The seventh study was judged inadequate. In each of the adequate studies, offspring effects occurred only at doses that also caused maternal toxicity. An integrated evaluation of the six adequate studies, using conservative assumptions, demonstrated that neither the overall malformation rate nor the incidence of cardiovascular malformations increased with dose up to the point where severe maternal toxicity was observed (generally ≥150 mg/kg/day). Random occurrences of cardiovascular malformations were observed across all dose groups (including controls) and did not exhibit a dose-response relationship. In the six rat studies (dose range: 30-3500 mg/kg/day), a low incidence of sporadic cardiovascular malformations was reported that was clearly not related to treatment. In summary, assessment of the entire body of the developmental toxicity data reviewed fails to support a potential risk for increased cardiovascular defects as a result of glyphosate exposure during pregnancy.


Assuntos
Anormalidades Cardiovasculares/etiologia , Desenvolvimento Embrionário/efeitos dos fármacos , Desenvolvimento Fetal/efeitos dos fármacos , Glicina/análogos & derivados , Herbicidas/toxicidade , Animais , Anormalidades Cardiovasculares/epidemiologia , Relação Dose-Resposta a Droga , Embrião de Mamíferos/efeitos dos fármacos , Feminino , Glicina/toxicidade , Humanos , Exposição Materna , Nível de Efeito Adverso não Observado , Gravidez , Coelhos , Ratos , Medição de Risco , Testes de Toxicidade , Glifosato
20.
bioRxiv ; 2023 Dec 04.
Artigo em Inglês | MEDLINE | ID: mdl-38106003

RESUMO

Local ancestry inference (LAI) is an indispensable component of a variety of analyses in medical and population genetics, from admixture mapping to characterizing demographic history. However, the accuracy of LAI depends on a number of factors such as phase quality (for phase-based LAI methods), time since admixture of the population under study, and other factors. Here we present an empirical analysis of four LAI methods using simulated individuals of mixed African and European ancestry, examining the impact of variable phase quality and a range of demographic scenarios. We found that regardless of phasing options, calls from LAI methods that operate on unphased genotypes (phase-free LAI) have 2.6-4.6% higher Pearson correlation with the ground truth than methods that operate on phased genotypes (phase-based LAI). Applying the TRACTOR phase-correction algorithm led to modest improvements in phase-based LAI, but despite this, the Pearson correlation of phase-free LAI remained 2.4-3.8% higher than phase-corrected phase-based approaches (considering the best performing methods in each category). Phase-free and phase-based LAI accuracy differences can dramatically impact downstream analyses: estimates of the time since admixture using phase-based LAI tracts are upwardly biased by ≈10 generations using our highest quality phased data but have virtually no bias using phase-free LAI calls. Our study underscores the strong dependence of phase-based LAI accuracy on phase quality and highlights the merits of LAI approaches that analyze unphased genetic data.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA