Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 139
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Am J Hum Genet ; 2024 Sep 30.
Artículo en Inglés | MEDLINE | ID: mdl-39362217

RESUMEN

Recent positive selection can result in an excess of long identity-by-descent (IBD) haplotype segments overlapping a locus. The statistical methods that we propose here address three major objectives in studying selective sweeps: scanning for regions of interest, identifying possible sweeping alleles, and estimating a selection coefficient s. First, we implement a selection scan to locate regions with excess IBD rates. Second, we estimate the allele frequency and location of an unknown sweeping allele by aggregating over variants that are more abundant in an inferred outgroup with excess IBD rate versus the rest of the sample. Third, we propose an estimator for the selection coefficient and quantify uncertainty using the parametric bootstrap. Comparing against state-of-the-art methods in extensive simulations, we show that our methods are more precise at estimating s when s≥0.015. We also show that our 95% confidence intervals contain s in nearly 95% of our simulations. We apply these methods to study positive selection in European ancestry samples from the Trans-Omics for Precision Medicine project. We analyze eight loci where IBD rates are more than four standard deviations above the genome-wide median, including LCT where the maximum IBD rate is 35 standard deviations above the genome-wide median. Overall, we present robust and accurate approaches to study recent adaptive evolution without knowing the identity of the causal allele or using time series data.

2.
Am J Hum Genet ; 109(12): 2178-2184, 2022 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-36370709

RESUMEN

We provide a method for estimating the genome-wide mutation rate from sequence data on unrelated individuals by using segments of identity by descent (IBD). The length of an IBD segment indicates the time to shared ancestor of the segment, and mutations that have occurred since the shared ancestor result in discordances between the two IBD haplotypes. Previous methods for IBD-based estimation of mutation rate have required the use of family data for accurate phasing of the genotypes. This has limited the scope of application of IBD-based mutation rate estimation. Here, we develop an IBD-based method for mutation rate estimation from population data, and we apply it to whole-genome sequence data on 4,166 European American individuals from the TOPMed Framingham Heart Study, 2,996 European American individuals from the TOPMed My Life, Our Future study, and 1,586 African American individuals from the TOPMed Hypertension Genetic Epidemiology Network study. Although mutation rates may differ between populations as a result of genetic factors, demographic factors such as average parental age, and environmental exposures, our results are consistent with equal genome-wide average mutation rates across these three populations. Our overall estimate of the average genome-wide mutation rate per 108 base pairs per generation for single-nucleotide variants is 1.24 (95% CI 1.18-1.33).


Asunto(s)
Genoma Humano , Tasa de Mutación , Humanos , Genoma Humano/genética , Polimorfismo de Nucleótido Simple/genética , Haplotipos , Genotipo
3.
Proc Natl Acad Sci U S A ; 119(25): e2119281119, 2022 06 21.
Artículo en Inglés | MEDLINE | ID: mdl-35696575

RESUMEN

Haplotype-based analyses have recently been leveraged to interrogate the fine-scale structure in specific geographic regions, notably in Europe, although an equivalent haplotype-based understanding across the whole of Europe with these tools is lacking. Furthermore, study of identity-by-descent (IBD) sharing in a large sample of haplotypes across Europe would allow a direct comparison between different demographic histories of different regions. The UK Biobank (UKBB) is a population-scale dataset of genotype and phenotype data collected from the United Kingdom, with established sampling of worldwide ancestries. The exact content of these non-UK ancestries is largely uncharacterized, where study could highlight valuable intracontinental ancestry references with deep phenotyping within the UKBB. In this context, we sought to investigate the sample of European ancestry captured in the UKBB. We studied the haplotypes of 5,500 UKBB individuals with a European birthplace; investigated the population structure and demographic history in Europe, showing in parallel the variety of footprints of demographic history in different genetic regions around Europe; and expand knowledge of the genetic landscape of the east and southeast of Europe. Providing an updated map of European genetics, we leverage IBD-segment sharing to explore the extent of population isolation and size across the continent. In addition to building and expanding upon previous knowledge in Europe, our results show the UKBB as a source of diverse ancestries beyond Britain. These worldwide ancestries sampled in the UKBB may complement and inform researchers interested in specific communities or regions not limited to Britain.


Asunto(s)
Haplotipos , Población , Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Demografía , Europa (Continente) , Variación Genética , Población/genética
4.
Proc Natl Acad Sci U S A ; 119(13): e2111533119, 2022 03 29.
Artículo en Inglés | MEDLINE | ID: mdl-35312358

RESUMEN

SignificanceCalifornia supports a high cultural and linguistic diversity of Indigenous peoples. In a partnership of researchers with the Muwekma Ohlone tribe, we studied genomes of eight present-day tribal members and 12 ancient individuals from two archaeological sites in the San Francisco Bay Area, spanning ∼2,000 y. We find that compared to genomes of Indigenous individuals from throughout the Americas, the 12 ancient individuals are most genetically similar to ancient individuals from Southern California, and that despite spanning a large time period, they share distinctive ancestry. This ancestry is also shared with present-day tribal members, providing evidence of genetic continuity between past and present Indigenous individuals in the region, in contrast to some popular reconstructions based on archaeological and linguistic information.


Asunto(s)
Genómica , Pueblos Indígenas , Arqueología , ADN Antiguo , Genética de Población , Historia Antigua , Humanos , Lingüística , San Francisco
5.
Am J Hum Genet ; 108(10): 1981-2005, 2021 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-34582790

RESUMEN

Neurodevelopmental disorders (NDDs) are clinically and genetically heterogenous; many such disorders are secondary to perturbation in brain development and/or function. The prevalence of NDDs is > 3%, resulting in significant sociocultural and economic challenges to society. With recent advances in family-based genomics, rare-variant analyses, and further exploration of the Clan Genomics hypothesis, there has been a logarithmic explosion in neurogenetic "disease-associated genes" molecular etiology and biology of NDDs; however, the majority of NDDs remain molecularly undiagnosed. We applied genome-wide screening technologies, including exome sequencing (ES) and whole-genome sequencing (WGS), to identify the molecular etiology of 234 newly enrolled subjects and 20 previously unsolved Turkish NDD families. In 176 of the 234 studied families (75.2%), a plausible and genetically parsimonious molecular etiology was identified. Out of 176 solved families, deleterious variants were identified in 218 distinct genes, further documenting the enormous genetic heterogeneity and diverse perturbations in human biology underlying NDDs. We propose 86 candidate disease-trait-associated genes for an NDD phenotype. Importantly, on the basis of objective and internally established variant prioritization criteria, we identified 51 families (51/176 = 28.9%) with multilocus pathogenic variation (MPV), mostly driven by runs of homozygosity (ROHs) - reflecting genomic segments/haplotypes that are identical-by-descent. Furthermore, with the use of additional bioinformatic tools and expansion of ES to additional family members, we established a molecular diagnosis in 5 out of 20 families (25%) who remained undiagnosed in our previously studied NDD cohort emanating from Turkey.


Asunto(s)
Genómica/métodos , Mutación , Trastornos del Neurodesarrollo/epidemiología , Fenotipo , Adolescente , Adulto , Niño , Preescolar , Estudios de Cohortes , Femenino , Humanos , Lactante , Recién Nacido , Masculino , Persona de Mediana Edad , Trastornos del Neurodesarrollo/genética , Trastornos del Neurodesarrollo/patología , Linaje , Prevalencia , Turquía/epidemiología , Secuenciación del Exoma , Adulto Joven
6.
Am J Hum Genet ; 108(11): 2099-2111, 2021 11 04.
Artículo en Inglés | MEDLINE | ID: mdl-34678161

RESUMEN

The integration of genomic data into health systems offers opportunities to identify genomic factors underlying the continuum of rare and common disease. We applied a population-scale haplotype association approach based on identity-by-descent (IBD) in a large multi-ethnic biobank to a spectrum of disease outcomes derived from electronic health records (EHRs) and uncovered a risk locus for liver disease. We used genome sequencing and in silico approaches to fine-map the signal to a non-coding variant (c.2784-12T>C) in the gene ABCB4. In vitro analysis confirmed the variant disrupted splicing of the ABCB4 pre-mRNA. Four of five homozygotes had evidence of advanced liver disease, and there was a significant association with liver disease among heterozygotes, suggesting the variant is linked to increased risk of liver disease in an allele dose-dependent manner. Population-level screening revealed the variant to be at a carrier rate of 1.95% in Puerto Rican individuals, likely as the result of a Puerto Rican founder effect. This work demonstrates that integrating EHR and genomic data at a population scale can facilitate strategies for understanding the continuum of genomic risk for common diseases, particularly in populations underrepresented in genomic medicine.


Asunto(s)
Atención a la Salud/organización & administración , Predisposición Genética a la Enfermedad , Hepatopatías/genética , Subfamilia B de Transportador de Casetes de Unión a ATP/genética , Registros Electrónicos de Salud , Haplotipos , Heterocigoto , Hispánicos o Latinos/genética , Homocigoto , Humanos , Puerto Rico
7.
Am J Hum Genet ; 108(1): 68-83, 2021 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-33385324

RESUMEN

The proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling pedigree-based analyses. Despite this, relatives are generally unreported and current inference methods typically detect only the degree of relatedness of sample pairs and not pedigree relationships. We developed CREST, an accurate and fast method that identifies the pedigree relationships of close relatives. CREST utilizes identity by descent (IBD) segments shared between a pair of samples and their mutual relatives, leveraging the fact that sharing rates among these individuals differ across pedigree configurations. Furthermore, CREST exploits the profound differences in sex-specific genetic maps to classify pairs as maternally or paternally related-e.g., paternal half-siblings-using the locations of autosomal IBD segments shared between the pair. In simulated data, CREST correctly classifies 91.5%-100% of grandparent-grandchild (GP) pairs, 80.0%-97.5% of avuncular (AV) pairs, and 75.5%-98.5% of half-siblings (HS) pairs compared to PADRE's rates of 38.5%-76.0% of GP, 60.5%-92.0% of AV, 73.0%-95.0% of HS pairs. Turning to the real 20,032 sample Generation Scotland (GS) dataset, CREST identified seven pedigrees with incorrect relationship types or maternal/paternal parent sexes, five of which we confirmed as mistakes, and two with uncertain relationships. After correcting these, CREST correctly determines relationship types for 93.5% of GP, 97.7% of AV, and 92.2% of HS pairs that have sufficient mutual relative data; the parent sex in 100% of HS and 99.6% of GP pairs; and it completes this analysis in 2.8 h including IBD detection in eight threads.


Asunto(s)
Genoma Humano/genética , Femenino , Ligamiento Genético/genética , Genotipo , Humanos , Masculino , Modelos Genéticos , Linaje , Escocia
8.
Am J Hum Genet ; 108(9): 1792-1806, 2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-34411538

RESUMEN

The Finnish population is a unique example of a genetic isolate affected by a recent founder event. Previous studies have suggested that the ancestors of Finnic-speaking Finns and Estonians reached the circum-Baltic region by the 1st millennium BC. However, high linguistic similarity points to a more recent split of their languages. To study genetic connectedness between Finns and Estonians directly, we first assessed the efficacy of imputation of low-coverage ancient genomes by sequencing a medieval Estonian genome to high depth (23×) and evaluated the performance of its down-sampled replicas. We find that ancient genomes imputed from >0.1× coverage can be reliably used in principal-component analyses without projection. By searching for long shared allele intervals (LSAIs; similar to identity-by-descent segments) in unphased data for >143,000 present-day Estonians, 99 Finns, and 14 imputed ancient genomes from Estonia, we find unexpectedly high levels of individual connectedness between Estonians and Finns for the last eight centuries in contrast to their clear differentiation by allele frequencies. High levels of sharing of these segments between Estonians and Finns predate the demographic expansion and late settlement process of Finland. One plausible source of this extensive sharing is the 8th-10th centuries AD migration event from North Estonia to Finland that has been proposed to explain uniquely shared linguistic features between the Finnish language and the northern dialect of Estonian and shared Christianity-related loanwords from Slavic. These results suggest that LSAI detection provides a computationally tractable way to detect fine-scale structure in large cohorts.


Asunto(s)
Alelos , ADN Antiguo/análisis , Genoma Humano , Migración Humana/historia , Linaje , Estonia , Femenino , Finlandia , Frecuencia de los Genes , Genealogía y Heráldica , Secuenciación de Nucleótidos de Alto Rendimiento , Historia del Siglo XXI , Historia Antigua , Historia Medieval , Humanos , Lenguaje/historia , Masculino
9.
Mol Genet Genomics ; 299(1): 37, 2024 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-38494535

RESUMEN

Identity by descent (IBD) segments, uninterrupted DNA segments derived from the same ancestral chromosomes, are widely used as indicators of relationships in genetics. A great deal of research focuses on IBD segments between related pairs, while the statistical analyses of segments in irrelevant individuals are rare. In this study, we investigated the basic informative features of IBD segments in unrelated pairs in Chinese populations from the 1000 Genome Project. A total of 5922 IBD segments in Chinese interpopulation unrelated individual pairs were detected via IBIS and the average length of IBD was 3.71 Mb in length. It was found that 17.86% of unrelated pairs shared at least one IBD segment in the Chinese cohort. Furthermore, a total of 49 chromosomal regions where IBD segments clustered in high abundance were identified, which might be sharing hotspots in the human genome. Such regions could also be observed in other ancestry populations, which implies that similar IBD backgrounds also exist. Altogether, these results demonstrated the distribution of common background IBD segments, which helps improve the accuracy in pedigree studies based on IBD analysis.


Asunto(s)
Pueblo Asiatico , Genoma Humano , Humanos , Pueblo Asiatico/genética , Genoma Humano/genética , Linaje , Proyectos de Investigación , China
10.
Mol Ecol ; 33(6): e17299, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38380534

RESUMEN

Additive and dominance genetic variances underlying the expression of quantitative traits are important quantities for predicting short-term responses to selection, but they are notoriously challenging to estimate in most non-model wild populations. Specifically, large-sized or panmictic populations may be characterized by low variance in genetic relatedness among individuals which, in turn, can prevent accurate estimation of quantitative genetic parameters. We used estimates of genome-wide identity-by-descent (IBD) sharing from autosomal SNP loci to estimate quantitative genetic parameters for ecologically important traits in nine-spined sticklebacks (Pungitius pungitius) from a large, outbred population. Using empirical and simulated datasets, with varying sample sizes and pedigree complexity, we assessed the performance of different crossing schemes in estimating additive genetic variance and heritability for all traits. We found that low variance in relatedness characteristic of wild outbred populations with high migration rate can impair the estimation of quantitative genetic parameters and bias heritability estimates downwards. On the other hand, the use of a half-sib/full-sib design allowed precise estimation of genetic variance components and revealed significant additive variance and heritability for all measured traits, with negligible dominance contributions. Genome-partitioning and QTL mapping analyses revealed that most traits had a polygenic basis and were controlled by genes at multiple chromosomes. Furthermore, different QTL contributed to variation in the same traits in different populations suggesting heterogeneous underpinnings of parallel evolution at the phenotypic level. Our results provide important guidelines for future studies aimed at estimating adaptive potential in the wild, particularly for those conducted in outbred large-sized populations.


Asunto(s)
Genoma , Herencia Multifactorial , Humanos , Genoma/genética , Mapeo Cromosómico , Fenotipo , Modelos Genéticos , Polimorfismo de Nucleótido Simple/genética
11.
Theor Popul Biol ; 158: 150-169, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38880430

RESUMEN

The coalescent is a stochastic process representing ancestral lineages in a population undergoing neutral genetic drift. Originally defined for a well-mixed population, the coalescent has been adapted in various ways to accommodate spatial, age, and class structure, along with other features of real-world populations. To further extend the range of population structures to which coalescent theory applies, we formulate a coalescent process for a broad class of neutral drift models with arbitrary - but fixed - spatial, age, sex, and class structure, haploid or diploid genetics, and any fixed mating pattern. Here, the coalescent is represented as a random sequence of mappings [Formula: see text] from a finite set G to itself. The set G represents the "sites" (in individuals, in particular locations and/or classes) at which these alleles can live. The state of the coalescent, Ct:G→G, maps each site g∈G to the site containing g's ancestor, t time-steps into the past. Using this representation, we define and analyze coalescence time, coalescence branch length, mutations prior to coalescence, and stationary probabilities of identity-by-descent and identity-by-state. For low mutation, we provide a recipe for computing identity-by-descent and identity-by-state probabilities via the coalescent. Applying our results to a diploid population with arbitrary sex ratio r, we find that measures of genetic dissimilarity, among any set of sites, are scaled by 4r(1-r) relative to the even sex ratio case.


Asunto(s)
Flujo Genético , Genética de Población , Modelos Genéticos , Mutación , Procesos Estocásticos , Humanos , Diploidia
12.
Am J Med Genet A ; : e63712, 2024 May 17.
Artículo en Inglés | MEDLINE | ID: mdl-38757552

RESUMEN

Chromosomal microarrays (CMA) incorporate single nucleotide polymorphisms to enable the detection of regions of homozygosity (ROH). Here, we retrospectively analyzed 6288 prenatal cases who performed CMA to explored the clinical implications of large ROH in prenatal diagnosis. We analyzed cases with ROH larger than 10 megabases and reviewed the ultrasound findings; karyotype results and pregnancy follow-up data. Cases with possible imprinting disorders were assessed by methylation-specific multiplex ligation-dependent probe amplification. In total, we identified 50 cases with large ROH and chromosomes 1 and 2 were the most affected. About 59.18% of the ROH cases had ultrasound abnormalities, with the most common findings being ultrasound soft-marker abnormalities. There were seven fetuses had ROH which covered almost the entire chromosome and four had terminal ROH that involved almost the entire long arm of the chromosomes, which indicated uniparental disomy (UPD), of which 70% showed abnormal ultrasound findings. Ten cases with multiple ROH on different chromosomes indicated the third to fifth degree of consanguinity. In this study, we highlighted the clinical relevance of large ROH related to UPD. The analysis of ROH allowed us to gain further understanding of complex cytogenetic and disease mechanisms in prenatal diagnosis.

13.
Am J Hum Genet ; 106(3): 371-388, 2020 03 05.
Artículo en Inglés | MEDLINE | ID: mdl-32142644

RESUMEN

The population of the United States is shaped by centuries of migration, isolation, growth, and admixture between ancestors of global origins. Here, we assemble a comprehensive view of recent population history by studying the ancestry and population structure of more than 32,000 individuals in the US using genetic, ancestral birth origin, and geographic data from the National Geographic Genographic Project. We identify migration routes and barriers that reflect historical demographic events. We also uncover the spatial patterns of relatedness in subpopulations through the combination of haplotype clustering, ancestral birth origin analysis, and local ancestry inference. Examples of these patterns include substantial substructure and heterogeneity in Hispanics/Latinos, isolation-by-distance in African Americans, elevated levels of relatedness and homozygosity in Asian immigrants, and fine-scale structure in European descents. Taken together, our results provide detailed insights into the genetic structure and demographic history of the diverse US population.


Asunto(s)
Emigración e Inmigración , Genética de Población , Haplotipos , Análisis por Conglomerados , Demografía , Humanos , Estados Unidos
14.
Am J Hum Genet ; 106(4): 426-437, 2020 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-32169169

RESUMEN

Segments of identity by descent (IBD) are used in many genetic analyses. We present a method for detecting identical-by-descent haplotype segments in phased genotype data. Our method, called hap-IBD, combines a compressed representation of haplotype data, the positional Burrows-Wheeler transform, and multi-threaded execution to produce very fast analysis times. An attractive feature of hap-IBD is its simplicity: the input parameters clearly and precisely define the IBD segments that are reported, so that program correctness can be confirmed by users. We evaluate hap-IBD and four state-of-the-art IBD segment detection methods (GERMLINE, iLASH, RaPID, and TRUFFLE) using UK Biobank chromosome 20 data and simulated sequence data. We show that hap-IBD detects IBD segments faster and more accurately than competing methods, and that hap-IBD is the only method that can rapidly and accurately detect short 2-4 centiMorgan (cM) IBD segments in the full UK Biobank data. Analysis of 485,346 UK Biobank samples through the use of hap-IBD with 12 computational threads detects 231.5 billion autosomal IBD segments with length ≥2 cM in 24.4 h.


Asunto(s)
Genoma Humano/genética , Análisis de Secuencia de ADN/métodos , Alelos , Cromosomas/genética , Simulación por Computador , Análisis de Datos , Marcadores Genéticos/genética , Genética de Población/métodos , Genotipo , Haplotipos/genética , Humanos , Polimorfismo de Nucleótido Simple/genética , Programas Informáticos
15.
Am J Hum Genet ; 107(5): 895-910, 2020 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-33053335

RESUMEN

Most methods for fast detection of identity by descent (IBD) segments report identity by state segments without any quantification of the uncertainty in the endpoints and lengths of the IBD segments. We present a method for determining the posterior probability distribution of IBD segment endpoints. Our approach accounts for genotype errors, recent mutations, and gene conversions which disrupt DNA sequence identity within IBD segments, and it can be applied to large cohorts with whole-genome sequence or SNP array data. We find that our method's estimates of uncertainty are well calibrated for homogeneous samples. We quantify endpoint uncertainty for 77.7 billion IBD segments from 408,883 individuals of white British ancestry in the UK Biobank, and we use these IBD segments to find regions showing evidence of recent natural selection. We show that many spurious selection signals are eliminated by the use of unbiased estimates of IBD segment endpoints and a pedigree-based genetic map. Eleven of the twelve regions with the greatest evidence for recent selection in our scan have been identified as selected in previous analyses using different approaches. Our computationally efficient method for quantifying IBD segment endpoint uncertainty is implemented in the open source ibd-ends software package.


Asunto(s)
Identificación Biométrica/métodos , Mapeo Cromosómico/estadística & datos numéricos , Genoma Humano , Patrón de Herencia , Modelos Estadísticos , Polimorfismo de Nucleótido Simple , Bancos de Muestras Biológicas , Familia , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Linaje , Programas Informáticos , Incertidumbre , Reino Unido
16.
Am J Hum Genet ; 106(4): 453-466, 2020 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-32197076

RESUMEN

Identity-by-descent (IBD) segments are a useful tool for applications ranging from demographic inference to relationship classification, but most detection methods rely on phasing information and therefore require substantial computation time. As genetic datasets grow, methods for inferring IBD segments that scale well will be critical. We developed IBIS, an IBD detector that locates long regions of allele sharing between unphased individuals, and benchmarked it with Refined IBD, GERMLINE, and TRUFFLE on 3,000 simulated individuals. Phasing these with Beagle 5 takes 4.3 CPU days, followed by either Refined IBD or GERMLINE segment detection in 2.9 or 1.1 h, respectively. By comparison, IBIS finishes in 6.8 min or 7.8 min with IBD2 functionality enabled: speedups of 805-946× including phasing time. TRUFFLE takes 2.6 h, corresponding to IBIS speedups of 20.2-23.3×. IBIS is also accurate, inferring ≥7 cM IBD segments at quality comparable to Refined IBD and GERMLINE. With these segments, IBIS classifies first through third degree relatives in real Mexican American samples at rates meeting or exceeding other methods tested and identifies fourth through sixth degree pairs at rates within 0.0%-2.0% of the top method. While allele frequency-based approaches that do not detect segments can infer relationship degrees faster than IBIS, the fastest are biased in admixed samples, with KING inferring 30.8% fewer fifth degree Mexican American relatives correctly compared with IBIS. Finally, we ran IBIS on chromosome 2 of the UK Biobank dataset and estimate its runtime on the autosomes to be 3.3 days parallelized across 128 cores.


Asunto(s)
Análisis de Secuencia/métodos , Alelos , Cromosomas Humanos Par 2/genética , Frecuencia de los Genes/genética , Genoma Humano/genética , Humanos , Modelos Genéticos , Polimorfismo de Nucleótido Simple/genética
17.
Am J Hum Genet ; 107(2): 265-277, 2020 08 06.
Artículo en Inglés | MEDLINE | ID: mdl-32707084

RESUMEN

According to historical records of transatlantic slavery, traders forcibly deported an estimated 12.5 million people from ports along the Atlantic coastline of Africa between the 16th and 19th centuries, with global impacts reaching to the present day, more than a century and a half after slavery's abolition. Such records have fueled a broad understanding of the forced migration from Africa to the Americas yet remain underexplored in concert with genetic data. Here, we analyzed genotype array data from 50,281 research participants, which-combined with historical shipping documents-illustrate that the current genetic landscape of the Americas is largely concordant with expectations derived from documentation of slave voyages. For instance, genetic connections between people in slave trading regions of Africa and disembarkation regions of the Americas generally mirror the proportion of individuals forcibly moved between those regions. While some discordances can be explained by additional records of deportations within the Americas, other discordances yield insights into variable survival rates and timing of arrival of enslaved people from specific regions of Africa. Furthermore, the greater contribution of African women to the gene pool compared to African men varies across the Americas, consistent with literature documenting regional differences in slavery practices. This investigation of the transatlantic slave trade, which is broad in scope in terms of both datasets and analyses, establishes genetic links between individuals in the Americas and populations across Atlantic Africa, yielding a more comprehensive understanding of the African roots of peoples of the Americas.


Asunto(s)
Población Negra/genética , Polimorfismo de Nucleótido Simple/genética , África , Américas , Personas Esclavizadas , Europa (Continente) , Femenino , Humanos , Masculino
18.
Mol Ecol ; 32(15): 4348-4361, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37271855

RESUMEN

Speciation, the continuous process by which new species form, is often investigated by looking at the variation of nucleotide diversity and differentiation across the genome (hereafter genomic landscapes). A key challenge lies in how to determine the main evolutionary forces at play shaping these patterns. One promising strategy, albeit little used to date, is to comparatively investigate these genomic landscapes as progression through time by using a series of species pairs along a divergence gradient. Here, we resequenced 201 whole-genomes from eight closely related Populus species, with pairs of species at different stages along the divergence gradient to learn more about speciation processes. Using population structure and ancestry analyses, we document extensive introgression between some species pairs, especially those with parapatric distributions. We further investigate genomic landscapes, focusing on within-species (i.e. nucleotide diversity and recombination rate) and among-species (i.e. relative and absolute divergence) summary statistics of diversity and divergence. We observe relatively conserved patterns of genomic divergence across species pairs. Independent of the stage across the divergence gradient, we find support for signatures of linked selection (i.e. the interaction between natural selection and genetic linkage) in shaping these genomic landscapes, along with gene flow and standing genetic variation. We highlight the importance of investigating genomic patterns on multiple species across a divergence gradient and discuss prospects to better understand the evolutionary forces shaping the genomic landscapes of diversity and differentiation.


Asunto(s)
Populus , Populus/clasificación , Populus/genética , Selección Genética , Especiación Genética , Flujo Génico , Evolución Biológica
19.
Electrophoresis ; 44(17-18): 1435-1445, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37501329

RESUMEN

Distant genetic relatives can be linked to a crime scene sample by computing identity-by-state (IBS) and identity-by-descent (IBD) shared by individuals. To test the methods of genetic genealogy estimation and optimal the parameters for forensic investigation, a family-based genetic genealogy analysis was performed using a dataset of 262 Han Chinese individuals from 11 families. The dataset covered relative pairs from 1st- to 14th degrees. But the 7th-degree relative is the most distant kinship to be fully investigated, and each individual has ∼200 relatives within the 7th degree. The KING algorithm by calculating IBS and IBD statistics can correctly discriminate the first-degree relationships of monozygotic twin, parent-offspring and full sibling. The inferred relationship was reliable within the fifth-degree, false positive rate <1.8%. The IBD segment algorithm, GERMLINE + ERSA, could provide reliable inference result prolonged to eighth degree. Analysis of IBD segments produced obviously false negative estimations (<27.4%) rather than false positives (0%) within the eighth-degree inferences. We studied different minimum IBD segment threshold settings (changed from >0 to 6 cM); the inferred results did not make much difference. In distant relative analysis, genetically undetectable relationships begin to occur from the sixth degree (second cousin once removed), which means the offspring after seven meiotic divisions may share no ancestor IBD segment at all. Application of KING and GERMLINE + ERSA worked complementarily to ensure accurate inference from first degree to eighth degree. Using simulated low call rate data, the KING algorithm shows better tolerance to marker decrease compared with the GERMLINE + ERSA segment algorithm.


Asunto(s)
Pueblos del Este de Asia , Genética Forense , Polimorfismo de Nucleótido Simple , Humanos , Algoritmos , Linaje
20.
J Hered ; 114(5): 504-512, 2023 08 23.
Artículo en Inglés | MEDLINE | ID: mdl-37381815

RESUMEN

Several methods exist for detecting genetic relatedness or identity by comparing DNA information. These methods generally require genotype calls, either single-nucleotide polymorphisms or short tandem repeats, at the sites used for comparison. For some DNA samples, like those obtained from bone fragments or single rootless hairs, there is often not enough DNA present to generate genotype calls that are accurate and complete enough for these comparisons. Here, we describe IBDGem, a fast and robust computational procedure for detecting genomic regions of identity-by-descent by comparing low-coverage shotgun sequence data against genotype calls from a known query individual. At less than 1× genome coverage, IBDGem reliably detects segments of relatedness and can make high-confidence identity detections with as little as 0.01× genome coverage.


Asunto(s)
Genoma , Genómica , Genotipo , Análisis de Secuencia de ADN , ADN , Polimorfismo de Nucleótido Simple , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA