Pesquisa | Secretaria de Estado da Saúde

1.

Estimating the genome-wide mutation rate from thousands of unrelated individuals.

Tian, Xiaowen; Cai, Ruoyi; Browning, Sharon R.

Am J Hum Genet ; 109(12): 2178-2184, 2022 12 01.

Artigo em Inglês | MEDLINE | ID: mdl-36370709

RESUMO

We provide a method for estimating the genome-wide mutation rate from sequence data on unrelated individuals by using segments of identity by descent (IBD). The length of an IBD segment indicates the time to shared ancestor of the segment, and mutations that have occurred since the shared ancestor result in discordances between the two IBD haplotypes. Previous methods for IBD-based estimation of mutation rate have required the use of family data for accurate phasing of the genotypes. This has limited the scope of application of IBD-based mutation rate estimation. Here, we develop an IBD-based method for mutation rate estimation from population data, and we apply it to whole-genome sequence data on 4,166 European American individuals from the TOPMed Framingham Heart Study, 2,996 European American individuals from the TOPMed My Life, Our Future study, and 1,586 African American individuals from the TOPMed Hypertension Genetic Epidemiology Network study. Although mutation rates may differ between populations as a result of genetic factors, demographic factors such as average parental age, and environmental exposures, our results are consistent with equal genome-wide average mutation rates across these three populations. Our overall estimate of the average genome-wide mutation rate per 108 base pairs per generation for single-nucleotide variants is 1.24 (95% CI 1.18-1.33).

Assuntos

Genoma Humano , Taxa de Mutação , Humanos , Genoma Humano/genética , Polimorfismo de Nucleotídeo Único/genética , Haplótipos , Genótipo

2.

Revealing the recent demographic history of Europe via haplotype sharing in the UK Biobank.

Gilbert, Edmund; Shanmugam, Ashwini; Cavalleri, Gianpiero L.

Proc Natl Acad Sci U S A ; 119(25): e2119281119, 2022 06 21.

Artigo em Inglês | MEDLINE | ID: mdl-35696575

RESUMO

Haplotype-based analyses have recently been leveraged to interrogate the fine-scale structure in specific geographic regions, notably in Europe, although an equivalent haplotype-based understanding across the whole of Europe with these tools is lacking. Furthermore, study of identity-by-descent (IBD) sharing in a large sample of haplotypes across Europe would allow a direct comparison between different demographic histories of different regions. The UK Biobank (UKBB) is a population-scale dataset of genotype and phenotype data collected from the United Kingdom, with established sampling of worldwide ancestries. The exact content of these non-UK ancestries is largely uncharacterized, where study could highlight valuable intracontinental ancestry references with deep phenotyping within the UKBB. In this context, we sought to investigate the sample of European ancestry captured in the UKBB. We studied the haplotypes of 5,500 UKBB individuals with a European birthplace; investigated the population structure and demographic history in Europe, showing in parallel the variety of footprints of demographic history in different genetic regions around Europe; and expand knowledge of the genetic landscape of the east and southeast of Europe. Providing an updated map of European genetics, we leverage IBD-segment sharing to explore the extent of population isolation and size across the continent. In addition to building and expanding upon previous knowledge in Europe, our results show the UKBB as a source of diverse ancestries beyond Britain. These worldwide ancestries sampled in the UKBB may complement and inform researchers interested in specific communities or regions not limited to Britain.

Assuntos

Haplótipos , População , Bases de Dados Genéticas , Conjuntos de Dados como Assunto , Demografia , Europa (Continente) , Variação Genética , População/genética

3.

Ancient and modern genomics of the Ohlone Indigenous population of California.

Severson, Alissa L; Byrd, Brian F; Mallott, Elizabeth K; Owings, Amanda C; DeGiorgio, Michael; de Flamingh, Alida; Nijmeh, Charlene; Arellano, Monica V; Leventhal, Alan; Rosenberg, Noah A; Malhi, Ripan S.

Proc Natl Acad Sci U S A ; 119(13): e2111533119, 2022 03 29.

Artigo em Inglês | MEDLINE | ID: mdl-35312358

RESUMO

SignificanceCalifornia supports a high cultural and linguistic diversity of Indigenous peoples. In a partnership of researchers with the Muwekma Ohlone tribe, we studied genomes of eight present-day tribal members and 12 ancient individuals from two archaeological sites in the San Francisco Bay Area, spanning â¼2,000 y. We find that compared to genomes of Indigenous individuals from throughout the Americas, the 12 ancient individuals are most genetically similar to ancient individuals from Southern California, and that despite spanning a large time period, they share distinctive ancestry. This ancestry is also shared with present-day tribal members, providing evidence of genetic continuity between past and present Indigenous individuals in the region, in contrast to some popular reconstructions based on archaeological and linguistic information.

Assuntos

Genômica , Povos Indígenas , Arqueologia , DNA Antigo , Genética Populacional , História Antiga , Humanos , Linguística , São Francisco

4.

Distinguishing pedigree relationships via multi-way identity by descent sharing and sex-specific genetic maps.

Qiao, Ying; Sannerud, Jens G; Basu-Roy, Sayantani; Hayward, Caroline; Williams, Amy L.

Am J Hum Genet ; 108(1): 68-83, 2021 01 07.

Artigo em Inglês | MEDLINE | ID: mdl-33385324

RESUMO

The proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling pedigree-based analyses. Despite this, relatives are generally unreported and current inference methods typically detect only the degree of relatedness of sample pairs and not pedigree relationships. We developed CREST, an accurate and fast method that identifies the pedigree relationships of close relatives. CREST utilizes identity by descent (IBD) segments shared between a pair of samples and their mutual relatives, leveraging the fact that sharing rates among these individuals differ across pedigree configurations. Furthermore, CREST exploits the profound differences in sex-specific genetic maps to classify pairs as maternally or paternally related-e.g., paternal half-siblings-using the locations of autosomal IBD segments shared between the pair. In simulated data, CREST correctly classifies 91.5%-100% of grandparent-grandchild (GP) pairs, 80.0%-97.5% of avuncular (AV) pairs, and 75.5%-98.5% of half-siblings (HS) pairs compared to PADRE's rates of 38.5%-76.0% of GP, 60.5%-92.0% of AV, 73.0%-95.0% of HS pairs. Turning to the real 20,032 sample Generation Scotland (GS) dataset, CREST identified seven pedigrees with incorrect relationship types or maternal/paternal parent sexes, five of which we confirmed as mistakes, and two with uncertain relationships. After correcting these, CREST correctly determines relationship types for 93.5% of GP, 97.7% of AV, and 92.2% of HS pairs that have sufficient mutual relative data; the parent sex in 100% of HS and 99.6% of GP pairs; and it completes this analysis in 2.8 h including IBD detection in eight threads.

Assuntos

Genoma Humano/genética , Feminino , Ligação Genética/genética , Genótipo , Humanos , Masculino , Modelos Genéticos , Linhagem , Escócia

5.

High prevalence of multilocus pathogenic variation in neurodevelopmental disorders in the Turkish population.

Mitani, Tadahiro; Isikay, Sedat; Gezdirici, Alper; Gulec, Elif Yilmaz; Punetha, Jaya; Fatih, Jawid M; Herman, Isabella; Akay, Gulsen; Du, Haowei; Calame, Daniel G; Ayaz, Akif; Tos, Tulay; Yesil, Gozde; Aydin, Hatip; Geckinli, Bilgen; Elcioglu, Nursel; Candan, Sukru; Sezer, Ozlem; Erdem, Haktan Bagis; Gul, Davut; Demiral, Emine; Elmas, Muhsin; Yesilbas, Osman; Kilic, Betul; Gungor, Serdal; Ceylan, Ahmet C; Bozdogan, Sevcan; Ozalp, Ozge; Cicek, Salih; Aslan, Huseyin; Yalcintepe, Sinem; Topcu, Vehap; Bayram, Yavuz; Grochowski, Christopher M; Jolly, Angad; Dawood, Moez; Duan, Ruizhi; Jhangiani, Shalini N; Doddapaneni, Harsha; Hu, Jianhong; Muzny, Donna M; Marafi, Dana; Akdemir, Zeynep Coban; Karaca, Ender; Carvalho, Claudia M B; Gibbs, Richard A; Posey, Jennifer E; Lupski, James R; Pehlivan, Davut.

Am J Hum Genet ; 108(10): 1981-2005, 2021 10 07.

Artigo em Inglês | MEDLINE | ID: mdl-34582790

RESUMO

Neurodevelopmental disorders (NDDs) are clinically and genetically heterogenous; many such disorders are secondary to perturbation in brain development and/or function. The prevalence of NDDs is > 3%, resulting in significant sociocultural and economic challenges to society. With recent advances in family-based genomics, rare-variant analyses, and further exploration of the Clan Genomics hypothesis, there has been a logarithmic explosion in neurogenetic "disease-associated genes" molecular etiology and biology of NDDs; however, the majority of NDDs remain molecularly undiagnosed. We applied genome-wide screening technologies, including exome sequencing (ES) and whole-genome sequencing (WGS), to identify the molecular etiology of 234 newly enrolled subjects and 20 previously unsolved Turkish NDD families. In 176 of the 234 studied families (75.2%), a plausible and genetically parsimonious molecular etiology was identified. Out of 176 solved families, deleterious variants were identified in 218 distinct genes, further documenting the enormous genetic heterogeneity and diverse perturbations in human biology underlying NDDs. We propose 86 candidate disease-trait-associated genes for an NDD phenotype. Importantly, on the basis of objective and internally established variant prioritization criteria, we identified 51 families (51/176 = 28.9%) with multilocus pathogenic variation (MPV), mostly driven by runs of homozygosity (ROHs) - reflecting genomic segments/haplotypes that are identical-by-descent. Furthermore, with the use of additional bioinformatic tools and expansion of ES to additional family members, we established a molecular diagnosis in 5 out of 20 families (25%) who remained undiagnosed in our previously studied NDD cohort emanating from Turkey.

Assuntos

Genômica/métodos , Mutação , Transtornos do Neurodesenvolvimento/epidemiologia , Fenótipo , Adolescente , Adulto , Criança , Pré-Escolar , Estudos de Coortes , Feminino , Humanos , Lactente , Recém-Nascido , Masculino , Pessoa de Meia-Idade , Transtornos do Neurodesenvolvimento/genética , Transtornos do Neurodesenvolvimento/patologia , Linhagem , Prevalência , Turquia/epidemiologia , Sequenciamento do Exoma , Adulto Jovem

6.

Patterns of genetic connectedness between modern and medieval Estonian genomes reveal the origins of a major ancestry component of the Finnish population.

Kivisild, Toomas; Saag, Lehti; Hui, Ruoyun; Biagini, Simone Andrea; Pankratov, Vasili; D'Atanasio, Eugenia; Pagani, Luca; Saag, Lauri; Rootsi, Siiri; Mägi, Reedik; Metspalu, Ene; Valk, Heiki; Malve, Martin; Irdt, Kadri; Reisberg, Tuuli; Solnik, Anu; Scheib, Christiana L; Seidman, Daniel N; Williams, Amy L; Tambets, Kristiina; Metspalu, Mait.

Am J Hum Genet ; 108(9): 1792-1806, 2021 09 02.

Artigo em Inglês | MEDLINE | ID: mdl-34411538

RESUMO

The Finnish population is a unique example of a genetic isolate affected by a recent founder event. Previous studies have suggested that the ancestors of Finnic-speaking Finns and Estonians reached the circum-Baltic region by the 1st millennium BC. However, high linguistic similarity points to a more recent split of their languages. To study genetic connectedness between Finns and Estonians directly, we first assessed the efficacy of imputation of low-coverage ancient genomes by sequencing a medieval Estonian genome to high depth (23×) and evaluated the performance of its down-sampled replicas. We find that ancient genomes imputed from >0.1× coverage can be reliably used in principal-component analyses without projection. By searching for long shared allele intervals (LSAIs; similar to identity-by-descent segments) in unphased data for >143,000 present-day Estonians, 99 Finns, and 14 imputed ancient genomes from Estonia, we find unexpectedly high levels of individual connectedness between Estonians and Finns for the last eight centuries in contrast to their clear differentiation by allele frequencies. High levels of sharing of these segments between Estonians and Finns predate the demographic expansion and late settlement process of Finland. One plausible source of this extensive sharing is the 8th-10th centuries AD migration event from North Estonia to Finland that has been proposed to explain uniquely shared linguistic features between the Finnish language and the northern dialect of Estonian and shared Christianity-related loanwords from Slavic. These results suggest that LSAI detection provides a computationally tractable way to detect fine-scale structure in large cohorts.

Assuntos

Alelos , DNA Antigo/análise , Genoma Humano , Migração Humana/história , Linhagem , Estônia , Feminino , Finlândia , Frequência do Gene , Genealogia e Heráldica , Sequenciamento de Nucleotídeos em Larga Escala , História do Século XXI , História Antiga , História Medieval , Humanos , Idioma/história , Masculino

7.

Leveraging health systems data to characterize a large effect variant conferring risk for liver disease in Puerto Ricans.

Belbin, Gillian M; Rutledge, Stephanie; Dodatko, Tetyana; Cullina, Sinead; Turchin, Michael C; Kohli, Sumita; Torre, Denis; Yee, Muh-Ching; Gignoux, Christopher R; Abul-Husn, Noura S; Houten, Sander M; Kenny, Eimear E.

Am J Hum Genet ; 108(11): 2099-2111, 2021 11 04.

Artigo em Inglês | MEDLINE | ID: mdl-34678161

RESUMO

The integration of genomic data into health systems offers opportunities to identify genomic factors underlying the continuum of rare and common disease. We applied a population-scale haplotype association approach based on identity-by-descent (IBD) in a large multi-ethnic biobank to a spectrum of disease outcomes derived from electronic health records (EHRs) and uncovered a risk locus for liver disease. We used genome sequencing and in silico approaches to fine-map the signal to a non-coding variant (c.2784-12T>C) in the gene ABCB4. In vitro analysis confirmed the variant disrupted splicing of the ABCB4 pre-mRNA. Four of five homozygotes had evidence of advanced liver disease, and there was a significant association with liver disease among heterozygotes, suggesting the variant is linked to increased risk of liver disease in an allele dose-dependent manner. Population-level screening revealed the variant to be at a carrier rate of 1.95% in Puerto Rican individuals, likely as the result of a Puerto Rican founder effect. This work demonstrates that integrating EHR and genomic data at a population scale can facilitate strategies for understanding the continuum of genomic risk for common diseases, particularly in populations underrepresented in genomic medicine.

Assuntos

Atenção à Saúde/organização & administração , Predisposição Genética para Doença , Hepatopatias/genética , Subfamília B de Transportador de Cassetes de Ligação de ATP/genética , Registros Eletrônicos de Saúde , Haplótipos , Heterozigoto , Hispânico ou Latino/genética , Homozigoto , Humanos , Porto Rico

8.

Characterizing identity by descent segments in Chinese interpopulation unrelated individual pairs.

Ji, Qiqi; Yao, Yining; Li, Zhimin; Zhou, Zhihan; Qian, Jinglei; Tang, Qiqun; Xie, Jianhui.

Mol Genet Genomics ; 299(1): 37, 2024 Mar 18.

Artigo em Inglês | MEDLINE | ID: mdl-38494535

RESUMO

Identity by descent (IBD) segments, uninterrupted DNA segments derived from the same ancestral chromosomes, are widely used as indicators of relationships in genetics. A great deal of research focuses on IBD segments between related pairs, while the statistical analyses of segments in irrelevant individuals are rare. In this study, we investigated the basic informative features of IBD segments in unrelated pairs in Chinese populations from the 1000 Genome Project. A total of 5922 IBD segments in Chinese interpopulation unrelated individual pairs were detected via IBIS and the average length of IBD was 3.71 Mb in length. It was found that 17.86% of unrelated pairs shared at least one IBD segment in the Chinese cohort. Furthermore, a total of 49 chromosomal regions where IBD segments clustered in high abundance were identified, which might be sharing hotspots in the human genome. Such regions could also be observed in other ancestry populations, which implies that similar IBD backgrounds also exist. Altogether, these results demonstrated the distribution of common background IBD segments, which helps improve the accuracy in pedigree studies based on IBD analysis.

Assuntos

Povo Asiático , Genoma Humano , Humanos , Povo Asiático/genética , Genoma Humano/genética , Linhagem , Projetos de Pesquisa , China

9.

Dissecting the genetic architecture of quantitative traits using genome-wide identity-by-descent sharing.

Fraimout, Antoine; Guillaume, Frédéric; Li, Zitong; Sillanpää, Mikko J; Rastas, Pasi; Merilä, Juha.

Mol Ecol ; 33(6): e17299, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38380534

RESUMO

Additive and dominance genetic variances underlying the expression of quantitative traits are important quantities for predicting short-term responses to selection, but they are notoriously challenging to estimate in most non-model wild populations. Specifically, large-sized or panmictic populations may be characterized by low variance in genetic relatedness among individuals which, in turn, can prevent accurate estimation of quantitative genetic parameters. We used estimates of genome-wide identity-by-descent (IBD) sharing from autosomal SNP loci to estimate quantitative genetic parameters for ecologically important traits in nine-spined sticklebacks (Pungitius pungitius) from a large, outbred population. Using empirical and simulated datasets, with varying sample sizes and pedigree complexity, we assessed the performance of different crossing schemes in estimating additive genetic variance and heritability for all traits. We found that low variance in relatedness characteristic of wild outbred populations with high migration rate can impair the estimation of quantitative genetic parameters and bias heritability estimates downwards. On the other hand, the use of a half-sib/full-sib design allowed precise estimation of genetic variance components and revealed significant additive variance and heritability for all measured traits, with negligible dominance contributions. Genome-partitioning and QTL mapping analyses revealed that most traits had a polygenic basis and were controlled by genes at multiple chromosomes. Furthermore, different QTL contributed to variation in the same traits in different populations suggesting heterogeneous underpinnings of parallel evolution at the phenotypic level. Our results provide important guidelines for future studies aimed at estimating adaptive potential in the wild, particularly for those conducted in outbred large-sized populations.

Assuntos

Genoma , Herança Multifatorial , Humanos , Genoma/genética , Mapeamento Cromossômico , Fenótipo , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética

10.

The coalescent in finite populations with arbitrary, fixed structure.

Allen, Benjamin; McAvoy, Alex.

Theor Popul Biol ; 158: 150-169, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38880430

RESUMO

The coalescent is a stochastic process representing ancestral lineages in a population undergoing neutral genetic drift. Originally defined for a well-mixed population, the coalescent has been adapted in various ways to accommodate spatial, age, and class structure, along with other features of real-world populations. To further extend the range of population structures to which coalescent theory applies, we formulate a coalescent process for a broad class of neutral drift models with arbitrary - but fixed - spatial, age, sex, and class structure, haploid or diploid genetics, and any fixed mating pattern. Here, the coalescent is represented as a random sequence of mappings [Formula: see text] from a finite set G to itself. The set G represents the "sites" (in individuals, in particular locations and/or classes) at which these alleles can live. The state of the coalescent, Ct:GâG, maps each site g∈G to the site containing g's ancestor, t time-steps into the past. Using this representation, we define and analyze coalescence time, coalescence branch length, mutations prior to coalescence, and stationary probabilities of identity-by-descent and identity-by-state. For low mutation, we provide a recipe for computing identity-by-descent and identity-by-state probabilities via the coalescent. Applying our results to a diploid population with arbitrary sex ratio r, we find that measures of genetic dissimilarity, among any set of sites, are scaled by 4r(1-r) relative to the even sex ratio case.

Assuntos

Deriva Genética , Genética Populacional , Modelos Genéticos , Mutação , Processos Estocásticos , Humanos , Diploide

11.

Large regions of homozygosity in prenatal diagnosis.

Ma, Di; Ye, Mei; Hu, Wenlong; Gao, Hui; Wang, Lijuan; Song, Yaqin; Nie, Rui; Hu, Zhiyang; Guo, Hui.

Am J Med Genet A ; : e63712, 2024 May 17.

Artigo em Inglês | MEDLINE | ID: mdl-38757552

RESUMO

Chromosomal microarrays (CMA) incorporate single nucleotide polymorphisms to enable the detection of regions of homozygosity (ROH). Here, we retrospectively analyzed 6288 prenatal cases who performed CMA to explored the clinical implications of large ROH in prenatal diagnosis. We analyzed cases with ROH larger than 10 megabases and reviewed the ultrasound findings; karyotype results and pregnancy follow-up data. Cases with possible imprinting disorders were assessed by methylation-specific multiplex ligation-dependent probe amplification. In total, we identified 50 cases with large ROH and chromosomes 1 and 2 were the most affected. About 59.18% of the ROH cases had ultrasound abnormalities, with the most common findings being ultrasound soft-marker abnormalities. There were seven fetuses had ROH which covered almost the entire chromosome and four had terminal ROH that involved almost the entire long arm of the chromosomes, which indicated uniparental disomy (UPD), of which 70% showed abnormal ultrasound findings. Ten cases with multiple ROH on different chromosomes indicated the third to fifth degree of consanguinity. In this study, we highlighted the clinical relevance of large ROH related to UPD. The analysis of ROH allowed us to gain further understanding of complex cytogenetic and disease mechanisms in prenatal diagnosis.

12.

A Fast and Simple Method for Detecting Identity-by-Descent Segments in Large-Scale Data.

Zhou, Ying; Browning, Sharon R; Browning, Brian L.

Am J Hum Genet ; 106(4): 426-437, 2020 04 02.

Artigo em Inglês | MEDLINE | ID: mdl-32169169

RESUMO

Segments of identity by descent (IBD) are used in many genetic analyses. We present a method for detecting identical-by-descent haplotype segments in phased genotype data. Our method, called hap-IBD, combines a compressed representation of haplotype data, the positional Burrows-Wheeler transform, and multi-threaded execution to produce very fast analysis times. An attractive feature of hap-IBD is its simplicity: the input parameters clearly and precisely define the IBD segments that are reported, so that program correctness can be confirmed by users. We evaluate hap-IBD and four state-of-the-art IBD segment detection methods (GERMLINE, iLASH, RaPID, and TRUFFLE) using UK Biobank chromosome 20 data and simulated sequence data. We show that hap-IBD detects IBD segments faster and more accurately than competing methods, and that hap-IBD is the only method that can rapidly and accurately detect short 2-4 centiMorgan (cM) IBD segments in the full UK Biobank data. Analysis of 485,346 UK Biobank samples through the use of hap-IBD with 12 computational threads detects 231.5 billion autosomal IBD segments with length ≥2 cM in 24.4 h.

Assuntos

Genoma Humano/genética , Análise de Sequência de DNA/métodos , Alelos , Cromossomos/genética , Simulação por Computador , Análise de Dados , Marcadores Genéticos/genética , Genética Populacional/métodos , Genótipo , Haplótipos/genética , Humanos , Polimorfismo de Nucleotídeo Único/genética , Software

13.

Probabilistic Estimation of Identity by Descent Segment Endpoints and Detection of Recent Selection.

Browning, Sharon R; Browning, Brian L.

Am J Hum Genet ; 107(5): 895-910, 2020 11 05.

Artigo em Inglês | MEDLINE | ID: mdl-33053335

RESUMO

Most methods for fast detection of identity by descent (IBD) segments report identity by state segments without any quantification of the uncertainty in the endpoints and lengths of the IBD segments. We present a method for determining the posterior probability distribution of IBD segment endpoints. Our approach accounts for genotype errors, recent mutations, and gene conversions which disrupt DNA sequence identity within IBD segments, and it can be applied to large cohorts with whole-genome sequence or SNP array data. We find that our method's estimates of uncertainty are well calibrated for homogeneous samples. We quantify endpoint uncertainty for 77.7 billion IBD segments from 408,883 individuals of white British ancestry in the UK Biobank, and we use these IBD segments to find regions showing evidence of recent natural selection. We show that many spurious selection signals are eliminated by the use of unbiased estimates of IBD segment endpoints and a pedigree-based genetic map. Eleven of the twelve regions with the greatest evidence for recent selection in our scan have been identified as selected in previous analyses using different approaches. Our computationally efficient method for quantifying IBD segment endpoint uncertainty is implemented in the open source ibd-ends software package.

Assuntos

Identificação Biométrica/métodos , Mapeamento Cromossômico/estatística & dados numéricos , Genoma Humano , Padrões de Herança , Modelos Estatísticos , Polimorfismo de Nucleotídeo Único , Bancos de Espécimes Biológicos , Família , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Linhagem , Software , Incerteza , Reino Unido

14.

Rapid, Phase-free Detection of Long Identity-by-Descent Segments Enables Effective Relationship Classification.

Seidman, Daniel N; Shenoy, Sushila A; Kim, Minsoo; Babu, Ramya; Woods, Ian G; Dyer, Thomas D; Lehman, Donna M; Curran, Joanne E; Duggirala, Ravindranath; Blangero, John; Williams, Amy L.

Am J Hum Genet ; 106(4): 453-466, 2020 04 02.

Artigo em Inglês | MEDLINE | ID: mdl-32197076

RESUMO

Identity-by-descent (IBD) segments are a useful tool for applications ranging from demographic inference to relationship classification, but most detection methods rely on phasing information and therefore require substantial computation time. As genetic datasets grow, methods for inferring IBD segments that scale well will be critical. We developed IBIS, an IBD detector that locates long regions of allele sharing between unphased individuals, and benchmarked it with Refined IBD, GERMLINE, and TRUFFLE on 3,000 simulated individuals. Phasing these with Beagle 5 takes 4.3 CPU days, followed by either Refined IBD or GERMLINE segment detection in 2.9 or 1.1 h, respectively. By comparison, IBIS finishes in 6.8 min or 7.8 min with IBD2 functionality enabled: speedups of 805-946× including phasing time. TRUFFLE takes 2.6 h, corresponding to IBIS speedups of 20.2-23.3×. IBIS is also accurate, inferring ≥7 cM IBD segments at quality comparable to Refined IBD and GERMLINE. With these segments, IBIS classifies first through third degree relatives in real Mexican American samples at rates meeting or exceeding other methods tested and identifies fourth through sixth degree pairs at rates within 0.0%-2.0% of the top method. While allele frequency-based approaches that do not detect segments can infer relationship degrees faster than IBIS, the fastest are biased in admixed samples, with KING inferring 30.8% fewer fifth degree Mexican American relatives correctly compared with IBIS. Finally, we ran IBIS on chromosome 2 of the UK Biobank dataset and estimate its runtime on the autosomes to be 3.3 days parallelized across 128 cores.

Assuntos

Análise de Sequência/métodos , Alelos , Cromossomos Humanos Par 2/genética , Frequência do Gene/genética , Genoma Humano/genética , Humanos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética

15.

Population Histories of the United States Revealed through Fine-Scale Migration and Haplotype Analysis.

Dai, Chengzhen L; Vazifeh, Mohammad M; Yeang, Chen-Hsiang; Tachet, Remi; Wells, R Spencer; Vilar, Miguel G; Daly, Mark J; Ratti, Carlo; Martin, Alicia R.

Am J Hum Genet ; 106(3): 371-388, 2020 03 05.

Artigo em Inglês | MEDLINE | ID: mdl-32142644

RESUMO

The population of the United States is shaped by centuries of migration, isolation, growth, and admixture between ancestors of global origins. Here, we assemble a comprehensive view of recent population history by studying the ancestry and population structure of more than 32,000 individuals in the US using genetic, ancestral birth origin, and geographic data from the National Geographic Genographic Project. We identify migration routes and barriers that reflect historical demographic events. We also uncover the spatial patterns of relatedness in subpopulations through the combination of haplotype clustering, ancestral birth origin analysis, and local ancestry inference. Examples of these patterns include substantial substructure and heterogeneity in Hispanics/Latinos, isolation-by-distance in African Americans, elevated levels of relatedness and homozygosity in Asian immigrants, and fine-scale structure in European descents. Taken together, our results provide detailed insights into the genetic structure and demographic history of the diverse US population.

Assuntos

Emigração e Imigração , Genética Populacional , Haplótipos , Análise por Conglomerados , Demografia , Humanos , Estados Unidos

16.

Genetic Consequences of the Transatlantic Slave Trade in the Americas.

Micheletti, Steven J; Bryc, Kasia; Ancona Esselmann, Samantha G; Freyman, William A; Moreno, Meghan E; Poznik, G David; Shastri, Anjali J; Beleza, Sandra; Mountain, Joanna L.

Am J Hum Genet ; 107(2): 265-277, 2020 08 06.

Artigo em Inglês | MEDLINE | ID: mdl-32707084

RESUMO

According to historical records of transatlantic slavery, traders forcibly deported an estimated 12.5 million people from ports along the Atlantic coastline of Africa between the 16th and 19th centuries, with global impacts reaching to the present day, more than a century and a half after slavery's abolition. Such records have fueled a broad understanding of the forced migration from Africa to the Americas yet remain underexplored in concert with genetic data. Here, we analyzed genotype array data from 50,281 research participants, which-combined with historical shipping documents-illustrate that the current genetic landscape of the Americas is largely concordant with expectations derived from documentation of slave voyages. For instance, genetic connections between people in slave trading regions of Africa and disembarkation regions of the Americas generally mirror the proportion of individuals forcibly moved between those regions. While some discordances can be explained by additional records of deportations within the Americas, other discordances yield insights into variable survival rates and timing of arrival of enslaved people from specific regions of Africa. Furthermore, the greater contribution of African women to the gene pool compared to African men varies across the Americas, consistent with literature documenting regional differences in slavery practices. This investigation of the transatlantic slave trade, which is broad in scope in terms of both datasets and analyses, establishes genetic links between individuals in the Americas and populations across Atlantic Africa, yielding a more comprehensive understanding of the African roots of peoples of the Americas.

Assuntos

População Negra/genética , Polimorfismo de Nucleotídeo Único/genética , África , América , Pessoas Escravizadas , Europa (Continente) , Feminino , Humanos , Masculino

17.

Drivers of genomic landscapes of differentiation across a Populus divergence gradient.

Shang, Huiying; Field, David L; Paun, Ovidiu; Rendón-Anaya, Martha; Hess, Jaqueline; Vogl, Claus; Liu, Jianquan; Ingvarsson, Pär K; Lexer, Christian; Leroy, Thibault.

Mol Ecol ; 32(15): 4348-4361, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-37271855

RESUMO

Speciation, the continuous process by which new species form, is often investigated by looking at the variation of nucleotide diversity and differentiation across the genome (hereafter genomic landscapes). A key challenge lies in how to determine the main evolutionary forces at play shaping these patterns. One promising strategy, albeit little used to date, is to comparatively investigate these genomic landscapes as progression through time by using a series of species pairs along a divergence gradient. Here, we resequenced 201 whole-genomes from eight closely related Populus species, with pairs of species at different stages along the divergence gradient to learn more about speciation processes. Using population structure and ancestry analyses, we document extensive introgression between some species pairs, especially those with parapatric distributions. We further investigate genomic landscapes, focusing on within-species (i.e. nucleotide diversity and recombination rate) and among-species (i.e. relative and absolute divergence) summary statistics of diversity and divergence. We observe relatively conserved patterns of genomic divergence across species pairs. Independent of the stage across the divergence gradient, we find support for signatures of linked selection (i.e. the interaction between natural selection and genetic linkage) in shaping these genomic landscapes, along with gene flow and standing genetic variation. We highlight the importance of investigating genomic patterns on multiple species across a divergence gradient and discuss prospects to better understand the evolutionary forces shaping the genomic landscapes of diversity and differentiation.

Assuntos

Populus , Populus/classificação , Populus/genética , Seleção Genética , Especiação Genética , Fluxo Gênico , Evolução Biológica

18.

Testing of two SNP array-based genealogy algorithms using extended Han Chinese pedigrees and recommendations for improved performances in forensic practice.

Liu, Jing; Wei, Yi-Liang; Yang, Lan; Jiang, Li; Zhao, Wen-Ting; Li, Cai-Xia.

Electrophoresis ; 44(17-18): 1435-1445, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37501329

RESUMO

Distant genetic relatives can be linked to a crime scene sample by computing identity-by-state (IBS) and identity-by-descent (IBD) shared by individuals. To test the methods of genetic genealogy estimation and optimal the parameters for forensic investigation, a family-based genetic genealogy analysis was performed using a dataset of 262 Han Chinese individuals from 11 families. The dataset covered relative pairs from 1st- to 14th degrees. But the 7th-degree relative is the most distant kinship to be fully investigated, and each individual has â¼200 relatives within the 7th degree. The KING algorithm by calculating IBS and IBD statistics can correctly discriminate the first-degree relationships of monozygotic twin, parent-offspring and full sibling. The inferred relationship was reliable within the fifth-degree, false positive rate <1.8%. The IBD segment algorithm, GERMLINE + ERSA, could provide reliable inference result prolonged to eighth degree. Analysis of IBD segments produced obviously false negative estimations (<27.4%) rather than false positives (0%) within the eighth-degree inferences. We studied different minimum IBD segment threshold settings (changed from >0 to 6 cM); the inferred results did not make much difference. In distant relative analysis, genetically undetectable relationships begin to occur from the sixth degree (second cousin once removed), which means the offspring after seven meiotic divisions may share no ancestor IBD segment at all. Application of KING and GERMLINE + ERSA worked complementarily to ensure accurate inference from first degree to eighth degree. Using simulated low call rate data, the KING algorithm shows better tolerance to marker decrease compared with the GERMLINE + ERSA segment algorithm.

Assuntos

População do Leste Asiático , Genética Forense , Polimorfismo de Nucleotídeo Único , Humanos , Algoritmos , Linhagem

19.

A computational approach for positive genetic identification and relatedness detection from low-coverage shotgun sequencing data.

Nguyen, Remy; Kapp, Joshua D; Sacco, Samuel; Myers, Steven P; Green, Richard E.

J Hered ; 114(5): 504-512, 2023 08 23.

Artigo em Inglês | MEDLINE | ID: mdl-37381815

RESUMO

Several methods exist for detecting genetic relatedness or identity by comparing DNA information. These methods generally require genotype calls, either single-nucleotide polymorphisms or short tandem repeats, at the sites used for comparison. For some DNA samples, like those obtained from bone fragments or single rootless hairs, there is often not enough DNA present to generate genotype calls that are accurate and complete enough for these comparisons. Here, we describe IBDGem, a fast and robust computational procedure for detecting genomic regions of identity-by-descent by comparing low-coverage shotgun sequence data against genotype calls from a known query individual. At less than 1× genome coverage, IBDGem reliably detects segments of relatedness and can make high-confidence identity detections with as little as 0.01× genome coverage.

Assuntos

Genoma , Genômica , Genótipo , Análise de Sequência de DNA , DNA , Polimorfismo de Nucleotídeo Único , Sequenciamento de Nucleotídeos em Larga Escala/métodos

20.

Heritability estimates and predictive ability for pig meat quality traits using identity-by-state and identity-by-descent relationships in an F₂ population.

Angarita Barajas, Belcy Karine; Cantet, Rodolfo J C; Steibel, Juan P; Schrauf, Matias F; Forneris, Natalia S.

J Anim Breed Genet ; 140(1): 13-27, 2023 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-36300585

RESUMO

Genomic relationships can be computed with dense genome-wide genotypes through different methods, either based on identity-by-state (IBS) or identity-by-descent (IBD). The latter has been shown to increase the accuracy of both estimated relationships and predicted breeding values. However, it is not clear whether an IBD approach would achieve greater heritability ( h 2 ) and predictive ability ( r Ì y , y Ì ) than its IBS counterpart for data with low-depth pedigrees. Here, we compare both approaches in terms of the estimated of h 2 and r Ì y , y Ì , using data on meat quality and carcass traits recorded in experimental crossbred pigs, with a pedigree constrained to only three generations. Three animal models were fitted which differed on the relationship matrix: an IBS model ( G IBS ), an IBD (defined within the known pedigree) model ( G IBD ), and a pedigree model ( A 22 ). In 9 of 20 traits, the range of increase for the estimates of σ u 2 and h 2 was 1.2-2.9 times greater with G IBS and G IBD models than with A 22 . Whereas for all traits, both parameters were similar between genomic models. The r Ì y , y Ì of the genomic models was higher compared to A 22 . A scarce increment in r Ì y , y Ì was found with G IBS when compared to G IBD , most likely due to the former recovering sizeable relationships among founder F0 animals.

Assuntos

Carne de Porco , Animais , Suínos/genética , Genômica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa