Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 5.798
Filtrar
1.
Hum Genomics ; 18(1): 44, 2024 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-38685113

RESUMEN

BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.


Asunto(s)
Enfermedades Raras , Humanos , Enfermedades Raras/genética , Enfermedades Raras/diagnóstico , Genoma Humano/genética , Variación Genética/genética , Biología Computacional/métodos , Fenotipo
2.
Database (Oxford) ; 20242024 Apr 11.
Artículo en Inglés | MEDLINE | ID: mdl-38602506

RESUMEN

Short Tandem Repeats (STRs) are genetic markers made up of repeating DNA sequences. The variations of the STRs are widely studied in forensic analysis, population studies and genetic testing for a variety of neuromuscular disorders. Understanding polymorphic STR variation and its cause is crucial for deciphering genetic information and finding links to various disorders. In this paper, we present STRIDE-DB, a novel and unique platform to explore STR Instability and its Phenotypic Relevance, and a comprehensive database of STRs in the human genome. We utilized RepeatMasker to identify all the STRs in the human genome (hg19) and combined it with frequency data from the 1000 Genomes Project. STRIDE-DB, a user-friendly resource, plays a pivotal role in investigating the relationship between STR variation, instability and phenotype. By harnessing data from genome-wide association studies (GWAS), ClinVar database, Alu loci, Haploblocks in genome and Conservation of the STRs, it serves as an important tool for researchers exploring the variability of STRs in the human genome and its direct impact on phenotypes. STRIDE-DB has its broad applicability and significance in various research domains like forensic sciences and other repeat expansion disorders. Database URL: https://stridedb.igib.res.in.


Asunto(s)
Genoma Humano , Estudio de Asociación del Genoma Completo , Humanos , Genoma Humano/genética , Fenotipo , Repeticiones de Microsatélite/genética , Bases de Datos Factuales
3.
PLoS Genet ; 20(3): e1011144, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38507461

RESUMEN

Across the human genome, there are large-scale fluctuations in genetic diversity caused by the indirect effects of selection. This "linked selection signal" reflects the impact of selection according to the physical placement of functional regions and recombination rates along chromosomes. Previous work has shown that purifying selection acting against the steady influx of new deleterious mutations at functional portions of the genome shapes patterns of genomic variation. To date, statistical efforts to estimate purifying selection parameters from linked selection models have relied on classic Background Selection theory, which is only applicable when new mutations are so deleterious that they cannot fix in the population. Here, we develop a statistical method based on a quantitative genetics view of linked selection, that models how polygenic additive fitness variance distributed along the genome increases the rate of stochastic allele frequency change. By jointly predicting the equilibrium fitness variance and substitution rate due to both strong and weakly deleterious mutations, we estimate the distribution of fitness effects (DFE) and mutation rate across three geographically distinct human samples. While our model can accommodate weaker selection, we find evidence of strong selection operating similarly across all human samples. Although our quantitative genetic model of linked selection fits better than previous models, substitution rates of the most constrained sites disagree with observed divergence levels. We find that a model incorporating selective interference better predicts observed divergence in conserved regions, but overall our results suggest uncertainty remains about the processes generating fitness variation in humans.


Asunto(s)
Modelos Genéticos , Selección Genética , Humanos , Evolución Molecular , Frecuencia de los Genes/genética , Mutación , Genoma Humano/genética , Variación Genética , Aptitud Genética
4.
Biotechniques ; 76(5): 216-223, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38530148

RESUMEN

Ancient DNA (aDNA) obtained from human remains is typically fragmented and present in relatively low amounts. Here we investigate a set of optimal methods for producing aDNA data by comparing silica-based DNA extraction and aDNA library preparation protocols. We also test the efficiency of whole-genome enrichment (WGC) on ancient human samples by modifying a number of parameter combinations. We find that the Dabney extraction protocol performs significantly better than alternatives. We further observed a positive trend with the BEST library protocol indicating lower clonality. Notably, our results suggest that WGC is effective at retrieving endogenous DNA, particularly from poorly-preserved human samples, by increasing human endogenous proportions by 5x. Thus, aDNA studies will be most likely to benefit from our results.


Asunto(s)
ADN Antiguo , Genoma Humano , ADN Antiguo/análisis , ADN Antiguo/aislamiento & purificación , Humanos , Genoma Humano/genética , Biblioteca de Genes , Análisis de Secuencia de ADN/métodos , Dióxido de Silicio/química
5.
Mol Genet Genomics ; 299(1): 37, 2024 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-38494535

RESUMEN

Identity by descent (IBD) segments, uninterrupted DNA segments derived from the same ancestral chromosomes, are widely used as indicators of relationships in genetics. A great deal of research focuses on IBD segments between related pairs, while the statistical analyses of segments in irrelevant individuals are rare. In this study, we investigated the basic informative features of IBD segments in unrelated pairs in Chinese populations from the 1000 Genome Project. A total of 5922 IBD segments in Chinese interpopulation unrelated individual pairs were detected via IBIS and the average length of IBD was 3.71 Mb in length. It was found that 17.86% of unrelated pairs shared at least one IBD segment in the Chinese cohort. Furthermore, a total of 49 chromosomal regions where IBD segments clustered in high abundance were identified, which might be sharing hotspots in the human genome. Such regions could also be observed in other ancestry populations, which implies that similar IBD backgrounds also exist. Altogether, these results demonstrated the distribution of common background IBD segments, which helps improve the accuracy in pedigree studies based on IBD analysis.


Asunto(s)
Pueblo Asiatico , Genoma Humano , Humanos , Pueblo Asiatico/genética , Genoma Humano/genética , Linaje , Proyectos de Investigación , China
6.
Pharmacogenet Genomics ; 34(4): 130-134, 2024 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-38359167

RESUMEN

The use of genome-wide genotyping arrays in pharmacogenomics (PGx) research and clinical implementation applications is increasing but it is unclear which arrays are best suited for these applications. Here, we conduct a comparative coverage analysis of PGx alleles included on genome-wide genotyping arrays, with an emphasis on alleles in genes with PGx-based prescribing guidelines. Genomic manifest files for seven arrays including the Axiom Precision Medicine Diversity Array (PMDA), Axiom PMDA Plus, Axiom PangenomiX, Axiom PangenomiX Plus, Infinium Global Screening Array, Infinium Global Diversity Array (GDA) and Infinium GDA with enhanced PGx (GDA-PGx) Array, were evaluated for coverage of 523 star alleles across 19 pharmacogenes included in prescribing guidelines developed by the Clinical Pharmacogenetic Implementation Consortium and Dutch Pharmacogenomics Working Group. Specific attention was given to coverage of the Association of Molecular Pathology's Tier 1 and Tier 2 allele sets for CYP2C9, CYP2C19, CYP2D6, CYP3A4, CYP3A5, NUDT15, TPMT and VKORC1 . Coverage of the examined PGx alleles was highest for the Infinium GDA-PGx (88%), Axiom PangenomiX Plus (77%), Axiom PangenomiX (72%) and Axiom PMDA Plus (70%). Three arrays (Infinium GDA-PGx, Axiom PangenomiX Plus and Axiom PMDA Plus) fully covered the Tier 1 alleles and the Axiom PangenomiX array provided full coverage of Tier 2 alleles. In conclusion, PGx allele coverage varied by gene and array. A superior array for all PGx applications was not identified. Future comparative analyses of genotype data produced by these arrays are needed to determine the robustness of the reported coverage estimates.


Asunto(s)
Alelos , Farmacogenética , Humanos , Farmacogenética/métodos , Genotipo , Técnicas de Genotipaje/métodos , Estudio de Asociación del Genoma Completo/métodos , Genoma Humano/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Medicina de Precisión/métodos
7.
Nature ; 627(8003): 340-346, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38374255

RESUMEN

Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics1-4. The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health5,6. Here we describe the programme's genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.


Asunto(s)
Conjuntos de Datos como Asunto , Genética Médica , Genética de Población , Genoma Humano , Genómica , Grupos Minoritarios , Grupos Raciales , Humanos , Acceso a la Información , Población Negra/genética , Registros Electrónicos de Salud , Etnicidad/genética , Pueblo Europeo/genética , Predisposición Genética a la Enfermedad/genética , Variación Genética/genética , Genoma Humano/genética , Estudios Longitudinales , Grupos Raciales/genética , Reproducibilidad de los Resultados , Investigadores , Factores de Tiempo , Poblaciones Vulnerables
8.
Nature ; 627(8004): 586-593, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38355797

RESUMEN

Over half of hepatocellular carcinoma (HCC) cases diagnosed worldwide are in China1-3. However, whole-genome analysis of hepatitis B virus (HBV)-associated HCC in Chinese individuals is limited4-8, with current analyses of HCC mainly from non-HBV-enriched populations9,10. Here we initiated the Chinese Liver Cancer Atlas (CLCA) project and performed deep whole-genome sequencing (average depth, 120×) of 494 HCC tumours. We identified 6 coding and 28 non-coding previously undescribed driver candidates. Five previously undescribed mutational signatures were found, including aristolochic-acid-associated indel and doublet base signatures, and a single-base-substitution signature that we termed SBS_H8. Pentanucleotide context analysis and experimental validation confirmed that SBS_H8 was distinct to the aristolochic-acid-associated SBS22. Notably, HBV integrations could take the form of extrachromosomal circular DNA, resulting in elevated copy numbers and gene expression. Our high-depth data also enabled us to characterize subclonal clustered alterations, including chromothripsis, chromoplexy and kataegis, suggesting that these catastrophic events could also occur in late stages of hepatocarcinogenesis. Pathway analysis of all classes of alterations further linked non-coding mutations to dysregulation of liver metabolism. Finally, we performed in vitro and in vivo assays to show that fibrinogen alpha chain (FGA), determined as both a candidate coding and non-coding driver, regulates HCC progression and metastasis. Our CLCA study depicts a detailed genomic landscape and evolutionary history of HCC in Chinese individuals, providing important clinical implications.


Asunto(s)
Carcinoma Hepatocelular , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Neoplasias Hepáticas , Mutación , Secuenciación Completa del Genoma , Humanos , Ácidos Aristolóquicos/metabolismo , Carcinogénesis , Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/virología , China , Cromotripsis , Progresión de la Enfermedad , ADN Circular/genética , Pueblos del Este de Asia/genética , Evolución Molecular , Genoma Humano/genética , Virus de la Hepatitis B/genética , Mutación INDEL/genética , Hígado/metabolismo , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/virología , Mutación/genética , Metástasis de la Neoplasia/genética , Sistemas de Lectura Abierta/genética , Reproducibilidad de los Resultados
9.
Signal Transduct Target Ther ; 9(1): 47, 2024 Feb 26.
Artículo en Inglés | MEDLINE | ID: mdl-38409199

RESUMEN

Precise genome-editing platforms are versatile tools for generating specific, site-directed DNA insertions, deletions, and substitutions. The continuous enhancement of these tools has led to a revolution in the life sciences, which promises to deliver novel therapies for genetic disease. Precise genome-editing can be traced back to the 1950s with the discovery of DNA's double-helix and, after 70 years of development, has evolved from crude in vitro applications to a wide range of sophisticated capabilities, including in vivo applications. Nonetheless, precise genome-editing faces constraints such as modest efficiency, delivery challenges, and off-target effects. In this review, we explore precise genome-editing, with a focus on introduction of the landmark events in its history, various platforms, delivery systems, and applications. First, we discuss the landmark events in the history of precise genome-editing. Second, we describe the current state of precise genome-editing strategies and explain how these techniques offer unprecedented precision and versatility for modifying the human genome. Third, we introduce the current delivery systems used to deploy precise genome-editing components through DNA, RNA, and RNPs. Finally, we summarize the current applications of precise genome-editing in labeling endogenous genes, screening genetic variants, molecular recording, generating disease models, and gene therapy, including ex vivo therapy and in vivo therapy, and discuss potential future advances.


Asunto(s)
Sistemas CRISPR-Cas , Edición Génica , Humanos , Sistemas CRISPR-Cas/genética , Terapia Genética/métodos , Genoma Humano/genética , ADN
10.
PLoS One ; 19(2): e0292479, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38349923

RESUMEN

Recombinase enzymes are extremely efficient at integrating very large DNA fragments into target genomes. However, intrinsic sequence specificities curtail their use to DNA sequences with sufficient homology to endogenous target motifs. Extensive engineering is therefore required to broaden applicability and robustness. Here, we describe the directed evolution of novel lambda integrase variants capable of editing exogenous target sequences identified in the diatom Phaeodactylum tricornutum and the algae Nannochloropsis oceanica. These microorganisms hold great promise as conduits for green biomanufacturing and carbon sequestration. The evolved enzyme variants show >1000-fold switch in specificity towards the non-natural target sites when assayed in vitro. A single-copy target motif in the human genome with homology to the Nannochloropsis oceanica site can also be efficiently targeted using an engineered integrase, both in vitro and in human cells. The developed integrase variants represent useful additions to the DNA editing toolbox, with particular application for targeted genomic insertion of large DNA cargos.


Asunto(s)
Diatomeas , Estramenopilos , Humanos , Integrasas/genética , Genoma Humano/genética , ADN , Genómica , Diatomeas/genética , Estramenopilos/genética , Edición Génica
11.
Nature ; 625(7994): 329-337, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38200294

RESUMEN

Major migration events in Holocene Eurasia have been characterized genetically at broad regional scales1-4. However, insights into the population dynamics in the contact zones are hampered by a lack of ancient genomic data sampled at high spatiotemporal resolution5-7. Here, to address this, we analysed shotgun-sequenced genomes from 100 skeletons spanning 7,300 years of the Mesolithic period, Neolithic period and Early Bronze Age in Denmark and integrated these with proxies for diet (13C and 15N content), mobility (87Sr/86Sr ratio) and vegetation cover (pollen). We observe that Danish Mesolithic individuals of the Maglemose, Kongemose and Ertebølle cultures form a distinct genetic cluster related to other Western European hunter-gatherers. Despite shifts in material culture they displayed genetic homogeneity from around 10,500 to 5,900 calibrated years before present, when Neolithic farmers with Anatolian-derived ancestry arrived. Although the Neolithic transition was delayed by more than a millennium relative to Central Europe, it was very abrupt and resulted in a population turnover with limited genetic contribution from local hunter-gatherers. The succeeding Neolithic population, associated with the Funnel Beaker culture, persisted for only about 1,000 years before immigrants with eastern Steppe-derived ancestry arrived. This second and equally rapid population replacement gave rise to the Single Grave culture with an ancestry profile more similar to present-day Danes. In our multiproxy dataset, these major demographic events are manifested as parallel shifts in genotype, phenotype, diet and land use.


Asunto(s)
Genoma Humano , Genómica , Migración Humana , Pueblos Nórdicos y Escandinávicos , Humanos , Dinamarca/etnología , Emigrantes e Inmigrantes/historia , Genotipo , Pueblos Nórdicos y Escandinávicos/genética , Pueblos Nórdicos y Escandinávicos/historia , Migración Humana/historia , Genoma Humano/genética , Historia Antigua , Polen , Dieta/historia , Caza/historia , Agricultores/historia , Cultura , Fenotipo , Conjuntos de Datos como Asunto
12.
Nature ; 625(7994): 312-320, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38200293

RESUMEN

The Holocene (beginning around 12,000 years ago) encompassed some of the most significant changes in human evolution, with far-reaching consequences for the dietary, physical and mental health of present-day populations. Using a dataset of more than 1,600 imputed ancient genomes1, we modelled the selection landscape during the transition from hunting and gathering, to farming and pastoralism across West Eurasia. We identify key selection signals related to metabolism, including that selection at the FADS cluster began earlier than previously reported and that selection near the LCT locus predates the emergence of the lactase persistence allele by thousands of years. We also find strong selection in the HLA region, possibly due to increased exposure to pathogens during the Bronze Age. Using ancient individuals to infer local ancestry tracts in over 400,000 samples from the UK Biobank, we identify widespread differences in the distribution of Mesolithic, Neolithic and Bronze Age ancestries across Eurasia. By calculating ancestry-specific polygenic risk scores, we show that height differences between Northern and Southern Europe are associated with differential Steppe ancestry, rather than selection, and that risk alleles for mood-related phenotypes are enriched for Neolithic farmer ancestry, whereas risk alleles for diabetes and Alzheimer's disease are enriched for Western hunter-gatherer ancestry. Our results indicate that ancient selection and migration were large contributors to the distribution of phenotypic diversity in present-day Europeans.


Asunto(s)
Asiático , Pueblo Europeo , Genoma Humano , Selección Genética , Humanos , Afecto , Agricultura/historia , Alelos , Enfermedad de Alzheimer/genética , Asia/etnología , Asiático/genética , Diabetes Mellitus/genética , Europa (Continente)/etnología , Pueblo Europeo/genética , Agricultores/historia , Sitios Genéticos/genética , Predisposición Genética a la Enfermedad , Genoma Humano/genética , Historia Antigua , Migración Humana , Caza/historia , Familia de Multigenes/genética , Fenotipo , Biobanco del Reino Unido , Herencia Multifactorial/genética
13.
Cell Genom ; 4(2): 100497, 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38295789

RESUMEN

Growing evidence indicates that transposable elements (TEs) play important roles in evolution by providing genomes with coding and non-coding sequences. Identification of TE-derived functional elements, however, has relied on TE annotations in individual species, which limits its scope to relatively intact TE sequences. Here, we report a novel approach to uncover previously unannotated degenerate TEs (degTEs) by probing multiple ancestral genomes reconstructed from hundreds of species. We applied this method to the human genome and achieved a 10.8% increase in coverage over the most recent annotation. Further, we discovered that degTEs contribute to various cis-regulatory elements and transcription factor binding sites, including those of a known TE-controlling family, the KRAB zinc-finger proteins. We also report unannotated chimeric transcripts between degTEs and human genes expressed in embryos. This study provides a novel methodology and a freely available resource that will facilitate the investigation of TE co-option events on a full scale.


Asunto(s)
Elementos Transponibles de ADN , Secuencias Reguladoras de Ácidos Nucleicos , Humanos , Elementos Transponibles de ADN/genética , Genoma Humano/genética
14.
Genet Med ; 26(5): 101076, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38258669

RESUMEN

PURPOSE: Genome sequencing (GS)-specific diagnostic rates in prospective tightly ascertained exome sequencing (ES)-negative intellectual disability (ID) cohorts have not been reported extensively. METHODS: ES, GS, epigenetic signatures, and long-read sequencing diagnoses were assessed in 74 trios with at least moderate ID. RESULTS: The ES diagnostic yield was 42 of 74 (57%). GS diagnoses were made in 9 of 32 (28%) ES-unresolved families. Repeated ES with a contemporary pipeline on the GS-diagnosed families identified 8 of 9 single-nucleotide variations/copy-number variations undetected in older ES, confirming a GS-unique diagnostic rate of 1 in 32 (3%). Episignatures contributed diagnostic information in 9% with GS corroboration in 1 of 32 (3%) and diagnostic clues in 2 of 32 (6%). A genetic etiology for ID was detected in 51 of 74 (69%) families. Twelve candidate disease genes were identified. Contemporary ES followed by GS cost US$4976 (95% CI: $3704; $6969) per diagnosis and first-line GS at a cost of $7062 (95% CI: $6210; $8475) per diagnosis. CONCLUSION: Performing GS only in ID trios would be cost equivalent to ES if GS were available at $2435, about a 60% reduction from current prices. This study demonstrates that first-line GS achieves higher diagnostic rate than contemporary ES but at a higher cost.


Asunto(s)
Secuenciación del Exoma , Exoma , Discapacidad Intelectual , Humanos , Discapacidad Intelectual/genética , Discapacidad Intelectual/diagnóstico , Masculino , Femenino , Exoma/genética , Secuenciación del Exoma/economía , Estudios de Cohortes , Pruebas Genéticas/economía , Pruebas Genéticas/métodos , Secuenciación Completa del Genoma/economía , Niño , Genoma Humano/genética , Variaciones en el Número de Copia de ADN/genética , Polimorfismo de Nucleótido Simple/genética , Preescolar
15.
Proc Natl Acad Sci U S A ; 121(2): e2316242120, 2024 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-38165936

RESUMEN

The genome of an individual from an admixed population consists of segments originated from different ancestral populations. Most existing ancestry inference approaches focus on calling these segments for the extant individual. In this paper, we present a general ancestry inference approach for inferring recent ancestors from an extant genome. Given the genome of an individual from a recently admixed population, our method can estimate the proportions of the genomes of the recent ancestors of this individual that originated from some ancestral populations. The key step of our method is the inference of ancestors (called founders) right after the formation of an admixed population. The inferred founders can then be used to infer the ancestry of recent ancestors of an extant individual. Our method is implemented in a computer program called PedMix2. To the best of our knowledge, there is no existing method that can practically infer ancestors beyond grandparents from an extant individual's genome. Results on both simulated and real data show that PedMix2 performs well in ancestry inference.


Asunto(s)
Genética de Población , Abuelos , Humanos , Programas Informáticos , Genoma Humano/genética
16.
17.
Life Sci Alliance ; 7(3)2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38167611

RESUMEN

Bulky DNA damages block transcription and compromise genome integrity and function. The cellular response to these damages includes global transcription shutdown. Still, active transcription is necessary for transcription-coupled repair and for induction of damage-response genes. To uncover common features of a general bulky DNA damage response, and to identify response-related transcripts that are expressed despite damage, we performed a systematic RNA-seq study comparing the transcriptional response to three independent damage-inducing agents: UV, the chemotherapy cisplatin, and benzo[a]pyrene, a component of cigarette smoke. Reduction in gene expression after damage was associated with higher damage rates, longer gene length, and low GC content. We identified genes with relatively higher expression after all three damage treatments, including NR4A2, a potential novel damage-response transcription factor. Up-regulated genes exhibit higher exon content that is associated with preferential repair, which could enable rapid damage removal and transcription restoration. The attenuated response to BPDE highlights that not all bulky damages elicit the same response. These findings frame gene architecture as a major determinant of the transcriptional response that is hardwired into the human genome.


Asunto(s)
Daño del ADN , Reparación del ADN , Humanos , Reparación del ADN/genética , Daño del ADN/genética , Benzo(a)pireno/farmacología , Benzo(a)pireno/metabolismo , Regulación de la Expresión Génica/genética , Genoma Humano/genética
18.
Nature ; 626(7999): 565-573, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38297123

RESUMEN

Genomic research that targets large-scale, prospective birth cohorts constitutes an essential strategy for understanding the influence of genetics and environment on human health1. Nonetheless, such studies remain scarce, particularly in Asia. Here we present the phase I genome study of the Born in Guangzhou Cohort Study2 (BIGCS), which encompasses the sequencing and analysis of 4,053 Chinese individuals, primarily composed of trios or mother-infant duos residing in South China. Our analysis reveals novel genetic variants, a high-quality reference panel, and fine-scale local genetic structure within BIGCS. Notably, we identify previously unreported East Asian-specific genetic associations with maternal total bile acid, gestational weight gain and infant cord blood traits. Additionally, we observe prevalent age-specific genetic effects on lipid levels in mothers and infants. In an exploratory intergenerational Mendelian randomization analysis, we estimate the maternal putatively causal and fetal genetic effects of seven adult phenotypes on seven fetal growth-related measurements. These findings illuminate the genetic links between maternal and early-life traits in an East Asian population and lay the groundwork for future research into the intricate interplay of genetics, intrauterine exposures and early-life experiences in shaping long-term health.


Asunto(s)
Estudios de Cohortes , Interacción Gen-Ambiente , Variación Genética , Genoma Humano , Fenotipo , Efectos Tardíos de la Exposición Prenatal , Adulto , Femenino , Humanos , Lactante , Recién Nacido , Ácidos y Sales Biliares/metabolismo , China/etnología , Cordocentesis , Feto/embriología , Ganancia de Peso Gestacional , Lípidos/sangre , Exposición Materna , Parto , Estudios Prospectivos , Genoma Humano/genética , Variación Genética/genética
19.
Nature ; 625(7996): 813-821, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38172637

RESUMEN

Although the impact of host genetics on gut microbial diversity and the abundance of specific taxa is well established1-6, little is known about how host genetics regulates the genetic diversity of gut microorganisms. Here we conducted a meta-analysis of associations between human genetic variation and gut microbial structural variation in 9,015 individuals from four Dutch cohorts. Strikingly, the presence rate of a structural variation segment in Faecalibacterium prausnitzii that harbours an N-acetylgalactosamine (GalNAc) utilization gene cluster is higher in individuals who secrete the type A oligosaccharide antigen terminating in GalNAc, a feature that is jointly determined by human ABO and FUT2 genotypes, and we could replicate this association in a Tanzanian cohort. In vitro experiments demonstrated that GalNAc can be used as the sole carbohydrate source for F. prausnitzii strains that carry the GalNAc-metabolizing pathway. Further in silico and in vitro studies demonstrated that other ABO-associated species can also utilize GalNAc, particularly Collinsella aerofaciens. The GalNAc utilization genes are also associated with the host's cardiometabolic health, particularly in individuals with mucosal A-antigen. Together, the findings of our study demonstrate that genetic associations across the human genome and bacterial metagenome can provide functional insights into the reciprocal host-microbiome relationship.


Asunto(s)
Bacterias , Microbioma Gastrointestinal , Interacciones Microbiota-Huesped , Metagenoma , Humanos , Acetilgalactosamina/metabolismo , Bacterias/clasificación , Bacterias/genética , Bacterias/aislamiento & purificación , Estudios de Cohortes , Simulación por Computador , Faecalibacterium prausnitzii/genética , Microbioma Gastrointestinal/genética , Genoma Humano/genética , Genotipo , Interacciones Microbiota-Huesped/genética , Técnicas In Vitro , Metagenoma/genética , Familia de Multigenes , Países Bajos , Tanzanía
20.
Nat Biotechnol ; 42(4): 663-673, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-37165083

RESUMEN

Pangenome references address biases of reference genomes by storing a representative set of diverse haplotypes and their alignment, usually as a graph. Alternate alleles determined by variant callers can be used to construct pangenome graphs, but advances in long-read sequencing are leading to widely available, high-quality phased assemblies. Constructing a pangenome graph directly from assemblies, as opposed to variant calls, leverages the graph's ability to represent variation at different scales. Here we present the Minigraph-Cactus pangenome pipeline, which creates pangenomes directly from whole-genome alignments, and demonstrate its ability to scale to 90 human haplotypes from the Human Pangenome Reference Consortium. The method builds graphs containing all forms of genetic variation while allowing use of current mapping and genotyping tools. We measure the effect of the quality and completeness of reference genomes used for analysis within the pangenomes and show that using the CHM13 reference from the Telomere-to-Telomere Consortium improves the accuracy of our methods. We also demonstrate construction of a Drosophila melanogaster pangenome.


Asunto(s)
Drosophila melanogaster , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Animales , Drosophila melanogaster/genética , Haplotipos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Alelos , Análisis de Secuencia de ADN , Genoma Humano/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA