Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros












Base de datos
Intervalo de año de publicación
1.
PLoS Genet ; 20(7): e1011092, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38959269

RESUMEN

Haplotype estimation, or phasing, has gained significant traction in large-scale projects due to its valuable contributions to population genetics, variant analysis, and the creation of reference panels for imputation and phasing of new samples. To scale with the growing number of samples, haplotype estimation methods designed for population scale rely on highly optimized statistical models to phase genotype data, and usually ignore read-level information. Statistical methods excel in resolving common variants, however, they still struggle at rare variants due to the lack of statistical information. In this study we introduce SAPPHIRE, a new method that leverages whole-genome sequencing data to enhance the precision of haplotype calls produced by statistical phasing. SAPPHIRE achieves this by refining haplotype estimates through the realignment of sequencing reads, particularly targeting low-confidence phase calls. Our findings demonstrate that SAPPHIRE significantly enhances the accuracy of haplotypes obtained from state of the art methods and also provides the subset of phase calls that are validated by sequencing reads. Finally, we show that our method scales to large data sets by its successful application to the extensive 3.6 Petabytes of sequencing data of the last UK Biobank 200,031 sample release.


Asunto(s)
Genética de Población , Haplotipos , Secuenciación Completa del Genoma , Secuenciación Completa del Genoma/métodos , Humanos , Genética de Población/métodos , Genoma Humano , Polimorfismo de Nucleótido Simple/genética , Estudio de Asociación del Genoma Completo/métodos , Algoritmos
2.
Nat Comput Sci ; 4(5): 360-366, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38745108

RESUMEN

For many genome-wide association studies, imputing genotypes from a haplotype reference panel is a necessary step. Over the past 15 years, reference panels have become larger and more diverse, leading to improvements in imputation accuracy. However, the latest generation of reference panels is subject to restrictions on data sharing due to concerns about privacy, limiting their usefulness for genotype imputation. In this context, here we propose RESHAPE, a method that employs a recombination Poisson process on a reference panel to simulate the genomes of hypothetical descendants after multiple generations. This data transformation helps to protect against re-identification threats and preserves data attributes, such as linkage disequilibrium patterns and, to some degree, identity-by-descent sharing, allowing for genotype imputation. Our experiments on gold-standard datasets show that simulated descendants up to eight generations can serve as reference panels without substantially reducing genotype imputation accuracy.


Asunto(s)
Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Estudio de Asociación del Genoma Completo/métodos , Desequilibrio de Ligamiento , Haplotipos/genética , Polimorfismo de Nucleótido Simple/genética , Difusión de la Información/métodos , Simulación por Computador , Modelos Genéticos , Algoritmos , Genoma Humano/genética , Distribución de Poisson
3.
Sci Rep ; 14(1): 6227, 2024 03 14.
Artículo en Inglés | MEDLINE | ID: mdl-38486065

RESUMEN

Low-coverage imputation is becoming ever more present in ancient DNA (aDNA) studies. Imputation pipelines commonly used for present-day genomes have been shown to yield accurate results when applied to ancient genomes. However, post-mortem damage (PMD), in the form of C-to-T substitutions at the reads termini, and contamination with DNA from closely related species can potentially affect imputation performance in aDNA. In this study, we evaluated imputation performance (i) when using a genotype caller designed for aDNA, ATLAS, compared to bcftools, and (ii) when contamination is present. We evaluated imputation performance with principal component analyses and by calculating imputation error rates. With a particular focus on differently imputed sites, we found that using ATLAS prior to imputation substantially improved imputed genotypes for a very damaged ancient genome (42% PMD). Trimming the ends of the sequencing reads led to similar improvements in imputation accuracy. For the remaining genomes, ATLAS brought limited gains. Finally, to examine the effect of contamination on imputation, we added various amounts of reads from two present-day genomes to a previously downsampled high-coverage ancient genome. We observed that imputation accuracy drastically decreased for contamination rates above 5%. In conclusion, we recommend (i) accounting for PMD by either trimming sequencing reads or using a genotype caller such as ATLAS before imputing highly damaged genomes and (ii) only imputing genomes containing up to 5% of contamination.


Asunto(s)
ADN Antiguo , Genoma , Genotipo , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple
4.
Front Immunol ; 14: 1305856, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38146367

RESUMEN

Introduction: We have reanalyzed the genomic data of the International Collaboration for the Genomics of HIV (ICGH), centering on HIV-1 Elite Controllers. Methods: We performed a genome-wide Association Study comparing 543 HIV Elite Controllers with 3,272 uninfected controls of European descent. Using the latest database for imputation, we analyzed 35,552 Single Nucleotide Polymorphisms (SNPs) within the Major Histocompatibility Complex (MHC) region. Results: Our analysis identified 2,626 SNPs significantly associated (p<5. 10-8) with elite control of HIV-1 infection, including well-established MHC signals such as the rs2395029-G allele which tags HLA-B*57:01. A thorough investigation of SNPs in linkage disequilibrium with rs2395029 revealed an extensive haploblock spanning 1.9 megabases in the MHC region tagging HLA-B*57:01, comprising 379 SNP alleles impacting 72 genes. This haploblock contains damaging variations in proteins like NOTCH4 and DXO and is also associated with a strong differential pattern of expression of multiple MHC genes such as HLA-B, MICB, and ZBTB12. The study was expanded to include two cohorts of seropositive African-American individuals, where a haploblock tagging the HLA-B*57:03 allele was similarly associated with control of viral load. The mRNA expression profile of this haploblock in African Americans closely mirrored that in the European cohort. Discussion: These findings suggest that additional molecular mechanisms beyond the conventional antigen-presenting role of class I HLA molecules may contribute to the observed influence of HLA-B*57:01/B*57:03 alleles on HIV-1 elite control. Overall, this study has uncovered a large haploblock associated with HLA-B*57 alleles, providing novel insights into their massive effect on HIV-1 elite control.


Asunto(s)
Seropositividad para VIH , VIH-1 , Humanos , VIH-1/genética , Alelos , Estudio de Asociación del Genoma Completo , Antígenos HLA-B/genética , Complejo Mayor de Histocompatibilidad , Seropositividad para VIH/genética , Proteínas de Unión al ADN/genética , Factores de Transcripción/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...