Pesquisa | BVS CLAP/SMR-OPAS/OMS

1.

The genomic history of the Aegean palatial civilizations.

Clemente, Florian; Unterländer, Martina; Dolgova, Olga; Amorim, Carlos Eduardo G; Coroado-Santos, Francisco; Neuenschwander, Samuel; Ganiatsou, Elissavet; Cruz Dávalos, Diana I; Anchieri, Lucas; Michaud, Frédéric; Winkelbach, Laura; Blöcher, Jens; Arizmendi Cárdenas, Yami Ommar; Sousa da Mota, Bárbara; Kalliga, Eleni; Souleles, Angelos; Kontopoulos, Ioannis; Karamitrou-Mentessidi, Georgia; Philaniotou, Olga; Sampson, Adamantios; Theodorou, Dimitra; Tsipopoulou, Metaxia; Akamatis, Ioannis; Halstead, Paul; Kotsakis, Kostas; Urem-Kotsou, Dushka; Panagiotopoulos, Diamantis; Ziota, Christina; Triantaphyllou, Sevasti; Delaneau, Olivier; Jensen, Jeffrey D; Moreno-Mayar, J Víctor; Burger, Joachim; Sousa, Vitor C; Lao, Oscar; Malaspinas, Anna-Sapfo; Papageorgopoulou, Christina.

Cell ; 184(10): 2565-2586.e21, 2021 05 13.

Artigo em Inglês | MEDLINE | ID: mdl-33930288

RESUMO

The Cycladic, the Minoan, and the Helladic (Mycenaean) cultures define the Bronze Age (BA) of Greece. Urbanism, complex social structures, craft and agricultural specialization, and the earliest forms of writing characterize this iconic period. We sequenced six Early to Middle BA whole genomes, along with 11 mitochondrial genomes, sampled from the three BA cultures of the Aegean Sea. The Early BA (EBA) genomes are homogeneous and derive most of their ancestry from Neolithic Aegeans, contrary to earlier hypotheses that the Neolithic-EBA cultural transition was due to massive population turnover. EBA Aegeans were shaped by relatively small-scale migration from East of the Aegean, as evidenced by the Caucasus-related ancestry also detected in Anatolians. In contrast, Middle BA (MBA) individuals of northern Greece differ from EBA populations in showing â¼50% Pontic-Caspian Steppe-related ancestry, dated at ca. 2,600-2,000 BCE. Such gene flow events during the MBA contributed toward shaping present-day Greek genomes.

Assuntos

Civilização/história , Genoma Humano , Genoma Mitocondrial , Migração Humana/história , DNA Antigo , Grécia Antiga , História Antiga , Humanos

2.

The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease.

Astle, William J; Elding, Heather; Jiang, Tao; Allen, Dave; Ruklisa, Dace; Mann, Alice L; Mead, Daniel; Bouman, Heleen; Riveros-Mckay, Fernando; Kostadima, Myrto A; Lambourne, John J; Sivapalaratnam, Suthesh; Downes, Kate; Kundu, Kousik; Bomba, Lorenzo; Berentsen, Kim; Bradley, John R; Daugherty, Louise C; Delaneau, Olivier; Freson, Kathleen; Garner, Stephen F; Grassi, Luigi; Guerrero, Jose; Haimel, Matthias; Janssen-Megens, Eva M; Kaan, Anita; Kamat, Mihir; Kim, Bowon; Mandoli, Amit; Marchini, Jonathan; Martens, Joost H A; Meacham, Stuart; Megy, Karyn; O'Connell, Jared; Petersen, Romina; Sharifi, Nilofar; Sheard, Simon M; Staley, James R; Tuna, Salih; van der Ent, Martijn; Walter, Klaudia; Wang, Shuang-Yin; Wheeler, Eleanor; Wilder, Steven P; Iotchkova, Valentina; Moore, Carmel; Sambrook, Jennifer; Stunnenberg, Hendrik G; Di Angelantonio, Emanuele; Kaptoge, Stephen.

Cell ; 167(5): 1415-1429.e19, 2016 11 17.

Artigo em Inglês | MEDLINE | ID: mdl-27863252

RESUMO

Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.

Assuntos

Variação Genética , Estudo de Associação Genômica Ampla , Células-Tronco Hematopoéticas/metabolismo , Doenças do Sistema Imunitário/genética , Alelos , Diferenciação Celular , Predisposição Genética para Doença , Células-Tronco Hematopoéticas/patologia , Humanos , Doenças do Sistema Imunitário/patologia , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , População Branca/genética

3.

Population Variation and Genetic Control of Modular Chromatin Architecture in Humans.

Waszak, Sebastian M; Delaneau, Olivier; Gschwind, Andreas R; Kilpinen, Helena; Raghav, Sunil K; Witwicki, Robert M; Orioli, Andrea; Wiederkehr, Michael; Panousis, Nikolaos I; Yurovsky, Alisa; Romano-Palumbo, Luciana; Planchon, Alexandra; Bielser, Deborah; Padioleau, Ismael; Udin, Gilles; Thurnheer, Sarah; Hacker, David; Hernandez, Nouria; Reymond, Alexandre; Deplancke, Bart; Dermitzakis, Emmanouil T.

Cell ; 162(5): 1039-50, 2015 Aug 27.

Artigo em Inglês | MEDLINE | ID: mdl-26300124

RESUMO

Chromatin state variation at gene regulatory elements is abundant across individuals, yet we understand little about the genetic basis of this variability. Here, we profiled several histone modifications, the transcription factor (TF) PU.1, RNA polymerase II, and gene expression in lymphoblastoid cell lines from 47 whole-genome sequenced individuals. We observed that distinct cis-regulatory elements exhibit coordinated chromatin variation across individuals in the form of variable chromatin modules (VCMs) at sub-Mb scale. VCMs were associated with thousands of genes and preferentially cluster within chromosomal contact domains. We mapped strong proximal and weak, yet more ubiquitous, distal-acting chromatin quantitative trait loci (cQTL) that frequently explain this variation. cQTLs were associated with molecular activity at clusters of cis-regulatory elements and mapped preferentially within TF-bound regions. We propose that local, sequence-independent chromatin variation emerges as a result of genetic perturbations in cooperative interactions between cis-regulatory elements that are located within the same genomic domain.

Assuntos

Cromatina/química , Regulação da Expressão Gênica , Variação Genética , Genoma Humano , Cromatina/metabolismo , Cromossomos Humanos/química , Genética Populacional , Humanos , Locos de Características Quantitativas , Sequências Reguladoras de Ácido Nucleico , Fatores de Transcrição/metabolismo

4.

Improving population scale statistical phasing with whole-genome sequencing data.

Wertenbroek, Rick; Hofmeister, Robin J; Xenarios, Ioannis; Thoma, Yann; Delaneau, Olivier.

PLoS Genet ; 20(7): e1011092, 2024 Jul 03.

Artigo em Inglês | MEDLINE | ID: mdl-38959269

RESUMO

Haplotype estimation, or phasing, has gained significant traction in large-scale projects due to its valuable contributions to population genetics, variant analysis, and the creation of reference panels for imputation and phasing of new samples. To scale with the growing number of samples, haplotype estimation methods designed for population scale rely on highly optimized statistical models to phase genotype data, and usually ignore read-level information. Statistical methods excel in resolving common variants, however, they still struggle at rare variants due to the lack of statistical information. In this study we introduce SAPPHIRE, a new method that leverages whole-genome sequencing data to enhance the precision of haplotype calls produced by statistical phasing. SAPPHIRE achieves this by refining haplotype estimates through the realignment of sequencing reads, particularly targeting low-confidence phase calls. Our findings demonstrate that SAPPHIRE significantly enhances the accuracy of haplotypes obtained from state of the art methods and also provides the subset of phase calls that are validated by sequencing reads. Finally, we show that our method scales to large data sets by its successful application to the extensive 3.6 Petabytes of sequencing data of the last UK Biobank 200,031 sample release.

5.

Mapache: a flexible pipeline to map ancient DNA.

Neuenschwander, Samuel; Cruz Dávalos, Diana I; Anchieri, Lucas; Sousa da Mota, Bárbara; Bozzi, Davide; Rubinacci, Simone; Delaneau, Olivier; Rasmussen, Simon; Malaspinas, Anna-Sapfo.

Bioinformatics ; 39(2)2023 02 03.

Artigo em Inglês | MEDLINE | ID: mdl-36637197

RESUMO

SUMMARY: We introduce mapache, a flexible, robust and scalable pipeline to map, quantify and impute ancient and present-day DNA in a reproducible way. Mapache is implemented in the workflow manager Snakemake and is optimized for low-space consumption, allowing to efficiently (re)map large datasets-such as reference panels and multiple extracts and libraries per sample - to one or several genomes. Mapache can easily be customized or combined with other Snakemake tools. AVAILABILITY AND IMPLEMENTATION: Mapache is freely available on GitHub (https://github.com/sneuensc/mapache). An extensive manual is provided at https://github.com/sneuensc/mapache/wiki. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

DNA Antigo , Software , Genoma , Fluxo de Trabalho

6.

The UK Biobank resource with deep phenotyping and genomic data.

Bycroft, Clare; Freeman, Colin; Petkova, Desislava; Band, Gavin; Elliott, Lloyd T; Sharp, Kevin; Motyer, Allan; Vukcevic, Damjan; Delaneau, Olivier; O'Connell, Jared; Cortes, Adrian; Welsh, Samantha; Young, Alan; Effingham, Mark; McVean, Gil; Leslie, Stephen; Allen, Naomi; Donnelly, Peter; Marchini, Jonathan.

Nature ; 562(7726): 203-209, 2018 10.

Artigo em Inglês | MEDLINE | ID: mdl-30305743

RESUMO

The UK Biobank project is a prospective cohort study with deep genetic and phenotypic data collected on approximately 500,000 individuals from across the United Kingdom, aged between 40 and 69 at recruitment. The open resource is unique in its size and scope. A rich variety of phenotypic and health-related information is available on each participant, including biological measurements, lifestyle indicators, biomarkers in blood and urine, and imaging of the body and brain. Follow-up information is provided by linking health and medical records. Genome-wide genotype data have been collected on all participants, providing many opportunities for the discovery of new genetic associations and the genetic bases of complex traits. Here we describe the centralized analysis of the genetic data, including genotype quality, properties of population structure and relatedness of the genetic data, and efficient phasing and genotype imputation that increases the number of testable variants to around 96 million. Classical allelic variation at 11 human leukocyte antigen genes was imputed, resulting in the recovery of signals with known associations between human leukocyte antigen alleles and many diseases.

Assuntos

Bases de Dados Factuais , Genômica , Fenótipo , Adulto , Idoso , Alelos , Biomarcadores/sangue , Biomarcadores/urina , Estatura/genética , Encéfalo/diagnóstico por imagem , Estudos de Coortes , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Família , Feminino , Estudo de Associação Genômica Ampla , Haplótipos/genética , Humanos , Estilo de Vida , Complexo Principal de Histocompatibilidade/genética , Masculino , Pessoa de Meia-Idade , Controle de Qualidade , Grupos Raciais/genética , Reino Unido

7.

GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing.

Valls-Margarit, Jordi; Galván-Femenía, Iván; Matías-Sánchez, Daniel; Blay, Natalia; Puiggròs, Montserrat; Carreras, Anna; Salvoro, Cecilia; Cortés, Beatriz; Amela, Ramon; Farre, Xavier; Lerga-Jaso, Jon; Puig, Marta; Sánchez-Herrero, Jose Francisco; Moreno, Victor; Perucho, Manuel; Sumoy, Lauro; Armengol, Lluís; Delaneau, Olivier; Cáceres, Mario; de Cid, Rafael; Torrents, David.

Nucleic Acids Res ; 50(5): 2464-2479, 2022 03 21.

Artigo em Inglês | MEDLINE | ID: mdl-35176773

RESUMO

The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.

Assuntos

Genoma Humano , Haplótipos , Mutação INDEL , Aciltransferases , Europa (Continente) , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Lipase , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma/métodos

8.

XSI-a genotype compression tool for compressive genomics in large biobanks.

Wertenbroek, Rick; Rubinacci, Simone; Xenarios, Ioannis; Thoma, Yann; Delaneau, Olivier.

Bioinformatics ; 38(15): 3778-3784, 2022 08 02.

Artigo em Inglês | MEDLINE | ID: mdl-35748697

RESUMO

MOTIVATION: Generation of genotype data has been growing exponentially over the last decade. With the large size of recent datasets comes a storage and computational burden with ever increasing costs. To reduce this burden, we propose XSI, a file format with reduced storage footprint that also allows computation on the compressed data and we show how this can improve future analyses. RESULTS: We show that xSqueezeIt (XSI) allows for a file size reduction of 4-20× compared with compressed BCF and demonstrate its potential for 'compressive genomics' on the UK Biobank whole-genome sequencing genotypes with 8× faster loading times, 5× faster run of homozygozity computation, 30× faster dot products computation and 280× faster allele counts. AVAILABILITY AND IMPLEMENTATION: The XSI file format specifications, API and command line tool are released under open-source (MIT) license and are available at https://github.com/rwk-unil/xSqueezeIt. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Compressão de Dados , Software , Bancos de Espécimes Biológicos , Genômica , Genótipo

9.

Genotype imputation using the Positional Burrows Wheeler Transform.

Rubinacci, Simone; Delaneau, Olivier; Marchini, Jonathan.

PLoS Genet ; 16(11): e1009049, 2020 11.

Artigo em Inglês | MEDLINE | ID: mdl-33196638

RESUMO

Genotype imputation is the process of predicting unobserved genotypes in a sample of individuals using a reference panel of haplotypes. In the last 10 years reference panels have increased in size by more than 100 fold. Increasing reference panel size improves accuracy of markers with low minor allele frequencies but poses ever increasing computational challenges for imputation methods. Here we present IMPUTE5, a genotype imputation method that can scale to reference panels with millions of samples. This method continues to refine the observation made in the IMPUTE2 method, that accuracy is optimized via use of a custom subset of haplotypes when imputing each individual. It achieves fast, accurate, and memory-efficient imputation by selecting haplotypes using the Positional Burrows Wheeler Transform (PBWT). By using the PBWT data structure at genotyped markers, IMPUTE5 identifies locally best matching haplotypes and long identical by state segments. The method then uses the selected haplotypes as conditioning states within the IMPUTE model. Using the HRC reference panel, which has â¼65,000 haplotypes, we show that IMPUTE5 is up to 30x faster than MINIMAC4 and up to 3x faster than BEAGLE5.1, and uses less memory than both these methods. Using simulated reference panels we show that IMPUTE5 scales sub-linearly with reference panel size. For example, keeping the number of imputed markers constant, increasing the reference panel size from 10,000 to 1 million haplotypes requires less than twice the computation time. As the reference panel increases in size IMPUTE5 is able to utilize a smaller number of reference haplotypes, thus reducing computational cost.

Assuntos

Biologia Computacional/métodos , Estudo de Associação Genômica Ampla/métodos , Haplótipos/genética , Alelos , Previsões/métodos , Frequência do Gene/genética , Genótipo , Humanos , Modelos Teóricos , Polimorfismo de Nucleotídeo Único/genética

10.

Expression estimation and eQTL mapping for HLA genes with a personalized pipeline.

Aguiar, Vitor R C; César, Jônatas; Delaneau, Olivier; Dermitzakis, Emmanouil T; Meyer, Diogo.

PLoS Genet ; 15(4): e1008091, 2019 04.

Artigo em Inglês | MEDLINE | ID: mdl-31009447

RESUMO

The HLA (Human Leukocyte Antigens) genes are well-documented targets of balancing selection, and variation at these loci is associated with many disease phenotypes. Variation in expression levels also influences disease susceptibility and resistance, but little information exists about the regulation and population-level patterns of expression. This results from the difficulty in mapping short reads originated from these highly polymorphic loci, and in accounting for the existence of several paralogues. We developed a computational pipeline to accurately estimate expression for HLA genes based on RNA-seq, improving both locus-level and allele-level estimates. First, reads are aligned to all known HLA sequences in order to infer HLA genotypes, then quantification of expression is carried out using a personalized index. We use simulations to show that expression estimates obtained in this way are not biased due to divergence from the reference genome. We applied our pipeline to the GEUVADIS dataset, and compared the quantifications to those obtained with reference transcriptome. Although the personalized pipeline recovers more reads, we found that using the reference transcriptome produces estimates similar to the personalized pipeline (r ≥ 0.87) with the exception of HLA-DQA1. We describe the impact of the HLA-personalized approach on downstream analyses for nine classical HLA loci (HLA-A, HLA-C, HLA-B, HLA-DRA, HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DPA1, HLA-DPB1). Although the influence of the HLA-personalized approach is modest for eQTL mapping, the p-values and the causality of the eQTLs obtained are better than when the reference transcriptome is used. We investigate how the eQTLs we identified explain variation in expression among lineages of HLA alleles. Finally, we discuss possible causes underlying differences between expression estimates obtained using RNA-seq, antibody-based approaches and qPCR.

Assuntos

Mapeamento Cromossômico , Expressão Gênica , Antígenos HLA/genética , Locos de Características Quantitativas , Alelos , Biologia Computacional/métodos , Frequência do Gene , Genótipo , Haplótipos , Humanos , Transcriptoma

11.

Biased allelic expression in human primary fibroblast single cells.

Borel, Christelle; Ferreira, Pedro G; Santoni, Federico; Delaneau, Olivier; Fort, Alexandre; Popadin, Konstantin Y; Garieri, Marco; Falconnet, Emilie; Ribaux, Pascale; Guipponi, Michel; Padioleau, Ismael; Carninci, Piero; Dermitzakis, Emmanouil T; Antonarakis, Stylianos E.

Am J Hum Genet ; 96(1): 70-80, 2015 Jan 08.

Artigo em Inglês | MEDLINE | ID: mdl-25557783

RESUMO

The study of gene expression in mammalian single cells via genomic technologies now provides the possibility to investigate the patterns of allelic gene expression. We used single-cell RNA sequencing to detect the allele-specific mRNA level in 203 single human primary fibroblasts over 133,633 unique heterozygous single-nucleotide variants (hetSNVs). We observed that at the snapshot of analyses, each cell contained mostly transcripts from one allele from the majority of genes; indeed, 76.4% of the hetSNVs displayed stochastic monoallelic expression in single cells. Remarkably, adjacent hetSNVs exhibited a haplotype-consistent allelic ratio; in contrast, distant sites located in two different genes were independent of the haplotype structure. Moreover, the allele-specific expression in single cells correlated with the abundance of the cellular transcript. We observed that genes expressing both alleles in the majority of the single cells at a given time point were rare and enriched with highly expressed genes. The relative abundance of each allele in a cell was controlled by some regulatory mechanisms given that we observed related single-cell allelic profiles according to genes. Overall, these results have direct implications in cellular phenotypic variability.

Assuntos

Alelos , Fibroblastos/citologia , Genoma Humano , Análise de Sequência de RNA , DNA Complementar/genética , DNA Complementar/metabolismo , Haplótipos , Heterozigoto , Humanos , Fenótipo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Análise de Célula Única

12.

MBV: a method to solve sample mislabeling and detect technical bias in large combined genotype and sequencing assay datasets.

Fort, Alexandre; Panousis, Nikolaos I; Garieri, Marco; Antonarakis, Stylianos E; Lappalainen, Tuuli; Dermitzakis, Emmanouil T; Delaneau, Olivier.

Bioinformatics ; 33(12): 1895-1897, 2017 Jun 15.

Artigo em Inglês | MEDLINE | ID: mdl-28186259

RESUMO

MOTIVATION: Large genomic datasets combining genotype and sequence data, such as for expression quantitative trait loci (eQTL) detection, require perfect matching between both data types. RESULTS: We described here MBV (Match BAM to VCF); a method to quickly solve sample mislabeling and detect cross-sample contamination and PCR amplification bias. AVAILABILITY AND IMPLEMENTATION: MBV is implemented in C ++ as an independent component of the QTLtools software package, the binary and source codes are freely available at https://qtltools.github.io/qtltools/ . CONTACT: olivier.delaneau@unige.ch or emmanouil.dermitzakis@unige.ch. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Técnicas de Genotipagem/métodos , Locos de Características Quantitativas , Análise de Sequência de DNA/métodos , Software , Viés , Genômica/métodos , Genômica/normas , Técnicas de Genotipagem/normas , Humanos , Análise de Sequência de DNA/normas

13.

Phasing for medical sequencing using rare variants and large haplotype reference panels.

Sharp, Kevin; Kretzschmar, Warren; Delaneau, Olivier; Marchini, Jonathan.

Bioinformatics ; 32(13): 1974-80, 2016 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-27153703

RESUMO

MOTIVATION: There is growing recognition that estimating haplotypes from high coverage sequencing of single samples in clinical settings is an important problem. At the same time very large datasets consisting of tens and hundreds of thousands of high-coverage sequenced samples will soon be available. We describe a method that takes advantage of these huge human genetic variation resources and rare variant sharing patterns to estimate haplotypes on single sequenced samples. Sharing rare variants between two individuals is more likely to arise from a recent common ancestor and, hence, also more likely to indicate similar shared haplotypes over a substantial flanking region of sequence. RESULTS: Our method exploits this idea to select a small set of highly informative copying states within a Hidden Markov Model (HMM) phasing algorithm. Using rare variants in this way allows us to avoid iterative MCMC methods to infer haplotypes. Compared to other approaches that do not explicitly use rare variants we obtain significant gains in phasing accuracy, less variation over phasing runs and improvements in speed. For example, using a reference panel of 7420 haplotypes from the UK10K project, we are able to reduce switch error rates by up to 50% when phasing samples sequenced at high-coverage. In addition, a single step rephasing of the UK10K panel, using rare variant information, has a downstream impact on phasing performance. These results represent a proof of concept that rare variant sharing patterns can be utilized to phase large high-coverage sequencing studies such as the 100 000 Genomes Project dataset. AVAILABILITY AND IMPLEMENTATION: A webserver that includes an implementation of this new method and allows phasing of high-coverage clinical samples is available at https://phasingserver.stats.ox.ac.uk/ CONTACT: marchini@stats.ox.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Biologia Computacional/métodos , Variação Genética , Haplótipos , Algoritmos , Alelos , Genótipo , Humanos

14.

Fast and efficient QTL mapper for thousands of molecular phenotypes.

Ongen, Halit; Buil, Alfonso; Brown, Andrew Anand; Dermitzakis, Emmanouil T; Delaneau, Olivier.

Bioinformatics ; 32(10): 1479-85, 2016 05 15.

Artigo em Inglês | MEDLINE | ID: mdl-26708335

RESUMO

MOTIVATION: In order to discover quantitative trait loci, multi-dimensional genomic datasets combining DNA-seq and ChiP-/RNA-seq require methods that rapidly correlate tens of thousands of molecular phenotypes with millions of genetic variants while appropriately controlling for multiple testing. RESULTS: We have developed FastQTL, a method that implements a popular cis-QTL mapping strategy in a user- and cluster-friendly tool. FastQTL also proposes an efficient permutation procedure to control for multiple testing. The outcome of permutations is modeled using beta distributions trained from a few permutations and from which adjusted P-values can be estimated at any level of significance with little computational cost. The Geuvadis & GTEx pilot datasets can be now easily analyzed an order of magnitude faster than previous approaches. AVAILABILITY AND IMPLEMENTATION: Source code, binaries and comprehensive documentation of FastQTL are freely available to download at http://fastqtl.sourceforge.net/ CONTACT: emmanouil.dermitzakis@unige.ch or olivier.delaneau@unige.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Locos de Características Quantitativas , Genômica , Fenótipo , Software , Distribuições Estatísticas

15.

A general approach for haplotype phasing across the full spectrum of relatedness.

O'Connell, Jared; Gurdasani, Deepti; Delaneau, Olivier; Pirastu, Nicola; Ulivi, Sheila; Cocca, Massimiliano; Traglia, Michela; Huang, Jie; Huffman, Jennifer E; Rudan, Igor; McQuillan, Ruth; Fraser, Ross M; Campbell, Harry; Polasek, Ozren; Asiki, Gershim; Ekoru, Kenneth; Hayward, Caroline; Wright, Alan F; Vitart, Veronique; Navarro, Pau; Zagury, Jean-Francois; Wilson, James F; Toniolo, Daniela; Gasparini, Paolo; Soranzo, Nicole; Sandhu, Manjinder S; Marchini, Jonathan.

PLoS Genet ; 10(4): e1004234, 2014 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-24743097

RESUMO

Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally 'unrelated' individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM) to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.

Assuntos

Haplótipos/genética , Mapeamento Cromossômico/métodos , Efeito de Coortes , Família , Genótipo , Humanos , Modelos Genéticos , Linhagem , Fenótipo , Recombinação Genética/genética

16.

Haplotype estimation using sequencing reads.

Delaneau, Olivier; Howie, Bryan; Cox, Anthony J; Zagury, Jean-François; Marchini, Jonathan.

Am J Hum Genet ; 93(4): 687-96, 2013 Oct 03.

Artigo em Inglês | MEDLINE | ID: mdl-24094745

RESUMO

High-throughput sequencing technologies produce short sequence reads that can contain phase information if they span two or more heterozygote genotypes. This information is not routinely used by current methods that infer haplotypes from genotype data. We have extended the SHAPEIT2 method to use phase-informative sequencing reads to improve phasing accuracy. Our model incorporates the read information in a probabilistic model through base quality scores within each read. The method is primarily designed for high-coverage sequence data or data sets that already have genotypes called. One important application is phasing of single samples sequenced at high coverage for use in medical sequencing and studies of rare diseases. Our method can also use existing panels of reference haplotypes. We tested the method by using a mother-father-child trio sequenced at high-coverage by Illumina together with the low-coverage sequence data from the 1000 Genomes Project (1000GP). We found that use of phase-informative reads increases the mean distance between switch errors by 22% from 274.4 kb to 328.6 kb. We also used male chromosome X haplotypes from the 1000GP samples to simulate sequencing reads with varying insert size, read length, and base error rate. When using short 100 bp paired-end reads, we found that using mixtures of insert sizes produced the best results. When using longer reads with high error rates (5-20 kb read with 4%-15% error per base), phasing performance was substantially improved.

Assuntos

Genoma Humano , Haplótipos/genética , Análise de Sequência de DNA/métodos , Criança , Pai , Feminino , Genótipo , Humanos , Masculino , Modelos Genéticos , Mães , Polimorfismo de Nucleotídeo Único

17.

Association study of common genetic variants and HIV-1 acquisition in 6,300 infected cases and 7,200 controls.

McLaren, Paul J; Coulonges, Cédric; Ripke, Stephan; van den Berg, Leonard; Buchbinder, Susan; Carrington, Mary; Cossarizza, Andrea; Dalmau, Judith; Deeks, Steven G; Delaneau, Olivier; De Luca, Andrea; Goedert, James J; Haas, David; Herbeck, Joshua T; Kathiresan, Sekar; Kirk, Gregory D; Lambotte, Olivier; Luo, Ma; Mallal, Simon; van Manen, Daniëlle; Martinez-Picado, Javier; Meyer, Laurence; Miro, José M; Mullins, James I; Obel, Niels; O'Brien, Stephen J; Pereyra, Florencia; Plummer, Francis A; Poli, Guido; Qi, Ying; Rucart, Pierre; Sandhu, Manj S; Shea, Patrick R; Schuitemaker, Hanneke; Theodorou, Ioannis; Vannberg, Fredrik; Veldink, Jan; Walker, Bruce D; Weintrob, Amy; Winkler, Cheryl A; Wolinsky, Steven; Telenti, Amalio; Goldstein, David B; de Bakker, Paul I W; Zagury, Jean-François; Fellay, Jacques.

PLoS Pathog ; 9(7): e1003515, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23935489

RESUMO

Multiple genome-wide association studies (GWAS) have been performed in HIV-1 infected individuals, identifying common genetic influences on viral control and disease course. Similarly, common genetic correlates of acquisition of HIV-1 after exposure have been interrogated using GWAS, although in generally small samples. Under the auspices of the International Collaboration for the Genomics of HIV, we have combined the genome-wide single nucleotide polymorphism (SNP) data collected by 25 cohorts, studies, or institutions on HIV-1 infected individuals and compared them to carefully matched population-level data sets (a list of all collaborators appears in Note S1 in Text S1). After imputation using the 1,000 Genomes Project reference panel, we tested approximately 8 million common DNA variants (SNPs and indels) for association with HIV-1 acquisition in 6,334 infected patients and 7,247 population samples of European ancestry. Initial association testing identified the SNP rs4418214, the C allele of which is known to tag the HLA-B*57:01 and B*27:05 alleles, as genome-wide significant (p = 3.6 × 10â»¹¹). However, restricting analysis to individuals with a known date of seroconversion suggested that this association was due to the frailty bias in studies of lethal diseases. Further analyses including testing recessive genetic models, testing for bulk effects of non-genome-wide significant variants, stratifying by sexual or parenteral transmission risk and testing previously reported associations showed no evidence for genetic influence on HIV-1 acquisition (with the exception of CCR5Δ32 homozygosity). Thus, these data suggest that genetic influences on HIV acquisition are either rare or have smaller effects than can be detected by this sample size.

Assuntos

Infecções por HIV/genética , HIV-1/fisiologia , Interações Hospedeiro-Patógeno , Polimorfismo de Nucleotídeo Único , Estudos de Casos e Controles , Estudos de Coortes , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Infecções por HIV/virologia , Humanos , População Branca

18.

Evidence after imputation for a role of MICA variants in nonprogression and elite control of HIV type 1 infection.

Le Clerc, Sigrid; Delaneau, Olivier; Coulonges, Cédric; Spadoni, Jean-Louis; Labib, Taoufik; Laville, Vincent; Ulveling, Damien; Noirel, Josselin; Montes, Matthieu; Schächter, François; Caillat-Zucman, Sophie; Zagury, Jean-François.

J Infect Dis ; 210(12): 1946-50, 2014 Dec 15.

Artigo em Inglês | MEDLINE | ID: mdl-24939907

RESUMO

Past genome-wide association studies (GWAS) involving individuals with AIDS have mainly identified associations in the HLA region. Using the latest software, we imputed 7 million single-nucleotide polymorphisms (SNPs)/indels of the 1000 Genomes Project from the GWAS-determined genotypes of individuals in the Genomics of Resistance to Immunodeficiency Virus AIDS nonprogression cohort and compared them with those of control cohorts. The strongest signals were in MICA, the gene encoding major histocompatibility class I polypeptide-related sequence A (P = 3.31 × 10(-12)), with a particular exonic deletion (P = 1.59 × 10(-8)) in full linkage disequilibrium with the reference HCP5 rs2395029 SNP. Haplotype analysis also revealed an additive effect between HLA-C, HLA-B, and MICA variants. These data suggest a role for MICA in progression and elite control of human immunodeficiency virus type 1 infection.

Assuntos

Resistência à Doença , Infecções por HIV/imunologia , HIV-1/imunologia , Antígenos de Histocompatibilidade Classe I/genética , Adulto , Estudos de Coortes , Feminino , Estudos de Associação Genética , Infecções por HIV/virologia , Haplótipos , Humanos , Desequilíbrio de Ligação , Complexo Principal de Histocompatibilidade/genética , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , RNA Longo não Codificante , RNA não Traduzido , Adulto Jovem

19.

A linear complexity phasing method for thousands of genomes.

Delaneau, Olivier; Marchini, Jonathan; Zagury, Jean-François.

Nat Methods ; 9(2): 179-81, 2011 Dec 04.

Artigo em Inglês | MEDLINE | ID: mdl-22138821

RESUMO

Human-disease etiology can be better understood with phase information about diploid sequences. We present a method for estimating haplotypes, using genotype data from unrelated samples or small nuclear families, that leads to improved accuracy and speed compared to several widely used methods. The method, segmented haplotype estimation and imputation tool (SHAPEIT), scales linearly with the number of haplotypes used in each iteration and can be run efficiently on whole chromosomes.

Assuntos

Genoma , Haplótipos

20.

A resampling-based approach to share reference panels.

Cavinato, Théo; Rubinacci, Simone; Malaspinas, Anna-Sapfo; Delaneau, Olivier.

Nat Comput Sci ; 4(5): 360-366, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38745108

RESUMO

For many genome-wide association studies, imputing genotypes from a haplotype reference panel is a necessary step. Over the past 15 years, reference panels have become larger and more diverse, leading to improvements in imputation accuracy. However, the latest generation of reference panels is subject to restrictions on data sharing due to concerns about privacy, limiting their usefulness for genotype imputation. In this context, here we propose RESHAPE, a method that employs a recombination Poisson process on a reference panel to simulate the genomes of hypothetical descendants after multiple generations. This data transformation helps to protect against re-identification threats and preserves data attributes, such as linkage disequilibrium patterns and, to some degree, identity-by-descent sharing, allowing for genotype imputation. Our experiments on gold-standard datasets show that simulated descendants up to eight generations can serve as reference panels without substantially reducing genotype imputation accuracy.

Assuntos

Estudo de Associação Genômica Ampla , Genótipo , Humanos , Estudo de Associação Genômica Ampla/métodos , Desequilíbrio de Ligação , Haplótipos/genética , Polimorfismo de Nucleotídeo Único/genética , Disseminação de Informação/métodos , Simulação por Computador , Modelos Genéticos , Algoritmos , Genoma Humano/genética , Distribuição de Poisson

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA