Search | VHL Regional Portal

1.

Patient-derived organoid biobank identifies epigenetic dysregulation of intestinal epithelial MHC-I as a novel mechanism in severe Crohn's Disease.

Dennison, Thomas W; Edgar, Rachel D; Payne, Felicity; Nayak, Komal M; Ross, Alexander D B; Cenier, Aurelie; Glemas, Claire; Giachero, Federica; Foster, April R; Harris, Rebecca; Kraiczy, Judith; Salvestrini, Camilla; Stavrou, Georgia; Torrente, Franco; Brook, Kimberley; Trayers, Claire; Elmentaite, Rasa; Youssef, Gehad; Tél, Bálint; Winton, Douglas James; Skoufou-Papoutsaki, Nefeli; Adler, Sam; Bufler, Philip; Azabdaftari, Aline; Jenke, Andreas; G, Natasha; Thomas, Natasha; Miele, Erasmo; Al-Mohammad, Abdulrahman; Guarda, Greta; Kugathasan, Subra; Venkateswaran, Suresh; Clatworthy, Menna R; Castro-Dopico, Tomas; Suchanek, Ondrej; Strisciuglio, Caterina; Gasparetto, Marco; Lee, Seokjun; Xu, Xingze; Bello, Erica; Han, Namshik; Zerbino, Daniel R; Teichmann, Sarah A; Nys, Josquin; Heuschkel, Robert; Perrone, Francesca; Zilbauer, Matthias.

Gut ; 2024 Jun 10.

Article in English | MEDLINE | ID: mdl-38857990

ABSTRACT

OBJECTIVE: Epigenetic mechanisms, including DNA methylation (DNAm), have been proposed to play a key role in Crohn's disease (CD) pathogenesis. However, the specific cell types and pathways affected as well as their potential impact on disease phenotype and outcome remain unknown. We set out to investigate the role of intestinal epithelial DNAm in CD pathogenesis. DESIGN: We generated 312 intestinal epithelial organoids (IEOs) from mucosal biopsies of 168 patients with CD (n=72), UC (n=23) and healthy controls (n=73). We performed genome-wide molecular profiling including DNAm, bulk as well as single-cell RNA sequencing. Organoids were subjected to gene editing and the functional consequences of DNAm changes evaluated using an organoid-lymphocyte coculture and a nucleotide-binding oligomerisation domain, leucine-rich repeat and CARD domain containing 5 (NLRC5) dextran sulphate sodium (DSS) colitis knock-out mouse model. RESULTS: We identified highly stable, CD-associated loss of DNAm at major histocompatibility complex (MHC) class 1 loci including NLRC5 and cognate gene upregulation. Single-cell RNA sequencing of primary mucosal tissue and IEOs confirmed the role of NLRC5 as transcriptional transactivator in the intestinal epithelium. Increased mucosal MHC-I and NLRC5 expression in adult and paediatric patients with CD was validated in additional cohorts and the functional role of MHC-I highlighted by demonstrating a relative protection from DSS-mediated mucosal inflammation in NLRC5-deficient mice. MHC-I DNAm in IEOs showed a significant correlation with CD disease phenotype and outcomes. Application of machine learning approaches enabled the development of a disease prognostic epigenetic molecular signature. CONCLUSIONS: Our study has identified epigenetically regulated intestinal epithelial MHC-I as a novel mechanism in CD pathogenesis.

2.

Culture-Associated DNA Methylation Changes Impact on Cellular Function of Human Intestinal Organoids.

Edgar, Rachel D; Perrone, Francesca; Foster, April R; Payne, Felicity; Lewis, Sophia; Nayak, Komal M; Kraiczy, Judith; Cenier, Aurélie; Torrente, Franco; Salvestrini, Camilla; Heuschkel, Robert; Hensel, Kai O; Harris, Rebecca; Jones, D Leanne; Zerbino, Daniel R; Zilbauer, Matthias.

Cell Mol Gastroenterol Hepatol ; 14(6): 1295-1310, 2022.

Article in English | MEDLINE | ID: mdl-36038072

ABSTRACT

BACKGROUND & AIMS: Human intestinal epithelial organoids (IEOs) are a powerful tool to model major aspects of intestinal development, health, and diseases because patient-derived cultures retain many features found in vivo. A necessary aspect of the organoid model is the requirement to expand cultures in vitro through several rounds of passaging. This is of concern because the passaging of cells has been shown to affect cell morphology, ploidy, and function. METHODS: Here, we analyzed 173 human IEO lines derived from the small and large bowel and examined the effect of culture duration on DNA methylation (DNAm). Furthermore, we tested the potential impact of DNAm changes on gene expression and cellular function. RESULTS: Our analyses show a reproducible effect of culture duration on DNAm in a large discovery cohort as well as 2 publicly available validation cohorts generated in different laboratories. Although methylation changes were seen in only approximately 8% of tested cytosine-phosphate-guanine dinucleotides (CpGs) and global cellular function remained stable, a subset of methylation changes correlated with altered gene expression at baseline as well as in response to inflammatory cytokine exposure and withdrawal of Wnt agonists. Importantly, epigenetic changes were found to be enriched in genomic regions associated with colonic cancer and distant to the site of replication, indicating similarities to malignant transformation. CONCLUSIONS: Our study shows distinct culture-associated epigenetic changes in mucosa-derived human IEOs, some of which appear to impact gene transcriptomic and cellular function. These findings highlight the need for future studies in this area and the importance of considering passage number as a potentially confounding factor.

Subject(s)

DNA Methylation , Organoids , Humans , Intestines , Epigenesis, Genetic , Intestinal Mucosa

3.

Author Correction: Perspectives on ENCODE.

Snyder, Michael P; Gingeras, Thomas R; Moore, Jill E; Weng, Zhiping; Gerstein, Mark B; Ren, Bing; Hardison, Ross C; Stamatoyannopoulos, John A; Graveley, Brenton R; Feingold, Elise A; Pazin, Michael J; Pagan, Michael; Gilchrist, Daniel A; Hitz, Benjamin C; Cherry, J Michael; Bernstein, Bradley E; Mendenhall, Eric M; Zerbino, Daniel R; Frankish, Adam; Flicek, Paul; Myers, Richard M.

Nature ; 605(7909): E4, 2022 May.

Article in English | MEDLINE | ID: mdl-35474002

4.

The Ensembl COVID-19 resource: ongoing integration of public SARS-CoV-2 data.

De Silva, Nishadi H; Bhai, Jyothish; Chakiachvili, Marc; Contreras-Moreira, Bruno; Cummins, Carla; Frankish, Adam; Gall, Astrid; Genez, Thiago; Howe, Kevin L; Hunt, Sarah E; Martin, Fergal J; Moore, Benjamin; Ogeh, Denye; Parker, Anne; Parton, Andrew; Ruffier, Magali; Sakthivel, Manoj Pandian; Sheppard, Dan; Tate, John; Thormann, Anja; Thybert, David; Trevanion, Stephen J; Winterbottom, Andrea; Zerbino, Daniel R; Finn, Robert D; Flicek, Paul; Yates, Andrew D.

Nucleic Acids Res ; 50(D1): D765-D770, 2022 01 07.

Article in English | MEDLINE | ID: mdl-34634797

ABSTRACT

The COVID-19 pandemic has seen unprecedented use of SARS-CoV-2 genome sequencing for epidemiological tracking and identification of emerging variants. Understanding the potential impact of these variants on the infectivity of the virus and the efficacy of emerging therapeutics and vaccines has become a cornerstone of the fight against the disease. To support the maximal use of genomic information for SARS-CoV-2 research, we launched the Ensembl COVID-19 browser; the first virus to be encompassed within the Ensembl platform. This resource incorporates a new Ensembl gene set, multiple variant sets, and annotation from several relevant resources aligned to the reference SARS-CoV-2 assembly. Since the first release in May 2020, the content has been regularly updated using our new rapid release workflow, and tools such as the Ensembl Variant Effect Predictor have been integrated. The Ensembl COVID-19 browser is freely available at https://covid-19.ensembl.org.

Subject(s)

COVID-19/virology , Databases, Genetic , SARS-CoV-2/genetics , Web Browser , Coronaviridae/genetics , Genetic Variation , Genome, Viral , Humans , Molecular Sequence Annotation

5.

The gene regulation knowledge commons: the action area of GREEKC.

Kuiper, Martin; Bonello, Joseph; Fernández-Breis, Jesualdo T; Bucher, Philipp; Futschik, Matthias E; Gaudet, Pascale; Kulakovskiy, Ivan V; Licata, Luana; Logie, Colin; Lovering, Ruth C; Makeev, Vsevolod J; Orchard, Sandra; Panni, Simona; Perfetto, Livia; Sant, David; Schulz, Stefan; Vercruysse, Steven; Zerbino, Daniel R; Lægreid, Astrid.

Biochim Biophys Acta Gene Regul Mech ; 1865(1): 194768, 2022 01.

Article in English | MEDLINE | ID: mdl-34757206

ABSTRACT

As computational modeling becomes more essential to analyze and understand biological regulatory mechanisms, governance of the many databases and knowledge bases that support this domain is crucial to guarantee reliability and interoperability of resources. To address this, the COST Action Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC, CA15205, www.greekc.org) organized nine workshops in a four-year period, starting September 2016. The workshops brought together a wide range of experts from all over the world working on various steps in the knowledge management process that focuses on understanding gene regulatory mechanisms. The discussions between ontologists, curators, text miners, biologists, bioinformaticians, philosophers and computational scientists spawned a host of activities aimed to standardize and update existing knowledge management workflows and involve end-users in the process of designing the Gene Regulation Knowledge Commons (GRKC). Here the GREEKC consortium describes its main achievements in improving this GRKC.

Subject(s)

Gene Expression Regulation , Reproducibility of Results

6.

A compendium of uniformly processed human gene expression and splicing quantitative trait loci.

Kerimov, Nurlan; Hayhurst, James D; Peikova, Kateryna; Manning, Jonathan R; Walter, Peter; Kolberg, Liis; Samovica, Marija; Sakthivel, Manoj Pandian; Kuzmin, Ivan; Trevanion, Stephen J; Burdett, Tony; Jupp, Simon; Parkinson, Helen; Papatheodorou, Irene; Yates, Andrew D; Zerbino, Daniel R; Alasoo, Kaur.

Nat Genet ; 53(9): 1290-1299, 2021 09.

Article in English | MEDLINE | ID: mdl-34493866

ABSTRACT

Many gene expression quantitative trait locus (eQTL) studies have published their summary statistics, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and co-localization. However, technical differences between these datasets are a barrier to their widespread use. Consequently, target genes for most genome-wide association study (GWAS) signals have still not been identified. In the present study, we present the eQTL Catalogue ( https://www.ebi.ac.uk/eqtl ), a resource of quality-controlled, uniformly re-computed gene expression and splicing QTLs from 21 studies. We find that, for matching cell types and tissues, the eQTL effect sizes are highly reproducible between studies. Although most QTLs were shared between most bulk tissues, we identified a greater diversity of cell-type-specific QTLs from purified cell types, a subset of which also manifested as new disease co-localizations. Our summary statistics are freely available to enable the systematic interpretation of human GWAS associations across many cell types and tissues.

Subject(s)

Databases, Genetic , Gene Expression Regulation/genetics , Quantitative Trait Loci/genetics , Quantitative Trait, Heritable , CD4-Positive T-Lymphocytes/cytology , Datasets as Topic , Genome-Wide Association Study , Humans , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics

7.

Transcription and DNA Methylation Patterns of Blood-Derived CD8⁺ T Cells Are Associated With Age and Inflammatory Bowel Disease But Do Not Predict Prognosis.

Gasparetto, Marco; Payne, Felicity; Nayak, Komal; Kraiczy, Judith; Glemas, Claire; Philip-McKenzie, Yosef; Ross, Alexander; Edgar, Rachel D; Zerbino, Daniel R; Salvestrini, Camilla; Torrente, Franco; Ventham, Nicholas T; Kalla, Rahul; Satsangi, Jack; Sarkies, Peter; Heuschkel, Robert; Zilbauer, Matthias.

Gastroenterology ; 160(1): 232-244.e7, 2021 01.

Article in English | MEDLINE | ID: mdl-32814113

ABSTRACT

BACKGROUND & AIMS: Gene expression patterns of CD8+ T cells have been reported to correlate with clinical outcomes of adults with inflammatory bowel diseases (IBD). We aimed to validate these findings in independent patient cohorts. METHODS: We obtained peripheral blood samples from 112 children with a new diagnosis of IBD (71 with Crohn's disease and 41 with ulcerative colitis) and 19 children without IBD (controls) and recorded medical information on disease activity and outcomes. CD8+ T cells were isolated from blood samples by magnetic bead sorting at the point of diagnosis and during the course of disease. Genome-wide transcription (n = 192) and DNA methylation (n = 66) profiles were generated using Affymetrix and Illumina arrays, respectively. Publicly available transcriptomes and DNA methylomes of CD8+ T cells from 3 adult patient cohorts with and without IBD were included in data analyses. RESULTS: Previously reported CD8+ T-cell prognostic expression and exhaustion signatures were only found in the original adult IBD patient cohort. These signatures could not be detected in either a pediatric or a second adult IBD cohort. In contrast, an association between CD8+ T-cell gene expression with age and sex was detected across all 3 cohorts. CD8+ gene transcription was clearly associated with IBD in the 2 cohorts that included non-IBD controls. Lastly, DNA methylation profiles of CD8+ T cells from children with Crohn's disease correlated with age but not with disease outcome. CONCLUSIONS: We were unable to validate previously reported findings of an association between CD8+ T-cell gene transcription and disease outcome in IBD. Our findings reveal the challenges of developing prognostic biomarkers for patients with IBD and the importance of their validation in large, independent cohorts before clinical application.

Subject(s)

CD8-Positive T-Lymphocytes/physiology , Inflammatory Bowel Diseases/diagnosis , Inflammatory Bowel Diseases/etiology , Adolescent , Adult , Age Factors , Case-Control Studies , Child , Child, Preschool , DNA Methylation , Female , Humans , Male , Predictive Value of Tests , Prognosis , Transcription, Genetic , Young Adult

8.

Perspectives on ENCODE.

Snyder, Michael P; Gingeras, Thomas R; Moore, Jill E; Weng, Zhiping; Gerstein, Mark B; Ren, Bing; Hardison, Ross C; Stamatoyannopoulos, John A; Graveley, Brenton R; Feingold, Elise A; Pazin, Michael J; Pagan, Michael; Gilchrist, Daniel A; Hitz, Benjamin C; Cherry, J Michael; Bernstein, Bradley E; Mendenhall, Eric M; Zerbino, Daniel R; Frankish, Adam; Flicek, Paul; Myers, Richard M.

Nature ; 583(7818): 693-698, 2020 07.

Article in English | MEDLINE | ID: mdl-32728248

ABSTRACT

The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.

Subject(s)

Databases, Genetic , Genome/genetics , Genomics , Molecular Sequence Annotation , Animals , Binding Sites , Chromatin/genetics , Chromatin/metabolism , DNA Methylation , Databases, Genetic/standards , Databases, Genetic/trends , Gene Expression Regulation/genetics , Genome, Human/genetics , Genomics/standards , Genomics/trends , Histones/metabolism , Humans , Mice , Molecular Sequence Annotation/standards , Quality Control , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/metabolism

9.

Progress, Challenges, and Surprises in Annotating the Human Genome.

Zerbino, Daniel R; Frankish, Adam; Flicek, Paul.

Annu Rev Genomics Hum Genet ; 21: 55-79, 2020 08 31.

Article in English | MEDLINE | ID: mdl-32421357

ABSTRACT

Our understanding of the human genome has continuously expanded since its draft publication in 2001. Over the years, novel assays have allowed us to progressively overlay layers of knowledge above the raw sequence of A's, T's, G's, and C's. The reference human genome sequence is now a complex knowledge base maintained under the shared stewardship of multiple specialist communities. Its complexity stems from the fact that it is simultaneously a template for transcription, a record of evolution, a vehicle for genetics, and a functional molecule. In short, the human genome serves as a frame of reference at the intersection of a diversity of scientific fields. In recent years, the progressive fall in sequencing costs has given increasing importance to the quality of the human reference genome, as hundreds of thousands of individuals are being sequenced yearly, often for clinical applications. Also, novel sequencing-based assays shed light on novel functions of the genome, especially with respect to gene expression regulation. Keeping the human genome annotation up to date and accurate is therefore an ongoing partnership between reference annotation projects and the greater community worldwide.

Subject(s)

Genome, Human , Molecular Sequence Annotation/methods , Molecular Sequence Annotation/standards , Humans

10.

Sequence tube maps: making graph genomes intuitive to commuters.

Beyer, Wolfgang; Novak, Adam M; Hickey, Glenn; Chan, Jeffrey; Tan, Vanessa; Paten, Benedict; Zerbino, Daniel R.

Bioinformatics ; 35(24): 5318-5320, 2019 12 15.

Article in English | MEDLINE | ID: mdl-31368484

ABSTRACT

MOTIVATION: Compared to traditional haploid reference genomes, graph genomes are an efficient and compact data structure for storing multiple genomic sequences, for storing polymorphisms or for mapping sequencing reads with greater sensitivity. Further, graphs are well-studied computer science objects that can be efficiently analyzed. However, their adoption in genomic research is slow, in part because of the cognitive difficulty in interpreting graphs. RESULTS: We present an intuitive graphical representation for graph genomes that re-uses well-honed techniques developed to display public transport networks, and demonstrate it as a web tool. AVAILABILITY AND IMPLEMENTATION: Code: https://github.com/vgteam/sequenceTubeMap. DEMONSTRATION: https://vgteam.github.io/sequenceTubeMap/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Algorithms , Genome , Software , Genomics , Sequence Analysis, DNA

11.

Ensembl 2018.

Zerbino, Daniel R; Achuthan, Premanand; Akanni, Wasiu; Amode, M Ridwan; Barrell, Daniel; Bhai, Jyothish; Billis, Konstantinos; Cummins, Carla; Gall, Astrid; Girón, Carlos García; Gil, Laurent; Gordon, Leo; Haggerty, Leanne; Haskell, Erin; Hourlier, Thibaut; Izuogu, Osagie G; Janacek, Sophie H; Juettemann, Thomas; To, Jimmy Kiang; Laird, Matthew R; Lavidas, Ilias; Liu, Zhicheng; Loveland, Jane E; Maurel, Thomas; McLaren, William; Moore, Benjamin; Mudge, Jonathan; Murphy, Daniel N; Newman, Victoria; Nuhn, Michael; Ogeh, Denye; Ong, Chuang Kee; Parker, Anne; Patricio, Mateus; Riat, Harpreet Singh; Schuilenburg, Helen; Sheppard, Dan; Sparrow, Helen; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Walts, Brandon; Zadissa, Amonida; Frankish, Adam; Hunt, Sarah E; Kostadima, Myrto; Langridge, Nicholas; Martin, Fergal J; Muffato, Matthieu; Perry, Emily.

Nucleic Acids Res ; 46(D1): D754-D761, 2018 01 04.

Article in English | MEDLINE | ID: mdl-29155950

ABSTRACT

The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.

Subject(s)

Databases, Genetic , Datasets as Topic , Genome , Information Dissemination , Animals , Epigenomics , Genome, Human , Genome-Wide Association Study , Genomics , High-Throughput Nucleotide Sequencing , Humans , Molecular Sequence Annotation , Vertebrates/genetics , Web Browser

12.

Ensembl 2017.

Aken, Bronwen L; Achuthan, Premanand; Akanni, Wasiu; Amode, M Ridwan; Bernsdorff, Friederike; Bhai, Jyothish; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E; Janacek, Sophie H; Juettemann, Thomas; Keenan, Stephen; Laird, Matthew R; Lavidas, Ilias; Maurel, Thomas; McLaren, William; Moore, Benjamin; Murphy, Daniel N; Nag, Rishi; Newman, Victoria; Nuhn, Michael; Ong, Chuang Kee; Parker, Anne; Patricio, Mateus; Riat, Harpreet Singh; Sheppard, Daniel; Sparrow, Helen; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Walts, Brandon; Wilder, Steven P; Zadissa, Amonida; Kostadima, Myrto; Martin, Fergal J; Muffato, Matthieu; Perry, Emily; Ruffier, Magali; Staines, Daniel M; Trevanion, Stephen J; Cunningham, Fiona; Yates, Andrew; Zerbino, Daniel R; Flicek, Paul.

Nucleic Acids Res ; 45(D1): D635-D642, 2017 01 04.

Article in English | MEDLINE | ID: mdl-27899575

ABSTRACT

Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.

Subject(s)

Computational Biology/methods , Databases, Genetic , Genomics/methods , Search Engine , Software , Web Browser , Animals , Data Mining , Evolution, Molecular , Gene Expression Regulation , Genetic Variation , Genome, Human , Humans , Molecular Sequence Annotation , Species Specificity , Vertebrates

13.

Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters.

Javierre, Biola M; Burren, Oliver S; Wilder, Steven P; Kreuzhuber, Roman; Hill, Steven M; Sewitz, Sven; Cairns, Jonathan; Wingett, Steven W; Várnai, Csilla; Thiecke, Michiel J; Burden, Frances; Farrow, Samantha; Cutler, Antony J; Rehnström, Karola; Downes, Kate; Grassi, Luigi; Kostadima, Myrto; Freire-Pritchett, Paula; Wang, Fan; Stunnenberg, Hendrik G; Todd, John A; Zerbino, Daniel R; Stegle, Oliver; Ouwehand, Willem H; Frontini, Mattia; Wallace, Chris; Spivakov, Mikhail; Fraser, Peter.

Cell ; 167(5): 1369-1384.e19, 2016 11 17.

Article in English | MEDLINE | ID: mdl-27863249

ABSTRACT

Long-range interactions between regulatory elements and gene promoters play key roles in transcriptional regulation. The vast majority of interactions are uncharted, constituting a major missing link in understanding genome control. Here, we use promoter capture Hi-C to identify interacting regions of 31,253 promoters in 17 human primary hematopoietic cell types. We show that promoter interactions are highly cell type specific and enriched for links between active promoters and epigenetically marked enhancers. Promoter interactomes reflect lineage relationships of the hematopoietic tree, consistent with dynamic remodeling of nuclear architecture during differentiation. Interacting regions are enriched in genetic variants linked with altered expression of genes they contact, highlighting their functional role. We exploit this rich resource to connect non-coding disease variants to putative target promoters, prioritizing thousands of disease-candidate genes and implicating disease pathways. Our results demonstrate the power of primary cell promoter interactomes to reveal insights into genomic regulatory mechanisms underlying common diseases.

Subject(s)

Blood Cells/cytology , Disease/genetics , Promoter Regions, Genetic , Cell Lineage , Cell Separation , Chromatin , Enhancer Elements, Genetic , Epigenomics , Genetic Predisposition to Disease , Genome-Wide Association Study , Hematopoiesis , Humans , Polymorphism, Single Nucleotide , Quantitative Trait Loci

14.

Representing and decomposing genomic structural variants as balanced integer flows on sequence graphs.

Zerbino, Daniel R; Ballinger, Tracy; Paten, Benedict; Hickey, Glenn; Haussler, David.

BMC Bioinformatics ; 17(1): 400, 2016 Sep 29.

Article in English | MEDLINE | ID: mdl-27687569

ABSTRACT

BACKGROUND: The study of genomic variation has provided key insights into the functional role of mutations. Predominantly, studies have focused on single nucleotide variants (SNV), which are relatively easy to detect and can be described with rich mathematical models. However, it has been observed that genomes are highly plastic, and that whole regions can be moved, removed or duplicated in bulk. These structural variants (SV) have been shown to have significant impact on phenotype, but their study has been held back by the combinatorial complexity of the underlying models. RESULTS: We describe here a general model of structural variation that encompasses both balanced rearrangements and arbitrary copy-number variants (CNV). CONCLUSIONS: In this model, we show that the space of possible evolutionary histories that explain the structural differences between any two genomes can be sampled ergodically.

15.

Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences.

Poznik, G David; Xue, Yali; Mendez, Fernando L; Willems, Thomas F; Massaia, Andrea; Wilson Sayres, Melissa A; Ayub, Qasim; McCarthy, Shane A; Narechania, Apurva; Kashin, Seva; Chen, Yuan; Banerjee, Ruby; Rodriguez-Flores, Juan L; Cerezo, Maria; Shao, Haojing; Gymrek, Melissa; Malhotra, Ankit; Louzada, Sandra; Desalle, Rob; Ritchie, Graham R S; Cerveira, Eliza; Fitzgerald, Tomas W; Garrison, Erik; Marcketta, Anthony; Mittelman, David; Romanovitch, Mallory; Zhang, Chengsheng; Zheng-Bradley, Xiangqun; Abecasis, Gonçalo R; McCarroll, Steven A; Flicek, Paul; Underhill, Peter A; Coin, Lachlan; Zerbino, Daniel R; Yang, Fengtang; Lee, Charles; Clarke, Laura; Auton, Adam; Erlich, Yaniv; Handsaker, Robert E; Bustamante, Carlos D; Tyler-Smith, Chris.

Nat Genet ; 48(6): 593-9, 2016 06.

Article in English | MEDLINE | ID: mdl-27111036

ABSTRACT

We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.

Subject(s)

Chromosomes, Human, Y , Demography , Haplotypes , Humans , Male , Mutation , Phylogeny , Polymorphism, Single Nucleotide

16.

Ensembl regulation resources.

Zerbino, Daniel R; Johnson, Nathan; Juetteman, Thomas; Sheppard, Dan; Wilder, Steven P; Lavidas, Ilias; Nuhn, Michael; Perry, Emily; Raffaillac-Desfosses, Quentin; Sobral, Daniel; Keefe, Damian; Gräf, Stefan; Ahmed, Ikhlak; Kinsella, Rhoda; Pritchard, Bethan; Brent, Simon; Amode, Ridwan; Parker, Anne; Trevanion, Steven; Birney, Ewan; Dunham, Ian; Flicek, Paul.

Database (Oxford) ; 20162016.

Article in English | MEDLINE | ID: mdl-26888907

ABSTRACT

New experimental techniques in epigenomics allow researchers to assay a diversity of highly dynamic features such as histone marks, DNA modifications or chromatin structure. The study of their fluctuations should provide insights into gene expression regulation, cell differentiation and disease. The Ensembl project collects and maintains the Ensembl regulation data resources on epigenetic marks, transcription factor binding and DNA methylation for human and mouse, as well as microarray probe mappings and annotations for a variety of chordate genomes. From this data, we produce a functional annotation of the regulatory elements along the human and mouse genomes with plans to expand to other species as data becomes available. Starting from well-studied cell lines, we will progressively expand our library of measurements to a greater variety of samples. Ensembl's regulation resources provide a central and easy-to-query repository for reference epigenomes. As with all Ensembl data, it is freely available at http://www.ensembl.org, from the Perl and REST APIs and from the public Ensembl MySQL database server at ensembldb.ensembl.org. Database URL: http://www.ensembl.org.

Subject(s)

Computational Biology/methods , DNA/analysis , Databases, Genetic , Amino Acid Motifs , Animals , DNA Methylation , Epigenesis, Genetic , Epigenomics , Genome , Genome, Human , Genomics , Histones/chemistry , Humans , Mice , Molecular Sequence Annotation , Oligonucleotide Array Sequence Analysis

17.

Ensembl 2016.

Yates, Andrew; Akanni, Wasiu; Amode, M Ridwan; Barrell, Daniel; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Fitzgerald, Stephen; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E; Janacek, Sophie H; Johnson, Nathan; Juettemann, Thomas; Keenan, Stephen; Lavidas, Ilias; Martin, Fergal J; Maurel, Thomas; McLaren, William; Murphy, Daniel N; Nag, Rishi; Nuhn, Michael; Parker, Anne; Patricio, Mateus; Pignatelli, Miguel; Rahtz, Matthew; Riat, Harpreet Singh; Sheppard, Daniel; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Wilder, Steven P; Zadissa, Amonida; Birney, Ewan; Harrow, Jennifer; Muffato, Matthieu; Perry, Emily; Ruffier, Magali; Spudich, Giulietta; Trevanion, Stephen J; Cunningham, Fiona; Aken, Bronwen L; Zerbino, Daniel R; Flicek, Paul.

Nucleic Acids Res ; 44(D1): D710-6, 2016 Jan 04.

Article in English | MEDLINE | ID: mdl-26687719

ABSTRACT

The Ensembl project (http://www.ensembl.org) is a system for genome annotation, analysis, storage and dissemination designed to facilitate the access of genomic annotation from chordates and key model organisms. It provides access to data from 87 species across our main and early access Pre! websites. This year we introduced three newly annotated species and released numerous updates across our supported species with a concentration on data for the latest genome assemblies of human, mouse, zebrafish and rat. We also provided two data updates for the previous human assembly, GRCh37, through a dedicated website (http://grch37.ensembl.org). Our tools, in particular the VEP, have been improved significantly through integration of additional third party data. REST is now capable of larger-scale analysis and our regulatory data BioMart can deliver faster results. The website is now capable of displaying long-range interactions such as those found in cis-regulated datasets. Finally we have launched a website optimized for mobile devices providing views of genes, variants and phenotypes. Our data is made available without restriction and all code is available from our GitHub organization site (http://github.com/Ensembl) under an Apache 2.0 license.

Subject(s)

Databases, Genetic , Genomics , Molecular Sequence Annotation , Animals , Genes , Genetic Variation , Humans , Internet , Mice , Proteins/genetics , Rats , Regulatory Sequences, Nucleic Acid , Software

18.

The ensembl regulatory build.

Zerbino, Daniel R; Wilder, Steven P; Johnson, Nathan; Juettemann, Thomas; Flicek, Paul R.

Genome Biol ; 16: 56, 2015 Mar 24.

Article in English | MEDLINE | ID: mdl-25887522

ABSTRACT

Most genomic variants associated with phenotypic traits or disease do not fall within gene coding regions, but in regulatory regions, rendering their interpretation difficult. We collected public data on epigenetic marks and transcription factor binding in human cell types and used it to construct an intuitive summary of regulatory regions in the human genome. We verified it against independent assays for sensitivity. The Ensembl Regulatory Build will be progressively enriched when more data is made available. It is freely available on the Ensembl browser, from the Ensembl Regulation MySQL database server and in a dedicated track hub.

Subject(s)

Databases, Genetic , Genomics , Software , Transcription Factors/genetics , Computational Biology , Epigenesis, Genetic/genetics , Humans , Internet , User-Computer Interface

19.

Building a pan-genome reference for a population.

Nguyen, Ngan; Hickey, Glenn; Zerbino, Daniel R; Raney, Brian; Earl, Dent; Armstrong, Joel; Kent, W James; Haussler, David; Paten, Benedict.

J Comput Biol ; 22(5): 387-401, 2015 May.

Article in English | MEDLINE | ID: mdl-25565268

ABSTRACT

A reference genome is a high quality individual genome that is used as a coordinate system for the genomes of a population, or genomes of closely related subspecies. Given a set of genomes partitioned by homology into alignment blocks we formalize the problem of ordering and orienting the blocks such that the resulting ordering maximally agrees with the underlying genomes' ordering and orientation, creating a pan-genome reference ordering. We show this problem is NP-hard, but also demonstrate, empirically and within simulations, the performance of heuristic algorithms based upon a cactus graph decomposition to find locally maximal solutions. We describe an extension of our Cactus software to create a pan-genome reference for whole genome alignments, and demonstrate how it can be used to create novel genome browser visualizations using human variation data as a test. In addition, we test the use of a pan-genome for describing variations and as a reference for read mapping.

Subject(s)

Algorithms , Genetics, Population/standards , Genome, Human , Software , Computer Graphics , Evolution, Molecular , Genetics, Population/statistics & numerical data , Humans , Reference Standards , Sequence Alignment , Sequence Analysis, DNA

20.

Ensembl 2015.

Cunningham, Fiona; Amode, M Ridwan; Barrell, Daniel; Beal, Kathryn; Billis, Konstantinos; Brent, Simon; Carvalho-Silva, Denise; Clapham, Peter; Coates, Guy; Fitzgerald, Stephen; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E; Janacek, Sophie H; Johnson, Nathan; Juettemann, Thomas; Kähäri, Andreas K; Keenan, Stephen; Martin, Fergal J; Maurel, Thomas; McLaren, William; Murphy, Daniel N; Nag, Rishi; Overduin, Bert; Parker, Anne; Patricio, Mateus; Perry, Emily; Pignatelli, Miguel; Riat, Harpreet Singh; Sheppard, Daniel; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Wilder, Steven P; Zadissa, Amonida; Aken, Bronwen L; Birney, Ewan; Harrow, Jennifer; Kinsella, Rhoda; Muffato, Matthieu; Ruffier, Magali; Searle, Stephen M J; Spudich, Giulietta; Trevanion, Stephen J; Yates, Andy; Zerbino, Daniel R; Flicek, Paul.

Nucleic Acids Res ; 43(Database issue): D662-9, 2015 Jan.

Article in English | MEDLINE | ID: mdl-25352552

ABSTRACT

Ensembl (http://www.ensembl.org) is a genomic interpretation system providing the most up-to-date annotations, querying tools and access methods for chordates and key model organisms. This year we released updated annotation (gene models, comparative genomics, regulatory regions and variation) on the new human assembly, GRCh38, although we continue to support researchers using the GRCh37.p13 assembly through a dedicated site (http://grch37.ensembl.org). Our Regulatory Build has been revamped to identify regulatory regions of interest and to efficiently highlight their activity across disparate epigenetic data sets. A number of new interfaces allow users to perform large-scale comparisons of their data against our annotations. The REST server (http://rest.ensembl.org), which allows programs written in any language to query our databases, has moved to a full service alongside our upgraded website tools. Our online Variant Effect Predictor tool has been updated to process more variants and calculate summary statistics. Lastly, the WiggleTools package enables users to summarize large collections of data sets and view them as single tracks in Ensembl. The Ensembl code base itself is more accessible: it is now hosted on our GitHub organization page (https://github.com/Ensembl) under an Apache 2.0 open source license.

Subject(s)

Databases, Nucleic Acid , Genomics , Animals , Epigenesis, Genetic , Genetic Variation , Genome, Human , Humans , Internet , Mice , Molecular Sequence Annotation , Regulatory Sequences, Nucleic Acid , Software

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL