Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
Nat Cell Biol ; 23(11): 1129-1135, 2021 11.
Article in English | MEDLINE | ID: mdl-34750578

ABSTRACT

Massive single-cell profiling efforts have accelerated our discovery of the cellular composition of the human body while at the same time raising the need to formalize this new knowledge. Here, we discuss current efforts to harmonize and integrate different sources of annotations of cell types and states into a reference cell ontology. We illustrate with examples how a unified ontology can consolidate and advance our understanding of cell types across scientific communities and biological domains.


Subject(s)
Atlases as Topic , Cell Biology , Cell Lineage , Cells/classification , Single-Cell Analysis , Biological Ontologies , Biomarkers/metabolism , Cells/metabolism , Cells/pathology , Data Mining , Disease , Gene Ontology , Genomics , High-Throughput Nucleotide Sequencing , Humans , Phenotype , Systems Integration , Transcriptome
2.
Nat Commun ; 11(1): 3400, 2020 07 07.
Article in English | MEDLINE | ID: mdl-32636365

ABSTRACT

The Pan-Cancer Analysis of Whole Genomes (PCAWG) project generated a vast amount of whole-genome cancer sequencing resource data. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumor types, we provide a user's guide to the five publicly available online data exploration and visualization tools introduced in the PCAWG marker paper. These tools are ICGC Data Portal, UCSC Xena, Chromothripsis Explorer, Expression Atlas, and PCAWG-Scout. We detail use cases and analyses for each tool, show how they incorporate outside resources from the larger genomics ecosystem, and demonstrate how the tools can be used together to understand the biology of cancers more deeply. Together, the tools enable researchers to query the complex genomic PCAWG data dynamically and integrate external information, enabling and enhancing interpretation.


Subject(s)
Computational Biology/methods , Genome, Human , Neoplasms/genetics , Chromothripsis , Data Analysis , Databases, Genetic , Genomics , Humans , Internet , Mutation , Software , User-Computer Interface , Whole Genome Sequencing
3.
Nucleic Acids Res ; 46(D1): D1181-D1189, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29165610

ABSTRACT

Gramene (http://www.gramene.org) is a knowledgebase for comparative functional analysis in major crops and model plant species. The current release, #54, includes over 1.7 million genes from 44 reference genomes, most of which were organized into 62,367 gene families through orthologous and paralogous gene classification, whole-genome alignments, and synteny. Additional gene annotations include ontology-based protein structure and function; genetic, epigenetic, and phenotypic diversity; and pathway associations. Gramene's Plant Reactome provides a knowledgebase of cellular-level plant pathway networks. Specifically, it uses curated rice reference pathways to derive pathway projections for an additional 66 species based on gene orthology, and facilitates display of gene expression, gene-gene interactions, and user-defined omics data in the context of these pathways. As a community portal, Gramene integrates best-of-class software and infrastructure components including the Ensembl genome browser, Reactome pathway browser, and Expression Atlas widgets, and undergoes periodic data and software upgrades. Via powerful, intuitive search interfaces, users can easily query across various portals and interactively analyze search results by clicking on diverse features such as genomic context, highly augmented gene trees, gene expression anatomograms, associated pathways, and external informatics resources. All data in Gramene are accessible through both visual and programmatic interfaces.


Subject(s)
Databases, Genetic , Gene Expression Regulation, Plant , Genomics/methods , Knowledge Bases , Plants/genetics , Epigenesis, Genetic , Gene Ontology , Genetic Research , Genetic Variation , Genome, Plant , Metabolic Networks and Pathways/genetics , Molecular Sequence Annotation , Plants/metabolism , Software , User-Computer Interface
4.
Nucleic Acids Res ; 46(D1): D246-D251, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29165655

ABSTRACT

Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions.


Subject(s)
Databases, Genetic , Animals , Gene Expression Profiling , Humans , Mammals/genetics , Mammals/metabolism , Oligonucleotide Array Sequence Analysis , Plants/genetics , Plants/metabolism , Proteomics , Sequence Analysis, RNA , Species Specificity , User-Computer Interface
6.
Bioinformatics ; 33(14): 2218-2220, 2017 Jul 15.
Article in English | MEDLINE | ID: mdl-28369191

ABSTRACT

MOTIVATION: The exponential growth of publicly available RNA-sequencing (RNA-Seq) data poses an increasing challenge to researchers wishing to discover, analyse and store such data, particularly those based in institutions with limited computational resources. EMBL-EBI is in an ideal position to address these challenges and to allow the scientific community easy access to not just raw, but also processed RNA-Seq data. We present a Web service to access the results of a systematically and continually updated standardized alignment as well as gene and exon expression quantification of all public bulk (and in the near future also single-cell) RNA-Seq runs in 264 species in European Nucleotide Archive, using Representational State Transfer. RESULTS: The RNASeq-er API (Application Programming Interface) enables ontology-powered search for and retrieval of CRAM, bigwig and bedGraph files, gene and exon expression quantification matrices (Fragments Per Kilobase Of Exon Per Million Fragments Mapped, Transcripts Per Million, raw counts) as well as sample attributes annotated with ontology terms. To date over 270 00 RNA-Seq runs in nearly 10 000 studies (1PB of raw FASTQ data) in 264 species in ENA have been processed and made available via the API. AVAILABILITY AND IMPLEMENTATION: The RNASeq-er API can be accessed at http://www.ebi.ac.uk/fg/rnaseq/api . The commands used to analyse the data are available in supplementary materials and at https://github.com/nunofonseca/irap/wiki/iRAP-single-library . CONTACT: rnaseq@ebi.ac.uk ; rpetry@ebi.ac.uk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology/methods , Eukaryota/genetics , Sequence Analysis, RNA/methods , Software , Transcriptome , Animals , Databases, Genetic , Gene Expression , Gene Ontology , Humans , Internet
8.
Nucleic Acids Res ; 45(D1): D985-D994, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899665

ABSTRACT

We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.


Subject(s)
Computational Biology/methods , Molecular Targeted Therapy , Search Engine , Software , Databases, Factual , Humans , Molecular Targeted Therapy/methods , Reproducibility of Results , Web Browser , Workflow
9.
Nucleic Acids Res ; 45(D1): D1029-D1039, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27799469

ABSTRACT

Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX.


Subject(s)
Computational Biology/methods , Databases, Genetic , Plants/genetics , Plants/metabolism , Search Engine , Genomics/methods , Metabolic Networks and Pathways , Signal Transduction , Systems Biology/methods , User-Computer Interface , Web Browser
10.
PLoS One ; 11(7): e0158724, 2016.
Article in English | MEDLINE | ID: mdl-27438017

ABSTRACT

Recent estimates of the global burden of fungal disease suggest that that their incidence has been drastically underestimated and that mortality may rival that of malaria or tuberculosis. Azoles are the principal class of antifungal drug and the only available oral treatment for fungal disease. Recent occurrence and increase in azole resistance is a major concern worldwide. Known azole resistance mechanisms include over-expression of efflux pumps and mutation of the gene encoding the target protein cyp51a, however, for one of the most important fungal pathogens of humans, Aspergillus fumigatus, much of the observed azole resistance does not appear to involve such mechanisms. Here we present evidence that azole resistance in A. fumigatus can arise through mutation of components of mitochondrial complex I. Gene deletions of the 29.9KD subunit of this complex are azole resistant, less virulent and exhibit dysregulation of secondary metabolite gene clusters in a manner analogous to deletion mutants of the secondary metabolism regulator, LaeA. Additionally we observe that a mutation leading to an E180D amino acid change in the 29.9 KD subunit is strongly associated with clinical azole resistant A. fumigatus isolates. Evidence presented in this paper suggests that complex I may play a role in the hypoxic response and that one possible mechanism for cell death during azole treatment is a dysfunctional hypoxic response that may be restored by dysregulation of complex I. Both deletion of the 29.9 KD subunit of complex I and azole treatment alone profoundly change expression of gene clusters involved in secondary metabolism and immunotoxin production raising potential concerns about long term azole therapy.


Subject(s)
Aspergillosis/drug therapy , Drug Resistance, Fungal/genetics , Electron Transport Complex I/genetics , Mitochondria/genetics , Antifungal Agents/therapeutic use , Aspergillosis/genetics , Aspergillosis/microbiology , Aspergillus fumigatus/drug effects , Aspergillus fumigatus/pathogenicity , Azoles/therapeutic use , Cytochrome P-450 Enzyme System/genetics , Electron Transport Complex I/drug effects , Fungal Proteins/genetics , Gene Deletion , Humans , Microbial Sensitivity Tests , Mitochondria/drug effects , Mutation , Secondary Metabolism/drug effects
11.
Nucleic Acids Res ; 44(D1): D746-52, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26481351

ABSTRACT

Expression Atlas (http://www.ebi.ac.uk/gxa) provides information about gene and protein expression in animal and plant samples of different cell types, organism parts, developmental stages, diseases and other conditions. It consists of selected microarray and RNA-sequencing studies from ArrayExpress, which have been manually curated, annotated with ontology terms, checked for high quality and processed using standardised analysis methods. Since the last update, Atlas has grown seven-fold (1572 studies as of August 2015), and incorporates baseline expression profiles of tissues from Human Protein Atlas, GTEx and FANTOM5, and of cancer cell lines from ENCODE, CCLE and Genentech projects. Plant studies constitute a quarter of Atlas data. For genes of interest, the user can view baseline expression in tissues, and differential expression for biologically meaningful pairwise comparisons-estimated using consistent methodology across all of Atlas. Our first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues. Novel analyses and visualisations include: 'enrichment' in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains; hierarchical clustering (by baseline expression) of most variable genes and experimental conditions; and, for a given gene-condition, distribution of baseline expression across biological replicates.


Subject(s)
Databases, Genetic , Gene Expression Profiling , Plants/metabolism , Proteins/metabolism , Proteomics , Animals , Cell Line, Tumor , Humans , Plants/genetics , User-Computer Interface
12.
Nucleic Acids Res ; 44(D1): D1133-40, 2016 Jan 04.
Article in English | MEDLINE | ID: mdl-26553803

ABSTRACT

Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to ∼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials.


Subject(s)
Databases, Genetic , Genome, Plant , Plants/metabolism , Gene Expression , Genetic Variation , Genomics , Internet , Metabolic Networks and Pathways , Molecular Sequence Annotation , Plants/genetics
13.
Curr Plant Biol ; 7-8: 10-15, 2016 Nov.
Article in English | MEDLINE | ID: mdl-28713666

ABSTRACT

Gramene (http://www.gramene.org) is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationships to enrich the annotation of genomic data and provides tools to perform powerful comparative analyses across a wide spectrum of plant species. It consists of an integrated portal for querying, visualizing and analyzing data for 44 plant reference genomes, genetic variation data sets for 12 species, expression data for 16 species, curated rice pathways and orthology-based pathway projections for 66 plant species including various crops. Here we briefly describe the functions and uses of the Gramene database.

14.
Nucleic Acids Res ; 43(Database issue): D1113-6, 2015 Jan.
Article in English | MEDLINE | ID: mdl-25361974

ABSTRACT

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42,000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.


Subject(s)
Databases, Genetic , Gene Expression Profiling , Oligonucleotide Array Sequence Analysis , Genomics , High-Throughput Nucleotide Sequencing , Internet , Software
15.
Nucleic Acids Res ; 42(Database issue): D926-32, 2014 Jan.
Article in English | MEDLINE | ID: mdl-24304889

ABSTRACT

Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of 'baseline' expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful 'contrasts', i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user.


Subject(s)
Databases, Genetic , Gene Expression Profiling , Genomics , Humans , Internet , Oligonucleotide Array Sequence Analysis , Proteins/genetics , Proteins/metabolism , RNA Isoforms/metabolism , Sequence Analysis, RNA
16.
Nucleic Acids Res ; 41(Database issue): D987-90, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23193272

ABSTRACT

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.


Subject(s)
Databases, Genetic , Genomics , Microarray Analysis , Databases, Genetic/statistics & numerical data , Databases, Genetic/trends , High-Throughput Nucleotide Sequencing , Internet , Software , User-Computer Interface
17.
Mol Ecol ; 20(17): 3617-30, 2011 Sep.
Article in English | MEDLINE | ID: mdl-21801259

ABSTRACT

The tempo and mode of evolution of loci with a large effect on adaptation and reproductive isolation will influence the rate of evolutionary divergence and speciation. Desaturase loci are involved in key biochemical changes in long-chain fatty acids. In insects, these have been shown to influence adaptation to starvation or desiccation resistance and in some cases act as important pheromones. The desaturase gene family of Drosophila is known to have evolved by gene duplication and diversification, and at least one locus shows rapid evolution of sex-specific expression variation. Here, we examine the evolution of the gene family in species representing the Drosophila phylogeny. We find that the family includes more loci than have been previously described. Most are represented as single-copy loci, but we also find additional examples of duplications in loci which influence pheromone blends. Most loci show patterns of variation associated with purifying selection, but there are strong signatures of diversifying selection in new duplicates. In the case of a new duplicate of desat1 in the obscura group species, we show that strong selection on the coding sequence is associated with the evolution of sex-specific expression variation. It seems likely that both sexual selection and ecological adaptation have influenced the evolution of this gene family in Drosophila.


Subject(s)
Drosophila Proteins/genetics , Drosophila/genetics , Fatty Acid Desaturases/genetics , Genes, Insect , Multigene Family , Selection, Genetic , Animals , Drosophila/classification , Drosophila Proteins/metabolism , Evolution, Molecular , Fatty Acid Desaturases/metabolism , Gene Duplication , Gene Expression , Genetic Loci , Genetic Variation , Molecular Sequence Data , Phylogeny , Reproductive Isolation , Sequence Analysis, DNA
SELECTION OF CITATIONS
SEARCH DETAIL
...