Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Int J Mol Sci ; 25(10)2024 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-38791593

RESUMEN

Epidemiological evidence suggests existing comorbidity between postmenopausal osteoporosis (OP) and cardiovascular disease (CVD), but identification of possible shared genes is lacking. The skeletal global transcriptomes were analyzed in trans-iliac bone biopsies (n = 84) from clinically well-characterized postmenopausal women (50 to 86 years) without clinical CVD using microchips and RNA sequencing. One thousand transcripts highly correlated with areal bone mineral density (aBMD) were further analyzed using bioinformatics, and common genes overlapping with CVD and associated biological mechanisms, pathways and functions were identified. Fifty genes (45 mRNAs, 5 miRNAs) were discovered with established roles in oxidative stress, inflammatory response, endothelial function, fibrosis, dyslipidemia and osteoblastogenesis/calcification. These pleiotropic genes with possible CVD comorbidity functions were also present in transcriptomes of microvascular endothelial cells and cardiomyocytes and were differentially expressed between healthy and osteoporotic women with fragility fractures. The results were supported by a genetic pleiotropy-informed conditional False Discovery Rate approach identifying any overlap in single nucleotide polymorphisms (SNPs) within several genes encoding aBMD- and CVD-associated transcripts. The study provides transcriptional and genomic evidence for genes of importance for both BMD regulation and CVD risk in a large collection of postmenopausal bone biopsies. Most of the transcripts identified in the CVD risk categories have no previously recognized roles in OP pathogenesis and provide novel avenues for exploring the mechanistic basis for the biological association between CVD and OP.


Asunto(s)
Densidad Ósea , Enfermedades Cardiovasculares , Osteoporosis Posmenopáusica , Polimorfismo de Nucleótido Simple , Transcriptoma , Humanos , Femenino , Osteoporosis Posmenopáusica/genética , Osteoporosis Posmenopáusica/patología , Anciano , Persona de Mediana Edad , Enfermedades Cardiovasculares/genética , Enfermedades Cardiovasculares/patología , Anciano de 80 o más Años , Densidad Ósea/genética , Perfilación de la Expresión Génica , ARN Mensajero/genética , ARN Mensajero/metabolismo , MicroARNs/genética
2.
Nucleic Acids Res ; 52(D1): D174-D182, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37962376

RESUMEN

JASPAR (https://jaspar.elixir.no/) is a widely-used open-access database presenting manually curated high-quality and non-redundant DNA-binding profiles for transcription factors (TFs) across taxa. In this 10th release and 20th-anniversary update, the CORE collection has expanded with 329 new profiles. We updated three existing profiles and provided orthogonal support for 72 profiles from the previous release's UNVALIDATED collection. Altogether, the JASPAR 2024 update provides a 20% increase in CORE profiles from the previous release. A trimming algorithm enhanced profiles by removing low information content flanking base pairs, which were likely uninformative (within the capacity of the PFM models) for TFBS predictions and modelling TF-DNA interactions. This release includes enhanced metadata, featuring a refined classification for plant TFs' structural DNA-binding domains. The new JASPAR collections prompt updates to the genomic tracks of predicted TF binding sites (TFBSs) in 8 organisms, with human and mouse tracks available as native tracks in the UCSC Genome browser. All data are available through the JASPAR web interface and programmatically through its API and the updated Bioconductor and pyJASPAR packages. Finally, a new TFBS extraction tool enables users to retrieve predicted JASPAR TFBSs intersecting their genomic regions of interest.


Asunto(s)
Bases de Datos Genéticas , Unión Proteica , Factores de Transcripción , Animales , Humanos , Ratones , Bases de Datos Genéticas/normas , Bases de Datos Genéticas/tendencias , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Plantas/genética
3.
Int J Cancer ; 153(10): 1819-1828, 2023 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-37551617

RESUMEN

Genome-scale screening experiments in cancer produce long lists of candidate genes that require extensive interpretation for biological insight and prioritization for follow-up studies. Interrogation of gene lists frequently represents a significant and time-consuming undertaking, in which experimental biologists typically combine results from a variety of bioinformatics resources in an attempt to portray and understand cancer relevance. As a means to simplify and strengthen the support for this endeavor, we have developed oncoEnrichR, a flexible bioinformatics tool that allows cancer researchers to comprehensively interrogate a given gene list along multiple facets of cancer relevance. oncoEnrichR differs from general gene set analysis frameworks through the integration of an extensive set of prior knowledge specifically relevant for cancer, including ranked gene-tumor type associations, literature-supported proto-oncogene and tumor suppressor gene annotations, target druggability data, regulatory interactions, synthetic lethality predictions, as well as prognostic associations, gene aberrations and co-expression patterns across tumor types. The software produces a structured and user-friendly analysis report as its main output, where versions of all underlying data resources are explicitly logged, the latter being a critical component for reproducible science. We demonstrate the usefulness of oncoEnrichR through interrogation of two candidate lists from proteomic and CRISPR screens. oncoEnrichR is freely available as a web-based service hosted by the Galaxy platform (https://oncotools.elixir.no), and can also be accessed as a stand-alone R package (https://github.com/sigven/oncoEnrichR).


Asunto(s)
Neoplasias , Proteómica , Humanos , Biología Computacional/métodos , Programas Informáticos , Neoplasias/genética
4.
PLoS One ; 18(7): e0286330, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37467208

RESUMEN

Many high-throughput sequencing datasets can be represented as objects with coordinates along a reference genome. Currently, biological investigations often involve a large number of such datasets, for example representing different cell types or epigenetic factors. Drawing overall conclusions from a large collection of results for individual datasets may be challenging and time-consuming. Meaningful interpretation often requires the results to be aggregated according to metadata that represents biological characteristics of interest. In this light, we here propose the hierarchical Genomic Suite HyperBrowser (hGSuite), an open-source extension to the GSuite HyperBrowser platform, which aims to provide a means for extracting key results from an aggregated collection of high-throughput DNA sequencing data. The hGSuite utilizes a metadata-informed data cube to calculate various statistics across the multiple dimensions of the datasets. With this work, we show that the hGSuite and its associated data cube methodology offers a quick and accessible way for exploratory analysis of large genomic datasets. The web-based toolkit named hGsuite Hyperbrowser is available at https://hyperbrowser.uio.no/hgsuite under a GPLv3 license.


Asunto(s)
Metadatos , Programas Informáticos , Genómica/métodos , Genoma , Internet
5.
F1000Res ; 102021.
Artículo en Inglés | MEDLINE | ID: mdl-34249331

RESUMEN

Background: Many types of data from genomic analyses can be represented as genomic tracks, i.e. features linked to the genomic coordinates of a reference genome. Examples of such data are epigenetic DNA methylation data, ChIP-seq peaks, germline or somatic DNA variants, as well as RNA-seq expression levels. Researchers often face difficulties in locating, accessing and combining relevant tracks from external sources, as well as locating the raw data, reducing the value of the generated information. Description of work: We propose to advance the application of FAIR data principles (Findable, Accessible, Interoperable, and Reusable) to produce searchable metadata for genomic tracks. Findability and Accessibility of metadata can then be ensured by a track search service that integrates globally identifiable metadata from various track hubs in the Track Hub Registry and other relevant repositories. Interoperability and Reusability need to be ensured by the specification and implementation of a basic set of recommendations for metadata. We have tested this concept by developing such a specification in a JSON Schema, called FAIRtracks, and have integrated it into a novel track search service, called TrackFind. We demonstrate practical usage by importing datasets through TrackFind into existing examples of relevant analytical tools for genomic tracks: EPICO and the GSuite HyperBrowser. Conclusion: We here provide a first iteration of a draft standard for genomic track metadata, as well as the accompanying software ecosystem. It can easily be adapted or extended to future needs of the research community regarding data, methods and tools, balancing the requirements of both data submitters and analytical end-users.


Asunto(s)
Ecosistema , Metadatos , Genoma , Genómica , Programas Informáticos
6.
Nat Mach Intell ; 3(11): 936-944, 2021 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37396030

RESUMEN

Adaptive immune receptor repertoires (AIRR) are key targets for biomedical research as they record past and ongoing adaptive immune responses. The capacity of machine learning (ML) to identify complex discriminative sequence patterns renders it an ideal approach for AIRR-based diagnostic and therapeutic discovery. To date, widespread adoption of AIRR ML has been inhibited by a lack of reproducibility, transparency, and interoperability. immuneML (immuneml.uio.no) addresses these concerns by implementing each step of the AIRR ML process in an extensible, open-source software ecosystem that is based on fully specified and shareable workflows. To facilitate widespread user adoption, immuneML is available as a command-line tool and through an intuitive Galaxy web interface, and extensive documentation of workflows is provided. We demonstrate the broad applicability of immuneML by (i) reproducing a large-scale study on immune state prediction, (ii) developing, integrating, and applying a novel deep learning method for antigen specificity prediction, and (iii) showcasing streamlined interpretability-focused benchmarking of AIRR ML.

7.
Bioinformatics ; 35(9): 1615-1624, 2019 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-30307532

RESUMEN

MOTIVATION: Many high-throughput methods produce sets of genomic regions as one of their main outputs. Scientists often use genomic colocalization analysis to interpret such region sets, for example to identify interesting enrichments and to understand the interplay between the underlying biological processes. Although widely used, there is little standardization in how these analyses are performed. Different practices can substantially affect the conclusions of colocalization analyses. RESULTS: Here, we describe the different approaches and provide recommendations for performing genomic colocalization analysis, while also discussing common methodological challenges that may influence the conclusions. As illustrated by concrete example cases, careful attention to analysis details is needed in order to meet these challenges and to obtain a robust and biologically meaningful interpretation of genomic region set data. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma , Genómica
8.
F1000Res ; 72018.
Artículo en Inglés | MEDLINE | ID: mdl-30271575

RESUMEN

The Norwegian e-Infrastructure for Life Sciences (NeLS) has been developed by ELIXIR Norway to provide its users with a system enabling data storage, sharing, and analysis in a project-oriented fashion. The system is available through easy-to-use web interfaces, including the Galaxy workbench for data analysis and workflow execution. Users confident with a command-line interface and programming may also access it through Secure Shell (SSH) and application programming interfaces (APIs).  NeLS has been in production since 2015, with training and support provided by the help desk of ELIXIR Norway. Through collaboration with NorSeq, the national consortium for high-throughput sequencing, an integrated service is offered so that sequencing data generated in a research project is provided to the involved researchers through NeLS. Sensitive data, such as individual genomic sequencing data, are handled using the TSD (Services for Sensitive Data) platform provided by Sigma2 and the University of Oslo. NeLS integrates national e-infrastructure storage and computing resources, and is also integrated with the SEEK platform in order to store large data files produced by experiments described in SEEK.   In this article, we outline the architecture of NeLS and discuss possible directions for further development.


Asunto(s)
Disciplinas de las Ciencias Biológicas , Sistemas de Administración de Bases de Datos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Difusión de la Información/métodos , Almacenamiento y Recuperación de la Información/métodos , Noruega
9.
Nucleic Acids Res ; 46(W1): W186-W193, 2018 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-29873782

RESUMEN

Functional genomics assays produce sets of genomic regions as one of their main outputs. To biologically interpret such region-sets, researchers often use colocalization analysis, where the statistical significance of colocalization (overlap, spatial proximity) between two or more region-sets is tested. Existing colocalization analysis tools vary in the statistical methodology and analysis approaches, thus potentially providing different conclusions for the same research question. As the findings of colocalization analysis are often the basis for follow-up experiments, it is helpful to use several tools in parallel and to compare the results. We developed the Coloc-stats web service to facilitate such analyses. Coloc-stats provides a unified interface to perform colocalization analysis across various analytical methods and method-specific options (e.g. colocalization measures, resolution, null models). Coloc-stats helps the user to find a method that supports their experimental requirements and allows for a straightforward comparison across methods. Coloc-stats is implemented as a web server with a graphical user interface that assists users with configuring their colocalization analyses. Coloc-stats is freely available at https://hyperbrowser.uio.no/coloc-stats/.


Asunto(s)
Genómica/métodos , Programas Informáticos , Inmunoprecipitación de Cromatina , Factor de Transcripción GATA1/metabolismo , Internet , Análisis de Secuencia de ADN , Interfaz Usuario-Computador
10.
Gigascience ; 6(7): 1-12, 2017 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-28459977

RESUMEN

Background: Recent large-scale undertakings such as ENCODE and Roadmap Epigenomics have generated experimental data mapped to the human reference genome (as genomic tracks) representing a variety of functional elements across a large number of cell types. Despite the high potential value of these publicly available data for a broad variety of investigations, little attention has been given to the analytical methodology necessary for their widespread utilisation. Findings: We here present a first principled treatment of the analysis of collections of genomic tracks. We have developed novel computational and statistical methodology to permit comparative and confirmatory analyses across multiple and disparate data sources. We delineate a set of generic questions that are useful across a broad range of investigations and discuss the implications of choosing different statistical measures and null models. Examples include contrasting analyses across different tissues or diseases. The methodology has been implemented in a comprehensive open-source software system, the GSuite HyperBrowser. To make the functionality accessible to biologists, and to facilitate reproducible analysis, we have also developed a web-based interface providing an expertly guided and customizable way of utilizing the methodology. With this system, many novel biological questions can flexibly be posed and rapidly answered. Conclusions: Through a combination of streamlined data acquisition, interoperable representation of dataset collections, and customizable statistical analysis with guided setup and interpretation, the GSuite HyperBrowser represents a first comprehensive solution for integrative analysis of track collections across the genome and epigenome. The software is available at: https://hyperbrowser.uio.no.


Asunto(s)
Conjuntos de Datos como Asunto/normas , Epigénesis Genética , Epigenómica/métodos , Genoma Humano , Programas Informáticos , Secuenciación Completa del Genoma/métodos , Epigenómica/normas , Humanos , Secuenciación Completa del Genoma/normas
11.
PLoS One ; 10(7): e0133280, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26208222

RESUMEN

Strict control of tissue-specific gene expression plays a pivotal role during lineage commitment. The transcription factor c-Myb has an essential role in adult haematopoiesis and functions as an oncogene when rearranged in human cancers. Here we have exploited digital genomic footprinting analysis to obtain a global picture of c-Myb occupancy in the genome of six different haematopoietic cell-types. We have biologically validated several c-Myb footprints using c-Myb knockdown data, reporter assays and DamID analysis. We show that our predicted conserved c-Myb footprints are highly dependent on the haematopoietic cell type, but that there is a group of gene targets common to all cell-types analysed. Furthermore, we find that c-Myb footprints co-localise with active histone mark H3K4me3 and are significantly enriched at exons. We analysed co-localisation of c-Myb footprints with 104 chromatin regulatory factors in K562 cells, and identified nine proteins that are enriched together with c-Myb footprints on genes positively regulated by c-Myb and one protein enriched on negatively regulated genes. Our data suggest that c-Myb is a transcription factor with multifaceted target regulation depending on cell type.


Asunto(s)
Sitios de Unión , Cromatina/genética , Cromatina/metabolismo , Hematopoyesis/genética , Proteínas Proto-Oncogénicas c-myb/metabolismo , Diferenciación Celular/genética , Huella de ADN , Regulación de la Expresión Génica , Estudio de Asociación del Genoma Completo , Histonas/metabolismo , Humanos , Células K562 , Unión Proteica , Factores de Transcripción/metabolismo , Transcripción Genética
12.
Bioinformatics ; 30(11): 1620-2, 2014 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-24511080

RESUMEN

UNLABELLED: Recently developed methods that couple next-generation sequencing with chromosome conformation capture-based techniques, such as Hi-C and ChIA-PET, allow for characterization of genome-wide chromatin 3D structure. Understanding the organization of chromatin in three dimensions is a crucial next step in the unraveling of global gene regulation, and methods for analyzing such data are needed. We have developed HiBrowse, a user-friendly web-tool consisting of a range of hypothesis-based and descriptive statistics, using realistic assumptions in null-models. AVAILABILITY AND IMPLEMENTATION: HiBrowse is supported by all major browsers, and is freely available at http://hyperbrowser.uio.no/3d. Software is implemented in Python, and source code is available for download by following instructions on the main site.


Asunto(s)
Cromatina/química , Programas Informáticos , Interpretación Estadística de Datos , Genoma , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento
13.
Nucleic Acids Res ; 41(Web Server issue): W133-41, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23632163

RESUMEN

The immense increase in availability of genomic scale datasets, such as those provided by the ENCODE and Roadmap Epigenomics projects, presents unprecedented opportunities for individual researchers to pose novel falsifiable biological questions. With this opportunity, however, researchers are faced with the challenge of how to best analyze and interpret their genome-scale datasets. A powerful way of representing genome-scale data is as feature-specific coordinates relative to reference genome assemblies, i.e. as genomic tracks. The Genomic HyperBrowser (http://hyperbrowser.uio.no) is an open-ended web server for the analysis of genomic track data. Through the provision of several highly customizable components for processing and statistical analysis of genomic tracks, the HyperBrowser opens for a range of genomic investigations, related to, e.g., gene regulation, disease association or epigenetic modifications of the genome.


Asunto(s)
Genómica/métodos , Programas Informáticos , Interpretación Estadística de Datos , Genoma , Internet
14.
BMC Genomics ; 12: 353, 2011 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-21736759

RESUMEN

BACKGROUND: Transcription factors in disease-relevant pathways represent potential drug targets, by impacting a distinct set of pathways that may be modulated through gene regulation. The influence of transcription factors is typically studied on a per disease basis, and no current resources provide a global overview of the relations between transcription factors and disease. Furthermore, existing pipelines for related large-scale analysis are tailored for particular sources of input data, and there is a need for generic methodology for integrating complementary sources of genomic information. RESULTS: We here present a large-scale analysis of multiple diseases versus multiple transcription factors, with a global map of over-and under-representation of 446 transcription factors in 1010 diseases. This map, referred to as the differential disease regulome, provides a first global statistical overview of the complex interrelationships between diseases, genes and controlling elements. The map is visualized using the Google map engine, due to its very large size, and provides a range of detailed information in a dynamic presentation format.The analysis is achieved through a novel methodology that performs a pairwise, genome-wide comparison on the cartesian product of two distinct sets of annotation tracks, e.g. all combinations of one disease and one TF.The methodology was also used to extend with maps using alternative data sets related to transcription and disease, as well as data sets related to Gene Ontology classification and histone modifications. We provide a web-based interface that allows users to generate other custom maps, which could be based on precisely specified subsets of transcription factors and diseases, or, in general, on any categorical genome annotation tracks as they are improved or become available. CONCLUSION: We have created a first resource that provides a global overview of the complex relations between transcription factors and disease. As the accuracy of the disease regulome depends mainly on the quality of the input data, forthcoming ChIP-seq based binding data for many TFs will provide improved maps. We further believe our approach to genome analysis could allow an advance from the current typical situation of one-time integrative efforts to reproducible and upgradable integrative analysis. The differential disease regulome and its associated methodology is available at http://hyperbrowser.uio.no.


Asunto(s)
Enfermedad/genética , Genómica/métodos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Gráficos por Computador , Humanos , Internet , Anotación de Secuencia Molecular
15.
BMC Bioinformatics ; 12: 494, 2011 Dec 30.
Artículo en Inglés | MEDLINE | ID: mdl-22208806

RESUMEN

BACKGROUND: With the recent advances and availability of various high-throughput sequencing technologies, data on many molecular aspects, such as gene regulation, chromatin dynamics, and the three-dimensional organization of DNA, are rapidly being generated in an increasing number of laboratories. The variation in biological context, and the increasingly dispersed mode of data generation, imply a need for precise, interoperable and flexible representations of genomic features through formats that are easy to parse. A host of alternative formats are currently available and in use, complicating analysis and tool development. The issue of whether and how the multitude of formats reflects varying underlying characteristics of data has to our knowledge not previously been systematically treated. RESULTS: We here identify intrinsic distinctions between genomic features, and argue that the distinctions imply that a certain variation in the representation of features as genomic tracks is warranted. Four core informational properties of tracks are discussed: gaps, lengths, values and interconnections. From this we delineate fifteen generic track types. Based on the track type distinctions, we characterize major existing representational formats and find that the track types are not adequately supported by any single format. We also find, in contrast to the XML formats, that none of the existing tabular formats are conveniently extendable to support all track types. We thus propose two unified formats for track data, an improved XML format, BioXSD 1.1, and a new tabular format, GTrack 1.0. CONCLUSIONS: The defined track types are shown to capture relevant distinctions between genomic annotation tracks, resulting in varying representational needs and analysis possibilities. The proposed formats, GTrack 1.0 and BioXSD 1.1, cater to the identified track distinctions and emphasize preciseness, flexibility and parsing convenience.


Asunto(s)
Genómica/métodos , Análisis de Secuencia de ADN , Genoma , Genoma Humano , Humanos , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos
16.
Genome Biol ; 11(12): R121, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-21182759

RESUMEN

The immense increase in the generation of genomic scale data poses an unmet analytical challenge, due to a lack of established methodology with the required flexibility and power. We propose a first principled approach to statistical analysis of sequence-level genomic information. We provide a growing collection of generic biological investigations that query pairwise relations between tracks, represented as mathematical objects, along the genome. The Genomic HyperBrowser implements the approach and is available at http://hyperbrowser.uio.no.


Asunto(s)
Biología Computacional/métodos , Genoma , Genómica/métodos , Análisis de Secuencia/métodos , Programas Informáticos , Emparejamiento Base , Exones , Expresión Génica , Histonas/metabolismo , Modelos Biológicos , Desnaturalización de Ácido Nucleico , Polimorfismo de Nucleótido Simple
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...