Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
Nucleic Acids Res ; 52(D1): D1210-D1217, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-38183204

RESUMEN

The Catalogue Of Somatic Mutations In Cancer (COSMIC), https://cancer.sanger.ac.uk/cosmic, is an expert-curated knowledgebase providing data on somatic variants in cancer, supported by a comprehensive suite of tools for interpreting genomic data, discerning the impact of somatic alterations on disease, and facilitating translational research. The catalogue is accessed and used by thousands of cancer researchers and clinicians daily, allowing them to quickly access information from an immense pool of data curated from over 29 thousand scientific publications and large studies. Within the last 4 years, COSMIC has substantially expanded its utility by adding new resources: the Mutational Signatures catalogue, the Cancer Mutation Census, and Actionability. To improve data accessibility and interoperability, somatic variants have received stable genomic identifiers that are associated with their genomic coordinates in GRCh37 and GRCh38, and new export files with reduced data redundancy have been made available for download.


Asunto(s)
Bases de Datos Genéticas , Genómica , Neoplasias , Humanos , Bases de Datos Factuales , Bases del Conocimiento , Mutación , Neoplasias/genética , Bases de Datos Genéticas/tendencias , Internet
2.
PLoS Biol ; 22(1): e3002477, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38271296

RESUMEN

Curated scientific databases catalogue and amplify research findings to maximize their reach. Authors should write their papers with this in mind, ensuring that data are accurate, easy to extract, and presented in standardized formats.


Asunto(s)
Escritura , Bases de Datos Factuales
3.
Sci Data ; 10(1): 632, 2023 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-37717042

RESUMEN

Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to a disease prediction. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repositioning models or as a valuable resource for training such models.


Asunto(s)
Benchmarking , Desarrollo de Medicamentos , Bases de Datos Factuales , Reposicionamiento de Medicamentos , Conocimiento
4.
bioRxiv ; 2023 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-37205439

RESUMEN

Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to disease predictions. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repurposing models or as a valuable resource for training such models.

5.
Nucleic Acids Res ; 49(D1): D1302-D1310, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33196847

RESUMEN

The Open Targets Platform (https://www.targetvalidation.org/) provides users with a queryable knowledgebase and user interface to aid systematic target identification and prioritisation for drug discovery based upon underlying evidence. It is publicly available and the underlying code is open source. Since our last update two years ago, we have had 10 releases to maintain and continuously improve evidence for target-disease relationships from 20 different data sources. In addition, we have integrated new evidence from key datasets, including prioritised targets identified from genome-wide CRISPR knockout screens in 300 cancer models (Project Score), and GWAS/UK BioBank statistical genetic analysis evidence from the Open Targets Genetics Portal. We have evolved our evidence scoring framework to improve target identification. To aid the prioritisation of targets and inform on the potential impact of modulating a given target, we have added evaluation of post-marketing adverse drug reactions and new curated information on target tractability and safety. We have also developed the user interface and backend technologies to improve performance and usability. In this article, we describe the latest enhancements to the Platform, to address the fundamental challenge that developing effective and safe drugs is difficult and expensive.


Asunto(s)
Antineoplásicos/uso terapéutico , Drogas en Investigación/uso terapéutico , Bases del Conocimiento , Terapia Molecular Dirigida/métodos , Neoplasias/tratamiento farmacológico , Programas Informáticos , Antineoplásicos/química , Bases de Datos Factuales , Conjuntos de Datos como Asunto , Descubrimiento de Drogas/métodos , Drogas en Investigación/química , Humanos , Internet , Neoplasias/clasificación , Neoplasias/genética , Neoplasias/patología
6.
Nucleic Acids Res ; 49(D1): D1311-D1320, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33045747

RESUMEN

Open Targets Genetics (https://genetics.opentargets.org) is an open-access integrative resource that aggregates human GWAS and functional genomics data including gene expression, protein abundance, chromatin interaction and conformation data from a wide range of cell types and tissues to make robust connections between GWAS-associated loci, variants and likely causal genes. This enables systematic identification and prioritisation of likely causal variants and genes across all published trait-associated loci. In this paper, we describe the public resources we aggregate, the technology and analyses we use, and the functionality that the portal offers. Open Targets Genetics can be searched by variant, gene or study/phenotype. It offers tools that enable users to prioritise causal variants and genes at disease-associated loci and access systematic cross-disease and disease-molecular trait colocalization analysis across 92 cell types and tissues including the eQTL Catalogue. Data visualizations such as Manhattan-like plots, regional plots, credible sets overlap between studies and PheWAS plots enable users to explore GWAS signals in depth. The integrated data is made available through the web portal, for bulk download and via a GraphQL API, and the software is open source. Applications of this integrated data include identification of novel targets for drug discovery and drug repurposing.


Asunto(s)
Bases de Datos Genéticas , Genoma Humano , Enfermedades Inflamatorias del Intestino/genética , Terapia Molecular Dirigida/métodos , Sitios de Carácter Cuantitativo , Programas Informáticos , Cromatina/química , Cromatina/metabolismo , Conjuntos de Datos como Asunto , Descubrimiento de Drogas/métodos , Reposicionamiento de Medicamentos/métodos , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Enfermedades Inflamatorias del Intestino/tratamiento farmacológico , Enfermedades Inflamatorias del Intestino/metabolismo , Enfermedades Inflamatorias del Intestino/patología , Internet , Fenotipo , Carácter Cuantitativo Heredable
7.
PLoS Comput Biol ; 16(5): e1007854, 2020 05.
Artículo en Inglés | MEDLINE | ID: mdl-32437350

RESUMEN

Everything we do today is becoming more and more reliant on the use of computers. The field of biology is no exception; but most biologists receive little or no formal preparation for the increasingly computational aspects of their discipline. In consequence, informal training courses are often needed to plug the gaps; and the demand for such training is growing worldwide. To meet this demand, some training programs are being expanded, and new ones are being developed. Key to both scenarios is the creation of new course materials. Rather than starting from scratch, however, it's sometimes possible to repurpose materials that already exist. Yet finding suitable materials online can be difficult: They're often widely scattered across the internet or hidden in their home institutions, with no systematic way to find them. This is a common problem for all digital objects. The scientific community has attempted to address this issue by developing a set of rules (which have been called the Findable, Accessible, Interoperable and Reusable [FAIR] principles) to make such objects more findable and reusable. Here, we show how to apply these rules to help make training materials easier to find, (re)use, and adapt, for the benefit of all.


Asunto(s)
Instrucción por Computador/normas , Guías como Asunto , Biología/educación , Biología Computacional , Humanos , Almacenamiento y Recuperación de la Información
8.
F1000Res ; 92020.
Artículo en Inglés | MEDLINE | ID: mdl-34367618

RESUMEN

Copy number variations (CNVs) are major causative contributors both in the genesis of genetic diseases and human neoplasias. While "High-Throughput" sequencing technologies are increasingly becoming the primary choice for genomic screening analysis, their ability to efficiently detect CNVs is still heterogeneous and remains to be developed. The aim of this white paper is to provide a guiding framework for the future contributions of ELIXIR's recently established human CNV Community, with implications beyond human disease diagnostics and population genomics. This white paper is the direct result of a strategy meeting that took place in September 2018 in Hinxton (UK) and involved representatives of 11 ELIXIR Nodes. The meeting led to the definition of priority objectives and tasks, to address a wide range of CNV-related challenges ranging from detection and interpretation to sharing and training. Here, we provide suggestions on how to align these tasks within the ELIXIR Platforms strategy, and on how to frame the activities of this new ELIXIR Community in the international context.


Asunto(s)
Biología Computacional , Variaciones en el Número de Copia de ADN , Variaciones en el Número de Copia de ADN/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
9.
Nucleic Acids Res ; 47(D1): D1056-D1065, 2019 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-30462303

RESUMEN

The Open Targets Platform integrates evidence from genetics, genomics, transcriptomics, drugs, animal models and scientific literature to score and rank target-disease associations for drug target identification. The associations are displayed in an intuitive user interface (https://www.targetvalidation.org), and are available through a REST-API (https://api.opentargets.io/v3/platform/docs/swagger-ui) and a bulk download (https://www.targetvalidation.org/downloads/data). In addition to target-disease associations, we also aggregate and display data at the target and disease levels to aid target prioritisation. Since our first publication two years ago, we have made eight releases, added new data sources for target-disease associations, started including causal genetic variants from non genome-wide targeted arrays, added new target and disease annotations, launched new visualisations and improved existing ones and released a new web tool for batch search of up to 200 targets. We have a new URL for the Open Targets Platform REST-API, new REST endpoints and also removed the need for authorisation for API fair use. Here, we present the latest developments of the Open Targets Platform, expanding the evidence and target-disease associations with new and improved data sources, refining data quality, enhancing website usability, and increasing our user base with our training workshops, user support, social media and bioinformatics forum engagement.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Almacenamiento y Recuperación de la Información/métodos , Terapia Molecular Dirigida/métodos , Biología Computacional/tendencias , Perfilación de la Expresión Génica/métodos , Genómica/tendencias , Humanos , Almacenamiento y Recuperación de la Información/tendencias , Internet , Reproducibilidad de los Resultados , Programas Informáticos
12.
Drug Discov Today ; 23(6): 1169-1174, 2018 06.
Artículo en Inglés | MEDLINE | ID: mdl-29337199

RESUMEN

We discuss how we designed the Open Targets Platform (www.targetvalidation.org), an intuitive application for bench scientists working in early drug discovery. To meet the needs of our users, we applied lean user experience (UX) design methods: we started engaging with users very early and carried out research, design and evaluation activities within an iterative development process. We also emphasize the collaborative nature of applying lean UX design, which we believe is a foundation for success in this and many other scientific projects.


Asunto(s)
Descubrimiento de Drogas , Internet , Conducta Cooperativa , Humanos , Investigadores
13.
Nucleic Acids Res ; 46(D1): D802-D808, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29092050

RESUMEN

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including genome sequence, gene models, transcript sequence, genetic variation, and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments and expansions. These include the incorporation of almost 20 000 additional genome sequences and over 35 000 tracks of RNA-Seq data, which have been aligned to genomic sequence and made available for visualization. Other advances since 2015 include the release of the database in Resource Description Framework (RDF) format, a large increase in community-derived curation, a new high-performance protein sequence search, additional cross-references, improved annotation of non-protein-coding genes, and the launch of pre-release and archival sites. Collectively, these changes are part of a continuing response to the increasing quantity of publicly-available genome-scale data, and the consequent need to archive, integrate, annotate and disseminate these using automated, scalable methods.


Asunto(s)
Archaea/genética , Bacterias/genética , Bases de Datos Genéticas , Bases de Datos de Proteínas , Eucariontes/genética , Genómica , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Minería de Datos , Predicción , Genoma , Anotación de Secuencia Molecular , ARN/genética , Interfaz Usuario-Computador
14.
Nucleic Acids Res ; 45(D1): D635-D642, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899575

RESUMEN

Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Motor de Búsqueda , Programas Informáticos , Navegador Web , Animales , Minería de Datos , Evolución Molecular , Regulación de la Expresión Génica , Variación Genética , Genoma Humano , Humanos , Anotación de Secuencia Molecular , Especificidad de la Especie , Vertebrados
15.
Nucleic Acids Res ; 45(D1): D985-D994, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27899665

RESUMEN

We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.


Asunto(s)
Biología Computacional/métodos , Terapia Molecular Dirigida , Motor de Búsqueda , Programas Informáticos , Bases de Datos Factuales , Humanos , Terapia Molecular Dirigida/métodos , Reproducibilidad de los Resultados , Navegador Web , Flujo de Trabajo
16.
Nucleic Acids Res ; 44(D1): D574-80, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26578574

RESUMEN

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces.


Asunto(s)
Bases de Datos Genéticas , Genoma Bacteriano , Genoma Fúngico , Genoma de Planta , Invertebrados/genética , Animales , Diploidia , Eucariontes/genética , Variación Genética , Genoma , Poliploidía , Alineación de Secuencia
17.
Genome Res ; 26(1): 130-9, 2016 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-26560630

RESUMEN

We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes--both single copy and amplified--on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution.


Asunto(s)
Cromosomas de los Mamíferos/genética , Evolución Molecular , Porcinos/genética , Cromosoma X/genética , Cromosoma Y/genética , Animales , Secuencia de Bases , Gatos/genética , Perros/genética , Femenino , Conversión Génica , Expresión Génica , Biblioteca de Genes , Orden Génico , Humanos , Masculino , Datos de Secuencia Molecular , Alineación de Secuencia , Análisis de Secuencia de ADN
18.
Nucleic Acids Res ; 44(D1): D710-6, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26687719

RESUMEN

The Ensembl project (http://www.ensembl.org) is a system for genome annotation, analysis, storage and dissemination designed to facilitate the access of genomic annotation from chordates and key model organisms. It provides access to data from 87 species across our main and early access Pre! websites. This year we introduced three newly annotated species and released numerous updates across our supported species with a concentration on data for the latest genome assemblies of human, mouse, zebrafish and rat. We also provided two data updates for the previous human assembly, GRCh37, through a dedicated website (http://grch37.ensembl.org). Our tools, in particular the VEP, have been improved significantly through integration of additional third party data. REST is now capable of larger-scale analysis and our regulatory data BioMart can deliver faster results. The website is now capable of displaying long-range interactions such as those found in cis-regulated datasets. Finally we have launched a website optimized for mobile devices providing views of genes, variants and phenotypes. Our data is made available without restriction and all code is available from our GitHub organization site (http://github.com/Ensembl) under an Apache 2.0 license.


Asunto(s)
Bases de Datos Genéticas , Genómica , Anotación de Secuencia Molecular , Animales , Genes , Variación Genética , Humanos , Internet , Ratones , Proteínas/genética , Ratas , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos
19.
Nucleic Acids Res ; 43(Database issue): D662-9, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25352552

RESUMEN

Ensembl (http://www.ensembl.org) is a genomic interpretation system providing the most up-to-date annotations, querying tools and access methods for chordates and key model organisms. This year we released updated annotation (gene models, comparative genomics, regulatory regions and variation) on the new human assembly, GRCh38, although we continue to support researchers using the GRCh37.p13 assembly through a dedicated site (http://grch37.ensembl.org). Our Regulatory Build has been revamped to identify regulatory regions of interest and to efficiently highlight their activity across disparate epigenetic data sets. A number of new interfaces allow users to perform large-scale comparisons of their data against our annotations. The REST server (http://rest.ensembl.org), which allows programs written in any language to query our databases, has moved to a full service alongside our upgraded website tools. Our online Variant Effect Predictor tool has been updated to process more variants and calculate summary statistics. Lastly, the WiggleTools package enables users to summarize large collections of data sets and view them as single tracks in Ensembl. The Ensembl code base itself is more accessible: it is now hosted on our GitHub organization page (https://github.com/Ensembl) under an Apache 2.0 open source license.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genómica , Animales , Epigénesis Genética , Variación Genética , Genoma Humano , Humanos , Internet , Ratones , Anotación de Secuencia Molecular , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos
20.
PLoS One ; 9(3): e91534, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24614536

RESUMEN

The greater Himalayan region demarcates two of the most prominent linguistic phyla in Asia: Tibeto-Burman and Indo-European. Previous genetic surveys, mainly using Y-chromosome polymorphisms and/or mitochondrial DNA polymorphisms suggested a substantially reduced geneflow between populations belonging to these two phyla. These studies, however, have mainly focussed on populations residing far to the north and/or south of this mountain range, and have not been able to study geneflow patterns within the greater Himalayan region itself. We now report a detailed, linguistically informed, genetic survey of Tibeto-Burman and Indo-European speakers from the Himalayan countries Nepal and Bhutan based on autosomal microsatellite markers and compare these populations with surrounding regions. The genetic differentiation between populations within the Himalayas seems to be much higher than between populations in the neighbouring countries. We also observe a remarkable genetic differentiation between the Tibeto-Burman speaking populations on the one hand and Indo-European speaking populations on the other, suggesting that language and geography have played an equally large role in defining the genetic composition of present-day populations within the Himalayas.


Asunto(s)
Cromosomas Humanos/genética , Genética de Población , Lingüística , Repeticiones de Microsatélite/genética , Asia , Flujo Génico , Técnicas de Genotipaje , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA