Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 103
Filter
Add more filters

Country/Region as subject
Publication year range
1.
Am J Hum Genet ; 109(6): 1038-1054, 2022 06 02.
Article in English | MEDLINE | ID: mdl-35568032

ABSTRACT

Metabolite levels measured in the human population are endophenotypes for biological processes. We combined sequencing data for 3,924 (whole-exome sequencing, WES, discovery) and 2,805 (whole-genome sequencing, WGS, replication) donors from a prospective cohort of blood donors in England. We used multiple approaches to select and aggregate rare genetic variants (minor allele frequency [MAF] < 0.1%) in protein-coding regions and tested their associations with 995 metabolites measured in plasma by using ultra-high-performance liquid chromatography-tandem mass spectrometry. We identified 40 novel associations implicating rare coding variants (27 genes and 38 metabolites), of which 28 (15 genes and 28 metabolites) were replicated. We developed algorithms to prioritize putative driver variants at each locus and used mediation and Mendelian randomization analyses to test directionality at associations of metabolite and protein levels at the ACY1 locus. Overall, 66% of reported associations implicate gene targets of approved drugs or bioactive drug-like compounds, contributing to drug targets' validating efforts.


Subject(s)
Exome , Exome/genetics , Gene Frequency/genetics , Humans , Prospective Studies , Exome Sequencing/methods , Whole Genome Sequencing
2.
Cell ; 143(3): 367-78, 2010 Oct 29.
Article in English | MEDLINE | ID: mdl-21029860

ABSTRACT

ATRX is an X-linked gene of the SWI/SNF family, mutations in which cause syndromal mental retardation and downregulation of α-globin expression. Here we show that ATRX binds to tandem repeat (TR) sequences in both telomeres and euchromatin. Genes associated with these TRs can be dysregulated when ATRX is mutated, and the change in expression is determined by the size of the TR, producing skewed allelic expression. This reveals the characteristics of the affected genes, explains the variable phenotypes seen with identical ATRX mutations, and illustrates a new mechanism underlying variable penetrance. Many of the TRs are G rich and predicted to form non-B DNA structures (including G-quadruplex) in vivo. We show that ATRX binds G-quadruplex structures in vitro, suggesting a mechanism by which ATRX may play a role in various nuclear processes and how this is perturbed when ATRX is mutated.


Subject(s)
DNA Helicases/metabolism , Nuclear Proteins/metabolism , Animals , Cells, Cultured , Chromatin Immunoprecipitation , Chromosomes, Mammalian/metabolism , CpG Islands , DNA Helicases/genetics , DNA, Ribosomal/metabolism , G-Quadruplexes , Gene Expression , Genome-Wide Association Study , Histones/metabolism , Humans , Mice , Minisatellite Repeats , Mutation , Nuclear Proteins/genetics , Telomere/metabolism , X-linked Nuclear Protein
3.
Nucleic Acids Res ; 51(D1): D1353-D1359, 2023 Jan 06.
Article in English | MEDLINE | ID: mdl-36399499

ABSTRACT

The Open Targets Platform (https://platform.opentargets.org/) is an open source resource to systematically assist drug target identification and prioritisation using publicly available data. Since our last update, we have reimagined, redesigned, and rebuilt the Platform in order to streamline data integration and harmonisation, expand the ways in which users can explore the data, and improve the user experience. The gene-disease causal evidence has been enhanced and expanded to better capture disease causality across rare, common, and somatic diseases. For target and drug annotations, we have incorporated new features that help assess target safety and tractability, including genetic constraint, PROTACtability assessments, and AlphaFold structure predictions. We have also introduced new machine learning applications for knowledge extraction from the published literature, clinical trial information, and drug labels. The new technologies and frameworks introduced since the last update will ease the introduction of new features and the creation of separate instances of the Platform adapted to user requirements. Our new Community forum, expanded training materials, and outreach programme support our users in a range of use cases.

4.
Nucleic Acids Res ; 49(D1): D1302-D1310, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33196847

ABSTRACT

The Open Targets Platform (https://www.targetvalidation.org/) provides users with a queryable knowledgebase and user interface to aid systematic target identification and prioritisation for drug discovery based upon underlying evidence. It is publicly available and the underlying code is open source. Since our last update two years ago, we have had 10 releases to maintain and continuously improve evidence for target-disease relationships from 20 different data sources. In addition, we have integrated new evidence from key datasets, including prioritised targets identified from genome-wide CRISPR knockout screens in 300 cancer models (Project Score), and GWAS/UK BioBank statistical genetic analysis evidence from the Open Targets Genetics Portal. We have evolved our evidence scoring framework to improve target identification. To aid the prioritisation of targets and inform on the potential impact of modulating a given target, we have added evaluation of post-marketing adverse drug reactions and new curated information on target tractability and safety. We have also developed the user interface and backend technologies to improve performance and usability. In this article, we describe the latest enhancements to the Platform, to address the fundamental challenge that developing effective and safe drugs is difficult and expensive.


Subject(s)
Antineoplastic Agents/therapeutic use , Drugs, Investigational/therapeutic use , Knowledge Bases , Molecular Targeted Therapy/methods , Neoplasms/drug therapy , Software , Antineoplastic Agents/chemistry , Databases, Factual , Datasets as Topic , Drug Discovery/methods , Drugs, Investigational/chemistry , Humans , Internet , Neoplasms/classification , Neoplasms/genetics , Neoplasms/pathology
5.
Nucleic Acids Res ; 49(D1): D1311-D1320, 2021 01 08.
Article in English | MEDLINE | ID: mdl-33045747

ABSTRACT

Open Targets Genetics (https://genetics.opentargets.org) is an open-access integrative resource that aggregates human GWAS and functional genomics data including gene expression, protein abundance, chromatin interaction and conformation data from a wide range of cell types and tissues to make robust connections between GWAS-associated loci, variants and likely causal genes. This enables systematic identification and prioritisation of likely causal variants and genes across all published trait-associated loci. In this paper, we describe the public resources we aggregate, the technology and analyses we use, and the functionality that the portal offers. Open Targets Genetics can be searched by variant, gene or study/phenotype. It offers tools that enable users to prioritise causal variants and genes at disease-associated loci and access systematic cross-disease and disease-molecular trait colocalization analysis across 92 cell types and tissues including the eQTL Catalogue. Data visualizations such as Manhattan-like plots, regional plots, credible sets overlap between studies and PheWAS plots enable users to explore GWAS signals in depth. The integrated data is made available through the web portal, for bulk download and via a GraphQL API, and the software is open source. Applications of this integrated data include identification of novel targets for drug discovery and drug repurposing.


Subject(s)
Databases, Genetic , Genome, Human , Inflammatory Bowel Diseases/genetics , Molecular Targeted Therapy/methods , Quantitative Trait Loci , Software , Chromatin/chemistry , Chromatin/metabolism , Datasets as Topic , Drug Discovery/methods , Drug Repositioning/methods , Genome-Wide Association Study , Genotype , Humans , Inflammatory Bowel Diseases/drug therapy , Inflammatory Bowel Diseases/metabolism , Inflammatory Bowel Diseases/pathology , Internet , Phenotype , Quantitative Trait, Heritable
6.
Bioinformatics ; 36(9): 2936-2937, 2020 05 01.
Article in English | MEDLINE | ID: mdl-31930349

ABSTRACT

MOTIVATION: Genome-wide association studies (GWAS) are a powerful method to detect even weak associations between variants and phenotypes; however, many of the identified associated variants are in non-coding regions, and presumably influence gene expression regulation. Identifying potential drug targets, i.e. causal protein-coding genes, therefore, requires crossing the genetics results with functional data. RESULTS: We present a novel data integration pipeline that analyses GWAS results in the light of experimental epigenetic and cis-regulatory datasets, such as ChIP-Seq, Promoter-Capture Hi-C or eQTL, and presents them in a single report, which can be used for inferring likely causal genes. This pipeline was then fed into an interactive data resource. AVAILABILITY AND IMPLEMENTATION: The analysis code is available at www.github.com/Ensembl/postgap and the interactive data browser at postgwas.opentargets.io.


Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Phenotype , Quantitative Trait Loci/genetics , Software
7.
PLoS Biol ; 16(9): e3000034, 2018 09.
Article in English | MEDLINE | ID: mdl-30256779

ABSTRACT

Determining the functions of human genes is a key objective for understanding disease and enabling development of new therapeutic approaches. A number of recent studies have shown that the amount of attention the research community gives to each of the more than 20,000 human genes is dramatically skewed toward specific, well-known genes. In this issue, Stoeger and colleagues uncover the factors that explain this bias and offer a way ahead to move more genes into the research limelight.


Subject(s)
Genes , Disease/genetics , Drug Discovery , Genome , Genome-Wide Association Study , Humans , National Institutes of Health (U.S.) , Research , United States
8.
Nat Rev Genet ; 16(10): 561-2, 2015 10.
Article in English | MEDLINE | ID: mdl-26370900

ABSTRACT

Jeffrey Barrett, Ian Dunham and Ewan Birney discuss the initiatives of the newly founded Centre for Therapeutic Target Validation, including a range of approaches to use human genetics to inform drug discovery and make better medicines.


Subject(s)
Drug Discovery/methods , Genetics, Medical/methods , Clustered Regularly Interspaced Short Palindromic Repeats , Crohn Disease/genetics , Genetic Variation , Genome-Wide Association Study , Humans , Hydroxymethylglutaryl CoA Reductases/genetics , International Cooperation , Smad7 Protein/genetics
9.
Nucleic Acids Res ; 47(D1): D1056-D1065, 2019 01 08.
Article in English | MEDLINE | ID: mdl-30462303

ABSTRACT

The Open Targets Platform integrates evidence from genetics, genomics, transcriptomics, drugs, animal models and scientific literature to score and rank target-disease associations for drug target identification. The associations are displayed in an intuitive user interface (https://www.targetvalidation.org), and are available through a REST-API (https://api.opentargets.io/v3/platform/docs/swagger-ui) and a bulk download (https://www.targetvalidation.org/downloads/data). In addition to target-disease associations, we also aggregate and display data at the target and disease levels to aid target prioritisation. Since our first publication two years ago, we have made eight releases, added new data sources for target-disease associations, started including causal genetic variants from non genome-wide targeted arrays, added new target and disease annotations, launched new visualisations and improved existing ones and released a new web tool for batch search of up to 200 targets. We have a new URL for the Open Targets Platform REST-API, new REST endpoints and also removed the need for authorisation for API fair use. Here, we present the latest developments of the Open Targets Platform, expanding the evidence and target-disease associations with new and improved data sources, refining data quality, enhancing website usability, and increasing our user base with our training workshops, user support, social media and bioinformatics forum engagement.


Subject(s)
Computational Biology/methods , Databases, Genetic , Genomics/methods , Information Storage and Retrieval/methods , Molecular Targeted Therapy/methods , Computational Biology/trends , Gene Expression Profiling/methods , Genomics/trends , Humans , Information Storage and Retrieval/trends , Internet , Reproducibility of Results , Software
10.
Bioinformatics ; 35(22): 4767-4769, 2019 11 01.
Article in English | MEDLINE | ID: mdl-31161210

ABSTRACT

SUMMARY: The Illumina Infinium EPIC BeadChip is a new high-throughput array for DNA methylation analysis, extending the earlier 450k array by over 400 000 new sites. Previously, a method named eFORGE was developed to provide insights into cell type-specific and cell-composition effects for 450k data. Here, we present a significantly updated and improved version of eFORGE that can analyze both EPIC and 450k array data. New features include analysis of chromatin states, transcription factor motifs and DNase I footprints, providing tools for epigenome-wide association study interpretation and epigenome editing. AVAILABILITY AND IMPLEMENTATION: eFORGE v2.0 is implemented as a web tool available from https://eforge.altiusinstitute.org and https://eforge-tf.altiusinstitute.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Methylation , Epigenomics , Chromatin , CpG Islands , Deoxyribonuclease I , Oligonucleotide Array Sequence Analysis , Software
11.
BMC Bioinformatics ; 19(1): 345, 2018 Oct 01.
Article in English | MEDLINE | ID: mdl-30285606

ABSTRACT

BACKGROUND: The Open Targets Platform integrates different data sources in order to facilitate identification of potential therapeutic drug targets to treat human diseases. It currently provides evidence for nearly 2.6 million potential target-disease pairs. G-protein coupled receptors are a drug target class of high interest because of the number of successful drugs being developed against them over many years. Here we describe a systematic approach utilizing the Open Targets Platform data to uncover and prioritize potential new disease indications for the G-protein coupled receptors and their ligands. RESULTS: Utilizing the data available in the Open Targets platform, potential G-protein coupled receptor and endogenous ligand disease association pairs were systematically identified. Intriguing examples such as GPR35 for inflammatory bowel disease and CXCR4 for viral infection are used as illustrations of how a systematic approach can aid in the prioritization of interesting drug discovery hypotheses. Combining evidences for G-protein coupled receptors and their corresponding endogenous peptidergic ligands increases confidence and provides supportive evidence for potential new target-disease hypotheses. Comparing such hypotheses to the global pharma drug discovery pipeline to validate the approach showed that more than 93% of G-protein coupled receptor-disease pairs with a high overall Open Targets score involved receptors with an existing drug discovery program. CONCLUSIONS: The Open Targets gene-disease score can be used to prioritize potential G-protein coupled receptors-indication hypotheses. In addition, availability of multiple different evidence types markedly increases confidence as does combining evidence from known receptor-ligand pairs. Comparing the top-ranked hypotheses to the current global pharma pipeline serves validation of our approach and identifies and prioritizes new therapeutic opportunities.


Subject(s)
Disease/genetics , Drug Discovery/methods , Ligands , Protein Binding/physiology , Receptors, G-Protein-Coupled/metabolism , Humans
12.
13.
PLoS Biol ; 13(7): e1002216, 2015 Jul.
Article in English | MEDLINE | ID: mdl-26225775

ABSTRACT

The last few decades have utterly transformed genetics and genomics, but what might the next ten years bring? PLOS Biology asked eight leaders spanning a range of related areas to give us their predictions. Without exception, the predictions are for more data on a massive scale and of more diverse types. All are optimistic and predict enormous positive impact on scientific understanding, while a recurring theme is the benefit of such data for the transformation and personalization of medicine. Several also point out that the biggest changes will very likely be those that we don't foresee, even now.


Subject(s)
Genomics/trends , Forecasting
15.
Nat Methods ; 11(3): 294-6, 2014 Mar.
Article in English | MEDLINE | ID: mdl-24487584

ABSTRACT

Identifying functionally relevant variants against the background of ubiquitous genetic variation is a major challenge in human genetics. For variants in protein-coding regions, our understanding of the genetic code and splicing allows us to identify likely candidates, but interpreting variants outside genic regions is more difficult. Here we present genome-wide annotation of variants (GWAVA), a tool that supports prioritization of noncoding variants by integrating various genomic and epigenomic annotations.


Subject(s)
Molecular Sequence Annotation , Untranslated Regions/genetics , Algorithms , Computer Simulation , Genetic Variation , Humans
16.
J Transl Med ; 15(1): 182, 2017 08 29.
Article in English | MEDLINE | ID: mdl-28851378

ABSTRACT

BACKGROUND: Target identification and validation is a pressing challenge in the pharmaceutical industry, with many of the programmes that fail for efficacy reasons showing poor association between the drug target and the disease. Computational prediction of successful targets could have a considerable impact on attrition rates in the drug discovery pipeline by significantly reducing the initial search space. Here, we explore whether gene-disease association data from the Open Targets platform is sufficient to predict therapeutic targets that are actively being pursued by pharmaceutical companies or are already on the market. METHODS: To test our hypothesis, we train four different classifiers (a random forest, a support vector machine, a neural network and a gradient boosting machine) on partially labelled data and evaluate their performance using nested cross-validation and testing on an independent set. We then select the best performing model and use it to make predictions on more than 15,000 genes. Finally, we validate our predictions by mining the scientific literature for proposed therapeutic targets. RESULTS: We observe that the data types with the best predictive power are animal models showing a disease-relevant phenotype, differential expression in diseased tissue and genetic association with the disease under investigation. On a test set, the neural network classifier achieves over 71% accuracy with an AUC of 0.76 when predicting therapeutic targets in a semi-supervised learning setting. We use this model to gain insights into current and failed programmes and to predict 1431 novel targets, of which a highly significant proportion has been independently proposed in the literature. CONCLUSIONS: Our in silico approach shows that data linking genes and diseases is sufficient to predict novel therapeutic targets effectively and confirms that this type of evidence is essential for formulating or strengthening hypotheses in the target discovery process. Ultimately, more rapid and automated target prioritisation holds the potential to reduce both the costs and the development times associated with bringing new medicines to patients.


Subject(s)
Computer Simulation , Genetic Predisposition to Disease , Molecular Targeted Therapy , Algorithms , Area Under Curve , Data Mining , Drug Discovery , Neural Networks, Computer , Reproducibility of Results
17.
PLoS Genet ; 10(11): e1004798, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25411781

ABSTRACT

Associating genetic variation with quantitative measures of gene regulation offers a way to bridge the gap between genotype and complex phenotypes. In order to identify quantitative trait loci (QTLs) that influence the binding of a transcription factor in humans, we measured binding of the multifunctional transcription and chromatin factor CTCF in 51 HapMap cell lines. We identified thousands of QTLs in which genotype differences were associated with differences in CTCF binding strength, hundreds of them confirmed by directly observable allele-specific binding bias. The majority of QTLs were either within 1 kb of the CTCF binding motif, or in linkage disequilibrium with a variant within 1 kb of the motif. On the X chromosome we observed three classes of binding sites: a minority class bound only to the active copy of the X chromosome, the majority class bound to both the active and inactive X, and a small set of female-specific CTCF sites associated with two non-coding RNA genes. In sum, our data reveal extensive genetic effects on CTCF binding, both direct and indirect, and identify a diversity of patterns of CTCF binding on the X chromosome.


Subject(s)
Chromosomes, Human, X/genetics , Quantitative Trait Loci , Repressor Proteins/genetics , Alleles , CCCTC-Binding Factor , Female , Humans , Linkage Disequilibrium , Protein Binding , RNA, Untranslated/genetics , RNA, Untranslated/metabolism , Repressor Proteins/metabolism
18.
Proc Natl Acad Sci U S A ; 111(17): 6131-8, 2014 Apr 29.
Article in English | MEDLINE | ID: mdl-24753594

ABSTRACT

With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease.


Subject(s)
DNA/genetics , Genome, Human/genetics , Biological Evolution , Disease/genetics , Humans , Regulatory Sequences, Nucleic Acid/genetics , Software
19.
Nucleic Acids Res ; 41(2): 827-41, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23221638

ABSTRACT

The ENCODE Project has generated a wealth of experimental information mapping diverse chromatin properties in several human cell lines. Although each such data track is independently informative toward the annotation of regulatory elements, their interrelations contain much richer information for the systematic annotation of regulatory elements. To uncover these interrelations and to generate an interpretable summary of the massive datasets of the ENCODE Project, we apply unsupervised learning methodologies, converting dozens of chromatin datasets into discrete annotation maps of regulatory regions and other chromatin elements across the human genome. These methods rediscover and summarize diverse aspects of chromatin architecture, elucidate the interplay between chromatin activity and RNA transcription, and reveal that a large proportion of the genome lies in a quiescent state, even across multiple cell types. The resulting annotation of non-coding regulatory elements correlate strongly with mammalian evolutionary constraint, and provide an unbiased approach for evaluating metrics of evolutionary constraint in human. Lastly, we use the regulatory annotations to revisit previously uncharacterized disease-associated loci, resulting in focused, testable hypotheses through the lens of the chromatin landscape.


Subject(s)
Chromatin/chemistry , Genome, Human , Molecular Sequence Annotation , Regulatory Elements, Transcriptional , Enhancer Elements, Genetic , Genome-Wide Association Study , Humans , Insulator Elements , Promoter Regions, Genetic , Proteins/genetics , Terminator Regions, Genetic , Transcription, Genetic
20.
Nucleic Acids Res ; 41(Database issue): D48-55, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23203987

ABSTRACT

The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets for all supported species; large-scale whole genome multiple species alignments across vertebrates and clade-specific alignments for eutherian mammals, primates, birds and fish; variation data resources for 17 species and regulation annotations based on ENCODE and other data sets. Ensembl data are accessible through the genome browser at http://www.ensembl.org and through other tools and programmatic interfaces.


Subject(s)
Databases, Genetic , Genomics , Animals , Gene Expression Regulation , Genetic Variation , Humans , Internet , Mice , Molecular Sequence Annotation , Rats , Software , Zebrafish/genetics
SELECTION OF CITATIONS
SEARCH DETAIL