Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
1.
Cell ; 171(6): 1437-1452.e17, 2017 Nov 30.
Article in English | MEDLINE | ID: mdl-29195078

ABSTRACT

We previously piloted the concept of a Connectivity Map (CMap), whereby genes, drugs, and disease states are connected by virtue of common gene-expression signatures. Here, we report more than a 1,000-fold scale-up of the CMap as part of the NIH LINCS Consortium, made possible by a new, low-cost, high-throughput reduced representation expression profiling method that we term L1000. We show that L1000 is highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts. We further show that the expanded CMap can be used to discover mechanism of action of small molecules, functionally annotate genetic variants of disease genes, and inform clinical trials. The 1.3 million L1000 profiles described here, as well as tools for their analysis, are available at https://clue.io.


Subject(s)
Gene Expression Profiling/methods , Cell Line, Tumor , Drug Resistance, Neoplasm , Gene Expression Profiling/economics , Humans , Neoplasms/drug therapy , Organ Specificity , Pharmaceutical Preparations/metabolism , Sequence Analysis, RNA/economics , Sequence Analysis, RNA/methods , Small Molecule Libraries
2.
Bioinformatics ; 35(8): 1427-1429, 2019 04 15.
Article in English | MEDLINE | ID: mdl-30203022

ABSTRACT

MOTIVATION: Facilitated by technological improvements, pharmacologic and genetic perturbational datasets have grown in recent years to include millions of experiments. Sharing and publicly distributing these diverse data creates many opportunities for discovery, but in recent years the unprecedented size of data generated and its complex associated metadata have also created data storage and integration challenges. RESULTS: We present the GCTx file format and a suite of open-source packages for the efficient storage, serialization and analysis of dense two-dimensional matrices. We have extensively used the format in the Connectivity Map to assemble and share massive datasets currently comprising 1.3 million experiments, and we anticipate that the format's generalizability, paired with code libraries that we provide, will lower barriers for integrated cross-assay analysis and algorithm development. AVAILABILITY AND IMPLEMENTATION: Software packages (available in Python, R, Matlab and Java) are freely available at https://github.com/cmap. Additional instructions, tutorials and datasets are available at clue.io/code. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Metadata , Software , Algorithms , Information Storage and Retrieval
3.
J Comput Aided Mol Des ; 27(5): 455-68, 2013 May.
Article in English | MEDLINE | ID: mdl-23585218

ABSTRACT

Integration of flexible data-analysis tools with cheminformatics methods is a prerequisite for successful identification and validation of "hits" in high-throughput screening (HTS) campaigns. We have designed, developed, and implemented a suite of robust yet flexible cheminformatics tools to support HTS activities at the Broad Institute, three of which are described herein. The "hit-calling" tool allows a researcher to set a hit threshold that can be varied during downstream analysis. The results from the hit-calling exercise are reported to a database for record keeping and further data analysis. The "cherry-picking" tool enables creation of an optimized list of hits for confirmatory and follow-up assays from an HTS hit list. This tool allows filtering by computed chemical property and by substructure. In addition, similarity searches can be performed on hits of interest and sets of related compounds can be selected. The third tool, an "S/SAR viewer," has been designed specifically for the Broad Institute's diversity-oriented synthesis (DOS) collection. The compounds in this collection are rich in chiral centers and the full complement of all possible stereoisomers of a given compound are present in the collection. The S/SAR viewer allows rapid identification of both structure/activity relationships and stereo-structure/activity relationships present in HTS data from the DOS collection. Together, these tools enable the prioritization and analysis of hits from diverse compound collections, and enable informed decisions for follow-up biology and chemistry efforts.


Subject(s)
Drug Design , High-Throughput Screening Assays , Structure-Activity Relationship , Algorithms , Combinatorial Chemistry Techniques , Databases, Factual , Humans
4.
Sci Data ; 8(1): 226, 2021 08 25.
Article in English | MEDLINE | ID: mdl-34433823

ABSTRACT

While gene expression profiling has traditionally been the method of choice for large-scale perturbational profiling studies, proteomics has emerged as an effective tool in this context for directly monitoring cellular responses to perturbations. We previously reported a pilot library containing 3400 profiles of multiple perturbations across diverse cellular backgrounds in the reduced-representation phosphoproteome (P100) and chromatin space (Global Chromatin Profiling, GCP). Here, we expand our original dataset to include profiles from a new set of cardiotoxic compounds and from astrocytes, an additional neural cell model, totaling 5300 proteomic signatures. We describe filtering criteria and quality control metrics used to assess and validate the technical quality and reproducibility of our data. To demonstrate the power of the library, we present two case studies where data is queried using the concept of "connectivity" to obtain biological insight. All data presented in this study have been deposited to the ProteomeXchange Consortium with identifiers PXD017458 (P100) and PXD017459 (GCP) and can be queried at https://clue.io/proteomics .


Subject(s)
Antineoplastic Agents/toxicity , Astrocytes/drug effects , Astrocytes/metabolism , Cardiotoxins/toxicity , Protein Kinase Inhibitors/toxicity , Proteomics , Cell Line, Tumor , Humans , Phosphorylation/drug effects , Protein Processing, Post-Translational/drug effects , Proteome
5.
Cell Syst ; 6(4): 424-443.e7, 2018 Apr 25.
Article in English | MEDLINE | ID: mdl-29655704

ABSTRACT

Although the value of proteomics has been demonstrated, cost and scale are typically prohibitive, and gene expression profiling remains dominant for characterizing cellular responses to perturbations. However, high-throughput sentinel assays provide an opportunity for proteomics to contribute at a meaningful scale. We present a systematic library resource (90 drugs × 6 cell lines) of proteomic signatures that measure changes in the reduced-representation phosphoproteome (P100) and changes in epigenetic marks on histones (GCP). A majority of these drugs elicited reproducible signatures, but notable cell line- and assay-specific differences were observed. Using the "connectivity" framework, we compared signatures across cell types and integrated data across assays, including a transcriptional assay (L1000). Consistent connectivity among cell types revealed cellular responses that transcended lineage, and consistent connectivity among assays revealed unexpected associations between drugs. We further leveraged the resource against public data to formulate hypotheses for treatment of multiple myeloma and acute lymphocytic leukemia. This resource is publicly available at https://clue.io/proteomics.


Subject(s)
Databases, Factual , Phosphoproteins/drug effects , Algorithms , Cell Line , Chromatography, Liquid , Datasets as Topic , Gene Expression Regulation , Histone Code , Humans , Mass Spectrometry , Pharmacological and Toxicological Phenomena , Phosphoproteins/metabolism , Proteomics , Signal Transduction , Software
SELECTION OF CITATIONS
SEARCH DETAIL