Search | Nursing VHL Search Portal

A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles.

Subramanian, Aravind; Narayan, Rajiv; Corsello, Steven M; Peck, David D; Natoli, Ted E; Lu, Xiaodong; Gould, Joshua; Davis, John F; Tubelli, Andrew A; Asiedu, Jacob K; Lahr, David L; Hirschman, Jodi E; Liu, Zihan; Donahue, Melanie; Julian, Bina; Khan, Mariya; Wadden, David; Smith, Ian C; Lam, Daniel; Liberzon, Arthur; Toder, Courtney; Bagul, Mukta; Orzechowski, Marek; Enache, Oana M; Piccioni, Federica; Johnson, Sarah A; Lyons, Nicholas J; Berger, Alice H; Shamji, Alykhan F; Brooks, Angela N; Vrcic, Anita; Flynn, Corey; Rosains, Jacqueline; Takeda, David Y; Hu, Roger; Davison, Desiree; Lamb, Justin; Ardlie, Kristin; Hogstrom, Larson; Greenside, Peyton; Gray, Nathanael S; Clemons, Paul A; Silver, Serena; Wu, Xiaoyun; Zhao, Wen-Ning; Read-Button, Willis; Wu, Xiaohua; Haggarty, Stephen J; Ronco, Lucienne V; Boehm, Jesse S.

Cell ; 171(6): 1437-1452.e17, 2017 Nov 30.

Article in English | MEDLINE | ID: mdl-29195078

ABSTRACT

We previously piloted the concept of a Connectivity Map (CMap), whereby genes, drugs, and disease states are connected by virtue of common gene-expression signatures. Here, we report more than a 1,000-fold scale-up of the CMap as part of the NIH LINCS Consortium, made possible by a new, low-cost, high-throughput reduced representation expression profiling method that we term L1000. We show that L1000 is highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts. We further show that the expanded CMap can be used to discover mechanism of action of small molecules, functionally annotate genetic variants of disease genes, and inform clinical trials. The 1.3 million L1000 profiles described here, as well as tools for their analysis, are available at https://clue.io.

Subject(s)

Gene Expression Profiling/methods , Cell Line, Tumor , Drug Resistance, Neoplasm , Gene Expression Profiling/economics , Humans , Neoplasms/drug therapy , Organ Specificity , Pharmaceutical Preparations/metabolism , Sequence Analysis, RNA/economics , Sequence Analysis, RNA/methods , Small Molecule Libraries

The GCTx format and cmap{Py, R, M, J} packages: resources for optimized storage and integrated traversal of annotated dense matrices.

Enache, Oana M; Lahr, David L; Natoli, Ted E; Litichevskiy, Lev; Wadden, David; Flynn, Corey; Gould, Joshua; Asiedu, Jacob K; Narayan, Rajiv; Subramanian, Aravind.

Bioinformatics ; 35(8): 1427-1429, 2019 04 15.

Article in English | MEDLINE | ID: mdl-30203022

ABSTRACT

MOTIVATION: Facilitated by technological improvements, pharmacologic and genetic perturbational datasets have grown in recent years to include millions of experiments. Sharing and publicly distributing these diverse data creates many opportunities for discovery, but in recent years the unprecedented size of data generated and its complex associated metadata have also created data storage and integration challenges. RESULTS: We present the GCTx file format and a suite of open-source packages for the efficient storage, serialization and analysis of dense two-dimensional matrices. We have extensively used the format in the Connectivity Map to assemble and share massive datasets currently comprising 1.3 million experiments, and we anticipate that the format's generalizability, paired with code libraries that we provide, will lower barriers for integrated cross-assay analysis and algorithm development. AVAILABILITY AND IMPLEMENTATION: Software packages (available in Python, R, Matlab and Java) are freely available at https://github.com/cmap. Additional instructions, tutorials and datasets are available at clue.io/code. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Metadata , Software , Algorithms , Information Storage and Retrieval

Evaluation of RNAi and CRISPR technologies by large-scale gene expression profiling in the Connectivity Map.

Smith, Ian; Greenside, Peyton G; Natoli, Ted; Lahr, David L; Wadden, David; Tirosh, Itay; Narayan, Rajiv; Root, David E; Golub, Todd R; Subramanian, Aravind; Doench, John G.

PLoS Biol ; 15(11): e2003213, 2017 Nov.

Article in English | MEDLINE | ID: mdl-29190685

ABSTRACT

The application of RNA interference (RNAi) to mammalian cells has provided the means to perform phenotypic screens to determine the functions of genes. Although RNAi has revolutionized loss-of-function genetic experiments, it has been difficult to systematically assess the prevalence and consequences of off-target effects. The Connectivity Map (CMAP) represents an unprecedented resource to study the gene expression consequences of expressing short hairpin RNAs (shRNAs). Analysis of signatures for over 13,000 shRNAs applied in 9 cell lines revealed that microRNA (miRNA)-like off-target effects of RNAi are far stronger and more pervasive than generally appreciated. We show that mitigating off-target effects is feasible in these datasets via computational methodologies to produce a consensus gene signature (CGS). In addition, we compared RNAi technology to clustered regularly interspaced short palindromic repeat (CRISPR)-based knockout by analysis of 373 single guide RNAs (sgRNAs) in 6 cells lines and show that the on-target efficacies are comparable, but CRISPR technology is far less susceptible to systematic off-target effects. These results will help guide the proper use and analysis of loss-of-function reagents for the determination of gene function.

Subject(s)

Clustered Regularly Interspaced Short Palindromic Repeats , Gene Expression Profiling , Gene Regulatory Networks/genetics , Genomics/methods , RNA Interference/physiology , Cells, Cultured , Gene Expression Regulation, Neoplastic , Genomics/standards , HT29 Cells , Hep G2 Cells , Humans , MCF-7 Cells , RNA, Small Interfering/genetics , Transcriptome

An informatic pipeline for managing high-throughput screening experiments and analyzing data from stereochemically diverse libraries.

Mulrooney, Carol A; Lahr, David L; Quintin, Michael J; Youngsaye, Willmen; Moccia, Dennis; Asiedu, Jacob K; Mulligan, Evan L; Akella, Lakshmi B; Marcaurelle, Lisa A; Montgomery, Philip; Bittker, Joshua A; Clemons, Paul A; Brudz, Stephen; Dandapani, Sivaraman; Duvall, Jeremy R; Tolliday, Nicola J; De Souza, Andrea.

J Comput Aided Mol Des ; 27(5): 455-68, 2013 May.

Article in English | MEDLINE | ID: mdl-23585218

ABSTRACT

Integration of flexible data-analysis tools with cheminformatics methods is a prerequisite for successful identification and validation of "hits" in high-throughput screening (HTS) campaigns. We have designed, developed, and implemented a suite of robust yet flexible cheminformatics tools to support HTS activities at the Broad Institute, three of which are described herein. The "hit-calling" tool allows a researcher to set a hit threshold that can be varied during downstream analysis. The results from the hit-calling exercise are reported to a database for record keeping and further data analysis. The "cherry-picking" tool enables creation of an optimized list of hits for confirmatory and follow-up assays from an HTS hit list. This tool allows filtering by computed chemical property and by substructure. In addition, similarity searches can be performed on hits of interest and sets of related compounds can be selected. The third tool, an "S/SAR viewer," has been designed specifically for the Broad Institute's diversity-oriented synthesis (DOS) collection. The compounds in this collection are rich in chiral centers and the full complement of all possible stereoisomers of a given compound are present in the collection. The S/SAR viewer allows rapid identification of both structure/activity relationships and stereo-structure/activity relationships present in HTS data from the DOS collection. Together, these tools enable the prioritization and analysis of hits from diverse compound collections, and enable informed decisions for follow-up biology and chemistry efforts.

Subject(s)

Drug Design , High-Throughput Screening Assays , Structure-Activity Relationship , Algorithms , Combinatorial Chemistry Techniques , Databases, Factual , Humans

A Library of Phosphoproteomic and Chromatin Signatures for Characterizing Cellular Responses to Drug Perturbations.

Litichevskiy, Lev; Peckner, Ryan; Abelin, Jennifer G; Asiedu, Jacob K; Creech, Amanda L; Davis, John F; Davison, Desiree; Dunning, Caitlin M; Egertson, Jarrett D; Egri, Shawn; Gould, Joshua; Ko, Tak; Johnson, Sarah A; Lahr, David L; Lam, Daniel; Liu, Zihan; Lyons, Nicholas J; Lu, Xiaodong; MacLean, Brendan X; Mungenast, Alison E; Officer, Adam; Natoli, Ted E; Papanastasiou, Malvina; Patel, Jinal; Sharma, Vagisha; Toder, Courtney; Tubelli, Andrew A; Young, Jennie Z; Carr, Steven A; Golub, Todd R; Subramanian, Aravind; MacCoss, Michael J; Tsai, Li-Huei; Jaffe, Jacob D.

Cell Syst ; 6(4): 424-443.e7, 2018 Apr 25.

Article in English | MEDLINE | ID: mdl-29655704

ABSTRACT

Although the value of proteomics has been demonstrated, cost and scale are typically prohibitive, and gene expression profiling remains dominant for characterizing cellular responses to perturbations. However, high-throughput sentinel assays provide an opportunity for proteomics to contribute at a meaningful scale. We present a systematic library resource (90 drugs × 6 cell lines) of proteomic signatures that measure changes in the reduced-representation phosphoproteome (P100) and changes in epigenetic marks on histones (GCP). A majority of these drugs elicited reproducible signatures, but notable cell line- and assay-specific differences were observed. Using the "connectivity" framework, we compared signatures across cell types and integrated data across assays, including a transcriptional assay (L1000). Consistent connectivity among cell types revealed cellular responses that transcended lineage, and consistent connectivity among assays revealed unexpected associations between drugs. We further leveraged the resource against public data to formulate hypotheses for treatment of multiple myeloma and acute lymphocytic leukemia. This resource is publicly available at https://clue.io/proteomics.

Subject(s)

Databases, Factual , Phosphoproteins/drug effects , Algorithms , Cell Line , Chromatography, Liquid , Datasets as Topic , Gene Expression Regulation , Histone Code , Humans , Mass Spectrometry , Pharmacological and Toxicological Phenomena , Phosphoproteins/metabolism , Proteomics , Signal Transduction , Software

High-throughput Phenotyping of Lung Cancer Somatic Mutations.

Berger, Alice H; Brooks, Angela N; Wu, Xiaoyun; Shrestha, Yashaswi; Chouinard, Candace; Piccioni, Federica; Bagul, Mukta; Kamburov, Atanas; Imielinski, Marcin; Hogstrom, Larson; Zhu, Cong; Yang, Xiaoping; Pantel, Sasha; Sakai, Ryo; Watson, Jacqueline; Kaplan, Nathan; Campbell, Joshua D; Singh, Shantanu; Root, David E; Narayan, Rajiv; Natoli, Ted; Lahr, David L; Tirosh, Itay; Tamayo, Pablo; Getz, Gad; Wong, Bang; Doench, John; Subramanian, Aravind; Golub, Todd R; Meyerson, Matthew; Boehm, Jesse S.

Cancer Cell ; 30(2): 214-228, 2016 08 08.

Article in English | MEDLINE | ID: mdl-27478040

ABSTRACT

Recent genome sequencing efforts have identified millions of somatic mutations in cancer. However, the functional impact of most variants is poorly understood. Here we characterize 194 somatic mutations identified in primary lung adenocarcinomas. We present an expression-based variant-impact phenotyping (eVIP) method that uses gene expression changes to distinguish impactful from neutral somatic mutations. eVIP identified 69% of mutations analyzed as impactful and 31% as functionally neutral. A subset of the impactful mutations induces xenograft tumor formation in mice and/or confers resistance to cellular EGFR inhibition. Among these impactful variants are rare somatic, clinically actionable variants including EGFR S645C, ARAF S214C and S214F, ERBB2 S418T, and multiple BRAF variants, demonstrating that rare mutations can be functionally important in cancer.

Subject(s)

Adenocarcinoma/genetics , High-Throughput Nucleotide Sequencing/methods , Lung Neoplasms/genetics , Mutation , Adenocarcinoma of Lung , Animals , Cell Line, Tumor , Gene Expression Profiling , Heterografts , Humans , Mice , Oncogenes , Phenotype

An Overview of the Challenges in Designing, Integrating, and Delivering BARD: A Public Chemical-Biology Resource and Query Portal for Multiple Organizations, Locations, and Disciplines.

de Souza, Andrea; Bittker, Joshua A; Lahr, David L; Brudz, Steve; Chatwin, Simon; Oprea, Tudor I; Waller, Anna; Yang, Jeremy J; Southall, Noel; Guha, Rajarshi; Schürer, Stephan C; Vempati, Uma D; Southern, Mark R; Dawson, Eric S; Clemons, Paul A; Chung, Thomas D Y.

J Biomol Screen ; 19(5): 614-27, 2014 Jun.

Article in English | MEDLINE | ID: mdl-24441647

ABSTRACT

Recent industry-academic partnerships involve collaboration among disciplines, locations, and organizations using publicly funded "open-access" and proprietary commercial data sources. These require the effective integration of chemical and biological information from diverse data sources, which presents key informatics, personnel, and organizational challenges. The BioAssay Research Database (BARD) was conceived to address these challenges and serve as a community-wide resource and intuitive web portal for public-sector chemical-biology data. Its initial focus is to enable scientists to more effectively use the National Institutes of Health Roadmap Molecular Libraries Program (MLP) data generated from the 3-year pilot and 6-year production phases of the Molecular Libraries Probe Production Centers Network (MLPCN), which is currently in its final year. BARD evolves the current data standards through structured assay and result annotations that leverage BioAssay Ontology and other industry-standard ontologies, and a core hierarchy of assay definition terms and data standards defined specifically for small-molecule assay data. We initially focused on migrating the highest-value MLP data into BARD and bringing it up to this new standard. We review the technical and organizational challenges overcome by the interdisciplinary BARD team, veterans of public- and private-sector data-integration projects, who are collaborating to describe (functional specifications), design (technical specifications), and implement this next-generation software solution.

Subject(s)

Databases, Chemical , Access to Information , Biochemistry , Chemistry, Pharmaceutical/methods , Data Collection , Drug Discovery , Drug Industry , Internet , National Institutes of Health (U.S.) , Small Molecule Libraries/chemistry , Software , United States

High-throughput Phenotyping of Lung Cancer Somatic Mutations.

Cancer Cell ; 32(6): 884, 2017 12 11.

Article in English | MEDLINE | ID: mdl-29232558

Catalyzed CO oxidation at 70 K on an extended Au/Ni surface alloy.

Lahr, David L; Ceyer, Sylvia T.

J Am Chem Soc ; 128(6): 1800-1, 2006 Feb 15.

Article in English | MEDLINE | ID: mdl-16464073

ABSTRACT

A Au/Ni(111) surface alloy catalyzes the oxidation of CO at low temperature by at least three distinct mechanisms. At the lowest temperature of 70 K, molecularly adsorbed O2, spectroscopically characterized as peroxo or superoxo species bound at multiple sites with vibrational frequencies of 865 and 950 cm-1, is the reactant with CO. Between 105 and 125 K, CO2 production coincides with O2 dissociation, suggesting a "hot atom" mechanism. Above 125 K, adsorbed CO reacts with atomically adsorbed O atoms. These results show that nanosize Au clusters bound to oxide supports are not a necessary condition for Au-catalyzed, low-temperature CO oxidation.

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL