RESUMEN
The identification of genetic and chemical perturbations with similar impacts on cell morphology can elucidate compounds' mechanisms of action or novel regulators of genetic pathways. Research on methods for identifying such similarities has lagged due to a lack of carefully designed and well-annotated image sets of cells treated with chemical and genetic perturbations. Here we create such a Resource dataset, CPJUMP1, in which each perturbed gene's product is a known target of at least two chemical compounds in the dataset. We systematically explore the directionality of correlations among perturbations that target the same protein encoded by a given gene, and we find that identifying matches between chemical and genetic perturbations is a challenging task. Our dataset and baseline analyses provide a benchmark for evaluating methods that measure perturbation similarities and impact, and more generally, learn effective representations of cellular state from microscopy images. Such advancements would accelerate the applications of image-based profiling of cellular states, such as uncovering drug mode of action or probing functional genomics.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Microscopía/métodosRESUMEN
Aminoacyl-tRNA synthetases (aaRS) catalyze both chemical steps that translate the universal genetic code. Rodin and Ohno offered an explanation for the existence of two aaRS classes, observing that codons for the most highly conserved Class I active-site residues are anticodons for corresponding Class II active-site residues. They proposed that the two classes arose simultaneously, by translation of opposite strands from the same gene. We have characterized wild-type 46-residue peptides containing ATP-binding sites of Class I and II synthetases and those coded by a gene designed by Rosetta to encode the corresponding peptides on opposite strands. Catalysis by WT and designed peptides is saturable, and the designed peptides are sensitive to active-site residue mutation. All have comparable apparent second-order rate constants 2.9-7.0E-3 M(-1) s(-1) or â¼750,000-1,300,000 times the uncatalyzed rate. The activities of the two complementary peptides demonstrate that the unique information in a gene can have two functional interpretations, one from each complementary strand. The peptides contain phylogenetic signatures of longer, more sophisticated catalysts we call Urzymes and are short enough to bridge the gap between them and simpler uncoded peptides. Thus, they directly substantiate the sense/antisense coding ancestry of Class I and II aaRS. Furthermore, designed 46-mers achieve similar catalytic proficiency to wild-type 46-mers by significant increases in both kcat and Km values, supporting suggestions that the earliest peptide catalysts activated ATP for biosynthetic purposes.
Asunto(s)
Adenosina Trifosfato/química , Aminoacil-ARNt Sintetasas/química , Codón/química , Código Genético , Péptidos/química , Adenosina Trifosfato/metabolismo , Secuencia de Aminoácidos , Aminoacil-ARNt Sintetasas/genética , Aminoacil-ARNt Sintetasas/metabolismo , Aminoacilación , Biocatálisis , Dominio Catalítico , Codón/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Evolución Molecular , Expresión Génica , Cinética , Datos de Secuencia Molecular , Mutación , Péptidos/genética , Péptidos/metabolismo , Unión Proteica , Proteínas Recombinantes/química , Proteínas Recombinantes/genética , Proteínas Recombinantes/metabolismoRESUMEN
We previously showed (Li, L., and Carter, C. W., Jr. (2013) J. Biol. Chem. 288, 34736-34745) that increased specificity for tryptophan versus tyrosine by contemporary Bacillus stearothermophilus tryptophanyl-tRNA synthetase (TrpRS) over that of TrpRS Urzyme results entirely from coupling between the anticodon-binding domain and an insertion into the Rossmann-fold known as Connecting Peptide 1. We show that this effect is closely related to a long range catalytic effect, in which side chain repacking in a region called the D1 Switch, accounts fully for the entire catalytic contribution of the catalytic Mg(2+) ion. We report intrinsic and higher order interaction effects on the specificity ratio, (kcat/Km)Trp/(kcat/Km)Tyr, of 15 combinatorial mutants from a previous study (Weinreb, V., Li, L., and Carter, C. W., Jr. (2012) Structure 20, 128-138) of the catalytic role of the D1 Switch. Unexpectedly, the same four-way interaction both activates catalytic assist by Mg(2+) ion and contributes -4.4 kcal/mol to the free energy of the specificity ratio. A minimum action path computed for the induced-fit and catalytic conformation changes shows that repacking of the four residues precedes a decrease in the volume of the tryptophan-binding pocket. We suggest that previous efforts to alter amino acid specificities of TrpRS and glutaminyl-tRNA synthetase (GlnRS) by mutagenesis without extensive, modular substitution failed because mutations were incompatible with interdomain motions required for catalysis.
Asunto(s)
Proteínas Bacterianas/química , Geobacillus stearothermophilus/enzimología , Triptófano-ARNt Ligasa/química , Secuencias de Aminoácidos , Aminoacil-ARNt Sintetasas/química , Aminoacil-ARNt Sintetasas/genética , Aminoacil-ARNt Sintetasas/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Catálisis , Geobacillus stearothermophilus/genética , Estructura Terciaria de Proteína , Triptófano-ARNt Ligasa/genética , Triptófano-ARNt Ligasa/metabolismoRESUMEN
We tested the idea that ancestral class I and II aminoacyl-tRNA synthetases arose on opposite strands of the same gene. We assembled excerpted 94-residue Urgenes for class I tryptophanyl-tRNA synthetase (TrpRS) and class II Histidyl-tRNA synthetase (HisRS) from a diverse group of species, by identifying and catenating three blocks coding for secondary structures that position the most highly conserved, active-site residues. The codon middle-base pairing frequency was 0.35 ± 0.0002 in all-by-all sense/antisense alignments for 211 TrpRS and 207 HisRS sequences, compared with frequencies between 0.22 ± 0.0009 and 0.27 ± 0.0005 for eight different representations of the null hypothesis. Clustering algorithms demonstrate further that profiles of middle-base pairing in the synthetase antisense alignments are correlated along the sequences from one species-pair to another, whereas this is not the case for similar operations on sets representing the null hypothesis. Most probable reconstructed sequences for ancestral nodes of maximum likelihood trees show that middle-base pairing frequency increases to approximately 0.42 ± 0.002 as bacterial trees approach their roots; ancestral nodes from trees including archaeal sequences show a less pronounced increase. Thus, contemporary and reconstructed sequences all validate important bioinformatic predictions based on descent from opposite strands of the same ancestral gene. They further provide novel evidence for the hypothesis that bacteria lie closer than archaea to the origin of translation. Moreover, the inverse polarity of genetic coding, together with a priori α-helix propensities suggest that in-frame coding on opposite strands leads to similar secondary structures with opposite polarity, as observed in TrpRS and HisRS crystal structures.
Asunto(s)
Aminoacil-ARNt Sintetasas/genética , Evolución Molecular , Histidina-ARNt Ligasa/genética , Triptófano-ARNt Ligasa/genética , Bacterias/genética , Secuencia de Bases , Dominio Catalítico , Codón , Estructura Secundaria de ProteínaRESUMEN
Mechanistic studies of Geobacillus stearothermophilus tryptophanyl-tRNA synthetase (TrpRS) afford an unusually detailed description-the escapement mechanism-for the distinct steps coupling catalysis to domain motion, efficiently converting the free energy of ATP hydrolysis into biologically useful alternative forms of information and work. Further elucidation of the escapement mechanism requires understanding thermodynamic linkages between domain configuration and conformational stability. To that end, we compare experimental thermal melting of fully liganded and apo TrpRS with a computational simulation of the melting of its fully liganded form. The simulation also provides important structural cameos at successively higher temperatures, enabling more confident interpretation. Experimental and simulated melting both proceed through a succession of three transitions at successively higher temperature. The low-temperature transition occurs at approximately the growth temperature of the organism and so may be functionally relevant but remains too subtle to characterize structurally. Structural metrics from the simulation imply that the two higher-temperature transitions entail forming a molten globular state followed by unfolding of secondary structures. Ligands that stabilize the enzyme in a pre-transition (PreTS) state compress the temperature range over which these transitions occur and sharpen the transitions to the molten globule and fully denatured states, while broadening the low-temperature transition. The experimental enthalpy changes provide a key parameter necessary to convert changes in melting temperature of combinatorial mutants into mutationally induced conformational free energy changes. The TrpRS urzyme, an excerpted model representing an early ancestral form, containing virtually the entire catalytic apparatus, remains largely intact at the highest simulated temperatures.
RESUMEN
Technological advances in high-throughput microscopy have facilitated the acquisition of cell images at a rapid pace, and data pipelines can now extract and process thousands of image-based features from microscopy images. These features represent valuable single-cell phenotypes that contain information about cell state and biological processes. The use of these features for biological discovery is known as image-based or morphological profiling. However, these raw features need processing before use and image-based profiling lacks scalable and reproducible open-source software. Inconsistent processing across studies makes it difficult to compare datasets and processing steps, further delaying the development of optimal pipelines, methods, and analyses. To address these issues, we present Pycytominer, an open-source software package with a vibrant community that establishes an image-based profiling standard. Pycytominer has a simple, user-friendly Application Programming Interface (API) that implements image-based profiling functions for processing high-dimensional morphological features extracted from microscopy images of cells. Establishing Pycytominer as a standard image-based profiling toolkit ensures consistent data processing pipelines with data provenance, therefore minimizing potential inconsistencies and enabling researchers to confidently derive accurate conclusions and discover novel insights from their data, thus driving progress in our field.
RESUMEN
We present the nELISA, a high-throughput, high-fidelity, and high-plex protein profiling platform. DNA oligonucleotides are used to pre-assemble antibody pairs on spectrally encoded microparticles and perform displacement-mediated detection. Spatial separation between non-cognate antibodies prevents the rise of reagent-driven cross-reactivity, while read-out is performed cost-efficiently and at high-throughput using flow cytometry. We assembled an inflammatory panel of 191 targets that were multiplexed without cross-reactivity or impact on performance vs 1-plex signals, with sensitivities as low as 0.1pg/mL and measurements spanning 7 orders of magnitude. We then performed a large-scale secretome perturbation screen of peripheral blood mononuclear cells (PBMCs), with cytokines as both perturbagens and read-outs, measuring 7,392 samples and generating ~1.5M protein datapoints in under a week, a significant advance in throughput compared to other highly multiplexed immunoassays. We uncovered 447 significant cytokine responses, including multiple putatively novel ones, that were conserved across donors and stimulation conditions. We also validated the nELISA's use in phenotypic screening, and propose its application to drug discovery.
RESUMEN
In image-based profiling, software extracts thousands of morphological features of cells from multi-channel fluorescence microscopy images, yielding single-cell profiles that can be used for basic research and drug discovery. Powerful applications have been proven, including clustering chemical and genetic perturbations on the basis of their similar morphological impact, identifying disease phenotypes by observing differences in profiles between healthy and diseased cells and predicting assay outcomes by using machine learning, among many others. Here, we provide an updated protocol for the most popular assay for image-based profiling, Cell Painting. Introduced in 2013, it uses six stains imaged in five channels and labels eight diverse components of the cell: DNA, cytoplasmic RNA, nucleoli, actin, Golgi apparatus, plasma membrane, endoplasmic reticulum and mitochondria. The original protocol was updated in 2016 on the basis of several years' experience running it at two sites, after optimizing it by visual stain quality. Here, we describe the work of the Joint Undertaking for Morphological Profiling Cell Painting Consortium, to improve upon the assay via quantitative optimization by measuring the assay's ability to detect morphological phenotypes and group similar perturbations together. The assay gives very robust outputs despite various changes to the protocol, and two vendors' dyes work equivalently well. We present Cell Painting version 3, in which some steps are simplified and several stain concentrations can be reduced, saving costs. Cell culture and image acquisition take 1-2 weeks for typically sized batches of ≤20 plates; feature extraction and data analysis take an additional 1-2 weeks.This protocol is an update to Nat. Protoc. 11, 1757-1774 (2016): https://doi.org/10.1038/nprot.2016.105.
Asunto(s)
Técnicas de Cultivo de Célula , Procesamiento de Imagen Asistido por Computador , Procesamiento de Imagen Asistido por Computador/métodos , Microscopía Fluorescente , Mitocondrias , Programas InformáticosRESUMEN
Morphological and gene expression profiling can cost-effectively capture thousands of features in thousands of samples across perturbations by disease, mutation, or drug treatments, but it is unclear to what extent the two modalities capture overlapping versus complementary information. Here, using both the L1000 and Cell Painting assays to profile gene expression and cell morphology, respectively, we perturb human A549 lung cancer cells with 1,327 small molecules from the Drug Repurposing Hub across six doses, providing a data resource including dose-response data from both assays. The two assays capture both shared and complementary information for mapping cell state. Cell Painting profiles from compound perturbations are more reproducible and show more diversity but measure fewer distinct groups of features. Applying unsupervised and supervised methods to predict compound mechanisms of action (MOAs) and gene targets, we find that the two assays not only provide a partially shared but also a complementary view of drug mechanisms. Given the numerous applications of profiling in biology, our analyses provide guidance for planning experiments that profile cells for detecting distinct cell types, disease phenotypes, and response to chemical or genetic perturbations.
Asunto(s)
Perfilación de la Expresión Génica , Humanos , Perfilación de la Expresión Génica/métodos , FenotipoRESUMEN
Image-based profiling is a maturing strategy by which the rich information present in biological images is reduced to a multidimensional profile, a collection of extracted image-based features. These profiles can be mined for relevant patterns, revealing unexpected biological activity that is useful for many steps in the drug discovery process. Such applications include identifying disease-associated screenable phenotypes, understanding disease mechanisms and predicting a drug's activity, toxicity or mechanism of action. Several of these applications have been recently validated and have moved into production mode within academia and the pharmaceutical industry. Some of these have yielded disappointing results in practice but are now of renewed interest due to improved machine-learning strategies that better leverage image-based information. Although challenges remain, novel computational technologies such as deep learning and single-cell methods that better capture the biological information in images hold promise for accelerating drug discovery.
Asunto(s)
Descubrimiento de Drogas/métodos , Industria Farmacéutica/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Aprendizaje Automático , Animales , Biología Computacional/métodos , Biología Computacional/tendencias , Descubrimiento de Drogas/tendencias , Industria Farmacéutica/tendencias , Ensayos Analíticos de Alto Rendimiento/métodos , Ensayos Analíticos de Alto Rendimiento/tendencias , Humanos , Procesamiento de Imagen Asistido por Computador/tendencias , Aprendizaje Automático/tendenciasRESUMEN
PATH algorithms for identifying conformational transition states provide computational parameters-time to the transition state, conformational free energy differences, and transition state activation energies-for comparison to experimental data and can be carried out sufficiently rapidly to use in the "high throughput" mode. These advantages are especially useful for interpreting results from combinatorial mutagenesis experiments. This report updates the previously published algorithm with enhancements that improve correlations between PATH convergence parameters derived from virtual variant structures generated by RosettaBackrub and previously published kinetic data for a complete, four-way combinatorial mutagenesis of a conformational switch in Tryptophanyl-tRNA synthetase.
RESUMEN
We measured and cross-validated the energetics of networks in Bacillus stearothermophilus Tryptophanyl-tRNA synthetase (TrpRS) using both multi-mutant and modular thermodynamic cycles. Multi-dimensional combinatorial mutagenesis showed that four side chains from this "molecular switch" move coordinately with the active-site Mg2+ ion as the active site preorganizes to stabilize the transition state for amino acid activation. A modular thermodynamic cycle consisting of full-length TrpRS, its Urzyme, and the Urzyme plus each of the two domains deleted in the Urzyme gives similar energetics. These dynamic linkages, although unlikely to stabilize the transition-state directly, consign the active-site preorganization to domain motion, assuring coupled vectorial behavior.
RESUMEN
PATH rapidly computes a path and a transition state between crystal structures by minimizing the Onsager-Machlup action. It requires input parameters whose range of values can generate different transition-state structures that cannot be uniquely compared with those generated by other methods. We outline modifications to estimate these input parameters to circumvent these difficulties and validate the PATH transition states by showing consistency between transition-states derived by different algorithms for unrelated protein systems. Although functional protein conformational change trajectories are to a degree stochastic, they nonetheless pass through a well-defined transition state whose detailed structural properties can rapidly be identified using PATH.
RESUMEN
Mammalian gastric lipases are stable and active under acidic conditions and also in the duodenal lumen. There has been considerable interest in acid stable lipases owing to their potential application in the treatment of pancreatic exocrine insufficiency. In order to gain insights into the domain movements of these enzymes, molecular dynamics simulations of human gastric lipase was performed at an acidic pH and under neutral conditions. For comparative studies, simulation of dog gastric lipase was also performed at an acidic pH. Analyses show, that in addition to the lid region, there is another region of high mobility in these lipases. The potential role of this novel region is discussed.