Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Nucleic Acids Res ; 48(D1): D1153-D1163, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31665479

RESUMO

ProteomicsDB (https://www.ProteomicsDB.org) started as a protein-centric in-memory database for the exploration of large collections of quantitative mass spectrometry-based proteomics data. The data types and contents grew over time to include RNA-Seq expression data, drug-target interactions and cell line viability data. In this manuscript, we summarize new developments since the previous update that was published in Nucleic Acids Research in 2017. Over the past two years, we have enriched the data content by additional datasets and extended the platform to support protein turnover data. Another important new addition is that ProteomicsDB now supports the storage and visualization of data collected from other organisms, exemplified by Arabidopsis thaliana. Due to the generic design of ProteomicsDB, all analytical features available for the original human resource seamlessly transfer to other organisms. Furthermore, we introduce a new service in ProteomicsDB which allows users to upload their own expression datasets and analyze them alongside with data stored in ProteomicsDB. Initially, users will be able to make use of this feature in the interactive heat map functionality as well as the drug sensitivity prediction, but ultimately will be able to use all analytical features of ProteomicsDB in this way.


Assuntos
Disciplinas das Ciências Biológicas , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteômica/métodos , Pesquisa , Descoberta de Drogas , Software , Interface Usuário-Computador , Navegador
2.
J Chem Inf Model ; 59(6): 2560-2571, 2019 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-31120751

RESUMO

Molecular patterns are widely used for compound filtering in molecular design endeavors. They describe structural properties that are connected with unwanted physical or chemical properties like reactivity or toxicity. With filter sets comprising hundreds of structural filters, an analytic approach to compare those patterns is needed. Here we present a novel approach to solve the generic pattern comparison problem. We introduce chemically inspired fingerprints for pattern nodes and edges to derive an easy-to-compare pattern representation. On two annotated pattern graphs we apply a maximum common subgraph algorithm enabling the calculation of pattern inclusion and similarity. The resulting algorithm can be used in many different ways. We can automatically derive pattern hierarchies or search in large pattern collections for more general or more specific patterns. To the best of our knowledge, the presented algorithm is the first of its kind enabling these types of chemical pattern analytics. Our new tool named SMARTScompare is an implementation of the approach for the SMARTS language, which is the quasi-standard for structural filters. We demonstrate the capabilities of SMARTScompare on a large collection of SMARTS patterns from real applications.


Assuntos
Bibliotecas de Moléculas Pequenas/química , Software , Algoritmos , Quimioinformática/métodos , Reconhecimento Automatizado de Padrão/métodos
3.
Nat Methods ; 16(6): 509-518, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31133760

RESUMO

In mass-spectrometry-based proteomics, the identification and quantification of peptides and proteins heavily rely on sequence database searching or spectral library matching. The lack of accurate predictive models for fragment ion intensities impairs the realization of the full potential of these approaches. Here, we extended the ProteomeTools synthetic peptide library to 550,000 tryptic peptides and 21 million high-quality tandem mass spectra. We trained a deep neural network, termed Prosit, resulting in chromatographic retention time and fragment ion intensity predictions that exceed the quality of the experimental data. Integrating Prosit into database search pipelines led to more identifications at >10× lower false discovery rates. We show the general applicability of Prosit by predicting spectra for proteases other than trypsin, generating spectral libraries for data-independent acquisition and improving the analysis of metaproteomes. Prosit is integrated into ProteomicsDB, allowing search result re-scoring and custom spectral library generation for any organism on the basis of peptide sequence alone.


Assuntos
Aprendizado Profundo , Redes Neurais de Computação , Fragmentos de Peptídeos/análise , Biblioteca de Peptídeos , Proteoma/análise , Software , Espectrometria de Massas em Tandem/métodos , Animais , Caenorhabditis elegans/metabolismo , Bases de Dados de Proteínas , Drosophila melanogaster/metabolismo , Células HEK293 , Humanos , Fragmentos de Peptídeos/metabolismo , Proteoma/metabolismo , Saccharomyces cerevisiae/metabolismo
4.
Nucleic Acids Res ; 46(D1): D1271-D1281, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29106664

RESUMO

ProteomicsDB (https://www.ProteomicsDB.org) is a protein-centric in-memory database for the exploration of large collections of quantitative mass spectrometry-based proteomics data. ProteomicsDB was first released in 2014 to enable the interactive exploration of the first draft of the human proteome. To date, it contains quantitative data from 78 projects totalling over 19k LC-MS/MS experiments. A standardized analysis pipeline enables comparisons between multiple datasets to facilitate the exploration of protein expression across hundreds of tissues, body fluids and cell lines. We recently extended the data model to enable the storage and integrated visualization of other quantitative omics data. This includes transcriptomics data from e.g. NCBI GEO, protein-protein interaction information from STRING, functional annotations from KEGG, drug-sensitivity/selectivity data from several public sources and reference mass spectra from the ProteomeTools project. The extended functionality transforms ProteomicsDB into a multi-purpose resource connecting quantification and meta-data for each protein. The rich user interface helps researchers to navigate all data sources in either a protein-centric or multi-protein-centric manner. Several options are available to download data manually, while our application programming interface enables accessing quantitative data systematically.


Assuntos
Bases de Dados de Proteínas , Espectrometria de Massas em Tandem , Sobrevivência Celular , Apresentação de Dados , Humanos , Internet , Preparações Farmacêuticas/metabolismo , Mapas de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Proteômica
5.
Science ; 358(6367)2017 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-29191878

RESUMO

Kinase inhibitors are important cancer therapeutics. Polypharmacology is commonly observed, requiring thorough target deconvolution to understand drug mechanism of action. Using chemical proteomics, we analyzed the target spectrum of 243 clinically evaluated kinase drugs. The data revealed previously unknown targets for established drugs, offered a perspective on the "druggable" kinome, highlighted (non)kinase off-targets, and suggested potential therapeutic applications. Integration of phosphoproteomic data refined drug-affected pathways, identified response markers, and strengthened rationale for combination treatments. We exemplify translational value by discovering SIK2 (salt-inducible kinase 2) inhibitors that modulate cytokine production in primary cells, by identifying drugs against the lung cancer survival marker MELK (maternal embryonic leucine zipper kinase), and by repurposing cabozantinib to treat FLT3-ITD-positive acute myeloid leukemia. This resource, available via the ProteomicsDB database, should facilitate basic, clinical, and drug discovery research and aid clinical decision-making.


Assuntos
Antineoplásicos/farmacologia , Descoberta de Drogas/métodos , Terapia de Alvo Molecular , Inibidores de Proteínas Quinases/farmacologia , Proteômica/métodos , Animais , Antineoplásicos/química , Linhagem Celular Tumoral , Citocinas/metabolismo , Humanos , Leucemia Mieloide Aguda/tratamento farmacológico , Leucemia Mieloide Aguda/enzimologia , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/enzimologia , Camundongos , Inibidores de Proteínas Quinases/química , Proteínas Serina-Treonina Quinases/antagonistas & inibidores , Ensaios Antitumorais Modelo de Xenoenxerto , Tirosina Quinase 3 Semelhante a fms/antagonistas & inibidores
6.
Nat Methods ; 14(3): 259-262, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28135259

RESUMO

We describe ProteomeTools, a project building molecular and digital tools from the human proteome to facilitate biomedical research. Here we report the generation and multimodal liquid chromatography-tandem mass spectrometry analysis of >330,000 synthetic tryptic peptides representing essentially all canonical human gene products, and we exemplify the utility of these data in several applications. The resource (available at http://www.proteometools.org) will be extended to >1 million peptides, and all data will be shared with the community via ProteomicsDB and ProteomeXchange.


Assuntos
Cromatografia Líquida/métodos , Proteoma/análise , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Bases de Dados de Proteínas , Genoma Humano/genética , Humanos
7.
Nat Methods ; 13(9): 741-8, 2016 08 30.
Artigo em Inglês | MEDLINE | ID: mdl-27575624

RESUMO

High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.


Assuntos
Biologia Computacional/métodos , Processamento Eletrônico de Dados , Espectrometria de Massas/métodos , Proteômica/métodos , Software , Envelhecimento/sangue , Proteínas Sanguíneas/química , Humanos , Anotação de Sequência Molecular , Proteogenômica/métodos , Fluxo de Trabalho
8.
Bioinformatics ; 30(24): 3484-90, 2014 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-25028727

RESUMO

MOTIVATION: The landscape of structural variation (SV) including complex duplication and translocation patterns is far from resolved. SV detection tools usually exhibit low agreement, are often geared toward certain types or size ranges of variation and struggle to correctly classify the type and exact size of SVs. RESULTS: We present Gustaf (Generic mUlti-SpliT Alignment Finder), a sound generic multi-split SV detection tool that detects and classifies deletions, inversions, dispersed duplications and translocations of ≥ 30 bp. Our approach is based on a generic multi-split alignment strategy that can identify SV breakpoints with base pair resolution. We show that Gustaf correctly identifies SVs, especially in the range from 30 to 100 bp, which we call the next-generation sequencing (NGS) twilight zone of SVs, as well as larger SVs >500 bp. Gustaf performs better than similar tools in our benchmark and is furthermore able to correctly identify size and location of dispersed duplications and translocations, which otherwise might be wrongly classified, for example, as large deletions.


Assuntos
Variação Estrutural do Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Humanos , Alinhamento de Sequência , Deleção de Sequência , Software , Translocação Genética
9.
J Chem Inf Model ; 53(7): 1676-88, 2013 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-23751070

RESUMO

Retrieving molecules with specific structural features is a fundamental requirement of today's molecular database technologies. Estimates claim the chemical space relevant for drug discovery to be around 106° molecules. This figure is many orders of magnitude larger than the amount of molecules conventional databases retain today and will store in the future. An elegant description of such a large chemical space is provided by the concept of fragment spaces. A fragment space comprises fragments that are molecules with open valences and describes rules how to connect these fragments to products. Due to the combinatorial nature of fragment spaces, a complete enumeration of its products is intractable. We present an algorithm to search fragment spaces for generic chemical patterns as present in the SMARTS chemical pattern language. Our method allows specification of the chemical surrounding of an atom in a query and, therefore, enables a chemically intuitive search. During the search, the costly enumeration of products is avoided. The result is a fragment space that exactly describes all possible molecules that contain the user-defined pattern. We evaluated the algorithm in three different drug development use-cases and performed a large scale statistical analysis with 738 SMARTS patterns on three public available fragment spaces. Our results show the ability of the algorithm to explore the chemical space around known active molecules, to analyze fragment spaces for the presence of likely toxic molecules, and to identify complex macromolecular structures under additional structural constraints. By searching the fragment space in its nonenumerated form, spaces covering up to 10¹9 molecules can be examined in times ranging between 47 s and 19 min depending on the complexity of the query pattern.


Assuntos
Algoritmos , Descoberta de Drogas/métodos
10.
J Med Chem ; 56(5): 2016-28, 2013 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-23379567

RESUMO

Crystal structure databases offer ample opportunities to derive small molecule conformation preferences, but the derived knowledge is not systematically applied in drug discovery research. We address this gap by a comprehensive and extendable expert system enabling quick assessment of the probability of a given conformation to occur. It is based on a hierarchical system of torsion patterns that cover a large part of druglike chemical space. Each torsion pattern has associated frequency histograms generated from CSD and PDB data and, derived from the histograms, traffic-light rules for frequently observed, rare, and highly unlikely torsion ranges. Structures imported into the corresponding software are annotated according to these rules. We present the concept behind the tree of torsion patterns, the design of an intuitive user interface for the management and usage of the torsion library, and we illustrate how the system helps analyze and understand conformation properties of substructures widely used in medicinal chemistry.


Assuntos
Desenho de Fármacos , Conformação Molecular , Torção Mecânica , Bases de Dados Factuais , Descoberta de Drogas , Modelos Moleculares , Rotação , Software
11.
J Chem Inf Model ; 52(12): 3181-9, 2012 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-23205736

RESUMO

A common task in drug development is the selection of compounds fulfilling specific structural features from a large data pool. While several methods that iteratively search through such data sets exist, their application is limited compared to the infinite character of molecular space. The introduction of the concept of fragment spaces (FSs), which are composed of molecular fragments and their connection rules, made the representation of large combinatorial data sets feasible. At the same time, search algorithms face the problem of structural features spanning over multiple fragments. Due to the combinatorial nature of FSs, an enumeration of all products is impossible. In order to overcome these time and storage issues, we present a method that is able to find substructures in FSs without explicit product enumeration. This is accomplished by splitting substructures into subsubstructures and mapping them onto fragments with respect to fragment connectivity rules. The method has been evaluated on three different drug discovery scenarios considering the exploration of a molecule class, the elaboration of decoration patterns for a molecular core, and the exhaustive query for peptides in FSs. FSs can be searched in seconds, and found products contain novel compounds not present in the PubChem database which may serve as hints for new lead structures.


Assuntos
Descoberta de Drogas/métodos , Preparações Farmacêuticas/química , Inibidores de Proteínas Quinases/química , Inibidores de Proteínas Quinases/farmacologia
12.
J Cheminform ; 4(1): 13, 2012 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-22849361

RESUMO

BACKGROUND: Searching for substructures in molecules belongs to the most elementary tasks in cheminformatics and is nowadays part of virtually every cheminformatics software. The underlying algorithms, used over several decades, are designed for the application to general graphs. Applied on molecular graphs, little effort has been spend on characterizing their performance. Therefore, it is not clear how current substructure search algorithms behave on such special graphs. One of the main reasons why such an evaluation was not performed in the past was the absence of appropriate data sets. RESULTS: In this paper, we present a systematic evaluation of Ullmann's and the VF2 subgraph isomorphism algorithms on molecular data. The benchmark set consists of a collection of 1235 SMARTS substructure expressions and selected molecules from the ZINC database. The benchmark evaluates substructures search times for complete database scans as well as individual substructure-molecule pairs. In detail, we focus on the influence of substructure formulation and size, the impact of molecule size, and the ability of both algorithms to be used on multiple cores. CONCLUSIONS: The results show a clear superiority of the VF2 algorithm in all test scenarios. In general, both algorithms solve most instances in less than one millisecond, which we consider to be acceptable. Still, in direct comparison, the VF2 is most often several folds faster than Ullmann's algorithm. Additionally, Ullmann's algorithm shows a surprising number of run time outliers.

13.
J Chem Inf Model ; 50(9): 1529-35, 2010 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-20795706

RESUMO

The intuitive way of chemists to communicate molecules is via two-dimensional structure diagrams. The straightforward visual representations are mostly preferred to the often complicated systematic chemical names. For chemical patterns, however, no comparable visualization standards have evolved so far. Chemical patterns denoting descriptions of chemical features are needed whenever a set of molecules is filtered for certain properties. The currently available representations are constrained to linear molecular pattern languages which are hardly human readable and therefore keep chemists without computational background from systematically formulating patterns. Therefore, we introduce a new visualization concept for chemical patterns. The common standard concept of structure diagrams is extended to account for property descriptions and logic combinations of chemical features in patterns. As a first application of the new concept, we developed the SMARTSviewer, a tool that converts chemical patterns encoded in SMARTS strings to a visual representation. The graphic pattern depiction provides an overview of the specified chemical features, variations, and similarities without needing to decode the often cryptic linear expressions. Taking recent chemical publications from various fields, we demonstrate the wide application range of a graphical chemical pattern language.


Assuntos
Estrutura Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA