Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Bioinformatics ; 37(1): 89-96, 2021 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-33416858

RESUMO

MOTIVATION: One avenue to address the paucity of clinically testable targets is to reinvestigate the druggable genome by tackling complicated types of targets such as Protein-Protein Interactions (PPIs). Given the challenge to target those interfaces with small chemical compounds, it has become clear that learning from successful examples of PPI modulation is a powerful strategy. Freely accessible databases of PPI modulators that provide the community with tractable chemical and pharmacological data, as well as powerful tools to query them, are therefore essential to stimulate new drug discovery projects on PPI targets. RESULTS: Here, we present the new version iPPI-DB, our manually curated database of PPI modulators. In this completely redesigned version of the database, we introduce a new web interface relying on crowdsourcing for the maintenance of the database. This interface was created to enable community contributions, whereby external experts can suggest new database entries. Moreover, the data model, the graphical interface, and the tools to query the database have been completely modernized and improved. We added new PPI modulators, new PPI targets and extended our focus to stabilizers of PPIs as well. AVAILABILITY AND IMPLEMENTATION: The iPPI-DB server is available at https://ippidb.pasteur.fr The source code for this server is available at https://gitlab.pasteur.fr/ippidb/ippidb-web/ and is distributed under GPL licence (http://www.gnu.org/licences/gpl). Queries can be shared through persistent links according to the FAIR data standards. Data can be downloaded from the website as csv files. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
Methods Mol Biol ; 2075: 265-283, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31584169

RESUMO

We present a computational method to identify conjugative systems in plasmids and chromosomes using the CONJscan module of MacSyFinder. The method relies on the identification of the protein components of the system using hidden Markov model profiles and then checking that the composition and genetic organization of the system is consistent with that expected from a conjugative system. The method can be assessed online using the Galaxy workflow or locally using a standalone software. The latter version allows to modify the models of the module (i.e., to change the expected components, their number, and their organization).CONJscan identifies conjugative systems, but when the mobile genetic element is integrative (ICE), one often also wants to delimit it from the chromosome. We present a method, with a script, to use the results of CONJscan and comparative genomics to delimit ICE in chromosomes. The method provides a visual representation of the ICE location. Together, these methods facilitate the identification of conjugative elements in bacterial genomes.


Assuntos
Biologia Computacional/métodos , Conjugação Genética , Transferência Genética Horizontal , Plasmídeos/genética , Software , Elementos de DNA Transponíveis , Genoma Bacteriano , Ilhas Genômicas , Genômica
3.
Nucleic Acids Res ; 47(W1): W260-W265, 2019 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-31028399

RESUMO

Phylogeny.fr, created in 2008, has been designed to facilitate the execution of phylogenetic workflows, and is nowadays widely used. However, since its development, user needs have evolved, new tools and workflows have been published, and the number of jobs has increased dramatically, thus promoting new practices, which motivated its refactoring. We developed NGPhylogeny.fr to be more flexible in terms of tools and workflows, easily installable, and more scalable. It integrates numerous tools in their latest version (e.g. TNT, FastME, MrBayes, etc.) as well as new ones designed in the last ten years (e.g. PhyML, SMS, FastTree, trimAl, BOOSTER, etc.). These tools cover a large range of usage (sequence searching, multiple sequence alignment, model selection, tree inference and tree drawing) and a large panel of standard methods (distance, parsimony, maximum likelihood and Bayesian). They are integrated in workflows, which have been already configured ('One click'), can be customized ('Advanced'), or are built from scratch ('A la carte'). Workflows are managed and run by an underlying Galaxy workflow system, which makes workflows more scalable in terms of number of jobs and size of data. NGPhylogeny.fr is deployable on any server or personal computer, and is freely accessible at https://ngphylogeny.fr.


Assuntos
Bases de Dados Factuais , Internet , Filogenia , Software
4.
Cell Syst ; 6(6): 752-758.e1, 2018 06 27.
Artigo em Inglês | MEDLINE | ID: mdl-29953864

RESUMO

The primary problem with the explosion of biomedical datasets is not the data, not computational resources, and not the required storage space, but the general lack of trained and skilled researchers to manipulate and analyze these data. Eliminating this problem requires development of comprehensive educational resources. Here we present a community-driven framework that enables modern, interactive teaching of data analytics in life sciences and facilitates the development of training materials. The key feature of our system is that it is not a static but a continuously improved collection of tutorials. By coupling tutorials with a web-based analysis framework, biomedical researchers can learn by performing computation themselves through a web browser without the need to install software or search for example datasets. Our ultimate goal is to expand the breadth of training materials to include fundamental statistical and data science topics and to precipitate a complete re-engineering of undergraduate and graduate curricula in life sciences. This project is accessible at https://training.galaxyproject.org.


Assuntos
Biologia Computacional/educação , Biologia Computacional/métodos , Pesquisadores/educação , Currículo , Análise de Dados , Educação a Distância/métodos , Educação a Distância/tendências , Humanos , Software
5.
BMC Genomics ; 18(1): 553, 2017 07 21.
Artigo em Inglês | MEDLINE | ID: mdl-28732463

RESUMO

BACKGROUND: While eukaryotic noncoding RNAs have recently received intense scrutiny, it is becoming clear that bacterial transcription is at least as pervasive. Bacterial small RNAs and antisense RNAs (sRNAs) are often assumed to be noncoding, due to their lack of long open reading frames (ORFs). However, there are numerous examples of sRNAs encoding for small proteins, whether or not they also have a regulatory role at the RNA level. METHODS: Here, we apply flexible machine learning techniques based on sequence features and comparative genomics to quantify the prevalence of sRNA ORFs under natural selection to maintain protein-coding function in 14 phylogenetically diverse bacteria. Importantly, we quantify uncertainty in our predictions, and follow up on them using mass spectrometry proteomics and comparison to datasets including ribosome profiling. RESULTS: A majority of annotated sRNAs have at least one ORF between 10 and 50 amino acids long, and we conservatively predict that 409±191.7 unannotated sRNA ORFs are under selection to maintain coding (mean estimate and 95% confidence interval), an average of 29 per species considered here. This implies that overall at least 10.3±0.5% of sRNAs have a coding ORF, and in some species around 20% do. 165±69 of these novel coding ORFs have some antisense overlap to annotated ORFs. As experimental validation, many of our predictions are translated in published ribosome profiling data and are identified via mass spectrometry shotgun proteomics. B. subtilis sRNAs with coding ORFs are enriched for high expression in biofilms and confluent growth, and S. pneumoniae sRNAs with coding ORFs are involved in virulence. sRNA coding ORFs are enriched for transmembrane domains and many are predicted novel components of type I toxin/antitoxin systems. CONCLUSIONS: We predict over two dozen new protein-coding genes per bacterial species, but crucially also quantified the uncertainty in this estimate. Our predictions for sRNA coding ORFs, along with predicted novel type I toxins and tools for sorting and visualizing genomic context, are freely available in a user-friendly format at http://disco-bac.web.pasteur.fr. We expect these easily-accessible predictions to be a valuable tool for the study not only of bacterial sRNAs and type I toxin-antitoxin systems, but also of bacterial genetics and genomics.


Assuntos
Bactérias/genética , Peptídeos/genética , Filogenia , RNA Bacteriano/genética , Pequeno RNA não Traduzido/genética , Antitoxinas/genética , Toxinas Bacterianas/genética , Internet , Aprendizado de Máquina , Anotação de Sequência Molecular , Fases de Leitura Aberta/genética , Ribossomos/genética
6.
Gigascience ; 6(6): 1-4, 2017 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-28402416

RESUMO

Background: Bioinformaticians routinely use multiple software tools and data sources in their day-to-day work and have been guided in their choices by a number of cataloguing initiatives. The ELIXIR Tools and Data Services Registry (bio.tools) aims to provide a central information point, independent of any specific scientific scope within bioinformatics or technological implementation. Meanwhile, efforts to integrate bioinformatics software in workbench and workflow environments have accelerated to enable the design, automation, and reproducibility of bioinformatics experiments. One such popular environment is the Galaxy framework, with currently more than 80 publicly available Galaxy servers around the world. In the context of a generic registry for bioinformatics software, such as bio.tools, Galaxy instances constitute a major source of valuable content. Yet there has been, to date, no convenient mechanism to register such services en masse. We present ReGaTE (Registration of Galaxy Tools in Elixir), a software utility that automates the process of registering the services available in a Galaxy instance. This utility uses the BioBlend application program interface to extract service metadata from a Galaxy server, enhance the metadata with the scientific information required by bio.tools, and push it to the registry. ReGaTE provides a fast and convenient way to publish Galaxy services in bio.tools. By doing so, service providers may increase the visibility of their services while enriching the software discovery function that bio.tools provides for its users. The source code of ReGaTE is freely available on Github at https://github.com/C3BI-pasteur-fr/ReGaTE .


Assuntos
Biologia Computacional/métodos , Automação , Sistemas Computacionais , Internet , Reprodutibilidade dos Testes , Software , Interface Usuário-Computador , Fluxo de Trabalho
7.
Nucleic Acids Res ; 44(D1): D38-47, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26538599

RESUMO

Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.


Assuntos
Biologia Computacional , Sistema de Registros , Curadoria de Dados , Software
8.
F1000Res ; 4: 86, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-28451381

RESUMO

The detection and characterization of emerging infectious agents has been a continuing public health concern. High Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies have proven to be promising approaches for efficient and unbiased detection of pathogens in complex biological samples, providing access to comprehensive analyses. As NGS approaches typically yield millions of putatively representative reads per sample, efficient data management and visualization resources have become mandatory. Most usually, those resources are implemented through a dedicated Laboratory Information Management System (LIMS), solely to provide perspective regarding the available information. We developed an easily deployable web-interface, facilitating management and bioinformatics analysis of metagenomics data-samples. It was engineered to run associated and dedicated Galaxy workflows for the detection and eventually classification of pathogens. The web application allows easy interaction with existing Galaxy metagenomic workflows, facilitates the organization, exploration and aggregation of the most relevant sample-specific sequences among millions of genomic sequences, allowing them to determine their relative abundance, and associate them to the most closely related organism or pathogen. The user-friendly Django-Based interface, associates the users' input data and its metadata through a bio-IT provided set of resources (a Galaxy instance, and both sufficient storage and grid computing power). Galaxy is used to handle and analyze the user's input data from loading, indexing, mapping, assembly and DB-searches. Interaction between our application and Galaxy is ensured by the BioBlend library, which gives API-based access to Galaxy's main features. Metadata about samples, runs, as well as the workflow results are stored in the LIMS. For metagenomic classification and exploration purposes, we show, as a proof of concept, that integration of intuitive exploratory tools, like Krona for representation of taxonomic classification, can be achieved very easily. In the trend of Galaxy, the interface enables the sharing of scientific results to fellow team members.

9.
Protein Sci ; 19(4): 847-67, 2010 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-20162627

RESUMO

Ligand-protein interactions are essential for biological processes, and precise characterization of protein binding sites is crucial to understand protein functions. MED-SuMo is a powerful technology to localize similar local regions on protein surfaces. Its heuristic is based on a 3D representation of macromolecules using specific surface chemical features associating chemical characteristics with geometrical properties. MED-SMA is an automated and fast method to classify binding sites. It is based on MED-SuMo technology, which builds a similarity graph, and it uses the Markov Clustering algorithm. Purine binding sites are well studied as drug targets. Here, purine binding sites of the Protein DataBank (PDB) are classified. Proteins potentially inhibited or activated through the same mechanism are gathered. Results are analyzed according to PROSITE annotations and to carefully refined functional annotations extracted from the PDB. As expected, binding sites associated with related mechanisms are gathered, for example, the Small GTPases. Nevertheless, protein kinases from different Kinome families are also found together, for example, Aurora-A and CDK2 proteins which are inhibited by the same drugs. Representative examples of different clusters are presented. The effectiveness of the MED-SMA approach is demonstrated as it gathers binding sites of proteins with similar structure-activity relationships. Moreover, an efficient new protocol associates structures absent of cocrystallized ligands to the purine clusters enabling those structures to be associated with a specific binding mechanism. Applications of this classification by binding mode similarity include target-based drug design and prediction of cross-reactivity and therefore potential toxic side effects.


Assuntos
Proteínas de Transporte/classificação , Purinas/metabolismo , Software , Algoritmos , Sítios de Ligação , Proteínas de Transporte/química , Bases de Dados de Proteínas , Ligantes , Modelos Moleculares , Conformação Proteica , Purinas/química , Relação Estrutura-Atividade
10.
Drug Des Devel Ther ; 3: 59-72, 2009 Sep 21.
Artigo em Inglês | MEDLINE | ID: mdl-19920922

RESUMO

Three-dimensional structural information is critical for understanding functional protein properties and the precise mechanisms of protein functions implicated in physiological and pathological processes. Comparison and detection of protein binding sites are key steps for annotating structures with functional predictions and are extremely valuable steps in a drug design process. In this research area, MED-SuMo is a powerful technology to detect and characterize similar local regions on protein surfaces. Each amino acid residue's potential chemical interactions are represented by specific surface chemical features (SCFs). The MED-SuMo heuristic is based on the representation of binding sites by a graph structure suitable for exploration by an efficient comparison algorithm. We use this approach to analyze one particular SCOP superfamily which includes HSP90 chaperone, MutL/DNA topoisomerase, histidine kinases, and alpha-ketoacid dehydrogenase kinase C (BCK). They share a common fold and a common region for ATP-binding. To analyze both similar and differing features of this fold, we use a novel classification method, the MED-SuMo multi approach (MED-SMA). We highlight common and distinct features of these proteins. The different clusters created by MED-SMA yield interesting observations. For instance, one cluster gathers three types of proteins (HSP90, topoisomerase VI, and BCK) which all bind the drug radicicol.

11.
J Comput Aided Mol Des ; 23(8): 571-82, 2009 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19533373

RESUMO

Eg5, a mitotic kinesin exclusively involved in the formation and function of the mitotic spindle has attracted interest as an anticancer drug target. Eg5 is co-crystallized with several inhibitors bound to its allosteric binding pocket. Each of these occupies a pocket formed by loop 5/helix alpha2 (L5/alpha2). Recently designed inhibitors additionally occupy a hydrophobic pocket of this site. The goal of the present study was to explore this hydrophobic pocket with our MED-SuMo fragment-based protocol, and thus discover novel chemical structures that might bind as inhibitors. The MED-SuMo software is able to compare and superimpose similar interaction surfaces upon the whole protein data bank (PDB). In a fragment-based protocol, MED-SuMo retrieves MED-Portions that encode protein-fragment binding sites and are derived from cross-mining protein-ligand structures with libraries of small molecules. Furthermore we have excluded intra-family MED-Portions derived from Eg5 ligands that occupy the hydrophobic pocket and predicted new potential ligands by hybridization that would fill simultaneously both pockets. Some of the latter having original scaffolds and substituents in the hydrophobic pocket are identified in libraries of synthetically accessible molecules by the MED-Search software.


Assuntos
Descoberta de Drogas , Cinesinas/química , Ligantes , Bibliotecas de Moléculas Pequenas/química , Sítio Alostérico , Desenho Assistido por Computador , Humanos , Interações Hidrofóbicas e Hidrofílicas , Cinesinas/antagonistas & inibidores , Espectroscopia de Ressonância Magnética , Ligação Proteica , Estrutura Terciária de Proteína , Bibliotecas de Moléculas Pequenas/uso terapêutico , Software , Fuso Acromático/química , Relação Estrutura-Atividade
12.
Infect Disord Drug Targets ; 9(3): 344-57, 2009 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-19519487

RESUMO

Resolved three-dimensional protein structures are a major source of information for understanding protein functional properties. The current explosive growth of publicly available protein structures is producing large volumes of data for computational modelling and drug design methods. Target-based in silico drug design tools aid design and optimize compounds to bind to specific targets. MED-SuMo is a powerful technology for comparing local regions on protein surfaces, allowing similarities to be discovered and explored. This is a target-based tool that can exploit all available macromolecule structures. Its computational efficiency differentiates its approach from widely used methods such as docking and scoring, or map-based methods. As a result, MED-SuMo contributes to a large variety of real-world drug discovery applications. We review specific applications where MED-SuMo performed a significant role. These examples include functional annotation, pocket profiling, structural superposition, and functional binding site classification. We also review cases where MED-SuMo provided an innovative solution to frequent undertakings of the medicinal chemist and molecular modeller during lead discovery and lead optimization. These further cases include drug repurposing and fragment-based drug design.


Assuntos
Desenho de Fármacos , Descoberta de Drogas , Conformação Proteica , Software , Sítios de Ligação , Biologia Computacional/métodos , Simulação por Computador , Bases de Dados de Proteínas , Modelos Moleculares , Estrutura Molecular , Ligação Proteica , Relação Estrutura-Atividade
13.
J Chem Inf Model ; 49(2): 280-94, 2009 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-19434830

RESUMO

The large volume of protein-ligand structures now available enables innovative and efficient protocols in computational FBDD (Fragment-Based Drug Design) to be proposed based on experimental data. In this work, we build a database of MED-Portions, where a MED-Portion is a new structural object encoding protein-fragment binding sites. MED-Portions are derived from mining all available protein-ligand structures with any library of small molecules. Combined with the MED-SuMo software to superpose similar protein interaction surfaces, pools of matching MED-Portions can be retrieved from any binding surface query. The rapidity of this technology allows its application to a diverse set of 107 protein binding sites. The selectivity of the protocol is shown by a qualitative correlation between the average hydrophobicity of the pools of MED-Portions and those of the binding sites. To generate hitlike molecules, MED-Portions are combined in 3D with the MED-Hybridise toolkit. Our MED-Portion/MED-SuMo/MED-Hybridise protocol is applied to two targets that represent important protein superfamilies in drug design: a protein kinase and a G-Protein Coupled Receptor (GPCR). We retrieved actives molecules of PubChem bioassays for the two targets. The results show the potential for finding relevant leads from any protein 3D structure since the occurrence of interfamily MED-Portions is 25% for protein kinase and almost 100% for the GPCR.


Assuntos
Bases de Dados de Proteínas , Fragmentos de Peptídeos/química , Proteínas/química , Ligantes , Modelos Moleculares , Ligação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...