Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 50(D1): D622-D631, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34986597

RESUMO

The Human Metabolome Database or HMDB (https://hmdb.ca) has been providing comprehensive reference information about human metabolites and their associated biological, physiological and chemical properties since 2007. Over the past 15 years, the HMDB has grown and evolved significantly to meet the needs of the metabolomics community and respond to continuing changes in internet and computing technology. This year's update, HMDB 5.0, brings a number of important improvements and upgrades to the database. These should make the HMDB more useful and more appealing to a larger cross-section of users. In particular, these improvements include: (i) a significant increase in the number of metabolite entries (from 114 100 to 217 920 compounds); (ii) enhancements to the quality and depth of metabolite descriptions; (iii) the addition of new structure, spectral and pathway visualization tools; (iv) the inclusion of many new and much more accurately predicted spectral data sets, including predicted NMR spectra, more accurately predicted MS spectra, predicted retention indices and predicted collision cross section data and (v) enhancements to the HMDB's search functions to facilitate better compound identification. Many other minor improvements and updates to the content, the interface, and general performance of the HMDB website have also been made. Overall, we believe these upgrades and updates should greatly enhance the HMDB's ease of use and its potential applications not only in human metabolomics but also in exposomics, lipidomics, nutritional science, biochemistry and clinical chemistry.


Assuntos
Bases de Dados Genéticas , Metaboloma/genética , Metabolômica/classificação , Humanos , Lipidômica/classificação , Espectrometria de Massas , Interface Usuário-Computador
2.
J Opt Soc Am A Opt Image Sci Vis ; 39(12): C133-C142, 2022 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-36520751

RESUMO

Astronomical instruments to detect exoplanets require extreme wavefront stability. For these missions to succeed, comprehensive and precise modeling is required to design and analyze suitable coronagraphs and their wavefront control systems. In this paper, we describe techniques for integrated modeling at scale that is, to the best of our knowledge, 1000 times faster than previously published works. We show how this capability has been used to validate performance and perform uncertainty quantification for the Roman Coronagraph instrument. Finally, we show how this modeling capacity may be necessary to design and build the next generation of space-based coronagraph instruments.

3.
Anal Chem ; 93(34): 11692-11700, 2021 08 31.
Artigo em Inglês | MEDLINE | ID: mdl-34403256

RESUMO

In the field of metabolomics, mass spectrometry (MS) is the method most commonly used for identifying and annotating metabolites. As this typically involves matching a given MS spectrum against an experimentally acquired reference spectral library, this approach is limited by the coverage and size of such libraries (which typically number in the thousands). These experimental libraries can be greatly extended by predicting the MS spectra of known chemical structures (which number in the millions) to create computational reference spectral libraries. To facilitate the generation of predicted spectral reference libraries, we developed CFM-ID, a computer program that can accurately predict ESI-MS/MS spectrum for a given compound structure. CFM-ID is one of the best-performing methods for compound-to-mass-spectrum prediction and also one of the top tools for in silico mass-spectrum-to-compound identification. This work improves CFM-ID's ability to predict ESI-MS/MS spectra from compounds by (1) learning parameters from features based on the molecular topology, (2) adding a new approach to ring cleavage that models such cleavage as a sequence of simple chemical bond dissociations, and (3) expanding its hand-written rule-based predictor to cover more chemical classes, including acylcarnitines, acylcholines, flavonols, flavones, flavanones, and flavonoid glycosides. We demonstrate that this new version of CFM-ID (version 4.0) is significantly more accurate than previous CFM-ID versions in terms of both EI-MS/MS spectral prediction and compound identification. CFM-ID 4.0 is available at http://cfmid4.wishartlab.com/ as a web server and docker images can be downloaded at https://hub.docker.com/r/wishartlab/cfmid.


Assuntos
Flavonas , Espectrometria de Massas em Tandem , Simulação por Computador , Metabolômica , Software
4.
Brief Bioinform ; 20(4): 1560-1567, 2019 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-29028989

RESUMO

PHAST (PHAge Search Tool) and its successor PHASTER (PHAge Search Tool - Enhanced Release) have become two of the most widely used web servers for identifying putative prophages in bacterial genomes. Here we review the main capabilities of these web resources, provide some practical guidance regarding their use and discuss possible future improvements. PHAST, which was first described in 2011, made its debut just as whole bacterial genome sequencing and was becoming inexpensive and relatively routine. PHAST quickly gained popularity among bacterial genome researchers because of its web accessibility, its ease of use along with its enhanced accuracy and rapid processing times. PHASTER, which appeared in 2016, provided a number of much-needed enhancements to the PHAST server, including greater processing speed (to cope with very large submission volumes), increased database sizes, a more modern user interface, improved graphical displays and support for metagenomic submissions. Continuing developments in the field, along with increased interest in automated phage and prophage finding, have already led to several improvements to the PHASTER server and will soon lead to the development of a successor to PHASTER (to be called PHASTEST).


Assuntos
Genoma Bacteriano , Prófagos/genética , Software , Biologia Computacional , Mineração de Dados/tendências , Bases de Dados Genéticas , Internet , Metagenômica , Ferramenta de Busca/tendências , Software/tendências , Interface Usuário-Computador
5.
Nucleic Acids Res ; 46(D1): D608-D617, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29140435

RESUMO

The Human Metabolome Database or HMDB (www.hmdb.ca) is a web-enabled metabolomic database containing comprehensive information about human metabolites along with their biological roles, physiological concentrations, disease associations, chemical reactions, metabolic pathways, and reference spectra. First described in 2007, the HMDB is now considered the standard metabolomic resource for human metabolic studies. Over the past decade the HMDB has continued to grow and evolve in response to emerging needs for metabolomics researchers and continuing changes in web standards. This year's update, HMDB 4.0, represents the most significant upgrade to the database in its history. For instance, the number of fully annotated metabolites has increased by nearly threefold, the number of experimental spectra has grown by almost fourfold and the number of illustrated metabolic pathways has grown by a factor of almost 60. Significant improvements have also been made to the HMDB's chemical taxonomy, chemical ontology, spectral viewing, and spectral/text searching tools. A great deal of brand new data has also been added to HMDB 4.0. This includes large quantities of predicted MS/MS and GC-MS reference spectral data as well as predicted (physiologically feasible) metabolite structures to facilitate novel metabolite identification. Additional information on metabolite-SNP interactions and the influence of drugs on metabolite levels (pharmacometabolomics) has also been added. Many other important improvements in the content, the interface, and the performance of the HMDB website have been made and these should greatly enhance its ease of use and its potential applications in nutrition, biochemistry, clinical chemistry, clinical genetics, medicine, and metabolomics science.


Assuntos
Bases de Dados Factuais , Metaboloma , Bases de Dados de Compostos Químicos , Cromatografia Gasosa-Espectrometria de Massas , Humanos , Redes e Vias Metabólicas , Metabolômica , Ressonância Magnética Nuclear Biomolecular , Espectrometria de Massas em Tandem , Interface Usuário-Computador
6.
Nucleic Acids Res ; 45(D1): D440-D445, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899612

RESUMO

YMDB or the Yeast Metabolome Database (http://www.ymdb.ca/) is a comprehensive database containing extensive information on the genome and metabolome of Saccharomyces cerevisiae Initially released in 2012, the YMDB has gone through a significant expansion and a number of improvements over the past 4 years. This manuscript describes the most recent version of YMDB (YMDB 2.0). More specifically, it provides an updated description of the database that was previously described in the 2012 NAR Database Issue and it details many of the additions and improvements made to the YMDB over that time. Some of the most important changes include a 7-fold increase in the number of compounds in the database (from 2007 to 16 042), a 430-fold increase in the number of metabolic and signaling pathway diagrams (from 66 to 28 734), a 16-fold increase in the number of compounds linked to pathways (from 742 to 12 733), a 17-fold increase in the numbers of compounds with nuclear magnetic resonance or MS spectra (from 783 to 13 173) and an increase in both the number of data fields and the number of links to external databases. In addition to these database expansions, a number of improvements to YMDB's web interface and its data visualization tools have been made. These additions and improvements should greatly improve the ease, the speed and the quantity of data that can be extracted, searched or viewed within YMDB. Overall, we believe these improvements should not only improve the understanding of the metabolism of S. cerevisiae, but also allow more in-depth exploration of its extensive metabolic networks, signaling pathways and biochemistry.


Assuntos
Biologia Computacional/métodos , Bases de Dados Factuais , Metaboloma , Metabolômica , Software , Leveduras/metabolismo , Redes e Vias Metabólicas , Metabolômica/métodos , Navegador
7.
J Biomol NMR ; 70(1): 33-51, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-29196969

RESUMO

Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often < 10 min/structure) and to significantly outperform other shift-based or threading-based structure determination methods (in terms of top template model accuracy)-with an average TM-score performance of 0.68 (vs. 0.50-0.62 for other methods). Coupled with recent developments in chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca .


Assuntos
Sequência de Aminoácidos , Estrutura Secundária de Proteína , Proteínas/química , Ressonância Magnética Nuclear Biomolecular/métodos , Conformação Proteica , Fatores de Tempo
8.
Nucleic Acids Res ; 44(W1): W16-21, 2016 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-27141966

RESUMO

PHASTER (PHAge Search Tool - Enhanced Release) is a significant upgrade to the popular PHAST web server for the rapid identification and annotation of prophage sequences within bacterial genomes and plasmids. Although the steps in the phage identification pipeline in PHASTER remain largely the same as in the original PHAST, numerous software improvements and significant hardware enhancements have now made PHASTER faster, more efficient, more visually appealing and much more user friendly. In particular, PHASTER is now 4.3× faster than PHAST when analyzing a typical bacterial genome. More specifically, software optimizations have made the backend of PHASTER 2.7X faster than PHAST, while the addition of 80 CPUs to the PHASTER compute cluster are responsible for the remaining speed-up. PHASTER can now process a typical bacterial genome in 3 min from the raw sequence alone, or in 1.5 min when given a pre-annotated GenBank file. A number of other optimizations have also been implemented, including automated algorithms to reduce the size and redundancy of PHASTER's databases, improvements in handling multiple (metagenomic) queries and higher user traffic, along with the ability to perform automated look-ups against 14 000 previously PHAST/PHASTER annotated bacterial genomes (which can lead to complete phage annotations in seconds as opposed to minutes). PHASTER's web interface has also been entirely rewritten. A new graphical genome browser has been added, gene/genome visualization tools have been improved, and the graphical interface is now more modern, robust and user-friendly. PHASTER is available online at www.phaster.ca.


Assuntos
Bactérias/genética , Bacteriófagos/genética , DNA Viral/genética , Genoma Bacteriano , Software , Algoritmos , Bactérias/virologia , Gráficos por Computador , Bases de Dados Genéticas , Ontologia Genética , Anotação de Sequência Molecular , Plasmídeos/química , Plasmídeos/metabolismo , Ferramenta de Busca , Fatores de Tempo
9.
Nucleic Acids Res ; 44(W1): W147-53, 2016 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-27190236

RESUMO

Heatmapper is a freely available web server that allows users to interactively visualize their data in the form of heat maps through an easy-to-use graphical interface. Unlike existing non-commercial heat map packages, which either lack graphical interfaces or are specialized for only one or two kinds of heat maps, Heatmapper is a versatile tool that allows users to easily create a wide variety of heat maps for many different data types and applications. More specifically, Heatmapper allows users to generate, cluster and visualize: (i) expression-based heat maps from transcriptomic, proteomic and metabolomic experiments; (ii) pairwise distance maps; (iii) correlation maps; (iv) image overlay heat maps; (v) latitude and longitude heat maps and (vi) geopolitical (choropleth) heat maps. Heatmapper offers a number of simple and intuitive customization options for facile adjustments to each heat map's appearance and plotting parameters. Heatmapper also allows users to interactively explore their numeric data values by hovering their cursor over each heat map cell, or by using a searchable/sortable data table view. Heat map data can be easily uploaded to Heatmapper in text, Excel or tab delimited formatted tables and the resulting heat map images can be easily downloaded in common formats including PNG, JPG and PDF. Heatmapper is designed to appeal to a wide range of users, including molecular biologists, structural biologists, microbiologists, epidemiologists, environmental scientists, agriculture/forestry scientists, fish and wildlife biologists, climatologists, geologists, educators and students. Heatmapper is available at http://www.heatmapper.ca.


Assuntos
Mapeamento Potencial de Superfície Corporal/métodos , Mapeamento Cromossômico/métodos , Mapeamento Geográfico , Mapeamento de Interação de Proteínas/métodos , Termografia/métodos , Interface Usuário-Computador , Animais , Gráficos por Computador , Redes Reguladoras de Genes , Humanos , Armazenamento e Recuperação da Informação , Internet , Metaboloma , Proteoma , Transcriptoma
10.
Nucleic Acids Res ; 43(W1): W370-7, 2015 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-25979265

RESUMO

The Chemical Shift Index or CSI 3.0 (http://csi3.wishartlab.com) is a web server designed to accurately identify the location of secondary and super-secondary structures in protein chains using only nuclear magnetic resonance (NMR) backbone chemical shifts and their corresponding protein sequence data. Unlike earlier versions of CSI, which only identified three types of secondary structure (helix, ß-strand and coil), CSI 3.0 now identifies total of 11 types of secondary and super-secondary structures, including helices, ß-strands, coil regions, five common ß-turns (type I, II, I', II' and VIII), ß hairpins as well as interior and edge ß-strands. CSI 3.0 accepts experimental NMR chemical shift data in multiple formats (NMR Star 2.1, NMR Star 3.1 and SHIFTY) and generates colorful CSI plots (bar graphs) and secondary/super-secondary structure assignments. The output can be readily used as constraints for structure determination and refinement or the images may be used for presentations and publications. CSI 3.0 uses a pipeline of several well-tested, previously published programs to identify the secondary and super-secondary structures in protein chains. Comparisons with secondary and super-secondary structure assignments made via standard coordinate analysis programs such as DSSP, STRIDE and VADAR on high-resolution protein structures solved by X-ray and NMR show >90% agreement between those made with CSI 3.0.


Assuntos
Ressonância Magnética Nuclear Biomolecular , Estrutura Secundária de Proteína , Software , Algoritmos , Internet
11.
Nucleic Acids Res ; 43(Database issue): D928-34, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25378312

RESUMO

The exposome is defined as the totality of all human environmental exposures from conception to death. It is often regarded as the complement to the genome, with the interaction between the exposome and the genome ultimately determining one's phenotype. The 'toxic exposome' is the complete collection of chronically or acutely toxic compounds to which humans can be exposed. Considerable interest in defining the toxic exposome has been spurred on by the realization that most human injuries, deaths and diseases are directly or indirectly caused by toxic substances found in the air, water, food, home or workplace. The Toxin-Toxin-Target Database (T3DB--www.t3db.ca) is a resource that was specifically designed to capture information about the toxic exposome. Originally released in 2010, the first version of T3DB contained data on nearly 2900 common toxic substances along with detailed information on their chemical properties, descriptions, targets, toxic effects, toxicity thresholds, sequences (for both targets and toxins), mechanisms and references. To more closely align itself with the needs of epidemiologists, toxicologists and exposome scientists, the latest release of T3DB has been substantially upgraded to include many more compounds (>3600), targets (>2000) and gene expression datasets (>15,000 genes). It now includes extensive data on 'normal' toxic compound concentrations in human biofluids as well as detailed chemical taxonomies, informative chemical ontologies and a large number of referential NMR, MS/MS and GC-MS spectra. This manuscript describes the most recent update to the T3DB, which was previously featured in the 2010 NAR Database Issue.


Assuntos
Bases de Dados de Compostos Químicos , Exposição Ambiental , Substâncias Perigosas/química , Substâncias Perigosas/toxicidade , Humanos , Internet
12.
Nucleic Acids Res ; 42(Database issue): D1091-7, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24203711

RESUMO

DrugBank (http://www.drugbank.ca) is a comprehensive online database containing extensive biochemical and pharmacological information about drugs, their mechanisms and their targets. Since it was first described in 2006, DrugBank has rapidly evolved, both in response to user requests and in response to changing trends in drug research and development. Previous versions of DrugBank have been widely used to facilitate drug and in silico drug target discovery. The latest update, DrugBank 4.0, has been further expanded to contain data on drug metabolism, absorption, distribution, metabolism, excretion and toxicity (ADMET) and other kinds of quantitative structure activity relationships (QSAR) information. These enhancements are intended to facilitate research in xenobiotic metabolism (both prediction and characterization), pharmacokinetics, pharmacodynamics and drug design/discovery. For this release, >1200 drug metabolites (including their structures, names, activity, abundance and other detailed data) have been added along with >1300 drug metabolism reactions (including metabolizing enzymes and reaction types) and dozens of drug metabolism pathways. Another 30 predicted or measured ADMET parameters have been added to each DrugCard, bringing the average number of quantitative ADMET values for Food and Drug Administration-approved drugs close to 40. Referential nuclear magnetic resonance and MS spectra have been added for almost 400 drugs as well as spectral and mass matching tools to facilitate compound identification. This expanded collection of drug information is complemented by a number of new or improved search tools, including one that provides a simple analyses of drug-target, -enzyme and -transporter associations to provide insight on drug-drug interactions.


Assuntos
Bases de Dados de Compostos Químicos , Descoberta de Drogas , Farmacocinética , Internet , Preparações Farmacêuticas/química , Relação Quantitativa Estrutura-Atividade
13.
Nucleic Acids Res ; 42(Database issue): D478-84, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24203708

RESUMO

The Small Molecule Pathway Database (SMPDB, http://www.smpdb.ca) is a comprehensive, colorful, fully searchable and highly interactive database for visualizing human metabolic, drug action, drug metabolism, physiological activity and metabolic disease pathways. SMPDB contains >600 pathways with nearly 75% of its pathways not found in any other database. All SMPDB pathway diagrams are extensively hyperlinked and include detailed information on the relevant tissues, organs, organelles, subcellular compartments, protein cofactors, protein locations, metabolite locations, chemical structures and protein quaternary structures. Since its last release in 2010, SMPDB has undergone substantial upgrades and significant expansion. In particular, the total number of pathways in SMPDB has grown by >70%. Additionally, every previously entered pathway has been completely redrawn, standardized, corrected, updated and enhanced with additional molecular or cellular information. Many SMPDB pathways now include transporter proteins as well as much more physiological, tissue, target organ and reaction compartment data. Thanks to the development of a standardized pathway drawing tool (called PathWhiz) all SMPDB pathways are now much more easily drawn and far more rapidly updated. PathWhiz has also allowed all SMPDB pathways to be saved in a BioPAX format. Significant improvements to SMPDB's visualization interface now make the browsing, selection, recoloring and zooming of pathways far easier and far more intuitive. Because of its utility and breadth of coverage, SMPDB is now integrated into several other databases including HMDB and DrugBank.


Assuntos
Bases de Dados de Compostos Químicos , Redes e Vias Metabólicas , Gráficos por Computador , Humanos , Internet , Doenças Metabólicas/metabolismo , Preparações Farmacêuticas/metabolismo , Proteínas/química , Proteínas/metabolismo
14.
J Biomol NMR ; 63(3): 255-64, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26345175

RESUMO

Over the past decade, a number of methods have been developed to determine the approximate structure of proteins using minimal NMR experimental information such as chemical shifts alone, sparse NOEs alone or a combination of comparative modeling data and chemical shifts. However, there have been relatively few methods that allow these approximate models to be substantively refined or improved using the available NMR chemical shift data. Here, we present a novel method, called Chemical Shift driven Genetic Algorithm for biased Molecular Dynamics (CS-GAMDy), for the robust optimization of protein structures using experimental NMR chemical shifts. The method incorporates knowledge-based scoring functions and structural information derived from NMR chemical shifts via a unique combination of multi-objective MD biasing, a genetic algorithm, and the widely used XPLOR molecular modelling language. Using this approach, we demonstrate that CS-GAMDy is able to refine and/or fold models that are as much as 10 Å (RMSD) away from the correct structure using only NMR chemical shift data. CS-GAMDy is also able to refine of a wide range of approximate or mildly erroneous protein structures to more closely match the known/correct structure and the known/correct chemical shifts. We believe CS-GAMDy will allow protein models generated by sparse restraint or chemical-shift-only methods to achieve sufficiently high quality to be considered fully refined and "PDB worthy". The CS-GAMDy algorithm is explained in detail and its performance is compared over a range of refinement scenarios with several commonly used protein structure refinement protocols. The program has been designed to be easily installed and easily used and is available at http://www.gamdy.ca.


Assuntos
Algoritmos , Modelos Moleculares , Ressonância Magnética Nuclear Biomolecular , Conformação Proteica , Proteínas/química , Ressonância Magnética Nuclear Biomolecular/métodos
15.
J Biomol NMR ; 62(3): 387-401, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-26078090

RESUMO

Accessible surface area (ASA) is the surface area of an atom, amino acid or biomolecule that is exposed to solvent. The calculation of a molecule's ASA requires three-dimensional coordinate data and the use of a "rolling ball" algorithm to both define and calculate the ASA. For polymers such as proteins, the ASA for individual amino acids is closely related to the hydrophobicity of the amino acid as well as its local secondary and tertiary structure. For proteins, ASA is a structural descriptor that can often be as informative as secondary structure. Consequently there has been considerable effort over the past two decades to try to predict ASA from protein sequence data and to use ASA information (derived from chemical modification studies) as a structure constraint. Recently it has become evident that protein chemical shifts are also sensitive to ASA. Given the potential utility of ASA estimates as structural constraints for NMR we decided to explore this relationship further. Using machine learning techniques (specifically a boosted tree regression model) we developed an algorithm called "ShiftASA" that combines chemical-shift and sequence derived features to accurately estimate per-residue fractional ASA values of water-soluble proteins. This method showed a correlation coefficient between predicted and experimental values of 0.79 when evaluated on a set of 65 independent test proteins, which was an 8.2 % improvement over the next best performing (sequence-only) method. On a separate test set of 92 proteins, ShiftASA reported a mean correlation coefficient of 0.82, which was 12.3 % better than the next best performing method. ShiftASA is available as a web server ( http://shiftasa.wishartlab.com ) for submitting input queries for fractional ASA calculation.


Assuntos
Ressonância Magnética Nuclear Biomolecular/métodos , Proteínas/química , Algoritmos , Internet , Aprendizado de Máquina , Software , Propriedades de Superfície
16.
Nucleic Acids Res ; 41(Database issue): D801-7, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23161693

RESUMO

The Human Metabolome Database (HMDB) (www.hmdb.ca) is a resource dedicated to providing scientists with the most current and comprehensive coverage of the human metabolome. Since its first release in 2007, the HMDB has been used to facilitate research for nearly 1000 published studies in metabolomics, clinical biochemistry and systems biology. The most recent release of HMDB (version 3.0) has been significantly expanded and enhanced over the 2009 release (version 2.0). In particular, the number of annotated metabolite entries has grown from 6500 to more than 40,000 (a 600% increase). This enormous expansion is a result of the inclusion of both 'detected' metabolites (those with measured concentrations or experimental confirmation of their existence) and 'expected' metabolites (those for which biochemical pathways are known or human intake/exposure is frequent but the compound has yet to be detected in the body). The latest release also has greatly increased the number of metabolites with biofluid or tissue concentration data, the number of compounds with reference spectra and the number of data fields per entry. In addition to this expansion in data quantity, new database visualization tools and new data content have been added or enhanced. These include better spectral viewing tools, more powerful chemical substructure searches, an improved chemical taxonomy and better, more interactive pathway maps. This article describes these enhancements to the HMDB, which was previously featured in the 2009 NAR Database Issue. (Note to referees, HMDB 3.0 will go live on 18 September 2012.).


Assuntos
Bases de Dados de Compostos Químicos , Metaboloma , Metabolômica , Humanos , Internet , Espectrometria de Massas , Ressonância Magnética Nuclear Biomolecular , Interface Usuário-Computador
17.
Nucleic Acids Res ; 40(Web Server issue): W88-95, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22645318

RESUMO

With recent improvements in DNA sequencing and sample extraction techniques, the quantity and quality of metagenomic data are now growing exponentially. This abundance of richly annotated metagenomic data and bacterial census information has spawned a new branch of microbiology called comparative metagenomics. Comparative metagenomics involves the comparison of bacterial populations between different environmental samples, different culture conditions or different microbial hosts. However, in order to do comparative metagenomics, one typically requires a sophisticated knowledge of multivariate statistics and/or advanced software programming skills. To make comparative metagenomics more accessible to microbiologists, we have developed a freely accessible, easy-to-use web server for comparative metagenomic analysis called METAGENassist. Users can upload their bacterial census data from a wide variety of common formats, using either amplified 16S rRNA data or shotgun metagenomic data. Metadata concerning environmental, culture, or host conditions can also be uploaded. During the data upload process, METAGENassist also performs an automated taxonomic-to-phenotypic mapping. Phenotypic information covering nearly 20 functional categories such as GC content, genome size, oxygen requirements, energy sources and preferred temperature range is automatically generated from the taxonomic input data. Using this phenotypically enriched data, users can then perform a variety of multivariate and univariate data analyses including fold change analysis, t-tests, PCA, PLS-DA, clustering and classification. To facilitate data processing, users are guided through a step-by-step analysis workflow using a variety of menus, information hyperlinks and check boxes. METAGENassist also generates colorful, publication quality tables and graphs that can be downloaded and used directly in the preparation of scientific papers. METAGENassist is available at http://www.metagenassist.ca.


Assuntos
Bactérias/classificação , Metagenômica/métodos , Software , Bactérias/genética , Interpretação Estatística de Dados , Internet , Fenótipo
18.
Nucleic Acids Res ; 36(Web Server issue): W202-9, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18483082

RESUMO

PROTEUS2 is a web server designed to support comprehensive protein structure prediction and structure-based annotation. PROTEUS2 accepts either single sequences (for directed studies) or multiple sequences (for whole proteome annotation) and predicts the secondary and, if possible, tertiary structure of the query protein(s). Unlike most other tools or servers, PROTEUS2 bundles signal peptide identification, transmembrane helix prediction, transmembrane beta-strand prediction, secondary structure prediction (for soluble proteins) and homology modeling (i.e. 3D structure generation) into a single prediction pipeline. Using a combination of progressive multi-sequence alignment, structure-based mapping, hidden Markov models, multi-component neural nets and up-to-date databases of known secondary structure assignments, PROTEUS is able to achieve among the highest reported levels of predictive accuracy for signal peptides (Q2 = 94%), membrane spanning helices (Q2 = 87%) and secondary structure (Q3 score of 81.3%). PROTEUS2's homology modeling services also provide high quality 3D models that compare favorably with those generated by SWISS-MODEL and 3D JigSaw (within 0.2 A RMSD). The average PROTEUS2 prediction takes approximately 3 min per query sequence. The PROTEUS2 server along with source code for many of its modules is accessible a http://wishart.biology.ualberta.ca/proteus2.


Assuntos
Estrutura Secundária de Proteína , Software , Homologia Estrutural de Proteína , Algoritmos , Internet , Proteínas de Membrana/química , Modelos Moleculares , Sinais Direcionadores de Proteínas , Análise de Sequência de Proteína
19.
Nucleic Acids Res ; 36(Web Server issue): W496-502, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18515350

RESUMO

CS23D (chemical shift to 3D structure) is a web server for rapidly generating accurate 3D protein structures using only assigned nuclear magnetic resonance (NMR) chemical shifts and sequence data as input. Unlike conventional NMR methods, CS23D requires no NOE and/or J-coupling data to perform its calculations. CS23D accepts chemical shift files in either SHIFTY or BMRB formats, and produces a set of PDB coordinates for the protein in about 10-15 min. CS23D uses a pipeline of several preexisting programs or servers to calculate the actual protein structure. Depending on the sequence similarity (or lack thereof) CS23D uses either (i) maximal subfragment assembly (a form of homology modeling), (ii) chemical shift threading or (iii) shift-aided de novo structure prediction (via Rosetta) followed by chemical shift refinement to generate and/or refine protein coordinates. Tests conducted on more than 100 proteins from the BioMagResBank indicate that CS23D converges (i.e. finds a solution) for >95% of protein queries. These chemical shift generated structures were found to be within 0.2-2.8 A RMSD of the NMR structure generated using conventional NOE-base NMR methods or conventional X-ray methods. The performance of CS23D is dependent on the completeness of the chemical shift assignments and the similarity of the query protein to known 3D folds. CS23D is accessible at http://www.cs23d.ca.


Assuntos
Ressonância Magnética Nuclear Biomolecular , Conformação Proteica , Software , Algoritmos , Bases de Dados de Proteínas , Internet , Análise de Sequência de Proteína
20.
Nucleic Acids Res ; 36(Database issue): D222-9, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17916570

RESUMO

The protein property prediction and testing database (PPT-DB) is a database housing nearly 30 carefully curated databases, each of which contains commonly predicted protein property information. These properties include both structural (i.e. secondary structure, contact order, disulfide pairing) and dynamic (i.e. order parameters, B-factors, folding rates) features that have been measured, derived or tabulated from a variety of sources. PPT-DB is designed to serve two purposes. First it is intended to serve as a centralized, up-to-date, freely downloadable and easily queried repository of predictable or 'derived' protein property data. In this role, PPT-DB can serve as a one-stop, fully standardized repository for developers to obtain the required training, testing and validation data needed for almost any kind of protein property prediction program they may wish to create. The second role that PPT-DB can play is as a tool for homology-based protein property prediction. Users may query PPT-DB with a sequence of interest and have a specific property predicted using a sequence similarity search against PPT-DB's extensive collection of proteins with known properties. PPT-DB exploits the well-known fact that protein structure and dynamic properties are highly conserved between homologous proteins. Predictions derived from PPT-DB's similarity searches are typically 85-95% correct (for categorical predictions, such as secondary structure) or exhibit correlations of >0.80 (for numeric predictions, such as accessible surface area). This performance is 10-20% better than what is typically obtained from standard 'ab initio' predictions. PPT-DB, its prediction utilities and all of its contents are available at http://www.pptdb.ca.


Assuntos
Bases de Dados de Proteínas , Bases de Dados de Proteínas/normas , Internet , Conformação Proteica , Proteínas/química , Controle de Qualidade , Homologia de Sequência de Aminoácidos , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA