Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
Nucleic Acids Res ; 51(D1): D1373-D1380, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36305812

RESUMO

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a popular chemical information resource that serves a wide range of use cases. In the past two years, a number of changes were made to PubChem. Data from more than 120 data sources was added to PubChem. Some major highlights include: the integration of Google Patents data into PubChem, which greatly expanded the coverage of the PubChem Patent data collection; the creation of the Cell Line and Taxonomy data collections, which provide quick and easy access to chemical information for a given cell line and taxon, respectively; and the update of the bioassay data model. In addition, new functionalities were added to the PubChem programmatic access protocols, PUG-REST and PUG-View, including support for target-centric data download for a given protein, gene, pathway, cell line, and taxon and the addition of the 'standardize' option to PUG-REST, which returns the standardized form of an input chemical structure. A significant update was also made to PubChemRDF. The present paper provides an overview of these changes.


Assuntos
Bases de Dados de Compostos Químicos , Descoberta de Drogas , Descoberta de Drogas/métodos , Bioensaio , Proteínas , Quimioinformática
2.
Environ Sci Technol ; 58(9): 4181-4192, 2024 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-38373301

RESUMO

Alzheimer's disease (AD) is a complex and multifactorial neurodegenerative disease, which is currently diagnosed via clinical symptoms and nonspecific biomarkers (such as Aß1-42, t-Tau, and p-Tau) measured in cerebrospinal fluid (CSF), which alone do not provide sufficient insights into disease progression. In this pilot study, these biomarkers were complemented with small-molecule analysis using non-target high-resolution mass spectrometry coupled with liquid chromatography (LC) on the CSF of three groups: AD, mild cognitive impairment (MCI) due to AD, and a non-demented (ND) control group. An open-source cheminformatics pipeline based on MS-DIAL and patRoon was enhanced using CSF- and AD-specific suspect lists to assist in data interpretation. Chemical Similarity Enrichment Analysis revealed a significant increase of hydroxybutyrates in AD, including 3-hydroxybutanoic acid, which was found at higher levels in AD compared to MCI and ND. Furthermore, a highly sensitive target LC-MS method was used to quantify 35 bile acids (BAs) in the CSF, revealing several statistically significant differences including higher dehydrolithocholic acid levels and decreased conjugated BA levels in AD. This work provides several promising small-molecule hypotheses that could be used to help track the progression of AD in CSF samples.


Assuntos
Doença de Alzheimer , Disfunção Cognitiva , Doenças Neurodegenerativas , Humanos , Doença de Alzheimer/líquido cefalorraquidiano , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/psicologia , Proteínas tau/líquido cefalorraquidiano , Peptídeos beta-Amiloides/líquido cefalorraquidiano , Projetos Piloto , Disfunção Cognitiva/líquido cefalorraquidiano , Disfunção Cognitiva/diagnóstico , Disfunção Cognitiva/psicologia , Biomarcadores , Progressão da Doença
3.
Glycobiology ; 33(6): 454-463, 2023 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-37129482

RESUMO

The GlyCosmos Glycoscience Portal (https://glycosmos.org) and PubChem (https://pubchem.ncbi.nlm.nih.gov/) are major portals for glycoscience and chemistry, respectively. GlyCosmos is a portal for glycan-related repositories, including GlyTouCan, GlycoPOST, and UniCarb-DR, as well as for glycan-related data resources that have been integrated from a variety of 'omics databases. Glycogenes, glycoproteins, lectins, pathways, and disease information related to glycans are accessible from GlyCosmos. PubChem, on the other hand, is a chemistry-based portal at the National Center for Biotechnology Information. PubChem provides information not only on chemicals, but also genes, proteins, pathways, as well as patents, bioassays, and more, from hundreds of data resources from around the world. In this work, these 2 portals have made substantial efforts to integrate their complementary data to allow users to cross between these 2 domains. In addition to glycan structures, key information, such as glycan-related genes, relevant diseases, glycoproteins, and pathways, was integrated and cross-linked with one another. The interfaces were designed to enable users to easily find, access, download, and reuse data of interest across these resources. Use cases are described illustrating and highlighting the type of content that can be investigated. In total, these integrations provide life science researchers improved awareness and enhanced access to glycan-related information.


Assuntos
Bases de Dados de Compostos Químicos , Polissacarídeos , Glicosilação , Fluxo de Trabalho , Informática , Polissacarídeos/química , Glicoconjugados/química
4.
Nucleic Acids Res ; 49(D1): D1388-D1395, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33151290

RESUMO

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a popular chemical information resource that serves the scientific community as well as the general public, with millions of unique users per month. In the past two years, PubChem made substantial improvements. Data from more than 100 new data sources were added to PubChem, including chemical-literature links from Thieme Chemistry, chemical and physical property links from SpringerMaterials, and patent links from the World Intellectual Properties Organization (WIPO). PubChem's homepage and individual record pages were updated to help users find desired information faster. This update involved a data model change for the data objects used by these pages as well as by programmatic users. Several new services were introduced, including the PubChem Periodic Table and Element pages, Pathway pages, and Knowledge panels. Additionally, in response to the coronavirus disease 2019 (COVID-19) outbreak, PubChem created a special data collection that contains PubChem data related to COVID-19 and the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).


Assuntos
COVID-19/prevenção & controle , Bases de Dados de Compostos Químicos , Armazenamento e Recuperação da Informação/estatística & dados numéricos , SARS-CoV-2/isolamento & purificação , Interface Usuário-Computador , COVID-19/epidemiologia , COVID-19/virologia , Descoberta de Drogas/estatística & dados numéricos , Epidemias , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Saúde Pública/estatística & dados numéricos , SARS-CoV-2/fisiologia , Software
5.
Anal Bioanal Chem ; 414(25): 7399-7419, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35829770

RESUMO

Parkinson's disease (PD) is the second most prevalent neurodegenerative disease, with an increasing incidence in recent years due to the aging population. Genetic mutations alone only explain <10% of PD cases, while environmental factors, including small molecules, may play a significant role in PD. In the present work, 22 plasma (11 PD, 11 control) and 19 feces samples (10 PD, 9 control) were analyzed by non-target high-resolution mass spectrometry (NT-HRMS) coupled to two liquid chromatography (LC) methods (reversed-phase (RP) and hydrophilic interaction liquid chromatography (HILIC)). A cheminformatics workflow was optimized using open software (MS-DIAL and patRoon) and open databases (all public MSP-formatted spectral libraries for MS-DIAL, PubChemLite for Exposomics, and the LITMINEDNEURO list for patRoon). Furthermore, five disease-specific databases and three suspect lists (on PD and related disorders) were developed, using PubChem functionality to identifying relevant unknown chemicals. The results showed that non-target screening with the larger databases generally provided better results compared with smaller suspect lists. However, two suspect screening approaches with patRoon were also good options to study specific chemicals in PD. The combination of chromatographic methods (RP and HILIC) as well as two ionization modes (positive and negative) enhanced the coverage of chemicals in the biological samples. While most metabolomics studies in PD have focused on blood and cerebrospinal fluid, we found a higher number of relevant features in feces, such as alanine betaine or nicotinamide, which can be directly metabolized by gut microbiota. This highlights the potential role of gut dysbiosis in PD development.


Assuntos
Expossoma , Doenças Neurodegenerativas , Doença de Parkinson , Idoso , Alanina , Betaína , Quimioinformática , Humanos , Metaboloma , Metabolômica/métodos , Niacinamida , Projetos Piloto
6.
Nucleic Acids Res ; 48(D1): D1093-D1103, 2020 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-31680153

RESUMO

Plant Reactome (https://plantreactome.gramene.org) is an open-source, comparative plant pathway knowledgebase of the Gramene project. It uses Oryza sativa (rice) as a reference species for manual curation of pathways and extends pathway knowledge to another 82 plant species via gene-orthology projection using the Reactome data model and framework. It currently hosts 298 reference pathways, including metabolic and transport pathways, transcriptional networks, hormone signaling pathways, and plant developmental processes. In addition to browsing plant pathways, users can upload and analyze their omics data, such as the gene-expression data, and overlay curated or experimental gene-gene interaction data to extend pathway knowledge. The curation team actively engages researchers and students on gene and pathway curation by offering workshops and online tutorials. The Plant Reactome supports, implements and collaborates with the wider community to make data and tools related to genes, genomes, and pathways Findable, Accessible, Interoperable and Re-usable (FAIR).


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica , Metabolômica , Plantas/genética , Plantas/metabolismo , Proteômica , Redes Reguladoras de Genes , Genômica/métodos , Humanos , Redes e Vias Metabólicas , Metabolômica/métodos , Proteômica/métodos , Transdução de Sinais , Navegador
7.
Glycobiology ; 31(11): 1510-1519, 2021 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-34314492

RESUMO

Glycans play a vital role in health, disease, bioenergy, biomaterials and bio-therapeutics. As a result, there is keen interest to identify and increase glycan data in bioinformatics databases like ChEBI and PubChem, and connecting them to resources at the EMBL-EBI and NCBI to facilitate access to important annotations at a global level. GlyTouCan is a comprehensive archival database that contains glycans obtained primarily through batch upload from glycan repositories, glycoprotein databases and individual laboratories. In many instances, the glycan structures deposited in GlyTouCan may not be fully defined or have supporting experimental evidence and citations. Databases like ChEBI and PubChem were designed to accommodate complete atomistic structures with well-defined chemical linkages. As a result, they cannot easily accommodate the structural ambiguity inherent in glycan databases. Consequently, there is a need to improve the organization of glycan data coherently to enhance connectivity across the major NCBI, EMBL-EBI and glycoscience databases. This paper outlines a workflow developed in collaboration between GlyGen, ChEBI and PubChem to improve the visibility and connectivity of glycan data across these resources. GlyGen hosts a subset of glycans (~29,000) from the GlyTouCan database and has submitted valuable glycan annotations to the PubChem database and integrated over 10,500 (including ambiguously defined) glycans into the ChEBI database. The integrated glycans were prioritized based on links to PubChem and connectivity to glycoprotein data. The pipeline provides a blueprint for how glycan data can be harmonized between different resources. The current PubChem, ChEBI and GlyTouCan mappings can be downloaded from GlyGen (https://data.glygen.org).


Assuntos
Bases de Dados de Compostos Químicos , Glicoproteínas/química , Polissacarídeos/química , Software , Configuração de Carboidratos , Glicômica
8.
Nucleic Acids Res ; 47(D1): D1102-D1109, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30371825

RESUMO

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a key chemical information resource for the biomedical research community. Substantial improvements were made in the past few years. New data content was added, including spectral information, scientific articles mentioning chemicals, and information for food and agricultural chemicals. PubChem released new web interfaces, such as PubChem Target View page, Sources page, Bioactivity dyad pages and Patent View page. PubChem also released a major update to PubChem Widgets and introduced a new programmatic access interface, called PUG-View. This paper describes these new developments in PubChem.


Assuntos
Biologia Computacional/métodos , Bases de Dados de Compostos Químicos , Preparações Farmacêuticas/química , Bibliotecas de Moléculas Pequenas/química , Animais , Bioensaio/métodos , Descoberta de Drogas/métodos , Ensaios de Triagem em Larga Escala/métodos , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Estrutura Molecular , Patentes como Assunto , Relação Estrutura-Atividade
9.
Nucleic Acids Res ; 46(W1): W563-W570, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29718389

RESUMO

PubChem (https://pubchem.ncbi.nlm.nih.gov) is one of the largest open chemical information resources available. It currently receives millions of unique users per month on average, serving as a key resource for many research fields such as cheminformatics, chemical biology, medicinal chemistry, and drug discovery. PubChem provides multiple programmatic access routes to its data and services. One of them is PUG-REST, a Representational State Transfer (REST)-like web service interface to PubChem. On average, PUG-REST receives more than a million requests per day from tens of thousands of unique users. The present paper provides an update on PUG-REST since our previous paper published in 2015. This includes access to new kinds of data (e.g. concise bioactivity data, table of contents headings, etc.), full implementation of synchronous fast structure search, support for assay data retrieval using accession identifiers in response to the deprecation of NCBI's GI numbers, data exchange between PUG-REST and NCBI's E-Utilities through the List Gateway, implementation of dynamic traffic control through throttling, and enhanced usage policies. In addition, example Perl scripts are provided, which the user can easily modify, run, or translate into another scripting language.


Assuntos
Química Farmacêutica/métodos , Descoberta de Drogas/métodos , Linguagens de Programação , Interface Usuário-Computador , Bases de Dados de Compostos Químicos , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Bibliotecas de Moléculas Pequenas/farmacologia
10.
Nucleic Acids Res ; 45(D1): D955-D963, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899599

RESUMO

PubChem's BioAssay database (https://pubchem.ncbi.nlm.nih.gov) has served as a public repository for small-molecule and RNAi screening data since 2004 providing open access of its data content to the community. PubChem accepts data submission from worldwide researchers at academia, industry and government agencies. PubChem also collaborates with other chemical biology database stakeholders with data exchange. With over a decade's development effort, it becomes an important information resource supporting drug discovery and chemical biology research. To facilitate data discovery, PubChem is integrated with all other databases at NCBI. In this work, we provide an update for the PubChem BioAssay database describing several recent development including added sources of research data, redesigned BioAssay record page, new BioAssay classification browser and new features in the Upload system facilitating data sharing.


Assuntos
Bases de Dados de Compostos Químicos , Bases de Dados de Ácidos Nucleicos , Interferência de RNA , Ferramenta de Busca , Bibliotecas de Moléculas Pequenas , Descoberta de Drogas , Regulação da Expressão Gênica/efeitos dos fármacos , Humanos , Software , Interface Usuário-Computador , Navegador
11.
Bioinformatics ; 33(11): 1621-1629, 2017 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-28158543

RESUMO

MOTIVATION: Genetic variants in drug targets and metabolizing enzymes often have important functional implications, including altering the efficacy and toxicity of drugs. Identifying single nucleotide variants (SNVs) that contribute to differences in drug response and understanding their underlying mechanisms are fundamental to successful implementation of the precision medicine model. This work reports an effort to collect, classify and analyze SNVs that may affect the optimal response to currently approved drugs. RESULTS: An integrated approach was taken involving data mining across multiple information resources including databases containing drugs, drug targets, chemical structures, protein-ligand structure complexes, genetic and clinical variations as well as protein sequence alignment tools. We obtained 2640 SNVs of interest, most of which occur rarely in populations (minor allele frequency < 0.01). Clinical significance of only 9.56% of the SNVs is known in ClinVar, although 79.02% are predicted as deleterious. The examples here demonstrate that even if the mapped SNVs predicted as deleterious may not result in significant structural modifications, they can plausibly modify the protein-drug interactions, affecting selectivity and drug-binding affinity. Our analysis identifies potentially deleterious SNVs present on drug-binding residues that are relevant for further studies in the context of precision medicine. AVAILABILITY AND IMPLEMENTATION: Data are available from Supplementary information file. CONTACT: yanli.wang@nih.gov. SUPPLEMENTARY INFORMATION: Supplementary Tables S1-S5 are available at Bioinformatics online.


Assuntos
Mineração de Dados/métodos , Polimorfismo de Nucleotídeo Único , Ligação Proteica/genética , Análise de Sequência de Proteína/métodos , Sítios de Ligação , Frequência do Gene , Humanos , Medicina de Precisão/métodos , Análise de Sequência de DNA/métodos
12.
Nucleic Acids Res ; 42(Database issue): D1075-82, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24198245

RESUMO

PubChem's BioAssay database (http://pubchem.ncbi.nlm.nih.gov) is a public repository for archiving biological tests of small molecules generated through high-throughput screening experiments, medicinal chemistry studies, chemical biology research and drug discovery programs. In addition, the BioAssay database contains data from high-throughput RNA interference screening aimed at identifying critical genes responsible for a biological process or disease condition. The mission of PubChem is to serve the community by providing free and easy access to all deposited data. To this end, PubChem BioAssay is integrated into the National Center for Biotechnology Information retrieval system, making them searchable by Entrez queries and cross-linked to other biomedical information archived at National Center for Biotechnology Information. Moreover, PubChem BioAssay provides web-based and programmatic tools allowing users to search, access and analyze bioassay test results and metadata. In this work, we provide an update for the PubChem BioAssay resource, such as information content growth, new developments supporting data integration and search, and the recently deployed PubChem Upload to streamline chemical structure and bioassay submissions.


Assuntos
Bases de Dados de Compostos Químicos , Ensaios de Triagem em Larga Escala , Interferência de RNA , Descoberta de Drogas , Genes , Humanos , Internet , Proteínas/genética , Bibliotecas de Moléculas Pequenas , Integração de Sistemas
13.
J Chem Inf Model ; 54(2): 407-18, 2014 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-24460210

RESUMO

Sixteen FDA-approved drugs were investigated to elucidate their mechanisms of action (MOAs) and clinical functions by pathway analysis based on retrieved drug targets interacting with or affected by the investigated drugs. Protein and gene targets and associated pathways were obtained by data-mining of public databases including the MMDB, PubChem BioAssay, GEO DataSets, and the BioSystems databases. Entrez E-Utilities were applied, and in-house Ruby scripts were developed for data retrieval and pathway analysis to identify and evaluate relevant pathways common to the retrieved drug targets. Pathways pertinent to clinical uses or MOAs were obtained for most drugs. Interestingly, some drugs identified pathways responsible for other diseases than their current therapeutic uses, and these pathways were verified retrospectively by in vitro tests, in vivo tests, or clinical trials. The pathway enrichment analysis based on drug target information from public databases could provide a novel approach for elucidating drug MOAs and repositioning, therefore benefiting the discovery of new therapeutic treatments for diseases.


Assuntos
Mineração de Dados/métodos , Bases de Dados de Produtos Farmacêuticos , Reposicionamento de Medicamentos/métodos , Antineoplásicos/farmacologia , Humanos , Receptores Citoplasmáticos e Nucleares/metabolismo
14.
Bioinformatics ; 28(21): 2851-2, 2012 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-22942017

RESUMO

SUMMARY: The FSelector package contains a comprehensive list of feature selection algorithms for supporting bioinformatics and machine learning research. FSelector primarily collects and implements the filter type of feature selection techniques, which are computationally efficient for mining large datasets. In particular, FSelector allows ensemble feature selection that takes advantage of multiple feature selection algorithms to yield more robust results. FSelector also provides many useful auxiliary tools, including normalization, discretization and missing data imputation. AVAILABILITY: FSelector, written in the Ruby programming language, is free and open-source software that runs on all Ruby supporting platforms, including Windows, Linux and Mac OS X. FSelector is available from https://rubygems.org/gems/fselector and can be installed like a breeze via the command gem install fselector. The source code is available (https://github.com/need47/fselector) and is fully documented (http://rubydoc.info/gems/fselector/frames).


Assuntos
Algoritmos , Inteligência Artificial , Biologia Computacional/métodos , Software , Linguagens de Programação
15.
J Mol Biol ; 434(11): 167514, 2022 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-35227770

RESUMO

PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical database at the U.S. National Institutes of Health. Visited by millions of users every month, it plays a role as a key chemical information resource for biomedical research communities. Data in PubChem is from hundreds of contributors and organized into multiple collections by record type. Among these are the Protein, Gene, Pathway, and Taxonomy data collections. Records in these collections contain information on chemicals related to a given biological target (i.e., protein, gene, pathway, or taxon), helping users to analyze and interpret the biological activity data of molecules. In addition, annotations about the biological targets are collected from authoritative or curated data sources and integrated into the four collections. The content can be programmatically accessed through PubChem's web service interfaces (including PUG View). A machine-readable representation of this content is also provided within PubChemRDF.


Assuntos
Bases de Dados de Compostos Químicos , Biologia , Descoberta de Drogas , Proteínas/genética
16.
Methods Mol Biol ; 2443: 511-525, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35037224

RESUMO

Plant Reactome (https://plantreactome.gramene.org) and PubChem ( https://pubchem.ncbi.nlm.nih.gov ) are two reference data portals and resources for curated plant pathways, small molecules, metabolites, gene products, and macromolecular interactions. Plant Reactome knowledgebase, a conceptual plant pathway network, is built by biocuration and integrating (bio)chemical entities, gene products, and macromolecular interactions. It provides manually curated pathways for the reference species Oryza sativa (rice) and gene orthology-based projections that extend pathway knowledge to 106 plant species. Currently, it hosts 320 reference pathways for plant metabolism, hormone signaling, transport, genetic regulation, plant organ development and differentiation, and biotic and abiotic stress responses. In addition to the pathway browsing and search functions, the Plant Reactome provides the analysis tools for pathway comparison between reference and projected species, pathway enrichment in gene expression data, and overlay of gene-gene interaction data on pathways. PubChem, a popular reference database of (bio)chemical entities, provides information on small molecules and other types of chemical entities, such as siRNAs, miRNAs, lipids, carbohydrates, and chemically modified nucleotides. The data in PubChem is collected from hundreds of data sources, including Plant Reactome. This chapter provides a brief overview of the Plant Reactome and the PubChem knowledgebases, their association to other public resources providing accessory information, and how users can readily access the contents.


Assuntos
Bases de Conhecimento , Redes e Vias Metabólicas , Bases de Dados Factuais , Plantas/genética , Plantas/metabolismo , Proteínas/metabolismo
17.
Front Mol Biosci ; 9: 831740, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35252351

RESUMO

iCn3D was initially developed as a web-based 3D molecular viewer. It then evolved from visualization into a full-featured interactive structural analysis software. It became a collaborative research instrument through the sharing of permanent, shortened URLs that encapsulate not only annotated visual molecular scenes, but also all underlying data and analysis scripts in a FAIR manner. More recently, with the growth of structural databases, the need to analyze large structural datasets systematically led us to use Python scripts and convert the code to be used in Node. js scripts. We showed a few examples of Python scripts at https://github.com/ncbi/icn3d/tree/master/icn3dpython to export secondary structures or PNG images from iCn3D. Users just need to replace the URL in the Python scripts to export other annotations from iCn3D. Furthermore, any interactive iCn3D feature can be converted into a Node. js script to be run in batch mode, enabling an interactive analysis performed on one or a handful of protein complexes to be scaled up to analysis features of large ensembles of structures. Currently available Node. js analysis scripts examples are available at https://github.com/ncbi/icn3d/tree/master/icn3dnode. This development will enable ensemble analyses on growing structural databases such as AlphaFold or RoseTTAFold on one hand and Electron Microscopy on the other. In this paper, we also review new features such as DelPhi electrostatic potential, 3D view of mutations, alignment of multiple chains, assembly of multiple structures by realignment, dynamic symmetry calculation, 2D cartoons at different levels, interactive contact maps, and use of iCn3D in Jupyter Notebook as described at https://pypi.org/project/icn3dpy.

18.
Environ Sci Eur ; 34(1): 104, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36284750

RESUMO

Background: The NORMAN Association (https://www.norman-network.com/) initiated the NORMAN Suspect List Exchange (NORMAN-SLE; https://www.norman-network.com/nds/SLE/) in 2015, following the NORMAN collaborative trial on non-target screening of environmental water samples by mass spectrometry. Since then, this exchange of information on chemicals that are expected to occur in the environment, along with the accompanying expert knowledge and references, has become a valuable knowledge base for "suspect screening" lists. The NORMAN-SLE now serves as a FAIR (Findable, Accessible, Interoperable, Reusable) chemical information resource worldwide. Results: The NORMAN-SLE contains 99 separate suspect list collections (as of May 2022) from over 70 contributors around the world, totalling over 100,000 unique substances. The substance classes include per- and polyfluoroalkyl substances (PFAS), pharmaceuticals, pesticides, natural toxins, high production volume substances covered under the European REACH regulation (EC: 1272/2008), priority contaminants of emerging concern (CECs) and regulatory lists from NORMAN partners. Several lists focus on transformation products (TPs) and complex features detected in the environment with various levels of provenance and structural information. Each list is available for separate download. The merged, curated collection is also available as the NORMAN Substance Database (NORMAN SusDat). Both the NORMAN-SLE and NORMAN SusDat are integrated within the NORMAN Database System (NDS). The individual NORMAN-SLE lists receive digital object identifiers (DOIs) and traceable versioning via a Zenodo community (https://zenodo.org/communities/norman-sle), with a total of > 40,000 unique views, > 50,000 unique downloads and 40 citations (May 2022). NORMAN-SLE content is progressively integrated into large open chemical databases such as PubChem (https://pubchem.ncbi.nlm.nih.gov/) and the US EPA's CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard/), enabling further access to these lists, along with the additional functionality and calculated properties these resources offer. PubChem has also integrated significant annotation content from the NORMAN-SLE, including a classification browser (https://pubchem.ncbi.nlm.nih.gov/classification/#hid=101). Conclusions: The NORMAN-SLE offers a specialized service for hosting suspect screening lists of relevance for the environmental community in an open, FAIR manner that allows integration with other major chemical resources. These efforts foster the exchange of information between scientists and regulators, supporting the paradigm shift to the "one substance, one assessment" approach. New submissions are welcome via the contacts provided on the NORMAN-SLE website (https://www.norman-network.com/nds/SLE/). Supplementary Information: The online version contains supplementary material available at 10.1186/s12302-022-00680-6.

19.
Bioinformatics ; 26(22): 2881-8, 2010 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-20947527

RESUMO

MOTIVATION: Most of the previous data mining studies based on the NCI-60 dataset, due to its intrinsic cell-based nature, can hardly provide insights into the molecular targets for screened compounds. On the other hand, the abundant information of the compound-target associations in PubChem can offer extensive experimental evidence of molecular targets for tested compounds. Therefore, by taking advantages of the data from both public repositories, one may investigate the correlations between the bioactivity profiles of small molecules from the NCI-60 dataset (cellular level) and their patterns of interactions with relevant protein targets from PubChem (molecular level) simultaneously. RESULTS: We investigated a set of 37 small molecules by providing links among their bioactivity profiles, protein targets and chemical structures. Hierarchical clustering of compounds was carried out based on their bioactivity profiles. We found that compounds were clustered into groups with similar mode of actions, which strongly correlated with chemical structures. Furthermore, we observed that compounds similar in bioactivity profiles also shared similar patterns of interactions with relevant protein targets, especially when chemical structures were related. The current work presents a new strategy for combining and data mining the NCI-60 dataset and PubChem. This analysis shows that bioactivity profile comparison can provide insights into the mode of actions at the molecular level, thus will facilitate the knowledge-based discovery of novel compounds with desired pharmacological properties. AVAILABILITY: The bioactivity profiling data and the target annotation information are publicly available in the PubChem BioAssay database (ftp://ftp.ncbi.nlm.nih.gov/pubchem/Bioassay/).


Assuntos
Mineração de Dados/métodos , Preparações Farmacêuticas/química , Bases de Dados Factuais , Bases de Conhecimento , National Library of Medicine (U.S.) , Estados Unidos
20.
J Chem Inf Model ; 51(9): 2440-8, 2011 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-21834535

RESUMO

Molecular target identification is of central importance to drug discovery. Here, we developed a computational approach, named bioactivity profile similarity search (BASS), for associating targets to small molecules by using the known target annotations of related compounds from public databases. To evaluate BASS, a bioactivity profile database was constructed using 4296 compounds that were commonly tested in the US National Cancer Institute 60 human tumor cell line anticancer drug screen (NCI-60). Each compound was used as a query to search against the entire bioactivity profile database, and reference compounds with similar bioactivity profiles above a threshold of 0.75 were considered as neighbor compounds of the query. Potential targets were subsequently linked to the identified neighbor compounds by using the known targets of the query compound. About 45% of the predicted compound-target associations were successfully verified retrospectively, suggesting the possible application of BASS in identifying the targets of uncharacterized compounds and thus providing insight into the study of promiscuity and polypharmacology. Furthermore, BASS identified a significant fraction of structurally diverse compounds with similar bioactivities, indicating its feasibility of "scaffold hopping" in searching novel molecules against the target of interest.


Assuntos
Bases de Dados Factuais , Armazenamento e Recuperação da Informação , Linhagem Celular Tumoral , Ensaios de Seleção de Medicamentos Antitumorais , Humanos , Estudos Retrospectivos , Bibliotecas de Moléculas Pequenas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA