Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 43
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Environ Microbiome ; 19(1): 26, 2024 Apr 26.
Artigo em Inglês | MEDLINE | ID: mdl-38671539

RESUMO

Castellaniella species have been isolated from a variety of mixed-waste environments including the nitrate and multiple metal-contaminated subsurface at the Oak Ridge Reservation (ORR). Previous studies examining microbial community composition and nitrate removal at ORR during biostimulation efforts reported increased abundances of members of the Castellaniella genus concurrent with increased denitrification rates. Thus, we asked how genomic and abiotic factors control the Castellaniella biogeography at the site to understand how these factors may influence nitrate transformation in an anthropogenically impacted setting. We report the isolation and characterization of several Castellaniella strains from the ORR subsurface. Five of these isolates match at 100% identity (at the 16S rRNA gene V4 region) to two Castellaniella amplicon sequence variants (ASVs), ASV1 and ASV2, that have persisted in the ORR subsurface for at least 2 decades. However, ASV2 has consistently higher relative abundance in samples taken from the site and was also the dominant blooming denitrifier population during a prior biostimulation effort. We found that the ASV2 representative strain has greater resistance to mixed metal stress than the ASV1 representative strains. We attribute this resistance, in part, to the large number of unique heavy metal resistance genes identified on a genomic island in the ASV2 representative genome. Additionally, we suggest that the relatively lower fitness of ASV1 may be connected to the loss of the nitrous oxide reductase (nos) operon (and associated nitrous oxide reductase activity) due to the insertion at this genomic locus of a mobile genetic element carrying copper resistance genes. This study demonstrates the value of integrating genomic, environmental, and phenotypic data to characterize the biogeography of key microorganisms in contaminated sites.

2.
Front Microbiol ; 14: 1095191, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37065130

RESUMO

Sulfate-reducing bacteria (SRB) are obligate anaerobes that can couple their growth to the reduction of sulfate. Despite the importance of SRB to global nutrient cycles and their damage to the petroleum industry, our molecular understanding of their physiology remains limited. To systematically provide new insights into SRB biology, we generated a randomly barcoded transposon mutant library in the model SRB Desulfovibrio vulgaris Hildenborough (DvH) and used this genome-wide resource to assay the importance of its genes under a range of metabolic and stress conditions. In addition to defining the essential gene set of DvH, we identified a conditional phenotype for 1,137 non-essential genes. Through examination of these conditional phenotypes, we were able to make a number of novel insights into our molecular understanding of DvH, including how this bacterium synthesizes vitamins. For example, we identified DVU0867 as an atypical L-aspartate decarboxylase required for the synthesis of pantothenic acid, provided the first experimental evidence that biotin synthesis in DvH occurs via a specialized acyl carrier protein and without methyl esters, and demonstrated that the uncharacterized dehydrogenase DVU0826:DVU0827 is necessary for the synthesis of pyridoxal phosphate. In addition, we used the mutant fitness data to identify genes involved in the assimilation of diverse nitrogen sources and gained insights into the mechanism of inhibition of chlorate and molybdate. Our large-scale fitness dataset and RB-TnSeq mutant library are community-wide resources that can be used to generate further testable hypotheses into the gene functions of this environmentally and industrially important group of bacteria.

3.
Gigascience ; 112022 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-36251274

RESUMO

BACKGROUND: Many organizations face challenges in managing and analyzing data, especially when relevant datasets arise from multiple sources and methods. Analyzing heterogeneous datasets and additional derived data requires rigorous tracking of their interrelationships and provenance. This task has long been a Grand Challenge of data science and has more recently been formalized in the FAIR principles: that all data objects be Findable, Accessible, Interoperable, and Reusable, both for machines and for people. Adherence to these principles is necessary for proper stewardship of information, for testing regulatory compliance, for measuring the efficiency of processes, and for facilitating reuse of data-analytical frameworks. FINDINGS: We present the Contextual Ontology-based Repository Analysis Library (CORAL), a platform that greatly facilitates adherence to all 4 of the FAIR principles, including the especially difficult challenge of making heterogeneous datasets Interoperable and Reusable across all parts of a large, long-lasting organization. To achieve this, CORAL's data model requires that data generators extensively document the context for all data, and our tools maintain that context throughout the entire analysis pipeline. CORAL also features a web interface for data generators to upload and explore data, as well as a Jupyter notebook interface for data analysts, both backed by a common API. CONCLUSIONS: CORAL enables organizations to build FAIR data types on the fly as they are needed, avoiding the expense of bespoke data modeling. CORAL provides a uniquely powerful platform to enable integrative cross-dataset analyses, generating deeper insights than are possible using traditional analysis tools.


Assuntos
Antozoários , Análise de Dados , Animais
4.
Environ Microbiol ; 24(11): 5546-5560, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36053980

RESUMO

Bacillus cereus strain CPT56D-587-MTF (CPTF) was isolated from the highly contaminated Oak Ridge Reservation (ORR) subsurface. This site is contaminated with high levels of nitric acid and multiple heavy metals. Amplicon sequencing of the 16S rRNA genes (V4 region) in sediment from this area revealed an amplicon sequence variant (ASV) with 100% identity to the CPTF 16S rRNA sequence. Notably, this CPTF-matching ASV had the highest relative abundance in this community survey, with a median relative abundance of 3.77% and comprised 20%-40% of reads in some samples. Pangenomic analysis revealed that strain CPTF has expanded genomic content compared to other B. cereus species-largely due to plasmid acquisition and expansion of transposable elements. This suggests that these features are important for rapid adaptation to native environmental stressors. We connected genotype to phenotype in the context of the unique geochemistry of the site. These analyses revealed that certain genes (e.g. nitrate reductase, heavy metal efflux pumps) that allow this strain to successfully occupy the geochemically heterogenous microniches of its native site are characteristic of the B. cereus species while others such as acid tolerance are mobile genetic element associated and are generally unique to strain CPTF.


Assuntos
Bacillus cereus , Metais Pesados , RNA Ribossômico 16S/genética , Bacillus cereus/genética , Genômica , Filogenia
5.
Microbiol Resour Announc ; 11(5): e0014522, 2022 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-35475637

RESUMO

Bacillus cereus strain CPT56D-587-MTF was isolated from nitrate- and toxic metal-contaminated subsurface sediment at the Oak Ridge Reservation (ORR) (Oak Ridge, TN, USA). Here, we report the complete genome sequence of this strain to provide genomic insight into its strategies for survival at this mixed-waste site.

6.
Nucleic Acids Res ; 50(D1): D553-D559, 2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34850923

RESUMO

The Structural Classification of Proteins-extended (SCOPe, https://scop.berkeley.edu) knowledgebase aims to provide an accurate, detailed, and comprehensive description of the structural and evolutionary relationships amongst the majority of proteins of known structure, along with resources for analyzing the protein structures and their sequences. Structures from the PDB are divided into domains and classified using a combination of manual curation and highly precise automated methods. In the current release of SCOPe, 2.08, we have developed search and display tools for analysis of genetic variants we mapped to structures classified in SCOPe. In order to improve the utility of SCOPe to automated methods such as deep learning classifiers that rely on multiple alignment of sequences of homologous proteins, we have introduced new machine-parseable annotations that indicate aberrant structures as well as domains that are distinguished by a smaller repeat unit. We also classified structures from 74 of the largest Pfam families not previously classified in SCOPe, and we improved our algorithm to remove N- and C-terminal cloning, expression and purification sequences from SCOPe domains. SCOPe 2.08-stable classifies 106 976 PDB entries (about 60% of PDB entries).


Assuntos
Biologia Computacional , Bases de Dados de Proteínas , Proteínas/classificação , Algoritmos , Bases de Dados de Compostos Químicos , Regulação da Expressão Gênica/genética , Aprendizado de Máquina , Proteínas/genética
7.
mSystems ; : e0053721, 2021 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-34184913

RESUMO

Viruses are ubiquitous microbiome components, shaping ecosystems via strain-specific predation, horizontal gene transfer and redistribution of nutrients through host lysis. Viral impacts are important in groundwater ecosystems, where microbes drive many nutrient fluxes and metabolic processes; however, little is known about the diversity of viruses in these environments. We analyzed four groundwater plasmidomes (the entire plasmid content of an environment) and identified 200 viral sequences, which clustered into 41 genus-level viral clusters (approximately equivalent to viral genera) including 9 known and 32 putative new genera. We used publicly available bacterial whole-genome sequences (WGS) and WGS from 261 bacterial isolates from this groundwater environment to identify potential viral hosts. We linked 76 of the 200 viral sequences to a range of bacterial phyla, the majority associated with Proteobacteria, followed by Firmicutes, Bacteroidetes, and Actinobacteria. The publicly available WGS enabled mapping bacterial hosts to several viral sequences. The WGS of groundwater isolates increased the depth of host prediction by allowing host identification at the strain level. The latter included 4 viruses that were almost entirely (>99% query coverage, >99% identity) identified as integrated in the genomes of Pseudomonas, Acidovorax, and Castellaniella strains, resulting in high-confidence host assignments. Lastly, 21 of these viruses carried putative auxiliary metabolite genes for metal and antibiotic resistance, which might drive their infection cycles and/or provide selective advantage to infected hosts. Exploring the groundwater virome provides a necessary foundation for integration of viruses into ecosystem models where they are key players in microbial adaption to environmental stress. IMPORTANCE To our knowledge, this is the first study to identify the bacteriophage distribution in a groundwater ecosystem shedding light on their prevalence and distribution across metal-contaminated and background sites. Our study is uniquely based on selective sequencing of solely the extrachromosomal elements of a microbiome followed by analysis for viral signatures, thus establishing a more focused approach for phage identifications. Using this method, we detected several novel phage genera along with those previously established. Our approach of using the whole-genome sequences of hundreds of bacterial isolates from the same site enabled us to make host assignments with high confidence, several at strain levels. Certain phage genes suggest that they provide an environment-specific selective advantage to their bacterial hosts. Our study lays the foundation for future research on directed phage isolations using specific bacterial host strains to further characterize groundwater phages, their life cycles, and their effects on groundwater microbiome and biogeochemistry.

9.
mSystems ; 6(1)2021 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-33622857

RESUMO

Microbiome samples are inherently defined by the environment in which they are found. Therefore, data that provide context and enable interpretation of measurements produced from biological samples, often referred to as metadata, are critical. Important contributions have been made in the development of community-driven metadata standards; however, these standards have not been uniformly embraced by the microbiome research community. To understand how these standards are being adopted, or the barriers to adoption, across research domains, institutions, and funding agencies, the National Microbiome Data Collaborative (NMDC) hosted a workshop in October 2019. This report provides a summary of discussions that took place throughout the workshop, as well as outcomes of the working groups initiated at the workshop.

10.
Front Microbiol ; 11: 587127, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33193240

RESUMO

A nitrate- and metal-contaminated site at the Oak Ridge Reservation (ORR) was previously shown to contain the metal molybdenum (Mo) at picomolar concentrations. This potentially limits microbial nitrate reduction, as Mo is required by the enzyme nitrate reductase, which catalyzes the first step of nitrate removal. Enrichment for anaerobic nitrate-reducing microbes from contaminated sediment at the ORR yielded Bacillus strain EB106-08-02-XG196. This bacterium grows in the presence of multiple metals (Cd, Ni, Cu, Co, Mn, and U) but also exhibits better growth compared to control strains, including Pseudomonas fluorescens N2E2 isolated from a pristine ORR environment under low molybdate concentrations (<1 nM). Molybdate is taken up by the molybdate binding protein, ModA, of the molybdate ATP-binding cassette transporter. ModA of XG196 is phylogenetically distinct from those of other characterized ModA proteins. The genes encoding ModA from XG196, P. fluorescens N2E2 and Escherichia coli K12 were expressed in E. coli and the recombinant proteins were purified. Isothermal titration calorimetry analysis showed that XG196 ModA has a higher affinity for molybdate than other ModA proteins with a molybdate binding constant (K D ) of 2.2 nM, about one order of magnitude lower than those of P. fluorescens N2E2 (27.0 nM) and E. coli K12 (25.0 nM). XG196 ModA also showed a fivefold higher affinity for molybdate than for tungstate (11 nM), whereas the ModA proteins from P. fluorescens N2E2 [K D (Mo) 27.0 nM, K D (W) 26.7 nM] and E. coli K12[(K D (Mo) 25.0 nM, K D (W) 23.8 nM] had similar affinities for the two oxyanions. We propose that high molybdate affinity coupled with resistance to multiple metals gives strain XG196 a competitive advantage in Mo-limited environments contaminated with high concentrations of metals and nitrate, as found at ORR.

11.
Microbiol Resour Announc ; 9(44)2020 Oct 29.
Artigo em Inglês | MEDLINE | ID: mdl-33122416

RESUMO

Bacillus sp. strain EB106-08-02-XG196 was isolated from a high-nitrate- and heavy metal-contaminated site at the Oak Ridge Reservation in Tennessee. We report the draft genome sequence of this strain to provide insights into the genomic basis for surviving in this unique environment.

12.
mBio ; 10(1)2019 02 26.
Artigo em Inglês | MEDLINE | ID: mdl-30808697

RESUMO

Naturally occurring plasmids constitute a major category of mobile genetic elements responsible for harboring and transferring genes important in survival and fitness. A targeted evaluation of plasmidomes can reveal unique adaptations required by microbial communities. We developed a model system to optimize plasmid DNA isolation procedures targeted to groundwater samples which are typically characterized by low cell density (and likely variations in the plasmid size and copy numbers). The optimized method resulted in successful identification of several hundred circular plasmids, including some large plasmids (11 plasmids more than 50 kb in size, with the largest being 1.7 Mb in size). Several interesting observations were made from the analysis of plasmid DNA isolated in this study. The plasmid pool (plasmidome) was more conserved than the corresponding microbiome distribution (16S rRNA based). The circular plasmids were diverse as represented by the presence of seven plasmid incompatibility groups. The genes carried on these groundwater plasmids were highly enriched in metal resistance. Results from this study confirmed that traits such as metal, antibiotic, and phage resistance along with toxin-antitoxin systems are encoded on abundant circular plasmids, all of which could confer novel and advantageous traits to their hosts. This study confirms the ecological role of the plasmidome in maintaining the latent capacity of a microbiome, enabling rapid adaptation to environmental stresses.IMPORTANCE Plasmidomes have been typically studied in environments abundant in bacteria, and this is the first study to explore plasmids from an environment characterized by low cell density. We specifically target groundwater, a significant source of water for human/agriculture use. We used samples from a well-studied site and identified hundreds of circular plasmids, including one of the largest sizes reported in plasmidome studies. The striking similarity of the plasmid-borne ORFs in terms of taxonomical and functional classifications across several samples suggests a conserved plasmid pool, in contrast to the observed variability in the 16S rRNA-based microbiome distribution. Additionally, the stress response to environmental factors has stronger conservation via plasmid-borne genes as marked by abundance of metal resistance genes. Last, identification of novel and diverse plasmids enriches the existing plasmid database(s) and serves as a paradigm to increase the repertoire of biological parts that are available for modifying novel environmental strains.


Assuntos
Farmacorresistência Bacteriana , Genes Bacterianos , Água Subterrânea/microbiologia , Metais/toxicidade , Plasmídeos/análise , Plasmídeos/química , Bactérias/classificação , Bactérias/genética , Análise por Conglomerados , DNA Ribossômico/química , DNA Ribossômico/genética , Variação Genética , Filogenia , RNA Ribossômico 16S/genética , Análise de Sequência de DNA
13.
Nucleic Acids Res ; 47(D1): D475-D481, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30500919

RESUMO

The SCOPe (Structural Classification of Proteins-extended, https://scop.berkeley.edu) database hierarchically classifies domains from the majority of proteins of known structure according to their structural and evolutionary relationships. SCOPe also incorporates and updates the ASTRAL compendium, which provides multiple databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. Protein structures are classified using a combination of manual curation and highly precise automated methods. In the current release of SCOPe, 2.07, we have focused our manual curation efforts on larger protein structures, including the spliceosome, proteasome and RNA polymerase I, as well as many other Pfam families that had not previously been classified. Domains from these large protein complexes are distinctive in several ways: novel non-globular folds are more common, and domains from previously observed protein families often have N- or C-terminal extensions that were disordered or not present in previous structures. The current monthly release update, SCOPe 2.07-2018-10-18, classifies 90 992 PDB entries (about two thirds of PDB entries).


Assuntos
Bases de Dados de Proteínas , Domínios Proteicos , Complexos Multiproteicos/química , Complexo de Endopeptidases do Proteassoma/química , Spliceossomos/química
15.
Hum Mutat ; 38(9): 1155-1168, 2017 09.
Artigo em Inglês | MEDLINE | ID: mdl-28397312

RESUMO

The CAGI-4 Hopkins clinical panel challenge was an attempt to assess state-of-the-art methods for clinical phenotype prediction from DNA sequence. Participants were provided with exonic sequences of 83 genes for 106 patients from the Johns Hopkins DNA Diagnostic Laboratory. Five groups participated in the challenge, predicting both the probability that each patient had each of the 14 possible classes of disease, as well as one or more causal variants. In cases where the Hopkins laboratory reported a variant, at least one predictor correctly identified the disease class in 36 of the 43 patients (84%). Even in cases where the Hopkins laboratory did not find a variant, at least one predictor correctly identified the class in 39 of the 63 patients (62%). Each prediction group correctly diagnosed at least one patient that was not successfully diagnosed by any other group. We discuss the causal variant predictions by different groups and their implications for further development of methods to assess variants of unknown significance. Our results suggest that clinically relevant variants may be missed when physicians order small panels targeted on a specific phenotype. We also quantify the false-positive rate of DNA-guided analysis in the absence of prior phenotypic indication.


Assuntos
Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Bases de Dados Genéticas , Predisposição Genética para Doença , Testes Genéticos , Humanos , Fenótipo
16.
Environ Sci Technol ; 51(5): 2879-2889, 2017 03 07.
Artigo em Inglês | MEDLINE | ID: mdl-28112946

RESUMO

Temporal variability complicates testing the influences of environmental variability on microbial community structure and thus function. An in-field bioreactor system was developed to assess oxic versus anoxic manipulations on in situ groundwater communities. Each sample was sequenced (16S SSU rRNA genes, average 10,000 reads), and biogeochemical parameters are monitored by quantifying 53 metals, 12 organic acids, 14 anions, and 3 sugars. Changes in dissolved oxygen (DO), pH, and other variables were similar across bioreactors. Sequencing revealed a complex community that fluctuated in-step with the groundwater community and responded to DO. This also directly influenced the pH, and so the biotic impacts of DO and pH shifts are correlated. A null model demonstrated that bioreactor communities were driven in part not only by experimental conditions but also by stochastic variability and did not accurately capture alterations in diversity during perturbations. We identified two groups of abundant OTUs important to this system; one was abundant in high DO and pH and contained heterotrophs and oxidizers of iron, nitrite, and ammonium, whereas the other was abundant in low DO with the capability to reduce nitrate. In-field bioreactors are a powerful tool for capturing natural microbial community responses to alterations in geochemical factors beyond the bulk phase.


Assuntos
Bactérias/genética , Reatores Biológicos , Água Subterrânea/química , Nitritos , RNA Ribossômico 16S/genética
17.
J Mol Biol ; 429(3): 348-355, 2017 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-27914894

RESUMO

SCOPe (Structural Classification of Proteins-extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOP is an expert-curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. SCOPe classifies the majority of protein structures released since SCOP development concluded in 2009, using a combination of manual curation and highly precise automated tools, aiming to have the same accuracy as fully hand-curated SCOP releases. SCOPe also incorporates and updates the ASTRAL compendium, which provides several databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. SCOPe continues high-quality manual classification of new superfamilies, a key feature of SCOP. Artifacts such as expression tags are now separated into their own class, in order to distinguish them from the homology-based annotations in the remainder of the SCOPe hierarchy. SCOPe 2.06 contains 77,439 Protein Data Bank entries, double the 38,221 structures classified in SCOP.


Assuntos
Bases de Dados de Proteínas , Mutação , Proteínas/classificação , Artefatos , Clonagem Molecular , Biologia Computacional , Estrutura Terciária de Proteína , Proteínas/química
18.
Mol Cell Proteomics ; 15(6): 2186-202, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27099342

RESUMO

Identifying protein-protein interactions (PPIs) at an acceptable false discovery rate (FDR) is challenging. Previously we identified several hundred PPIs from affinity purification - mass spectrometry (AP-MS) data for the bacteria Escherichia coli and Desulfovibrio vulgaris These two interactomes have lower FDRs than any of the nine interactomes proposed previously for bacteria and are more enriched in PPIs validated by other data than the nine earlier interactomes. To more thoroughly determine the accuracy of ours or other interactomes and to discover further PPIs de novo, here we present a quantitative tagless method that employs iTRAQ MS to measure the copurification of endogenous proteins through orthogonal chromatography steps. 5273 fractions from a four-step fractionation of a D. vulgaris protein extract were assayed, resulting in the detection of 1242 proteins. Protein partners from our D. vulgaris and E. coli AP-MS interactomes copurify as frequently as pairs belonging to three benchmark data sets of well-characterized PPIs. In contrast, the protein pairs from the nine other bacterial interactomes copurify two- to 20-fold less often. We also identify 200 high confidence D. vulgaris PPIs based on tagless copurification and colocalization in the genome. These PPIs are as strongly validated by other data as our AP-MS interactomes and overlap with our AP-MS interactome for D.vulgaris within 3% of expectation, once FDRs and false negative rates are taken into account. Finally, we reanalyzed data from two quantitative tagless screens of human cell extracts. We estimate that the novel PPIs reported in these studies have an FDR of at least 85% and find that less than 7% of the novel PPIs identified in each screen overlap. Our results establish that a quantitative tagless method can be used to validate and identify PPIs, but that such data must be analyzed carefully to minimize the FDR.


Assuntos
Proteínas de Bactérias/metabolismo , Desulfovibrio vulgaris/metabolismo , Escherichia coli/metabolismo , Proteômica/métodos , Cromatografia de Afinidade/métodos , Espectrometria de Massas/métodos , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas
19.
Mol Cell Proteomics ; 15(5): 1539-55, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-26873250

RESUMO

Numerous affinity purification-mass spectrometry (AP-MS) and yeast two-hybrid screens have each defined thousands of pairwise protein-protein interactions (PPIs), most of which are between functionally unrelated proteins. The accuracy of these networks, however, is under debate. Here, we present an AP-MS survey of the bacterium Desulfovibrio vulgaris together with a critical reanalysis of nine published bacterial yeast two-hybrid and AP-MS screens. We have identified 459 high confidence PPIs from D. vulgaris and 391 from Escherichia coli Compared with the nine published interactomes, our two networks are smaller, are much less highly connected, and have significantly lower false discovery rates. In addition, our interactomes are much more enriched in protein pairs that are encoded in the same operon, have similar functions, and are reproducibly detected in other physical interaction assays than the pairs reported in prior studies. Our work establishes more stringent benchmarks for the properties of protein interactomes and suggests that bona fide PPIs much more frequently involve protein partners that are annotated with similar functions or that can be validated in independent assays than earlier studies suggested.


Assuntos
Proteínas de Bactérias/metabolismo , Biologia Computacional/métodos , Desulfovibrio vulgaris/metabolismo , Escherichia coli/metabolismo , Cromatografia de Afinidade , Bases de Dados de Proteínas , Espectrometria de Massas , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Proteômica/métodos , Técnicas do Sistema de Duplo-Híbrido
20.
Proteins ; 83(11): 2025-38, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26313554

RESUMO

The Structural Classification of Proteins (SCOP) and Class, Architecture, Topology, Homology (CATH) databases have been valuable resources for protein structure classification for over 20 years. Development of SCOP (version 1) concluded in June 2009 with SCOP 1.75. The SCOPe (SCOP-extended) database offers continued development of the classic SCOP hierarchy, adding over 33,000 structures. We have attempted to assess the impact of these two decade old resources and guide future development. To this end, we surveyed recent articles to learn how structure classification data are used. Of 571 articles published in 2012-2013 that cite SCOP, 439 actually use data from the resource. We found that the type of use was fairly evenly distributed among four top categories: A) study protein structure or evolution (27% of articles), B) train and/or benchmark algorithms (28% of articles), C) augment non-SCOP datasets with SCOP classification (21% of articles), and D) examine the classification of one protein/a small set of proteins (22% of articles). Most articles described computational research, although 11% described purely experimental research, and a further 9% included both. We examined how CATH and SCOP were used in 158 articles that cited both databases: while some studies used only one dataset, the majority used data from both resources. Protein structure classification remains highly relevant for a diverse range of problems and settings.


Assuntos
Proteínas/química , Proteínas/classificação , Algoritmos , Biologia Computacional , Bases de Dados de Proteínas , Conformação Proteica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...