Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Mol Evol ; 89(8): 544-553, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34328525

RESUMEN

The native subcellular location (also referred to as localization or cellular compartment) of a protein is the one in which it acts most frequently; it is one aspect of protein function. Do ten eukaryotic model organisms differ in their location spectrum, i.e., the fraction of its proteome in each of seven major cellular compartments? As experimental annotations of locations remain biased and incomplete, we need prediction methods to answer this question. After systematic bias corrections, the complete but faulty prediction methods appeared to be more appropriate to compare location spectra between species than the incomplete more accurate experimental data. This work compared the location spectra for ten eukaryotes: Homo sapiens (human), Gorilla gorilla (gorilla), Pan troglodytes (chimpanzee), Mus musculus (mouse), Rattus norvegicus (rat), Drosophila melanogaster (fruit/vinegar fly), Anopheles gambiae (African malaria mosquito), Caenorhabitis elegans (nematode), Saccharomyces cerevisiae (baker's yeast), and Schizosaccharomyces pombe (fission yeast). The two largest classes were predicted to be the nucleus and the cytoplasm together accounting for 47-62% of all proteins, while 7-21% of the proteins were predicted in the plasma membrane and 4-15% to be secreted. Overall, the predicted location spectra were largely similar. However, in detail, the differences sufficed to plot trees (UPGMA) and 2D (PCA) maps relating the ten organisms using a simple Euclidean distance in seven states (location classes). The relations based on the simple predicted location spectra captured aspects of cross-species comparisons usually revealed only by much more detailed evolutionary comparisons. Most interestingly, known phylogenetic relations were reproduced better by paralog-only than by ortholog-only trees.


Asunto(s)
Drosophila melanogaster , Proteoma , Animales , Drosophila , Drosophila melanogaster/genética , Ratones , Filogenia , Proteoma/genética , Ratas , Saccharomyces cerevisiae/genética
2.
Nucleic Acids Res ; 49(W1): W535-W540, 2021 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-33999203

RESUMEN

Since 1992 PredictProtein (https://predictprotein.org) is a one-stop online resource for protein sequence analysis with its main site hosted at the Luxembourg Centre for Systems Biomedicine (LCSB) and queried monthly by over 3,000 users in 2020. PredictProtein was the first Internet server for protein predictions. It pioneered combining evolutionary information and machine learning. Given a protein sequence as input, the server outputs multiple sequence alignments, predictions of protein structure in 1D and 2D (secondary structure, solvent accessibility, transmembrane segments, disordered regions, protein flexibility, and disulfide bridges) and predictions of protein function (functional effects of sequence variation or point mutations, Gene Ontology (GO) terms, subcellular localization, and protein-, RNA-, and DNA binding). PredictProtein's infrastructure has moved to the LCSB increasing throughput; the use of MMseqs2 sequence search reduced runtime five-fold (apparently without lowering performance of prediction methods); user interface elements improved usability, and new prediction methods were added. PredictProtein recently included predictions from deep learning embeddings (GO and secondary structure) and a method for the prediction of proteins and residues binding DNA, RNA, or other proteins. PredictProtein.org aspires to provide reliable predictions to computational and experimental biologists alike. All scripts and methods are freely available for offline execution in high-throughput settings.


Asunto(s)
Conformación Proteica , Programas Informáticos , Sitios de Unión , Proteínas de la Nucleocápside de Coronavirus/química , Proteínas de Unión al ADN/química , Fosfoproteínas/química , Estructura Secundaria de Proteína , Proteínas/química , Proteínas/fisiología , Proteínas de Unión al ARN/química , Alineación de Secuencia , Análisis de Secuencia de Proteína
3.
Curr Protoc Bioinformatics ; 69(1): e97, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32150354

RESUMEN

Visualizing protein data remains a challenging and stimulating task. Useful and intuitive visualization tools may help advance biomolecular and medical research; unintuitive tools may bar important breakthroughs. This protocol describes two use cases for the CellMap (http://cellmap.protein.properties) web tool. The tool allows researchers to visualize human protein-protein interaction data constrained by protein subcellular localizations. In the simplest form, proteins are visualized on cell images that also show protein-protein interactions (PPIs) through lines (edges) connecting the proteins across the compartments. At a glance, this simultaneously highlights spatial constraints that proteins are subject to in their physical environment and visualizes PPIs against these localizations. Visualizing two realities helps in decluttering the protein interaction visualization from "hairball" phenomena that arise when single proteins or groups thereof interact with hundreds of partners. © 2019 The Authors. Basic Protocol 1: Visualizing proteins and their interactions on cell images Basic Protocol 2: Displaying all interaction partners for a protein.


Asunto(s)
Células/metabolismo , Imagenología Tridimensional , Mapeo de Interacción de Proteínas , Programas Informáticos , Humanos , Proteínas/metabolismo , Fracciones Subcelulares/metabolismo
4.
BMC Bioinformatics ; 20(1): 727, 2019 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-31861997

RESUMEN

Following publication of the original article [1], the author reported that an incorrect figure has been published as Figure 2. The correct Figure 2 is shown below.

5.
BMC Bioinformatics ; 20(1): 205, 2019 Apr 23.
Artículo en Inglés | MEDLINE | ID: mdl-31014229

RESUMEN

BACKGROUND: Sub-nuclear structures or locations are associated with various nuclear processes. Proteins localized in these substructures are important to understand the interior nuclear mechanisms. Despite advances in high-throughput methods, experimental protein annotations remain limited. Predictions of cellular compartments have become very accurate, largely at the expense of leaving out substructures inside the nucleus making a fine-grained analysis impossible. RESULTS: Here, we present a new method (LocNuclei) that predicts nuclear substructures from sequence alone. LocNuclei used a string-based Profile Kernel with Support Vector Machines (SVMs). It distinguishes sub-nuclear localization in 13 distinct substructures and distinguishes between nuclear proteins confined to the nucleus and those that are also native to other compartments (traveler proteins). High performance was achieved by implicitly leveraging a large biological knowledge-base in creating predictions by homology-based inference through BLAST. Using this approach, the performance reached AUC = 0.70-0.74 and Q13 = 59-65%. Travelling proteins (nucleus and other) were identified at Q2 = 70-74%. A Gene Ontology (GO) analysis of the enrichment of biological processes revealed that the predicted sub-nuclear compartments matched the expected functionality. Analysis of protein-protein interactions (PPI) show that formation of compartments and functionality of proteins in these compartments highly rely on interactions between proteins. This suggested that the LocNuclei predictions carry important information about function. The source code and data sets are available through GitHub: https://github.com/Rostlab/LocNuclei . CONCLUSIONS: LocNuclei predicts subnuclear compartments and traveler proteins accurately. These predictions carry important information about functionality and PPIs.


Asunto(s)
Núcleo Celular/química , Biología Computacional/métodos , Proteínas Nucleares , Análisis de Secuencia de Proteína/métodos , Proteínas Nucleares/química , Proteínas Nucleares/clasificación , Proteínas Nucleares/fisiología , Proteínas/química , Proteínas/clasificación , Proteínas/fisiología , Máquina de Vectores de Soporte
6.
Cancer Med ; 8(1): 128-136, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30561851

RESUMEN

BACKGROUND: For Glioblastoma (GBM), various prognostic nomograms have been proposed. This study aims to evaluate machine learning models to predict patients' overall survival (OS) and progression-free survival (PFS) on the basis of clinical, pathological, semantic MRI-based, and FET-PET/CT-derived information. Finally, the value of adding treatment features was evaluated. METHODS: One hundred and eighty-nine patients were retrospectively analyzed. We assessed clinical, pathological, and treatment information. The VASARI set of semantic imaging features was determined on MRIs. Metabolic information was retained from preoperative FET-PET/CT images. We generated multiple random survival forest prediction models on a patient training set and performed internal validation. Single feature class models were created including "clinical," "pathological," "MRI-based," and "FET-PET/CT-based" models, as well as combinations. Treatment features were combined with all other features. RESULTS: Of all single feature class models, the MRI-based model had the highest prediction performance on the validation set for OS (C-index: 0.61 [95% confidence interval: 0.51-0.72]) and PFS (C-index: 0.61 [0.50-0.72]). The combination of all features did increase performance above all single feature class models up to C-indices of 0.70 (0.59-0.84) and 0.68 (0.57-0.78) for OS and PFS, respectively. Adding treatment information further increased prognostic performance up to C-indices of 0.73 (0.62-0.84) and 0.71 (0.60-0.81) on the validation set for OS and PFS, respectively, allowing significant stratification of patient groups for OS. CONCLUSIONS: MRI-based features were the most relevant feature class for prognostic assessment. Combining clinical, pathological, and imaging information increased predictive power for OS and PFS. A further increase was achieved by adding treatment features.


Asunto(s)
Neoplasias Encefálicas/clasificación , Glioblastoma/clasificación , Aprendizaje Automático , Modelos Teóricos , Adulto , Anciano , Anciano de 80 o más Años , Neoplasias Encefálicas/diagnóstico por imagen , Neoplasias Encefálicas/patología , Neoplasias Encefálicas/radioterapia , Quimioterapia Adyuvante , Femenino , Glioblastoma/diagnóstico por imagen , Glioblastoma/patología , Glioblastoma/radioterapia , Humanos , Imagen por Resonancia Magnética , Masculino , Persona de Mediana Edad , Imagen Multimodal , Tomografía Computarizada por Tomografía de Emisión de Positrones , Pronóstico , Análisis de Supervivencia , Adulto Joven
7.
Phys Med ; 48: 27-36, 2018 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-29728226

RESUMEN

PURPOSE: Noticing the fast growing translation of artificial intelligence (AI) technologies to medical image analysis this paper emphasizes the future role of the medical physicist in this evolving field. Specific challenges are addressed when implementing big data concepts with high-throughput image data processing like radiomics and machine learning in a radiooncology environment to support clinical decisions. METHODS: Based on the experience of our interdisciplinary radiomics working group, techniques for processing minable data, extracting radiomics features and associating this information with clinical, physical and biological data for the development of prediction models are described. A special emphasis was placed on the potential clinical significance of such an approach. RESULTS: Clinical studies demonstrate the role of radiomics analysis as an additional independent source of information with the potential to influence the radiooncology practice, i.e. to predict patient prognosis, treatment response and underlying genetic changes. Extending the radiomics approach to integrate imaging, clinical, genetic and dosimetric data ('panomics') challenges the medical physicist as member of the radiooncology team. CONCLUSIONS: The new field of big data processing in radiooncology offers opportunities to support clinical decisions, to improve predicting treatment outcome and to stimulate fundamental research on radiation response both of tumor and normal tissue. The integration of physical data (e.g. treatment planning, dosimetric, image guidance data) demands an involvement of the medical physicist in the radiomics approach of radiooncology. To cope with this challenge national and international organizations for medical physics should organize more training opportunities in artificial intelligence technologies in radiooncology.


Asunto(s)
Inteligencia Artificial , Diagnóstico por Imagen , Procesamiento de Imagen Asistido por Computador/métodos , Neoplasias/diagnóstico por imagen , Física , Humanos
8.
Strahlenther Onkol ; 194(9): 824-834, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29557486

RESUMEN

BACKGROUND AND PURPOSE: Current prognostic models for soft tissue sarcoma (STS) patients are solely based on staging information. Treatment-related data have not been included to date. Including such information, however, could help to improve these models. MATERIALS AND METHODS: A single-center retrospective cohort of 136 STS patients treated with radiotherapy (RT) was analyzed for patients' characteristics, staging information, and treatment-related data. Therapeutic imaging studies and pathology reports of neoadjuvantly treated patients were analyzed for signs of response. Random forest machine learning-based models were used to predict patients' death and disease progression at 2 years. Pre-treatment and treatment models were compared. RESULTS: The prognostic models achieved high performances. Using treatment features improved the overall performance for all three classification types: prediction of death, and of local and systemic progression (area under the receiver operatoring characteristic curve (AUC) of 0.87, 0.88, and 0.84, respectively). Overall, RT-related features, such as the planning target volume and total dose, had preeminent importance for prognostic performance. Therapy response features were selected for prediction of disease progression. CONCLUSIONS: A machine learning-based prognostic model combining known prognostic factors with treatment- and response-related information showed high accuracy for individualized risk assessment. This model could be used for adjustments of follow-up procedures.


Asunto(s)
Aprendizaje Automático , Modelos de Riesgos Proporcionales , Sarcoma/patología , Sarcoma/radioterapia , Adulto , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Progresión de la Enfermedad , Femenino , Humanos , Masculino , Persona de Mediana Edad , Terapia Neoadyuvante , Estadificación de Neoplasias , Pronóstico , Estudios Retrospectivos , Medición de Riesgo , Sarcoma/mortalidad , Tasa de Supervivencia
9.
BMC Bioinformatics ; 19(1): 15, 2018 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-29343218

RESUMEN

BACKGROUND: The subcellular localization of a protein is an important aspect of its function. However, the experimental annotation of locations is not even complete for well-studied model organisms. Text mining might aid database curators to add experimental annotations from the scientific literature. Existing extraction methods have difficulties to distinguish relationships between proteins and cellular locations co-mentioned in the same sentence. RESULTS: LocText was created as a new method to extract protein locations from abstracts and full texts. LocText learned patterns from syntax parse trees and was trained and evaluated on a newly improved LocTextCorpus. Combined with an automatic named-entity recognizer, LocText achieved high precision (P = 86%±4). After completing development, we mined the latest research publications for three organisms: human (Homo sapiens), budding yeast (Saccharomyces cerevisiae), and thale cress (Arabidopsis thaliana). Examining 60 novel, text-mined annotations, we found that 65% (human), 85% (yeast), and 80% (cress) were correct. Of all validated annotations, 40% were completely novel, i.e. did neither appear in the annotations nor the text descriptions of Swiss-Prot. CONCLUSIONS: LocText provides a cost-effective, semi-automated workflow to assist database curators in identifying novel protein localization annotations. The annotations suggested through text-mining would be verified by experts to guarantee high-quality standards of manually-curated databases such as Swiss-Prot.


Asunto(s)
Minería de Datos , Bases de Datos de Proteínas , Proteínas/metabolismo , Programas Informáticos , Ontología de Genes , Humanos , Anotación de Secuencia Molecular
10.
Nucleic Acids Res ; 46(D1): D503-D508, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29106588

RESUMEN

NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES. The potential sets of novel NES and NLS have been generated by a simple 'in silico mutagenesis' protocol. We started with motifs annotated by experiments. In step 1, we increased specificity such that no known non-nuclear protein matched the refined motif. In step 2, we increased the sensitivity trying to match several different families with a motif. We then iterated over steps 1 and 2. The final set of 2253 NLS motifs matched 35% of 8421 experimentally verified nuclear proteins (up from 21% for the previous version) and none of 18 278 non-nuclear proteins. We updated the web interface providing multiple options to search protein sequences for NES and NLS motifs, and to evaluate your own signal sequences. NLSdb can be accessed via Rostlab services at: https://rostlab.org/services/nlsdb/.


Asunto(s)
Transporte Activo de Núcleo Celular/genética , Bases de Datos Genéticas , Anotación de Secuencia Molecular , Señales de Exportación Nuclear/genética , Señales de Localización Nuclear/química , Interfaz Usuario-Computador , Secuencia de Aminoácidos , Animales , Arabidopsis/genética , Arabidopsis/metabolismo , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Núcleo Celular/metabolismo , Conjuntos de Datos como Asunto , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Células Eucariotas/metabolismo , Humanos , Internet , Ratones , Señales de Localización Nuclear/genética , Señales de Localización Nuclear/metabolismo , Oryza/genética , Oryza/metabolismo , Ratas , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Schizosaccharomyces/genética , Schizosaccharomyces/metabolismo
11.
PLoS One ; 12(9): e0184119, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28902868

RESUMEN

In the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure. All intergenic open reading frames potentially encoding a protein of ≥ 30 amino acids were investigated with regard to coverage by transcription and translation signals and their translatability expressed by the ribosomal coverage value. This led to discovery of 465 unique, putative novel genes not yet annotated in this E. coli strain, which are evenly distributed over both DNA strands of the genome. For 255 of the novel genes, annotated homologs in other bacteria were found, and a machine-learning algorithm, trained on small protein-coding E. coli genes, predicted that 89% of these translated open reading frames represent bona fide genes. The remaining 210 putative novel genes without annotated homologs were compared to the 255 novel genes with homologs and to 250 short annotated genes of this E. coli strain. All three groups turned out to be similar with respect to their translatability distribution, fractions of differentially regulated genes, secondary structure composition, and the distribution of evolutionary constraint, suggesting that both novel groups represent legitimate genes. However, the machine-learning algorithm only recognized a small fraction of the 210 genes without annotated homologs. It is possible that these genes represent a novel group of genes, which have unusual features dissimilar to the genes of the machine-learning algorithm training set.


Asunto(s)
ADN Intergénico/genética , Escherichia coli O157/genética , Genes Bacterianos , Genoma Bacteriano , Secuencia Conservada , ADN Bacteriano/genética , Estudios de Asociación Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Sistemas de Lectura Abierta/genética , ARN Bacteriano/genética , Transcriptoma
12.
Environ Microbiol Rep ; 9(5): 537-549, 2017 10.
Artículo en Inglés | MEDLINE | ID: mdl-28618195

RESUMEN

Desulfoluna spongiiphila strain AA1 is an organohalide respiring bacterium, isolated from the marine sponge Aplysina aerophoba, that can use brominated and iodinated phenols, in addition to sulfate and thiosulfate as terminal electron acceptors. The genome of Desulfoluna spongiiphila strain AA1 is approximately 6.5 Mb. Three putative reductive dehalogenase (rdhA) genes involved in respiratory metabolism of organohalides were identified within the sequence. Conserved motifs found in respiratory reductive dehalogenases (a twin arginine translocation signal sequence and two iron-sulfur clusters) were present in all three putative AA1 rdhA genes. Transcription of one of the three rdhA genes was significantly upregulated during respiration of 2,6-dibromophenol and sponge extracts. Strain AA1 appears to have the ability to synthesize cobalamin, the key cofactor of most characterized reductive dehalogenase enzymes. The genome contains genes involved in cobalamin synthesis and uptake and can grow without cobalamin supplementation. Identification of this target gene associated with debromination lays the foundation for understanding how dehalogenating bacteria control the fate of organohalide compounds in sponges and their role in a symbiotic organobromine cycle. In the sponge environment, D. spongiiphila strain AA1 may thus take advantage of both brominated compounds and sulfate as electron acceptors for respiration.


Asunto(s)
Deltaproteobacteria/enzimología , Oxidorreductasas/metabolismo , Poríferos/microbiología , Animales , Corrinoides/biosíntesis , Deltaproteobacteria/clasificación , Deltaproteobacteria/genética , Deltaproteobacteria/metabolismo , Genes Bacterianos , Genoma Bacteriano , Genómica/métodos , Familia de Multigenes , Oxidorreductasas/genética , Filogenia
13.
F1000Res ; 6: 1824, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29497493

RESUMEN

Many tools visualize protein-protein interaction (PPI) networks. The tool introduced here, CellMap, adds one crucial novelty by visualizing PPI networks in the context of subcellular localization, i.e. the location in the cell or cellular component in which a PPI happens. Users can upload images of cells and define areas of interest against which PPIs for selected proteins are displayed (by default on a cartoon of a cell). Annotations of localization are provided by the user or through our in-house database. The visualizer and server are written in JavaScript, making CellMap easy to customize and to extend by researchers and developers.

14.
Sci Rep ; 6: 34516, 2016 10 07.
Artículo en Inglés | MEDLINE | ID: mdl-27713481

RESUMEN

Type III secretion system is a key bacterial symbiosis and pathogenicity mechanism responsible for a variety of infectious diseases, ranging from food-borne illnesses to the bubonic plague. In many Gram-negative bacteria, the type III secretion system transports effector proteins into host cells, converting resources to bacterial advantage. Here we introduce a computational method that identifies type III effectors by combining homology-based inference with de novo predictions, reaching up to 3-fold higher performance than existing tools. Our work reveals that signals for recognition and transport of effectors are distributed over the entire protein sequence instead of being confined to the N-terminus, as was previously thought. Our scan of hundreds of prokaryotic genomes identified previously unknown effectors, suggesting that type III secretion may have evolved prior to the archaea/bacteria split. Crucially, our method performs well for short sequence fragments, facilitating evaluation of microbial communities and rapid identification of bacterial pathogenicity - no genome assembly required. pEffect and its data sets are available at http://services.bromberglab.org/peffect.


Asunto(s)
Proteínas Bacterianas/metabolismo , Biología Computacional/métodos , Sistemas de Secreción Tipo III/metabolismo , Proteínas Bacterianas/genética , Genoma Bacteriano , Proteómica
15.
Bioinformatics ; 32(22): 3501-3503, 2016 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-27412096

RESUMEN

The MSAViewer is a quick and easy visualization and analysis JavaScript component for Multiple Sequence Alignment data of any size. Core features include interactive navigation through the alignment, application of popular color schemes, sorting, selecting and filtering. The MSAViewer is 'web ready': written entirely in JavaScript, compatible with modern web browsers and does not require any specialized software. The MSAViewer is part of the BioJS collection of components. AVAILABILITY AND IMPLEMENTATION: The MSAViewer is released as open source software under the Boost Software License 1.0. Documentation, source code and the viewer are available at http://msa.biojs.net/Supplementary information: Supplementary data are available at Bioinformatics online. CONTACT: msa@bio.sh.


Asunto(s)
Alineación de Secuencia , Programas Informáticos , Lenguajes de Programación , Navegador Web
17.
BMC Genomics ; 17: 133, 2016 Feb 24.
Artículo en Inglés | MEDLINE | ID: mdl-26911138

RESUMEN

BACKGROUND: Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). RESULTS: Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. CONCLUSIONS: These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.


Asunto(s)
Escherichia coli O157/genética , Evolución Molecular , Genes Bacterianos , Proteoma/genética , Transcriptoma , Animales , Bovinos , Biología Computacional , Proteínas de Escherichia coli/genética , Espectrometría de Masas , Fenotipo , ARN Bacteriano/genética , Análisis de Secuencia de ARN
18.
Nucleic Acids Res ; 44(D1): D38-47, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26538599

RESUMEN

Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.


Asunto(s)
Biología Computacional , Sistema de Registros , Curaduría de Datos , Programas Informáticos
19.
F1000Res ; 4: 1222, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26673203

RESUMEN

Recent experiments established that a culture of Saccharomyces cerevisiae (baker's yeast) survives sudden high temperatures by specifically duplicating the entire chromosome III and two chromosomal fragments (from IV and XII). Heat shock proteins (HSPs) are not significantly over-abundant in the duplication. In contrast, we suggest a simple algorithm to " postdict " the experimental results: Find a small enough chromosome with minimal protein disorder and duplicate this region. This algorithm largely explains all observed duplications. In particular, all regions duplicated in the experiment reduced the overall content of protein disorder. The differential analysis of the functional makeup of the duplication remained inconclusive. Gene Ontology (GO) enrichment suggested over-representation in processes related to reproduction and nutrient uptake. Analyzing the protein-protein interaction network (PPI) revealed that few network-central proteins were duplicated. The predictive hypothesis hinges upon the concept of reducing proteins with long regions of disorder in order to become less sensitive to heat shock attack.

20.
Elife ; 42015 Jul 08.
Artículo en Inglés | MEDLINE | ID: mdl-26153621

RESUMEN

BioJS is an open source software project that develops visualization tools for different types of biological data. Here we report on the factors that influenced the growth of the BioJS user and developer community, and outline our strategy for building on this growth. The lessons we have learned on BioJS may also be relevant to other open source software projects.


Asunto(s)
Disciplinas de las Ciencias Biológicas/métodos , Biología Computacional/métodos , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA