Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Nucleic Acids Res ; 49(W1): W535-W540, 2021 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-33999203

RESUMO

Since 1992 PredictProtein (https://predictprotein.org) is a one-stop online resource for protein sequence analysis with its main site hosted at the Luxembourg Centre for Systems Biomedicine (LCSB) and queried monthly by over 3,000 users in 2020. PredictProtein was the first Internet server for protein predictions. It pioneered combining evolutionary information and machine learning. Given a protein sequence as input, the server outputs multiple sequence alignments, predictions of protein structure in 1D and 2D (secondary structure, solvent accessibility, transmembrane segments, disordered regions, protein flexibility, and disulfide bridges) and predictions of protein function (functional effects of sequence variation or point mutations, Gene Ontology (GO) terms, subcellular localization, and protein-, RNA-, and DNA binding). PredictProtein's infrastructure has moved to the LCSB increasing throughput; the use of MMseqs2 sequence search reduced runtime five-fold (apparently without lowering performance of prediction methods); user interface elements improved usability, and new prediction methods were added. PredictProtein recently included predictions from deep learning embeddings (GO and secondary structure) and a method for the prediction of proteins and residues binding DNA, RNA, or other proteins. PredictProtein.org aspires to provide reliable predictions to computational and experimental biologists alike. All scripts and methods are freely available for offline execution in high-throughput settings.


Assuntos
Conformação Proteica , Software , Sítios de Ligação , Proteínas do Nucleocapsídeo de Coronavírus/química , Proteínas de Ligação a DNA/química , Fosfoproteínas/química , Estrutura Secundária de Proteína , Proteínas/química , Proteínas/fisiologia , Proteínas de Ligação a RNA/química , Alinhamento de Sequência , Análise de Sequência de Proteína
2.
J Mol Evol ; 89(8): 544-553, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34328525

RESUMO

The native subcellular location (also referred to as localization or cellular compartment) of a protein is the one in which it acts most frequently; it is one aspect of protein function. Do ten eukaryotic model organisms differ in their location spectrum, i.e., the fraction of its proteome in each of seven major cellular compartments? As experimental annotations of locations remain biased and incomplete, we need prediction methods to answer this question. After systematic bias corrections, the complete but faulty prediction methods appeared to be more appropriate to compare location spectra between species than the incomplete more accurate experimental data. This work compared the location spectra for ten eukaryotes: Homo sapiens (human), Gorilla gorilla (gorilla), Pan troglodytes (chimpanzee), Mus musculus (mouse), Rattus norvegicus (rat), Drosophila melanogaster (fruit/vinegar fly), Anopheles gambiae (African malaria mosquito), Caenorhabitis elegans (nematode), Saccharomyces cerevisiae (baker's yeast), and Schizosaccharomyces pombe (fission yeast). The two largest classes were predicted to be the nucleus and the cytoplasm together accounting for 47-62% of all proteins, while 7-21% of the proteins were predicted in the plasma membrane and 4-15% to be secreted. Overall, the predicted location spectra were largely similar. However, in detail, the differences sufficed to plot trees (UPGMA) and 2D (PCA) maps relating the ten organisms using a simple Euclidean distance in seven states (location classes). The relations based on the simple predicted location spectra captured aspects of cross-species comparisons usually revealed only by much more detailed evolutionary comparisons. Most interestingly, known phylogenetic relations were reproduced better by paralog-only than by ortholog-only trees.


Assuntos
Drosophila melanogaster , Proteoma , Animais , Drosophila , Drosophila melanogaster/genética , Camundongos , Filogenia , Proteoma/genética , Ratos , Saccharomyces cerevisiae/genética
3.
Nucleic Acids Res ; 46(D1): D503-D508, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29106588

RESUMO

NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES. The potential sets of novel NES and NLS have been generated by a simple 'in silico mutagenesis' protocol. We started with motifs annotated by experiments. In step 1, we increased specificity such that no known non-nuclear protein matched the refined motif. In step 2, we increased the sensitivity trying to match several different families with a motif. We then iterated over steps 1 and 2. The final set of 2253 NLS motifs matched 35% of 8421 experimentally verified nuclear proteins (up from 21% for the previous version) and none of 18 278 non-nuclear proteins. We updated the web interface providing multiple options to search protein sequences for NES and NLS motifs, and to evaluate your own signal sequences. NLSdb can be accessed via Rostlab services at: https://rostlab.org/services/nlsdb/.


Assuntos
Transporte Ativo do Núcleo Celular/genética , Bases de Dados Genéticas , Anotação de Sequência Molecular , Sinais de Exportação Nuclear/genética , Sinais de Localização Nuclear/química , Interface Usuário-Computador , Sequência de Aminoácidos , Animais , Arabidopsis/genética , Arabidopsis/metabolismo , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Núcleo Celular/metabolismo , Conjuntos de Dados como Assunto , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Células Eucarióticas/metabolismo , Humanos , Internet , Camundongos , Sinais de Localização Nuclear/genética , Sinais de Localização Nuclear/metabolismo , Oryza/genética , Oryza/metabolismo , Ratos , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Schizosaccharomyces/genética , Schizosaccharomyces/metabolismo
4.
BMC Bioinformatics ; 20(1): 727, 2019 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-31861997

RESUMO

Following publication of the original article [1], the author reported that an incorrect figure has been published as Figure 2. The correct Figure 2 is shown below.

5.
BMC Bioinformatics ; 20(1): 205, 2019 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-31014229

RESUMO

BACKGROUND: Sub-nuclear structures or locations are associated with various nuclear processes. Proteins localized in these substructures are important to understand the interior nuclear mechanisms. Despite advances in high-throughput methods, experimental protein annotations remain limited. Predictions of cellular compartments have become very accurate, largely at the expense of leaving out substructures inside the nucleus making a fine-grained analysis impossible. RESULTS: Here, we present a new method (LocNuclei) that predicts nuclear substructures from sequence alone. LocNuclei used a string-based Profile Kernel with Support Vector Machines (SVMs). It distinguishes sub-nuclear localization in 13 distinct substructures and distinguishes between nuclear proteins confined to the nucleus and those that are also native to other compartments (traveler proteins). High performance was achieved by implicitly leveraging a large biological knowledge-base in creating predictions by homology-based inference through BLAST. Using this approach, the performance reached AUC = 0.70-0.74 and Q13 = 59-65%. Travelling proteins (nucleus and other) were identified at Q2 = 70-74%. A Gene Ontology (GO) analysis of the enrichment of biological processes revealed that the predicted sub-nuclear compartments matched the expected functionality. Analysis of protein-protein interactions (PPI) show that formation of compartments and functionality of proteins in these compartments highly rely on interactions between proteins. This suggested that the LocNuclei predictions carry important information about function. The source code and data sets are available through GitHub: https://github.com/Rostlab/LocNuclei . CONCLUSIONS: LocNuclei predicts subnuclear compartments and traveler proteins accurately. These predictions carry important information about functionality and PPIs.


Assuntos
Núcleo Celular/química , Biologia Computacional/métodos , Proteínas Nucleares , Análise de Sequência de Proteína/métodos , Proteínas Nucleares/química , Proteínas Nucleares/classificação , Proteínas Nucleares/fisiologia , Proteínas/química , Proteínas/classificação , Proteínas/fisiologia , Máquina de Vetores de Suporte
6.
BMC Bioinformatics ; 19(1): 15, 2018 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-29343218

RESUMO

BACKGROUND: The subcellular localization of a protein is an important aspect of its function. However, the experimental annotation of locations is not even complete for well-studied model organisms. Text mining might aid database curators to add experimental annotations from the scientific literature. Existing extraction methods have difficulties to distinguish relationships between proteins and cellular locations co-mentioned in the same sentence. RESULTS: LocText was created as a new method to extract protein locations from abstracts and full texts. LocText learned patterns from syntax parse trees and was trained and evaluated on a newly improved LocTextCorpus. Combined with an automatic named-entity recognizer, LocText achieved high precision (P = 86%±4). After completing development, we mined the latest research publications for three organisms: human (Homo sapiens), budding yeast (Saccharomyces cerevisiae), and thale cress (Arabidopsis thaliana). Examining 60 novel, text-mined annotations, we found that 65% (human), 85% (yeast), and 80% (cress) were correct. Of all validated annotations, 40% were completely novel, i.e. did neither appear in the annotations nor the text descriptions of Swiss-Prot. CONCLUSIONS: LocText provides a cost-effective, semi-automated workflow to assist database curators in identifying novel protein localization annotations. The annotations suggested through text-mining would be verified by experts to guarantee high-quality standards of manually-curated databases such as Swiss-Prot.


Assuntos
Mineração de Dados , Bases de Dados de Proteínas , Proteínas/metabolismo , Software , Ontologia Genética , Humanos , Anotação de Sequência Molecular
7.
Strahlenther Onkol ; 194(9): 824-834, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29557486

RESUMO

BACKGROUND AND PURPOSE: Current prognostic models for soft tissue sarcoma (STS) patients are solely based on staging information. Treatment-related data have not been included to date. Including such information, however, could help to improve these models. MATERIALS AND METHODS: A single-center retrospective cohort of 136 STS patients treated with radiotherapy (RT) was analyzed for patients' characteristics, staging information, and treatment-related data. Therapeutic imaging studies and pathology reports of neoadjuvantly treated patients were analyzed for signs of response. Random forest machine learning-based models were used to predict patients' death and disease progression at 2 years. Pre-treatment and treatment models were compared. RESULTS: The prognostic models achieved high performances. Using treatment features improved the overall performance for all three classification types: prediction of death, and of local and systemic progression (area under the receiver operatoring characteristic curve (AUC) of 0.87, 0.88, and 0.84, respectively). Overall, RT-related features, such as the planning target volume and total dose, had preeminent importance for prognostic performance. Therapy response features were selected for prediction of disease progression. CONCLUSIONS: A machine learning-based prognostic model combining known prognostic factors with treatment- and response-related information showed high accuracy for individualized risk assessment. This model could be used for adjustments of follow-up procedures.


Assuntos
Aprendizado de Máquina , Modelos de Riscos Proporcionais , Sarcoma/patologia , Sarcoma/radioterapia , Adulto , Idoso , Idoso de 80 Anos ou mais , Estudos de Coortes , Progressão da Doença , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Terapia Neoadjuvante , Estadiamento de Neoplasias , Prognóstico , Estudos Retrospectivos , Medição de Risco , Sarcoma/mortalidade , Taxa de Sobrevida
8.
Nucleic Acids Res ; 44(D1): D38-47, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26538599

RESUMO

Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand.Here we present a community-driven curation effort, supported by ELIXIR-the European infrastructure for biological information-that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners.As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.


Assuntos
Biologia Computacional , Sistema de Registros , Curadoria de Dados , Software
9.
Bioinformatics ; 32(22): 3501-3503, 2016 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-27412096

RESUMO

The MSAViewer is a quick and easy visualization and analysis JavaScript component for Multiple Sequence Alignment data of any size. Core features include interactive navigation through the alignment, application of popular color schemes, sorting, selecting and filtering. The MSAViewer is 'web ready': written entirely in JavaScript, compatible with modern web browsers and does not require any specialized software. The MSAViewer is part of the BioJS collection of components. AVAILABILITY AND IMPLEMENTATION: The MSAViewer is released as open source software under the Boost Software License 1.0. Documentation, source code and the viewer are available at http://msa.biojs.net/Supplementary information: Supplementary data are available at Bioinformatics online. CONTACT: msa@bio.sh.


Assuntos
Alinhamento de Sequência , Software , Linguagens de Programação , Navegador
10.
Mol Cell Proteomics ; 14(8): 2085-102, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25991687

RESUMO

Naive CD4(+) T cells are the common precursors of multiple effector and memory T-cell subsets and possess a high plasticity in terms of differentiation potential. This stem-cell-like character is important for cell therapies aiming at regeneration of specific immunity. Cell surface proteins are crucial for recognition and response to signals mediated by other cells or environmental changes. Knowledge of cell surface proteins of human naive CD4(+) T cells and their changes during the early phase of T-cell activation is urgently needed for a guided differentiation of naive T cells and may support the selection of pluripotent cells for cell therapy. Periodate oxidation and aniline-catalyzed oxime ligation technology was applied with subsequent quantitative liquid chromatography-tandem MS to generate a data set describing the surface proteome of primary human naive CD4(+) T cells and to monitor dynamic changes during the early phase of activation. This led to the identification of 173 N-glycosylated surface proteins. To independently confirm the proteomic data set and to analyze the cell surface by an alternative technique a systematic phenotypic expression analysis of surface antigens via flow cytometry was performed. This screening expanded the previous data set, resulting in 229 surface proteins, which were expressed on naive unstimulated and activated CD4(+) T cells. Furthermore, we generated a surface expression atlas based on transcriptome data, experimental annotation, and predicted subcellular localization, and correlated the proteomics result with this transcriptional data set. This extensive surface atlas provides an overall naive CD4(+) T cell surface resource and will enable future studies aiming at a deeper understanding of mechanisms of T-cell biology allowing the identification of novel immune targets usable for the development of therapeutic treatments.


Assuntos
Linfócitos T CD4-Positivos/metabolismo , Ativação Linfocitária/imunologia , Proteômica/métodos , Receptores de Antígenos de Linfócitos T/metabolismo , Membrana Celular/metabolismo , Análise por Conglomerados , Simulação por Computador , Citometria de Fluxo , Perfilação da Expressão Gênica , Ontologia Genética , Glicoproteínas/metabolismo , Humanos , Proteoma/metabolismo , Reprodutibilidade dos Testes , Transcriptoma/genética
11.
BMC Genomics ; 17: 133, 2016 Feb 24.
Artigo em Inglês | MEDLINE | ID: mdl-26911138

RESUMO

BACKGROUND: Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). RESULTS: Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. CONCLUSIONS: These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.


Assuntos
Escherichia coli O157/genética , Evolução Molecular , Genes Bacterianos , Proteoma/genética , Transcriptoma , Animais , Bovinos , Biologia Computacional , Proteínas de Escherichia coli/genética , Espectrometria de Massas , Fenótipo , RNA Bacteriano/genética , Análise de Sequência de RNA
12.
Nucleic Acids Res ; 42(Web Server issue): W337-43, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24799431

RESUMO

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.


Assuntos
Conformação Proteica , Software , Substituição de Aminoácidos , Sítios de Ligação , Ontologia Genética , Internet , Proteínas Intrinsicamente Desordenadas/química , Proteínas de Membrana/química , Mutação , Mapeamento de Interação de Proteínas , Proteínas/análise , Proteínas/genética , Proteínas/metabolismo , Alinhamento de Sequência , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos
13.
Nucleic Acids Res ; 42(Web Server issue): W350-5, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24848019

RESUMO

The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18=80±3% for eukaryotes and a six-state accuracy Q6=89±4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3.


Assuntos
Proteínas/análise , Software , Proteínas Arqueais/análise , Inteligência Artificial , Proteínas de Bactérias/análise , Internet , Homologia de Sequência de Aminoácidos
14.
Genomics ; 104(6 Pt B): 496-503, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25458812

RESUMO

Protein-protein interaction (PPI) detection is one of the central goals of functional genomics and systems biology. Knowledge about the nature of PPIs can help fill the widening gap between sequence information and functional annotations. Although experimental methods have produced valuable PPI data, they also suffer from significant limitations. Computational PPI prediction methods have attracted tremendous attentions. Despite considerable efforts, PPI prediction is still in its infancy in complex multicellular organisms such as humans. Here, we propose a novel ensemble learning method, LocFuse, which is useful in human PPI prediction. This method uses eight different genomic and proteomic features along with four types of different classifiers. The prediction performance of this classifier selection method was found to be considerably better than methods employed hitherto. This confirms the complex nature of the PPI prediction problem and also the necessity of using biological information for classifier fusion. The LocFuse is available at: http://lbb.ut.ac.ir/Download/LBBsoft/LocFuse. BIOLOGICAL SIGNIFICANCE: The results revealed that if we divide proteome space according to the cellular localization of proteins, then the utility of some classifiers in PPI prediction can be improved. Therefore, to predict the interaction for any given protein pair, we can select the most accurate classifier with regard to the cellular localization information. Based on the results, we can say that the importance of different features for PPI prediction varies between differently localized proteins; however in general, our novel features, which were extracted from position-specific scoring matrices (PSSMs), are the most important ones and the Random Forest (RF) classifier performs best in most cases. LocFuse was developed with a user-friendly graphic interface and it is freely available for Linux, Mac OSX and MS Windows operating systems.


Assuntos
Metabolômica/métodos , Processamento de Proteína Pós-Traducional , Proteoma/metabolismo , Proteômica/métodos , Software , Inteligência Artificial , Humanos , Ligação Proteica , Transporte Proteico
15.
BMC Plant Biol ; 14: 329, 2014 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-25476999

RESUMO

BACKGROUND: For most organisms, even if their genome sequence is available, little functional information about individual genes or proteins exists. Several annotation pipelines have been developed for functional analysis based on sequence, 'omics', and literature data. However, researchers encounter little guidance on how well they perform. Here, we used the recently sequenced potato genome as a case study. The potato genome was selected since its genome is newly sequenced and it is a non-model plant even if there is relatively ample information on individual potato genes, and multiple gene expression profiles are available. RESULTS: We show that the automatic gene annotations of potato have low accuracy when compared to a "gold standard" based on experimentally validated potato genes. Furthermore, we evaluate six state-of-the-art annotation pipelines and show that their predictions are markedly dissimilar (Jaccard similarity coefficient of 0.27 between pipelines on average). To overcome this discrepancy, we introduce a simple GO structure-based algorithm that reconciles the predictions of the different pipelines. We show that the integrated annotation covers more genes, increases by over 50% the number of highly co-expressed GO processes, and obtains much higher agreement with the gold standard. CONCLUSIONS: We find that different annotation pipelines produce different results, and show how to integrate them into a unified annotation that is of higher quality than each single pipeline. We offer an improved functional annotation of both PGSC and ITAG potato gene models, as well as tools that can be applied to additional pipelines and improve annotation in other organisms. This will greatly aid future functional analysis of '-omics' datasets from potato and other organisms with newly sequenced genomes. The new potato annotations are available with this paper.


Assuntos
Genoma de Planta , Anotação de Sequência Molecular , Solanum tuberosum/genética
16.
Bioinformatics ; 28(18): i458-i465, 2012 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-22962467

RESUMO

MOTIVATION: Subcellular localization is one aspect of protein function. Despite advances in high-throughput imaging, localization maps remain incomplete. Several methods accurately predict localization, but many challenges remain to be tackled. RESULTS: In this study, we introduced a framework to predict localization in life's three domains, including globular and membrane proteins (3 classes for archaea; 6 for bacteria and 18 for eukaryota). The resulting method, LocTree2, works well even for protein fragments. It uses a hierarchical system of support vector machines that imitates the cascading mechanism of cellular sorting. The method reaches high levels of sustained performance (eukaryota: Q18=65%, bacteria: Q6=84%). LocTree2 also accurately distinguishes membrane and non-membrane proteins. In our hands, it compared favorably with top methods when tested on new data. AVAILABILITY: Online through PredictProtein (predictprotein.org); as standalone version at http://www.rostlab.org/services/loctree2. CONTACT: localization@rostlab.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Proteínas Arqueais/análise , Proteínas de Bactérias/análise , Proteínas de Membrana/análise , Proteínas/análise , Máquina de Vetores de Suporte , Animais , Anotação de Sequência Molecular , Análise de Sequência de Proteína
17.
Curr Protoc Bioinformatics ; 69(1): e97, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-32150354

RESUMO

Visualizing protein data remains a challenging and stimulating task. Useful and intuitive visualization tools may help advance biomolecular and medical research; unintuitive tools may bar important breakthroughs. This protocol describes two use cases for the CellMap (http://cellmap.protein.properties) web tool. The tool allows researchers to visualize human protein-protein interaction data constrained by protein subcellular localizations. In the simplest form, proteins are visualized on cell images that also show protein-protein interactions (PPIs) through lines (edges) connecting the proteins across the compartments. At a glance, this simultaneously highlights spatial constraints that proteins are subject to in their physical environment and visualizes PPIs against these localizations. Visualizing two realities helps in decluttering the protein interaction visualization from "hairball" phenomena that arise when single proteins or groups thereof interact with hundreds of partners. © 2019 The Authors. Basic Protocol 1: Visualizing proteins and their interactions on cell images Basic Protocol 2: Displaying all interaction partners for a protein.


Assuntos
Células/metabolismo , Imageamento Tridimensional , Mapeamento de Interação de Proteínas , Software , Humanos , Proteínas/metabolismo , Frações Subcelulares/metabolismo
18.
Cancer Med ; 8(1): 128-136, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30561851

RESUMO

BACKGROUND: For Glioblastoma (GBM), various prognostic nomograms have been proposed. This study aims to evaluate machine learning models to predict patients' overall survival (OS) and progression-free survival (PFS) on the basis of clinical, pathological, semantic MRI-based, and FET-PET/CT-derived information. Finally, the value of adding treatment features was evaluated. METHODS: One hundred and eighty-nine patients were retrospectively analyzed. We assessed clinical, pathological, and treatment information. The VASARI set of semantic imaging features was determined on MRIs. Metabolic information was retained from preoperative FET-PET/CT images. We generated multiple random survival forest prediction models on a patient training set and performed internal validation. Single feature class models were created including "clinical," "pathological," "MRI-based," and "FET-PET/CT-based" models, as well as combinations. Treatment features were combined with all other features. RESULTS: Of all single feature class models, the MRI-based model had the highest prediction performance on the validation set for OS (C-index: 0.61 [95% confidence interval: 0.51-0.72]) and PFS (C-index: 0.61 [0.50-0.72]). The combination of all features did increase performance above all single feature class models up to C-indices of 0.70 (0.59-0.84) and 0.68 (0.57-0.78) for OS and PFS, respectively. Adding treatment information further increased prognostic performance up to C-indices of 0.73 (0.62-0.84) and 0.71 (0.60-0.81) on the validation set for OS and PFS, respectively, allowing significant stratification of patient groups for OS. CONCLUSIONS: MRI-based features were the most relevant feature class for prognostic assessment. Combining clinical, pathological, and imaging information increased predictive power for OS and PFS. A further increase was achieved by adding treatment features.


Assuntos
Neoplasias Encefálicas/classificação , Glioblastoma/classificação , Aprendizado de Máquina , Modelos Teóricos , Adulto , Idoso , Idoso de 80 Anos ou mais , Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/patologia , Neoplasias Encefálicas/radioterapia , Quimioterapia Adjuvante , Feminino , Glioblastoma/diagnóstico por imagem , Glioblastoma/patologia , Glioblastoma/radioterapia , Humanos , Imageamento por Ressonância Magnética , Masculino , Pessoa de Meia-Idade , Imagem Multimodal , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Prognóstico , Análise de Sobrevida , Adulto Jovem
19.
Phys Med ; 48: 27-36, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-29728226

RESUMO

PURPOSE: Noticing the fast growing translation of artificial intelligence (AI) technologies to medical image analysis this paper emphasizes the future role of the medical physicist in this evolving field. Specific challenges are addressed when implementing big data concepts with high-throughput image data processing like radiomics and machine learning in a radiooncology environment to support clinical decisions. METHODS: Based on the experience of our interdisciplinary radiomics working group, techniques for processing minable data, extracting radiomics features and associating this information with clinical, physical and biological data for the development of prediction models are described. A special emphasis was placed on the potential clinical significance of such an approach. RESULTS: Clinical studies demonstrate the role of radiomics analysis as an additional independent source of information with the potential to influence the radiooncology practice, i.e. to predict patient prognosis, treatment response and underlying genetic changes. Extending the radiomics approach to integrate imaging, clinical, genetic and dosimetric data ('panomics') challenges the medical physicist as member of the radiooncology team. CONCLUSIONS: The new field of big data processing in radiooncology offers opportunities to support clinical decisions, to improve predicting treatment outcome and to stimulate fundamental research on radiation response both of tumor and normal tissue. The integration of physical data (e.g. treatment planning, dosimetric, image guidance data) demands an involvement of the medical physicist in the radiomics approach of radiooncology. To cope with this challenge national and international organizations for medical physics should organize more training opportunities in artificial intelligence technologies in radiooncology.


Assuntos
Inteligência Artificial , Diagnóstico por Imagem , Processamento de Imagem Assistida por Computador/métodos , Neoplasias/diagnóstico por imagem , Física , Humanos
20.
F1000Res ; 6: 1824, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29497493

RESUMO

Many tools visualize protein-protein interaction (PPI) networks. The tool introduced here, CellMap, adds one crucial novelty by visualizing PPI networks in the context of subcellular localization, i.e. the location in the cell or cellular component in which a PPI happens. Users can upload images of cells and define areas of interest against which PPIs for selected proteins are displayed (by default on a cartoon of a cell). Annotations of localization are provided by the user or through our in-house database. The visualizer and server are written in JavaScript, making CellMap easy to customize and to extend by researchers and developers.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa