Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
1.
Plant J ; 104(3): 812-827, 2020 11.
Article in English | MEDLINE | ID: mdl-32780488

ABSTRACT

Agriculture faces increasing demand for yield, higher plant-derived protein content and diversity while facing pressure to achieve sustainability. Although the genomes of many of the important crops have been sequenced, the subcellular locations of most of the encoded proteins remain unknown or are only predicted. Protein subcellular location is crucial in determining protein function and accumulation patterns in plants, and is critical for targeted improvements in yield and resilience. Integrating location data from over 800 studies for 12 major crop species into the cropPAL2020 data collection showed that while >80% of proteins in most species are not localised by experimental data, combining species data or integrating predictions can help bridge gaps at similar accuracy. The collation and integration of over 61 505 experimental localisations and more than 6 million predictions showed that the relative sizes of the protein catalogues located in different subcellular compartments are comparable between crops and Arabidopsis. A comprehensive cross-species comparison showed that between 50% and 80% of the subcellulomes are conserved across species and that conservation only depends to some degree on the phylogenetic relationship of the species. Protein subcellular locations in major biosynthesis pathways are more often conserved than in metabolic pathways. Underlying this conservation is a clear potential for subcellular diversity in protein location between species by means of gene duplication and alternative splicing. Our cropPAL data set and search platform (https://crop-pal.org) provide a comprehensive subcellular proteomics resource to drive compartmentation-based approaches for improving yield, protein composition and resilience in future crop varieties.


Subject(s)
Crops, Agricultural/metabolism , Databases, Protein , Plant Proteins/metabolism , Cell Compartmentation , Crops, Agricultural/cytology , Plant Breeding , Plant Cells/metabolism , Species Specificity
2.
Adv Exp Med Biol ; 1346: 67-89, 2021.
Article in English | MEDLINE | ID: mdl-35113396

ABSTRACT

In eukaryotic organisms, subcellular protein location is critical in defining protein function and understanding sub-functionalization of gene families. Some proteins have defined locations, whereas others have low specificity targeting and complex accumulation patterns. There is no single approach that can be considered entirely adequate for defining the in vivo location of all proteins. By combining evidence from different approaches, the strengths and weaknesses of different technologies can be estimated, and a location consensus can be built. The Subcellular Location of Proteins in Arabidopsis database ( http://suba.live/ ) combines experimental data sets that have been reported in the literature and is analyzing these data to provide useful tools for biologists to interpret their own data. Foremost among these tools is a consensus classifier (SUBAcon) that computes a proposed location for all proteins based on balancing the experimental evidence and predictions. Further tools analyze sets of proteins to define the abundance of cellular structures. Extending these types of resources to plant crop species has been complex due to polyploidy, gene family expansion and contraction, and the movement of pathways and processes within cells across the plant kingdom. The Crop Proteins of Annotated Location database ( http://crop-pal.org/ ) has developed a range of subcellular location resources including a species-specific voting consensus for 12 plant crop species that offers collated evidence and filters for current crop proteomes akin to SUBA. Comprehensive cross-species comparison of these data shows that the sub-cellular proteomes (subcellulomes) depend only to some degree on phylogenetic relationship and are more conserved in major biosynthesis than in metabolic pathways. Together SUBA and cropPAL created reference subcellulomes for plants as well as species-specific subcellulomes for cross-species data mining. These data collections are increasingly used by the research community to provide a subcellular protein location layer, inform models of compartmented cell function and protein-protein interaction network, guide future molecular crop breeding strategies, or simply answer a specific question-where is my protein of interest inside the cell?


Subject(s)
Arabidopsis , Arabidopsis/genetics , Databases, Protein , Humans , Phylogeny , Proteomics , Species Specificity , Subcellular Fractions
3.
Nucleic Acids Res ; 45(D1): D1064-D1074, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27899614

ABSTRACT

The SUBcellular location database for Arabidopsis proteins (SUBA4, http://suba.live) is a comprehensive collection of manually curated published data sets of large-scale subcellular proteomics, fluorescent protein visualization, protein-protein interaction (PPI) as well as subcellular targeting calls from 22 prediction programs. SUBA4 contains an additional 35 568 localizations totalling more than 60 000 experimental protein location claims as well as 37 new suborganellar localization categories. The experimental PPI data has been expanded to 26 327 PPI pairs including 856 PPI localizations from experimental fluorescent visualizations. The new SUBA4 user interface enables users to choose quickly from the filter categories: 'subcellular location', 'protein properties', 'protein-protein interaction' and 'affiliations' to build complex queries. This allows substantial expansion of search parameters into 80 annotation types comprising 1 150 204 new annotations to study metadata associated with subcellular localization. The 'BLAST' tab contains a sequence alignment tool to enable a sequence fragment from any species to find the closest match in Arabidopsis and retrieve data on subcellular location. Using the location consensus SUBAcon, the SUBA4 toolbox delivers three novel data services allowing interactive analysis of user data to provide relative compartmental protein abundances and proximity relationship analysis of PPI and coexpression partners from a submitted list of Arabidopsis gene identifiers.


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/metabolism , Computational Biology/methods , Databases, Protein , Protein Interaction Mapping , Protein Interaction Maps , Intracellular Space/metabolism , Molecular Sequence Annotation , Protein Transport , Proteomics , Software , Web Browser
4.
Plant J ; 92(6): 1202-1217, 2017 Dec.
Article in English | MEDLINE | ID: mdl-29024340

ABSTRACT

Measuring changes in protein or organelle abundance in the cell is an essential, but challenging aspect of cell biology. Frequently-used methods for determining organelle abundance typically rely on detection of a very few marker proteins, so are unsatisfactory. In silico estimates of protein abundances from publicly available protein spectra can provide useful standard abundance values but contain only data from tissue proteomes, and are not coupled to organelle localization data. A new protein abundance score, the normalized protein abundance scale (NPAS), expands on the number of scored proteins and the scoring accuracy of lower-abundance proteins in Arabidopsis. NPAS was combined with subcellular protein localization data, facilitating quantitative estimations of organelle abundance during routine experimental procedures. A suite of targeted proteomics markers for subcellular compartment markers was developed, enabling independent verification of in silico estimates for relative organelle abundance. Estimation of relative organelle abundance was found to be reproducible and consistent over a range of tissues and growth conditions. In silico abundance estimations and localization data have been combined into an online tool, multiple marker abundance profiling, available in the SUBA4 toolbox (http://suba.live).


Subject(s)
Arabidopsis Proteins/metabolism , Arabidopsis/metabolism , Proteome , Proteomics , Biomarkers/metabolism , Organelles/metabolism , Protein Transport
5.
Plant Cell Physiol ; 57(1): e9, 2016 Jan.
Article in English | MEDLINE | ID: mdl-26556651

ABSTRACT

Barley, wheat, rice and maize provide the bulk of human nutrition and have extensive industrial use as agricultural products. The genomes of these crops each contains >40,000 genes encoding proteins; however, the major genome databases for these species lack annotation information of protein subcellular location for >80% of these gene products. We address this gap, by constructing the compendium of crop protein subcellular locations called crop Proteins with Annotated Locations (cropPAL). Subcellular location is most commonly determined by fluorescent protein tagging of live cells or mass spectrometry detection in subcellular purifications, but can also be predicted from amino acid sequence or protein expression patterns. The cropPAL database collates 556 published studies, from >300 research institutes in >30 countries that have been previously published, as well as compiling eight pre-computed subcellular predictions for all Hordeum vulgare, Triticum aestivum, Oryza sativa and Zea mays protein sequences. The data collection including metadata for proteins and published studies can be accessed through a search portal http://crop-PAL.org. The subcellular localization information housed in cropPAL helps to depict plant cells as compartmentalized protein networks that can be investigated for improving crop yield and quality, and developing new biotechnological solutions to agricultural challenges.


Subject(s)
Databases, Genetic , Genome, Plant/genetics , Hordeum/genetics , Oryza/genetics , Triticum/genetics , Zea mays/genetics , Amino Acid Sequence , Computational Biology , Crops, Agricultural , Hordeum/metabolism , Plant Proteins/genetics , Protein Transport
6.
Bioinformatics ; 30(23): 3356-64, 2014 Dec 01.
Article in English | MEDLINE | ID: mdl-25150248

ABSTRACT

MOTIVATION: Knowing the subcellular location of proteins is critical for understanding their function and developing accurate networks representing eukaryotic biological processes. Many computational tools have been developed to predict proteome-wide subcellular location, and abundant experimental data from green fluorescent protein (GFP) tagging or mass spectrometry (MS) are available in the model plant, Arabidopsis. None of these approaches is error-free, and thus, results are often contradictory. RESULTS: To help unify these multiple data sources, we have developed the SUBcellular Arabidopsis consensus (SUBAcon) algorithm, a naive Bayes classifier that integrates 22 computational prediction algorithms, experimental GFP and MS localizations, protein-protein interaction and co-expression data to derive a consensus call and probability. SUBAcon classifies protein location in Arabidopsis more accurately than single predictors. AVAILABILITY: SUBAcon is a useful tool for recovering proteome-wide subcellular locations of Arabidopsis proteins and is displayed in the SUBA3 database (http://suba.plantenergy.uwa.edu.au). The source code and input data is available through the SUBA3 server (http://suba.plantenergy.uwa.edu.au//SUBAcon.html) and the Arabidopsis SUbproteome REference (ASURE) training set can be accessed using the ASURE web portal (http://suba.plantenergy.uwa.edu.au/ASURE).


Subject(s)
Algorithms , Arabidopsis Proteins/analysis , Arabidopsis/chemistry , Proteome/analysis , Arabidopsis/genetics , Arabidopsis/metabolism , Arabidopsis Proteins/genetics , Arabidopsis Proteins/metabolism , Bayes Theorem , Databases, Protein , Green Fluorescent Proteins/genetics , Mass Spectrometry , Membrane Proteins/analysis , Protein Interaction Mapping , Proteome/genetics , Proteome/metabolism , Software
7.
Nucleic Acids Res ; 41(Database issue): D1185-91, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23180787

ABSTRACT

The subcellular location database for Arabidopsis proteins (SUBA3, http://suba.plantenergy.uwa.edu.au) combines manual literature curation of large-scale subcellular proteomics, fluorescent protein visualization and protein-protein interaction (PPI) datasets with subcellular targeting calls from 22 prediction programs. More than 14 500 new experimental locations have been added since its first release in 2007. Overall, nearly 650 000 new calls of subcellular location for 35 388 non-redundant Arabidopsis proteins are included (almost six times the information in the previous SUBA version). A re-designed interface makes the SUBA3 site more intuitive and easier to use than earlier versions and provides powerful options to search for PPIs within the context of cell compartmentation. SUBA3 also includes detailed localization information for reference organelle datasets and incorporates green fluorescent protein (GFP) images for many proteins. To determine as objectively as possible where a particular protein is located, we have developed SUBAcon, a Bayesian approach that incorporates experimental localization and targeting prediction data to best estimate a protein's location in the cell. The probabilities of subcellular location for each protein are provided and displayed as a pictographic heat map of a plant cell in SUBA3.


Subject(s)
Arabidopsis Proteins/analysis , Databases, Protein , Internet , Protein Interaction Mapping , Proteomics , Systems Integration , User-Computer Interface
8.
Sci Rep ; 11(1): 1696, 2021 01 18.
Article in English | MEDLINE | ID: mdl-33462256

ABSTRACT

The increased diversity and scale of published biological data has to led to a growing appreciation for the applications of machine learning and statistical methodologies to gain new insights. Key to achieving this aim is solving the Relationship Extraction problem which specifies the semantic interaction between two or more biological entities in a published study. Here, we employed two deep neural network natural language processing (NLP) methods, namely: the continuous bag of words (CBOW), and the bi-directional long short-term memory (bi-LSTM). These methods were employed to predict relations between entities that describe protein subcellular localisation in plants. We applied our system to 1700 published Arabidopsis protein subcellular studies from the SUBA manually curated dataset. The system combines pre-processing of full-text articles in a machine-readable format with relevant sentence extraction for downstream NLP analysis. Using the SUBA corpus, the neural network classifier predicted interactions between protein name, subcellular localisation and experimental methodology with an average precision, recall rate, accuracy and F1 scores of 95.1%, 82.8%, 89.3% and 88.4% respectively (n = 30). Comparable scoring metrics were obtained using the CropPAL database as an independent testing dataset that stores protein subcellular localisation in crop species, demonstrating wide applicability of prediction model. We provide a framework for extracting protein functional features from unstructured text in the literature with high accuracy, improving data dissemination and unlocking the potential of big data text analytics for generating new hypotheses.

9.
PLoS One ; 9(11): e112909, 2014.
Article in English | MEDLINE | ID: mdl-25412507

ABSTRACT

Medulloblastoma is the most common form of malignant paediatric brain tumour and is the leading cause of childhood cancer related mortality. The four molecular subgroups of medulloblastoma that have been identified - WNT, SHH, Group 3 and Group 4 - have molecular and topographical characteristics suggestive of different cells of origin. Definitive identification of the cell(s) of origin of the medulloblastoma subgroups, particularly the poorer prognosis Group 3 and Group 4 medulloblastoma, is critical to understand the pathogenesis of the disease, and ultimately for the development of more effective treatment options. To address this issue, the gene expression profiles of normal human neural tissues and cell types representing a broad neuro-developmental continuum, were compared to those of two independent cohorts of primary human medulloblastoma specimens. Clustering, co-expression network, and gene expression analyses revealed that WNT and SHH medulloblastoma may be derived from distinct neural stem cell populations during early embryonic development, while the transcriptional profiles of Group 3 and Group 4 medulloblastoma resemble cerebellar granule neuron precursors at weeks 10-15 and 20-30 of embryogenesis, respectively. Our data indicate that Group 3 medulloblastoma may arise through abnormal neuronal differentiation, whereas deregulation of synaptic pruning-associated apoptosis may be driving Group 4 tumorigenesis. Overall, these data provide significant new insight into the spatio-temporal relationships and molecular pathogenesis of the human medulloblastoma subgroups, and provide an important framework for the development of more refined model systems, and ultimately improved therapeutic strategies.


Subject(s)
Cerebellar Neoplasms/pathology , Gene Expression Profiling/methods , Gene Regulatory Networks , Medulloblastoma/pathology , Neurogenesis , Cells, Cultured , Cerebellar Neoplasms/genetics , Child , Child, Preschool , Gene Expression Regulation, Neoplastic , Humans , Medulloblastoma/genetics , Neural Stem Cells/cytology , Neural Stem Cells/metabolism , Neurons/cytology , Neurons/metabolism
10.
Front Plant Sci ; 5: 396, 2014.
Article in English | MEDLINE | ID: mdl-25161662

ABSTRACT

Sub-functionalization during the expansion of gene families in eukaryotes has occurred in part through specific subcellular localization of different family members. To better understand this process in plants, compiled records of large-scale proteomic and fluorescent protein localization datasets can be explored and bioinformatic predictions for protein localization can be used to predict the gaps in experimental data. This process can be followed by targeted experiments to test predictions. The SUBA3 database is a free web-service at http://suba.plantenergy.uwa.edu.au that helps users to explore reported experimental data and predictions concerning proteins encoded by gene families and to define the experiments required to locate these homologous sets of proteins. Here we show how SUBA3 can be used to explore the subcellular location of the Deg protease family of ATP-independent serine endopeptidases (Deg1-Deg16). Combined data integration and new experiments refined location information for Deg1 and Deg9, confirmed Deg2, Deg5, and Deg8 in plastids and Deg 15 in peroxisomes and provide substantial experimental evidence for mitochondrial localized Deg proteases. Two of these, Deg3 and Deg10, additionally localized to the plastid, revealing novel dual-targeted Deg proteases in the plastid and the mitochondrion. SUBA3 is continually updated to ensure that researchers can use the latest published data when planning the experimental steps remaining to localize gene family functions.

11.
J Dermatol Sci ; 67(2): 120-9, 2012 Aug.
Article in English | MEDLINE | ID: mdl-22727730

ABSTRACT

BACKGROUND: Melaleuca alternifolia (tea tree) oil (TTO) applied topically in a dilute (10%) dimethyl sulphoxide (DMSO) formulation exerts a rapid anti-cancer effect after a short treatment protocol. Tumour clearance is associated with skin irritation mediated by neutrophils which quickly and completely resolves upon treatment cessation. OBJECTIVE: To examine the mechanism of action underlying the anti-cancer activity of TTO. METHODS: Immune cell changes in subcutaneous tumour bearing mice in response to topically applied TTO treatments were assessed by flow cytometry and immunohistochemistry. Direct cytotoxicity of TTO on tumour cells in vivo was assessed by transmission electron microscopy. RESULTS: Neutrophils accumulate in the skin following topical 10% TTO/DMSO treatment but are not required for tumour clearance as neutrophil depletion did not abrogate the anti-cancer effect. Topically applied 10% TTO/DMSO, but not neat TTO, induces an accumulation and activation of dendritic cells and an accumulation of T cells. Although topical application of 10% TTO/DMSO appears to activate an immune response, anti-tumour efficacy is mediated by a direct effect on tumour cells in vivo. The direct cytotoxicity of TTO in vivo appears to be associated with TTO penetration. CONCLUSION: Future studies should focus on enhancing the direct cytotoxicity of TTO by increasing penetration through skin to achieve a higher in situ terpene concentration. This coupled with boosting a more specific anti-tumour immune response will likely result in long term clearance of tumours.


Subject(s)
Antineoplastic Agents/pharmacology , Melaleuca/drug effects , Neoplasms/drug therapy , Tea Tree Oil/pharmacology , Administration, Topical , Animals , Cell Line, Tumor , Dimethyl Sulfoxide/chemistry , Female , Flow Cytometry/methods , Immunohistochemistry/methods , Mice , Mice, Inbred C57BL , Microscopy, Electron, Transmission/methods , Neoplasm Transplantation , Neutrophils/cytology , Neutrophils/drug effects , Tea Tree Oil/administration & dosage
SELECTION OF CITATIONS
SEARCH DETAIL