Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 32
Filtrar
1.
Nat Commun ; 12(1): 4992, 2021 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-34404777

RESUMO

Liquid chromatography-mass spectrometry-based metabolomics studies are increasingly applied to large population cohorts, which run for several weeks or even years in data acquisition. This inevitably introduces unwanted intra- and inter-batch variations over time that can overshadow true biological signals and thus hinder potential biological discoveries. To date, normalisation approaches have struggled to mitigate the variability introduced by technical factors whilst preserving biological variance, especially for protracted acquisitions. Here, we propose a study design framework with an arrangement for embedding biological sample replicates to quantify variance within and between batches and a workflow that uses these replicates to remove unwanted variation in a hierarchical manner (hRUV). We use this design to produce a dataset of more than 1000 human plasma samples run over an extended period of time. We demonstrate significant improvement of hRUV over existing methods in preserving biological signals whilst removing unwanted variation for large scale metabolomics studies. Our tools not only provide a strategy for large scale data normalisation, but also provides guidance on the design strategy for large omics studies.


Assuntos
Metabolômica/métodos , Cromatografia Líquida , Humanos , Espectrometria de Massas/métodos , Modelos Biológicos , Fluxo de Trabalho
2.
BMC Pregnancy Childbirth ; 21(1): 277, 2021 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-33823838

RESUMO

BACKGROUND: There is increasing awareness that perinatal psychosocial adversity experienced by mothers, children, and their families, may influence health and well-being across the life course. To maximise the impact of population-based interventions for optimising perinatal wellbeing, health services can utilise empirical methods to identify subgroups at highest risk of poor outcomes relative to the overall population. METHODS: This study sought to identify sub-groups using latent class analysis within a population of mothers in Sydney, Australia, based on their differing experience of self-reported indicators of psychosocial adversity. This study sought to identify sub-groups using latent class analysis within a population of mothers in Sydney, Australia, based on their differing experience of self-reported indicators of psychosocial adversity. Subgroup differences in antenatal and postnatal depressive symptoms were assessed using the Edinburgh Postnatal Depression Scale. RESULTS: Latent class analysis identified four distinct subgroups within the cohort, who were distinguished empirically on the basis of their native language, current smoking status, previous involvement with Family-and-Community Services (FaCS), history of child abuse, presence of a supportive partner, and a history of intimate partner psychological violence. One group consisted of socially supported 'local' women who speak English as their primary language (Group L), another of socially supported 'migrant' women who speak a language other than English as their primary language (Group M), another of socially stressed 'local' women who speak English as their primary language (Group Ls), and socially stressed 'migrant' women who speak a language other than English as their primary language (Group Ms.). Compared to local and not socially stressed residents (L group), the odds of antenatal depression were nearly three times higher for the socially stressed groups (Ls OR: 2.87 95%CI 2.10-3.94) and nearly nine times more in the Ms. group (Ms OR: 8.78, 95%CI 5.13-15.03). Antenatal symptoms of depression were also higher in the not socially stressed migrant group (M OR: 1.70 95%CI 1.47-1.97) compared to non-migrants. In the postnatal period, Group M was 1.5 times more likely, while the Ms. group was over five times more likely to experience suboptimal mental health compared to Group L (OR 1.50, 95%CI 1.22-1.84; and OR 5.28, 95%CI 2.63-10.63, for M and Ms. respectively). CONCLUSIONS: The application of empirical subgrouping analysis permits an informed approach to targeted interventions and resource allocation for optimising perinatal maternal wellbeing.


Assuntos
Depressão Pós-Parto/prevenção & controle , Programas de Rastreamento/organização & administração , Saúde Materna/estatística & dados numéricos , Saúde Mental/estatística & dados numéricos , Adulto , Austrália/epidemiologia , Depressão Pós-Parto/diagnóstico , Depressão Pós-Parto/epidemiologia , Depressão Pós-Parto/psicologia , Registros Eletrônicos de Saúde/estatística & dados numéricos , Feminino , Alocação de Recursos para a Atenção à Saúde , Humanos , Recém-Nascido , Análise de Classes Latentes , Programas de Rastreamento/métodos , Assistência Perinatal/métodos , Assistência Perinatal/organização & administração , Gravidez , Escalas de Graduação Psiquiátrica/estatística & dados numéricos , Estudos Retrospectivos , Medição de Risco/métodos , Autorrelato/estatística & dados numéricos , Determinantes Sociais da Saúde/estatística & dados numéricos , Adulto Jovem
3.
Mar Genomics ; 59: 100857, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-33676872

RESUMO

The molecular mechanisms underlying development of the pentameral body of adult echinoderms are poorly understood but are important to solve with respect to evolution of a unique body plan that contrasts with the bilateral body plan of other deuterostomes. As Nodal and BMP2/4 signalling is involved in axis formation in larvae and development of the echinoderm body plan, we used the developmental transcriptome generated for the asterinid seastar Parvulastra exigua to investigate the temporal expression patterns of Nodal and BMP2/4 genes from the embryo and across metamorphosis to the juvenile. For echinoderms, the Asteroidea represents the basal-type body architecture with a distinct (separated) ray structure. Parvulastra exigua has lecithotrophic development forming the juvenile soon after gastrulation providing ready access to the developing adult stage. We identified 39 genes associated with the Nodal and BMP2/4 network in the P. exigua developmental transcriptome. Clustering analysis of these genes resulted in 6 clusters with similar temporal expression patterns across development. A co-expression analysis revealed genes that have similar expression profiles as Nodal and BMP2/4. These results indicated genes that may have a regulatory relationship in patterning morphogenesis of the juvenile seastar. Developmental RNA-seq analyses of Parvulastra exigua show changes in Nodal and BMP2/4 signalling genes across the metamorphic transition. We provide the foundation for detailed analyses of this cascade in the evolution of the unusual pentameral echinoderm body and its deuterostome affinities.

4.
Transplantation ; 105(9): 1914-1915, 2021 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-33534532
5.
Kidney Int ; 99(4): 817-823, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-32916179

RESUMO

Kidney transplant recipients and transplant physicians face important clinical questions where machine learning methods may help improve the decision-making process. This mini-review explores potential applications of machine learning methods to key stages of a kidney transplant recipient's journey, from initial waitlisting and donor selection, to personalization of immunosuppression and prediction of post-transplantation events. Both unsupervised and supervised machine learning methods are presented, including k-means clustering, principal components analysis, k-nearest neighbors, and random forests. The various challenges of these approaches are also discussed.


Assuntos
Transplante de Rim , Aprendizado de Máquina , Humanos , Transplante de Rim/efeitos adversos , Transplantados
6.
Nat Methods ; 17(8): 799-806, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32661426

RESUMO

Single-cell genomics has transformed our ability to examine cell fate choice. Examining cells along a computationally ordered 'pseudotime' offers the potential to unpick subtle changes in variability and covariation among key genes. We describe an approach, scHOT-single-cell higher-order testing-which provides a flexible and statistically robust framework for identifying changes in higher-order interactions among genes. scHOT can be applied for cells along a continuous trajectory or across space and accommodates various higher-order measurements including variability or correlation. We demonstrate the use of scHOT by studying coordinated changes in higher-order interactions during embryonic development of the mouse liver. Additionally, scHOT identifies subtle changes in gene-gene correlations across space using spatially resolved transcriptomics data from the mouse olfactory bulb. scHOT meaningfully adds to first-order differential expression testing and provides a framework for interrogating higher-order interactions using single-cell data.


Assuntos
Fígado/embriologia , Análise de Célula Única/métodos , Animais , Biologia Computacional , Bases de Dados de Ácidos Nucleicos , Hepatócitos/fisiologia , Fígado/citologia , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Análise de Sequência de RNA , Software
7.
Mol Syst Biol ; 16(6): e9389, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32567229

RESUMO

Automated cell type identification is a key computational challenge in single-cell RNA-sequencing (scRNA-seq) data. To capitalise on the large collection of well-annotated scRNA-seq datasets, we developed scClassify, a multiscale classification framework based on ensemble learning and cell type hierarchies constructed from single or multiple annotated datasets as references. scClassify enables the estimation of sample size required for accurate classification of cell types in a cell type hierarchy and allows joint classification of cells when multiple references are available. We show that scClassify consistently performs better than other supervised cell type classification methods across 114 pairs of reference and testing data, representing a diverse combination of sizes, technologies and levels of complexity, and further demonstrate the unique components of scClassify through simulations and compendia of experimental datasets. Finally, we demonstrate the scalability of scClassify on large single-cell atlases and highlight a novel application of identifying subpopulations of cells from the Tabula Muris data that were unidentified in the original publication. Together, scClassify represents state-of-the-art methodology in automated cell type identification from scRNA-seq data.

8.
Bioinformatics ; 36(14): 4137-4143, 2020 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-32353146

RESUMO

MOTIVATION: Multi-modal profiling of single cells represents one of the latest technological advancements in molecular biology. Among various single-cell multi-modal strategies, cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) allows simultaneous quantification of two distinct species: RNA and cell-surface proteins. Here, we introduce CiteFuse, a streamlined package consisting of a suite of tools for doublet detection, modality integration, clustering, differential RNA and protein expression analysis, antibody-derived tag evaluation, ligand-receptor interaction analysis and interactive web-based visualization of CITE-seq data. RESULTS: We demonstrate the capacity of CiteFuse to integrate the two data modalities and its relative advantage against data generated from single-modality profiling using both simulations and real-world CITE-seq data. Furthermore, we illustrate a novel doublet detection method based on a combined index of cell hashing and transcriptome data. Finally, we demonstrate CiteFuse for predicting ligand-receptor interactions by using multi-modal CITE-seq data. Collectively, we demonstrate the utility and effectiveness of CiteFuse for the integrative analysis of transcriptome and epitope profiles from CITE-seq data. AVAILABILITY AND IMPLEMENTATION: CiteFuse is freely available at http://shiny.maths.usyd.edu.au/CiteFuse/ as an online web service and at https://github.com/SydneyBioX/CiteFuse/ as an R package. CONTACT: pengyi.yang@sydney.edu.au. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Transcriptoma , Epitopos , Perfilação da Expressão Gênica , RNA , Análise de Sequência de RNA , Análise de Célula Única
9.
DNA Res ; 27(1)2020 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-32339242

RESUMO

The Echinodermata is characterized by a secondarily evolved pentameral body plan. While the evolutionary origin of this body plan has been the subject of debate, the molecular mechanisms underlying its development are poorly understood. We assembled a de novo developmental transcriptome from the embryo through metamorphosis in the sea star Parvulastra exigua. We use the asteroid model as it represents the basal-type echinoderm body architecture. Global variation in gene expression distinguished the gastrula profile and showed that metamorphic and juvenile stages were more similar to each other than to the pre-metamorphic stages, pointing to the marked changes that occur during metamorphosis. Differential expression and gene ontology (GO) analyses revealed dynamic changes in gene expression throughout development and the transition to pentamery. Many GO terms enriched during late metamorphosis were related to neurogenesis and signalling. Neural transcription factor genes exhibited clusters with distinct expression patterns. A suite of these genes was up-regulated during metamorphosis (e.g. Pax6, Eya, Hey, NeuroD, FoxD, Mbx, and Otp). In situ hybridization showed expression of neural genes in the CNS and sensory structures. Our results provide a foundation to understand the metamorphic transition in echinoderms and the genes involved in development and evolution of pentamery.


Assuntos
Neurogênese/genética , Estrelas-do-Mar/crescimento & desenvolvimento , Fatores de Transcrição/metabolismo , Animais , Evolução Molecular , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Estrelas-do-Mar/genética
10.
BMC Genomics ; 20(Suppl 9): 913, 2019 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-31874628

RESUMO

BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) is a fast emerging technology allowing global transcriptome profiling on the single cell level. Cell type identification from scRNA-seq data is a critical task in a variety of research such as developmental biology, cell reprogramming, and cancers. Typically, cell type identification relies on human inspection using a combination of prior biological knowledge (e.g. marker genes and morphology) and computational techniques (e.g. PCA and clustering). Due to the incompleteness of our current knowledge and the subjectivity involved in this process, a small amount of cells may be subject to mislabelling. RESULTS: Here, we propose a semi-supervised learning framework, named scReClassify, for 'post hoc' cell type identification from scRNA-seq datasets. Starting from an initial cell type annotation with potentially mislabelled cells, scReClassify first performs dimension reduction using PCA and next applies a semi-supervised learning method to learn and subsequently reclassify cells that are likely mislabelled initially to the most probable cell types. By using both simulated and real-world experimental datasets that profiled various tissues and biological systems, we demonstrate that scReClassify is able to accurately identify and reclassify misclassified cells to their correct cell types. CONCLUSIONS: scReClassify can be used for scRNA-seq data as a post hoc cell type classification tool to fine-tune cell type annotations generated by any cell type classification procedure. It is implemented as an R package and is freely available from https://github.com/SydneyBioX/scReClassify.


Assuntos
RNA-Seq/métodos , Animais , Humanos , Aprendizado de Máquina , Camundongos , Análise de Célula Única/métodos , Software
11.
Proteomics ; 19(13): e1900068, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31099962

RESUMO

The increasing role played by liquid chromatography-mass spectrometry (LC-MS)-based proteomics in biological discovery has led to a growing need for quality control (QC) on the LC-MS systems. While numerous quality control tools have been developed to track the performance of LC-MS systems based on a pre-defined set of performance factors (e.g., mass error, retention time), the precise influence and contribution of the performance factors and their generalization property to different biological samples are not as well characterized. Here, a web-based application (QCMAP) is developed for interactive diagnosis and prediction of the performance of LC-MS systems across different biological sample types. Leveraging on a standardized HeLa cell sample run as QC within a multi-user facility, predictive models are trained on a panel of commonly used performance factors to pinpoint the precise conditions to a (un)satisfactory performance in three LC-MS systems. It is demonstrated that the learned model can be applied to predict LC-MS system performance for brain samples generated from an independent study. By compiling these predictive models into our web-application, QCMAP allows users to benchmark the performance of their LC-MS systems using their own samples and identify key factors for instrument optimization. QCMAP is freely available from: http://shiny.maths.usyd.edu.au/QCMAP/.


Assuntos
Cromatografia Líquida/métodos , Proteômica/métodos , Controle de Qualidade , Espectrometria de Massas em Tandem/métodos , Linhagem Celular Tumoral , Células HeLa , Humanos , Internet
12.
Proc Natl Acad Sci U S A ; 116(20): 9775-9784, 2019 05 14.
Artigo em Inglês | MEDLINE | ID: mdl-31028141

RESUMO

Concerted examination of multiple collections of single-cell RNA sequencing (RNA-seq) data promises further biological insights that cannot be uncovered with individual datasets. Here we present scMerge, an algorithm that integrates multiple single-cell RNA-seq datasets using factor analysis of stably expressed genes and pseudoreplicates across datasets. Using a large collection of public datasets, we benchmark scMerge against published methods and demonstrate that it consistently provides improved cell type separation by removing unwanted factors; scMerge can also enhance biological discovery through robust data integration, which we show through the inference of development trajectory in a liver dataset collection.


Assuntos
Metanálise como Assunto , Análise de Sequência de RNA , Análise de Célula Única , Software , Algoritmos , Animais , Desenvolvimento Embrionário , Análise Fatorial , Expressão Gênica , Humanos , Camundongos
13.
Brief Bioinform ; 20(6): 2316-2326, 2019 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-30137247

RESUMO

Advances in high-throughput sequencing on single-cell gene expressions [single-cell RNA sequencing (scRNA-seq)] have enabled transcriptome profiling on individual cells from complex samples. A common goal in scRNA-seq data analysis is to discover and characterise cell types, typically through clustering methods. The quality of the clustering therefore plays a critical role in biological discovery. While numerous clustering algorithms have been proposed for scRNA-seq data, fundamentally they all rely on a similarity metric for categorising individual cells. Although several studies have compared the performance of various clustering algorithms for scRNA-seq data, currently there is no benchmark of different similarity metrics and their influence on scRNA-seq data clustering. Here, we compared a panel of similarity metrics on clustering a collection of annotated scRNA-seq datasets. Within each dataset, a stratified subsampling procedure was applied and an array of evaluation measures was employed to assess the similarity metrics. This produced a highly reliable and reproducible consensus on their performance assessment. Overall, we found that correlation-based metrics (e.g. Pearson's correlation) outperformed distance-based metrics (e.g. Euclidean distance). To test if the use of correlation-based metrics can benefit the recently published clustering techniques for scRNA-seq data, we modified a state-of-the-art kernel-based clustering algorithm (SIMLR) using Pearson's correlation as a similarity measure and found significant performance improvement over Euclidean distance on scRNA-seq data clustering. These findings demonstrate the importance of similarity metrics in clustering scRNA-seq data and highlight Pearson's correlation as a favourable choice. Further comparison on different scRNA-seq library preparation protocols suggests that they may also affect clustering performance. Finally, the benchmarking framework is available at http://www.maths.usyd.edu.au/u/SMS/bioinformatics/software.html.


Assuntos
Análise de Sequência de RNA , Algoritmos , Análise por Conglomerados , Humanos
14.
Sci Rep ; 8(1): 1774, 2018 01 29.
Artigo em Inglês | MEDLINE | ID: mdl-29379070

RESUMO

Insulin resistance is a major risk factor for metabolic diseases such as Type 2 diabetes. Although the underlying mechanisms of insulin resistance remain elusive, oxidative stress is a unifying driver by which numerous extrinsic signals and cellular stresses trigger insulin resistance. Consequently, we sought to understand the cellular response to oxidative stress and its role in insulin resistance. Using cultured 3T3-L1 adipocytes, we established a model of physiologically-derived oxidative stress by inhibiting the cycling of glutathione and thioredoxin, which induced insulin resistance as measured by impaired insulin-stimulated 2-deoxyglucose uptake. Using time-resolved transcriptomics, we found > 2000 genes differentially-expressed over 24 hours, with specific metabolic and signalling pathways enriched at different times. We explored this coordination using a knowledge-based hierarchical-clustering approach to generate a temporal transcriptional cascade and identify key transcription factors responding to oxidative stress. This response shared many similarities with changes observed in distinct insulin resistance models. However, an anti-oxidant reversed insulin resistance phenotypically but not transcriptionally, implying that the transcriptional response to oxidative stress is insufficient for insulin resistance. This suggests that the primary site by which oxidative stress impairs insulin action occurs post-transcriptionally, warranting a multi-level 'trans-omic' approach when studying time-resolved responses to cellular perturbations.


Assuntos
Adipócitos/metabolismo , Resistência à Insulina/genética , Estresse Oxidativo/genética , Transcrição Genética/genética , Células 3T3-L1 , Animais , Linhagem Celular , Desoxiglucose/metabolismo , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Glucose/metabolismo , Insulina/genética , Camundongos , Transdução de Sinais/genética
15.
Bioinformatics ; 33(13): 1916-1920, 2017 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-28203701

RESUMO

Motivation: DNA binding proteins such as chromatin remodellers, transcription factors (TFs), histone modifiers and co-factors often bind cooperatively to activate or repress their target genes in a cell type-specific manner. Nonetheless, the precise role of cooperative binding in defining cell-type identity is still largely uncharacterized. Results: Here, we collected and analyzed 214 public datasets representing chromatin immunoprecipitation followed by sequencing (ChIP-Seq) of 104 DNA binding proteins in embryonic stem cell (ESC) lines. We classified their binding sites into those proximal to gene promoters and those in distal regions, and developed a web resource called Proximal And Distal (PAD) clustering to identify their co-localization at these respective regions. Using this extensive dataset, we discovered an extensive co-localization of BRG1 and CHD7 at distal but not proximal regions. The comparison of co-localization sites to those bound by either BRG1 or CHD7 alone showed an enrichment of ESC master TFs binding and active chromatin architecture at co-localization sites. Most notably, our analysis reveals the co-dependency of BRG1 and CHD7 at distal regions on regulating expression of their common target genes in ESC. This work sheds light on cooperative binding of TF binding proteins in regulating gene expression in ESC, and demonstrates the utility of integrative analysis of a manually curated compendium of genome-wide protein binding profiles in our online resource PAD. Availability and Implementation: PAD is freely available at http://pad.victorchang.edu.au/ and its source code is available via an open source GPL 3.0 license at https://github.com/VCCRI/PAD/. Contact: pengyi.yang@sydney.edu.au or j.ho@victorchang.edu.au. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
DNA Helicases/genética , Proteínas de Ligação a DNA/genética , Células-Tronco Embrionárias/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Proteínas Nucleares/genética , Análise de Sequência de DNA/métodos , Software , Fatores de Transcrição/genética , Animais , Linhagem Celular , Imunoprecipitação da Cromatina/métodos , Camundongos
16.
BMC Dev Biol ; 17(1): 4, 2017 02 13.
Artigo em Inglês | MEDLINE | ID: mdl-28193178

RESUMO

BACKGROUND: The molecular mechanisms underlying the development of the unusual echinoderm pentameral body plan and their likeness to mechanisms underlying the development of the bilateral plans of other deuterostomes are of interest in tracing body plan evolution. In this first study of the spatial expression of genes associated with Nodal and BMP2/4 signalling during the transition to pentamery in sea urchins, we investigate Heliocidaris erythrogramma, a species that provides access to the developing adult rudiment within days of fertilization. RESULTS: BMP2/4, and the putative downstream genes, Six1/2, Eya, Tbx2/3 and Msx were expressed in the earliest morphological manifestation of pentamery during development, the five hydrocoele lobes. The formation of the vestibular ectoderm, the specialized region overlying the left coelom that forms adult ectoderm, involved the expression of putative Nodal target genes Chordin, Gsc and BMP2/4 and putative BMP2/4 target genes Dlx, Msx and Tbx. The expression of Nodal, Lefty and Pitx2 in the right ectoderm, and Pitx2 in the right coelom, was as previously observed in other sea urchins. CONCLUSION: That genes associated with Nodal and BMP2/4 signalling are expressed in the hydrocoele lobes, indicates that they have a role in the developmental transition to pentamery, contributing to our understanding of how the most unusual body plan in the Bilateria may have evolved. We suggest that the Nodal and BMP2/4 signalling cascades might have been duplicated or split during the evolution to pentamery.


Assuntos
Anthocidaris/crescimento & desenvolvimento , Anthocidaris/genética , Padronização Corporal/genética , Proteínas Morfogenéticas Ósseas/genética , Regulação da Expressão Gênica no Desenvolvimento , Proteína Nodal/genética , Animais , Proteínas Morfogenéticas Ósseas/metabolismo , Ectoderma/metabolismo , Proteína Nodal/metabolismo , Transdução de Sinais
17.
Methods Mol Biol ; 1558: 459-469, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28150252

RESUMO

Protein post-translational modifications (PTMs) are crucial for signal transduction in cells. In order to understand key cell signaling events, identification of functionally important PTMs, which are more likely to be evolutionarily conserved, is necessary. In recent times, high-throughput mass spectrometry (MS) has made quantitative datasets in diverse species readily available, which has led to a growing need for tools to facilitate cross-species comparison of PTM data. Cross-species comparison of PTM sites is difficult since they often lie in structurally disordered protein domains. Current tools that address this can only map known PTMs between species based on previously annotated orthologous phosphosites and do not enable cross-species mapping of newly identified modification sites. Here, we describe an automated web-based tool, PhosphOrtholog, that accurately maps annotated and novel orthologous PTM sites from high-throughput MS-based experimental data obtained from different species without relying on existing PTM databases. Identification of conserved PTMs across species from large-scale experimental data increases our knowledgebase of evolutionarily conserved and functional PTM sites that influence most biological processes. In this Chapter, we illustrate with examples how to use PhosphOrtholog to map novel PTM sites from cross-species MS-based phosphoproteomics data.


Assuntos
Aminoácidos/metabolismo , Biologia Computacional/métodos , Bases de Dados de Proteínas , Fosfoproteínas/metabolismo , Processamento de Proteína Pós-Traducional , Proteômica/métodos , Algoritmos , Animais , Humanos , Proteoma , Ferramenta de Busca , Software , Especificidade da Espécie , Interface Usuário-Computador , Navegador
18.
Proteomics ; 16(13): 1868-71, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27145998

RESUMO

Mass spectrometry (MS)-based quantitative phosphoproteomics has become a key approach for proteome-wide profiling of phosphorylation in tissues and cells. Traditional experimental design often compares a single treatment with a control, whereas increasingly more experiments are designed to compare multiple treatments with respect to a control. To this end, the development of bioinformatic tools that can integrate multiple treatments and visualise kinases and substrates under combinatorial perturbations is vital for dissecting concordant and/or independent effects of each treatment. Here, we propose a hypothesis driven kinase perturbation analysis (KinasePA) to annotate and visualise kinases and their substrates that are perturbed by various combinatorial effects of treatments in phosphoproteomics experiments. We demonstrate the utility of KinasePA through its application to two large-scale phosphoproteomics datasets and show its effectiveness in dissecting kinases and substrates within signalling pathways driven by unique combinations of cellular stimuli and inhibitors. We implemented and incorporated KinasePA as part of the "directPA" R package available from the comprehensive R archive network (CRAN). Furthermore, KinasePA also has an interactive web interface that can be readily applied to annotate user provided phosphoproteomics data (http://kinasepa.pengyiyang.org).


Assuntos
Proteínas Quinases/metabolismo , Proteômica/métodos , Linhagem Celular , Cromonas/farmacologia , Bases de Dados de Proteínas , Compostos Heterocíclicos com 3 Anéis/farmacologia , Humanos , Insulina/metabolismo , Morfolinas/farmacologia , Naftiridinas/farmacologia , Fosforilação , Inibidores de Proteínas Quinases/farmacologia , Transdução de Sinais/efeitos dos fármacos , Sirolimo/farmacologia , Serina-Treonina Quinases TOR/antagonistas & inibidores , Serina-Treonina Quinases TOR/metabolismo
19.
Comput Biol Chem ; 63: 73-82, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-26935398

RESUMO

BACKGROUND: Data made available through large cancer consortia like The Cancer Genome Atlas make for a rich source of information to be studied across and between cancers. In recent years, network approaches have been applied to such data in uncovering the complex interrelationships between mutational and expression profiles, but lack direct testing for expression changes via mutation. In this pan-cancer study we analyze mutation and gene expression information in an integrative manner by considering the networks generated by testing for differences in expression in direct association with specific mutations. We relate our findings among the 19 cancers examined to identify commonalities and differences as well as their characteristics. RESULTS: Using somatic mutation and gene expression information across 19 cancers, we generated mutation-expression networks per cancer. On evaluation we found that our generated networks were significantly enriched for known cancer-related genes, such as skin cutaneous melanoma (p<0.01 using Network of Cancer Genes 4.0). Our framework identified that while different cancers contained commonly mutated genes, there was little concordance between associated gene expression changes among cancers. Comparison between cancers showed a greater overlap of network nodes for cancers with higher overall non-silent mutation load, compared to those with a lower overall non-silent mutation load. CONCLUSIONS: This study offers a framework that explores network information through co-analysis of somatic mutations and gene expression profiles. Our pan-cancer application of this approach suggests that while mutations are frequently common among cancer types, the impact they have on the surrounding networks via gene expression changes varies. Despite this finding, there are some cancers for which mutation-associated network behaviour appears to be similar: suggesting a potential framework for uncovering related cancers for which similar therapeutic strategies may be applicable. Our framework for understanding relationships among cancers has been integrated into an interactive R Shiny application, PAn Cancer Mutation Expression Networks (PACMEN), containing dynamic and static network visualization of the mutation-expression networks. PACMEN also features tools for further examination of network topology characteristics among cancers.


Assuntos
Mutação , Neoplasias/genética , Redes Reguladoras de Genes , Predisposição Genética para Doença , Humanos , Modelos Teóricos
20.
PLoS Comput Biol ; 11(8): e1004403, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26252020

RESUMO

Cell signaling underlies transcription/epigenetic control of a vast majority of cell-fate decisions. A key goal in cell signaling studies is to identify the set of kinases that underlie key signaling events. In a typical phosphoproteomics study, phosphorylation sites (substrates) of active kinases are quantified proteome-wide. By analyzing the activities of phosphorylation sites over a time-course, the temporal dynamics of signaling cascades can be elucidated. Since many substrates of a given kinase have similar temporal kinetics, clustering phosphorylation sites into distinctive clusters can facilitate identification of their respective kinases. Here we present a knowledge-based CLUster Evaluation (CLUE) approach for identifying the most informative partitioning of a given temporal phosphoproteomics data. Our approach utilizes prior knowledge, annotated kinase-substrate relationships mined from literature and curated databases, to first generate biologically meaningful partitioning of the phosphorylation sites and then determine key kinases associated with each cluster. We demonstrate the utility of the proposed approach on two time-series phosphoproteomics datasets and identify key kinases associated with human embryonic stem cell differentiation and insulin signaling pathway. The proposed approach will be a valuable resource in the identification and characterizing of signaling networks from phosphoproteomics data.


Assuntos
Comunicação Celular/fisiologia , Bases de Conhecimento , Fosfoproteínas/metabolismo , Proteoma/metabolismo , Proteômica/métodos , Transdução de Sinais/fisiologia , Diferenciação Celular/fisiologia , Linhagem Celular , Bases de Dados de Proteínas , Células-Tronco Embrionárias , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...