Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 79
Filtrar
1.
Mol Cell Proteomics ; 21(9): 100280, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35944844

RESUMO

Mouse models of Alzheimer's disease (AD) show progression through stages reflective of human pathology. Proteomics identification of temporal and sex-linked factors driving AD-related pathways can be used to dissect initiating and propagating events of AD stages to develop biomarkers or design interventions. In the present study, we conducted label-free proteome measurements of mouse hippocampus tissue with variables of time (3, 6, and 9 months), genetic background (5XFAD versus WT), and sex (equal males and females). These time points are associated with well-defined phenotypes with respect to the following: Aß42 plaque deposition, memory deficits, and neuronal loss, allowing correlation of proteome-based molecular signatures with the mouse model stages. Our data show 5XFAD mice exhibit increases in known human AD biomarkers as amyloid-beta peptide, APOE, GFAP, and ITM2B are upregulated across all time points/stages. At the same time, 23 proteins are here newly associated with Alzheimer's pathology as they are also dysregulated in 5XFAD mice. At a pathways level, the 5XFAD-specific upregulated proteins are significantly enriched for DNA damage and stress-induced senescence at 3-month only, while at 6-month, the AD-specific proteome signature is altered and significantly enriched for membrane trafficking and vesicle-mediated transport protein annotations. By 9-month, AD-specific dysregulation is also characterized by significant neuroinflammation with innate immune system, platelet activation, and hyper-reactive astrocyte-related enrichments. Aside from these temporal changes, analysis of sex-linked differences in proteome signatures uncovered novel sex and AD-associated proteins. Pathway analysis revealed sex-linked differences in the 5XFAD model to be involved in the regulation of well-known human AD-related processes of amyloid fibril formation, wound healing, lysosome biogenesis, and DNA damage. Verification of the discovery results by Western blot and parallel reaction monitoring confirm the fundamental conclusions of the study and poise the 5XFAD model for further use as a molecular tool for understanding AD.


Assuntos
Doença de Alzheimer , Doença de Alzheimer/metabolismo , Amiloide , Peptídeos beta-Amiloides/metabolismo , Animais , Apolipoproteínas E/metabolismo , Biomarcadores , Modelos Animais de Doenças , Feminino , Humanos , Masculino , Camundongos , Camundongos Transgênicos , Proteoma
2.
Int J Mol Sci ; 25(12)2024 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-38928221

RESUMO

Methionine oxidation to the sulfoxide form (MSox) is a poorly understood post-translational modification of proteins associated with non-specific chemical oxidation from reactive oxygen species (ROS), whose chemistries are linked to various disease pathologies, including neurodegeneration. Emerging evidence shows MSox site occupancy is, in some cases, under enzymatic regulatory control, mediating cellular signaling, including phosphorylation and/or calcium signaling, and raising questions as to the speciation and functional nature of MSox across the proteome. The 5XFAD lineage of the C57BL/6 mouse has well-defined Alzheimer's and aging states. Using this model, we analyzed age-, sex-, and disease-dependent MSox speciation in the mouse hippocampus. In addition, we explored the chemical stability and statistical variance of oxidized peptide signals to understand the needed power for MSox-based proteome studies. Our results identify mitochondrial and glycolytic pathway targets with increases in MSox with age as well as neuroinflammatory targets accumulating MSox with AD in proteome studies of the mouse hippocampus. Further, this paper establishes a foundation for reproducible and rigorous experimental MSox-omics appropriate for novel target identification in biological discovery and for biomarker analysis in ROS and other oxidation-linked diseases.


Assuntos
Envelhecimento , Doença de Alzheimer , Glicólise , Hipocampo , Metionina , Camundongos Endogâmicos C57BL , Mitocôndrias , Proteômica , Animais , Doença de Alzheimer/metabolismo , Doença de Alzheimer/patologia , Hipocampo/metabolismo , Camundongos , Mitocôndrias/metabolismo , Proteômica/métodos , Metionina/metabolismo , Metionina/análogos & derivados , Envelhecimento/metabolismo , Masculino , Feminino , Oxirredução , Proteoma/metabolismo , Espécies Reativas de Oxigênio/metabolismo , Modelos Animais de Doenças
3.
Bioinformatics ; 38(15): 3785-3793, 2022 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-35731218

RESUMO

MOTIVATION: Protein phosphorylation is a ubiquitous regulatory mechanism that plays a central role in cellular signaling. According to recent estimates, up to 70% of human proteins can be phosphorylated. Therefore, the characterization of phosphorylation dynamics is critical for understanding a broad range of biological and biochemical processes. Technologies based on mass spectrometry are rapidly advancing to meet the needs for high-throughput screening of phosphorylation. These technologies enable untargeted quantification of thousands of phosphorylation sites in a given sample. Many labs are already utilizing these technologies to comprehensively characterize signaling landscapes by examining perturbations with drugs and knockdown approaches, or by assessing diverse phenotypes in cancers, neuro-degerenational diseases, infectious diseases and normal development. RESULTS: We comprehensively investigate the concept of 'co-phosphorylation' (Co-P), defined as the correlated phosphorylation of a pair of phosphosites across various biological states. We integrate nine publicly available phosphoproteomics datasets for various diseases (including breast cancer, ovarian cancer and Alzheimer's disease) and utilize functional data related to sequence, evolutionary histories, kinase annotations and pathway annotations to investigate the functional relevance of Co-P. Our results across a broad range of studies consistently show that functionally associated sites tend to exhibit significant positive or negative Co-P. Specifically, we show that Co-P can be used to predict with high precision the sites that are on the same pathway or that are targeted by the same kinase. Overall, these results establish Co-P as a useful resource for analyzing phosphoproteins in a network context, which can help extend our knowledge on cellular signaling and its dysregulation. AVAILABILITY AND IMPLEMENTATION: github.com/msayati/Cophosphorylation. This research used the publicly available datasets published by other researchers as cited in the manuscript. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Fosfoproteínas , Proteômica , Humanos , Fosforilação , Proteômica/métodos , Fosfoproteínas/química , Espectrometria de Massas/métodos , Fosfotransferases/metabolismo
4.
Bioinformatics ; 38(4): 908-917, 2022 01 27.
Artigo em Inglês | MEDLINE | ID: mdl-34864867

RESUMO

MOTIVATION: Genome-wide association studies show that variants in individual genomic loci alone are not sufficient to explain the heritability of complex, quantitative phenotypes. Many computational methods have been developed to address this issue by considering subsets of loci that can collectively predict the phenotype. This problem can be considered a challenging instance of feature selection in which the number of dimensions (loci that are screened) is much larger than the number of samples. While currently available methods can achieve decent phenotype prediction performance, they either do not scale to large datasets or have parameters that require extensive tuning. RESULTS: We propose a fast and simple algorithm, Macarons, to select a small, complementary subset of variants by avoiding redundant pairs that are likely to be in linkage disequilibrium. Our method features two interpretable parameters that control the time/performance trade-off without requiring parameter tuning. In our computational experiments, we show that Macarons consistently achieves similar or better prediction performance than state-of-the-art selection methods while having a simpler premise and being at least two orders of magnitude faster. Overall, Macarons can seamlessly scale to the human genome with ∼107 variants in a matter of minutes while taking the dependencies between the variants into account. AVAILABILITYAND IMPLEMENTATION: Macarons is available in Matlab and Python at https://github.com/serhan-yilmaz/macarons. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Estudo de Associação Genômica Ampla , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Desequilíbrio de Ligação , Genoma Humano , Polimorfismo de Nucleotídeo Único
5.
Medicina (Kaunas) ; 59(1)2023 Jan 07.
Artigo em Inglês | MEDLINE | ID: mdl-36676744

RESUMO

Background and Objectives: There is no biomarker to predict lithium response. This study used CellPrint™ enhanced flow cytometry to study 28 proteins representing a spectrum of cellular pathways in monocytes and CD4+ lymphocytes before and after lithium treatment in patients with bipolar disorder (BD). Materials and Methods: Symptomatic patients with BD type I or II received lithium (serum level ≥ 0.6 mEq/L) for 16 weeks. Patients were assessed with standard rating scales and divided into two groups, responders (≥50% improvement from baseline) and non-responders. Twenty-eight intracellular proteins in CD4+ lymphocytes and monocytes were analyzed with CellPrint™, an enhanced flow cytometry procedure. Data were analyzed for differences in protein expression levels. Results: The intent-to-treat sample included 13 lithium-responders (12 blood samples before treatment and 9 after treatment) and 11 lithium-non-responders (11 blood samples before treatment and 4 after treatment). No significant differences in expression between the groups was observed prior to lithium treatment. After treatment, the majority of analytes increased expression in responders and decreased expression in non-responders. Significant increases were seen for PDEB4 and NR3C1 in responders. A significant decrease was seen for NR3C1 in non-responders. Conclusions: Lithium induced divergent directionality of protein expression depending on the whether the patient was a responder or non-responder, elucidating molecular characteristics of lithium responsiveness. A subsequent study with a larger sample size is warranted.


Assuntos
Transtorno Bipolar , Lítio , Humanos , Lítio/farmacologia , Lítio/uso terapêutico , Transtorno Bipolar/tratamento farmacológico , Compostos de Lítio , Citometria de Fluxo , Linhagem Celular
6.
Bioinformatics ; 37(23): 4501-4508, 2021 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-34152393

RESUMO

BACKGROUND: Link prediction is an important and well-studied problem in network biology. Recently, graph representation learning methods, including Graph Convolutional Network (GCN)-based node embedding have drawn increasing attention in link prediction. MOTIVATION: An important component of GCN-based network embedding is the convolution matrix, which is used to propagate features across the network. Existing algorithms use the degree-normalized adjacency matrix for this purpose, as this matrix is closely related to the graph Laplacian, capturing the spectral properties of the network. In parallel, it has been shown that GCNs with a single layer can generate more robust embeddings by reducing the number of parameters. Laplacian-based convolution is not well suited to single-layered GCNs, as it limits the propagation of information to immediate neighbors of a node. RESULTS: Capitalizing on the rich literature on unsupervised link prediction, we propose using node similarity-based convolution matrices in GCNs to compute node embeddings for link prediction. We consider eight representative node-similarity measures (Common Neighbors, Jaccard Index, Adamic-Adar, Resource Allocation, Hub- Depressed Index, Hub-Promoted Index, Sorenson Index and Salton Index) for this purpose. We systematically compare the performance of the resulting algorithms against GCNs that use the degree-normalized adjacency matrix for convolution, as well as other link prediction algorithms. In our experiments, we use three-link prediction tasks involving biomedical networks: drug-disease association prediction, drug-drug interaction prediction and protein-protein interaction prediction. Our results show that node similarity-based convolution matrices significantly improve the link prediction performance of GCN-based embeddings. CONCLUSION: As sophisticated machine-learning frameworks are increasingly employed in biological applications, historically well-established methods can be useful in making a head-start. AVAILABILITY AND IMPLEMENTATION: Our method, SiGraC, is implemented as a Python library and is freely available at https://github.com/mustafaCoskunAgu/SiGraC.


Assuntos
Algoritmos , Bibliotecas , Biblioteca Gênica , Aprendizado de Máquina
7.
Bioinformatics ; 37(2): 221-228, 2021 04 19.
Artigo em Inglês | MEDLINE | ID: mdl-32730576

RESUMO

MOTIVATION: Protein phosphorylation is a ubiquitous mechanism of post-translational modification that plays a central role in cellular signaling. Phosphorylation is particularly important in the context of cancer, as downregulation of tumor suppressors and upregulation of oncogenes by the dysregulation of associated kinase and phosphatase networks are shown to have key roles in tumor growth and progression. Despite recent advances that enable large-scale monitoring of protein phosphorylation, these data are not fully incorporated into such computational tasks as phenotyping and subtyping of cancers. RESULTS: We develop a network-based algorithm, CoPPNet, to enable unsupervised subtyping of cancers using phosphorylation data. For this purpose, we integrate prior knowledge on evolutionary, structural and functional association of phosphosites, kinase-substrate associations and protein-protein interactions with the correlation of phosphorylation of phosphosites across different tumor samples (a.k.a co-phosphorylation) to construct a context-specific-weighted network of phosphosites. We then mine these networks to identify subnetworks with correlated phosphorylation patterns. We apply the proposed framework to two mass-spectrometry-based phosphorylation datasets for breast cancer (BC), and observe that (i) the phosphorylation pattern of the identified subnetworks are highly correlated with clinically identified subtypes, and (ii) the identified subnetworks are highly reproducible across datasets that are derived from different studies. Our results show that integration of quantitative phosphorylation data with network frameworks can provide mechanistic insights into the differences between the signaling mechanisms that drive BC subtypes. Furthermore, the reproducibility of the identified subnetworks suggests that phosphorylation can provide robust classification of disease response and markers. AVAILABILITY AND IMPLEMENTATION: CoPPNet is available at http://compbio.case.edu/coppnet/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Neoplasias da Mama , Neoplasias da Mama/genética , Humanos , Fosforilação , Processamento de Proteína Pós-Traducional , Reprodutibilidade dos Testes , Transdução de Sinais
8.
Bioinformatics ; 36(12): 3652-3661, 2020 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-32044914

RESUMO

MOTIVATION: Protein phosphorylation is a key regulator of protein function in signal transduction pathways. Kinases are the enzymes that catalyze the phosphorylation of other proteins in a target-specific manner. The dysregulation of phosphorylation is associated with many diseases including cancer. Although the advances in phosphoproteomics enable the identification of phosphosites at the proteome level, most of the phosphoproteome is still in the dark: more than 95% of the reported human phosphosites have no known kinases. Determining which kinase is responsible for phosphorylating a site remains an experimental challenge. Existing computational methods require several examples of known targets of a kinase to make accurate kinase-specific predictions, yet for a large body of kinases, only a few or no target sites are reported. RESULTS: We present DeepKinZero, the first zero-shot learning approach to predict the kinase acting on a phosphosite for kinases with no known phosphosite information. DeepKinZero transfers knowledge from kinases with many known target phosphosites to those kinases with no known sites through a zero-shot learning model. The kinase-specific positional amino acid preferences are learned using a bidirectional recurrent neural network. We show that DeepKinZero achieves significant improvement in accuracy for kinases with no known phosphosites in comparison to the baseline model and other methods available. By expanding our knowledge on understudied kinases, DeepKinZero can help to chart the phosphoproteome atlas. AVAILABILITY AND IMPLEMENTATION: The source codes are available at https://github.com/Tastanlab/DeepKinZero. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Fosfoproteínas , Fosfotransferases , Humanos , Fosfoproteínas/metabolismo , Fosforilação , Proteoma , Software
9.
PLoS Comput Biol ; 15(2): e1006678, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30811403

RESUMO

We present CoPhosK to predict kinase-substrate associations for phosphopeptide substrates detected by mass spectrometry (MS). The tool utilizes a Naïve Bayes framework with priors of known kinase-substrate associations (KSAs) to generate its predictions. Through the mining of MS data for the collective dynamic signatures of the kinases' substrates revealed by correlation analysis of phosphopeptide intensity data, the tool infers KSAs in the data for the considerable body of substrates lacking such annotations. We benchmarked the tool against existing approaches for predicting KSAs that rely on static information (e.g. sequences, structures and interactions) using publically available MS data, including breast, colon, and ovarian cancer models. The benchmarking reveals that co-phosphorylation analysis can significantly improve prediction performance when static information is available (about 35% of sites) while providing reliable predictions for the remainder, thus tripling the KSAs available from the experimental MS data providing to a comprehensive and reliable characterization of the landscape of kinase-substrate interactions well beyond current limitations.


Assuntos
Biologia Computacional/métodos , Proteínas Quinases/fisiologia , Especificidade por Substrato/fisiologia , Teorema de Bayes , Sítios de Ligação , Bases de Dados de Proteínas , Humanos , Espectrometria de Massas , Fosforilação/fisiologia , Fosfotransferases/fisiologia , Ligação Proteica , Mapeamento de Interação de Proteínas , Proteoma , Análise de Sequência de Proteína , Software
10.
BMC Womens Health ; 20(1): 269, 2020 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-33287806

RESUMO

BACKGROUND: It is estimated that a majority of intimate partner violence (IPV) victims suffer from blunt force to the head, neck and the face area. Injuries to head and neck are among the major causes for traumatic brain injury (TBI). METHODS: In this interdisciplinary study, we aimed to characterize the key associations between IPV and TBI by mining de-identified electronic health records data with more than 12 M records between 1999 to 2017 from the IBM Explorys platform. For this purpose, we formulated a data-driven analytical framework to identify significant health correlates among IPV, TBI and six control cohorts. Using this framework, we assessed the co-morbidity, shared prevalence, and synergy between pairs of conditions. RESULTS: Our findings suggested that health effects attributed to malnutrition, acquired thrombocytopenia, post-traumatic wound infection, local infection of wound, poisoning by cardiovascular drug, alcoholic cirrhosis, alcoholic fatty liver, and drug-induced cirrhosis were highly significant at the joint presence of IPV and TBI. CONCLUSION: To develop a better understanding of how IPV is related to negative health effects, it is potentially useful to determine the interactions and relationships between symptom categories. Our results can potentially improve the accuracy and confidence of existing clinical screening techniques on determining IPV-induced TBI diagnoses.


Assuntos
Lesões Encefálicas Traumáticas , Violência por Parceiro Íntimo , Lesões Encefálicas Traumáticas/epidemiologia , Análise de Dados , Registros Eletrônicos de Saúde , Humanos , Violência por Parceiro Íntimo/estatística & dados numéricos , Prevalência
11.
BMC Bioinformatics ; 20(Suppl 12): 320, 2019 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-31216985

RESUMO

BACKGROUND: As Genome-Wide Association Studies (GWAS) have been increasingly used with data from various populations, it has been observed that data from different populations reveal different sets of Single Nucleotide Polymorphisms (SNPs) that are associated with the same disease. Using Type II Diabetes (T2D) as a test case, we develop measures and methods to characterize the functional overlap of SNPs associated with the same disease across populations. RESULTS: We introduce the notion of an Overlap Matrix as a general means of characterizing the functional overlap between different SNP sets at different genomic and functional granularities. Using SNP-to-gene mapping, functional annotation databases, and functional association networks, we assess the degree of functional overlap across nine populations from Asian and European ethnic origins. We further assess the generalizability of the method by applying it to a dataset for another complex disease - Prostate Cancer. Our results show that more overlap is captured as more functional data is incorporated as we go through the pipeline, starting from SNPs and ending at network overlap analyses. We hypothesize that these observed differences in the functional mechanisms of T2D across populations can also explain the common use of different prescription drugs in different populations. We show that this hypothesis is concordant with the literature on the functional mechanisms of prescription drugs. CONCLUSION: Our results show that although the etiology of a complex disease can be associated with distinct processes that are affected in different populations, network-based annotations can capture more functional overlap across populations. These results support the notion that it can be useful to take ethnicity into account in making personalized treatment decisions for complex diseases.


Assuntos
Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único/genética , Povo Asiático , Diabetes Mellitus Tipo 2/tratamento farmacológico , Etnicidade , Genoma Humano , Humanos , Masculino , Neoplasias da Próstata/genética , Mapas de Interação de Proteínas/genética
12.
Mol Hum Reprod ; 25(7): 408-422, 2019 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-31211832

RESUMO

Parturition involves cellular signaling changes driven by the complex interplay between progesterone (P4), inflammation, and the cyclic adenosine monophosphate (cAMP) pathway. To characterize this interplay, we performed comprehensive transcriptomic studies utilizing eight treatment combinations on myometrial cell lines and tissue samples from pregnant women. We performed genome-wide RNA-sequencing on the hTERT-HM${}^{A/B}$ cell line treated with all combinations of P4, forskolin (FSK) (induces cAMP), and interleukin-1$\beta$ (IL-1$\beta$). We then performed gene set enrichment and regulatory network analyses to identify pathways commonly, differentially, or synergistically regulated by these treatments. Finally, we used tissue similarity index (TSI) to characterize the correspondence between cell lines and tissue phenotypes. We observed that in addition to their individual anti-inflammatory effects, P4 and cAMP synergistically blocked specific inflammatory pathways/regulators including STAT3/6, CEBPA/B, and OCT1/7, but not NF$\kappa$B. TSI analysis indicated that FSK + P4- and IL-1$\beta$-treated cells exhibit transcriptional signatures highly similar to non-laboring and laboring term myometrium, respectively. Our results identify potential therapeutic targets to prevent preterm birth and show that the hTERT-HM${}^{A/B}$ cell line provides an accurate transcriptional model for term myometrial tissue.


Assuntos
AMP Cíclico/genética , Inflamação/genética , Miométrio/metabolismo , Parto/genética , Parto/fisiologia , Progesterona/genética , Transdução de Sinais/fisiologia , Feminino , Humanos , Técnicas In Vitro , Interleucina-1beta/genética , Trabalho de Parto/metabolismo , Gravidez , RNA-Seq , Transdução de Sinais/genética
13.
Nucleic Acids Res ; 45(14): e131, 2017 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-28605458

RESUMO

Epistasis is defined as a statistical interaction between two or more genomic loci in terms of their association with a phenotype of interest. Epistatic loci that are identified using data from Genome-Wide Association Studies (GWAS) provide insights into the interplay among multiple genetic factors, with applications including assessment of susceptibility to complex diseases, decision making in precision medicine, and gaining insights into disease mechanisms. Since the number of genomic loci assayed by GWAS is extremely large (usually in the order of millions), identification of epistatic loci is a statistically difficult and computationally intensive problem. Even when only pairwise interactions are considered, the size of the search space ranges from hundreds of millions to billions of locus pairs. The large number of statistical tests performed also makes sufficient type one error correction imperative. Consequently, efficient algorithms are required to filter the tests that are performed and evaluate large GWAS data sets in a reasonable amount of computation time. It has been observed that many pairwise tests are redundant due to correlations in their genotype values across samples, known as linkage disequilibrium. However, algorithms that have been developed for efficient identification of epistatic loci do not systematically exploit linkage disequilibrium. Here, we propose a new algorithm for fast epistasis detection based on hierarchical representation of linkage disequilibrium (LinDen). We utilize redundancies in genotype patterns between neighboring loci to generate a hierarchical structure and execute a branch-and-bound search to prioritize loci testing based on approximations of a test statistic for pairs of locus groups. The hierarchical organization of tests performed by LinDen allows for efficient scaling based on the screened loci. We test LinDen comprehensively on three data sets obtained from the Wellcome Trust Case Control Consortium: type two diabetes, psoriasis, and hypertension. Our results show that, as compared other state-of-the-art tools for fast epistasis detection, LinDen drastically reduces the number of tests performed while discovering statistically significant locus pairs. LinDen is implemented in C++ and is available as open source at http://compbio. CASE: edu/linden/.


Assuntos
Biologia Computacional/métodos , Epistasia Genética , Estudo de Associação Genômica Ampla/métodos , Desequilíbrio de Ligação , Algoritmos , Diabetes Mellitus Tipo 2/genética , Frequência do Gene , Genótipo , Humanos , Hipertensão/genética , Modelos Genéticos , Fenótipo , Psoríase/genética , Reprodutibilidade dos Testes
14.
Bioinformatics ; 33(9): 1354-1361, 2017 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-28453667

RESUMO

Motivation: In recent years, various network proximity measures have been proposed to facilitate the use of biomolecular interaction data in a broad range of applications. These applications include functional annotation, disease gene prioritization, comparative analysis of biological systems and prediction of new interactions. In such applications, a major task is the scoring or ranking of the nodes in the network in terms of their proximity to a given set of 'seed' nodes (e.g. a group of proteins that are identified to be associated with a disease, or are deferentially expressed in a certain condition). Many different network proximity measures are utilized for this purpose, and these measures are quite diverse in terms of the benefits they offer. Results: We propose a unifying framework for characterizing network proximity measures for set-based queries. We observe that many existing measures are linear, in that the proximity of a node to a set of nodes can be represented as an aggregation of its proximity to the individual nodes in the set. Based on this observation, we propose methods for processing of set-based proximity queries that take advantage of sparse local proximity information. In addition, we provide an analytical framework for characterizing the distribution of proximity scores based on reference models that accurately capture the characteristics of the seed set (e.g. degree distribution and biological function). The resulting framework facilitates computation of exact figures for the statistical significance of network proximity scores, enabling assessment of the accuracy of Monte Carlo simulation based estimation methods. Availability and Implementation: Implementations of the methods in this paper are available at https://bioengine.case.edu/crosstalker which includes a robust visualization for results viewing. Contact: stm@case.edu or mxk331@case.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Simulação por Computador , Método de Monte Carlo , Humanos
15.
Bioinformatics ; 33(21): 3489-3491, 2017 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-28655153

RESUMO

Summary: Computational characterization of differential kinase activity from phosphoproteomics datasets is critical for correctly inferring cellular circuitry and how signaling cascades are altered in drug treatment and/or disease. Kinase-Substrate Enrichment Analysis (KSEA) offers a powerful approach to estimating changes in a kinase's activity based on the collective phosphorylation changes of its identified substrates. However, KSEA has been limited to programmers who are able to implement the algorithms. Thus, to make it accessible to the larger scientific community, we present a web-based application of this method: the KSEA App. Overall, we expect that this tool will offer a quick and user-friendly way of generating kinase activity estimates from high-throughput phosphoproteomics datasets. Availability and Implementation: the KSEA App is a free online tool: casecpb.shinyapps.io/ksea/. The source code is on GitHub: github.com/casecpb/KSEA/. The application is also available as the R package "KSEAapp" on CRAN: CRAN.R-project.org/package=KSEAapp/. Contact: mark.chance@case.edu. Supplementary information: Supplementary data are available at Bioinformatics online.

16.
Proteomics ; 17(22)2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-28961369

RESUMO

Activation of protein phosphatase 2A (PP2A) is a promising anticancer therapeutic strategy, as this tumor suppressor has the ability to coordinately downregulate multiple pathways involved in the regulation of cellular growth and proliferation. In order to understand the systems-level perturbations mediated by PP2A activation, we carried out mass spectrometry-based phosphoproteomic analysis of two KRAS mutated non-small cell lung cancer (NSCLC) cell lines (A549 and H358) treated with a novel small molecule activator of PP2A (SMAP). Overall, this permitted quantification of differential signaling across over 1600 phosphoproteins and 3000 phosphosites. Kinase activity assessment and pathway enrichment implicate collective downregulation of RAS and cell cycle kinases in the case of both cell lines upon PP2A activation. However, the effects on RAS-related signaling are attenuated for A549 compared to H358, while the effects on cell cycle-related kinases are noticeably more prominent in A549. Network-based analyses and validation experiments confirm these detailed differences in signaling. These studies reveal the power of phosphoproteomics studies, coupled to computational systems biology, to elucidate global patterns of phosphatase activation and understand the variations in response to PP2A activation across genetically similar NSCLC cell lines.


Assuntos
Carcinoma Pulmonar de Células não Pequenas/metabolismo , Neoplasias Pulmonares/metabolismo , Fosfoproteínas/metabolismo , Proteína Fosfatase 2/metabolismo , Proteômica/métodos , Bibliotecas de Moléculas Pequenas/farmacologia , Ciclo Celular , Linhagem Celular Tumoral , Proliferação de Células , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Humanos , Espectrometria de Massas , Fosforilação , Transdução de Sinais
17.
PLoS Comput Biol ; 12(11): e1005195, 2016 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-27835645

RESUMO

Susceptibility loci identified by GWAS generally account for a limited fraction of heritability. Predictive models based on identified loci also have modest success in risk assessment and therefore are of limited practical use. Many methods have been developed to overcome these limitations by incorporating prior biological knowledge. However, most of the information utilized by these methods is at the level of genes, limiting analyses to variants that are in or proximate to coding regions. We propose a new method that integrates protein protein interaction (PPI) as well as expression quantitative trait loci (eQTL) data to identify sets of functionally related loci that are collectively associated with a trait of interest. We call such sets of loci "population covering locus sets" (PoCos). The contributions of the proposed approach are three-fold: 1) We consider all possible genotype models for each locus, thereby enabling identification of combinatorial relationships between multiple loci. 2) We develop a framework for the integration of PPI and eQTL into a heterogenous network model, enabling efficient identification of functionally related variants that are associated with the disease. 3) We develop a novel method to integrate the genotypes of multiple loci in a PoCo into a representative genotype to be used in risk assessment. We test the proposed framework in the context of risk assessment for seven complex diseases, type 1 diabetes (T1D), type 2 diabetes (T2D), psoriasis (PS), bipolar disorder (BD), coronary artery disease (CAD), hypertension (HT), and multiple sclerosis (MS). Our results show that the proposed method significantly outperforms individual variant based risk assessment models as well as the state-of-the-art polygenic score. We also show that incorporation of eQTL data improves the performance of identified POCOs in risk assessment. We also assess the biological relevance of PoCos for three diseases that have similar biological mechanisms and identify novel candidate genes. The resulting software is publicly available at http://compbio. CASE: edu/pocos/.


Assuntos
Estudos de Associação Genética/métodos , Marcadores Genéticos/genética , Predisposição Genética para Doença/epidemiologia , Predisposição Genética para Doença/genética , Locos de Características Quantitativas/genética , Medição de Risco/métodos , Algoritmos , Humanos , Prevalência , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
18.
BMC Bioinformatics ; 17(1): 453, 2016 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-27829360

RESUMO

BACKGROUND: Accurately prioritizing candidate disease genes is an important and challenging problem. Various network-based methods have been developed to predict potential disease genes by utilizing the disease similarity network and molecular networks such as protein interaction or gene co-expression networks. Although successful, a common limitation of the existing methods is that they assume all diseases share the same molecular network and a single generic molecular network is used to predict candidate genes for all diseases. However, different diseases tend to manifest in different tissues, and the molecular networks in different tissues are usually different. An ideal method should be able to incorporate tissue-specific molecular networks for different diseases. RESULTS: In this paper, we develop a robust and flexible method to integrate tissue-specific molecular networks for disease gene prioritization. Our method allows each disease to have its own tissue-specific network(s). We formulate the problem of candidate gene prioritization as an optimization problem based on network propagation. When there are multiple tissue-specific networks available for a disease, our method can automatically infer the relative importance of each tissue-specific network. Thus it is robust to the noisy and incomplete network data. To solve the optimization problem, we develop fast algorithms which have linear time complexities in the number of nodes in the molecular networks. We also provide rigorous theoretical foundations for our algorithms in terms of their optimality and convergence properties. Extensive experimental results show that our method can significantly improve the accuracy of candidate gene prioritization compared with the state-of-the-art methods. CONCLUSIONS: In our experiments, we compare our methods with 7 popular network-based disease gene prioritization algorithms on diseases from Online Mendelian Inheritance in Man (OMIM) database. The experimental results demonstrate that our methods recover true associations more accurately than other methods in terms of AUC values, and the performance differences are significant (with paired t-test p-values less than 0.05). This validates the importance to integrate tissue-specific molecular networks for studying disease gene prioritization and show the superiority of our network models and ranking algorithms toward this purpose. The source code and datasets are available at http://nijingchao.github.io/CRstar/ .


Assuntos
Doença/genética , Redes Reguladoras de Genes , Modelos Genéticos , Especificidade de Órgãos/genética , Algoritmos , Área Sob a Curva , Biologia Computacional/métodos , Bases de Dados Genéticas , Humanos , Curva ROC
19.
PLoS Comput Biol ; 11(12): e1004595, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26683094

RESUMO

Development of high-throughput monitoring technologies enables interrogation of cancer samples at various levels of cellular activity. Capitalizing on these developments, various public efforts such as The Cancer Genome Atlas (TCGA) generate disparate omic data for large patient cohorts. As demonstrated by recent studies, these heterogeneous data sources provide the opportunity to gain insights into the molecular changes that drive cancer pathogenesis and progression. However, these insights are limited by the vast search space and as a result low statistical power to make new discoveries. In this paper, we propose methods for integrating disparate omic data using molecular interaction networks, with a view to gaining mechanistic insights into the relationship between molecular changes at different levels of cellular activity. Namely, we hypothesize that genes that play a role in cancer development and progression may be implicated by neither frequent mutation nor differential expression, and that network-based integration of mutation and differential expression data can reveal these "silent players". For this purpose, we utilize network-propagation algorithms to simulate the information flow in the cell at a sample-specific resolution. We then use the propagated mutation and expression signals to identify genes that are not necessarily mutated or differentially expressed genes, but have an essential role in tumor development and patient outcome. We test the proposed method on breast cancer and glioblastoma multiforme data obtained from TCGA. Our results show that the proposed method can identify important proteins that are not readily revealed by molecular data, providing insights beyond what can be gleaned by analyzing different types of molecular data in isolation.


Assuntos
Perfilação da Expressão Gênica/métodos , Genes Neoplásicos/genética , Genômica/métodos , Proteínas de Neoplasias/genética , Neoplasias/genética , Mutação Silenciosa/genética , Algoritmos , Mapeamento Cromossômico/métodos , Mineração de Dados/métodos , Bases de Dados Genéticas , Estudos de Associação Genética/métodos , Marcadores Genéticos/genética , Humanos , Transdução de Sinais/genética
20.
J Biomed Inform ; 58: 104-113, 2015 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-26453823

RESUMO

PURPOSE: To date the standard nosology and prognostic schemes for myeloid neoplasms have been based on morphologic and cytogenetic criteria. We sought to test the hypothesis that a comprehensive, unbiased analysis of somatic mutations may allow for an improved classification of these diseases to predict outcome (overall survival). EXPERIMENTAL DESIGN: We performed whole-exome sequencing (WES) of 274 myeloid neoplasms, including myelodysplastic syndrome (MDS, N=75), myelodysplastic/myeloproliferative neoplasia (MDS/MPN, N=33), and acute myeloid leukemia (AML, N=22), augmenting the resulting mutational data with public WES results from AML (N=144). We fit random survival forests (RSFs) to the patient survival and clinical/cytogenetic data, with and without gene mutation information, to build prognostic classifiers. A targeted sequencing assay was used to sequence predictor genes in an independent cohort of 507 patients, whose accompanying data were used to evaluate performance of the risk classifiers. RESULTS: We show that gene mutations modify the impact of standard clinical variables on patient outcome, and therefore their incorporation hones the accuracy of prediction. The mutation-based classification scheme robustly predicted patient outcome in the validation set (log rank P=6.77 × 10(-21); poor prognosis vs. good prognosis categories HR 10.4, 95% CI 3.21-33.6). The RSF-based approach also compares favorably with recently-published efforts to incorporate mutational information for MDS prognosis. CONCLUSION: The results presented here support the inclusion of mutational information in prognostic classification of myeloid malignancies. Our classification scheme is implemented in a publicly available web-based tool (http://myeloid-risk. CASE: edu/).


Assuntos
Neoplasias da Medula Óssea/genética , Exoma , Neoplasias da Medula Óssea/classificação , Neoplasias da Medula Óssea/fisiopatologia , Estudos de Coortes , Prognóstico
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa