Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 48
Filtrar
Mais filtros

Bases de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
PLoS Pathog ; 19(7): e1011492, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37459363

RESUMO

HIV-1 spreads efficiently through direct cell-to-cell transmission at virological synapses (VSs) formed by interactions between HIV-1 envelope proteins (Env) on the surface of infected cells and CD4 receptors on uninfected target cells. Env-CD4 interactions bring the infected and uninfected cellular membranes into close proximity and induce transport of viral and cellular factors to the VS for efficient virion assembly and HIV-1 transmission. Using novel, cell-specific stable isotope labeling and quantitative mass spectrometric proteomics, we identified extensive changes in the levels and phosphorylation states of proteins in HIV-1 infected producer cells upon mixing with CD4+ target cells under conditions inducing VS formation. These coculture-induced alterations involved multiple cellular pathways including transcription, TCR signaling and, unexpectedly, cell cycle regulation, and were dominated by Env-dependent responses. We confirmed the proteomic results using inhibitors targeting regulatory kinases and phosphatases in selected pathways identified by our proteomic analysis. Strikingly, inhibiting the key mitotic regulator Aurora kinase B (AURKB) in HIV-1 infected cells significantly increased HIV activity in cell-to-cell fusion and transmission but had little effect on cell-free infection. Consistent with this, we found that AURKB regulates the fusogenic activity of HIV-1 Env. In the Jurkat T cell line and primary T cells, HIV-1 Env:CD4 interaction also dramatically induced cell cycle-independent AURKB relocalization to the centromere, and this signaling required the long (150 aa) cytoplasmic C-terminal domain (CTD) of Env. These results imply that cytoplasmic/plasma membrane AURKB restricts HIV-1 envelope fusion, and that this restriction is overcome by Env CTD-induced AURKB relocalization. Taken together, our data reveal a new signaling pathway regulating HIV-1 cell-to-cell transmission and potential new avenues for therapeutic intervention through targeting the Env CTD and AURKB activity.


Assuntos
Infecções por HIV , HIV-1 , Humanos , HIV-1/fisiologia , Aurora Quinase B/metabolismo , Proteômica , Linfócitos T CD4-Positivos/metabolismo , Antígenos CD4/metabolismo , Infecções por HIV/metabolismo
2.
PLoS Biol ; 20(12): e3001934, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36542656

RESUMO

Viruses must balance their reliance on host cell machinery for replication while avoiding host defense. Influenza A viruses are zoonotic agents that frequently switch hosts, causing localized outbreaks with the potential for larger pandemics. The host range of influenza virus is limited by the need for successful interactions between the virus and cellular partners. Here we used immunocompetitive capture-mass spectrometry to identify cellular proteins that interact with human- and avian-style viral polymerases. We focused on the proviral activity of heterogenous nuclear ribonuclear protein U-like 1 (hnRNP UL1) and the antiviral activity of mitochondrial enoyl CoA-reductase (MECR). MECR is localized to mitochondria where it functions in mitochondrial fatty acid synthesis (mtFAS). While a small fraction of the polymerase subunit PB2 localizes to the mitochondria, PB2 did not interact with full-length MECR. By contrast, a minor splice variant produces cytoplasmic MECR (cMECR). Ectopic expression of cMECR shows that it binds the viral polymerase and suppresses viral replication by blocking assembly of viral ribonucleoprotein complexes (RNPs). MECR ablation through genome editing or drug treatment is detrimental for cell health, creating a generic block to virus replication. Using the yeast homolog Etr1 to supply the metabolic functions of MECR in MECR-null cells, we showed that specific antiviral activity is independent of mtFAS and is reconstituted by expressing cMECR. Thus, we propose a strategy where alternative splicing produces a cryptic antiviral protein that is embedded within a key metabolic enzyme.


Assuntos
Ácidos Graxos Dessaturases , Vírus da Influenza A , Humanos , Ácidos Graxos Dessaturases/metabolismo , Processamento Alternativo/genética , Mitocôndrias/metabolismo , Vírus da Influenza A/genética , Isoformas de Proteínas/metabolismo , Replicação Viral
3.
Proc Natl Acad Sci U S A ; 118(48)2021 11 30.
Artigo em Inglês | MEDLINE | ID: mdl-34815338

RESUMO

The mapping from protein sequence to function is highly complex, making it challenging to predict how sequence changes will affect a protein's behavior and properties. We present a supervised deep learning framework to learn the sequence-function mapping from deep mutational scanning data and make predictions for new, uncharacterized sequence variants. We test multiple neural network architectures, including a graph convolutional network that incorporates protein structure, to explore how a network's internal representation affects its ability to learn the sequence-function mapping. Our supervised learning approach displays superior performance over physics-based and unsupervised prediction methods. We find that networks that capture nonlinear interactions and share parameters across sequence positions are important for learning the relationship between sequence and function. Further analysis of the trained models reveals the networks' ability to learn biologically meaningful information about protein structure and mechanism. Finally, we demonstrate the models' ability to navigate sequence space and design new proteins beyond the training set. We applied the protein G B1 domain (GB1) models to design a sequence that binds to immunoglobulin G with substantially higher affinity than wild-type GB1.


Assuntos
Sequência de Aminoácidos/genética , Análise de Sequência de Proteína/métodos , Algoritmos , Sequência de Aminoácidos/fisiologia , Fenômenos Bioquímicos , Aprendizado Profundo , Aprendizado de Máquina , Mutação , Redes Neurais de Computação , Proteínas/metabolismo , Relação Estrutura-Atividade
4.
Bioinformatics ; 38(Suppl 1): i10-i18, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758797

RESUMO

SUMMARY: The increasing prevalence and importance of machine learning in biological research have created a need for machine learning training resources tailored towards biological researchers. However, existing resources are often inaccessible, infeasible or inappropriate for biologists because they require significant computational and mathematical knowledge, demand an unrealistic time-investment or teach skills primarily for computational researchers. We created the Machine Learning for Biologists (ML4Bio) workshop, a short, intensive workshop that empowers biological researchers to comprehend machine learning applications and pursue machine learning collaborations in their own research. The ML4Bio workshop focuses on classification and was designed around three principles: (i) emphasizing preparedness over fluency or expertise, (ii) necessitating minimal coding and mathematical background and (iii) requiring low time investment. It incorporates active learning methods and custom open-source software that allows participants to explore machine learning workflows. After multiple sessions to improve workshop design, we performed a study on three workshop sessions. Despite some confusion around identifying subtle methodological flaws in machine learning workflows, participants generally reported that the workshop met their goals, provided them with valuable skills and knowledge and greatly increased their beliefs that they could engage in research that uses machine learning. ML4Bio is an educational tool for biological researchers, and its creation and evaluation provide valuable insight into tailoring educational resources for active researchers in different domains. AVAILABILITY AND IMPLEMENTATION: Workshop materials are available at https://github.com/carpentries-incubator/ml4bio-workshop and the ml4bio software is available at https://github.com/gitter-lab/ml4bio. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado de Máquina , Software , Humanos , Fluxo de Trabalho
5.
Biometrics ; 79(2): 642-654, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-35165892

RESUMO

An important experimental design problem in early-stage drug discovery is how to prioritize available compounds for testing when very little is known about the target protein. Informer-based ranking (IBR) methods address the prioritization problem when the compounds have provided bioactivity data on other potentially relevant targets. An IBR method selects an informer set of compounds, and then prioritizes the remaining compounds on the basis of new bioactivity experiments performed with the informer set on the target. We formalize the problem as a two-stage decision problem and introduce the Bayes Optimal Informer SEt (BOISE) method for its solution. BOISE leverages a flexible model of the initial bioactivity data, a relevant loss function, and effective computational schemes to resolve the two-step design problem. We evaluate BOISE and compare it to other IBR strategies in two retrospective studies, one on protein-kinase inhibition and the other on anticancer drug sensitivity. In both empirical settings BOISE exhibits better predictive performance than available methods. It also behaves well with missing data, where methods that use matrix completion show worse predictive performance.


Assuntos
Descoberta de Drogas , Proteínas , Teorema de Bayes , Estudos Retrospectivos , Descoberta de Drogas/métodos
6.
J Chem Inf Model ; 63(17): 5513-5528, 2023 09 11.
Artigo em Inglês | MEDLINE | ID: mdl-37625010

RESUMO

Traditional small-molecule drug discovery is a time-consuming and costly endeavor. High-throughput chemical screening can only assess a tiny fraction of drug-like chemical space. The strong predictive power of modern machine-learning methods for virtual chemical screening enables training models on known active and inactive compounds and extrapolating to much larger chemical libraries. However, there has been limited experimental validation of these methods in practical applications on large commercially available or synthesize-on-demand chemical libraries. Through a prospective evaluation with the bacterial protein-protein interaction PriA-SSB, we demonstrate that ligand-based virtual screening can identify many active compounds in large commercial libraries. We use cross-validation to compare different types of supervised learning models and select a random forest (RF) classifier as the best model for this target. When predicting the activity of more than 8 million compounds from Aldrich Market Select, the RF substantially outperforms a naïve baseline based on chemical structure similarity. 48% of the RF's 701 selected compounds are active. The RF model easily scales to score one billion compounds from the synthesize-on-demand Enamine REAL database. We tested 68 chemically diverse top predictions from Enamine REAL and observed 31 hits (46%), including one with an IC50 value of 1.3 µM.


Assuntos
Ensaios de Triagem em Larga Escala , Bibliotecas de Moléculas Pequenas , Bases de Dados Factuais , Descoberta de Drogas , Aprendizado de Máquina Supervisionado
7.
Bioinformatics ; 36(Suppl_2): i822-i830, 2020 12 30.
Artigo em Inglês | MEDLINE | ID: mdl-33381832

RESUMO

MOTIVATION: Cells regulate themselves via dizzyingly complex biochemical processes called signaling pathways. These are usually depicted as a network, where nodes represent proteins and edges indicate their influence on each other. In order to understand diseases and therapies at the cellular level, it is crucial to have an accurate understanding of the signaling pathways at work. Since signaling pathways can be modified by disease, the ability to infer signaling pathways from condition- or patient-specific data is highly valuable. A variety of techniques exist for inferring signaling pathways. We build on past works that formulate signaling pathway inference as a Dynamic Bayesian Network structure estimation problem on phosphoproteomic time course data. We take a Bayesian approach, using Markov Chain Monte Carlo to estimate a posterior distribution over possible Dynamic Bayesian Network structures. Our primary contributions are (i) a novel proposal distribution that efficiently samples sparse graphs and (ii) the relaxation of common restrictive modeling assumptions. RESULTS: We implement our method, named Sparse Signaling Pathway Sampling, in Julia using the Gen probabilistic programming language. Probabilistic programming is a powerful methodology for building statistical models. The resulting code is modular, extensible and legible. The Gen language, in particular, allows us to customize our inference procedure for biological graphs and ensure efficient sampling. We evaluate our algorithm on simulated data and the HPN-DREAM pathway reconstruction challenge, comparing our performance against a variety of baseline methods. Our results demonstrate the vast potential for probabilistic programming, and Gen specifically, for biological network inference. AVAILABILITY AND IMPLEMENTATION: Find the full codebase at https://github.com/gitter-lab/ssps. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Transdução de Sinais , Teorema de Bayes , Humanos , Cadeias de Markov , Método de Monte Carlo
8.
BMC Bioinformatics ; 21(1): 21, 2020 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-31948388

RESUMO

BACKGROUND: The similarity or distance measure used for clustering can generate intuitive and interpretable clusters when it is tailored to the unique characteristics of the data. In time series datasets generated with high-throughput biological assays, measurements such as gene expression levels or protein phosphorylation intensities are collected sequentially over time, and the similarity score should capture this special temporal structure. RESULTS: We propose a clustering similarity measure called Lag Penalized Weighted Correlation (LPWC) to group pairs of time series that exhibit closely-related behaviors over time, even if the timing is not perfectly synchronized. LPWC aligns time series profiles to identify common temporal patterns. It down-weights aligned profiles based on the length of the temporal lags that are introduced. We demonstrate the advantages of LPWC versus existing time series and general clustering algorithms. In a simulated dataset based on the biologically-motivated impulse model, LPWC is the only method to recover the true clusters for almost all simulated genes. LPWC also identifies clusters with distinct temporal patterns in our yeast osmotic stress response and axolotl limb regeneration case studies. CONCLUSIONS: LPWC achieves both of its time series clustering goals. It groups time series with correlated changes over time, even if those patterns occur earlier or later in some of the time series. In addition, it refrains from introducing large shifts in time when searching for temporal patterns by applying a lag penalty. The LPWC R package is available at https://github.com/gitter-lab/LPWC and CRAN under a MIT license.


Assuntos
Análise por Conglomerados , Algoritmos , Ambystoma mexicanum , Animais , Extremidades/fisiologia , Perfilação da Expressão Gênica , Pressão Osmótica , Fosforilação , Regeneração , Leveduras
9.
PLoS Comput Biol ; 15(6): e1007128, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31233491

RESUMO

Open, collaborative research is a powerful paradigm that can immensely strengthen the scientific process by integrating broad and diverse expertise. However, traditional research and multi-author writing processes break down at scale. We present new software named Manubot, available at https://manubot.org, to address the challenges of open scholarly writing. Manubot adopts the contribution workflow used by many large-scale open source software projects to enable collaborative authoring of scholarly manuscripts. With Manubot, manuscripts are written in Markdown and stored in a Git repository to precisely track changes over time. By hosting manuscript repositories publicly, such as on GitHub, multiple authors can simultaneously propose and review changes. A cloud service automatically evaluates proposed changes to catch errors. Publication with Manubot is continuous: When a manuscript's source changes, the rendered outputs are rebuilt and republished to a web page. Manubot automates bibliographic tasks by implementing citation by identifier, where users cite persistent identifiers (e.g. DOIs, PubMed IDs, ISBNs, URLs), whose metadata is then retrieved and converted to a user-specified style. Manubot modernizes publishing to align with the ideals of open science by making it transparent, reproducible, immediate, versioned, collaborative, and free of charge.


Assuntos
Editoração , Software , Redação , Humanos , Manuscritos Médicos como Assunto
10.
PLoS Comput Biol ; 15(8): e1006813, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31381559

RESUMO

Prediction of compounds that are active against a desired biological target is a common step in drug discovery efforts. Virtual screening methods seek some active-enriched fraction of a library for experimental testing. Where data are too scarce to train supervised learning models for compound prioritization, initial screening must provide the necessary data. Commonly, such an initial library is selected on the basis of chemical diversity by some pseudo-random process (for example, the first few plates of a larger library) or by selecting an entire smaller library. These approaches may not produce a sufficient number or diversity of actives. An alternative approach is to select an informer set of screening compounds on the basis of chemogenomic information from previous testing of compounds against a large number of targets. We compare different ways of using chemogenomic data to choose a small informer set of compounds based on previously measured bioactivity data. We develop this Informer-Based-Ranking (IBR) approach using the Published Kinase Inhibitor Sets (PKIS) as the chemogenomic data to select the informer sets. We test the informer compounds on a target that is not part of the chemogenomic data, then predict the activity of the remaining compounds based on the experimental informer data and the chemogenomic data. Through new chemical screening experiments, we demonstrate the utility of IBR strategies in a prospective test on three kinase targets not included in the PKIS.


Assuntos
Descoberta de Drogas/métodos , Inibidores de Proteínas Quinases/química , Inibidores de Proteínas Quinases/farmacologia , Quimioinformática/métodos , Quimioinformática/estatística & dados numéricos , Biologia Computacional , Simulação por Computador , Bases de Dados de Compostos Químicos , Bases de Dados de Produtos Farmacêuticos , Descoberta de Drogas/estatística & dados numéricos , Avaliação Pré-Clínica de Medicamentos/métodos , Avaliação Pré-Clínica de Medicamentos/estatística & dados numéricos , Ensaios de Triagem em Larga Escala/métodos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Estudos Prospectivos , Proteínas Serina-Treonina Quinases/antagonistas & inibidores , Proteínas de Protozoários , Relação Estrutura-Atividade , Interface Usuário-Computador , Proteínas Virais/antagonistas & inibidores
11.
PLoS Pathog ; 13(3): e1006256, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-28257516

RESUMO

Kaposi's Sarcoma associated Herpesvirus (KSHV), an oncogenic, human gamma-herpesvirus, is the etiological agent of Kaposi's Sarcoma the most common tumor of AIDS patients world-wide. KSHV is predominantly latent in the main KS tumor cell, the spindle cell, a cell of endothelial origin. KSHV modulates numerous host cell-signaling pathways to activate endothelial cells including major metabolic pathways involved in lipid metabolism. To identify the underlying cellular mechanisms of KSHV alteration of host signaling and endothelial cell activation, we identified changes in the host proteome, phosphoproteome and transcriptome landscape following KSHV infection of endothelial cells. A Steiner forest algorithm was used to integrate the global data sets and, together with transcriptome based predicted transcription factor activity, cellular networks altered by latent KSHV were predicted. Several interesting pathways were identified, including peroxisome biogenesis. To validate the predictions, we showed that KSHV latent infection increases the number of peroxisomes per cell. Additionally, proteins involved in peroxisomal lipid metabolism of very long chain fatty acids, including ABCD3 and ACOX1, are required for the survival of latently infected cells. In summary, novel cellular pathways altered during herpesvirus latency that could not be predicted by a single systems biology platform, were identified by integrated proteomics and transcriptomics data analysis and when correlated with our metabolomics data revealed that peroxisome lipid metabolism is essential for KSHV latent infection of endothelial cells.


Assuntos
Herpesvirus Humano 8/metabolismo , Interações Hospedeiro-Parasita/fisiologia , Metabolismo dos Lipídeos/fisiologia , Peroxissomos/metabolismo , Ativação Viral/fisiologia , Latência Viral/fisiologia , Separação Celular , Células Cultivadas , Células Endoteliais/virologia , Citometria de Fluxo , Humanos , Espectrometria de Massas , Microscopia Confocal , RNA Interferente Pequeno , Sarcoma de Kaposi/virologia , Biologia de Sistemas , Transfecção
12.
PLoS Comput Biol ; 13(5): e1006088, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29738528

RESUMO

Cells respond to stressful conditions by coordinating a complex, multi-faceted response that spans many levels of physiology. Much of the response is coordinated by changes in protein phosphorylation. Although the regulators of transcriptome changes during stress are well characterized in Saccharomyces cerevisiae, the upstream regulatory network controlling protein phosphorylation is less well dissected. Here, we developed a computational approach to infer the signaling network that regulates phosphorylation changes in response to salt stress. We developed an approach to link predicted regulators to groups of likely co-regulated phospho-peptides responding to stress, thereby creating new edges in a background protein interaction network. We then use integer linear programming (ILP) to integrate wild type and mutant phospho-proteomic data and predict the network controlling stress-activated phospho-proteomic changes. The network we inferred predicted new regulatory connections between stress-activated and growth-regulating pathways and suggested mechanisms coordinating metabolism, cell-cycle progression, and growth during stress. We confirmed several network predictions with co-immunoprecipitations coupled with mass-spectrometry protein identification and mutant phospho-proteomic analysis. Results show that the cAMP-phosphodiesterase Pde2 physically interacts with many stress-regulated transcription factors targeted by PKA, and that reduced phosphorylation of those factors during stress requires the Rck2 kinase that we show physically interacts with Pde2. Together, our work shows how a high-quality computational network model can facilitate discovery of new pathway interactions during osmotic stress.


Assuntos
Proteômica/métodos , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/metabolismo , Cloreto de Sódio/química , Ciclo Celular , Biologia Computacional , Simulação por Computador , Proteínas Quinases Dependentes de AMP Cíclico/metabolismo , Imunoprecipitação , Espectrometria de Massas , Modelos Biológicos , Pressão Osmótica , Fosforilação , Mapeamento de Interação de Proteínas , Proteínas Serina-Treonina Quinases/metabolismo , Proteoma , Transdução de Sinais , Fatores de Transcrição/metabolismo
13.
J Chem Inf Model ; 59(10): 4438-4449, 2019 10 28.
Artigo em Inglês | MEDLINE | ID: mdl-31518132

RESUMO

Empirical testing of chemicals for drug efficacy costs many billions of dollars every year. The ability to predict the action of molecules in silico would greatly increase the speed and decrease the cost of prioritizing drug leads. Here, we asked whether drug function, defined as MeSH "therapeutic use" classes, can be predicted from only a chemical structure. We evaluated two chemical-structure-derived drug classification methods, chemical images with convolutional neural networks and molecular fingerprints with random forests, both of which outperformed previous predictions that used drug-induced transcriptomic changes as chemical representations. This suggests that the structure of a chemical contains at least as much information about its therapeutic use as the transcriptional cellular response to that chemical. Furthermore, because training data based on chemical structure is not limited to a small set of molecules for which transcriptomic measurements are available, our strategy can leverage more training data to significantly improve predictive accuracy to 83-88%. Finally, we explore use of these models for prediction of side effects and drug-repurposing opportunities and demonstrate the effectiveness of this modeling strategy for multilabel classification.


Assuntos
Descoberta de Drogas/métodos , Simulação por Computador , Reposicionamento de Medicamentos , Estrutura Molecular , Redes Neurais de Computação , Relação Estrutura-Atividade
14.
J Chem Inf Model ; 59(1): 282-293, 2019 01 28.
Artigo em Inglês | MEDLINE | ID: mdl-30500183

RESUMO

Virtual (computational) high-throughput screening provides a strategy for prioritizing compounds for experimental screens, but the choice of virtual screening algorithm depends on the data set and evaluation strategy. We consider a wide range of ligand-based machine learning and docking-based approaches for virtual screening on two protein-protein interactions, PriA-SSB and RMI-FANCM, and present a strategy for choosing which algorithm is best for prospective compound prioritization. Our workflow identifies a random forest as the best algorithm for these targets over more sophisticated neural network-based models. The top 250 predictions from our selected random forest recover 37 of the 54 active compounds from a library of 22,434 new molecules assayed on PriA-SSB. We show that virtual screening methods that perform well on public data sets and synthetic benchmarks, like multi-task neural networks, may not always translate to prospective screening performance on a specific assay of interest.


Assuntos
Avaliação Pré-Clínica de Medicamentos/métodos , Aprendizado de Máquina , Simulação de Acoplamento Molecular , Algoritmos , Conformação Proteica , Proteínas/química , Proteínas/metabolismo , Interface Usuário-Computador
16.
Nat Rev Genet ; 13(8): 552-64, 2012 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-22805708

RESUMO

Biological processes are often dynamic, thus researchers must monitor their activity at multiple time points. The most abundant source of information regarding such dynamic activity is time-series gene expression data. These data are used to identify the complete set of activated genes in a biological process, to infer their rates of change, their order and their causal effects and to model dynamic systems in the cell. In this Review we discuss the basic patterns that have been observed in time-series experiments, how these patterns are combined to form expression programs, and the computational analysis, visualization and integration of these data to infer models of dynamic biological systems.


Assuntos
Perfilação da Expressão Gênica , Expressão Gênica , Modelos Genéticos , Animais , Interpretação Estatística de Dados , Epigênese Genética/genética , Humanos , Camundongos , Transdução de Sinais
17.
PLoS Comput Biol ; 12(4): e1004879, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-27096930

RESUMO

High-throughput, 'omic' methods provide sensitive measures of biological responses to perturbations. However, inherent biases in high-throughput assays make it difficult to interpret experiments in which more than one type of data is collected. In this work, we introduce Omics Integrator, a software package that takes a variety of 'omic' data as input and identifies putative underlying molecular pathways. The approach applies advanced network optimization algorithms to a network of thousands of molecular interactions to find high-confidence, interpretable subnetworks that best explain the data. These subnetworks connect changes observed in gene expression, protein abundance or other global assays to proteins that may not have been measured in the screens due to inherent bias or noise in measurement. This approach reveals unannotated molecular pathways that would not be detectable by searching pathway databases. Omics Integrator also provides an elegant framework to incorporate not only positive data, but also negative evidence. Incorporating negative evidence allows Omics Integrator to avoid unexpressed genes and avoid being biased toward highly-studied hub proteins, except when they are strongly implicated by the data. The software is comprised of two individual tools, Garnet and Forest, that can be run together or independently to allow a user to perform advanced integration of multiple types of high-throughput data as well as create condition-specific subnetworks of protein interactions that best connect the observed changes in various datasets. It is available at http://fraenkel.mit.edu/omicsintegrator and on GitHub at https://github.com/fraenkel-lab/OmicsIntegrator.


Assuntos
Bases de Dados Genéticas/estatística & dados numéricos , Redes Reguladoras de Genes , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Software , Algoritmos , Biologia Computacional , Epigênese Genética , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/metabolismo , Mapas de Interação de Proteínas/genética , Fatores de Transcrição/metabolismo
18.
Genome Res ; 23(2): 365-76, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-23064748

RESUMO

Accurate models of the cross-talk between signaling pathways and transcriptional regulatory networks within cells are essential to understand complex response programs. We present a new computational method that combines condition-specific time-series expression data with general protein interaction data to reconstruct dynamic and causal stress response networks. These networks characterize the pathways involved in the response, their time of activation, and the affected genes. The signaling and regulatory components of our networks are linked via a set of common transcription factors that serve as targets in the signaling network and as regulators of the transcriptional response network. Detailed case studies of stress responses in budding yeast demonstrate the predictive power of our method. Our method correctly identifies the core signaling proteins and transcription factors of the response programs. It further predicts the involvement of additional transcription factors and other proteins not previously implicated in the response pathways. We experimentally verify several of these predictions for the osmotic stress response network. Our approach requires little condition-specific data: only a partial set of upstream initiators and time-series gene expression data, which are readily available for many conditions and species. Consequently, our method is widely applicable and can be used to derive accurate, dynamic response models in several species.


Assuntos
Redes Reguladoras de Genes , Modelos Biológicos , Transdução de Sinais , Estresse Fisiológico/fisiologia , Algoritmos , Arabidopsis/genética , Arabidopsis/imunologia , Arabidopsis/metabolismo , Regulação da Expressão Gênica/efeitos dos fármacos , Técnicas de Inativação de Genes , Redes Reguladoras de Genes/efeitos dos fármacos , Concentração Osmolar , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Transdução de Sinais/efeitos dos fármacos , Sirolimo/farmacologia , Estresse Fisiológico/efeitos dos fármacos , Fatores de Tempo
19.
PLoS Comput Biol ; 10(12): e1003943, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25522349

RESUMO

Reconstructing regulatory and signaling response networks is one of the major goals of systems biology. While several successful methods have been suggested for this task, some integrating large and diverse datasets, these methods have so far been applied to reconstruct a single response network at a time, even when studying and modeling related conditions. To improve network reconstruction we developed MT-SDREM, a multi-task learning method which jointly models networks for several related conditions. In MT-SDREM, parameters are jointly constrained across the networks while still allowing for condition-specific pathways and regulation. We formulate the multi-task learning problem and discuss methods for optimizing the joint target function. We applied MT-SDREM to reconstruct dynamic human response networks for three flu strains: H1N1, H5N1 and H3N2. Our multi-task learning method was able to identify known and novel factors and genes, improving upon prior methods that model each condition independently. The MT-SDREM networks were also better at identifying proteins whose removal affects viral load indicating that joint learning can still lead to accurate, condition-specific, networks. Supporting website with MT-SDREM implementation: http://sb.cs.cmu.edu/mtsdrem.


Assuntos
Redes Reguladoras de Genes/imunologia , Vírus da Influenza A/imunologia , Influenza Humana/imunologia , Transdução de Sinais/imunologia , Biologia de Sistemas/métodos , Algoritmos , Humanos , Aprendizado de Máquina
20.
Bioinformatics ; 29(13): i227-36, 2013 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-23812988

RESUMO

MOTIVATION: Several types of studies, including genome-wide association studies and RNA interference screens, strive to link genes to diseases. Although these approaches have had some success, genetic variants are often only present in a small subset of the population, and screens are noisy with low overlap between experiments in different labs. Neither provides a mechanistic model explaining how identified genes impact the disease of interest or the dynamics of the pathways those genes regulate. Such mechanistic models could be used to accurately predict downstream effects of knocking down pathway members and allow comprehensive exploration of the effects of targeting pairs or higher-order combinations of genes. RESULTS: We developed methods to model the activation of signaling and dynamic regulatory networks involved in disease progression. Our model, SDREM, integrates static and time series data to link proteins and the pathways they regulate in these networks. SDREM uses prior information about proteins' likelihood of involvement in a disease (e.g. from screens) to improve the quality of the predicted signaling pathways. We used our algorithms to study the human immune response to H1N1 influenza infection. The resulting networks correctly identified many of the known pathways and transcriptional regulators of this disease. Furthermore, they accurately predict RNA interference effects and can be used to infer genetic interactions, greatly improving over other methods suggested for this task. Applying our method to the more pathogenic H5N1 influenza allowed us to identify several strain-specific targets of this infection. AVAILABILITY: SDREM is available from http://sb.cs.cmu.edu/sdrem. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Influenza Humana/genética , Influenza Humana/metabolismo , Mapeamento de Interação de Proteínas , Transdução de Sinais , Progressão da Doença , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Estudo de Associação Genômica Ampla , Humanos , Vírus da Influenza A Subtipo H1N1 , Virus da Influenza A Subtipo H5N1 , Interferência de RNA , Fatores de Transcrição/metabolismo , Carga Viral
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA