Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PeerJ ; 11: e16164, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37818330

RESUMO

Background: Aberrant protein kinase regulation leading to abnormal substrate phosphorylation is associated with several human diseases. Despite the promise of therapies targeting kinases, many human kinases remain understudied. Most existing computational tools predicting phosphorylation cover less than 50% of known human kinases. They utilize local feature selection based on protein sequences, motifs, domains, structures, and/or functions, and do not consider the heterogeneous relationships of the proteins. In this work, we present KSFinder, a tool that predicts kinase-substrate links by capturing the inherent association of proteins in a network comprising 85% of the known human kinases. We also postulate the potential role of two understudied kinases based on their substrate predictions from KSFinder. Methods: KSFinder learns the semantic relationships in a phosphoproteome knowledge graph using a knowledge graph embedding algorithm and represents the nodes in low-dimensional vectors. A multilayer perceptron (MLP) classifier is trained to discern kinase-substrate links using the embedded vectors. KSFinder uses a strategic negative generation approach that eliminates biases in entity representation and combines data from experimentally validated non-interacting protein pairs, proteins from different subcellular locations, and random sampling. We assess KSFinder's generalization capability on four different datasets and compare its performance with other state-of-the-art prediction models. We employ KSFinder to predict substrates of 68 "dark" kinases considered understudied by the Illuminating the Druggable Genome program and use our text-mining tool, RLIMS-P along with manual curation, to search for literature evidence for the predictions. In a case study, we performed functional enrichment analysis for two dark kinases - HIPK3 and CAMKK1 using their predicted substrates. Results: KSFinder shows improved performance over other kinase-substrate prediction models and generalized prediction ability on different datasets. We identified literature evidence for 17 novel predictions involving an understudied kinase. All of these 17 predictions had a probability score ≥0.7 (nine at >0.9, six at 0.8-0.9, and two at 0.7-0.8). The evaluation of 93,593 negative predictions (probability ≤0.3) identified four false negatives. The top enriched biological processes of HIPK3 substrates relate to the regulation of extracellular matrix and epigenetic gene expression, while CAMKK1 substrates include lipid storage regulation and glucose homeostasis. Conclusions: KSFinder outperforms the current kinase-substrate prediction tools with higher kinase coverage. The strategically developed negatives provide a superior generalization ability for KSFinder. We predicted substrates of 432 kinases, 68 of which are understudied, and hypothesized the potential functions of two dark kinases using their predicted substrates.


Assuntos
Reconhecimento Automatizado de Padrão , Proteínas Quinases , Humanos , Proteínas Quinases/genética , Fosforilação , Algoritmos , Proteoma/química
2.
Mol Omics ; 18(9): 853-864, 2022 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-35975455

RESUMO

The human proteome contains a vast network of interacting kinases and substrates. Even though some kinases have proven to be immensely useful as therapeutic targets, a majority are still understudied. In this work, we present a novel knowledge graph representation learning approach to predict novel interaction partners for understudied kinases. Our approach uses a phosphoproteomic knowledge graph constructed by integrating data from iPTMnet, protein ontology, gene ontology and BioKG. The representations of kinases and substrates in this knowledge graph are learned by performing directed random walks on triples coupled with a modified SkipGram or CBOW model. These representations are then used as an input to a supervised classification model to predict novel interactions for understudied kinases. We also present a post-predictive analysis of the predicted interactions and an ablation study of the phosphoproteomic knowledge graph to gain an insight into the biology of the understudied kinases.


Assuntos
Reconhecimento Automatizado de Padrão , Proteoma , Humanos , Ontologia Genética , Especificidade por Substrato
3.
Methods Mol Biol ; 2499: 187-204, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35696082

RESUMO

iPTMnet is a resource that combines rich information about protein post-translational modifications (PTM) from curated databases as well as text mining tools. Researchers can use the iPTMnet website to query, analyze and download the PTM data. In this chapter we describe the iPTMnet RESTful API which provides a way to streamline the integration of iPTMnet data into an automated data analysis workflow. In the first section, we give an overview of the architecture of the API. In the second section, we describe various function defined by the API and provide detailed examples of using these functions.


Assuntos
Mineração de Dados , Processamento de Proteína Pós-Traducional , Bases de Dados de Proteínas , Proteínas/metabolismo , Fluxo de Trabalho
4.
Bioinformatics ; 37(23): 4597-4598, 2021 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-34613368

RESUMO

SUMMARY: The global response to the COVID-19 pandemic has led to a rapid increase of scientific literature on this deadly disease. Extracting knowledge from biomedical literature and integrating it with relevant information from curated biological databases is essential to gain insight into COVID-19 etiology, diagnosis and treatment. We used Semantic Web technology RDF to integrate COVID-19 knowledge mined from literature by iTextMine, PubTator and SemRep with relevant biological databases and formalized the knowledge in a standardized and computable COVID-19 Knowledge Graph (KG). We published the COVID-19 KG via a SPARQL endpoint to support federated queries on the Semantic Web and developed a knowledge portal with browsing and searching interfaces. We also developed a RESTful API to support programmatic access and provided RDF dumps for download. AVAILABILITY AND IMPLEMENTATION: The COVID-19 Knowledge Graph is publicly available under CC-BY 4.0 license at https://research.bioinformatics.udel.edu/covid19kg/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , Semântica , Humanos , Pandemias , Reconhecimento Automatizado de Padrão , Bases de Dados Factuais
5.
Cancer Res ; 81(11): 3051-3066, 2021 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-33727228

RESUMO

Lung cancer is the leading cause of cancer mortality worldwide. The treatment of patients with lung cancer harboring mutant EGFR with orally administered EGFR tyrosine kinase inhibitors (TKI) has been a paradigm shift. Osimertinib and rociletinib are third-generation irreversible EGFR TKIs targeting the EGFR T790M mutation. Osimertinib is the current standard of care for patients with EGFR mutations due to increased efficacy, lower side effects, and enhanced brain penetrance. Unfortunately, all patients develop resistance. Genomic approaches have primarily been used to interrogate resistance mechanisms. Here we characterized the proteome and phosphoproteome of a series of isogenic EGFR-mutant lung adenocarcinoma cell lines that are either sensitive or resistant to these drugs, comprising the most comprehensive proteomic dataset resource to date to investigate third generation EGFR TKI resistance in lung adenocarcinoma. Unbiased global quantitative mass spectrometry uncovered alterations in signaling pathways, revealed a proteomic signature of epithelial-mesenchymal transition, and identified kinases and phosphatases with altered expression and phosphorylation in TKI-resistant cells. Decreased tyrosine phosphorylation of key sites in the phosphatase SHP2 suggests its inhibition, resulting in subsequent inhibition of RAS/MAPK and activation of PI3K/AKT pathways. Anticorrelation analyses of this phosphoproteomic dataset with published drug-induced P100 phosphoproteomic datasets from the Library of Integrated Network-Based Cellular Signatures program predicted drugs with the potential to overcome EGFR TKI resistance. The PI3K/MTOR inhibitor dactolisib in combination with osimertinib overcame resistance both in vitro and in vivo. Taken together, this study reveals global proteomic alterations upon third generation EGFR TKI resistance and highlights potential novel approaches to overcome resistance. SIGNIFICANCE: Global quantitative proteomics reveals changes in the proteome and phosphoproteome in lung cancer cells resistant to third generation EGFR TKIs, identifying the PI3K/mTOR inhibitor dactolisib as a potential approach to overcome resistance.


Assuntos
Adenocarcinoma de Pulmão/tratamento farmacológico , Resistencia a Medicamentos Antineoplásicos , Imidazóis/farmacologia , Fosfoproteínas/metabolismo , Inibidores de Proteínas Quinases/farmacologia , Proteoma/metabolismo , Quinolinas/farmacologia , Adenocarcinoma de Pulmão/metabolismo , Adenocarcinoma de Pulmão/patologia , Antineoplásicos/farmacologia , Apoptose , Proliferação de Células , Receptores ErbB/antagonistas & inibidores , Humanos , Neoplasias Pulmonares/tratamento farmacológico , Neoplasias Pulmonares/metabolismo , Neoplasias Pulmonares/patologia , Fosfatidilinositol 3-Quinases/química , Fosfoproteínas/análise , Proteoma/análise , Serina-Treonina Quinases TOR/antagonistas & inibidores , Células Tumorais Cultivadas
6.
Sci Data ; 7(1): 337, 2020 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-33046717

RESUMO

The Protein Ontology (PRO) provides an ontological representation of protein-related entities, ranging from protein families to proteoforms to complexes. Protein Ontology Linked Open Data (LOD) exposes, shares, and connects knowledge about protein-related entities on the Semantic Web using Resource Description Framework (RDF), thus enabling integration with other Linked Open Data for biological knowledge discovery. For example, proteins (or variants thereof) can be retrieved on the basis of specific disease associations. As a community resource, we strive to follow the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles, disseminate regular updates of our data, support multiple methods for accessing, querying and downloading data in various formats, and provide documentation both for scientists and programmers. PRO Linked Open Data can be browsed via faceted browser interface and queried using SPARQL via YASGUI. RDF data dumps are also available for download. Additionally, we developed RESTful APIs to support programmatic data access. We also provide W3C HCLS specification compliant metadata description for our data. The PRO Linked Open Data is available at https://lod.proconsortium.org/ .


Assuntos
Descoberta do Conhecimento , Proteínas/química , Web Semântica , Conjuntos de Dados como Assunto , Software
7.
Adv Biosyst ; 4(9): e2000119, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32603024

RESUMO

Late recurrences of breast cancer are hypothesized to originate from disseminated tumor cells that re-activate after a long period of dormancy, ≥5 years for estrogen-receptor positive (ER+) tumors. An outstanding question remains as to what the key microenvironment interactions are that regulate this complex process, and well-defined human model systems are needed for probing this. Here, a robust, bioinspired 3D ER+ dormancy culture model is established and utilized to probe the effects of matrix properties for common sites of late recurrence on breast cancer cell dormancy. Formation of dormant micrometastases over several weeks is examined for ER+ cells (T47D, BT474), where the timing of entry into dormancy versus persistent growth depends on matrix composition and cell type. In contrast, triple negative cells (MDA-MB-231), associated with early recurrence, are not observed to undergo long-term dormancy. Bioinformatic analyses quantitatively support an increased "dormancy score" gene signature for ER+ cells (T47D) and reveal differential expression of genes associated with different biological processes based on matrix composition. Further, these analyses support a link between dormancy and autophagy, a potential survival mechanism. This robust model system will allow systematic investigations of other cell-microenvironment interactions in dormancy and evaluation of therapeutics for preventing late recurrence.


Assuntos
Neoplasias da Mama , Técnicas de Cultura de Células/métodos , Modelos Biológicos , Receptores de Estrogênio/metabolismo , Microambiente Tumoral/fisiologia , Autofagia , Neoplasias da Mama/química , Neoplasias da Mama/metabolismo , Neoplasias da Mama/fisiopatologia , Linhagem Celular Tumoral , Matriz Extracelular/metabolismo , Feminino , Humanos , Biologia Sintética
8.
Database (Oxford) ; 20202020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-32395768

RESUMO

iPTMnet is a bioinformatics resource that integrates protein post-translational modification (PTM) data from text mining and curated databases and ontologies to aid in knowledge discovery and scientific study. The current iPTMnet website can be used for querying and browsing rich PTM information but does not support automated iPTMnet data integration with other tools. Hence, we have developed a RESTful API utilizing the latest developments in cloud technologies to facilitate the integration of iPTMnet into existing tools and pipelines. We have packaged iPTMnet API software in Docker containers and published it on DockerHub for easy redistribution. We have also developed Python and R packages that allow users to integrate iPTMnet for scientific discovery, as demonstrated in a use case that connects PTM sites to kinase signaling pathways.


Assuntos
Biologia Computacional , Software , Mineração de Dados , Processamento de Proteína Pós-Traducional , Proteínas/genética
9.
APL Bioeng ; 3(1): 016101, 2019 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-31069334

RESUMO

The extracellular matrix (ECM) is thought to play a critical role in the progression of breast cancer. In this work, we have designed a photopolymerizable, biomimetic synthetic matrix for the controlled, 3D culture of breast cancer cells and, in combination with imaging and bioinformatics tools, utilized this system to investigate the breast cancer cell response to different matrix cues. Specifically, hydrogel-based matrices of different densities and modified with receptor-binding peptides derived from ECM proteins [fibronectin/vitronectin (RGDS), collagen (GFOGER), and laminin (IKVAV)] were synthesized to mimic key aspects of the ECM of different soft tissue sites. To assess the breast cancer cell response, the morphology and growth of breast cancer cells (MDA-MB-231 and T47D) were monitored in three dimensions over time, and differences in their transcriptome were assayed using next generation sequencing. We observed increased growth in response to GFOGER and RGDS, whether individually or in combination with IKVAV, where binding of integrin ß1 was key. Importantly, in matrices with GFOGER, increased growth was observed with increasing matrix density for MDA-MB-231s. Further, transcriptomic analyses revealed increased gene expression and enrichment of biological processes associated with cell-matrix interactions, proliferation, and motility in matrices rich in GFOGER relative to IKVAV. In sum, a new approach for investigating breast cancer cell-matrix interactions was established with insights into how microenvironments rich in collagen promote breast cancer growth, a hallmark of disease progression in vivo, with opportunities for future investigations that harness the multidimensional property control afforded by this photopolymerizable system.

10.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30576489

RESUMO

Numerous efforts have been made for developing text-mining tools to extract information from biomedical text automatically. They have assisted in many biological tasks, such as database curation and hypothesis generation. Text-mining tools are usually different from each other in terms of programming language, system dependency and input/output format. There are few previous works that concern the integration of different text-mining tools and their results from large-scale text processing. In this paper, we describe the iTextMine system with an automated workflow to run multiple text-mining tools on large-scale text for knowledge extraction. We employ parallel processing with dockerized text-mining tools with a standardized JSON output format and implement a text alignment algorithm to solve the text discrepancy for result integration. iTextMine presently integrates four relation extraction tools, which have been used to process all the Medline abstracts and PMC open access full-length articles. The website allows users to browse the text evidence and view integrated results for knowledge discovery through a network view. We demonstrate the utilities of iTextMine with two use cases involving the gene PTEN and breast cancer and the gene SATB1.


Assuntos
Indexação e Redação de Resumos/métodos , Mineração de Dados/métodos , Publicações , Software , Algoritmos
11.
Sci Rep ; 8(1): 16094, 2018 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-30382141

RESUMO

Oviductosomes (OVS) are nano-sized extracellular vesicles secreted in the oviductal luminal fluid by oviductal epithelial cells and known to be involved in sperm capacitation and fertility. Although they have been shown to transfer encapsulated proteins to sperm, cargo constituents other than proteins have not been identified. Using next-generation sequencing, we demonstrate that OVS are carriers of microRNAs (miRNAs), with 272 detected throughout the estrous cycle. Of the 50 most abundant, 6 (12%) and 2 (4%) were expressed at significantly higher levels (P < 0.05) at metestrus/diestrus and proestrus/estrus. RT-qPCR showed that selected miRNAs are present in oviductal epithelial cells in significantly (P < 0.05) lower abundance than in OVS, indicating selective miRNA packaging. The majority (64%) of the top 25 OVS miRNAs are present in sperm. These miRNAs' potential target list is enriched with transcription factors, transcription regulators, and protein kinases and there are several embryonic developmentally-related genes. Importantly, OVS can deliver to sperm miRNAs, including miR-34c-5p which is essential for the first cleavage and is solely sperm-derived in the zygote. Z-stack of confocal images of sperm co-incubated with OVS loaded with labeled miRNAs showed the intracellular location of the delivered miRNAs. Interestingly, individual miRNAs were predominantly localized in specific head compartments, with miR-34c-5p being highly concentrated at the centrosome where it is known to function. These results, for the first time, demonstrate OVS' ability to contribute to the sperm's miRNA repertoire (an important role for solely sperm-derived zygotic miRNAs) and the physiological relevance of an OVS-borne miRNA that is delivered to sperm.


Assuntos
Centrossomo/metabolismo , Ciclo Estral/genética , Vesículas Extracelulares/metabolismo , Perfilação da Expressão Gênica , MicroRNAs/metabolismo , Oviductos/metabolismo , Espermatozoides/metabolismo , Animais , Proliferação de Células , Centrossomo/ultraestrutura , Desenvolvimento Embrionário , Endocitose , Vesículas Extracelulares/ultraestrutura , Feminino , Regulação da Expressão Gênica , Ontologia Genética , Masculino , Camundongos , MicroRNAs/genética , Oviductos/embriologia , Oviductos/ultraestrutura , Reprodutibilidade dos Testes
12.
Nucleic Acids Res ; 46(D1): D542-D550, 2018 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-29145615

RESUMO

Protein post-translational modifications (PTMs) play a pivotal role in numerous biological processes by modulating regulation of protein function. We have developed iPTMnet (http://proteininformationresource.org/iPTMnet) for PTM knowledge discovery, employing an integrative bioinformatics approach-combining text mining, data mining, and ontological representation to capture rich PTM information, including PTM enzyme-substrate-site relationships, PTM-specific protein-protein interactions (PPIs) and PTM conservation across species. iPTMnet encompasses data from (i) our PTM-focused text mining tools, RLIMS-P and eFIP, which extract phosphorylation information from full-scale mining of PubMed abstracts and full-length articles; (ii) a set of curated databases with experimentally observed PTMs; and iii) Protein Ontology that organizes proteins and PTM proteoforms, enabling their representation, annotation and comparison within and across species. Presently covering eight major PTM types (phosphorylation, ubiquitination, acetylation, methylation, glycosylation, S-nitrosylation, sumoylation and myristoylation), iPTMnet knowledgebase contains more than 654 500 unique PTM sites in over 62 100 proteins, along with more than 1200 PTM enzymes and over 24 300 PTM enzyme-substrate-site relations. The website supports online search, browsing, retrieval and visual analysis for scientific queries. Several examples, including functional interpretation of phosphoproteomic data, demonstrate iPTMnet as a gateway for visual exploration and systematic analysis of PTM networks and conservation, thereby enabling PTM discovery and hypothesis generation.


Assuntos
Bases de Dados de Proteínas , Bases de Conhecimento , Processamento de Proteína Pós-Traducional , Animais , Biologia Computacional , Mineração de Dados , Enzimas/metabolismo , Humanos , Internet , Fosforilação , Mapas de Interação de Proteínas , Alinhamento de Sequência
13.
Nucleic Acids Res ; 45(D1): D339-D346, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899649

RESUMO

The Protein Ontology (PRO; http://purl.obolibrary.org/obo/pr) formally defines and describes taxon-specific and taxon-neutral protein-related entities in three major areas: proteins related by evolution; proteins produced from a given gene; and protein-containing complexes. PRO thus serves as a tool for referencing protein entities at any level of specificity. To enhance this ability, and to facilitate the comparison of such entities described in different resources, we developed a standardized representation of proteoforms using UniProtKB as a sequence reference and PSI-MOD as a post-translational modification reference. We illustrate its use in facilitating an alignment between PRO and Reactome protein entities. We also address issues of scalability, describing our first steps into the use of text mining to identify protein-related entities, the large-scale import of proteoform information from expert curated resources, and our ability to dynamically generate PRO terms. Web views for individual terms are now more informative about closely-related terms, including for example an interactive multiple sequence alignment. Finally, we describe recent improvement in semantic utility, with PRO now represented in OWL and as a SPARQL endpoint. These developments will further support the anticipated growth of PRO and facilitate discoverability of and allow aggregation of data relating to protein entities.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Proteínas , Animais , Humanos , Proteínas/química , Proteínas/genética , Navegador
14.
Nucleic Acids Res ; 42(Database issue): D415-21, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24270789

RESUMO

The Protein Ontology (PRO; http://proconsortium.org) formally defines protein entities and explicitly represents their major forms and interrelations. Protein entities represented in PRO corresponding to single amino acid chains are categorized by level of specificity into family, gene, sequence and modification metaclasses, and there is a separate metaclass for protein complexes. All metaclasses also have organism-specific derivatives. PRO complements established sequence databases such as UniProtKB, and interoperates with other biomedical and biological ontologies such as the Gene Ontology (GO). PRO relates to UniProtKB in that PRO's organism-specific classes of proteins encoded by a specific gene correspond to entities documented in UniProtKB entries. PRO relates to the GO in that PRO's representations of organism-specific protein complexes are subclasses of the organism-agnostic protein complex terms in the GO Cellular Component Ontology. The past few years have seen growth and changes to the PRO, as well as new points of access to the data and new applications of PRO in immunology and proteomics. Here we describe some of these developments.


Assuntos
Ontologias Biológicas , Bases de Dados de Proteínas , Proteínas/classificação , Animais , Humanos , Internet , Camundongos , Proteínas/química
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...