Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 117
Filtrar
1.
PeerJ ; 11: e16164, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37818330

RESUMEN

Background: Aberrant protein kinase regulation leading to abnormal substrate phosphorylation is associated with several human diseases. Despite the promise of therapies targeting kinases, many human kinases remain understudied. Most existing computational tools predicting phosphorylation cover less than 50% of known human kinases. They utilize local feature selection based on protein sequences, motifs, domains, structures, and/or functions, and do not consider the heterogeneous relationships of the proteins. In this work, we present KSFinder, a tool that predicts kinase-substrate links by capturing the inherent association of proteins in a network comprising 85% of the known human kinases. We also postulate the potential role of two understudied kinases based on their substrate predictions from KSFinder. Methods: KSFinder learns the semantic relationships in a phosphoproteome knowledge graph using a knowledge graph embedding algorithm and represents the nodes in low-dimensional vectors. A multilayer perceptron (MLP) classifier is trained to discern kinase-substrate links using the embedded vectors. KSFinder uses a strategic negative generation approach that eliminates biases in entity representation and combines data from experimentally validated non-interacting protein pairs, proteins from different subcellular locations, and random sampling. We assess KSFinder's generalization capability on four different datasets and compare its performance with other state-of-the-art prediction models. We employ KSFinder to predict substrates of 68 "dark" kinases considered understudied by the Illuminating the Druggable Genome program and use our text-mining tool, RLIMS-P along with manual curation, to search for literature evidence for the predictions. In a case study, we performed functional enrichment analysis for two dark kinases - HIPK3 and CAMKK1 using their predicted substrates. Results: KSFinder shows improved performance over other kinase-substrate prediction models and generalized prediction ability on different datasets. We identified literature evidence for 17 novel predictions involving an understudied kinase. All of these 17 predictions had a probability score ≥0.7 (nine at >0.9, six at 0.8-0.9, and two at 0.7-0.8). The evaluation of 93,593 negative predictions (probability ≤0.3) identified four false negatives. The top enriched biological processes of HIPK3 substrates relate to the regulation of extracellular matrix and epigenetic gene expression, while CAMKK1 substrates include lipid storage regulation and glucose homeostasis. Conclusions: KSFinder outperforms the current kinase-substrate prediction tools with higher kinase coverage. The strategically developed negatives provide a superior generalization ability for KSFinder. We predicted substrates of 432 kinases, 68 of which are understudied, and hypothesized the potential functions of two dark kinases using their predicted substrates.


Asunto(s)
Reconocimiento de Normas Patrones Automatizadas , Proteínas Quinasas , Humanos , Proteínas Quinasas/genética , Fosforilación , Algoritmos , Proteoma/química
2.
PLoS One ; 18(4): e0274042, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37022994

RESUMEN

Chinese hamster ovary (CHO) cells are widely used for mass production of therapeutic proteins in the pharmaceutical industry. With the growing need in optimizing the performance of producer CHO cell lines, research on CHO cell line development and bioprocess continues to increase in recent decades. Bibliographic mapping and classification of relevant research studies will be essential for identifying research gaps and trends in literature. To qualitatively and quantitatively understand the CHO literature, we have conducted topic modeling using a CHO bioprocess bibliome manually compiled in 2016, and compared the topics uncovered by the Latent Dirichlet Allocation (LDA) models with the human labels of the CHO bibliome. The results show a significant overlap between the manually selected categories and computationally generated topics, and reveal the machine-generated topic-specific characteristics. To identify relevant CHO bioprocessing papers from new scientific literature, we have developed supervized models using Logistic Regression to identify specific article topics and evaluated the results using three CHO bibliome datasets, Bioprocessing set, Glycosylation set, and Phenotype set. The use of top terms as features supports the explainability of document classification results to yield insights on new CHO bioprocessing papers.


Asunto(s)
Minería de Datos , Cricetinae , Animales , Humanos , Células CHO , Cricetulus , Fenotipo , Glicosilación
3.
JAMA Netw Open ; 6(3): e233012, 2023 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-36920393

RESUMEN

Importance: The association between degree of neighborhood deprivation and primary hypertension diagnosis in youth remains understudied. Objective: To assess the association between neighborhood measures of deprivation and primary hypertension diagnosis in youth. Design, Setting, and Participants: This cross-sectional study included 65 452 Delaware Medicaid-insured youths aged 8 to 18 years between January 1, 2014, and December 31, 2019. Residence was geocoded by national area deprivation index (ADI). Exposures: Higher area deprivation. Main Outcomes and Measures: The main outcome was primary hypertension diagnosis based on International Classification of Diseases, Ninth Revision and Tenth Revision codes. Data were analyzed between September 1, 2021, and December 31, 2022. Results: A total of 65 452 youths were included in the analysis, including 64 307 (98.3%) without a hypertension diagnosis (30 491 [47%] female and 33 813 [53%] male; mean [SD] age, 12.5 (3.1) years; 12 500 [19%] Hispanic, 25 473 [40%] non-Hispanic Black, 24 565 [38%] non-Hispanic White, and 1769 [3%] other race or ethnicity; 13 029 [20%] with obesity; and 31 548 [49%] with an ADI ≥50) and 1145 (1.7%) with a diagnosis of primary hypertension (mean [SD] age, 13.3 [2.8] years; 464 [41%] female and 681 [59%] male; 271 [24%] Hispanic, 460 [40%] non-Hispanic Black, 396 [35%] non-Hispanic White, and 18 [2%] of other race or ethnicity; 705 [62%] with obesity; and 614 [54%] with an ADI ≥50). The mean (SD) duration of full Medicaid benefit coverage was 61 (16) months for those with a diagnosis of primary hypertension and 46.0 (24.3) months for those without. By multivariable logistic regression, residence within communities with ADI greater than or equal to 50 was associated with 60% greater odds of a hypertension diagnosis (odds ratio [OR], 1.61; 95% CI 1.04-2.51). Older age (OR per year, 1.16; 95%, CI, 1.14-1.18), an obesity diagnosis (OR, 5.16; 95% CI, 4.54-5.85), and longer duration of full Medicaid benefit coverage (OR, 1.03; 95% CI, 1.03-1.04) were associated with greater odds of primary hypertension diagnosis, whereas female sex was associated with lower odds (OR, 0.68; 95%, 0.61-0.77). Model fit including a Medicaid-by-ADI interaction term was significant for the interaction and revealed slightly greater odds of hypertension diagnosis for youths with ADI less than 50 (OR, 1.03; 95% CI, 1.03-1.04) vs ADI ≥50 (OR, 1.02; 95% CI, 1.02-1.03). Race and ethnicity were not associated with primary hypertension diagnosis. Conclusions and Relevance: In this cross-sectional study, higher childhood neighborhood ADI, obesity, age, sex, and duration of Medicaid benefit coverage were associated with a primary hypertension diagnosis in youth. Screening algorithms and national guidelines may consider the importance of ADI when assessing for the presence and prevalence of primary hypertension in youth.


Asunto(s)
Hipertensión , Medicaid , Estados Unidos/epidemiología , Humanos , Masculino , Adolescente , Femenino , Niño , Estudios Transversales , Delaware/epidemiología , Obesidad , Hipertensión/diagnóstico , Hipertensión/epidemiología , Hipertensión Esencial
4.
Sci Rep ; 13(1): 1200, 2023 01 21.
Artículo en Inglés | MEDLINE | ID: mdl-36681715

RESUMEN

Chinese hamster ovary (CHO) cell lines are widely used to manufacture biopharmaceuticals. However, CHO cells are not an optimal expression host due to the intrinsic plasticity of the CHO genome. Genome plasticity can lead to chromosomal rearrangements, transgene exclusion, and phenotypic drift. A poorly understood genomic element of CHO cell line instability is extrachromosomal circular DNA (eccDNA) in gene expression and regulation. EccDNA can facilitate ultra-high gene expression and are found within many eukaryotes including humans, yeast, and plants. EccDNA confers genetic heterogeneity, providing selective advantages to individual cells in response to dynamic environments. In CHO cell cultures, maintaining genetic homogeneity is critical to ensuring consistent productivity and product quality. Understanding eccDNA structure, function, and microevolutionary dynamics under various culture conditions could reveal potential engineering targets for cell line optimization. In this study, eccDNA sequences were investigated at the beginning and end of two-week fed-batch cultures in an ambr®250 bioreactor under control and lactate-stressed conditions. This work characterized structure and function of eccDNA in a CHO-K1 clone. Gene annotation identified 1551 unique eccDNA genes including cancer driver genes and genes involved in protein production. Furthermore, RNA-seq data is integrated to identify transcriptionally active eccDNA genes.


Asunto(s)
Técnicas de Cultivo Celular por Lotes , Ácido Láctico , Cricetinae , Animales , Humanos , Cricetulus , Células CHO , Genoma , ADN
5.
Nucleic Acids Res ; 51(D1): D418-D427, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36350672

RESUMEN

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. Here, we report recent developments with InterPro (version 90.0) and its associated software, including updates to data content and to the website. These developments extend and enrich the information provided by InterPro, and provide a more user friendly access to the data. Additionally, we have worked on adding Pfam website features to the InterPro website, as the Pfam website will be retired in late 2022. We also show that InterPro's sequence coverage has kept pace with the growth of UniProtKB. Moreover, we report the development of a card game as a method of engaging the non-scientific community. Finally, we discuss the benefits and challenges brought by the use of artificial intelligence for protein structure prediction.


Asunto(s)
Bases de Datos de Proteínas , Humanos , Secuencia de Aminoácidos , Inteligencia Artificial , Internet , Proteínas/química , Programas Informáticos
6.
Mol Omics ; 18(9): 853-864, 2022 10 31.
Artículo en Inglés | MEDLINE | ID: mdl-35975455

RESUMEN

The human proteome contains a vast network of interacting kinases and substrates. Even though some kinases have proven to be immensely useful as therapeutic targets, a majority are still understudied. In this work, we present a novel knowledge graph representation learning approach to predict novel interaction partners for understudied kinases. Our approach uses a phosphoproteomic knowledge graph constructed by integrating data from iPTMnet, protein ontology, gene ontology and BioKG. The representations of kinases and substrates in this knowledge graph are learned by performing directed random walks on triples coupled with a modified SkipGram or CBOW model. These representations are then used as an input to a supervised classification model to predict novel interactions for understudied kinases. We also present a post-predictive analysis of the predicted interactions and an ablation study of the phosphoproteomic knowledge graph to gain an insight into the biology of the understudied kinases.


Asunto(s)
Reconocimiento de Normas Patrones Automatizadas , Proteoma , Humanos , Ontología de Genes , Especificidad por Sustrato
7.
Methods Mol Biol ; 2499: 187-204, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35696082

RESUMEN

iPTMnet is a resource that combines rich information about protein post-translational modifications (PTM) from curated databases as well as text mining tools. Researchers can use the iPTMnet website to query, analyze and download the PTM data. In this chapter we describe the iPTMnet RESTful API which provides a way to streamline the integration of iPTMnet data into an automated data analysis workflow. In the first section, we give an overview of the architecture of the API. In the second section, we describe various function defined by the API and provide detailed examples of using these functions.


Asunto(s)
Minería de Datos , Procesamiento Proteico-Postraduccional , Bases de Datos de Proteínas , Proteínas/metabolismo , Flujo de Trabajo
8.
BMC Plant Biol ; 22(1): 107, 2022 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-35260072

RESUMEN

BACKGROUND: Sustainable production of high-quality feedstock has been of great interest in bioenergy research. Despite the economic importance, high temperatures and water deficit are limiting factors for the successful cultivation of switchgrass in semi-arid areas. There are limited reports on the molecular basis of combined abiotic stress tolerance in switchgrass, particularly the combination of drought and heat stress. We used transcriptomic approaches to elucidate the changes in the response of switchgrass to drought and high temperature simultaneously. RESULTS: We conducted solely drought treatment in switchgrass plant Alamo AP13 by withholding water after 45 days of growing. For the combination of drought and heat effect, heat treatment (35 °C/25 °C day/night) was imposed after 72 h of the initiation of drought. Samples were collected at 0 h, 72 h, 96 h, 120 h, 144 h, and 168 h after treatment imposition, total RNA was extracted, and RNA-Seq conducted. Out of a total of 32,190 genes, we identified 3912, as drought (DT) responsive genes, 2339 and 4635 as, heat (HT) and drought and heat (DTHT) responsive genes, respectively. There were 209, 106, and 220 transcription factors (TFs) differentially expressed under DT, HT and DTHT respectively. Gene ontology annotation identified the metabolic process as the significant term enriched in DTHT genes. Other biological processes identified in DTHT responsive genes included: response to water, photosynthesis, oxidation-reduction processes, and response to stress. KEGG pathway enrichment analysis on DT and DTHT responsive genes revealed that TFs and genes controlling phenylpropanoid pathways were important for individual as well as combined stress response. For example, hydroxycinnamoyl-CoA shikimate/quinate hydroxycinnamoyl transferase (HCT) from the phenylpropanoid pathway was induced by single DT and combinations of DTHT stress. CONCLUSION: Through RNA-Seq analysis, we have identified unique and overlapping genes in response to DT and combined DTHT stress in switchgrass. The combination of DT and HT stress may affect the photosynthetic machinery and phenylpropanoid pathway of switchgrass which negatively impacts lignin synthesis and biomass production of switchgrass. The biological function of genes identified particularly in response to DTHT stress could further be confirmed by techniques such as single point mutation or RNAi.


Asunto(s)
Adaptación Fisiológica/genética , Deshidratación/genética , Respuesta al Choque Térmico/genética , Panicum/genética , Transcriptoma , Perfilación de la Expresión Génica , Regulación de la Expresión Génica de las Plantas , Genes de Plantas
9.
PLoS Biol ; 19(12): e3001464, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34871295

RESUMEN

The UniProt knowledgebase is a public database for protein sequence and function, covering the tree of life and over 220 million protein entries. Now, the whole community can use a new crowdsourcing annotation system to help scale up UniProt curation and receive proper attribution for their biocuration work.


Asunto(s)
Colaboración de las Masas/métodos , Curaduría de Datos/métodos , Anotación de Secuencia Molecular/métodos , Secuencia de Aminoácidos/genética , Biología Computacional/métodos , Bases de Datos de Proteínas/tendencias , Humanos , Literatura , Proteínas/metabolismo , Participación de los Interesados
10.
Bioinformatics ; 37(23): 4597-4598, 2021 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-34613368

RESUMEN

SUMMARY: The global response to the COVID-19 pandemic has led to a rapid increase of scientific literature on this deadly disease. Extracting knowledge from biomedical literature and integrating it with relevant information from curated biological databases is essential to gain insight into COVID-19 etiology, diagnosis and treatment. We used Semantic Web technology RDF to integrate COVID-19 knowledge mined from literature by iTextMine, PubTator and SemRep with relevant biological databases and formalized the knowledge in a standardized and computable COVID-19 Knowledge Graph (KG). We published the COVID-19 KG via a SPARQL endpoint to support federated queries on the Semantic Web and developed a knowledge portal with browsing and searching interfaces. We also developed a RESTful API to support programmatic access and provided RDF dumps for download. AVAILABILITY AND IMPLEMENTATION: The COVID-19 Knowledge Graph is publicly available under CC-BY 4.0 license at https://research.bioinformatics.udel.edu/covid19kg/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
COVID-19 , Semántica , Humanos , Pandemias , Reconocimiento de Normas Patrones Automatizadas , Bases de Datos Factuales
11.
Nucleic Acids Res ; 49(D1): D344-D354, 2021 01 08.
Artículo en Inglés | MEDLINE | ID: mdl-33156333

RESUMEN

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/química , Secuencia de Aminoácidos , COVID-19/metabolismo , Internet , Anotación de Secuencia Molecular , Dominios Proteicos , Mapas de Interacción de Proteínas , SARS-CoV-2/metabolismo , Alineación de Secuencia
12.
Sci Data ; 7(1): 337, 2020 10 12.
Artículo en Inglés | MEDLINE | ID: mdl-33046717

RESUMEN

The Protein Ontology (PRO) provides an ontological representation of protein-related entities, ranging from protein families to proteoforms to complexes. Protein Ontology Linked Open Data (LOD) exposes, shares, and connects knowledge about protein-related entities on the Semantic Web using Resource Description Framework (RDF), thus enabling integration with other Linked Open Data for biological knowledge discovery. For example, proteins (or variants thereof) can be retrieved on the basis of specific disease associations. As a community resource, we strive to follow the Findability, Accessibility, Interoperability, and Reusability (FAIR) principles, disseminate regular updates of our data, support multiple methods for accessing, querying and downloading data in various formats, and provide documentation both for scientists and programmers. PRO Linked Open Data can be browsed via faceted browser interface and queried using SPARQL via YASGUI. RDF data dumps are also available for download. Additionally, we developed RESTful APIs to support programmatic data access. We also provide W3C HCLS specification compliant metadata description for our data. The PRO Linked Open Data is available at https://lod.proconsortium.org/ .


Asunto(s)
Descubrimiento del Conocimiento , Proteínas/química , Web Semántica , Conjuntos de Datos como Asunto , Programas Informáticos
13.
Adv Biosyst ; 4(9): e2000119, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32603024

RESUMEN

Late recurrences of breast cancer are hypothesized to originate from disseminated tumor cells that re-activate after a long period of dormancy, ≥5 years for estrogen-receptor positive (ER+) tumors. An outstanding question remains as to what the key microenvironment interactions are that regulate this complex process, and well-defined human model systems are needed for probing this. Here, a robust, bioinspired 3D ER+ dormancy culture model is established and utilized to probe the effects of matrix properties for common sites of late recurrence on breast cancer cell dormancy. Formation of dormant micrometastases over several weeks is examined for ER+ cells (T47D, BT474), where the timing of entry into dormancy versus persistent growth depends on matrix composition and cell type. In contrast, triple negative cells (MDA-MB-231), associated with early recurrence, are not observed to undergo long-term dormancy. Bioinformatic analyses quantitatively support an increased "dormancy score" gene signature for ER+ cells (T47D) and reveal differential expression of genes associated with different biological processes based on matrix composition. Further, these analyses support a link between dormancy and autophagy, a potential survival mechanism. This robust model system will allow systematic investigations of other cell-microenvironment interactions in dormancy and evaluation of therapeutics for preventing late recurrence.


Asunto(s)
Neoplasias de la Mama , Técnicas de Cultivo de Célula/métodos , Modelos Biológicos , Receptores de Estrógenos/metabolismo , Microambiente Tumoral/fisiología , Autofagia , Neoplasias de la Mama/química , Neoplasias de la Mama/metabolismo , Neoplasias de la Mama/fisiopatología , Línea Celular Tumoral , Matriz Extracelular/metabolismo , Femenino , Humanos , Biología Sintética
14.
Database (Oxford) ; 20202020 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-32395768

RESUMEN

iPTMnet is a bioinformatics resource that integrates protein post-translational modification (PTM) data from text mining and curated databases and ontologies to aid in knowledge discovery and scientific study. The current iPTMnet website can be used for querying and browsing rich PTM information but does not support automated iPTMnet data integration with other tools. Hence, we have developed a RESTful API utilizing the latest developments in cloud technologies to facilitate the integration of iPTMnet into existing tools and pipelines. We have packaged iPTMnet API software in Docker containers and published it on DockerHub for easy redistribution. We have also developed Python and R packages that allow users to integrate iPTMnet for scientific discovery, as demonstrated in a use case that connects PTM sites to kinase signaling pathways.


Asunto(s)
Biología Computacional , Programas Informáticos , Minería de Datos , Procesamiento Proteico-Postraduccional , Proteínas/genética
15.
APL Bioeng ; 3(1): 016101, 2019 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-31069334

RESUMEN

The extracellular matrix (ECM) is thought to play a critical role in the progression of breast cancer. In this work, we have designed a photopolymerizable, biomimetic synthetic matrix for the controlled, 3D culture of breast cancer cells and, in combination with imaging and bioinformatics tools, utilized this system to investigate the breast cancer cell response to different matrix cues. Specifically, hydrogel-based matrices of different densities and modified with receptor-binding peptides derived from ECM proteins [fibronectin/vitronectin (RGDS), collagen (GFOGER), and laminin (IKVAV)] were synthesized to mimic key aspects of the ECM of different soft tissue sites. To assess the breast cancer cell response, the morphology and growth of breast cancer cells (MDA-MB-231 and T47D) were monitored in three dimensions over time, and differences in their transcriptome were assayed using next generation sequencing. We observed increased growth in response to GFOGER and RGDS, whether individually or in combination with IKVAV, where binding of integrin ß1 was key. Importantly, in matrices with GFOGER, increased growth was observed with increasing matrix density for MDA-MB-231s. Further, transcriptomic analyses revealed increased gene expression and enrichment of biological processes associated with cell-matrix interactions, proliferation, and motility in matrices rich in GFOGER relative to IKVAV. In sum, a new approach for investigating breast cancer cell-matrix interactions was established with insights into how microenvironments rich in collagen promote breast cancer growth, a hallmark of disease progression in vivo, with opportunities for future investigations that harness the multidimensional property control afforded by this photopolymerizable system.

16.
Database (Oxford) ; 20192019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30805646

RESUMEN

Methods focused on predicting 'global' annotations for proteins (such as molecular function, biological process and presence of domains or membership in a family) have reached a relatively mature stage. Methods to provide fine-grained 'local' annotation of functional sites (at the level of individual amino acid) are now coming to the forefront, especially in light of the rapid accumulation of genetic variant data. We have developed a computational method and workflow that predicts functional sites within proteins using position-specific conditional template annotation rules (namely PIR Site Rules or PIRSRs for short). Such rules are curated through review of known protein structural and other experimental data by structural biologists and are used to generate high-quality annotations for the UniProt Knowledgebase (UniProtKB) unreviewed section. To share the PIRSR functional site prediction method with the broader scientific community, we have streamlined our workflow and developed a stand-alone Java software package named PIRSitePredict. We demonstrate the use of PIRSitePredict for functional annotation of de novo assembled genome/transcriptome by annotating uncharacterized proteins from Trinity RNA-seq assembly of embryonic transcriptomes of the following three cartilaginous fishes: Leucoraja erinacea (Little Skate), Scyliorhinus canicula (Small-spotted Catshark) and Callorhinchus milii (Elephant Shark). On average about 1200 lines of annotations were predicted for each species.


Asunto(s)
Bases de Datos de Proteínas , Anotación de Secuencia Molecular , Secuencia de Aminoácidos , Animales , Embrión no Mamífero/metabolismo , Peces/embriología , Peces/genética , Genoma , Programas Informáticos , Transcriptoma/genética
17.
Epigenetics Chromatin ; 12(1): 11, 2019 02 08.
Artículo en Inglés | MEDLINE | ID: mdl-30736855

RESUMEN

BACKGROUND: Epithelial to mesenchymal transition (EMT) plays a crucial role in cancer propagation. It can be orchestrated by the activation of multiple signaling pathways, which have been found to be highly coordinated with many epigenetic regulators. Although the mechanism of EMT has been studied over decades, cross talk between signaling and epigenetic regulation is not fully understood. RESULTS: Here, we present a time-resolved multi-omics strategy, which featured the identification of the correlation between protein changes (proteome), signaling pathways (phosphoproteome) and chromatin modulation (histone modifications) dynamics during TGF-ß-induced EMT. Our data revealed that Erk signaling was activated in 5-min stimulation and structural proteins involved in cytoskeleton rearrangement were regulated after 1-day treatment, constituting a detailed map of systematic changes. The comprehensive profiling of histone post-translational modifications identified H3K27me3 as the most significantly up-regulated mark. We thus speculated and confirmed that a combined inhibition of Erk signaling and Ezh2 (H3K27me3 methyltransferase) was more effective in blocking EMT progress than individual inhibitions. CONCLUSIONS: In summary, our data provided a more detailed map of cross talk between signaling pathway and chromatin regulation comparing to previous EMT studies. Our findings point to a promising therapeutic strategy for EMT-related diseases by combining Erk inhibitor (singling pathway) and Ezh2 inhibitor (epigenetic regulation).


Asunto(s)
Epigénesis Genética , Transición Epitelial-Mesenquimal , Sistema de Señalización de MAP Quinasas , Factor de Crecimiento Transformador beta/metabolismo , Animales , Línea Celular , Proteína Potenciadora del Homólogo Zeste 2/metabolismo , Código de Histonas , Ratones
18.
Database (Oxford) ; 20182018 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30576489

RESUMEN

Numerous efforts have been made for developing text-mining tools to extract information from biomedical text automatically. They have assisted in many biological tasks, such as database curation and hypothesis generation. Text-mining tools are usually different from each other in terms of programming language, system dependency and input/output format. There are few previous works that concern the integration of different text-mining tools and their results from large-scale text processing. In this paper, we describe the iTextMine system with an automated workflow to run multiple text-mining tools on large-scale text for knowledge extraction. We employ parallel processing with dockerized text-mining tools with a standardized JSON output format and implement a text alignment algorithm to solve the text discrepancy for result integration. iTextMine presently integrates four relation extraction tools, which have been used to process all the Medline abstracts and PMC open access full-length articles. The website allows users to browse the text evidence and view integrated results for knowledge discovery through a network view. We demonstrate the utilities of iTextMine with two use cases involving the gene PTEN and breast cancer and the gene SATB1.


Asunto(s)
Indización y Redacción de Resúmenes/métodos , Minería de Datos/métodos , Publicaciones , Programas Informáticos , Algoritmos
19.
BMC Med Inform Decis Mak ; 18(Suppl 5): 119, 2018 12 07.
Artículo en Inglés | MEDLINE | ID: mdl-30526566

RESUMEN

BACKGROUND: The Gene Ontology (GO) is a resource that supplies information about gene product function using ontologies to represent biological knowledge. These ontologies cover three domains: Cellular Component (CC), Molecular Function (MF), and Biological Process (BP). GO annotation is a process which assigns gene functional information using GO terms to relevant genes in the literature. It is a common task among the Model Organism Database (MOD) groups. Manual GO annotation relies on human curators assigning gene functional information using GO terms by reading the biomedical literature. This process is very time-consuming and labor-intensive. As a result, many MODs can afford to curate only a fraction of relevant articles. METHODS: GO terms from the CC domain can be essentially divided into two sub-hierarchies: subcellular location terms, and protein complex terms. We cast the task of gene annotation using GO terms from the CC domain as relation extraction between gene and other entities: (1) extract cases where a protein is found to be in a subcellular location, and (2) extract cases where a protein is a subunit of a protein complex. For each relation extraction task, we use an approach based on triggers and syntactic dependencies to extract the desired relations among entities. RESULTS: We tested our approach on the BC4GO test set, a publicly available corpus for GO annotation. Our approach obtains a F1-score of 71%, a precision of 91% and a recall of 58% for predicting GO terms from CC Domain for given genes. CONCLUSIONS: We have described a novel approach of treating gene annotation with GO terms from CC domain as two relation extraction subtasks. Evaluation results show that our approach achieves a F1-score of 71% for predicting GO terms for given genes. Thereby our approach can be used to accelerate the process of GO annotation for the bio-annotators.


Asunto(s)
Biología Computacional , Ontología de Genes , Anotación de Secuencia Molecular , Procesamiento de Lenguaje Natural , Humanos
20.
BMC Genomics ; 19(1): 695, 2018 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-30241500

RESUMEN

BACKGROUND: Although hatching is perhaps the most abrupt and profound metabolic challenge that a chicken must undergo; there have been no attempts to functionally map the metabolic pathways induced in liver during the embryo-to-hatchling transition. Furthermore, we know very little about the metabolic and regulatory factors that regulate lipid metabolism in late embryos or newly-hatched chicks. In the present study, we examined hepatic transcriptomes of 12 embryos and 12 hatchling chicks during the peri-hatch period-or the metabolic switch from chorioallantoic to pulmonary respiration. RESULTS: Initial hierarchical clustering revealed two distinct, albeit opposing, patterns of hepatic gene expression. Cluster A genes are largely lipolytic and highly expressed in embryos. While, Cluster B genes are lipogenic/thermogenic and mainly controlled by the lipogenic transcription factor THRSPA. Using pairwise comparisons of embryo and hatchling ages, we found 1272 genes that were differentially expressed between embryos and hatchling chicks, including 24 transcription factors and 284 genes that regulate lipid metabolism. The three most differentially-expressed transcripts found in liver of embryos were MOGAT1, DIO3 and PDK4, whereas THRSPA, FASN and DIO2 were highest in hatchlings. An unusual finding was the "ectopic" and extremely high differentially expression of seven feather keratin transcripts in liver of 16 day embryos, which coincides with engorgement of liver with yolk lipids. Gene interaction networks show several transcription factors, transcriptional co-activators/co-inhibitors and their downstream genes that exert a 'ying-yang' action on lipid metabolism during the embryo-to-hatching transition. These upstream regulators include ligand-activated transcription factors, sirtuins and Kruppel-like factors. CONCLUSIONS: Our genome-wide transcriptional analysis has greatly expanded the hepatic repertoire of regulatory and metabolic genes involved in the embryo-to-hatchling transition. New knowledge was gained on interactive transcriptional networks and metabolic pathways that enable the abrupt switch from ectothermy (embryo) to endothermy (hatchling) in the chicken. Several transcription factors and their coactivators/co-inhibitors appear to exert opposing actions on lipid metabolism, leading to the predominance of lipolysis in embryos and lipogenesis in hatchlings. Our analysis of hepatic transcriptomes has enabled discovery of opposing, interconnected and interdependent transcriptional regulators that provide precise ying-yang or homeorhetic regulation of lipid metabolism during the critical embryo-to-hatchling transition.


Asunto(s)
Pollos/crecimiento & desarrollo , Pollos/metabolismo , Regulación del Desarrollo de la Expresión Génica , Hígado/metabolismo , Animales , Cruzamiento , Embrión de Pollo/crecimiento & desarrollo , Embrión de Pollo/metabolismo , Desarrollo Embrionario , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Hígado/embriología , Hígado/crecimiento & desarrollo , Transcriptoma
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA