Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
J Mater Chem B ; 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38949411

RESUMEN

Most existing hydrogels, even recently developed injectable hydrogels that undergo a reversible sol-gel phase transition in response to external stimuli, are designed to gel immediately before or after implantation/injection to prevent the free diffusion of materials and drugs; however, the property of immediate gelation leads to a very weak tumour-targeting ability, limiting their application in anticancer therapy. Therefore, the development of tumour-specific responsive hydrogels for anticancer therapy is imperative because tumour-specific responses improve their tumour-targeting efficacy, increase therapeutic effects, and decrease toxicity and side effects. In this review, we introduce the following three types of tumour-responsive hydrogels: (1) hydrogels that gel specifically at the tumour site; (2) hydrogels that decompose specifically at the tumour site; and (3) hydrogels that react specifically with tumours. For each type, their compositions, the mechanisms of tumour-specific responsiveness and their applications in anticancer treatment are comprehensively discussed.

2.
Appl Environ Microbiol ; : e0054524, 2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38899887

RESUMEN

White-rot fungi differentially express laccases when they encounter aromatic compounds. However, the underlying mechanisms are still being explored. Here, proteomics analysis revealed that in addition to increased laccase activity, proteins involved in sphingolipid metabolism and toluene degradation as well as some cytochrome P450s (CYP450s) were differentially expressed and significantly enriched during 48 h of o-toluidine exposure, in Trametes hirsuta AH28-2. Two Zn2Cys6-type transcription factors (TFs), TH8421 and TH4300, were upregulated. Bioinformatics docking and isothermal titration calorimetry assays showed that each of them could bind directly to o-toluidine and another aromatic monomer, guaiacol. Binding to aromatic compounds promoted the formation of TH8421/TH4300 heterodimers. TH8421 and TH4300 silencing in T. hirsuta AH28-2 led to decreased transcriptional levels and activities of LacA and LacB upon o-toluidine and guaiacol exposure. EMSA and ChIP-qPCR analysis further showed that TH8421 and TH4300 bound directly with the promoter regions of lacA and lacB containing CGG or CCG motifs. Furthermore, the two TFs were involved in direct and positive regulation of the transcription of some CYP450s. Together, TH8421 and TH4300, two key regulators found in T. hirsuta AH28-2, function as heterodimers to simultaneously trigger the expression of downstream laccases and intracellular enzymes. Monomeric aromatic compounds act as ligands to promote heterodimer formation and enhance the transcriptional activities of the two TFs.IMPORTANCEWhite-rot fungi differentially express laccase isoenzymes when exposed to aromatic compounds. Clarification of the molecular mechanisms underlying differential laccase expression is essential to elucidate how white-rot fungi respond to the environment. Our study shows that two Zn2Cys6-type transcription factors form heterodimers, interact with the promoters of laccase genes, and positively regulate laccase transcription in Trametes hirsuta AH28-2. Aromatic monomer addition induces faster heterodimer formation and rate of activity. These findings not only identify two new transcription factors involved in fungal laccase transcription but also deepen our understanding of the mechanisms underlying the response to aromatics exposure in white-rot fungi.

3.
Comput Struct Biotechnol J ; 24: 322-333, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38690549

RESUMEN

Data curation for a hospital-based cancer registry heavily relies on the labor-intensive manual abstraction process by cancer registrars to identify cancer-related information from free-text electronic health records. To streamline this process, a natural language processing system incorporating a hybrid of deep learning-based and rule-based approaches for identifying lung cancer registry-related concepts, along with a symbolic expert system that generates registry coding based on weighted rules, was developed. The system is integrated with the hospital information system at a medical center to provide cancer registrars with a patient journey visualization platform. The embedded system offers a comprehensive view of patient reports annotated with significant registry concepts to facilitate the manual coding process and elevate overall quality. Extensive evaluations, including comparisons with state-of-the-art methods, were conducted using a lung cancer dataset comprising 1428 patients from the medical center. The experimental results illustrate the effectiveness of the developed system, consistently achieving F1-scores of 0.85 and 1.00 across 30 coding items. Registrar feedback highlights the system's reliability as a tool for assisting and auditing the abstraction. By presenting key registry items along the timeline of a patient's reports with accurate code predictions, the system improves the quality of registrar outcomes and reduces the labor resources and time required for data abstraction. Our study highlights advancements in cancer registry coding practices, demonstrating that the proposed hybrid weighted neural-symbolic cancer registry system is reliable and efficient for assisting cancer registrars in the coding workflow and contributing to clinical outcomes.

4.
J Med Chem ; 67(6): 4904-4915, 2024 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-38499004

RESUMEN

A selective tumor-penetrating strategy generally exploits tumor-targeted ligands to modify drugs so that the conjugate preferentially enters tumors and subsequently undergoes transcellular transport to penetrate tumors. However, this process shields ligands from their corresponding targets on the cell surface, possibly inducing an off-target effect during drug penetration at the tumor-normal interface. Herein, we first describe a selective tumor-penetrating drug (R11-phalloidin conjugates) for intravesical therapy of bladder cancer. The intravesical conjugates rapidly translocated across the mucus layer, specifically bound to tumors, and infiltrated throughout the tumor via direct intercellular transfer. Notably, direct transfer from normal cells to tumor cells was unidirectional because the pathways required for direct transfer, termed F-actin-rich tunneling nanotubes, were more unidirectionally extended from normal cells to tumor cells. Moreover, the intravesical conjugates displayed strong anticancer activity and well-tolerated biosafety in murine orthotopic bladder tumor models. Our study demonstrated the potential of a selective tumor-penetrating conjugate for effective intravesical anticancer therapy.


Asunto(s)
Neoplasias de la Vejiga Urinaria , Ratones , Animales , Administración Intravesical , Neoplasias de la Vejiga Urinaria/patología
5.
J Med Internet Res ; 25: e48145, 2023 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-38055317

RESUMEN

BACKGROUND: Electronic health records (EHRs) in unstructured formats are valuable sources of information for research in both the clinical and biomedical domains. However, before such records can be used for research purposes, sensitive health information (SHI) must be removed in several cases to protect patient privacy. Rule-based and machine learning-based methods have been shown to be effective in deidentification. However, very few studies investigated the combination of transformer-based language models and rules. OBJECTIVE: The objective of this study is to develop a hybrid deidentification pipeline for Australian EHR text notes using rules and transformers. The study also aims to investigate the impact of pretrained word embedding and transformer-based language models. METHODS: In this study, we present a hybrid deidentification pipeline called OpenDeID, which is developed using an Australian multicenter EHR-based corpus called OpenDeID Corpus. The OpenDeID corpus consists of 2100 pathology reports with 38,414 SHI entities from 1833 patients. The OpenDeID pipeline incorporates a hybrid approach of associative rules, supervised deep learning, and pretrained language models. RESULTS: The OpenDeID achieved a best F1-score of 0.9659 by fine-tuning the Discharge Summary BioBERT model and incorporating various preprocessing and postprocessing rules. The OpenDeID pipeline has been deployed at a large tertiary teaching hospital and has processed over 8000 unstructured EHR text notes in real time. CONCLUSIONS: The OpenDeID pipeline is a hybrid deidentification pipeline to deidentify SHI entities in unstructured EHR text notes. The pipeline has been evaluated on a large multicenter corpus. External validation will be undertaken as part of our future work to evaluate the effectiveness of the OpenDeID pipeline.


Asunto(s)
Anonimización de la Información , Registros Electrónicos de Salud , Humanos , Australia , Algoritmos , Hospitales de Enseñanza
6.
Microbiol Spectr ; 11(4): e0076823, 2023 08 17.
Artículo en Inglés | MEDLINE | ID: mdl-37395668

RESUMEN

The function of Seryl-tRNA synthetase in fungi during gene transcription regulation beyond translation has not been reported. Here, we report a seryl-tRNA synthetase, ThserRS, which can negatively regulate laccase lacA transcription in Trametes hirsuta AH28-2 under exposure to copper ion. ThserRS was obtained through yeast one-hybrid screening using a bait sequence of lacA promoter (-502 to -372 bp). ThserRS decreased while lacA increased at the transcription level in T. hirsuta AH28-2 in the first 36 h upon CuSO4 induction. Then, ThserRS was upregulated, and lacA was downregulated. ThserRS overexpression in T. hirsuta AH28-2 resulted in a decrement in lacA transcription and LacA activity. By comparison, ThserRS silencing led to increased LacA transcripts and activity. A minimum of a 32-bp DNA fragment containing two putative xenobiotic response elements could interact with ThserRS, with a dissociation constant of 919.9 nM. ThserRS localized in the cell cytoplasm and nucleus in T. hirsuta AH28-2 and was heterologously expressed in yeast. ThserRS overexpression also enhanced mycelial growth and oxidative stress resistance. The transcriptional level of several intracellular antioxidative enzymes in T. hirsuta AH28-2 was upregulated. Our results demonstrate a noncanonical activity of SerRS that acts as a transcriptional regulation factor to upregulate laccase expression at an early stage after exposure to copper ions. IMPORTANCE Seryl-tRNA synthetase is well known for the attachment of serine to the corresponding cognate tRNA during protein translation. In contrast, its functions beyond translation in microorganisms are underexplored. We performed in vitro and cell experiments to show that the seryl-tRNA synthetase in fungi with no UNE-S domain at the carboxyl terminus can enter the nucleus, directly interact with the promoter of the laccase gene, and negatively regulate the fungal laccase transcription early upon copper ion induction. Our study deepens our understanding of the Seryl-tRNA synthetase noncanonical activities in microorganisms. It also demonstrates a new transcription factor for fungal laccase transcription.


Asunto(s)
Saccharomyces cerevisiae , Serina-ARNt Ligasa , Saccharomyces cerevisiae/metabolismo , Trametes/genética , Trametes/metabolismo , Serina-ARNt Ligasa/metabolismo , Lacasa/genética , Lacasa/metabolismo , Cobre/metabolismo , Iones
7.
Database (Oxford) ; 20232023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36734300

RESUMEN

This study presents the outcomes of the shared task competition BioCreative VII (Task 3) focusing on the extraction of medication names from a Twitter user's publicly available tweets (the user's 'timeline'). In general, detecting health-related tweets is notoriously challenging for natural language processing tools. The main challenge, aside from the informality of the language used, is that people tweet about any and all topics, and most of their tweets are not related to health. Thus, finding those tweets in a user's timeline that mention specific health-related concepts such as medications requires addressing extreme imbalance. Task 3 called for detecting tweets in a user's timeline that mentions a medication name and, for each detected mention, extracting its span. The organizers made available a corpus consisting of 182 049 tweets publicly posted by 212 Twitter users with all medication mentions manually annotated. The corpus exhibits the natural distribution of positive tweets, with only 442 tweets (0.2%) mentioning a medication. This task was an opportunity for participants to evaluate methods that are robust to class imbalance beyond the simple lexical match. A total of 65 teams registered, and 16 teams submitted a system run. This study summarizes the corpus created by the organizers and the approaches taken by the participating teams for this challenge. The corpus is freely available at https://biocreative.bioinformatics.udel.edu/tasks/biocreative-vii/track-3/. The methods and the results of the competing systems are analyzed with a focus on the approaches taken for learning from class-imbalanced data.


Asunto(s)
Minería de Datos , Procesamiento de Lenguaje Natural , Humanos , Minería de Datos/métodos
8.
Zhen Ci Yan Jiu ; 46(9): 782-8, 2021 Sep 25.
Artículo en Chino | MEDLINE | ID: mdl-34558245

RESUMEN

OBJECTIVE: To explore the molecular mechanism of locus coeruleus(LC) involved in electroacupuncture (EA) anti myocardial ischemia. METHODS: Twenty-four SD rats were randomly divided into sham-operation, model, EA and EA +lesion groups, with 6 rats in each group. The acute myocardial ischemia (AMI) model was established by ligation of the left anterior descending branch of coronary artery. EA (2 Hz/15 Hz, 1 mA) was applied to bilateral "Shenmen" (HT7) -"Tongli" (HT5) and the middle-point between HT7 and HT5 for 30 min, once daily for 3 days. For rats of the EA +lesion group, the virus (300 nL) was injected into bilateral LC before EA treatment. Serum aspartate aminotransferase (AST) was detected by ELISA. The gene expression profiles of rat heart were detected by transcriptome sequencing, the differentially expressed genes were screened, and Gene Ontology (GO) functional classification and Kyoto Encyclopedia of genes and genomes (KEGG) metabolic pathway enrichment analysis were performed. RESULTS: Compared with the sham-operation group, serum AST content was significantly increased in the model group (P<0.01). Following the intervention, serum AST was significantly reduced in the EA group (P<0.01), while the serum AST in the EA + lesion group was significantly higher compared with the EA group (P<0.05). Differential expression analysis showed that 1 138 differentially expressed genes were screened out between the model group and the sham-operation group, 1 330 differentially expressed genes between model and EA group, and 804 differentially expressed genes between EA and EA + lesion group. Among them, 218 differential genes were involved in the regulation of EA anti-myocardial ischemia in LC. GO functional classification analysis showed that these differentially expressed genes mainly involved in cell processes, metabolic processes and biological regulation in biological processes. KEGG pathway analysis showed that these differentially expressed genes were enriched in sulfur relay system, thiamine metabolism, glutathione metabolism, C5 branch dicarboxylic acid metabolism, cell adhesion molecules and Th1 and Th2 cell differentiation. CONCLUSION: EA intervention has a positive effect in anti-myocardial ischemia, which may be related to the sulfur relay system, thiamine metabolism, glutathione metabolism, C5 branch dicarboxylic acid metabolism, cell adhesion molecules and Th1 and Th2 cell differentiation involved in LC.


Asunto(s)
Electroacupuntura , Isquemia Miocárdica , Puntos de Acupuntura , Animales , Locus Coeruleus , Isquemia Miocárdica/genética , Isquemia Miocárdica/terapia , Ratas , Ratas Sprague-Dawley , Transcriptoma
9.
Zhongguo Zhong Yao Za Zhi ; 46(5): 1084-1093, 2021 Mar.
Artículo en Chino | MEDLINE | ID: mdl-33787101

RESUMEN

In order to enrich the transcriptome data of Fagopyrum dibotrys plants, analyze the genes encoding key enzyme involved in flavonoid biosynthesis pathway, and mine their functional genes, in this study, we performed RNA sequencing analysis for the rhizomes, roots, flowers, leaves and stems of F. dibotrys on the BGISEQ-500 sequencing platform. After de novo assembly of transcripts, a total of 205 619 unigenes were generated and 132 372 unigenes were obtained and annotated into seven public databases, of which, 81 327 unigenes were mapped to the GO database and most of the unigenes were annotated in cellular process, biological regulation, binding and catalytic activity. Besides, 86 922 unigenes were enriched in 136 pathways using KEGG database' and we identified 82 unigenes that encodes key enzymes involved in flavonoid biosynthesis. Comparing rhizome with root, flower, leaf or stem in F. dibotrys, 27 962 co-expressed differentially expressed genes(DEGs) were obtained. Among them, 23 515 DEGs of rhizome tissue-specific were enriched into 132 pathways and 13 unigenes were significantly enriched in biosynthesis of flavone and flavonol. In addition, we also identified 3 427 unigenes encoding 60 transcription factor(TFs) families as well as four unigenes encoding bHLH TFs were enriched in flavonoid biosynthesis. Our results greatly enriched the transcriptome database of plants, provided a reference for the analysis of key enzymes involved in flavonoid biosynthesis in plants, and will facilitate the study of the functions and regulatory mechanisms of key enzymes involved in flavonoid biosynthesis in F. dibotrys at the genetic level.


Asunto(s)
Fagopyrum , Vías Biosintéticas/genética , Flavonoides , Flores , Perfilación de la Expresión Génica , Regulación de la Expresión Génica de las Plantas , Humanos , Transcriptoma/genética
10.
PeerJ ; 9: e10885, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33665027

RESUMEN

BACKGROUND: Pueraria lobata (Willd.) Ohwi is a valuable herb used in traditional Chinese medicine. Isoflavonoids are the major bioactive compounds in P. lobata, namely puerarin, daidzin, glycitin, genistin, daidzein, and glycitein, which have pharmacological properties of anti-cardiovascular, anti-hypertension, anti-inflammatory, and anti-arrhythmic. METHODS: To characterize the corresponding genes of the compounds in the isoflavonoid pathway, RNA sequencing (RNA-Seq) analyses of roots, stems, and leaves of P. lobata were carried out on the BGISEQ-500 sequencing platform. RESULTS: We identified 140,905 unigenes in total, of which 109,687 were annotated in public databases, after assembling the transcripts from all three tissues. Multiple genes encoding key enzymes, such as IF7GT and transcription factors, associated with isoflavonoid biosynthesis were identified and then further analyzed. Quantitative real-time PCR (qRT-PCR) results of some genes encoding key enzymes were consistent with our RNA-Seq analysis. Differentially expressed genes (DEGs) were determined by analyzing the expression profiles of roots compared with other tissues (leaves and stems). This analysis revealed numerous DEGs that were either uniquely expressed or up-regulated in the roots. Finally, quantitative analyses of isoflavonoid metabolites occurring in the three P. lobata tissue types were done via high-performance liquid-chromatography and tandem mass spectrometry methodology (HPLC-MS/MS). Our comprehensive transcriptome investigation substantially expands the genomic resources of P. lobata and provides valuable knowledge on both gene expression regulation and promising candidate genes that are involved in plant isoflavonoid pathways.

11.
Zhongguo Zhong Yao Za Zhi ; 45(12): 2847-2857, 2020 Jun.
Artículo en Chino | MEDLINE | ID: mdl-32627459

RESUMEN

Steroidal saponins, which are the characteristic and main active constituents of Polygonatum, exhibit a broad range of pharmacological functions, such as regulating blood sugar, preventing cardiovascular and cerebrovascular diseases and anti-tumor. In this study, we performed RNA sequencing(RNA-Seq) analysis for the flowers, leaves, roots, and rhizomes of Polygonatum cyrtonema using the BGISEQ-500 platform to understand the biosynthesis pathway of steroidal saponins and study their key enzyme genes. The assembly of transcripts for four tissues generated 129 989 unigenes, of which 88 958 were mapped to several public databases for functional annotation, 22 813 unigenes were assigned to 53 subcategories and 64 877 unigenes were annotated to 136 pathways in KEGG database. Furthermore, 502 unigenes involved in the biosynthesis pathway of steroidal saponins were identified, of which 97 unigenes encoding 12 key enzymes. Cycloartenol synthase, the first key enzyme in the pathway of phytosterol biosynthesis, showed conserved catalytic domain and substrate binding domain based on sequence analysis and homology modeling. Differentially expressed genes(DEGs) were identified in rhizomes as compared to other tissues(flowers, leaves or roots).The 2 437 unigenes annotated by KEGG showed rhizome-specific expression, of which 35 unigenes involved in the biosynthesis of steroidal saponins. Our results greatly extend the public transcriptome dataset of Polygonatum and provide valuable information for the identification of candidate genes involved in the biosynthesis of steroidal saponins and other important secondary metabolites.


Asunto(s)
Polygonatum , Saponinas , Vías Biosintéticas , Perfilación de la Expresión Génica , Análisis de Secuencia de ARN , Transcriptoma
12.
Gene ; 744: 144626, 2020 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-32224272

RESUMEN

Polygonatum odoratum (Mill.) Druce is a well-known traditional Chinese herb. Polysaccharides are major bioactive components of Polygonatum odoratum, which can improve immunity, and are used to treat rheumatic heart disease, cardiovascular disease, and diabetes. This study identified potential genes and transcription factors (TFs) that regulate polysaccharide synthesis in Polygonatum odoratum (Mill.) Druce using RNA sequencing data from leaf, stem, and rhizome tissues. 76,714 unigenes were annotated in public databases. Analysis of KEGG annotations identified 18 key enzymes responsible for polysaccharide biosynthesis and the most of the upregulated expressed unigenes were enriched in rhizome tissue compared with leaf or stem tissue. 73 TFs involved in polysaccharide synthesis were predicted. In addition, key enzyme genes were verified by quantitative real-time PCR. This study substantially enlarged the public transcriptome datasets of this species, and provided insight into detection of novel genes involved in synthesis of polysaccharides and other secondary metabolites.


Asunto(s)
Polygonatum/genética , Polisacáridos/biosíntesis , Transcriptoma , Expresión Génica , Genes de Plantas , Hojas de la Planta/genética , Hojas de la Planta/metabolismo , Tallos de la Planta/genética , Tallos de la Planta/metabolismo , Polygonatum/enzimología , Polygonatum/metabolismo , Polisacáridos/metabolismo , RNA-Seq , Rizoma/genética , Rizoma/metabolismo , Metabolismo Secundario/genética , Factores de Transcripción/metabolismo , beta-Fructofuranosidasa/química
13.
BMC Genomics ; 21(1): 49, 2020 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-31941462

RESUMEN

BACKGROUND: Clinopodium gracile (Benth.) Matsum (C. gracile) is an annual herb with pharmacological properties effective in the treatment of various diseases, including hepatic carcinoma. Triterpenoid saponins are crucial bioactive compounds in C. gracile. However, the molecular understanding of the triterpenoid saponin biosynthesis pathway remains unclear. RESULTS: In this study, we performed RNA sequencing (RNA-Seq) analysis of the flowers, leaves, roots, and stems of C. gracile plants using the BGISEQ-500 platform. The assembly of transcripts from all four types of tissues generated 128,856 unigenes, of which 99,020 were mapped to several public databases for functional annotation. Differentially expressed genes (DEGs) were identified via the comparison of gene expression levels between leaves and other tissues (flowers, roots, and stems). Multiple genes encoding pivotal enzymes, such as squalene synthase (SS), or transcription factors (TFs) related to triterpenoid saponin biosynthesis were identified and further analyzed. The expression levels of unigenes encoding important enzymes were verified by quantitative real-time PCR (qRT-PCR). Different chemical constituents of triterpenoid saponins were identified by Ultra-Performance Liquid Chromatography coupled with quadrupole time-of-flight mass spectrometry (UPLC/Q-TOF-MS). CONCLUSIONS: Our results greatly extend the public transcriptome dataset of C. gracile and provide valuable information for the identification of candidate genes involved in the biosynthesis of triterpenoid saponins and other important secondary metabolites.


Asunto(s)
Magnoliopsida/genética , Saponinas/biosíntesis , Transcriptoma , Triterpenos/metabolismo , Vías Biosintéticas/genética , Farnesil Difosfato Farnesil Transferasa/química , Magnoliopsida/enzimología , Magnoliopsida/metabolismo , RNA-Seq , Reacción en Cadena en Tiempo Real de la Polimerasa , Saponinas/química , Metabolismo Secundario/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Triterpenos/química
14.
Front Psychiatry ; 11: 533949, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-33584354

RESUMEN

The introduction of pre-trained language models in natural language processing (NLP) based on deep learning and the availability of electronic health records (EHRs) presents a great opportunity to transfer the "knowledge" learned from data in the general domain to enable the analysis of unstructured textual data in clinical domains. This study explored the feasibility of applying NLP to a small EHR dataset to investigate the power of transfer learning to facilitate the process of patient screening in psychiatry. A total of 500 patients were randomly selected from a medical center database. Three annotators with clinical experience reviewed the notes to make diagnoses for major/minor depression, bipolar disorder, schizophrenia, and dementia to form a small and highly imbalanced corpus. Several state-of-the-art NLP methods based on deep learning along with pre-trained models based on shallow or deep transfer learning were adapted to develop models to classify the aforementioned diseases. We hypothesized that the models that rely on transferred knowledge would be expected to outperform the models learned from scratch. The experimental results demonstrated that the models with the pre-trained techniques outperformed the models without transferred knowledge by micro-avg. and macro-avg. F-scores of 0.11 and 0.28, respectively. Our results also suggested that the use of the feature dependency strategy to build multi-labeling models instead of problem transformation is superior considering its higher performance and simplicity in the training process.

15.
Int J Med Inform ; 129: 122-132, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-31445246

RESUMEN

BACKGROUND: Nowadays, social media are often being used by general public to create and share public messages related to their health. With the global increase in social media usage, there is a trend of posting information related to adverse drug reactions (ADR). Mining the social media data for this type of information will be helpful for pharmacological post-marketing surveillance and monitoring. Although the concept of using social media to facilitate pharmacovigilance is convincing, construction of automatic ADR detection systems remains a challenge because the corpora compiled from social media tend to be highly imbalanced, posing a major obstacle to the development of classifiers with reliable performance. METHODS: Several methods have been proposed to address the challenge of imbalanced corpora. However, we are not aware of any studies that investigated the effectiveness of the strategies of dealing with the problem of imbalanced data in the context of ADR detection from social media. In light of this, we evaluated a variety of imbalanced techniques and proposed a novel word embedding-based synthetic minority over-sampling technique (WESMOTE), which synthesizes new training examples from the sentence representation based on word embeddings. We compared the performance of all methods on two large imbalanced datasets released for the purpose of detecting ADR posts. RESULTS: In comparison with the state-of-the-art approaches, the classifiers that incorporated imbalanced classification techniques achieved comparable or better F-scores. All of our best performing configurations combined random under-sampling with techniques including the proposed WESMOTE, boosting and ensemble, implying that an integration of these approaches with under-sampling provides a reliable solution for large imbalanced social media datasets. Furthermore, ensemble-based methods like vote-based under-sampling (VUE) and random under-sampling boosting can be alternatives for the hybrid synthetic methods because both methods increase the diversity of the created weak classifiers, leading to better recall and overall F-scores for the minority classes. CONCLUSIONS: Data collected from the social media are usually very large and highly imbalanced. In order to maximize the performance of a classifier trained on such data, applications of imbalanced strategies are required. We considered several practical methods for handling imbalanced Twitter data along with their performance on the binary classification task with respect to ADRs. In conclusion, the following practical insights are gained: 1) When dealing with text classification, the proposed word embedding-based synthetic minority over-sampling technique is more effective than traditional synthetic-based over-sampling methods. 2) In cases where large amounts of training data are available, the imbalanced strategies combined with under-sampling techniques are preferred. 3) Finally, employment of advanced methods does not guarantee better performance than simpler ones such as VUE, which achieved high performance with advantages like faster building time and ease of development.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Medios de Comunicación Sociales , Concienciación , Farmacovigilancia
16.
Plant Methods ; 15: 65, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31289459

RESUMEN

BACKGROUND: Polygonatum cyrtonema Hua (P. cyrtonema) is one of the most important herbs in traditional Chinese medicine. Polysaccharides in P. cyrtonema plants comprise a class of important secondary metabolites and exhibit a broad range of pharmacological functions. RESULTS: In order to identify genes involved in polysaccharide biosynthesis, we performed RNA sequencing analysis of leaf, root, and rhizome tissues of P. cyrtonema. A total of 164,573 unigenes were obtained by assembling transcripts from all three tissues and 86,063 of these were annotated in public databases. Differentially expressed genes (DEGs) were determined based on expression profile analysis, and DEG levels in rhizome tissues were then compared with their counterparts in leaf and root tissues. This analysis revealed numerous genes that were either up-regulated or uniquely expressed in the rhizome. Multiple genes encoding important enzymes, such as UDP glycosyltransferases (UGTs), or transcription factors involved in polysaccharide biosynthesis were identified and further analyzed, while a few genes encoding key enzymes were experimentally validated using quantitative real-time PCR. CONCLUSION: Our results substantially expand the public transcriptome dataset of P. cyrtonema and provide valuable clues for the identification of candidate genes involved in metabolic pathways.

17.
Zhongguo Zhong Yao Za Zhi ; 44(9): 1799-1807, 2019 May.
Artículo en Chino | MEDLINE | ID: mdl-31342705

RESUMEN

Chalcone synthase( CHS) and chalcone isomerase( CHI) are key enzymes in the biosynthesis pathway of flavonoids. In this study,unigenes for CHS and CHI were screened from the transcriptome database of Arisaema heterophyllum. The open reading frame( ORFs) of chalcone synthase( Ah CHS) and chalcone isomerase( Ah CHI) were cloned from the plant by RT-PCR. The physicochemical properties,expression and structure characteristics of the encoded proteins Ah CHS and Ah CHI were analyzed. The ORFs of Ah CHS and Ah CHI were 1 176,630 bp in length and encoded 392,209 amino acids,respectively. Ah CHS functioned as a symmetric homodimer. The N-terminal helix of one monomer entwined with the corresponding helix of another monomer. Each CHS monomer consisted of two structural domains. In particular,four conserved residues define the active site. The tertiary structure of Ah CHI revealed a novel open-faced ß-sandwich fold. A large ß-sheet( ß4-ß11) and a layer of α-helices( α1-α7) comprised the core structure. The residues spanning ß4,ß5,α4,and α6 in the three-dimensional structure were conserved among CHIs from different species. Notably,these structural elements formed the active site on the protein surface,and the topology of the active-site cleft defined the stereochemistry of the cyclization reaction. The homology comparison showed that Ah CHS had the highest similarity to the CHS of Anthurium andraeanum,while Ah CHI had the highest similarity to the CHI of Paeonia delavayi. This study provided the basis for the functional study of Ah CHS and Ah CHI and the further study on plant flavonoid biosynthesis pathway.


Asunto(s)
Aciltransferasas/genética , Arisaema/enzimología , Liasas Intramoleculares/genética , Proteínas de Plantas/genética , Aciltransferasas/química , Arisaema/genética , Clonación Molecular , Liasas Intramoleculares/química , Proteínas de Plantas/química
18.
Int J Mol Sci ; 20(11)2019 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-31146369

RESUMEN

Clinopodium chinense (Benth.) O. Kuntze (C. chinense) is an important herb in traditional Chinese medicine. Triterpenoid saponins are a major class of active compounds in C. chinense with broad pharmacological activities and hemostatic, antitumor, and anti-hyperglycemic effects. To identify genes involved in triterpenoid saponin biosynthesis, transcriptomic analyses of leaves, stems, and roots from C. chinense were performed. A total of 135,968 unigenes were obtained by assembling the leaf, stem, and root transcripts, of which 102,154 were annotated in public databases. Differentially expressed genes were determined based on expression profile analysis and analyzed for differential expression of unique genes related to triterpenoid saponin biosynthesis. Multiple unigenes encoding crucial enzymes or transcription factors involved in triterpenoid saponin synthesis were identified and analyzed. The expression levels of unigenes encoding enzymes were experimentally validated using quantitative real-time PCR. This study greatly broadens the public transcriptome database for this species and provides a valuable resource for identifying candidate genes involved in the biosynthesis of triterpenoid saponins and other secondary metabolites.


Asunto(s)
Genes de Plantas , Lamiales/genética , Saponinas/biosíntesis , Transcriptoma , Lamiales/metabolismo , Saponinas/genética
19.
Database (Oxford) ; 20192019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30809637

RESUMEN

The detection of MicroRNA (miRNA) mentions in scientific literature facilitates researchers with the ability to find relevant and appropriate literature based on queries formulated using miRNA information. Considering most published biological studies elaborated on signal transduction pathways or genetic regulatory information in the form of figure captions, the extraction of miRNA from both the main content and figure captions of a manuscript is useful in aggregate analysis and comparative analysis of the studies published. In this study, we present a statistical principle-based miRNA recognition and normalization method to identify miRNAs and link them to the identifiers in the Rfam database. As one of the core components in the text mining pipeline of the database miRTarBase, the proposed method combined the advantages of previous works relying on pattern, dictionary and supervised learning and provided an integrated solution for the problem of miRNA identification. Furthermore, the knowledge learned from the training data was organized in a human-interpretable manner to understand the reason why the system considers a span of text as a miRNA mention, and the represented knowledge can be further complemented by domain experts. We studied the ambiguity level of miRNA nomenclature to connect the miRNA mentions to the Rfam database and evaluated the performance of our approach on two datasets: the BioCreative VI Bio-ID corpus and the miRNA interaction corpus by extending the later corpus with additional Rfam normalization information. Our study highlights and also proposes a better understanding of the challenges associated with miRNA identification and normalization in scientific literature and the research gap that needs to be further explored in prospective studies.


Asunto(s)
MicroARNs/metabolismo , Publicaciones , Estadística como Asunto , Algoritmos , Bases de Datos Genéticas , Internet , MicroARNs/genética , Anotación de Secuencia Molecular
20.
Database (Oxford) ; 20192019 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-30689846

RESUMEN

The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease association studies, electronic health records and other data types. The biomedical literature provides a rich foundation for populating these KBs, reporting genetic and molecular interactions that provide the scaffold for the cellular regulatory systems and detailing the influence of genetic variants in these interactions. The goal of BioCreative VI Precision Medicine Track was to extract this particular type of information and was organized in two tasks: (i) document triage task, focused on identifying scientific literature containing experimentally verified protein-protein interactions (PPIs) affected by genetic mutations and (ii) relation extraction task, focused on extracting the affected interactions (protein pairs). To assist system developers and task participants, a large-scale corpus of PubMed documents was manually annotated for this task. Ten teams worldwide contributed 22 distinct text-mining models for the document triage task, and six teams worldwide contributed 14 different text-mining systems for the relation extraction task. When comparing the text-mining system predictions with human annotations, for the triage task, the best F-score was 69.06%, the best precision was 62.89%, the best recall was 98.0% and the best average precision was 72.5%. For the relation extraction task, when taking homologous genes into account, the best F-score was 37.73%, the best precision was 46.5% and the best recall was 54.1%. Submitted systems explored a wide range of methods, from traditional rule-based, statistical and machine learning systems to state-of-the-art deep learning methods. Given the level of participation and the individual team results we find the precision medicine track to be successful in engaging the text-mining research community. In the meantime, the track produced a manually annotated corpus of 5509 PubMed documents developed by BioGRID curators and relevant for precision medicine. The data set is freely available to the community, and the specific interactions have been integrated into the BioGRID data set. In addition, this challenge provided the first results of automatically identifying PubMed articles that describe PPI affected by mutations, as well as extracting the affected relations from those articles. Still, much progress is needed for computer-assisted precision medicine text mining to become mainstream. Future work should focus on addressing the remaining technical challenges and incorporating the practical benefits of text-mining tools into real-world precision medicine information-related curation.


Asunto(s)
Minería de Datos/métodos , Bases de Datos de Proteínas , Mutación , Medicina de Precisión/métodos , Mapas de Interacción de Proteínas , Programas Informáticos , Biología Computacional/métodos , Humanos , Mutación/genética , Mutación/fisiología , Mapeo de Interacción de Proteínas , Mapas de Interacción de Proteínas/genética , Mapas de Interacción de Proteínas/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA