Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 25
Filter
1.
J Biomed Inform ; 155: 104657, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38772443

ABSTRACT

The increasing prevalence of overcrowding in Emergency Departments (EDs) threatens the effective delivery of urgent healthcare. Mitigation strategies include the deployment of monitoring systems capable of tracking and managing patient disposition to facilitate appropriate and timely care, which subsequently reduces patient revisits, optimizes resource allocation, and enhances patient outcomes. This study used âˆ¼ 250,000 emergency department visit records from Taipei Medical University-Shuang Ho Hospital to develop a natural language processing model using BlueBERT, a biomedical domain-specific pre-trained language model, to predict patient disposition status and unplanned readmissions. Data preprocessing and the integration of both structured and unstructured data were central to our approach. Compared to other models, BlueBERT outperformed due to its pre-training on a diverse range of medical literature, enabling it to better comprehend the specialized terminology, relationships, and context present in ED data. We found that translating Chinese-English clinical narratives into English and textualizing numerical data into categorical representations significantly improved the prediction of patient disposition (AUROC = 0.9014) and 72-hour unscheduled return visits (AUROC = 0.6475). The study concludes that the BlueBERT-based model demonstrated superior prediction capabilities, surpassing the performance of prior patient disposition predictive models, thus offering promising applications in the realm of ED clinical practice.


Subject(s)
Emergency Service, Hospital , Natural Language Processing , Patient Readmission , Emergency Service, Hospital/statistics & numerical data , Humans , Patient Readmission/statistics & numerical data , Female , Male , Adult , Middle Aged , Electronic Health Records , Narration , Aged
2.
Int J Nurs Stud ; 156: 104797, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38788263

ABSTRACT

BACKGROUND: ICU readmissions and post-discharge mortality pose significant challenges. Previous studies used EHRs and machine learning models, but mostly focused on structured data. Nursing records contain crucial unstructured information, but their utilization is challenging. Natural language processing (NLP) can extract structured features from clinical text. This study proposes the Crucial Nursing Description Extractor (CNDE) to predict post-ICU discharge mortality rates and identify high-risk patients for unplanned readmission by analyzing electronic nursing records. OBJECTIVE: Developed a deep neural network (NurnaNet) with the ability to perceive nursing records, combined with a bio-clinical medicine pre-trained language model (BioClinicalBERT) to analyze the electronic health records (EHRs) in the MIMIC III dataset to predict the death of patients within six month and two year risk. DESIGN: A cohort and system development design was used. SETTING(S): Based on data extracted from MIMIC-III, a database of critically ill in the US between 2001 and 2012, the results were analyzed. PARTICIPANTS: We calculated patients' age using admission time and date of birth information from the MIMIC dataset. Patients under 18 or over 89 years old, or who died in the hospital, were excluded. We analyzed 16,973 nursing records from patients' ICU stays. METHODS: We have developed a technology called the Crucial Nursing Description Extractor (CNDE), which extracts key content from text. We use the logarithmic likelihood ratio to extract keywords and combine BioClinicalBERT. We predict the survival of discharged patients after six months and two years and evaluate the performance of the model using precision, recall, the F1-score, the receiver operating characteristic curve (ROC curve), the area under the curve (AUC), and the precision-recall curve (PR curve). RESULTS: The research findings indicate that NurnaNet achieved good F1-scores (0.67030, 0.70874) within six months and two years. Compared to using BioClinicalBERT alone, there was an improvement in performance of 2.05 % and 1.08 % for predictions within six months and two years, respectively. CONCLUSIONS: CNDE can effectively reduce long-form records and extract key content. NurnaNet has a good F1-score in analyzing the data of nursing records, which helps to identify the risk of death of patients after leaving the hospital and adjust the regular follow-up and treatment plan of relevant medical care as soon as possible.


Subject(s)
Neural Networks, Computer , Patient Discharge , Humans , Patient Discharge/statistics & numerical data , Nursing Records , Electronic Health Records , Middle Aged , Female , Aged , Male , Risk Assessment/methods , Natural Language Processing , Cohort Studies
3.
J Dent Sci ; 19(1): 542-549, 2024 Jan.
Article in English | MEDLINE | ID: mdl-38303893

ABSTRACT

Background/purpose: Producing tooth crowns through dental technology is a basic function of dentistry. The morphology of tooth crowns is the most important parameter for evaluating its acceptability. The procedures were divided into four steps: tooth collection, scanning skills, use of mathematical methods and software, and machine learning calculation. Materials and methods: Dental plaster rods were prepared. The effective data collected were to classify 121 teeth (15th tooth position), 342 teeth (16th tooth position), 69 teeth (21st tooth position), and 89 teeth (43rd tooth position), for a total of 621 teeth. The procedures are divided into four steps: tooth collection, scanning skills, use of mathematical methods and software, and machine learning calculation. Results: The area under the curve (AUC) value was 0, 0.5, and 0.72 in this study. The precision rate and recall rate of micro-averaging/macro-averaging were 0.75/0.73 and 0.75/0.72. If we took a newly carved tooth picture into the program, the current effectiveness of machine learning was about 70%-75% to evaluate the quality of tooth morphology. Through the calculation and analysis of the two different concepts of micro-average/macro-average and AUC, similar values could be obtained. Conclusion: This study established a set of procedures that can judge the quality of hand-carved plaster sticks and teeth, and the accuracy rate is about 70%-75%. It is expected that this process can be used to assist dental technicians in judging the pros and cons of hand-carved plaster sticks and teeth, so as to help dental technicians to learn the tooth morphology more effectively.

4.
J Tissue Eng ; 14: 20417314231196212, 2023.
Article in English | MEDLINE | ID: mdl-37661967

ABSTRACT

Current clinical treatments on lymphedema provide promising results, but also result in donor site morbidities. The establishment of a microenvironment optimized for lymphangiogenesis can be an alternative way to enhance lymphatic tissue formation. Hemodynamic flow stimuli have been confirmed to have an influential effect on angiogenesis in tissue engineering, but not on lymphatic vessel formation. Here, the three in vivo scaffolds generated from different blood stimuli in the subcutaneous layer, in the flow through pedicle, and in an arterio-venous (AV) loop model, were created to investigate potential of lymphangiogenesis of scaffolds containing lymphatic endothelial cells (LECs). Our results indicated that AV loop model displayed better lymphangiogenesis in comparison to the other two models with slower flow or no stimuli. Other than hemodynamic force, the supplement of LECs is required for lymphatic vessel regeneration. The in vivo scaffold generated from AV loop model provides an effective approach for engineering lymphatic tissue in the clinical treatment of lymphedema.

5.
Database (Oxford) ; 20232023 03 07.
Article in English | MEDLINE | ID: mdl-36882099

ABSTRACT

The BioCreative National Library of Medicine (NLM)-Chem track calls for a community effort to fine-tune automated recognition of chemical names in the biomedical literature. Chemicals are one of the most searched biomedical entities in PubMed, and-as highlighted during the coronavirus disease 2019 pandemic-their identification may significantly advance research in multiple biomedical subfields. While previous community challenges focused on identifying chemical names mentioned in titles and abstracts, the full text contains valuable additional detail. We, therefore, organized the BioCreative NLM-Chem track as a community effort to address automated chemical entity recognition in full-text articles. The track consisted of two tasks: (i) chemical identification and (ii) chemical indexing. The chemical identification task required predicting all chemicals mentioned in recently published full-text articles, both span [i.e. named entity recognition (NER)] and normalization (i.e. entity linking), using Medical Subject Headings (MeSH). The chemical indexing task required identifying which chemicals reflect topics for each article and should therefore appear in the listing of MeSH terms for the document in the MEDLINE article indexing. This manuscript summarizes the BioCreative NLM-Chem track and post-challenge experiments. We received a total of 85 submissions from 17 teams worldwide. The highest performance achieved for the chemical identification task was 0.8672 F-score (0.8759 precision and 0.8587 recall) for strict NER performance and 0.8136 F-score (0.8621 precision and 0.7702 recall) for strict normalization performance. The highest performance achieved for the chemical indexing task was 0.6073 F-score (0.7417 precision and 0.5141 recall). This community challenge demonstrated that (i) the current substantial achievements in deep learning technologies can be utilized to improve automated prediction accuracy further and (ii) the chemical indexing task is substantially more challenging. We look forward to further developing biomedical text-mining methods to respond to the rapid growth of biomedical literature. The NLM-Chem track dataset and other challenge materials are publicly available at https://ftp.ncbi.nlm.nih.gov/pub/lu/BC7-NLM-Chem-track/. Database URL https://ftp.ncbi.nlm.nih.gov/pub/lu/BC7-NLM-Chem-track/.


Subject(s)
COVID-19 , United States , Humans , National Library of Medicine (U.S.) , Data Mining , Databases, Factual , MEDLINE
6.
Medicina (Kaunas) ; 59(2)2023 Jan 18.
Article in English | MEDLINE | ID: mdl-36837398

ABSTRACT

Background and Objectives. Anxiety and depressive disorders are the most prevalent mental disorders, and due to the COVID-19 pandemic, more people are suffering from anxiety and depressive disorders, and a considerable fraction of COVID-19 survivors have a variety of persistent neuropsychiatric problems after the initial infection. Traditional Chinese Medicine (TCM) offers a different perspective on mental disorders from Western biomedicine. Effective management of mental disorders has become an increasing concern in recent decades due to the high social and economic costs involved. This study attempts to express and ontologize the relationships between different mental disorders and physical organs from the perspective of TCM, so as to bridge the gap between the unique terminology used in TCM and a medical professional. Materials and Methods. Natural language processing (NLP) is introduced to quantify the importance of different mental disorder descriptions relative to the five depots and two palaces, stomach and gallbladder, through the classical medical text Huangdi Neijing and construct a mental disorder ontology based on the TCM classic text. Results. The results demonstrate that our proposed framework integrates NLP and data visualization, enabling clinicians to gain insights into mental health, in addition to biomedicine. According to the results of the relationship analysis of mental disorders, depots, palaces, and symptoms, the organ/depot most related to mental disorders is the heart, and the two most important emotion factors associated with mental disorders are anger and worry & think. The mental disorders described in TCM are related to more than one organ (depot/palace). Conclusion. This study complements recent research delving into co-relations or interactions between mental status and other organs and systems.


Subject(s)
COVID-19 , Mental Disorders , Humans , Medicine, Chinese Traditional/methods , Data Visualization , Pandemics , Data Mining
7.
J Biomed Inform ; 138: 104284, 2023 02.
Article in English | MEDLINE | ID: mdl-36632861

ABSTRACT

Since early identification of potential critical patients in the Emergency Department (ED) can lower mortality and morbidity, this study seeks to develop a machine learning model capable of predicting possible critical outcomes based on the history and vital signs routinely collected at triage. We compare emergency physicians and the predictive performance of the machine learning model. Predictors including patients' chief complaints, present illness, past medical history, vital signs, and demographic data of adult patients (aged ≥ 18 years) visiting the ED at Shuang-Ho Hospital in New Taipei City, Taiwan, are extracted from the hospital's electronic health records. Critical outcomes are defined as in-hospital cardiac arrest (IHCA) or intensive care unit (ICU) admission. A clinical narrative-aware deep neural network was developed to handle the text-intensive data and standardized numerical data, which is compared against other machine learning models. After this, emergency physicians were asked to predict possible clinical outcomes of thirty visits that were extracted randomly from our dataset, and their results were further compared to our machine learning model. A total of 4,308 (2.5 %) out of the 171,275 adult visits to the ED included in this study resulted in critical outcomes. The area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC) of our proposed prediction model is 0.874 and 0.207, respectively, which not only outperforms the other machine learning models, but even has better sensitivity (0.95 vs 0.41) and accuracy (0.90 vs 0.67) as compared to the emergency physicians. This model is sensitive and accurate in predicting critical outcomes and highlights the potential to use predictive analytics to support post-triage decision-making.


Subject(s)
Emergency Service, Hospital , Hospitalization , Adult , Humans , Prognosis , Neural Networks, Computer , Machine Learning , Retrospective Studies
8.
Database (Oxford) ; 20222022 08 31.
Article in English | MEDLINE | ID: mdl-36043400

ABSTRACT

The coronavirus disease 2019 (COVID-19) pandemic has been severely impacting global society since December 2019. The related findings such as vaccine and drug development have been reported in biomedical literature-at a rate of about 10 000 articles on COVID-19 per month. Such rapid growth significantly challenges manual curation and interpretation. For instance, LitCovid is a literature database of COVID-19-related articles in PubMed, which has accumulated more than 200 000 articles with millions of accesses each month by users worldwide. One primary curation task is to assign up to eight topics (e.g. Diagnosis and Treatment) to the articles in LitCovid. The annotated topics have been widely used for navigating the COVID literature, rapidly locating articles of interest and other downstream studies. However, annotating the topics has been the bottleneck of manual curation. Despite the continuing advances in biomedical text-mining methods, few have been dedicated to topic annotations in COVID-19 literature. To close the gap, we organized the BioCreative LitCovid track to call for a community effort to tackle automated topic annotation for COVID-19 literature. The BioCreative LitCovid dataset-consisting of over 30 000 articles with manually reviewed topics-was created for training and testing. It is one of the largest multi-label classification datasets in biomedical scientific literature. Nineteen teams worldwide participated and made 80 submissions in total. Most teams used hybrid systems based on transformers. The highest performing submissions achieved 0.8875, 0.9181 and 0.9394 for macro-F1-score, micro-F1-score and instance-based F1-score, respectively. Notably, these scores are substantially higher (e.g. 12%, higher for macro F1-score) than the corresponding scores of the state-of-art multi-label classification method. The level of participation and results demonstrate a successful track and help close the gap between dataset curation and method development. The dataset is publicly available via https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/ for benchmarking and further development. Database URL https://ftp.ncbi.nlm.nih.gov/pub/lu/LitCovid/biocreative/.


Subject(s)
COVID-19 , COVID-19/epidemiology , Data Mining/methods , Databases, Factual , Humans , PubMed , Publications
9.
Database (Oxford) ; 20222022 07 15.
Article in English | MEDLINE | ID: mdl-35849027

ABSTRACT

In this research, we explored various state-of-the-art biomedical-specific pre-trained Bidirectional Encoder Representations from Transformers (BERT) models for the National Library of Medicine - Chemistry (NLM CHEM) and LitCovid tracks in the BioCreative VII Challenge, and propose a BERT-based ensemble learning approach to integrate the advantages of various models to improve the system's performance. The experimental results of the NLM-CHEM track demonstrate that our method can achieve remarkable performance, with F1-scores of 85% and 91.8% in strict and approximate evaluations, respectively. Moreover, the proposed Medical Subject Headings identifier (MeSH ID) normalization algorithm is effective in entity normalization, which achieved a F1-score of about 80% in both strict and approximate evaluations. For the LitCovid track, the proposed method is also effective in detecting topics in the Coronavirus disease 2019 (COVID-19) literature, which outperformed the compared methods and achieve state-of-the-art performance in the LitCovid corpus. Database URL: https://www.ncbi.nlm.nih.gov/research/coronavirus/.


Subject(s)
COVID-19 , Data Mining , Data Mining/methods , Humans , Machine Learning , Medical Subject Headings , PubMed
10.
JMIR Public Health Surveill ; 8(7): e34583, 2022 07 13.
Article in English | MEDLINE | ID: mdl-35830225

ABSTRACT

BACKGROUND: Globalization and environmental changes have intensified the emergence or re-emergence of infectious diseases worldwide, such as outbreaks of dengue fever in Southeast Asia. Collaboration on region-wide infectious disease surveillance systems is therefore critical but difficult to achieve because of the different transparency levels of health information systems in different countries. Although the Program for Monitoring Emerging Diseases (ProMED)-mail is the most comprehensive international expert-curated platform providing rich disease outbreak information on humans, animals, and plants, the unstructured text content of the reports makes analysis for further application difficult. OBJECTIVE: To make monitoring the epidemic situation in Southeast Asia more efficient, this study aims to develop an automatic summary of the alert articles from ProMED-mail, a huge textual data source. In this paper, we proposed a text summarization method that uses natural language processing technology to automatically extract important sentences from alert articles in ProMED-mail emails to generate summaries. Using our method, we can quickly capture crucial information to help make important decisions regarding epidemic surveillance. METHODS: Our data, which span a period from 1994 to 2019, come from the ProMED-mail website. We analyzed the collected data to establish a unique Taiwan dengue corpus that was validated with professionals' annotations to achieve almost perfect agreement (Cohen κ=90%). To generate a ProMED-mail summary, we developed a dual-channel bidirectional long short-term memory with attention mechanism with infused latent syntactic features to identify key sentences from the alerting article. RESULTS: Our method is superior to many well-known machine learning and neural network approaches in identifying important sentences, achieving a macroaverage F1 score of 93%. Moreover, it can successfully extract the relevant correct information on dengue fever from a ProMED-mail alerting article, which can help researchers or general users to quickly understand the essence of the alerting article at first glance. In addition to verifying the model, we also recruited 3 professional experts and 2 students from related fields to participate in a satisfaction survey on the generated summaries, and the results show that 84% (63/75) of the summaries received high satisfaction ratings. CONCLUSIONS: The proposed approach successfully fuses latent syntactic features into a deep neural network to analyze the syntactic, semantic, and contextual information in the text. It then exploits the derived information to identify crucial sentences in the ProMED-mail alerting article. The experiment results show that the proposed method is not only effective but also outperforms the compared methods. Our approach also demonstrates the potential for case summary generation from ProMED-mail alerting articles. In terms of practical application, when a new alerting article arrives, our method can quickly identify the relevant case information, which is the most critical part, to use as a reference or for further analysis.


Subject(s)
Communicable Diseases , Dengue , Algorithms , Animals , Communicable Diseases/epidemiology , Dengue/epidemiology , Humans , Linguistics , Memory, Short-Term , Postal Service
11.
J Pers Med ; 12(3)2022 Mar 08.
Article in English | MEDLINE | ID: mdl-35330417

ABSTRACT

Radiology report generation through chest radiography interpretation is a time-consuming task that involves the interpretation of images by expert radiologists. It is common for fatigue-induced diagnostic error to occur, and especially difficult in areas of the world where radiologists are not available or lack diagnostic expertise. In this research, we proposed a multi-objective deep learning model called CT2Rep (Computed Tomography to Report) for generating lung radiology reports by extracting semantic features from lung CT scans. A total of 458 CT scans were used in this research, from which 107 radiomics features and 6 slices of segmentation related nodule features were extracted for the input of our model. The CT2Rep can simultaneously predict position, margin, and texture, which are three important indicators of lung cancer, and achieves remarkable performance with an F1-score of 87.29%. We conducted a satisfaction survey for estimating the practicality of CT2Rep, and the results show that 95% of the reports received satisfactory ratings. The results demonstrate the great potential in this model for the production of robust and reliable quantitative lung diagnosis reports. Medical personnel can obtain important indicators simply by providing the lung CT scan to the system, which can bring about the widespread application of the proposed framework.

12.
Int J Clin Pharmacol Ther ; 60(1): 46-51, 2022 Jan.
Article in English | MEDLINE | ID: mdl-34647866

ABSTRACT

Tumor necrosis factor (TNF) α inhibitors are widely used to treat inflammatory bowel disease (IBD); however, some patients have unexpected inflammatory episodes during anti-TNF therapy. The objective of our research was to highlight a paradoxical case of anti-TNF-agent-induced Sweet syndrome compared with Sweet syndrome treated by anti-TNF agents. We describe a 62-year-old male with a history of ulcerative colitis presenting with multiple polymorphic indurated skin macules and plaques after 2 months of adalimumab therapy. Neutrophilic dermatosis was diagnosed based on the clinical presentation and skin biopsy and may have resulted from extraintestinal manifestations of a flare-up of IBD or been induced by adalimumab therapy. We conclude that when facing this dilemma, adalimumab should be discontinued, and the dose of prednisolone should be increased before determining the definitive cause. Based on drug hypersensitivity syndrome (DHS) risk assessment in the 10-D assessment system, this case was classified as grade 1 (no risk). Finally, we review the molecular and cellular mechanisms connecting cytokine dysregulation to Sweet syndrome.


Subject(s)
Colitis, Ulcerative , Inflammatory Bowel Diseases , Sweet Syndrome , Adalimumab/adverse effects , Colitis, Ulcerative/chemically induced , Colitis, Ulcerative/drug therapy , Humans , Infliximab , Male , Middle Aged , Sweet Syndrome/chemically induced , Sweet Syndrome/diagnosis , Tumor Necrosis Factor Inhibitors , Tumor Necrosis Factor-alpha
13.
Microsurgery ; 41(8): 762-771, 2021 Nov.
Article in English | MEDLINE | ID: mdl-34617323

ABSTRACT

INTRODUCTION: The medical demand for lymphedema treatment is huge since the disease mechanism remains unclear, and management are difficult. Our purpose was to develop a reliable lymphedema model mimicking the clinical scenario and allows a microsurgical approach. MATERIALS AND METHODS: Male Lewis rats weighing 400 to 450 g were used to create lymphedema with groin and popliteal lymph node dissection and creation of 5 mm circumferential skin defect (n = 6). A skin incision was made and closed primarily for control group (n = 5). Evaluation included indocyanine green (ICG) lymphangiography 1 and 2 months postoperatively, volume difference between bilateral hindlimbs measured using micro-CT, and the skin was harvested for histological evaluation 2 months postoperatively. RESULTS: Larger volume differences present in the lymphedema group (17.50 ± 7.76 vs. 3.73 ± 2.66%, p < .05). ICG lymphangiography indicated dermal backflow only in the lymphedema group. Increased thickness of the epidermis was noted in lymphedema group (28.50 ± 12.61 µm vs. 15.10 ± 5.41 µm, p < .0001). More CD45+ (35.6 ± 26.68 vs. 2.8 ± 4.23 cells/high power field [HPF], p < .0001), CD3+ (38.39 ± 20.17 vs. 9.73 ± 8.62 cells/HPF, p < .0001), and CD4+ cell infiltration (11.7 ± 7.71 vs. 2.0 ± 2.67 cells/HPF, p < .0001) were observed in the lymphedema group. Collagen type I deposition was more in the lymphedema group (0.15 ± 0.06 vs. 0.07 ± 0.03, p < .0005). CONCLUSIONS: A rat lymphedema model was successfully established. The model can be applied in lymphedema related research.


Subject(s)
Lymphedema , Animals , Lymph Node Excision , Lymph Nodes , Lymphedema/etiology , Lymphedema/surgery , Lymphography , Male , Rats , Rats, Inbred Lew
14.
Diagnostics (Basel) ; 11(6)2021 Jun 09.
Article in English | MEDLINE | ID: mdl-34207578

ABSTRACT

We aimed to develop and validate a model for predicting mortality in patients with angina across the spectrum of dysglycemia. A total of 1479 patients admitted for coronary angiography due to angina were enrolled. All-cause mortality served as the primary endpoint. The models were validated with five-fold cross validation to predict long-term mortality. The features selected by least absolute shrinkage and selection operator (LASSO) were age, heart rate, plasma glucose levels at 30 min and 120 min during an oral glucose tolerance test (OGTT), the use of angiotensin II receptor blockers, the use of diuretics, and smoking history. This best performing model was built using a random survival forest with selected features. It had a good discriminative ability (Harrell's C-index: 0.829) and acceptable calibration (Brier score: 0.08) for predicting long-term mortality. Among patients with obstructive coronary artery disease confirmed by angiography, our model outperformed the Global Registry of Acute Coronary Events discharge score for mortality prediction (Harrell's C-index: 0.829 vs. 0.739, p < 0.001). In conclusion, we developed a machine learning model to predict long-term mortality among patients with angina. With the integration of OGTT, the model could help to identify a high risk of mortality across the spectrum of dysglycemia.

16.
Nutrients ; 13(6)2021 May 23.
Article in English | MEDLINE | ID: mdl-34071009

ABSTRACT

Dining is an essential part of human life. In order to pursue a healthier self, more and more people enjoy homemade cuisines. Consequently, the amount of recipe websites has increased significantly. These online recipes represent different cultures and cooking methods from various regions, and provide important indications on nutritional content. In recent years, the development of data science made data mining a popular research area. However, only a few researches in Taiwan have applied data mining in the studies of recipes and nutrients. Therefore, this work aims at utilizing machine learning models to discover health-related insights from recipes on social media. First, we collected over 15,000 Chinese recipes from the largest recipe website in Taiwan to build a recipe database. We then extracted information from this dataset through natural language processing methodologies so as to better understand the characteristics of various cuisines and ingredients. Thus, we can establish a classification model for the automatic categorization of recipes. We further performed cluster analysis for grouping nutrients to recognize the nutritional differences for each cluster and each cuisine type. The results showed that using the support vector machine (SVM) model can successfully classify recipes with an average F-score of 82%. We also analyzed the nutritional value of different cuisine categories and the possible health effects they may bring to the consumers. Our methods and findings can assist future work on extracting essential nutritional information from recipes and promoting healthier diets.


Subject(s)
Cookbooks as Topic , Data Mining , Diet , Nutritive Value , Social Media , Cluster Analysis , Cooking , Data Management , Diet, Healthy , Humans , Machine Learning , Models, Theoretical , Nutrients , Nutritional Status , Taiwan
18.
Bioinformatics ; 37(3): 404-412, 2021 04 20.
Article in English | MEDLINE | ID: mdl-32810217

ABSTRACT

MOTIVATION: Natural Language Processing techniques are constantly being advanced to accommodate the influx of data as well as to provide exhaustive and structured knowledge dissemination. Within the biomedical domain, relation detection between bio-entities known as the Bio-Entity Relation Extraction (BRE) task has a critical function in knowledge structuring. Although recent advances in deep learning-based biomedical domain embedding have improved BRE predictive analytics, these works are often task selective or use external knowledge-based pre-/post-processing. In addition, deep learning-based models do not account for local syntactic contexts, which have improved data representation in many kernel classifier-based models. In this study, we propose a universal BRE model, i.e. LBERT, which is a Lexically aware Transformer-based Bidirectional Encoder Representation model, and which explores both local and global contexts representations for sentence-level classification tasks. RESULTS: This article presents one of the most exhaustive BRE studies ever conducted over five different bio-entity relation types. Our model outperforms state-of-the-art deep learning models in protein-protein interaction (PPI), drug-drug interaction and protein-bio-entity relation classification tasks by 0.02%, 11.2% and 41.4%, respectively. LBERT representations show a statistically significant improvement over BioBERT in detecting true bio-entity relation for large corpora like PPI. Our ablation studies clearly indicate the contribution of the lexical features and distance-adjusted attention in improving prediction performance by learning additional local semantic context along with bi-directionally learned global context. AVAILABILITY AND IMPLEMENTATION: Github. https://github.com/warikoone/LBERT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Knowledge Bases , Natural Language Processing , Language , Research Design , Semantics
19.
PLoS One ; 14(10): e0223317, 2019.
Article in English | MEDLINE | ID: mdl-31647844

ABSTRACT

As user-generated content increasingly proliferates through social networking sites, our lives are bombarded with ever more information, which has in turn has inspired the rapid evolution of new technologies and tools to process these vast amounts of data. Semantic and sentiment analysis of these social multimedia have become key research topics in many areas in society, e.g., in shopping malls to help policymakers predict market trends and discover potential customers. In this light, this study proposes a novel method to analyze the emotional aspects of Chinese vocabulary and then to assess the mass comments of the movie reviews. The experiment results show that our method 1. can improve the machine learning model by providing more refined emotional information to enhance the effectiveness of movie recommendation systems, and 2. performs significantly better than the other commonly used methods of emotional analysis.


Subject(s)
Emotions , Models, Theoretical , Social Media , Algorithms , Humans
20.
J Am Med Inform Assoc ; 26(11): 1227-1236, 2019 11 01.
Article in English | MEDLINE | ID: mdl-31390470

ABSTRACT

OBJECTIVE: In this era of digitized health records, there has been a marked interest in using de-identified patient records for conducting various health related surveys. To assist in this research effort, we developed a novel clinical data representation model entitled medical knowledge-infused convolutional neural network (MKCNN), which is used for learning the clinical trial criteria eligibility status of patients to participate in cohort studies. MATERIALS AND METHODS: In this study, we propose a clinical text representation infused with medical knowledge (MK). First, we isolate the noise from the relevant data using a medically relevant description extractor; then we utilize log-likelihood ratio based weights from selected sentences to highlight "met" and "not-met" knowledge-infused representations in bichannel setting for each instance. The combined medical knowledge-infused representation (MK) from these modules helps identify significant clinical criteria semantics, which in turn renders effective learning when used with a convolutional neural network architecture. RESULTS: MKCNN outperforms other Medical Knowledge (MK) relevant learning architectures by approximately 3%; notably SVM and XGBoost implementations developed in this study. MKCNN scored 86.1% on F1metric, a gain of 6% above the average performance assessed from the submissions for n2c2 task. Although pattern/rule-based methods show a higher average performance for the n2c2 clinical data set, MKCNN significantly improves performance of machine learning implementations for clinical datasets. CONCLUSION: MKCNN scored 86.1% on the F1 score metric. In contrast to many of the rule-based systems introduced during the n2c2 challenge workshop, our system presents a model that heavily draws on machine-based learning. In addition, the MK representations add more value to clinical comprehension and interpretation of natural texts.


Subject(s)
Clinical Trials as Topic/methods , Data Mining/methods , Deep Learning , Neural Networks, Computer , Patient Selection , Humans , Natural Language Processing , Support Vector Machine
SELECTION OF CITATIONS
SEARCH DETAIL
...