Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 67
Filtrar
1.
BMC Med Res Methodol ; 24(1): 108, 2024 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-38724903

RESUMO

OBJECTIVE: Systematic literature reviews (SLRs) are critical for life-science research. However, the manual selection and retrieval of relevant publications can be a time-consuming process. This study aims to (1) develop two disease-specific annotated corpora, one for human papillomavirus (HPV) associated diseases and the other for pneumococcal-associated pediatric diseases (PAPD), and (2) optimize machine- and deep-learning models to facilitate automation of the SLR abstract screening. METHODS: This study constructed two disease-specific SLR screening corpora for HPV and PAPD, which contained citation metadata and corresponding abstracts. Performance was evaluated using precision, recall, accuracy, and F1-score of multiple combinations of machine- and deep-learning algorithms and features such as keywords and MeSH terms. RESULTS AND CONCLUSIONS: The HPV corpus contained 1697 entries, with 538 relevant and 1159 irrelevant articles. The PAPD corpus included 2865 entries, with 711 relevant and 2154 irrelevant articles. Adding additional features beyond title and abstract improved the performance (measured in Accuracy) of machine learning models by 3% for HPV corpus and 2% for PAPD corpus. Transformer-based deep learning models that consistently outperformed conventional machine learning algorithms, highlighting the strength of domain-specific pre-trained language models for SLR abstract screening. This study provides a foundation for the development of more intelligent SLR systems.


Assuntos
Aprendizado de Máquina , Infecções por Papillomavirus , Humanos , Infecções por Papillomavirus/diagnóstico , Economia Médica , Algoritmos , Avaliação de Resultados em Cuidados de Saúde/métodos , Aprendizado Profundo , Indexação e Redação de Resumos/métodos
2.
Angew Chem Int Ed Engl ; : e202409296, 2024 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-38923710

RESUMO

Among the various types of materials with intrinsic porosity, porous organic cages (POCs) are distinctive as discrete molecules that possess intrinsic cavities and extrinsic channels capable of facilitating molecular sieving. However, the fabrication of POC membranes remains highly challenging due to the weak noncovalent intermolecular interactions and most reported POCs are powders. In this study, we constructed crystalline free-standing porous organic cage membranes by fortifying intermolecular interactions through the induction of intramolecular hydrogen bonds, which was confirmed by single-crystal X-ray analysis. To elucidate the driving forces behind, a series of terephthaldehyde building blocks containing different substitutions were reacted with flexible triamine under different conditions via interfacial polymerization (IP). Furthermore, density functional theory (DFT) calculations suggest that intramolecular hydrogen bonding can significantly boost the intermolecular interactions. The resulting membranes exhibited fast solvent permeance and high rejection of dyes not only in water, but also in organic solvents. In addition, the membrane demonstrated excellent performance in precise molecular sieving in organic solvents. This work opens an avenue to designing and fabricating free-standing membranes composed of porous organic materials for efficient molecular sieving.

3.
BMC Bioinformatics ; 24(Suppl 3): 477, 2023 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-38102593

RESUMO

BACKGROUND: With more clinical trials are offering optional participation in the collection of bio-specimens for biobanking comes the increasing complexity of requirements of informed consent forms. The aim of this study is to develop an automatic natural language processing (NLP) tool to annotate informed consent documents to promote biorepository data regulation, sharing, and decision support. We collected informed consent documents from several publicly available sources, then manually annotated them, covering sentences containing permission information about the sharing of either bio-specimens or donor data, or conducting genetic research or future research using bio-specimens or donor data. RESULTS: We evaluated a variety of machine learning algorithms including random forest (RF) and support vector machine (SVM) for the automatic identification of these sentences. 120 informed consent documents containing 29,204 sentences were annotated, of which 1250 sentences (4.28%) provide answers to a permission question. A support vector machine (SVM) model achieved a F-1 score of 0.95 on classifying the sentences when using a gold standard, which is a prefiltered corpus containing all relevant sentences. CONCLUSIONS: This study provides the feasibility of using machine learning tools to classify permission-related sentences in informed consent documents.


Assuntos
Bancos de Espécimes Biológicos , Termos de Consentimento , Aprendizado de Máquina , Algoritmos , Processamento de Linguagem Natural
4.
BMC Bioinformatics ; 23(Suppl 6): 407, 2022 Sep 30.
Artigo em Inglês | MEDLINE | ID: mdl-36180861

RESUMO

BACKGROUND: To date, there are no effective treatments for most neurodegenerative diseases. Knowledge graphs can provide comprehensive and semantic representation for heterogeneous data, and have been successfully leveraged in many biomedical applications including drug repurposing. Our objective is to construct a knowledge graph from literature to study the relations between Alzheimer's disease (AD) and chemicals, drugs and dietary supplements in order to identify opportunities to prevent or delay neurodegenerative progression. We collected biomedical annotations and extracted their relations using SemRep via SemMedDB. We used both a BERT-based classifier and rule-based methods during data preprocessing to exclude noise while preserving most AD-related semantic triples. The 1,672,110 filtered triples were used to train with knowledge graph completion algorithms (i.e., TransE, DistMult, and ComplEx) to predict candidates that might be helpful for AD treatment or prevention. RESULTS: Among three knowledge graph completion models, TransE outperformed the other two (MR = 10.53, Hits@1 = 0.28). We leveraged the time-slicing technique to further evaluate the prediction results. We found supporting evidence for most highly ranked candidates predicted by our model which indicates that our approach can inform reliable new knowledge. CONCLUSION: This paper shows that our graph mining model can predict reliable new relationships between AD and other entities (i.e., dietary supplements, chemicals, and drugs). The knowledge graph constructed can facilitate data-driven knowledge discoveries and the generation of novel hypotheses.


Assuntos
Doença de Alzheimer , Semântica , Doença de Alzheimer/tratamento farmacológico , Reposicionamento de Medicamentos , Humanos , Conhecimento , Reconhecimento Automatizado de Padrão
5.
J Biomed Inform ; 115: 103671, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33387683

RESUMO

OBJECTIVES: Patient representation learning refers to learning a dense mathematical representation of a patient that encodes meaningful information from Electronic Health Records (EHRs). This is generally performed using advanced deep learning methods. This study presents a systematic review of this field and provides both qualitative and quantitative analyses from a methodological perspective. METHODS: We identified studies developing patient representations from EHRs with deep learning methods from MEDLINE, EMBASE, Scopus, the Association for Computing Machinery (ACM) Digital Library, and the Institute of Electrical and Electronics Engineers (IEEE) Xplore Digital Library. After screening 363 articles, 49 papers were included for a comprehensive data collection. RESULTS: Publications developing patient representations almost doubled each year from 2015 until 2019. We noticed a typical workflow starting with feeding raw data, applying deep learning models, and ending with clinical outcome predictions as evaluations of the learned representations. Specifically, learning representations from structured EHR data was dominant (37 out of 49 studies). Recurrent Neural Networks were widely applied as the deep learning architecture (Long short-term memory: 13 studies, Gated recurrent unit: 11 studies). Learning was mainly performed in a supervised manner (30 studies) optimized with cross-entropy loss. Disease prediction was the most common application and evaluation (31 studies). Benchmark datasets were mostly unavailable (28 studies) due to privacy concerns of EHR data, and code availability was assured in 20 studies. DISCUSSION & CONCLUSION: The existing predictive models mainly focus on the prediction of single diseases, rather than considering the complex mechanisms of patients from a holistic review. We show the importance and feasibility of learning comprehensive representations of patient EHR data through a systematic review. Advances in patient representation learning techniques will be essential for powering patient-level EHR analyses. Future work will still be devoted to leveraging the richness and potential of available EHR data. Reproducibility and transparency of reported results will hopefully improve. Knowledge distillation and advanced learning techniques will be exploited to assist the capability of learning patient representation further.


Assuntos
Aprendizado Profundo , Registros Eletrônicos de Saúde , Humanos , Redes Neurais de Computação , Prognóstico , Reprodutibilidade dos Testes
6.
Pharmacoepidemiol Drug Saf ; 30(5): 602-609, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33533072

RESUMO

PURPOSE: Severe adverse events (AEs), such as Guillain-Barré syndrome (GBS) occur rarely after influenza vaccination. We identify highly associated AEs with GBS and develop prediction models for GBS using the US Vaccine Adverse Event Reporting System (VAERS) reports following trivalent influenza vaccination (FLU3). METHODS: This study analyzed 80 059 reports from the US VAERS between 1990 and 2017. Several AEs were identified as highly associated with GBS and were used to develop the prediction model. Some common and mild AEs that were suspected to be underreported when GBS occurred simultaneously were removed from the final model. The analyses were validated using European influenza vaccine AEs data from EudraVigilance. RESULTS: Of the 80 059 reports, 1185 (1.5%) were annotated as GBS related. Twenty-four AEs were identified as having strong association with GBS. The full prediction model, using age, sex, and all 24 AEs achieved an area under the receiver operating characteristic (ROC) curve (AUC) of 85.4% (90% CI: [83.8%, 86.9%]). After excluding the nine (e.g., pruritus, rash, injection site pain) likely underreported AEs, the final AUC became 77.5% (90% CI: [75.5%, 79.6%]). Two hundred and one (0.25%) reports were predicted as of high risk of GBS (predicted probability >25%) and 84 actually developed GBS. CONCLUSION: The prediction performance demonstrated the potential of developing risk-prediction models utilizing the VAERS cohort. Excluding the likely underreported AEs sacrificed some prediction power but made the model more interpretable and feasible. The high absolute risk of even a small number of AE combinations suggests the promise of GBS prediction within the VAERS dataset.


Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos/estatística & dados numéricos , Síndrome de Guillain-Barré , Vacinas contra Influenza/efeitos adversos , Influenza Humana/prevenção & controle , Feminino , Síndrome de Guillain-Barré/induzido quimicamente , Síndrome de Guillain-Barré/diagnóstico , Síndrome de Guillain-Barré/epidemiologia , Humanos , Vacinas contra Influenza/administração & dosagem , Masculino , Estados Unidos/epidemiologia , Vacinação/efeitos adversos
7.
J Med Internet Res ; 23(8): e26478, 2021 08 05.
Artigo em Inglês | MEDLINE | ID: mdl-34383667

RESUMO

BACKGROUND: The rapid growth of social media as an information channel has made it possible to quickly spread inaccurate or false vaccine information, thus creating obstacles for vaccine promotion. OBJECTIVE: The aim of this study is to develop and evaluate an intelligent automated protocol for identifying and classifying human papillomavirus (HPV) vaccine misinformation on social media using machine learning (ML)-based methods. METHODS: Reddit posts (from 2007 to 2017, N=28,121) that contained keywords related to HPV vaccination were compiled. A random subset (2200/28,121, 7.82%) was manually labeled for misinformation and served as the gold standard corpus for evaluation. A total of 5 ML-based algorithms, including a support vector machine, logistic regression, extremely randomized trees, a convolutional neural network, and a recurrent neural network designed to identify vaccine misinformation, were evaluated for identification performance. Topic modeling was applied to identify the major categories associated with HPV vaccine misinformation. RESULTS: A convolutional neural network model achieved the highest area under the receiver operating characteristic curve of 0.7943. Of the 28,121 Reddit posts, 7207 (25.63%) were classified as vaccine misinformation, with discussions about general safety issues identified as the leading type of misinformed posts (2666/7207, 36.99%). CONCLUSIONS: ML-based approaches are effective in the identification and classification of HPV vaccine misinformation on Reddit and may be generalizable to other social media platforms. ML-based methods may provide the capacity and utility to meet the challenge involved in intelligent automated monitoring and classification of public health misinformation on social media platforms. The timely identification of vaccine misinformation on the internet is the first step in misinformation correction and vaccine promotion.


Assuntos
Vacinas contra Papillomavirus , Mídias Sociais , Comunicação , Humanos , Aprendizado de Máquina , Saúde Pública
8.
J Med Internet Res ; 22(7): e16981, 2020 07 31.
Artigo em Inglês | MEDLINE | ID: mdl-32735224

RESUMO

BACKGROUND: Asthma exacerbation is an acute or subacute episode of progressive worsening of asthma symptoms and can have a significant impact on patients' quality of life. However, efficient methods that can help identify personalized risk factors and make early predictions are lacking. OBJECTIVE: This study aims to use advanced deep learning models to better predict the risk of asthma exacerbations and to explore potential risk factors involved in progressive asthma. METHODS: We proposed a novel time-sensitive, attentive neural network to predict asthma exacerbation using clinical variables from large electronic health records. The clinical variables were collected from the Cerner Health Facts database between 1992 and 2015, including 31,433 adult patients with asthma. Interpretations on both patient and cohort levels were investigated based on the model parameters. RESULTS: The proposed model obtained an area under the curve value of 0.7003 through a five-fold cross-validation, which outperformed the baseline methods. The results also demonstrated that the addition of elapsed time embeddings considerably improved the prediction performance. Further analysis observed diverse distributions of contributing factors across patients as well as some possible cohort-level risk factors, which could be found supporting evidence from peer-reviewed literature such as respiratory diseases and esophageal reflux. CONCLUSIONS: The proposed neural network model performed better than previous methods for the prediction of asthma exacerbation. We believe that personalized risk scores and analyses of contributing factors can help clinicians better assess the individual's level of disease progression and afford the opportunity to adjust treatment, prevent exacerbation, and improve outcomes.


Assuntos
Asma/fisiopatologia , Aprendizado Profundo/normas , Redes Neurais de Computação , Qualidade de Vida/psicologia , Progressão da Doença , Feminino , Humanos , Masculino , Estudos Retrospectivos , Medição de Risco , Fatores de Risco
9.
BMC Med Inform Decis Mak ; 20(Suppl 1): 73, 2020 04 30.
Artigo em Inglês | MEDLINE | ID: mdl-32349758

RESUMO

BACKGROUND: Capturing sentence semantics plays a vital role in a range of text mining applications. Despite continuous efforts on the development of related datasets and models in the general domain, both datasets and models are limited in biomedical and clinical domains. The BioCreative/OHNLP2018 organizers have made the first attempt to annotate 1068 sentence pairs from clinical notes and have called for a community effort to tackle the Semantic Textual Similarity (BioCreative/OHNLP STS) challenge. METHODS: We developed models using traditional machine learning and deep learning approaches. For the post challenge, we focused on two models: the Random Forest and the Encoder Network. We applied sentence embeddings pre-trained on PubMed abstracts and MIMIC-III clinical notes and updated the Random Forest and the Encoder Network accordingly. RESULTS: The official results demonstrated our best submission was the ensemble of eight models. It achieved a Person correlation coefficient of 0.8328 - the highest performance among 13 submissions from 4 teams. For the post challenge, the performance of both Random Forest and the Encoder Network was improved; in particular, the correlation of the Encoder Network was improved by ~ 13%. During the challenge task, no end-to-end deep learning models had better performance than machine learning models that take manually-crafted features. In contrast, with the sentence embeddings pre-trained on biomedical corpora, the Encoder Network now achieves a correlation of ~ 0.84, which is higher than the original best model. The ensembled model taking the improved versions of the Random Forest and Encoder Network as inputs further increased performance to 0.8528. CONCLUSIONS: Deep learning models with sentence embeddings pre-trained on biomedical corpora achieve the highest performance on the test set. Through error analysis, we find that end-to-end deep learning models and traditional machine learning models with manually-crafted features complement each other by finding different types of sentences. We suggest a combination of these models can better find similar sentences in practice.


Assuntos
Aprendizado Profundo , Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Aprendizado de Máquina , Mineração de Dados , Humanos , Idioma , PubMed
10.
BMC Genomics ; 20(Suppl 1): 82, 2019 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-30712510

RESUMO

BACKGROUND: Existing functional description of genes are categorical, discrete, and mostly through manual process. In this work, we explore the idea of gene embedding, distributed representation of genes, in the spirit of word embedding. RESULTS: From a pure data-driven fashion, we trained a 200-dimension vector representation of all human genes, using gene co-expression patterns in 984 data sets from the GEO databases. These vectors capture functional relatedness of genes in terms of recovering known pathways - the average inner product (similarity) of genes within a pathway is 1.52X greater than that of random genes. Using t-SNE, we produced a gene co-expression map that shows local concentrations of tissue specific genes. We also illustrated the usefulness of the embedded gene vectors, laden with rich information on gene co-expression patterns, in tasks such as gene-gene interaction prediction. CONCLUSIONS: We proposed a machine learning method that utilizes transcriptome-wide gene co-expression to generate a distributed representation of genes. We further demonstrated the utility of our distribution by predicting gene-gene interaction based solely on gene names. The distributed representation of genes could be useful for more bioinformatics applications.


Assuntos
Biologia Computacional/métodos , Software , Algoritmos , Biologia Computacional/normas , Epistasia Genética , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica , Humanos , Curva ROC , Transcriptoma , Interface Usuário-Computador
11.
BMC Med Inform Decis Mak ; 19(Suppl 4): 147, 2019 08 08.
Artigo em Inglês | MEDLINE | ID: mdl-31391106

RESUMO

BACKGROUND: Hepatitis C affects about 3 % of the world's population. In the United States, about 3.5 million have chronic hepatitis C, and it is the leading cause of liver cancer and the most common indication for liver transplantation. In the last decades, new advances in therapy have substantially increased the cure rate of hepatitis C to more than 95% with the use of antiviral agents. However, drug safety of the new treatments remains one of the major concerns. Data from the US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) and the Electronic Health Record (EHR) systems provide crucial post-market information to evaluate drug safety. Currently, quantitative evidence of drug safety of hepatitis C treatments based on post-market data are still limited, and there is also a lack of a standard statistical procedure to systematically compare drug safety across multiple drugs using FAERS and EHR. METHOD: In this study, we presented a statistical procedure to compare the difference in adverse events (AE) across multiple hepatitis C drugs using data from FAERS and EHR, and to assess the consistency of results from two data bases. Through three major steps, including descriptive comparison, testing for difference among groups, and quantification of association, the proposed method can provide a quantitative comparison on safety of multiple drugs. Specifically, we compared drugs that were approved by FDA to treat hepatitis C before 2011versus those approved after 2013. We used spontaneous AE reports submitted between 2004 to 2015 from FAERS data base and medical records between 1999 to 2015 from the Cerner health facts data base to estimate and compare the rate of AE after drug use. RESULT: We studied 30 most frequently reported AEs after treatment of hepatitis C, comparing the difference between drugs approved before 2011versus those approved after 2013. Our results showed that there was difference in rate of AE between the two groups of treatment. We reported the AEs that have significant statistical difference, and estimate the difference attributable to variation of age and gender between the two groups of drug users. Our findings are consistent with results in existing literature. Moreover, we compared the results obtained from FAERS data and EHR data, and evaluated the consistency of evidence. CONCLUSION: The proposed procedure is a general and standardized pipeline that can be used to compare and visualize drug safety among multiple drugs to support regulatory decision-makings using post-market data. We showed that there was statistically significant difference in AE rates between the new and old therapies for hepatitis C. We showed that both FAERS and EHR contained large information for research of post-market drug safety, but each has its own strength and limitations. Cautions should be taken when combining evidence from the two data resources and there is a need of more sophisticated informatics and statistical tools for evidence synthesis.


Assuntos
Antivirais/efeitos adversos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Hepatite C/tratamento farmacológico , Sistemas de Notificação de Reações Adversas a Medicamentos , Bases de Dados Factuais , Humanos , Vigilância de Produtos Comercializados , Estados Unidos , United States Food and Drug Administration
12.
J Med Internet Res ; 20(7): e236, 2018 07 09.
Artigo em Inglês | MEDLINE | ID: mdl-29986843

RESUMO

BACKGROUND: Timely understanding of public perceptions allows public health agencies to provide up-to-date responses to health crises such as infectious diseases outbreaks. Social media such as Twitter provide an unprecedented way for the prompt assessment of the large-scale public response. OBJECTIVE: The aims of this study were to develop a scheme for a comprehensive public perception analysis of a measles outbreak based on Twitter data and demonstrate the superiority of the convolutional neural network (CNN) models (compared with conventional machine learning methods) on measles outbreak-related tweets classification tasks with a relatively small and highly unbalanced gold standard training set. METHODS: We first designed a comprehensive scheme for the analysis of public perception of measles based on tweets, including 3 dimensions: discussion themes, emotions expressed, and attitude toward vaccination. All 1,154,156 tweets containing the word "measles" posted between December 1, 2014, and April 30, 2015, were purchased and downloaded from DiscoverText.com. Two expert annotators curated a gold standard of 1151 tweets (approximately 0.1% of all tweets) based on the 3-dimensional scheme. Next, a tweet classification system based on the CNN framework was developed. We compared the performance of the CNN models to those of 4 conventional machine learning models and another neural network model. We also compared the impact of different word embeddings configurations for the CNN models: (1) Stanford GloVe embedding trained on billions of tweets in the general domain, (2) measles-specific embedding trained on our 1 million measles related tweets, and (3) a combination of the 2 embeddings. RESULTS: Cohen kappa intercoder reliability values for the annotation were: 0.78, 0.72, and 0.80 on the 3 dimensions, respectively. Class distributions within the gold standard were highly unbalanced for all dimensions. The CNN models performed better on all classification tasks than k-nearest neighbors, naïve Bayes, support vector machines, or random forest. Detailed comparison between support vector machines and the CNN models showed that the major contributor to the overall superiority of the CNN models is the improvement on recall, especially for classes with low occurrence. The CNN model with the 2 embedding combination led to better performance on discussion themes and emotions expressed (microaveraging F1 scores of 0.7811 and 0.8592, respectively), while the CNN model with Stanford embedding achieved best performance on attitude toward vaccination (microaveraging F1 score of 0.8642). CONCLUSIONS: The proposed scheme can successfully classify the public's opinions and emotions in multiple dimensions, which would facilitate the timely understanding of public perceptions during the outbreak of an infectious disease. Compared with conventional machine learning methods, our CNN models showed superiority on measles-related tweet classification tasks with a relatively small and highly unbalanced gold standard. With the success of these tasks, our proposed scheme and CNN-based tweets classification system is expected to be useful for the analysis of tweets about other infectious diseases such as influenza and Ebola.


Assuntos
Surtos de Doenças/estatística & dados numéricos , Sarampo/epidemiologia , Redes Neurais de Computação , Mídias Sociais/tendências , História do Século XXI , Humanos , Sarampo/patologia , Percepção , Opinião Pública , Reprodutibilidade dos Testes
13.
BMC Med Inform Decis Mak ; 18(Suppl 2): 43, 2018 07 23.
Artigo em Inglês | MEDLINE | ID: mdl-30066665

RESUMO

BACKGROUND: Suicide has been one of the leading causes of deaths in the United States. One major cause of suicide is psychiatric stressors. The detection of psychiatric stressors in an at risk population will facilitate the early prevention of suicidal behaviors and suicide. In recent years, the widespread popularity and real-time information sharing flow of social media allow potential early intervention in a large-scale population. However, few automated approaches have been proposed to extract psychiatric stressors from Twitter. The goal of this study was to investigate techniques for recognizing suicide related psychiatric stressors from Twitter using deep learning based methods and transfer learning strategy which leverages an existing annotation dataset from clinical text. METHODS: First, a dataset of suicide-related tweets was collected from Twitter streaming data with a multiple-step pipeline including keyword-based retrieving, filtering and further refining using an automated binary classifier. Specifically, a convolutional neural networks (CNN) based algorithm was used to build the binary classifier. Next, psychiatric stressors were annotated in the suicide-related tweets. The stressor recognition problem is conceptualized as a typical named entity recognition (NER) task and tackled using recurrent neural networks (RNN) based methods. Moreover, to reduce the annotation cost and improve the performance, transfer learning strategy was adopted by leveraging existing annotation from clinical text. RESULTS & CONCLUSIONS: To our best knowledge, this is the first effort to extract psychiatric stressors from Twitter data using deep learning based approaches. Comparison to traditional machine learning algorithms shows the superiority of deep learning based approaches. CNN is leading the performance at identifying suicide-related tweets with a precision of 78% and an F-1 measure of 83%, outperforming Support Vector Machine (SVM), Extra Trees (ET), etc. RNN based psychiatric stressors recognition obtains the best F-1 measure of 53.25% by exact match and 67.94% by inexact match, outperforming Conditional Random Fields (CRF). Moreover, transfer learning from clinical notes for the Twitter corpus outperforms the training with Twitter corpus only with an F-1 measure of 54.9% by exact match. The results indicate the advantages of deep learning based methods for the automated stressors recognition from social media.


Assuntos
Aprendizado Profundo , Mídias Sociais , Estresse Psicológico , Prevenção do Suicídio , Algoritmos , Humanos , Redes Neurais de Computação
14.
BMC Med Inform Decis Mak ; 17(Suppl 2): 69, 2017 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-28699569

RESUMO

BACKGROUND: As one of the serious public health issues, vaccination refusal has been attracting more and more attention, especially for newly approved human papillomavirus (HPV) vaccines. Understanding public opinion towards HPV vaccines, especially concerns on social media, is of significant importance for HPV vaccination promotion. METHODS: In this study, we leveraged a hierarchical machine learning based sentiment analysis system to extract public opinions towards HPV vaccines from Twitter. English tweets containing HPV vaccines-related keywords were collected from November 2, 2015 to March 28, 2016. Manual annotation was done to evaluate the performance of the system on the unannotated tweets corpus. Followed time series analysis was applied to this corpus to track the trends of machine-deduced sentiments and their associations with different days of the week. RESULTS: The evaluation of the unannotated tweets corpus showed that the micro-averaging F scores have reached 0.786. The learning system deduced the sentiment labels for 184,214 tweets in the collected unannotated tweets corpus. Time series analysis identified a coincidence between mainstream outcome and Twitter contents. A weak trend was found for "Negative" tweets that decreased firstly and began to increase later; an opposite trend was identified for "Positive" tweets. Tweets that contain the worries on efficacy for HPV vaccines showed a relative significant decreasing trend. Strong associations were found between some sentiments ("Positive", "Negative", "Negative-Safety" and "Negative-Others") with different days of the week. CONCLUSIONS: Our efforts on sentiment analysis for newly approved HPV vaccines provide us an automatic and instant way to extract public opinion and understand the concerns on Twitter. Our approaches can provide a feedback to public health professionals to monitor online public response, examine the effectiveness of their HPV vaccination promotion strategies and adjust their promotion plans.


Assuntos
Conhecimentos, Atitudes e Prática em Saúde , Aprendizado de Máquina , Infecções por Papillomavirus/prevenção & controle , Vacinas contra Papillomavirus , Opinião Pública , Mídias Sociais , Vacinação/psicologia , Humanos
15.
BMC Med Inform Decis Mak ; 17(Suppl 2): 76, 2017 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-28699543

RESUMO

BACKGROUND: To identify safety signals by manual review of individual report in large surveillance databases is time consuming; such an approach is very unlikely to reveal complex relationships between medications and adverse events. Since the late 1990s, efforts have been made to develop data mining tools to systematically and automatically search for safety signals in surveillance databases. Influenza vaccines present special challenges to safety surveillance because the vaccine changes every year in response to the influenza strains predicted to be prevalent that year. Therefore, it may be expected that reporting rates of adverse events following flu vaccines (number of reports for a specific vaccine-event combination/number of reports for all vaccine-event combinations) may vary substantially across reporting years. Current surveillance methods seldom consider these variations in signal detection, and reports from different years are typically collapsed together to conduct safety analyses. However, merging reports from different years ignores the potential heterogeneity of reporting rates across years and may miss important safety signals. METHOD: Reports of adverse events between years 1990 to 2013 were extracted from the Vaccine Adverse Event Reporting System (VAERS) database and formatted into a three-dimensional data array with types of vaccine, groups of adverse events and reporting time as the three dimensions. We propose a random effects model to test the heterogeneity of reporting rates for a given vaccine-event combination across reporting years. The proposed method provides a rigorous statistical procedure to detect differences of reporting rates among years. We also introduce a new visualization tool to summarize the result of the proposed method when applied to multiple vaccine-adverse event combinations. RESULT: We applied the proposed method to detect safety signals of FLU3, an influenza vaccine containing three flu strains, in the VAERS database. We showed that it had high statistical power to detect the variation in reporting rates across years. The identified vaccine-event combinations with significant different reporting rates over years suggested potential safety issues due to changes in vaccines which require further investigation. CONCLUSION: We developed a statistical model to detect safety signals arising from heterogeneity of reporting rates of a given vaccine-event combinations across reporting years. This method detects variation in reporting rates over years with high power. The temporal trend of reporting rate across years may reveal the impact of vaccine update on occurrence of adverse events and provide evidence for further investigations.


Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos , Mineração de Dados , Segurança do Paciente , Vacinas/efeitos adversos , Humanos , Detecção de Sinal Psicológico
16.
J Hazard Mater ; 476: 134956, 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38917630

RESUMO

Covalent organic frameworks (COFs) are a type of novel organic catalysts which show great potential in the treatment of environmental contaminations. Herein, we synthesized three isoreticular halogen-functionalized (F, Cl and Br) porphyrin COFs for visible-light (420 nm ≤ λ ≤ 780 nm) photocatalytic reduction of Cr(VI) to Cr(III). Halogen substituents with tunable electronegativity can regulate the band structure and modulate the charge carrier kinetics of COFs. In the absence of any sacrificial reagent, the isoreticular COFs exhibited good photocatalytic reduction activity of Cr(VI). Particularly, the TAPP-2F showed nearly 100 % conversion efficiency and the highest reaction rate constants (k) on account of the strong electronegativity of F substituent. Experimental results and theoretical calculations showed that the conduction band (CB) potentials of COFs became more negative and charge carrier separation increased with the enhancement of electronegativity (Br < Cl < F), which could provide sufficient driving force for the photoreduction of Cr(VI) to Cr(III). The halogen substituents strategy for regulating the electronic structure of COFs can provide opportunities for designing efficient photocatalysts for environmental remediation. Meanwhile, the mechanistic insights reported in this study help to understand the photocatalytic degradation pathways of heavy metals.

17.
ACS Appl Mater Interfaces ; 16(3): 4283-4294, 2024 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-38206114

RESUMO

Traditional piperazine-based polyamide membranes usually suffer from the intrinsic trade-off relationship between selectivity and permeance. The development of macrocycle membranes with customized nanoscale pores is expected to address this challenge. Herein, we introduce 1,4-diazacyclohexane (2N), 1,4,7-triazacyclononane (3N), and 1,4,8,11-tetraazacyclotetradecane (4N) as molecular building blocks to construct the nanoarchitectonics of polyamide membranes prepared from interfacial polymerization (IP). The permeance of covalent organic network membranes follows the trend of 4N-TMC > 3N-TMC > 2N-TMC, while the molecular weight cutoff (MWCO) also follows the same trend of 4N-TMC > 3N-TMC > 2N-TMC, according to their nanopore size of the membranes. The microporosity, orientation, and surface chemistry of covalent organic network membranes can be rationally designed by macrocycle building units. The ordered nanoarchitectonics allows the membranes to attain an excellent performance in graded molecular sieving. Importantly, the novel covalent organic network membranes with tunable nanoarchitectonics prepared from macrocycle building units exhibited high water permeance (32.5 LMH/bar) and retained long-term stability after 100 h of test and bovine serum albumin fouling. These results reveal the enormous potential of 3N-TMC and 4N-TMC membranes in saline textile wastewater treatments and precise molecular sieving.

18.
Artigo em Inglês | MEDLINE | ID: mdl-38281112

RESUMO

IMPORTANCE: The study highlights the potential of large language models, specifically GPT-3.5 and GPT-4, in processing complex clinical data and extracting meaningful information with minimal training data. By developing and refining prompt-based strategies, we can significantly enhance the models' performance, making them viable tools for clinical NER tasks and possibly reducing the reliance on extensive annotated datasets. OBJECTIVES: This study quantifies the capabilities of GPT-3.5 and GPT-4 for clinical named entity recognition (NER) tasks and proposes task-specific prompts to improve their performance. MATERIALS AND METHODS: We evaluated these models on 2 clinical NER tasks: (1) to extract medical problems, treatments, and tests from clinical notes in the MTSamples corpus, following the 2010 i2b2 concept extraction shared task, and (2) to identify nervous system disorder-related adverse events from safety reports in the vaccine adverse event reporting system (VAERS). To improve the GPT models' performance, we developed a clinical task-specific prompt framework that includes (1) baseline prompts with task description and format specification, (2) annotation guideline-based prompts, (3) error analysis-based instructions, and (4) annotated samples for few-shot learning. We assessed each prompt's effectiveness and compared the models to BioClinicalBERT. RESULTS: Using baseline prompts, GPT-3.5 and GPT-4 achieved relaxed F1 scores of 0.634, 0.804 for MTSamples and 0.301, 0.593 for VAERS. Additional prompt components consistently improved model performance. When all 4 components were used, GPT-3.5 and GPT-4 achieved relaxed F1 socres of 0.794, 0.861 for MTSamples and 0.676, 0.736 for VAERS, demonstrating the effectiveness of our prompt framework. Although these results trail BioClinicalBERT (F1 of 0.901 for the MTSamples dataset and 0.802 for the VAERS), it is very promising considering few training samples are needed. DISCUSSION: The study's findings suggest a promising direction in leveraging LLMs for clinical NER tasks. However, while the performance of GPT models improved with task-specific prompts, there's a need for further development and refinement. LLMs like GPT-4 show potential in achieving close performance to state-of-the-art models like BioClinicalBERT, but they still require careful prompt engineering and understanding of task-specific knowledge. The study also underscores the importance of evaluation schemas that accurately reflect the capabilities and performance of LLMs in clinical settings. CONCLUSION: While direct application of GPT models to clinical NER tasks falls short of optimal performance, our task-specific prompt framework, incorporating medical knowledge and training samples, significantly enhances GPT models' feasibility for potential clinical applications.

19.
JMIR Med Inform ; 12: e57164, 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38904984

RESUMO

BACKGROUND: Vaccines serve as a crucial public health tool, although vaccine hesitancy continues to pose a significant threat to full vaccine uptake and, consequently, community health. Understanding and tracking vaccine hesitancy is essential for effective public health interventions; however, traditional survey methods present various limitations. OBJECTIVE: This study aimed to create a real-time, natural language processing (NLP)-based tool to assess vaccine sentiment and hesitancy across 3 prominent social media platforms. METHODS: We mined and curated discussions in English from Twitter (subsequently rebranded as X), Reddit, and YouTube social media platforms posted between January 1, 2011, and October 31, 2021, concerning human papillomavirus; measles, mumps, and rubella; and unspecified vaccines. We tested multiple NLP algorithms to classify vaccine sentiment into positive, neutral, or negative and to classify vaccine hesitancy using the World Health Organization's (WHO) 3Cs (confidence, complacency, and convenience) hesitancy model, conceptualizing an online dashboard to illustrate and contextualize trends. RESULTS: We compiled over 86 million discussions. Our top-performing NLP models displayed accuracies ranging from 0.51 to 0.78 for sentiment classification and from 0.69 to 0.91 for hesitancy classification. Explorative analysis on our platform highlighted variations in online activity about vaccine sentiment and hesitancy, suggesting unique patterns for different vaccines. CONCLUSIONS: Our innovative system performs real-time analysis of sentiment and hesitancy on 3 vaccine topics across major social networks, providing crucial trend insights to assist campaigns aimed at enhancing vaccine uptake and public health.

20.
Adv Mater ; : e2405744, 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38861297

RESUMO

The precise manipulation of the microstructure (pore size, free volume distribution, and connectivity of the free-volume elements), thickness, and mechanical characteristics of membranes holds paramount significance in facilitating the effective utilization of self-standing membranes. In this contribution, the synthesis of two innovative ester-linked covalent-organic framework (COF) membranes is first reported, which are generated through the selection of plant-derived ellagic acid and quercetin phenolic monomers in conjunction with terephthaloyl chloride as a building block. The optimization of the microstructure of these two COF membranes is systematically achieved through the application of three different interfacial electric field systems: electric neutrality, positive electricity, and negative electricity. It is observed that the positively charged system facilitates a record increase in the rate of membrane formation, resulting in a denser membrane with a uniform pore size and enhanced flexibility. In addition, a correlation is identified wherein an increase in the alkyl chain length of the surfactants leads to a more uniform pore size and a decrease in the molecular weight cutoff of the COF membrane. The resulting COF membrane exhibits an unprecedented combination of high water permeance, superior sieving capability, robust mechanical strength, chemical robustness for promising membrane-based separation science and technology.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA