Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 85
Filter
1.
Invest Educ Enferm ; 42(2)2024 Jun.
Article in English | MEDLINE | ID: mdl-39083839

ABSTRACT

Objective: This work sought to identify the academic communities that have shown interest and participation in the Journal Research and Education in Nursing and analyze the scientific impact generated by said journal. Methods: A bibliometric analysis was carried out, as well as social network analysis and techniques of natural language processing to conduct the research. The data was gathered and analyzed during a specific study period, covering from 2010 - 2020, for articles published in the journal, and 2010 - 2022, for articles that cited the journal within Scopus. These methods permitted performing an exhaustive evaluation of the journal's influence and reach in diverse academic and geographic contexts. Results: During the analysis, it was noted that the journal Research and Education in Nursing has had significant influence in academic and scientific communities, both nationally and internationally. Collaboration networks were detected among diverse institutions and countries, which indicates active interaction in the field of nursing research. In addition, trends and emerging patterns were identified in this field, providing a more complete view of the discipline's evolution. Conclusion: Based on the results obtained, it is concluded that the journal Research and Education in Nursing has played un fundamental role in disseminating knowledge and promoting research in nursing. The combination of Bibliometric metrics, social network analysis, and natural language processing permitted utmost comprehension of its impact in the scientific and academic community globally.


Subject(s)
Bibliometrics , Natural Language Processing , Nursing Research , Periodicals as Topic , Humans , Periodicals as Topic/statistics & numerical data , Social Network Analysis , Education, Nursing
2.
Braz J Psychiatry ; 2024 Jul 29.
Article in English | MEDLINE | ID: mdl-39074334

ABSTRACT

OBJECTIVE: Verbal communication has key information for mental health evaluation. Researchers have linked psychopathology phenomena to some of their counterparts in natural-language-processing (NLP). We study the characterization of subtle impairments presented in early stages of psychosis, developing new analysis techniques and a comprehensive map associating NLP features with the full range of clinical presentation. METHODS: We used NLP to assess elicited and free-speech of 60 individuals in at-risk-mental-states (ARMS) and 73 controls, screened from 4,500 quota-sampled Portuguese speaking citizens in Sao Paulo, Brazil. Psychotic symptoms were independently assessed with Structured-Interview-for-Psychosis-Risk-Syndromes (SIPS). Speech features (e.g.sentiments, semantic coherence), including novel ones, were correlated with psychotic traits (Spearman's-ρ) and ARMS status (general linear models and machine-learning ensembles). RESULTS: NLP features were informative inputs for classification, which presented 86% balanced accuracy. The NLP features brought forth (e.g. Semantic laminarity as 'perseveration', Semantic recurrence time as 'circumstantiality', average centrality in word repetition graphs) carried most information and also presented direct correlations with psychotic symptoms. Out of the standard measures, grammatical tagging (e.g. use of adjectives) was the most relevant. CONCLUSION: Subtle speech impairments can be grasped by sensitive methods and used for ARMS screening. We sketch a blueprint for speech-based evaluation, pairing features to standard thought disorder psychometric items.

3.
Radiol Bras ; 57: e20230096en, 2024.
Article in English | MEDLINE | ID: mdl-38993952

ABSTRACT

Objective: To develop a natural language processing application capable of automatically identifying benign gallbladder diseases that require surgery, from radiology reports. Materials and Methods: We developed a text classifier to classify reports as describing benign diseases of the gallbladder that do or do not require surgery. We randomly selected 1,200 reports describing the gallbladder from our database, including different modalities. Four radiologists classified the reports as describing benign disease that should or should not be treated surgically. Two deep learning architectures were trained for classification: a convolutional neural network (CNN) and a bidirectional long short-term memory (BiLSTM) network. In order to represent words in vector form, the models included a Word2Vec representation, with dimensions of 300 or 1,000. The models were trained and evaluated by dividing the dataset into training, validation, and subsets (80/10/10). Results: The CNN and BiLSTM performed well in both dimensional spaces. For the 300- and 1,000-dimensional spaces, respectively, the F1-scores were 0.95945 and 0.95302 for the CNN model, compared with 0.96732 and 0.96732 for the BiLSTM model. Conclusion: Our models achieved high performance, regardless of the architecture and dimensional space employed.


Objetivo: Desenvolver uma aplicação de processamento de linguagem natural capaz de identificar automaticamente doenças cirúrgicas benignas da vesícula biliar a partir de laudos radiológicos. Materiais e Métodos: Desenvolvemos um classificador de texto para classificar laudos como contendo ou não doenças cirúrgicas benignas da vesícula biliar. Selecionamos aleatoriamente 1.200 laudos com descrição da vesícula biliar de nosso banco de dados, incluindo diferentes modalidades. Quatro radiologistas classificaram os laudos como doença benigna cirúrgica ou não. Duas arquiteturas de aprendizagem profunda foram treinadas para a classificação: a rede neural convolucional (convolutional neural network - CNN) e a memória longa de curto prazo bidirecional (bidirectional long short-term memory - BiLSTM). Para representar palavras de forma vetorial, os modelos incluíram uma representação Word2Vec, com dimensões variando de 300 a 1000. Os modelos foram treinados e avaliados por meio da divisão do conjunto de dados entre treinamento, validação e teste (80/10/10). Resultados: CNN e BiLSTM tiveram bom desempenho em ambos os espaços dimensionais. Relatamos para 300 e 1000 dimensões, respectivamente, as pontuações F1 de 0,95945 e 0,95302 para o modelo CNN e de 0,96732 e 0,96732 para a BiLSTM. Conclusão: Nossos modelos alcançaram alto desempenho, independentemente de diferentes arquiteturas e espaços dimensionais.

4.
BMC Med Inform Decis Mak ; 24(1): 204, 2024 Jul 24.
Article in English | MEDLINE | ID: mdl-39049027

ABSTRACT

Despite the high creation cost, annotated corpora are indispensable for robust natural language processing systems. In the clinical field, in addition to annotating medical entities, corpus creators must also remove personally identifiable information (PII). This has become increasingly important in the era of large language models where unwanted memorization can occur. This paper presents a corpus annotated to anonymize personally identifiable information in 1,787 anamneses of work-related accidents and diseases in Spanish. Additionally, we applied a previously released model for Named Entity Recognition (NER) trained on referrals from primary care physicians to identify diseases, body parts, and medications in this work-related text. We analyzed the differences between the models and the gold standard curated by a physician in detail. Moreover, we compared the performance of the NER model on the original narratives, in narratives where personal information has been masked, and in texts where the personal data is replaced by another similar surrogate value (pseudonymization). Within this publication, we share the annotation guidelines and the annotated corpus.


Subject(s)
Natural Language Processing , Humans , Spain , Occupational Health , Narration
5.
Invest. educ. enferm ; 42(2): 163-178, 20240722. ilus, tab, graf
Article in English | LILACS, BDENF - Nursing, COLNAL | ID: biblio-1570366

ABSTRACT

Objectives. This work sought to identify the academic communities that have shown interest and participation in the Journal Research and Education in Nursing and analyze the scientific impact generated by said journal. Methods. A bibliometric analysis was carried out, as well as social network analysis and techniques of natural language processing to conduct the research. The data was gathered and analyzed during a specific study period, covering from 2010 - 2020, for articles published in the journal, and 2010 - 2022, for articles that cited the journal within Scopus. These methods permitted performing an exhaustive evaluation of the journal's influence and reach in diverse academic and geographic contexts. Results. During the analysis, it was noted that the journal Research and Education in Nursing has had significant influence in academic and scientific communities, both nationally and internationally. Collaboration networks were detected among diverse institutions and countries, which indicates active interaction in the field of nursing research. In addition, trends and emerging patterns were identified in this field, providing a more complete view of the discipline's evolution. Conclusion. Based on the results obtained, it is concluded that the journal Research and Education in Nursing has played un fundamental role in disseminating knowledge and promoting research in nursing. The combination of Bibliometric metrics, social network analysis, and natural language processing permitted utmost comprehension of its impact in the scientific and academic community globally.


Objetivos. Identificar las comunidades académicas que han mostrado interés y participación en la revista Investigación y Educación en Enfermería y analizar el impacto científico generado por esta publicación. Métodos. Se realizó un análisis bibliométrico, así como análisis de redes sociales y técnicas de procesamiento de lenguaje natural para llevar a cabo la investigación. Los datos se recopilaron y analizaron durante un período de estudio específico, abarcando los años 2010-2020, para los artículos publicados en la revista, y 2010-2022, para los artículos que citaron la revista dentro de Scopus. Estos métodos permitieron realizar una evaluación exhaustiva de la influencia y alcance de la revista en diversos contextos académicos y geográficos. Resultados. Durante el análisis, se observó que la revista Investigación y Educación en Enfermería ha ejercido una influencia significativa en las comunidades académicas y científicas, tanto a nivel nacional como internacional. Se detectaron redes de colaboración entre diversas instituciones y países, lo que indica una interacción activa en el ámbito de la investigación en enfermería. Además, se identificaron tendencias y patrones emergentes en este campo, proporcionando una visión más completa de la evolución de la disciplina. Conclusión. Basándose en los resultados obtenidos, se concluye que la revista Investigación y Educación en Enfermería ha desempeñado un papel fundamental en la difusión del conocimiento y la promoción de la investigación en enfermería. La combinación de métricas bibliométricas, análisis de redes sociales y procesamiento de lenguaje natural permitió una comprensión más completa de su impacto en la comunidad científica y académica a nivel global.


Objetivos. Identificar as comunidades acadêmicas que demonstraram interesse e participação na revista Nursing Research and Education e analisar o impacto científico gerado por esta publicação colombiana. Métodos. Foi realizada análise bibliométrica, análise de redes sociais e técnicas de processamento de linguagem natural para a realização da pesquisa. Os dados foram coletados e analisados durante um período específico de estudo, abrangendo os anos 2010-2020, para artigos publicados na revista, e 2010-2022, para artigos que citaram a revista dentro do Scopus. Esses métodos permitiram uma avaliação abrangente da influência e do alcance da revista em diversos contextos acadêmicos e geográficos. Resultados. Durante a análise, observou-se que a revista Nursing Research and Education tem exercido influência significativa nas comunidades acadêmica e científica, tanto nacional quanto internacionalmente. Foram detectadas redes de colaboração entre diversas instituições e países, o que indica interação ativa no campo da pesquisa em enfermagem. Além disso, foram identificadas tendências e padrões emergentes neste campo, proporcionando uma visão mais completa da evolução da disciplina. Conclusão. Com base nos resultados obtidos, conclui-se que a revista Nursing Research and Education tem desempenhado um papel fundamental na divulgação do conhecimento e na promoção da investigação em enfermagem. A combinação de métricas bibliométricas, análise de redes sociais e processamento de linguagem natural permitiu uma compreensão mais completa do seu impacto na comunidade científica e académica global.


Subject(s)
Humans , Male , Female , Research , Education , Social Network Analysis , Natural Language Processing
6.
Heliyon ; 10(7): e27516, 2024 Apr 15.
Article in English | MEDLINE | ID: mdl-38560155

ABSTRACT

The importance of radiology in modern medicine is acknowledged for its non-invasive diagnostic capabilities, yet the manual formulation of unstructured medical reports poses time constraints and error risks. This study addresses the common limitation of Artificial Intelligence applications in medical image captioning, which typically focus on classification problems, lacking detailed information about the patient's condition. Despite advancements in AI-generated medical reports that incorporate descriptive details from X-ray images, which are essential for comprehensive reports, the challenge persists. The proposed solution involves a multimodal model utilizing Computer Vision for image representation and Natural Language Processing for textual report generation. A notable contribution is the innovative use of the Swin Transformer as the image encoder, enabling hierarchical mapping and enhanced model perception without a surge in parameters or computational costs. The model incorporates GPT-2 as the textual decoder, integrating cross-attention layers and bilingual training with datasets in Portuguese PT-BR and English. Promising results are noted in the proposed database with ROUGE-L 0.748, METEOR 0.741, and NIH CHEST X-ray with ROUGE-L 0.404 and METEOR 0.393.

7.
Front Artif Intell ; 7: 1336071, 2024.
Article in English | MEDLINE | ID: mdl-38576460

ABSTRACT

Introduction: Antibiotic-resistant Acinetobacter baumannii is a very important nosocomial pathogen worldwide. Thousands of studies have been conducted about this pathogen. However, there has not been any attempt to use all this information to highlight the research trends concerning this pathogen. Methods: Here we use unsupervised learning and natural language processing (NLP), two areas of Artificial Intelligence, to analyse the most extensive database of articles created (5,500+ articles, from 851 different journals, published over 3 decades). Results: K-means clustering found 113 theme clusters and these were defined with representative terms automatically obtained with topic modelling, summarising different research areas. The biggest clusters, all with over 100 articles, are biased toward multidrug resistance, carbapenem resistance, clinical treatment, and nosocomial infections. However, we also found that some research areas, such as ecology and non-human infections, have received very little attention. This approach allowed us to study research themes over time unveiling those of recent interest, such as the use of Cefiderocol (a recently approved antibiotic) against A. baumannii. Discussion: In a broader context, our results show that unsupervised learning, NLP and topic modelling can be used to describe and analyse the research themes for important infectious diseases. This strategy should be very useful to analyse other ESKAPE pathogens or any other pathogens relevant to Public Health.

8.
Assessment ; 31(2): 502-517, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37042304

ABSTRACT

Data aggregation in mental health is complicated by using different questionnaires, and little is known about the impact of item harmonization strategies on measurement precision. Therefore, we aimed to assess the impact of various item harmonization strategies for a target and proxy questionnaire using correlated and bifactor models. Data were obtained from the Brazilian High-Risk Study for Mental Conditions (BHRCS) and the Healthy Brain Network (HBN; N = 6,140, ages 5-22 years, 39.6% females). We tested six item-wise harmonization strategies and compared them based on several indices. The one-by-one (1:1) expert-based semantic item harmonization presented the best strategy as it was the only that resulted in scalar-invariant models for both samples and factor models. The between-questionnaires factor correlation, reliability, and factor score difference in using a proxy instead of a target measure improved little when all other harmonization strategies were compared with a completely at-random strategy. However, for bifactor models, between-questionnaire specific factor correlation increased from 0.05-0.19 (random item harmonization) to 0.43-0.60 (expert-based 1:1 semantic harmonization) in BHRCS and HBN samples, respectively. Therefore, item harmonization strategies are relevant for specific factors from bifactor models and had little impact on p-factors and first-order correlated factors when the child behavior checklist (CBCL) and strengths and difficulties questionnaire (SDQ) were harmonized.


Subject(s)
Mental Disorders , Psychopathology , Child , Female , Humans , Adolescent , Male , Reproducibility of Results , Psychometrics , Mental Health , Surveys and Questionnaires , Mental Disorders/diagnosis , Mental Disorders/psychology
9.
Radiol. bras ; Radiol. bras;57: e20230096en, 2024. tab, graf
Article in English | LILACS-Express | LILACS | ID: biblio-1564998

ABSTRACT

Abstract Objective: To develop a natural language processing application capable of automatically identifying benign gallbladder diseases that require surgery, from radiology reports. Materials and Methods: We developed a text classifier to classify reports as describing benign diseases of the gallbladder that do or do not require surgery. We randomly selected 1,200 reports describing the gallbladder from our database, including different modalities. Four radiologists classified the reports as describing benign disease that should or should not be treated surgically. Two deep learning architectures were trained for classification: a convolutional neural network (CNN) and a bidirectional long short-term memory (BiLSTM) network. In order to represent words in vector form, the models included a Word2Vec representation, with dimensions of 300 or 1,000. The models were trained and evaluated by dividing the dataset into training, validation, and subsets (80/10/10). Results: The CNN and BiLSTM performed well in both dimensional spaces. For the 300- and 1,000-dimensional spaces, respectively, the F1-scores were 0.95945 and 0.95302 for the CNN model, compared with 0.96732 and 0.96732 for the BiLSTM model. Conclusion: Our models achieved high performance, regardless of the architecture and dimensional space employed.


Resumo Objetivo: Desenvolver uma aplicação de processamento de linguagem natural capaz de identificar automaticamente doenças cirúrgicas benignas da vesícula biliar a partir de laudos radiológicos. Materiais e Métodos: Desenvolvemos um classificador de texto para classificar laudos como contendo ou não doenças cirúrgicas benignas da vesícula biliar. Selecionamos aleatoriamente 1.200 laudos com descrição da vesícula biliar de nosso banco de dados, incluindo diferentes modalidades. Quatro radiologistas classificaram os laudos como doença benigna cirúrgica ou não. Duas arquiteturas de aprendizagem profunda foram treinadas para a classificação: a rede neural convolucional (convolutional neural network - CNN) e a memória longa de curto prazo bidirecional (bidirectional long short-term memory - BiLSTM). Para representar palavras de forma vetorial, os modelos incluíram uma representação Word2Vec, com dimensões variando de 300 a 1000. Os modelos foram treinados e avaliados por meio da divisão do conjunto de dados entre treinamento, validação e teste (80/10/10). Resultados: CNN e BiLSTM tiveram bom desempenho em ambos os espaços dimensionais. Relatamos para 300 e 1000 dimensões, respectivamente, as pontuações F1 de 0,95945 e 0,95302 para o modelo CNN e de 0,96732 e 0,96732 para a BiLSTM. Conclusão: Nossos modelos alcançaram alto desempenho, independentemente de diferentes arquiteturas e espaços dimensionais.

10.
Rev. invest. clín ; Rev. invest. clín;75(6): 309-317, Nov.-Dec. 2023. graf
Article in English | LILACS-Express | LILACS | ID: biblio-1560116

ABSTRACT

ABSTRACT Artificial intelligence (AI) generative models driven by the integration of AI and natural language processing technologies, such as OpenAI's chatbot generative pre-trained transformer large language model (LLM), are receiving much public attention and have the potential to transform personalized medicine. Dialysis patients are highly dependent on technology and their treatment generates a challenging large volume of data that has to be analyzed for knowledge extraction. We argue that, by integrating the data acquired from hemodialysis treatments with the powerful conversational capabilities of LLMs, nephrologists could personalize treatments adapted to patients' lifestyles and preferences. We also argue that this new conversational AI integrated with a personalized patient-computer interface will enhance patients' engagement and self-care by providing them with a more personalized experience. However, generative AI models require continuous and accurate updates of data, and expert supervision and must address potential biases and limitations. Dialysis patients can also benefit from other new emerging technologies such as Digital Twins with which patients' care can also be addressed from a personalized medicine perspective. In this paper, we will revise LLMs potential strengths in terms of their contribution to personalized medicine, and, in particular, their potential impact, and limitations in nephrology. Nephrologists' collaboration with AI academia and companies, to develop algorithms and models that are more transparent, understandable, and trustworthy, will be crucial for the next generation of dialysis patients. The combination of technology, patient-specific data, and AI should contribute to create a more personalized and interactive dialysis process, improving patients' quality of life.

11.
Rev. cuba. inform. méd ; 15(2)dic. 2023.
Article in Spanish | LILACS-Express | LILACS | ID: biblio-1536285

ABSTRACT

Introducción: Los avances actuales en el campo de las TICs han permitido un importante impulso en el desarrollo de sistemas que traducen texto plano en español en pictogramas. Sin embargo, las soluciones actuales no pueden ser comprendidas por una persona con dificultades del lenguaje en Cuba, debido a que algunas terminologías no están presentes en el lenguaje cotidiano. Objetivo: Desarrollar el modelo Pictobana para el análisis semántico de un Pictotraductor que integre la semántica del lenguaje cubano. Métodos: El modelo fue desarrollado aplicando técnicas de procesamiento del lenguaje natural. Se realiza un análisis lingüístico con el objetivo de proporcionar las mejores representaciones posibles de los textos en pictogramas. Resultados: El modelo es implementado en una aplicación web que proporciona una herramienta que ayuda a promover las competencias y habilidades de comunicación a personas con dificultades del habla en Cuba y a sus familiares. Conclusiones: Las pruebas realizadas mediante experimentos y criterio de expertos, demuestran que el analizador desarrollado, aumenta la ajustabilidad de los pictogramas al contexto y a la semántica, aminorando la incoherencia y la ambigüedad semántica del futuro sistema.


Introduction: Current advances in the field of ICTs have allowed an important boost in the development of systems that allow translating plain text in Spanish into pictograms. However, the current solutions cannot be understood by a person with language difficulties in Cuba because some terminologies are not present in everyday language. Objective: To develop the Pictobana model for the semantic analysis of a Pictotranslator that integrates the semantics of the Cuban language. Methods: The model was developed by applying natural language processing techniques. A linguistic analysis was carried out with the aim of providing the best possible representations of the texts in pictograms. Results: The model is implemented in a web application that provides a tool that helps promote communication skills and abilities for people with speech difficulties and their families in Cuba. Conclusions: The tests carried out through experiments and expert criteria show that the developed analyzer increases the adjustability of the pictograms to the context and the semantics, reducing the incoherence and semantic ambiguity of the future system.

12.
Rev. cuba. inform. méd ; 15(2)dic. 2023.
Article in Spanish | LILACS-Express | LILACS | ID: biblio-1536297

ABSTRACT

El objetivo de este estudio fue describir las percepciones de los usuarios de Facebook que realizaron comentarios, en las publicaciones realizadas desde la cuenta oficial del Ministerio de Salud de Perú (MINSA), referentes a la campaña de vacunación contra el VPH. Se analizaron 2748 comentarios en Python con procesamiento de lenguaje natural. Con este proceso se obtuvieron palabras claves que luego fueron interpretadas de manera manual. Se encontraron mayoritariamente cuatro tipos de discursos dentro de ellos: a) apoyo a la publicación sobre la vacuna contra el VPH; b) rechazo a la vacuna contra el VPH; c) Vacuna contra el VPH en niños; d) Dudas sobre la vacuna contra el VPH. En su mayoría, los usuarios que expresaron una postura de rechazo de esta vacuna se respaldaban de links a noticias donde se presentaba un evento supuestamente atribuido a la vacunación o inmunización pero que carecía de una fuente de información confiable y/o verificable.


The objective of this study was to describe the perceptions of Facebook users who commented on posts made by the official account of the Ministry of Health of Peru (MINSA) regarding the HPV vaccination campaign. We analyzed 2748 comments in Python with natural language processing. With this process we obtained keywords that were then interpreted manually. We found mostly four types of discourse, within them: a) support for the publications of the HPV vaccine; b) refusal of the HPV vaccine; c) HPV vaccine in children; d) doubts about the HPV vaccine. For the most part, users who expressed a position against this vaccine relied on links to online news stories that presented an event supposedly attributed to vaccination or immunization but lacked a reliable and/or verifiable source of information.

13.
Sci Justice ; 63(6): 689-723, 2023 11.
Article in English | MEDLINE | ID: mdl-38030340

ABSTRACT

Cocaine trafficking threatens countries' national security and is a major public health challenge. Cocaine is transported from producer countries to consumer markets using various routes, methods, and transportation means. These routes develop in the geographical environment, are carefully planned and are geo-strategic objects that respond to the opportunities that drug trafficking organisations (DTOs) find to reduce the risks of interdiction. In this sense, individual drug seizure data (IDS) become essential indicators for identifying trends and understanding trafficking flows associated with drug trafficking routes. However, due to the illicit nature of DTOs, the availability of these data is considerably limited, hindering the ability to analyse and identify trends. This study presents a methodology for collecting and processing data from open-source information reported by Brazil's federal government news website. Using geospatial intelligence and natural language processing methods, we created a dataset with 939 records and 44 variables related to cocaine seizures in Brazil in 2022. We applied geospatial analysis techniques from this dataset to identify trends and potential cocaine trafficking flows. The results were broadly consistent with existing literature on drug trafficking. They demonstrated the potential of open-source information for environmental scanning and knowledge generation through geographic information science. The approach proposed in our research provides tools that can be used to complement drug trafficking monitoring and formulate public policies to strengthen prevention and enforcement strategies.


Subject(s)
Cocaine , Drug Trafficking , Humans , Brazil , Natural Language Processing
14.
Healthc Inform Res ; 29(4): 286-300, 2023 Oct.
Article in English | MEDLINE | ID: mdl-37964451

ABSTRACT

OBJECTIVES: A substantial portion of the data contained in Electronic Health Records (EHR) is unstructured, often appearing as free text. This format restricts its potential utility in clinical decision-making. Named entity recognition (NER) methods address the challenge of extracting pertinent information from unstructured text. The aim of this study was to outline the current NER methods and trace their evolution from 2011 to 2022. METHODS: We conducted a methodological literature review of NER methods, with a focus on distinguishing the classification models, the types of tagging systems, and the languages employed in various corpora. RESULTS: Several methods have been documented for automatically extracting relevant information from EHRs using natural language processing techniques such as NER and relation extraction (RE). These methods can automatically extract concepts, events, attributes, and other data, as well as the relationships between them. Most NER studies conducted thus far have utilized corpora in English or Chinese. Additionally, the bidirectional encoder representation from transformers using the BIO tagging system architecture is the most frequently reported classification scheme. We discovered a limited number of papers on the implementation of NER or RE tasks in EHRs within a specific clinical domain. CONCLUSIONS: EHRs play a pivotal role in gathering clinical information and could serve as the primary source for automated clinical decision support systems. However, the creation of new corpora from EHRs in specific clinical domains is essential to facilitate the swift development of NER and RE models applied to EHRs for use in clinical practice.

15.
Data Brief ; 51: 109720, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37965606

ABSTRACT

The COVID-19 pandemic has underlined the need for reliable information for clinical decision-making and public health policies. As such, evidence-based medicine (EBM) is essential in identifying and evaluating scientific documents pertinent to novel diseases, and the accurate classification of biomedical text is integral to this process. Given this context, we introduce a comprehensive, curated dataset composed of COVID-19-related documents. This dataset includes 20,047 labeled documents that were meticulously classified into five distinct categories: systematic reviews (SR), primary study randomized controlled trials (PS-RCT), primary study non-randomized controlled trials (PS-NRCT), broad synthesis (BS), and excluded (EXC). The documents, labeled by collaborators from the Epistemonikos Foundation, incorporate information such as document type, title, abstract, and metadata, including PubMed id, authors, journal, and publication date. Uniquely, this dataset has been curated by the Epistemonikos Foundation and is not readily accessible through conventional web-scraping methods, thereby attesting to its distinctive value in this field of research. In addition to this, the dataset also includes a vast evidence repository comprising 427,870 non-COVID-19 documents, also categorized into SR, PS-RCT, PS-NRCT, BS, and EXC. This additional collection can serve as a valuable benchmark for subsequent research. The comprehensive nature of this open-access dataset and its accompanying resources is poised to significantly advance evidence-based medicine and facilitate further research in the domain.

16.
Bioengineering (Basel) ; 10(9)2023 Sep 19.
Article in English | MEDLINE | ID: mdl-37760200

ABSTRACT

The automatic generation of descriptions for medical images has sparked increasing interest in the healthcare field due to its potential to assist professionals in the interpretation and analysis of clinical exams. This study explores the development and evaluation of a generalist generative model for medical images. Gaps were identified in the literature, such as the lack of studies that explore the performance of specific models for medical description generation and the need for objective evaluation of the quality of generated descriptions. Additionally, there is a lack of model generalization to different image modalities and medical conditions. To address these issues, a methodological strategy was adopted, combining natural language processing and features extraction from medical images and feeding them into a generative model based on neural networks. The goal was to achieve model generalization across various image modalities and medical conditions. The results showed promising outcomes in the generation of descriptions, with an accuracy of 0.7628 and a BLEU-1 score of 0.5387. However, the quality of the generated descriptions may still be limited, exhibiting semantic errors or lacking relevant details. These limitations could be attributed to the availability and representativeness of the data, as well as the techniques used.

17.
medRxiv ; 2023 Dec 04.
Article in English | MEDLINE | ID: mdl-37693571

ABSTRACT

Background: Atopic dermatitis (AD) is a chronic skin condition that millions of people around the world live with each day. Performing research studies into identifying the causes and treatment for this disease has great potential to provide benefit for these individuals. However, AD clinical trial recruitment is a non-trivial task due to variance in diagnostic precision and phenotypic definitions leveraged by different clinicians as well as time spent finding, recruiting, and enrolling patients by clinicians to become study subjects. Thus, there is a need for automatic and effective patient phenotyping for cohort recruitment. Objective: Our study aims to present an approach for identifying patients whose electronic health records suggest that they may have AD. Methods: We created a vectorized representation of each patient and trained various supervised machine learning methods to classify when a patient has AD. Each patient is represented by a vector of either probabilities or binary values where each value indicates whether they meet a different criteria for AD diagnosis. Results: The most accurate AD classifier performed with a class-balanced accuracy of 0.8036, a precision of 0.8400, and a recall of 0.7500 when using XGBoost (Extreme Gradient Boosting). Conclusions: Creating an automated approach for identifying patient cohorts has the potential to accelerate, standardize, and automate the process of patient recruitment for AD studies; therefore, reducing clinician burden and informing knowledge discovery of better treatment options for AD.

18.
Rev Invest Clin ; 75(6): 309-317, 2023 12 18.
Article in English | MEDLINE | ID: mdl-37734067

ABSTRACT

Artificial intelligence (AI) generative models driven by the integration of AI and natural language processing technologies, such as OpenAI's chatbot generative pre-trained transformer large language model (LLM), are receiving much public attention and have the potential to transform personalized medicine. Dialysis patients are highly dependent on technology and their treatment generates a challenging large volume of data that has to be analyzed for knowledge extraction. We argue that, by integrating the data acquired from hemodialysis treatments with the powerful conversational capabilities of LLMs, nephrologists could personalize treatments adapted to patients' lifestyles and preferences. We also argue that this new conversational AI integrated with a personalized patient-computer interface will enhance patients' engagement and self-care by providing them with a more personalized experience. However, generative AI models require continuous and accurate updates of data, and expert supervision and must address potential biases and limitations. Dialysis patients can also benefit from other new emerging technologies such as Digital Twins with which patients' care can also be addressed from a personalized medicine perspective. In this paper, we will revise LLMs potential strengths in terms of their contribution to personalized medicine, and, in particular, their potential impact, and limitations in nephrology. Nephrologists' collaboration with AI academia and companies, to develop algorithms and models that are more transparent, understandable, and trustworthy, will be crucial for the next generation of dialysis patients. The combination of technology, patient-specific data, and AI should contribute to create a more personalized and interactive dialysis process, improving patients' quality of life.


Subject(s)
Artificial Intelligence , Quality of Life , Humans , Algorithms , Software , Renal Dialysis
19.
Diagnostics (Basel) ; 13(13)2023 Jun 25.
Article in English | MEDLINE | ID: mdl-37443557

ABSTRACT

Parkinson's disease (PD) is the second most prevalent neurodegenerative disorder in the world, and it is characterized by the production of different motor and non-motor symptoms which negatively affect speech and language production. For decades, the research community has been working on methodologies to automatically model these biomarkers to detect and monitor the disease; however, although speech impairments have been widely explored, language remains underexplored despite being a valuable source of information, especially to assess cognitive impairments associated with non-motor symptoms. This study proposes the automatic assessment of PD patients using different methodologies to model speech and language biomarkers. One-dimensional and two-dimensional convolutional neural networks (CNNs), along with pre-trained models such as Wav2Vec 2.0, BERT, and BETO, were considered to classify PD patients vs. Healthy Control (HC) subjects. The first approach consisted of modeling speech and language independently. Then, the best representations from each modality were combined following early, joint, and late fusion strategies. The results show that the speech modality yielded an accuracy of up to 88%, thus outperforming all language representations, including the multi-modal approach. These results suggest that speech representations better discriminate PD patients and HC subjects than language representations. When analyzing the fusion strategies, we observed that changes in the time span of the multi-modal representation could produce a significant loss of information in the speech modality, which was likely linked to a decrease in accuracy in the multi-modal experiments. Further experiments are necessary to validate this claim with other fusion methods using different time spans.

20.
BMC Bioinformatics ; 24(1): 242, 2023 Jun 08.
Article in English | MEDLINE | ID: mdl-37291492

ABSTRACT

BACKGROUND: Although the development of sequencing technologies has provided a large number of protein sequences, the analysis of functions that each one plays is still difficult due to the efforts of laboratorial methods, making necessary the usage of computational methods to decrease this gap. As the main source of information available about proteins is their sequences, approaches that can use this information, such as classification based on the patterns of the amino acids and the inference based on sequence similarity using alignment tools, are able to predict a large collection of proteins. The methods available in the literature that use this type of feature can achieve good results, however, they present restrictions of protein length as input to their models. In this work, we present a new method, called TEMPROT, based on the fine-tuning and extraction of embeddings from an available architecture pre-trained on protein sequences. We also describe TEMPROT+, an ensemble between TEMPROT and BLASTp, a local alignment tool that analyzes sequence similarity, which improves the results of our former approach. RESULTS: The evaluation of our proposed classifiers with the literature approaches has been conducted on our dataset, which was derived from CAFA3 challenge database. Both TEMPROT and TEMPROT+ achieved competitive results on [Formula: see text], [Formula: see text], AuPRC and IAuPRC metrics on Biological Process (BP), Cellular Component (CC) and Molecular Function (MF) ontologies compared to state-of-the-art models, with the main results equal to 0.581, 0.692 and 0.662 of [Formula: see text] on BP, CC and MF, respectively. CONCLUSIONS: The comparison with the literature showed that our model presented competitive results compared the state-of-the-art approaches considering the amino acid sequence pattern recognition and homology analysis. Our model also presented improvements related to the input size that the model can use to train compared to the literature methods.


Subject(s)
Amino Acids , Proteins , Proteins/chemistry , Molecular Sequence Annotation , Amino Acid Sequence , Amines
SELECTION OF CITATIONS
SEARCH DETAIL