Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 560
Filtrar
Más filtros

Intervalo de año de publicación
1.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37068306

RESUMEN

Determining the interacting proteins in multiprotein complexes can be technically challenging. An emerging biochemical approach to this end is based on the 'thermal proximity co-aggregation' (TPCA) phenomenon. Accordingly, when two or more proteins interact to form a complex, they tend to co-aggregate when subjected to heat-induced denaturation and thus exhibit similar melting curves. Here, we explore the potential of leveraging TPCA for determining protein interactions. We demonstrate that dissimilarity measure-based information retrieval applied to melting curves tends to rank a protein-of-interest's interactors higher than its non-interactors, as shown in the context of pull-down assay results. Consequently, such rankings can reduce the number of confirmatory biochemical experiments needed to find bona fide protein-protein interactions. In general, rankings based on dissimilarity measures generated through metric learning further reduce the required number of experiments compared to those based on standard dissimilarity measures such as Euclidean distance. When a protein mixture's melting curves are obtained in two conditions, we propose a scoring function that uses melting curve data to inform how likely a protein pair is to interact in one condition but not another. We show that ranking protein pairs by their scores is an effective approach for determining condition-specific protein-protein interactions. By contrast, clustering melting curve data generally does not inform about the interacting proteins in multiprotein complexes. In conclusion, we report improved methods for dissimilarity measure-based computation of melting curves data that can greatly enhance the determination of interacting proteins in multiprotein complexes.


Asunto(s)
Complejos Multiproteicos , Proteínas
2.
Bioinformatics ; 2024 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-39171832

RESUMEN

MOTIVATION: Integrating information from data sources representing different study designs has the potential to strengthen evidence in population health research. However, this concept of evidence "triangulation" presents a number of challenges for systematically identifying and integrating relevant information. These include the harmonization of heterogenous evidence with common semantic concepts and properties, as well as the priortization of the retrieved evidence for triangulation with the question of interest. RESULTS: We present ASQ (Annotated Semantic Queries), a natural language query interface to the integrated biomedical entities and epidemiological evidence in EpiGraphDB, which enables users to extract "claims" from a piece of unstructured text, and then investigate the evidence that could either support, contradict the claims, or offer additional information to the query.This approach has the potential to support the rapid review of preprints, grant applications, conference abstracts and articles submitted for peer review. ASQ implements strategies to harmonize biomedical entities in different taxonomies and evidence from different sources, to facilitate evidence triangulation and interpretation. AVAILABILITY AND IMPLEMENTATION: ASQ is openly available at https://asq.epigraphdb.org and its source code is available at https://github.com/mrcieu/epigraphdb-asq under GPL-3.0 license. SUPPLEMENTARY INFORMATION: Further information can be found in the Supplementary Materials as well as on the ASQ platform via https://asq.epigraphdb.org/docs.

3.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-35596953

RESUMEN

Coronavirus disease 2019 (COVID-19) has infected hundreds of millions of people and killed millions of them. As an RNA virus, COVID-19 is more susceptible to variation than other viruses. Many problems involved in this epidemic have made biosafety and biosecurity (hereafter collectively referred to as 'biosafety') a popular and timely topic globally. Biosafety research covers a broad and diverse range of topics, and it is important to quickly identify hotspots and trends in biosafety research through big data analysis. However, the data-driven literature on biosafety research discovery is quite scant. We developed a novel topic model based on latent Dirichlet allocation, affinity propagation clustering and the PageRank algorithm (LDAPR) to extract knowledge from biosafety research publications from 2011 to 2020. Then, we conducted hotspot and trend analysis with LDAPR and carried out further studies, including annual hot topic extraction, a 10-year keyword evolution trend analysis, topic map construction, hot region discovery and fine-grained correlation analysis of interdisciplinary research topic trends. These analyses revealed valuable information that can guide epidemic prevention work: (1) the research enthusiasm over a certain infectious disease not only is related to its epidemic characteristics but also is affected by the progress of research on other diseases, and (2) infectious diseases are not only strongly related to their corresponding microorganisms but also potentially related to other specific microorganisms. The detailed experimental results and our code are available at https://github.com/KEAML-JLU/Biosafety-analysis.


Asunto(s)
COVID-19 , Bioaseguramiento , COVID-19/epidemiología , Contención de Riesgos Biológicos/métodos , Humanos , Aprendizaje Automático , ARN
4.
BMC Med Imaging ; 24(1): 86, 2024 Apr 10.
Artículo en Inglés | MEDLINE | ID: mdl-38600525

RESUMEN

Medical imaging AI systems and big data analytics have attracted much attention from researchers of industry and academia. The application of medical imaging AI systems and big data analytics play an important role in the technology of content based remote sensing (CBRS) development. Environmental data, information, and analysis have been produced promptly using remote sensing (RS). The method for creating a useful digital map from an image data set is called image information extraction. Image information extraction depends on target recognition (shape and color). For low-level image attributes like texture, Classifier-based Retrieval(CR) techniques are ineffective since they categorize the input images and only return images from the determined classes of RS. The issues mentioned earlier cannot be handled by the existing expertise based on a keyword/metadata remote sensing data service model. To get over these restrictions, Fuzzy Class Membership-based Image Extraction (FCMIE), a technology developed for Content-Based Remote Sensing (CBRS), is suggested. The compensation fuzzy neural network (CFNN) is used to calculate the category label and fuzzy category membership of the query image. Use a basic and balanced weighted distance metric. Feature information extraction (FIE) enhances remote sensing image processing and autonomous information retrieval of visual content based on time-frequency meaning, such as color, texture and shape attributes of images. Hierarchical nested structure and cyclic similarity measure produce faster queries when searching. The experiment's findings indicate that applying the proposed model can have favorable outcomes for assessment measures, including Ratio of Coverage, average means precision, recall, and efficiency retrieval that are attained more effectively than the existing CR model. In the areas of feature tracking, climate forecasting, background noise reduction, and simulating nonlinear functional behaviors, CFNN has a wide range of RS applications. The proposed method CFNN-FCMIE achieves a minimum range of 4-5% for all three feature vectors, sample mean and comparison precision-recall ratio, which gives better results than the existing classifier-based retrieval model. This work provides an important reference for medical imaging artificial intelligence system and big data analysis.


Asunto(s)
Inteligencia Artificial , Tecnología de Sensores Remotos , Humanos , Ciencia de los Datos , Almacenamiento y Recuperación de la Información , Redes Neurales de la Computación
5.
J Med Internet Res ; 26: e58764, 2024 Jul 31.
Artículo en Inglés | MEDLINE | ID: mdl-39083765

RESUMEN

Evidence-based medicine (EBM) emerged from McMaster University in the 1980-1990s, which emphasizes the integration of the best research evidence with clinical expertise and patient values. The Health Information Research Unit (HiRU) was created at McMaster University in 1985 to support EBM. Early on, digital health informatics took the form of teaching clinicians how to search MEDLINE with modems and phone lines. Searching and retrieval of published articles were transformed as electronic platforms provided greater access to clinically relevant studies, systematic reviews, and clinical practice guidelines, with PubMed playing a pivotal role. In the early 2000s, the HiRU introduced Clinical Queries-validated search filters derived from the curated, gold-standard, human-appraised Hedges dataset-to enhance the precision of searches, allowing clinicians to hone their queries based on study design, population, and outcomes. Currently, almost 1 million articles are added to PubMed annually. To filter through this volume of heterogenous publications for clinically important articles, the HiRU team and other researchers have been applying classical machine learning, deep learning, and, increasingly, large language models (LLMs). These approaches are built upon the foundation of gold-standard annotated datasets and humans in the loop for active machine learning. In this viewpoint, we explore the evolution of health informatics in supporting evidence search and retrieval processes over the past 25+ years within the HiRU, including the evolving roles of LLMs and responsible artificial intelligence, as we continue to facilitate the dissemination of knowledge, enabling clinicians to integrate the best available evidence into their clinical practice.


Asunto(s)
Medicina Basada en la Evidencia , Informática Médica , Informática Médica/métodos , Informática Médica/tendencias , Humanos , Historia del Siglo XX , Historia del Siglo XXI , Aprendizaje Automático
6.
Sensors (Basel) ; 24(7)2024 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-38610412

RESUMEN

Classical machine learning techniques have dominated Music Emotion Recognition. However, improvements have slowed down due to the complex and time-consuming task of handcrafting new emotionally relevant audio features. Deep learning methods have recently gained popularity in the field because of their ability to automatically learn relevant features from spectral representations of songs, eliminating such necessity. Nonetheless, there are limitations, such as the need for large amounts of quality labeled data, a common problem in MER research. To understand the effectiveness of these techniques, a comparison study using various classical machine learning and deep learning methods was conducted. The results showed that using an ensemble of a Dense Neural Network and a Convolutional Neural Network architecture resulted in a state-of-the-art 80.20% F1 score, an improvement of around 5% considering the best baseline results, concluding that future research should take advantage of both paradigms, that is, combining handcrafted features with feature learning.


Asunto(s)
Aprendizaje Profundo , Música , Exactitud de los Datos , Emociones , Aprendizaje Automático
7.
Sensors (Basel) ; 24(11)2024 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-38894095

RESUMEN

The revolution of the Internet of Things (IoT) and the Web of Things (WoT) has brought new opportunities and challenges for the information retrieval (IR) field. The exponential number of interconnected physical objects and real-time data acquisition requires new approaches and architectures for IR systems. Research and prototypes can be crucial in designing and developing new systems and refining architectures for IR in the WoT. This paper proposes a unified and holistic approach for IR in the WoT, called IR.WoT. The proposed system contemplates the critical indexing, scoring, and presentation stages applied to some smart cities' use cases and scenarios. Overall, this paper describes the research, architecture, and vision for advancing the field of IR in the WoT and addresses some of the remaining challenges and opportunities in this exciting area. The article also describes the design considerations, cloud implementation, and experimentation based on a simulated collection of synthetic XML documents with technical efficiency measures. The experimentation results show promising outcomes, whereas further studies are required to improve IR.WoT effectiveness, considering the WoT dynamic characteristics and, more importantly, the heterogeneity and divergence of WoT modeling proposals in the IR domain.

8.
J Med Libr Assoc ; 112(1): 13-21, 2024 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-38911524

RESUMEN

Objective: To evaluate the ability of DynaMedex, an evidence-based drug and disease Point of Care Information (POCI) resource, in answering clinical queries using keyword searches. Methods: Real-world disease-related questions compiled from clinicians at an academic medical center, DynaMedex search query data, and medical board review resources were categorized into five clinical categories (complications & prognosis, diagnosis & clinical presentation, epidemiology, prevention & screening/monitoring, and treatment) and six specialties (cardiology, endocrinology, hematology-oncology, infectious disease, internal medicine, and neurology). A total of 265 disease-related questions were evaluated by pharmacist reviewers based on if an answer was found (yes, no), whether the answer was relevant (yes, no), difficulty in finding the answer (easy, not easy), cited best evidence available (yes, no), clinical practice guidelines included (yes, no), and level of detail provided (detailed, limited details). Results: An answer was found for 259/265 questions (98%). Both reviewers found an answer for 241 questions (91%), neither found the answer for 6 questions (2%), and only one reviewer found an answer for 18 questions (7%). Both reviewers found a relevant answer 97% of the time when an answer was found. Of all relevant answers found, 68% were easy to find, 97% cited best quality of evidence available, 72% included clinical guidelines, and 95% were detailed. Recommendations for areas of resource improvement were identified. Conclusions: The resource enabled reviewers to answer most questions easily with the best quality of evidence available, providing detailed answers and clinical guidelines, with a high level of replication of results across users.


Asunto(s)
Sistemas de Atención de Punto , Humanos , Medicina Basada en la Evidencia
9.
Med Ref Serv Q ; 43(1): 15-25, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38237019

RESUMEN

This study sought to provide a protocol for searching complex medical cases of grand rounds. A clinical informationist was embedded in gastroenterology grand rounds to use comprehensive search strategies and summarize patients' information through concept mapping. Our proposed protocol classifies into three categories: (1) The general search strategy, (2) The protocol for searching for evidence about rare diseases, and (3) Identifying other resources more than routine medical databases. This approach represents a novel method beyond previous studies which were focused on usual ward rounds to facilitate evidence-based decision-making by providing and simplifying a comprehensive summary view of complex medical cases.


Asunto(s)
Manejo de Datos , Hospitales
10.
Health Info Libr J ; 41(1): 76-83, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-37574776

RESUMEN

BACKGROUND: Latin American and Caribbean Health Sciences Literature (LILACS) is the main reference database in the region; however, the way in which this resource is used in Cochrane systematic reviews has not been studied. OBJECTIVES: To assess the search methods of Cochrane reviews that used LILACS as a source of information and explore the Cochrane community's perceptions about this resource. METHODS: We identified all Cochrane reviews of interventions published during 2019, which included LILACS as a source of information, and analysed their search methods and also ran a survey through the Cochrane Community. RESULTS: We found 133 Cochrane reviews that reported the full search strategies, identifying heterogeneity in search details. The respondents to our survey highlighted many areas for improvement in the use of LILACS, including the usability of the search platform for this purpose. DISCUSSION: The use and reporting of LILACS in Cochrane reviews demonstrate inconsistencies, as evidenced by the analysis of search reports from systematic reviews and surveys conducted among members of the Cochrane community. CONCLUSION: With better guidance on how LILACS database is structured, information specialists working on Cochrane reviews should be able to make more effective use of this unique resource.


Asunto(s)
Servicios de Información , Medicina , Humanos , Publicaciones , Encuestas y Cuestionarios
11.
Behav Res Methods ; 56(4): 3560-3577, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38286947

RESUMEN

Selecting appropriate musical stimuli to induce specific emotions represents a recurring challenge in music and emotion research. Most existing stimuli have been categorized according to taxonomies derived from general emotion models (e.g., basic emotions, affective circumplex), have been rated for perceived emotions, and are rarely defined in terms of interrater agreement. To redress these limitations, we present research that served in the development of a new interactive online database, including an initial set of 364 music excerpts from three different genres (classical, pop, and hip/hop) that were rated for felt emotion using the Geneva Emotion Music Scale (GEMS), a music-specific emotion scale. The sample comprised 517 English- and German-speaking participants and each excerpt was rated by an average of 28.76 participants (SD = 7.99). Data analyses focused on research questions that are of particular relevance for musical database development, notably the number of raters required to obtain stable estimates of emotional effects of music and the adequacy of the GEMS as a tool for describing music-evoked emotions across three prominent music genres. Overall, our findings suggest that 10-20 raters are sufficient to obtain stable estimates of emotional effects of music excerpts in most cases, and that the GEMS shows promise as a valid and comprehensive annotation tool for music databases.


Asunto(s)
Bases de Datos Factuales , Emociones , Música , Humanos , Música/psicología , Emociones/fisiología , Femenino , Masculino , Adulto , Adulto Joven , Adolescente , Persona de Mediana Edad , Estimulación Acústica/métodos , Internet
12.
Entropy (Basel) ; 26(3)2024 Mar 10.
Artículo en Inglés | MEDLINE | ID: mdl-38539757

RESUMEN

We introduce the problem of deceptive information retrieval (DIR), in which a user wishes to download a required file out of multiple independent files stored in a system of databases while deceiving the databases by making the databases' predictions on the user-required file index incorrect with high probability. Conceptually, DIR is an extension of private information retrieval (PIR). In PIR, a user downloads a required file without revealing its index to any of the databases. The metric of deception is defined as the probability of error of databases' prediction on the user-required file, minus the corresponding probability of error in PIR. The problem is defined on time-sensitive data that keep updating from time to time. In the proposed scheme, the user deceives the databases by sending real queries to download the required file at the time of the requirement and dummy queries at multiple distinct future time instances to manipulate the probabilities of sending each query for each file requirement, using which the databases' make the predictions on the user-required file index. The proposed DIR scheme is based on a capacity achieving probabilistic PIR scheme, and achieves rates lower than the PIR capacity due to the additional downloads made to deceive the databases. When the required level of deception is zero, the proposed scheme achieves the PIR capacity.

13.
BMC Bioinformatics ; 24(1): 3, 2023 Jan 03.
Artículo en Inglés | MEDLINE | ID: mdl-36597033

RESUMEN

PURPOSE: The objective of the manuscript is to propose a hybrid algorithm combining the improved BM25 algorithm, k-means clustering, and BioBert model to better determine biomedical articles utilizing the PubMed database so, the number of retrieved biomedical articles whose content contains much similar information regarding a query of a specific disease could grow larger. DESIGN/METHODOLOGY/APPROACH: In the paper, a two-stage information retrieval method is proposed to conduct an improved Text-Rank algorithm. The first stage consists of employing the improved BM25 algorithm to assign scores to biomedical articles in the database and identify the 1000 publications with the highest scores. The second stage is composed of employing a method called a cluster-based abstract extraction to reduce the number of article abstracts to match the input constraints of the BioBert model, and then the BioBert-based document similarity matching method is utilized to obtain the most similar search outcomes between the document and the retrieved morphemes. To realize reproducibility, the written code is made available on https://github.com/zzc1991/TREC_Precision_Medicine_Track . FINDINGS: The experimental study is conducted based on the data sets of TREC2017 and TREC2018 to train the proposed model and the data of TREC2019 is used as a validation set confirming the effectiveness and practicability of the proposed algorithm that would be implemented for clinical decision support in precision medicine with a generalizability feature. ORIGINALITY/VALUE: This research integrates multiple machine learning and text processing methods to devise a hybrid method applicable to domains of specific medical literature retrieval. The proposed algorithm provides a 3% increase of P@10 than that of the state-of-the-art algorithm in TREC 2019.


Asunto(s)
Sistemas de Apoyo a Decisiones Clínicas , Medicina de Precisión , Reproducibilidad de los Resultados , Algoritmos , Aprendizaje Automático
14.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33866351

RESUMEN

In this letter, we explain how intuitive and explainable methods inspired from human physiology and computational biology can serve to simplify and ameliorate the way we process and generate knowledge resources.


Asunto(s)
Inteligencia Artificial , Biología Computacional , Algoritmos , Humanos
15.
Brief Bioinform ; 22(2): 781-799, 2021 03 22.
Artículo en Inglés | MEDLINE | ID: mdl-33279995

RESUMEN

More than 50 000 papers have been published about COVID-19 since the beginning of 2020 and several hundred new papers continue to be published every day. This incredible rate of scientific productivity leads to information overload, making it difficult for researchers, clinicians and public health officials to keep up with the latest findings. Automated text mining techniques for searching, reading and summarizing papers are helpful for addressing information overload. In this review, we describe the many resources that have been introduced to support text mining applications over the COVID-19 literature; specifically, we discuss the corpora, modeling resources, systems and shared tasks that have been introduced for COVID-19. We compile a list of 39 systems that provide functionality such as search, discovery, visualization and summarization over the COVID-19 literature. For each system, we provide a qualitative description and assessment of the system's performance, unique data or user interface features and modeling decisions. Many systems focus on search and discovery, though several systems provide novel features, such as the ability to summarize findings over multiple documents or linking between scientific articles and clinical trials. We also describe the public corpora, models and shared tasks that have been introduced to help reduce repeated effort among community members; some of these resources (especially shared tasks) can provide a basis for comparing the performance of different systems. Finally, we summarize promising results and open challenges for text mining the COVID-19 literature.


Asunto(s)
COVID-19/epidemiología , Minería de Datos/métodos , COVID-19/virología , Humanos , SARS-CoV-2/aislamiento & purificación
16.
J Biomed Inform ; 144: 104444, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37451494

RESUMEN

INTRODUCTION: Clinical trials (CTs) often fail due to inadequate patient recruitment. Finding eligible patients involves comparing the patient's information with the CT eligibility criteria. Automated patient matching offers the promise of improving the process, yet the main difficulties of CT retrieval lie in the semantic complexity of matching unstructured patient descriptions with semi-structured, multi-field CT documents and in capturing the meaning of negation coming from the eligibility criteria. OBJECTIVES: This paper tackles the challenges of CT retrieval by presenting an approach that addresses the patient-to-trials paradigm. Our approach involves two key components in a pipeline-based model: (i) a data enrichment technique for enhancing both queries and documents during the first retrieval stage, and (ii) a novel re-ranking schema that uses a Transformer network in a setup adapted to this task by leveraging the structure of the CT documents. METHODS: We use named entity recognition and negation detection in both patient description and the eligibility section of CTs. We further classify patient descriptions and CT eligibility criteria into current, past, and family medical conditions. This extracted information is used to boost the importance of disease and drug mentions in both query and index for lexical retrieval. Furthermore, we propose a two-step training schema for the Transformer network used to re-rank the results from the lexical retrieval. The first step focuses on matching patient information with the descriptive sections of trials, while the second step aims to determine eligibility by matching patient information with the criteria section. RESULTS: Our findings indicate that the inclusion criteria section of the CT has a great influence on the relevance score in lexical models, and that the enrichment techniques for queries and documents improve the retrieval of relevant trials. The re-ranking strategy, based on our training schema, consistently enhances CT retrieval and shows improved performance by 15% in terms of precision at retrieving eligible trials. CONCLUSION: The results of our experiments suggest the benefit of making use of extracted entities. Moreover, our proposed re-ranking schema shows promising effectiveness compared to larger neural models, even with limited training data. These findings offer valuable insights for improving methods for retrieval of clinical documents.


Asunto(s)
Almacenamiento y Recuperación de la Información , Semántica , Humanos
17.
Skin Res Technol ; 29(3): e13286, 2023 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-36973976

RESUMEN

BACKGROUND: Cutaneous malignant melanoma (MM) is potentially aggressive, and numerous clinically suspicious pigmented skin lesions are excised, causing unnecessary mutilation for patients at high healthcare costs, but without histopathological evidence of MM. The high number of excisions may be lowered by using more accurate diagnostics. Tape stripping (TS) of clinically suspicious lesions is a non-invasive diagnostic test of MM that can potentially lower the number needed to biopsy/excise. MATERIALS AND METHODS: The aim is to determine the diagnostic accuracy of TS in detecting MM in clinically suspicious pigmented skin lesions. This systematic review following PRISMA guidelines searched PubMed, Web of Science, and Embase (September 2022) using melanoma combined with tape stripping, adhesive patch(es), pigmented lesion assay, or epidermal genetic information retrieval. RESULTS: Ten studies were included. Sensitivity ranged from 68.8% (95% confidence interval [CI] 51.5, 82.1) to 100% (95% CI 91.0, 100). Specificity ranged from 69.1% (95% CI 63.8, 74.0) to 100% (95% CI 78.5, 100). A pooled analysis of five studies testing the RNA markers LINC00518 and PRAME found a sensitivity of 86.9% (95% CI 81.7, 90.8) and a specificity of 82.4% (95% CI 80.8, 83.9). CONCLUSION: Overall quality of studies was low, and the reliability of sensitivity and specificity is questionable. However, TS may supplement well-established diagnostic methods as pooled analysis of five studies indicates a moderate sensitivity. Future studies are needed to obtain more reliable data as independent studies with no conflict of interest.


Asunto(s)
Biopsia , Melanoma , Neoplasias Cutáneas , Cinta Quirúrgica , Humanos , Antígenos de Neoplasias/genética , Biopsia/métodos , Melanoma/patología , Melanoma/cirugía , Trastornos de la Pigmentación/patología , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Neoplasias Cutáneas/patología , Neoplasias Cutáneas/cirugía , Melanoma Cutáneo Maligno
18.
J Med Internet Res ; 25: e49771, 2023 12 14.
Artículo en Inglés | MEDLINE | ID: mdl-38096014

RESUMEN

BACKGROUND: The COVID-19 pandemic, caused by the SARS-CoV-2 virus, has necessitated reliable and authoritative information for public guidance. The World Health Organization (WHO) has been a primary source of such information, disseminating it through a question and answer format on its official website. Concurrently, ChatGPT 3.5 and 4.0, a deep learning-based natural language generation system, has shown potential in generating diverse text types based on user input. OBJECTIVE: This study evaluates the accuracy of COVID-19 information generated by ChatGPT 3.5 and 4.0, assessing its potential as a supplementary public information source during the pandemic. METHODS: We extracted 487 COVID-19-related questions from the WHO's official website and used ChatGPT 3.5 and 4.0 to generate corresponding answers. These generated answers were then compared against the official WHO responses for evaluation. Two clinical experts scored the generated answers on a scale of 0-5 across 4 dimensions-accuracy, comprehensiveness, relevance, and clarity-with higher scores indicating better performance in each dimension. The WHO responses served as the reference for this assessment. Additionally, we used the BERT (Bidirectional Encoder Representations from Transformers) model to generate similarity scores (0-1) between the generated and official answers, providing a dual validation mechanism. RESULTS: The mean (SD) scores for ChatGPT 3.5-generated answers were 3.47 (0.725) for accuracy, 3.89 (0.719) for comprehensiveness, 4.09 (0.787) for relevance, and 3.49 (0.809) for clarity. For ChatGPT 4.0, the mean (SD) scores were 4.15 (0.780), 4.47 (0.641), 4.56 (0.600), and 4.09 (0.698), respectively. All differences were statistically significant (P<.001), with ChatGPT 4.0 outperforming ChatGPT 3.5. The BERT model verification showed mean (SD) similarity scores of 0.83 (0.07) for ChatGPT 3.5 and 0.85 (0.07) for ChatGPT 4.0 compared with the official WHO answers. CONCLUSIONS: ChatGPT 3.5 and 4.0 can generate accurate and relevant COVID-19 information to a certain extent. However, compared with official WHO responses, gaps and deficiencies exist. Thus, users of ChatGPT 3.5 and 4.0 should also reference other reliable information sources to mitigate potential misinformation risks. Notably, ChatGPT 4.0 outperformed ChatGPT 3.5 across all evaluated dimensions, a finding corroborated by BERT model validation.


Asunto(s)
COVID-19 , Humanos , SARS-CoV-2 , Pandemias , Lenguaje , Organización Mundial de la Salud
19.
J Med Internet Res ; 25: e45013, 2023 08 28.
Artículo en Inglés | MEDLINE | ID: mdl-37639292

RESUMEN

BACKGROUND: Thorough data stewardship is a key enabler of comprehensive health research. Processes such as data collection, storage, access, sharing, and analytics require researchers to follow elaborate data management strategies properly and consistently. Studies have shown that findable, accessible, interoperable, and reusable (FAIR) data leads to improved data sharing in different scientific domains. OBJECTIVE: This scoping review identifies and discusses concepts, approaches, implementation experiences, and lessons learned in FAIR initiatives in health research data. METHODS: The Arksey and O'Malley stage-based methodological framework for scoping reviews was applied. PubMed, Web of Science, and Google Scholar were searched to access relevant publications. Articles written in English, published between 2014 and 2020, and addressing FAIR concepts or practices in the health domain were included. The 3 data sources were deduplicated using a reference management software. In total, 2 independent authors reviewed the eligibility of each article based on defined inclusion and exclusion criteria. A charting tool was used to extract information from the full-text papers. The results were reported using the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) guidelines. RESULTS: A total of 2.18% (34/1561) of the screened articles were included in the final review. The authors reported FAIRification approaches, which include interpolation, inclusion of comprehensive data dictionaries, repository design, semantic interoperability, ontologies, data quality, linked data, and requirement gathering for FAIRification tools. Challenges and mitigation strategies associated with FAIRification, such as high setup costs, data politics, technical and administrative issues, privacy concerns, and difficulties encountered in sharing health data despite its sensitive nature were also reported. We found various workflows, tools, and infrastructures designed by different groups worldwide to facilitate the FAIRification of health research data. We also uncovered a wide range of problems and questions that researchers are trying to address by using the different workflows, tools, and infrastructures. Although the concept of FAIR data stewardship in the health research domain is relatively new, almost all continents have been reached by at least one network trying to achieve health data FAIRness. Documented outcomes of FAIRification efforts include peer-reviewed publications, improved data sharing, facilitated data reuse, return on investment, and new treatments. Successful FAIRification of data has informed the management and prognosis of various diseases such as cancer, cardiovascular diseases, and neurological diseases. Efforts to FAIRify data on a wider variety of diseases have been ongoing since the COVID-19 pandemic. CONCLUSIONS: This work summarises projects, tools, and workflows for the FAIRification of health research data. The comprehensive review shows that implementing the FAIR concept in health data stewardship carries the promise of improved research data management and transparency in the era of big data and open research publishing. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.2196/22505.


Asunto(s)
COVID-19 , Enfermedades Cardiovasculares , Humanos , Pandemias , Macrodatos , Exactitud de los Datos
20.
J Med Internet Res ; 25: e46571, 2023 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-37656502

RESUMEN

BACKGROUND: Genetic testing has become an integrated part of health care for patients with breast or ovarian cancer, and the increasing demand for genetic testing is accompanied by an increasing need for easy access to reliable genetic information for patients. Therefore, we developed a chatbot app (Rosa) that is able to perform humanlike digital conversations about genetic BRCA testing. OBJECTIVE: Before implementing this new information service in daily clinical practice, we wanted to explore 2 aspects of chatbot use: the perceived utility and trust in chatbot technology among healthy patients at risk of hereditary cancer and how interaction with a chatbot regarding sensitive information about hereditary cancer influences patients. METHODS: Overall, 175 healthy individuals at risk of hereditary breast and ovarian cancer were invited to test the chatbot, Rosa, before and after genetic counseling. To secure a varied sample, participants were recruited from all cancer genetic clinics in Norway, and the selection was based on age, gender, and risk of having a BRCA pathogenic variant. Among the 34.9% (61/175) of participants who consented for individual interview, a selected subgroup (16/61, 26%) shared their experience through in-depth interviews via video. The semistructured interviews covered the following topics: usability, perceived usefulness, trust in the information received via the chatbot, how Rosa influenced the user, and thoughts about future use of digital tools in health care. The transcripts were analyzed using the stepwise-deductive inductive approach. RESULTS: The overall finding was that the chatbot was very welcomed by the participants. They appreciated the 24/7 availability wherever they were and the possibility to use it to prepare for genetic counseling and to repeat and ask questions about what had been said afterward. As Rosa was created by health care professionals, they also valued the information they received as being medically correct. Rosa was referred to as being better than Google because it provided specific and reliable answers to their questions. The findings were summed up in 3 concepts: "Anytime, anywhere"; "In addition, not instead"; and "Trustworthy and true." All participants (16/16) denied increased worry after reading about genetic testing and hereditary breast and ovarian cancer in Rosa. CONCLUSIONS: Our results indicate that a genetic information chatbot has the potential to contribute to easy access to uniform information for patients at risk of hereditary breast and ovarian cancer, regardless of geographical location. The 24/7 availability of quality-assured information, tailored to the specific situation, had a reassuring effect on our participants. It was consistent across concepts that Rosa was a tool for preparation and repetition; however, none of the participants (0/16) supported that Rosa could replace genetic counseling if hereditary cancer was confirmed. This indicates that a chatbot can be a well-suited digital companion to genetic counseling.


Asunto(s)
Neoplasias Ováricas , Rosa , Humanos , Femenino , Predisposición Genética a la Enfermedad , Neoplasias Ováricas/genética , Pruebas Genéticas , Investigación Cualitativa
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA