Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Br J Dermatol ; 190(6): 789-797, 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38330217

RESUMO

The field of dermatology is experiencing the rapid deployment of artificial intelligence (AI), from mobile applications (apps) for skin cancer detection to large language models like ChatGPT that can answer generalist or specialist questions about skin diagnoses. With these new applications, ethical concerns have emerged. In this scoping review, we aimed to identify the applications of AI to the field of dermatology and to understand their ethical implications. We used a multifaceted search approach, searching PubMed, MEDLINE, Cochrane Library and Google Scholar for primary literature, following the PRISMA Extension for Scoping Reviews guidance. Our advanced query included terms related to dermatology, AI and ethical considerations. Our search yielded 202 papers. After initial screening, 68 studies were included. Thirty-two were related to clinical image analysis and raised ethical concerns for misdiagnosis, data security, privacy violations and replacement of dermatologist jobs. Seventeen discussed limited skin of colour representation in datasets leading to potential misdiagnosis in the general population. Nine articles about teledermatology raised ethical concerns, including the exacerbation of health disparities, lack of standardized regulations, informed consent for AI use and privacy challenges. Seven addressed inaccuracies in the responses of large language models. Seven examined attitudes toward and trust in AI, with most patients requesting supplemental assessment by a physician to ensure reliability and accountability. Benefits of AI integration into clinical practice include increased patient access, improved clinical decision-making, efficiency and many others. However, safeguards must be put in place to ensure the ethical application of AI.


The use of artificial intelligence (AI) in dermatology is rapidly increasing, with applications in dermatopathology, medical dermatology, cutaneous surgery, microscopy/spectroscopy and the identification of prognostic biomarkers (characteristics that provide information on likely patient health outcomes). However, with the rise of AI in dermatology, ethical concerns have emerged. We reviewed the existing literature to identify applications of AI in the field of dermatology and understand the ethical implications. Our search initially identified 202 papers, and after we went through them (screening), 68 were included in our review. We found that ethical concerns are related to the use of AI in the areas of clinical image analysis, teledermatology, natural language processing models, privacy, skin of colour representation, and patient and provider attitudes toward AI. We identified nine ethical principles to facilitate the safe use of AI in dermatology. These ethical principles include fairness, inclusivity, transparency, accountability, security, privacy, reliability, informed consent and conflict of interest. Although there are many benefits of integrating AI into clinical practice, our findings highlight how safeguards must be put in place to reduce rising ethical concerns.


Assuntos
Inteligência Artificial , Dermatologia , Humanos , Inteligência Artificial/ética , Dermatologia/ética , Dermatologia/métodos , Telemedicina/ética , Consentimento Livre e Esclarecido/ética , Confidencialidade/ética , Erros de Diagnóstico/ética , Erros de Diagnóstico/prevenção & controle , Segurança Computacional/ética , Dermatopatias/diagnóstico , Dermatopatias/terapia , Aplicativos Móveis/ética
2.
J Biomed Inform ; 157: 104702, 2024 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-39084480

RESUMO

Although rare diseases individually have a low prevalence, they collectively affect nearly 400 million individuals around the world. On average, it takes five years for an accurate rare disease diagnosis, but many patients remain undiagnosed or misdiagnosed. As machine learning technologies have been used to aid diagnostics in the past, this study aims to test ChatGPT's suitability for rare disease diagnostic support with the enhancement provided by Retrieval Augmented Generation (RAG). RareDxGPT, our enhanced ChatGPT model, supplies ChatGPT with information about 717 rare diseases from an external knowledge resource, the RareDis Corpus, through RAG. In RareDxGPT, when a query is entered, the three documents most relevant to the query in the RareDis Corpus are retrieved. Along with the query, they are returned to ChatGPT to provide a diagnosis. Additionally, phenotypes for thirty different diseases were extracted from free text from PubMed's Case Reports. They were each entered with three different prompt types: "prompt", "prompt + explanation" and "prompt + role play." The accuracy of ChatGPT and RareDxGPT with each prompt was then measured. With "Prompt", RareDxGPT had a 40 % accuracy, while ChatGPT 3.5 got 37 % of the cases correct. With "Prompt + Explanation", RareDxGPT had a 43 % accuracy, while ChatGPT 3.5 got 23 % of the cases correct. With "Prompt + Role Play", RareDxGPT had a 40 % accuracy, while ChatGPT 3.5 got 23 % of the cases correct. To conclude, ChatGPT, especially when supplying extra domain specific knowledge, demonstrates early potential for rare disease diagnosis with adjustments.

3.
J Biomed Inform ; 155: 104659, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38777085

RESUMO

OBJECTIVE: This study aims to promote interoperability in precision medicine and translational research by aligning the Observational Medical Outcomes Partnership (OMOP) and Phenopackets data models. Phenopackets is an expert knowledge-driven schema designed to facilitate the storage and exchange of multimodal patient data, and support downstream analysis. The first goal of this paper is to explore model alignment by characterizing the common data models using a newly developed data transformation process and evaluation method. Second, using OMOP normalized clinical data, we evaluate the mapping of real-world patient data to Phenopackets. We evaluate the suitability of Phenopackets as a patient data representation for real-world clinical cases. METHODS: We identified mappings between OMOP and Phenopackets and applied them to a real patient dataset to assess the transformation's success. We analyzed gaps between the models and identified key considerations for transforming data between them. Further, to improve ambiguous alignment, we incorporated Unified Medical Language System (UMLS) semantic type-based filtering to direct individual concepts to their most appropriate domain and conducted a domain-expert evaluation of the mapping's clinical utility. RESULTS: The OMOP to Phenopacket transformation pipeline was executed for 1,000 Alzheimer's disease patients and successfully mapped all required entities. However, due to missing values in OMOP for required Phenopacket attributes, 10.2 % of records were lost. The use of UMLS-semantic type filtering for ambiguous alignment of individual concepts resulted in 96 % agreement with clinical thinking, increased from 68 % when mapping exclusively by domain correspondence. CONCLUSION: This study presents a pipeline to transform data from OMOP to Phenopackets. We identified considerations for the transformation to ensure data quality, handling restrictions for successful Phenopacket validation and discrepant data formats. We identified unmappable Phenopacket attributes that focus on specialty use cases, such as genomics or oncology, which OMOP does not currently support. We introduce UMLS semantic type filtering to resolve ambiguous alignment to Phenopacket entities to be most appropriate for real-world interpretation. We provide a systematic approach to align OMOP and Phenopackets schemas. Our work facilitates future use of Phenopackets in clinical applications by addressing key barriers to interoperability when deriving a Phenopacket from real-world patient data.


Assuntos
Unified Medical Language System , Humanos , Semântica , Registros Eletrônicos de Saúde , Medicina de Precisão/métodos , Pesquisa Translacional Biomédica , Informática Médica/métodos , Processamento de Linguagem Natural , Doença de Alzheimer
4.
J Biomed Inform ; 156: 104663, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38838949

RESUMO

OBJECTIVE: This study aims to investigate the association between social determinants of health (SDoH) and clinical research recruitment outcomes and recommends evidence-based strategies to enhance equity. MATERIALS AND METHODS: Data were collected from the internal clinical study manager database, clinical data warehouse, and clinical research registry. Study characteristics (e.g., study phase) and sociodemographic information were extracted. Median neighborhood income, distance from the study location, and Area Deprivation Index (ADI) were calculated. Mixed effect generalized regression was used for clustering effects and false discovery rate adjustment for multiple testing. A stratified analysis was performed to examine the impact in distinct medical departments. RESULTS: The study sample consisted of 3,962 individuals, with a mean age of 61.5 years, 53.6 % male, 54.2 % White, and 49.1 % non-Hispanic or Latino. Study characteristics revealed a variety of protocols across different departments, with cardiology having the highest percentage of participants (46.4 %). Industry funding was the most common (74.5 %), and digital advertising and personal outreach were the main recruitment methods (58.9 % and 90.8 %). DISCUSSION: The analysis demonstrated significant associations between participant characteristics and research participation, including biological sex, age, ethnicity, and language. The stratified analysis revealed other significant associations for recruitment strategies. SDoH is crucial to clinical research recruitment, and this study presents evidence-based solutions for equity and inclusivity. Researchers can tailor recruitment strategies to overcome barriers and increase participant diversity by identifying participant characteristics and research involvement status. CONCLUSION: The findings highlight the relevance of clinical research inequities and equitable representation of historically underrepresented populations. We need to improve recruitment strategies to promote diversity and inclusivity in research.


Assuntos
Pesquisa Biomédica , Determinantes Sociais da Saúde , Humanos , Masculino , Pessoa de Meia-Idade , Feminino , Seleção de Pacientes , Idoso , Equidade em Saúde , Adulto
5.
J Biomed Inform ; 154: 104649, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38697494

RESUMO

OBJECTIVE: Automated identification of eligible patients is a bottleneck of clinical research. We propose Criteria2Query (C2Q) 3.0, a system that leverages GPT-4 for the semi-automatic transformation of clinical trial eligibility criteria text into executable clinical database queries. MATERIALS AND METHODS: C2Q 3.0 integrated three GPT-4 prompts for concept extraction, SQL query generation, and reasoning. Each prompt was designed and evaluated separately. The concept extraction prompt was benchmarked against manual annotations from 20 clinical trials by two evaluators, who later also measured SQL generation accuracy and identified errors in GPT-generated SQL queries from 5 clinical trials. The reasoning prompt was assessed by three evaluators on four metrics: readability, correctness, coherence, and usefulness, using corrected SQL queries and an open-ended feedback questionnaire. RESULTS: Out of 518 concepts from 20 clinical trials, GPT-4 achieved an F1-score of 0.891 in concept extraction. For SQL generation, 29 errors spanning seven categories were detected, with logic errors being the most common (n = 10; 34.48 %). Reasoning evaluations yielded a high coherence rating, with the mean score being 4.70 but relatively lower readability, with a mean of 3.95. Mean scores of correctness and usefulness were identified as 3.97 and 4.37, respectively. CONCLUSION: GPT-4 significantly improves the accuracy of extracting clinical trial eligibility criteria concepts in C2Q 3.0. Continued research is warranted to ensure the reliability of large language models.


Assuntos
Ensaios Clínicos como Assunto , Humanos , Processamento de Linguagem Natural , Software , Seleção de Pacientes
6.
J Biomed Inform ; 153: 104640, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38608915

RESUMO

Evidence-based medicine promises to improve the quality of healthcare by empowering medical decisions and practices with the best available evidence. The rapid growth of medical evidence, which can be obtained from various sources, poses a challenge in collecting, appraising, and synthesizing the evidential information. Recent advancements in generative AI, exemplified by large language models, hold promise in facilitating the arduous task. However, developing accountable, fair, and inclusive models remains a complicated undertaking. In this perspective, we discuss the trustworthiness of generative AI in the context of automated summarization of medical evidence.


Assuntos
Inteligência Artificial , Medicina Baseada em Evidências , Humanos , Confiança , Processamento de Linguagem Natural
7.
BMC Med Inform Decis Mak ; 22(Suppl 2): 348, 2024 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-38433189

RESUMO

BACKGROUND: Systemic lupus erythematosus (SLE) is a rare autoimmune disorder characterized by an unpredictable course of flares and remission with diverse manifestations. Lupus nephritis, one of the major disease manifestations of SLE for organ damage and mortality, is a key component of lupus classification criteria. Accurately identifying lupus nephritis in electronic health records (EHRs) would therefore benefit large cohort observational studies and clinical trials where characterization of the patient population is critical for recruitment, study design, and analysis. Lupus nephritis can be recognized through procedure codes and structured data, such as laboratory tests. However, other critical information documenting lupus nephritis, such as histologic reports from kidney biopsies and prior medical history narratives, require sophisticated text processing to mine information from pathology reports and clinical notes. In this study, we developed algorithms to identify lupus nephritis with and without natural language processing (NLP) using EHR data from the Northwestern Medicine Enterprise Data Warehouse (NMEDW). METHODS: We developed five algorithms: a rule-based algorithm using only structured data (baseline algorithm) and four algorithms using different NLP models. The first NLP model applied simple regular expression for keywords search combined with structured data. The other three NLP models were based on regularized logistic regression and used different sets of features including positive mention of concept unique identifiers (CUIs), number of appearances of CUIs, and a mixture of three components (i.e. a curated list of CUIs, regular expression concepts, structured data) respectively. The baseline algorithm and the best performing NLP algorithm were externally validated on a dataset from Vanderbilt University Medical Center (VUMC). RESULTS: Our best performing NLP model incorporated features from both structured data, regular expression concepts, and mapped concept unique identifiers (CUIs) and showed improved F measure in both the NMEDW (0.41 vs 0.79) and VUMC (0.52 vs 0.93) datasets compared to the baseline lupus nephritis algorithm. CONCLUSION: Our NLP MetaMap mixed model improved the F-measure greatly compared to the structured data only algorithm in both internal and external validation datasets. The NLP algorithms can serve as powerful tools to accurately identify lupus nephritis phenotype in EHR for clinical research and better targeted therapies.


Assuntos
Lúpus Eritematoso Sistêmico , Nefrite Lúpica , Humanos , Nefrite Lúpica/diagnóstico , Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Fenótipo , Doenças Raras
8.
AMIA Jt Summits Transl Sci Proc ; 2024: 670-678, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38827089

RESUMO

Topic modeling performs poorly on short phrases or sentences and ever-changing slang, which are common in social media, such as X, formerly known as Twitter. This study investigates whether concept annotation tools such as MetaMap can enable topic modeling at the semantic level. Using tweets mentioning "hydroxychloroquine" for a case study, we extracted 56,017 posted between 03/01/2020-12/31/2021. The tweets were run through MetaMap to encode concepts with UMLS Concept Unique Identifiers (CUIs) and then we used Latent Dirichlet Allocation (LDA) to identify the optimal model for two datasets: 1) tweets with the original text and 2) tweets with the replaced CUIs. We found that the MetaMap LDA models outperformed the non-MetaMap models in terms of coherence and representativeness and identified topics timely relevant to social and political discussions. We concluded that integrating MetaMap to standardize tweets through UMLS concepts improved semantic topic modeling performance amidst noise in the text.

9.
J Am Med Inform Assoc ; 31(9): 2065-2075, 2024 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-38787964

RESUMO

OBJECTIVES: To automatically construct a drug indication taxonomy from drug labels using generative Artificial Intelligence (AI) represented by the Large Language Model (LLM) GPT-4 and real-world evidence (RWE). MATERIALS AND METHODS: We extracted indication terms from 46 421 free-text drug labels using GPT-4, iteratively and recursively generated indication concepts and inferred indication concept-to-concept and concept-to-term subsumption relations by integrating GPT-4 with RWE, and created a drug indication taxonomy. Quantitative and qualitative evaluations involving domain experts were performed for cardiovascular (CVD), Endocrine, and Genitourinary system diseases. RESULTS: 2909 drug indication terms were extracted and assigned into 24 high-level indication categories (ie, initially generated concepts), each of which was expanded into a sub-taxonomy. For example, the CVD sub-taxonomy contains 242 concepts, spanning a depth of 11, with 170 being leaf nodes. It collectively covers a total of 234 indication terms associated with 189 distinct drugs. The accuracies of GPT-4 on determining the drug indication hierarchy exceeded 0.7 with "good to very good" inter-rater reliability. However, the accuracies of the concept-to-term subsumption relation checking varied greatly, with "fair to moderate" reliability. DISCUSSION AND CONCLUSION: We successfully used generative AI and RWE to create a taxonomy, with drug indications adequately consistent with domain expert expectations. We show that LLMs are good at deriving their own concept hierarchies but still fall short in determining the subsumption relations between concepts and terms in unregulated language from free-text drug labels, which is the same hard task for human experts.


Assuntos
Inteligência Artificial , Rotulagem de Medicamentos , Processamento de Linguagem Natural , Humanos , Classificação/métodos
10.
bioRxiv ; 2024 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-38234802

RESUMO

Objective: We aim to develop a novel method for rare disease concept normalization by fine-tuning Llama 2, an open-source large language model (LLM), using a domain-specific corpus sourced from the Human Phenotype Ontology (HPO). Methods: We developed an in-house template-based script to generate two corpora for fine-tuning. The first (NAME) contains standardized HPO names, sourced from the HPO vocabularies, along with their corresponding identifiers. The second (NAME+SYN) includes HPO names and half of the concept's synonyms as well as identifiers. Subsequently, we fine-tuned Llama2 (Llama2-7B) for each sentence set and conducted an evaluation using a range of sentence prompts and various phenotype terms. Results: When the phenotype terms for normalization were included in the fine-tuning corpora, both models demonstrated nearly perfect performance, averaging over 99% accuracy. In comparison, ChatGPT-3.5 has only ~20% accuracy in identifying HPO IDs for phenotype terms. When single-character typos were introduced in the phenotype terms, the accuracy of NAME and NAME+SYN is 10.2% and 36.1%, respectively, but increases to 61.8% (NAME+SYN) with additional typo-specific fine-tuning. For terms sourced from HPO vocabularies as unseen synonyms, the NAME model achieved 11.2% accuracy, while the NAME+SYN model achieved 92.7% accuracy. Conclusion: Our fine-tuned models demonstrate ability to normalize phenotype terms unseen in the fine-tuning corpus, including misspellings, synonyms, terms from other ontologies, and laymen's terms. Our approach provides a solution for the use of LLM to identify named medical entities from the clinical narratives, while successfully normalizing them to standard concepts in a controlled vocabulary.

11.
J Am Med Inform Assoc ; 31(9): 2076-2083, 2024 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-38829731

RESUMO

OBJECTIVE: We aim to develop a novel method for rare disease concept normalization by fine-tuning Llama 2, an open-source large language model (LLM), using a domain-specific corpus sourced from the Human Phenotype Ontology (HPO). METHODS: We developed an in-house template-based script to generate two corpora for fine-tuning. The first (NAME) contains standardized HPO names, sourced from the HPO vocabularies, along with their corresponding identifiers. The second (NAME+SYN) includes HPO names and half of the concept's synonyms as well as identifiers. Subsequently, we fine-tuned Llama 2 (Llama2-7B) for each sentence set and conducted an evaluation using a range of sentence prompts and various phenotype terms. RESULTS: When the phenotype terms for normalization were included in the fine-tuning corpora, both models demonstrated nearly perfect performance, averaging over 99% accuracy. In comparison, ChatGPT-3.5 has only ∼20% accuracy in identifying HPO IDs for phenotype terms. When single-character typos were introduced in the phenotype terms, the accuracy of NAME and NAME+SYN is 10.2% and 36.1%, respectively, but increases to 61.8% (NAME+SYN) with additional typo-specific fine-tuning. For terms sourced from HPO vocabularies as unseen synonyms, the NAME model achieved 11.2% accuracy, while the NAME+SYN model achieved 92.7% accuracy. CONCLUSION: Our fine-tuned models demonstrate ability to normalize phenotype terms unseen in the fine-tuning corpus, including misspellings, synonyms, terms from other ontologies, and laymen's terms. Our approach provides a solution for the use of LLMs to identify named medical entities from clinical narratives, while successfully normalizing them to standard concepts in a controlled vocabulary.


Assuntos
Ontologias Biológicas , Processamento de Linguagem Natural , Fenótipo , Doenças Raras , Vocabulário Controlado , Humanos
12.
J Am Med Inform Assoc ; 31(5): 1163-1171, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38471120

RESUMO

OBJECTIVES: Extracting PICO (Populations, Interventions, Comparison, and Outcomes) entities is fundamental to evidence retrieval. We present a novel method, PICOX, to extract overlapping PICO entities. MATERIALS AND METHODS: PICOX first identifies entities by assessing whether a word marks the beginning or conclusion of an entity. Then, it uses a multi-label classifier to assign one or more PICO labels to a span candidate. PICOX was evaluated using 1 of the best-performing baselines, EBM-NLP, and 3 more datasets, ie, PICO-Corpus and randomized controlled trial publications on Alzheimer's Disease (AD) or COVID-19, using entity-level precision, recall, and F1 scores. RESULTS: PICOX achieved superior precision, recall, and F1 scores across the board, with the micro F1 score improving from 45.05 to 50.87 (P ≪.01). On the PICO-Corpus, PICOX obtained higher recall and F1 scores than the baseline and improved the micro recall score from 56.66 to 67.33. On the COVID-19 dataset, PICOX also outperformed the baseline and improved the micro F1 score from 77.10 to 80.32. On the AD dataset, PICOX demonstrated comparable F1 scores with higher precision when compared to the baseline. CONCLUSION: PICOX excels in identifying overlapping entities and consistently surpasses a leading baseline across multiple datasets. Ablation studies reveal that its data augmentation strategy effectively minimizes false positives and improves precision.


Assuntos
Doença de Alzheimer , COVID-19 , Humanos , Processamento de Linguagem Natural
13.
ArXiv ; 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38562452

RESUMO

Phenotype-driven gene prioritization is a critical process in the diagnosis of rare genetic disorders for identifying and ranking potential disease-causing genes based on observed physical traits or phenotypes. While traditional approaches rely on curated knowledge graphs with phenotype-gene relations, recent advancements in large language models have opened doors to the potential of AI predictions through extensive training on diverse corpora and complex models. This study conducted a comprehensive evaluation of five large language models, including two Generative Pre-trained Transformers series, and three Llama2 series, assessing their performance across three key metrics: task completeness, gene prediction accuracy, and adherence to required output structures. Various experiments explored combinations of models, prompts, input types, and task difficulty levels. Our findings reveal that even the best-performing LLM, GPT-4, achieved an accuracy of 16.0%, which still lags behind traditional bioinformatics tools. Prediction accuracy increased with the parameter/model size. A similar increasing trend was observed for the task completion rate, with complicated prompts more likely to increase task completeness in models smaller than GPT-4. However, complicated prompts are more likely to decrease the structure compliance rate, but no prompt effects on GPT-4. Compared to HPO term-based input, LLM was also able to achieve better than random prediction accuracy by taking free-text input, but slightly lower than with the HPO input. Bias analysis showed that certain genes, such as MECP2, CDKL5, and SCN1A, are more likely to be top-ranked, potentially explaining the variances observed across different datasets. This study provides valuable insights into the integration of LLMs within genomic analysis, contributing to the ongoing discussion on the utilization of advanced LLMs in clinical workflows.

14.
AMIA Jt Summits Transl Sci Proc ; 2024: 515-524, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38827062

RESUMO

Clinical notes are full of ambiguous medical abbreviations. Contextual knowledge has been leveraged by recent learning-based approaches for sense disambiguation. Previous findings indicated that structural elements of clinical notes entail useful characteristics for informing different interpretations of abbreviations, yet they have remained underutilized and have not been fully investigated. To our best knowledge, the only study exploring note structures simply enumerated the headers in the notes, where such representations are not semantically meaningful. This paper describes a learning-based approach using the note structure represented by the semantic types predefined in Unified Medical Language System (UMLS). We evaluated the representation in addition to the widely used N-gram with three learning models on two different datasets. Experiments indicate that our feature augmentation consistently improved model performance for abbreviation disambiguation, with the optimal F1 score of 0.93.

15.
Clin Dermatol ; 2024 Jun 24.
Artigo em Inglês | MEDLINE | ID: mdl-38925444

RESUMO

Nonmelanoma skin cancers (NMSCs) are among the top five most common cancers globally. NMSC is an area with great potential for novel application of diagnostic tools including artificial intelligence (AI). In this scoping review, we aimed to describe the applications of AI in the diagnosis and treatment of NMSC. Twenty-nine publications described AI applications to dermatopathology including lesion classification and margin assessment. Twenty-five publications discussed AI use in clinical image analysis, showing that algorithms are not superior to dermatologists and may rely on unbalanced, nonrepresentative, and nontransparent training data sets. Sixteen publications described the use of AI in cutaneous surgery for NMSC including use in margin assessment during excisions and Mohs surgery, as well as predicting procedural complexity. Eleven publications discussed spectroscopy, confocal microscopy, thermography, and the AI algorithms that analyze and interpret their data. Ten publications pertained to AI applications for the discovery and use of NMSC biomarkers. Eight publications discussed the use of smartphones and AI, specifically how they enable clinicians and patients to have increased access to instant dermatologic assessments but with varying accuracies. Five publications discussed large language models and NMSC, including how they may facilitate or hinder patient education and medical decision-making. Three publications pertaining to the skin of color and AI for NMSC discussed concerns regarding limited diverse data sets for the training of convolutional neural networks. AI demonstrates tremendous potential to improve diagnosis, patient and clinician education, and management of NMSC. Despite excitement regarding AI, data sets are often not transparently reported, may include low-quality images, and may not include diverse skin types, limiting generalizability. AI may serve as a tool to increase access to dermatology services for patients in rural areas and save health care dollars. These benefits can only be achieved, however, with consideration of potential ethical costs.

16.
Patterns (N Y) ; 5(1): 100887, 2024 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-38264716

RESUMO

To enhance phenotype recognition in clinical notes of genetic diseases, we developed two models-PhenoBCBERT and PhenoGPT-for expanding the vocabularies of Human Phenotype Ontology (HPO) terms. While HPO offers a standardized vocabulary for phenotypes, existing tools often fail to capture the full scope of phenotypes due to limitations from traditional heuristic or rule-based approaches. Our models leverage large language models to automate the detection of phenotype terms, including those not in the current HPO. We compare these models with PhenoTagger, another HPO recognition tool, and found that our models identify a wider range of phenotype concepts, including previously uncharacterized ones. Our models also show strong performance in case studies on biomedical literature. We evaluate the strengths and weaknesses of BERT- and GPT-based models in aspects such as architecture and accuracy. Overall, our models enhance automated phenotype detection from clinical texts, improving downstream analyses on human diseases.

17.
J Am Med Inform Assoc ; 31(5): 1062-1073, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38447587

RESUMO

BACKGROUND: Alzheimer's disease and related dementias (ADRD) affect over 55 million globally. Current clinical trials suffer from low recruitment rates, a challenge potentially addressable via natural language processing (NLP) technologies for researchers to effectively identify eligible clinical trial participants. OBJECTIVE: This study investigates the sociotechnical feasibility of NLP-driven tools for ADRD research prescreening and analyzes the tools' cognitive complexity's effect on usability to identify cognitive support strategies. METHODS: A randomized experiment was conducted with 60 clinical research staff using three prescreening tools (Criteria2Query, Informatics for Integrating Biology and the Bedside [i2b2], and Leaf). Cognitive task analysis was employed to analyze the usability of each tool using the Health Information Technology Usability Evaluation Scale. Data analysis involved calculating descriptive statistics, interrater agreement via intraclass correlation coefficient, cognitive complexity, and Generalized Estimating Equations models. RESULTS: Leaf scored highest for usability followed by Criteria2Query and i2b2. Cognitive complexity was found to be affected by age, computer literacy, and number of criteria, but was not significantly associated with usability. DISCUSSION: Adopting NLP for ADRD prescreening demands careful task delegation, comprehensive training, precise translation of eligibility criteria, and increased research accessibility. The study highlights the relevance of these factors in enhancing NLP-driven tools' usability and efficacy in clinical research prescreening. CONCLUSION: User-modifiable NLP-driven prescreening tools were favorably received, with system type, evaluation sequence, and user's computer literacy influencing usability more than cognitive complexity. The study emphasizes NLP's potential in improving recruitment for clinical trials, endorsing a mixed-methods approach for future system evaluation and enhancements.


Assuntos
Doença de Alzheimer , Informática Médica , Humanos , Processamento de Linguagem Natural , Estudos de Viabilidade , Definição da Elegibilidade
18.
HGG Adv ; 5(2): 100281, 2024 Apr 11.
Artigo em Inglês | MEDLINE | ID: mdl-38414240

RESUMO

Research on polygenic risk scores (PRSs) for common, genetically complex chronic diseases aims to improve health-related predictions, tailor risk-reducing interventions, and improve health outcomes. Yet, the study and use of PRSs in clinical settings raise equity, clinical, and regulatory challenges that can be greater for individuals from historically marginalized racial, ethnic, and other minoritized communities. As part of the National Human Genome Research Institute-funded Electronic Medical Records and Genomics IV Network, we conducted online focus groups with patients/community members, clinicians, and members of institutional review boards to explore their views on key issues, including PRS research, return of PRS results, clinical translation, and barriers and facilitators to health behavioral changes in response to PRS results. Across stakeholder groups, our findings indicate support for PRS development and a strong interest in having PRS results returned to research participants. However, we also found multi-level barriers and significant differences in stakeholders' views about what is needed and possible for successful implementation. These include researcher-participant interaction formats, health and genomic literacy, and a range of structural barriers, such as financial instability, insurance coverage, and the absence of health-supporting infrastructure and affordable healthy food options in poorer neighborhoods. Our findings highlight the need to revisit and implement measures in PRS studies (e.g., incentives and resources for follow-up care), as well as system-level policies to promote equity in genomic research and health outcomes.


Assuntos
Registros Eletrônicos de Saúde , Estratificação de Risco Genético , Humanos , Grupos Focais
19.
Appl Clin Inform ; 15(2): 306-312, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38442909

RESUMO

OBJECTIVES: Large language models (LLMs) like Generative pre-trained transformer (ChatGPT) are powerful algorithms that have been shown to produce human-like text from input data. Several potential clinical applications of this technology have been proposed and evaluated by biomedical informatics experts. However, few have surveyed health care providers for their opinions about whether the technology is fit for use. METHODS: We distributed a validated mixed-methods survey to gauge practicing clinicians' comfort with LLMs for a breadth of tasks in clinical practice, research, and education, which were selected from the literature. RESULTS: A total of 30 clinicians fully completed the survey. Of the 23 tasks, 16 were rated positively by more than 50% of the respondents. Based on our qualitative analysis, health care providers considered LLMs to have excellent synthesis skills and efficiency. However, our respondents had concerns that LLMs could generate false information and propagate training data bias.Our survey respondents were most comfortable with scenarios that allow LLMs to function in an assistive role, like a physician extender or trainee. CONCLUSION: In a mixed-methods survey of clinicians about LLM use, health care providers were encouraging of having LLMs in health care for many tasks, and especially in assistive roles. There is a need for continued human-centered development of both LLMs and artificial intelligence in general.


Assuntos
Algoritmos , Inteligência Artificial , Humanos , Instalações de Saúde , Pessoal de Saúde , Idioma
20.
JAMIA Open ; 7(1): ooae021, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38455840

RESUMO

Objective: To automate scientific claim verification using PubMed abstracts. Materials and Methods: We developed CliVER, an end-to-end scientific Claim VERification system that leverages retrieval-augmented techniques to automatically retrieve relevant clinical trial abstracts, extract pertinent sentences, and use the PICO framework to support or refute a scientific claim. We also created an ensemble of three state-of-the-art deep learning models to classify rationale of support, refute, and neutral. We then constructed CoVERt, a new COVID VERification dataset comprising 15 PICO-encoded drug claims accompanied by 96 manually selected and labeled clinical trial abstracts that either support or refute each claim. We used CoVERt and SciFact (a public scientific claim verification dataset) to assess CliVER's performance in predicting labels. Finally, we compared CliVER to clinicians in the verification of 19 claims from 6 disease domains, using 189 648 PubMed abstracts extracted from January 2010 to October 2021. Results: In the evaluation of label prediction accuracy on CoVERt, CliVER achieved a notable F1 score of 0.92, highlighting the efficacy of the retrieval-augmented models. The ensemble model outperforms each individual state-of-the-art model by an absolute increase from 3% to 11% in the F1 score. Moreover, when compared with four clinicians, CliVER achieved a precision of 79.0% for abstract retrieval, 67.4% for sentence selection, and 63.2% for label prediction, respectively. Conclusion: CliVER demonstrates its early potential to automate scientific claim verification using retrieval-augmented strategies to harness the wealth of clinical trial abstracts in PubMed. Future studies are warranted to further test its clinical utility.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA