Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Ann Acad Med Singap ; 53(3): 187-207, 2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38920245

RESUMO

Introduction: Automated machine learning (autoML) removes technical and technological barriers to building artificial intelligence models. We aimed to summarise the clinical applications of autoML, assess the capabilities of utilised platforms, evaluate the quality of the evidence trialling autoML, and gauge the performance of autoML platforms relative to conventionally developed models, as well as each other. Method: This review adhered to a prospectively registered protocol (PROSPERO identifier CRD42022344427). The Cochrane Library, Embase, MEDLINE and Scopus were searched from inception to 11 July 2022. Two researchers screened abstracts and full texts, extracted data and conducted quality assessment. Disagreement was resolved through discussion and if required, arbitration by a third researcher. Results: There were 26 distinct autoML platforms featured in 82 studies. Brain and lung disease were the most common fields of study of 22 specialties. AutoML exhibited variable performance: area under the receiver operator characteristic curve (AUCROC) 0.35-1.00, F1-score 0.16-0.99, area under the precision-recall curve (AUPRC) 0.51-1.00. AutoML exhibited the highest AUCROC in 75.6% trials; the highest F1-score in 42.3% trials; and the highest AUPRC in 83.3% trials. In autoML platform comparisons, AutoPrognosis and Amazon Rekognition performed strongest with unstructured and structured data, respectively. Quality of reporting was poor, with a median DECIDE-AI score of 14 of 27. Conclusion: A myriad of autoML platforms have been applied in a variety of clinical contexts. The performance of autoML compares well to bespoke computational and clinical benchmarks. Further work is required to improve the quality of validation studies. AutoML may facilitate a transition to data-centric development, and integration with large language models may enable AI to build itself to fulfil user-defined goals.


Assuntos
Aprendizado de Máquina , Humanos , Pneumopatias/diagnóstico , Curva ROC , Encefalopatias/diagnóstico , Área Sob a Curva
2.
NPJ Digit Med ; 7(1): 131, 2024 May 18.
Artigo em Inglês | MEDLINE | ID: mdl-38762669

RESUMO

Subjectivity and ambiguity of visual field classification limits the accuracy and reliability of glaucoma diagnosis, prognostication, and management decisions. Standardised rules for classifying glaucomatous visual field defects exist, but these are labour-intensive and therefore impractical for day-to-day clinical work. Here a web-application, Glaucoma Field Defect Classifier (GFDC), for automatic application of Hodapp-Parrish-Anderson, is presented and validated in a cross-sectional study. GFDC exhibits perfect accuracy in classifying mild, moderate, and severe glaucomatous field defects. GFDC may thereby improve the accuracy and fairness of clinical decision-making in glaucoma. The application and its source code are freely hosted online for clinicians and researchers to use with glaucoma patients.

3.
PLOS Digit Health ; 3(4): e0000341, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38630683

RESUMO

Large language models (LLMs) underlie remarkable recent advanced in natural language processing, and they are beginning to be applied in clinical contexts. We aimed to evaluate the clinical potential of state-of-the-art LLMs in ophthalmology using a more robust benchmark than raw examination scores. We trialled GPT-3.5 and GPT-4 on 347 ophthalmology questions before GPT-3.5, GPT-4, PaLM 2, LLaMA, expert ophthalmologists, and doctors in training were trialled on a mock examination of 87 questions. Performance was analysed with respect to question subject and type (first order recall and higher order reasoning). Masked ophthalmologists graded the accuracy, relevance, and overall preference of GPT-3.5 and GPT-4 responses to the same questions. The performance of GPT-4 (69%) was superior to GPT-3.5 (48%), LLaMA (32%), and PaLM 2 (56%). GPT-4 compared favourably with expert ophthalmologists (median 76%, range 64-90%), ophthalmology trainees (median 59%, range 57-63%), and unspecialised junior doctors (median 43%, range 41-44%). Low agreement between LLMs and doctors reflected idiosyncratic differences in knowledge and reasoning with overall consistency across subjects and types (p>0.05). All ophthalmologists preferred GPT-4 responses over GPT-3.5 and rated the accuracy and relevance of GPT-4 as higher (p<0.05). LLMs are approaching expert-level knowledge and reasoning skills in ophthalmology. In view of the comparable or superior performance to trainee-grade ophthalmologists and unspecialised junior doctors, state-of-the-art LLMs such as GPT-4 may provide useful medical advice and assistance where access to expert ophthalmologists is limited. Clinical benchmarks provide useful assays of LLM capabilities in healthcare before clinical trials can be designed and conducted.

4.
JMIR Form Res ; 8: e51770, 2024 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-38271088

RESUMO

BACKGROUND: Approximately 80% of primary school children in the United States and Europe experience glue ear, which may impair hearing at a critical time for speech acquisition and social development. A web-based app, DigiBel, has been developed primarily to identify individuals with conductive hearing impairment who may benefit from the temporary use of bone-conduction assistive technology in the community. OBJECTIVE: This preliminary study aims to determine the screening accuracy and usability of DigiBel self-assessed air-conduction (AC) pure tone audiometry in adult volunteers with simulated hearing impairment prior to formal clinical validation. METHODS: Healthy adults, each with 1 ear plugged, underwent automated AC pure tone audiometry (reference test) and DigiBel audiometry in quiet community settings. Threshold measurements were compared across 6 tone frequencies and DigiBel test-retest reliability was calculated. The accuracy of DigiBel for detecting more than 20 dB of hearing impairment was assessed. A total of 30 adults (30 unplugged ears and 30 plugged ears) completed both audiometry tests. RESULTS: DigiBel had 100% sensitivity (95% CI 87.23-100) and 72.73% (95% CI 54.48-86.70) specificity in detecting hearing impairment. Threshold mean bias was insignificant except at 4000 and 8000 Hz where a small but significant overestimation of threshold measurement was identified. All 24 participants completing feedback rated the DigiBel test as good or excellent and 21 (88%) participants agreed or strongly agreed that they would be able to do the test at home without help. CONCLUSIONS: This study supports the potential use of DigiBel as a screening tool for hearing impairment. The findings will be used to improve the software further prior to undertaking a formal clinical trial of AC and bone-conduction audiometry in individuals with suspected conductive hearing impairment.

5.
Biomed J ; : 100679, 2023 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-38048990

RESUMO

The Metaverse has gained wide attention for being the application interface for the next generation of Internet. The potential of the Metaverse is growing, as Web 3·0 development and adoption continues to advance medicine and healthcare. We define the next generation of interoperable healthcare ecosystem in the Metaverse. We examine the existing literature regarding the Metaverse, explain the technology framework to deliver an immersive experience, along with a technical comparison of legacy and novel Metaverse platforms that are publicly released and in active use. The potential applications of different features of the Metaverse, including avatar-based meetings, immersive simulations, and social interactions are examined with different roles from patients to healthcare providers and healthcare organizations. Present challenges in the development of the Metaverse healthcare ecosystem are discussed, along with potential solutions including capabilities requiring technological innovation, use cases requiring regulatory supervision, and sound governance. This proposed concept and framework of the Metaverse could potentially redefine the traditional healthcare system and enhance digital transformation in healthcare. Similar to AI technology at the beginning of this decade, real-world development and implementation of these capabilities are relatively nascent. Further pragmatic research is needed for the development of an interoperable healthcare ecosystem in the Metaverse.

6.
J Med Internet Res ; 25: e51603, 2023 12 05.
Artigo em Inglês | MEDLINE | ID: mdl-38051572

RESUMO

Large language models (LLMs) are exhibiting remarkable performance in clinical contexts, with exemplar results ranging from expert-level attainment in medical examination questions to superior accuracy and relevance when responding to patient queries compared to real doctors replying to queries on social media. The deployment of LLMs in conventional health care settings is yet to be reported, and there remains an open question as to what evidence should be required before such deployment is warranted. Early validation studies use unvalidated surrogate variables to represent clinical aptitude, and it may be necessary to conduct prospective randomized controlled trials to justify the use of an LLM for clinical advice or assistance, as potential pitfalls and pain points cannot be exhaustively predicted. This viewpoint states that as LLMs continue to revolutionize the field, there is an opportunity to improve the rigor of artificial intelligence (AI) research to reward innovation, conferring real benefits to real patients.


Assuntos
Aptidão , Inteligência Artificial , Competência Clínica , Humanos , Idioma , Dor , Estudos Prospectivos
7.
J Med Internet Res ; 25: e49949, 2023 10 12.
Artigo em Inglês | MEDLINE | ID: mdl-37824185

RESUMO

Deep learning-based clinical imaging analysis underlies diagnostic artificial intelligence (AI) models, which can match or even exceed the performance of clinical experts, having the potential to revolutionize clinical practice. A wide variety of automated machine learning (autoML) platforms lower the technical barrier to entry to deep learning, extending AI capabilities to clinicians with limited technical expertise, and even autonomous foundation models such as multimodal large language models. Here, we provide a technical overview of autoML with descriptions of how autoML may be applied in education, research, and clinical practice. Each stage of the process of conducting an autoML project is outlined, with an emphasis on ethical and technical best practices. Specifically, data acquisition, data partitioning, model training, model validation, analysis, and model deployment are considered. The strengths and limitations of available code-free, code-minimal, and code-intensive autoML platforms are considered. AutoML has great potential to democratize AI in medicine, improving AI literacy by enabling "hands-on" education. AutoML may serve as a useful adjunct in research by facilitating rapid testing and benchmarking before significant computational resources are committed. AutoML may also be applied in clinical contexts, provided regulatory requirements are met. The abstraction by autoML of arduous aspects of AI engineering promotes prioritization of data set curation, supporting the transition from conventional model-driven approaches to data-centric development. To fulfill its potential, clinicians must be educated on how to apply these technologies ethically, rigorously, and effectively; this tutorial represents a comprehensive summary of relevant considerations.


Assuntos
Inteligência Artificial , Aprendizado de Máquina , Humanos , Processamento de Imagem Assistida por Computador , Escolaridade , Benchmarking
8.
Ophthalmol Sci ; 3(4): 100394, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37885755

RESUMO

The rapid progress of large language models (LLMs) driving generative artificial intelligence applications heralds the potential of opportunities in health care. We conducted a review up to April 2023 on Google Scholar, Embase, MEDLINE, and Scopus using the following terms: "large language models," "generative artificial intelligence," "ophthalmology," "ChatGPT," and "eye," based on relevance to this review. From a clinical viewpoint specific to ophthalmologists, we explore from the different stakeholders' perspectives-including patients, physicians, and policymakers-the potential LLM applications in education, research, and clinical domains specific to ophthalmology. We also highlight the foreseeable challenges of LLM implementation into clinical practice, including the concerns of accuracy, interpretability, perpetuating bias, and data security. As LLMs continue to mature, it is essential for stakeholders to jointly establish standards for best practices to safeguard patient safety. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

9.
Cell Rep Med ; 4(10): 101230, 2023 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-37852174

RESUMO

Current and future healthcare professionals are generally not trained to cope with the proliferation of artificial intelligence (AI) technology in healthcare. To design a curriculum that caters to variable baseline knowledge and skills, clinicians may be conceptualized as "consumers", "translators", or "developers". The changes required of medical education because of AI innovation are linked to those brought about by evidence-based medicine (EBM). We outline a core curriculum for AI education of future consumers, translators, and developers, emphasizing the links between AI and EBM, with suggestions for how teaching may be integrated into existing curricula. We consider the key barriers to implementation of AI in the medical curriculum: time, resources, variable interest, and knowledge retention. By improving AI literacy rates and fostering a translator- and developer-enriched workforce, innovation may be accelerated for the benefit of patients and practitioners.


Assuntos
Inteligência Artificial , Educação Médica , Humanos , Currículo , Medicina Baseada em Evidências/educação
10.
Nat Med ; 29(8): 1930-1940, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37460753

RESUMO

Large language models (LLMs) can respond to free-text queries without being specifically trained in the task in question, causing excitement and concern about their use in healthcare settings. ChatGPT is a generative artificial intelligence (AI) chatbot produced through sophisticated fine-tuning of an LLM, and other tools are emerging through similar developmental processes. Here we outline how LLM applications such as ChatGPT are developed, and we discuss how they are being leveraged in clinical settings. We consider the strengths and limitations of LLMs and their potential to improve the efficiency and effectiveness of clinical, educational and research work in medicine. LLM chatbots have already been deployed in a range of biomedical contexts, with impressive but mixed results. This review acts as a primer for interested clinicians, who will determine if and how LLM technology is used in healthcare for the benefit of patients and practitioners.


Assuntos
Inteligência Artificial , Medicina , Humanos , Idioma , Software , Tecnologia
11.
PLoS One ; 18(6): e0281847, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37347757

RESUMO

BACKGROUND: Remote self-administered visual acuity (VA) tests have the potential to allow patients and non-specialists to assess vision without eye health professional input. Validation in pragmatic trials is necessary to demonstrate the accuracy and reliability of tests in relevant settings to justify deployment. Here, published pragmatic trials of these tests were synthesised to summarise the effectiveness of available options and appraise the quality of their supporting evidence. METHODS: A systematic review was undertaken in accordance with a preregistered protocol (CRD42022385045). The Cochrane Library, Embase, MEDLINE, and Scopus were searched. Screening was conducted according to the following criteria: (1) English language; (2) primary research article; (3) visual acuity test conducted out of eye clinic; (4) no clinical administration of remote test; (5) accuracy or reliability of remote test analysed. There were no restrictions on trial participants. Quality assessment was conducted with QUADAS-2. RESULTS: Of 1227 identified reports, 10 studies were ultimately included. One study was at high risk of bias and two studies exhibited concerning features of bias; all studies were applicable. Three trials-of DigiVis, iSight Professional, and Peek Acuity-from two studies suggested that accuracy of the remote tests is comparable to clinical assessment. All other trials exhibited inferior accuracy, including conflicting results from a pooled study of iSight Professional and Peek Acuity. Two studies evaluated test-retest agreement-one trial provided evidence that DigiVis is as reliable as clinical assessment. The three most accurate tests required access to digital devices. Reporting was inconsistent and often incomplete, particularly with regards to describing methods and conducting statistical analysis. CONCLUSIONS: Remote self-administered VA tests appear promising, but further pragmatic trials are indicated to justify deployment in carefully defined contexts to facilitate patient or non-specialist led assessment. Deployment could augment teleophthalmology, non-specialist eye assessment, pre-consultation triage, and autonomous long-term monitoring of vision.


Assuntos
Oftalmologia , Telemedicina , Humanos , Reprodutibilidade dos Testes , Acuidade Visual
13.
JMIR Med Educ ; 9: e46599, 2023 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-37083633

RESUMO

BACKGROUND: Large language models exhibiting human-level performance in specialized tasks are emerging; examples include Generative Pretrained Transformer 3.5, which underlies the processing of ChatGPT. Rigorous trials are required to understand the capabilities of emerging technology, so that innovation can be directed to benefit patients and practitioners. OBJECTIVE: Here, we evaluated the strengths and weaknesses of ChatGPT in primary care using the Membership of the Royal College of General Practitioners Applied Knowledge Test (AKT) as a medium. METHODS: AKT questions were sourced from a web-based question bank and 2 AKT practice papers. In total, 674 unique AKT questions were inputted to ChatGPT, with the model's answers recorded and compared to correct answers provided by the Royal College of General Practitioners. Each question was inputted twice in separate ChatGPT sessions, with answers on repeated trials compared to gauge consistency. Subject difficulty was gauged by referring to examiners' reports from 2018 to 2022. Novel explanations from ChatGPT-defined as information provided that was not inputted within the question or multiple answer choices-were recorded. Performance was analyzed with respect to subject, difficulty, question source, and novel model outputs to explore ChatGPT's strengths and weaknesses. RESULTS: Average overall performance of ChatGPT was 60.17%, which is below the mean passing mark in the last 2 years (70.42%). Accuracy differed between sources (P=.04 and .06). ChatGPT's performance varied with subject category (P=.02 and .02), but variation did not correlate with difficulty (Spearman ρ=-0.241 and -0.238; P=.19 and .20). The proclivity of ChatGPT to provide novel explanations did not affect accuracy (P>.99 and .23). CONCLUSIONS: Large language models are approaching human expert-level performance, although further development is required to match the performance of qualified primary care physicians in the AKT. Validated high-performance models may serve as assistants or autonomous clinical tools to ameliorate the general practice workforce crisis.

14.
Health Care Sci ; 2(4): 255-263, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38939520

RESUMO

Recently, the emergence of ChatGPT, an artificial intelligence chatbot developed by OpenAI, has attracted significant attention due to its exceptional language comprehension and content generation capabilities, highlighting the immense potential of large language models (LLMs). LLMs have become a burgeoning hotspot across many fields, including health care. Within health care, LLMs may be classified into LLMs for the biomedical domain and LLMs for the clinical domain based on the corpora used for pre-training. In the last 3 years, these domain-specific LLMs have demonstrated exceptional performance on multiple natural language processing tasks, surpassing the performance of general LLMs as well. This not only emphasizes the significance of developing dedicated LLMs for the specific domains, but also raises expectations for their applications in health care. We believe that LLMs may be used widely in preconsultation, diagnosis, and management, with appropriate development and supervision. Additionally, LLMs hold tremendous promise in assisting with medical education, medical writing and other related applications. Likewise, health care systems must recognize and address the challenges posed by LLMs.

16.
Eye (Lond) ; 36(10): 2057-2061, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-34462579

RESUMO

BACKGROUND/OBJECTIVES: Ophthalmic disorders cause 8% of hospital clinic attendances, the highest of any specialty. The fundamental need for a distance visual acuity (VA) measurement constrains remote consultation. A web-application, DigiVis, facilitates self-assessment of VA using two internet-connected devices. This prospective validation study aimed to establish its accuracy, reliability, usability and acceptability. SUBJECTS/METHODS: In total, 120 patients aged 5-87 years (median = 27) self-tested their vision twice using DigiVis in addition to their standard clinical assessment. Eyes with VA worse than +0.80 logMAR were excluded. Accuracy and test-retest (TRT) variability were compared using Bland-Altman analysis and intraclass correlation coefficients (ICC). Patient feedback was analysed. RESULTS: Bias between VA tests was insignificant at -0.001 (95% CI -0.017 to 0.015) logMAR. The upper limit of agreement (LOA) was 0.173 (95% CI 0.146 to 0.201) and the lower LOA -0.175 (95% CI -0.202 to -0.147) logMAR. The ICC was 0.818 (95% CI 0.748 to 0.869). DigiVis TRT mean bias was similarly insignificant, at 0.001 (95% CI -0.011 to 0.013) logMAR, the upper LOA was 0.124 (95% CI 0.103 to 0.144) and the lower LOA -0.121 (95% CI -0.142 to -0.101) logMAR. The ICC was 0.922 (95% CI 0.887 to 0.946). 95% of subjects were willing to use DigiVis to monitor vision at home. CONCLUSIONS: Self-tested distance VA using DigiVis is accurate, reliable and well accepted by patients. The app has potential to facilitate home monitoring, triage and remote consultation but widescale implementation will require integration with NHS databases and secure patient data storage.


Assuntos
Software , Testes Visuais , Humanos , Reprodutibilidade dos Testes , Visão Ocular , Acuidade Visual
18.
BMJ Open Ophthalmol ; 6(1): e000801, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34651083

RESUMO

OBJECTIVE: The difficulty in accurately assessing distance visual acuity (VA) at home limits the usefulness of remote consultation in ophthalmology. A novel web application, DigiVis, enables automated VA self-assessment using standard digital devices. This study aims to compare its accuracy and reliability in children with clinical assessment by a healthcare professional. METHODS AND ANALYSIS: Children aged 4-10 years were recruited from a paediatric ophthalmology service. Those with VA worse than +0.8 logMAR (Logarithm of the Minimum Angle of Resolution) or with cognitive impairment were excluded. Bland-Altman statistics were used to analyse both the accuracy and repeatability of VA self-testing. User feedback was collected by questionnaire. RESULTS: The left eyes of 89 children (median 7 years) were tested. VA self-testing showed a mean bias of 0.023 logMAR, with a limit of agreement (LOA) of ±0.195 logMAR and an intraclass correlation coefficient (ICC) of 0.816. A second test was possible in 80 (90%) children. Test-retest comparison showed a mean bias of 0.010, with an LOA of ±0.179 logMAR, an ICC of 0.815 and a repeatability coefficient of 0.012. 96% of children rated the test as good or excellent, as did 99% of their parents. CONCLUSION: Digital self-testing gave comparable distance VA assessments with clinical testing in children and was well accepted. Since DigiVis self-testing can be performed under direct supervision using medical video consultation software, it may be a useful tool to enable a proportion of paediatric eye clinic attendances to be moved online, reducing time off school and releasing face-to-face clinical capacity for those who need it.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA