Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
1.
Nat Commun ; 15(1): 1906, 2024 Mar 19.
Article in English | MEDLINE | ID: mdl-38503774

ABSTRACT

Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research challenge. To address this unmet need, we propose TacticAI, an AI football tactics assistant developed and evaluated in close collaboration with domain experts from Liverpool FC. We focus on analysing corner kicks, as they offer coaches the most direct opportunities for interventions and improvements. TacticAI incorporates both a predictive and a generative component, allowing the coaches to effectively sample and explore alternative player setups for each corner kick routine and to select those with the highest predicted likelihood of success. We validate TacticAI on a number of relevant benchmark tasks: predicting receivers and shot attempts and recommending player position adjustments. The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC. We show that TacticAI's model suggestions are not only indistinguishable from real tactics, but also favoured over existing tactics 90% of the time, and that TacticAI offers an effective corner kick retrieval system. TacticAI achieves these results despite the limited availability of gold-standard data, achieving data efficiency through geometric deep learning.


Subject(s)
Athletic Performance , Athletic Performance/physiology , Qualitative Research , Soccer
2.
BMJ Open ; 14(3): e079105, 2024 Mar 15.
Article in English | MEDLINE | ID: mdl-38490661

ABSTRACT

INTRODUCTION: For artificial intelligence (AI) to help improve mental healthcare, the design of data-driven technologies needs to be fair, safe, and inclusive. Participatory design can play a critical role in empowering marginalised communities to take an active role in constructing research agendas and outputs. Given the unmet needs of the LGBTQI+ (Lesbian, Gay, Bisexual, Transgender, Queer and Intersex) community in mental healthcare, there is a pressing need for participatory research to include a range of diverse queer perspectives on issues of data collection and use (in routine clinical care as well as for research) as well as AI design. Here we propose a protocol for a Delphi consensus process for the development of PARticipatory Queer AI Research for Mental Health (PARQAIR-MH) practices, aimed at informing digital health practices and policy. METHODS AND ANALYSIS: The development of PARQAIR-MH is comprised of four stages. In stage 1, a review of recent literature and fact-finding consultation with stakeholder organisations will be conducted to define a terms-of-reference for stage 2, the Delphi process. Our Delphi process consists of three rounds, where the first two rounds will iterate and identify items to be included in the final Delphi survey for consensus ratings. Stage 3 consists of consensus meetings to review and aggregate the Delphi survey responses, leading to stage 4 where we will produce a reusable toolkit to facilitate participatory development of future bespoke LGBTQI+-adapted data collection, harmonisation, and use for data-driven AI applications specifically in mental healthcare settings. ETHICS AND DISSEMINATION: PARQAIR-MH aims to deliver a toolkit that will help to ensure that the specific needs of LGBTQI+ communities are accounted for in mental health applications of data-driven technologies. The study is expected to run from June 2024 through January 2025, with the final outputs delivered in mid-2025. Participants in the Delphi process will be recruited by snowball and opportunistic sampling via professional networks and social media (but not by direct approach to healthcare service users, patients, specific clinical services, or via clinicians' caseloads). Participants will not be required to share personal narratives and experiences of healthcare or treatment for any condition. Before agreeing to participate, people will be given information about the issues considered to be in-scope for the Delphi (eg, developing best practices and methods for collecting and harmonising sensitive characteristics data; developing guidelines for data use/reuse) alongside specific risks of unintended harm from participating that can be reasonably anticipated. Outputs will be made available in open-access peer-reviewed publications, blogs, social media, and on a dedicated project website for future reuse.


Subject(s)
Mental Health , Sexual and Gender Minorities , Female , Humans , Delphi Technique , Artificial Intelligence , Data Collection , Review Literature as Topic
4.
Nature ; 620(7972): 172-180, 2023 Aug.
Article in English | MEDLINE | ID: mdl-37438534

ABSTRACT

Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose a human evaluation framework for model answers along multiple axes including factuality, comprehension, reasoning, possible harm and bias. In addition, we evaluate Pathways Language Model1 (PaLM, a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM2 on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA3, MedMCQA4, PubMedQA5 and Measuring Massive Multitask Language Understanding (MMLU) clinical topics6), including 67.6% accuracy on MedQA (US Medical Licensing Exam-style questions), surpassing the prior state of the art by more than 17%. However, human evaluation reveals key gaps. To resolve this, we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, knowledge recall and reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications.


Subject(s)
Benchmarking , Computer Simulation , Knowledge , Medicine , Natural Language Processing , Bias , Clinical Competence , Comprehension , Datasets as Topic , Licensure , Medicine/methods , Medicine/standards , Patient Safety , Physicians
5.
Nat Commun ; 14(1): 4314, 2023 07 18.
Article in English | MEDLINE | ID: mdl-37463884

ABSTRACT

Machine learning (ML) holds great promise for improving healthcare, but it is critical to ensure that its use will not propagate or amplify health disparities. An important step is to characterize the (un)fairness of ML models-their tendency to perform differently across subgroups of the population-and to understand its underlying mechanisms. One potential driver of algorithmic unfairness, shortcut learning, arises when ML models base predictions on improper correlations in the training data. Diagnosing this phenomenon is difficult as sensitive attributes may be causally linked with disease. Using multitask learning, we propose a method to directly test for the presence of shortcut learning in clinical ML systems and demonstrate its application to clinical tasks in radiology and dermatology. Finally, our approach reveals instances when shortcutting is not responsible for unfairness, highlighting the need for a holistic approach to fairness mitigation in medical AI.


Subject(s)
Health Facilities , Machine Learning
6.
Nat Biomed Eng ; 7(6): 756-779, 2023 06.
Article in English | MEDLINE | ID: mdl-37291435

ABSTRACT

Machine-learning models for medical tasks can match or surpass the performance of clinical experts. However, in settings differing from those of the training dataset, the performance of a model can deteriorate substantially. Here we report a representation-learning strategy for machine-learning models applied to medical-imaging tasks that mitigates such 'out of distribution' performance problem and that improves model robustness and training efficiency. The strategy, which we named REMEDIS (for 'Robust and Efficient Medical Imaging with Self-supervision'), combines large-scale supervised transfer learning on natural images and intermediate contrastive self-supervised learning on medical images and requires minimal task-specific customization. We show the utility of REMEDIS in a range of diagnostic-imaging tasks covering six imaging domains and 15 test datasets, and by simulating three realistic out-of-distribution scenarios. REMEDIS improved in-distribution diagnostic accuracies up to 11.5% with respect to strong supervised baseline models, and in out-of-distribution settings required only 1-33% of the data for retraining to match the performance of supervised models retrained using all available data. REMEDIS may accelerate the development lifecycle of machine-learning models for medical imaging.


Subject(s)
Machine Learning , Supervised Machine Learning , Diagnostic Imaging
8.
Proc Natl Acad Sci U S A ; 119(47): e2206625119, 2022 11 22.
Article in English | MEDLINE | ID: mdl-36375061

ABSTRACT

We analyze the knowledge acquired by AlphaZero, a neural network engine that learns chess solely by playing against itself yet becomes capable of outperforming human chess players. Although the system trains without access to human games or guidance, it appears to learn concepts analogous to those used by human chess players. We provide two lines of evidence. Linear probes applied to AlphaZero's internal state enable us to quantify when and where such concepts are represented in the network. We also describe a behavioral analysis of opening play, including qualitative commentary by a former world chess champion.


Subject(s)
Neural Networks, Computer , Recreation , Humans , Learning
10.
Nature ; 600(7887): 70-74, 2021 12.
Article in English | MEDLINE | ID: mdl-34853458

ABSTRACT

The practice of mathematics involves discovering patterns and using these to formulate and prove conjectures, resulting in theorems. Since the 1960s, mathematicians have used computers to assist in the discovery of patterns and formulation of conjectures1, most famously in the Birch and Swinnerton-Dyer conjecture2, a Millennium Prize Problem3. Here we provide examples of new fundamental results in pure mathematics that have been discovered with the assistance of machine learning-demonstrating a method by which machine learning can aid mathematicians in discovering new conjectures and theorems. We propose a process of using machine learning to discover potential patterns and relations between mathematical objects, understanding them with attribution techniques and using these observations to guide intuition and propose conjectures. We outline this machine-learning-guided framework and demonstrate its successful application to current research questions in distinct areas of pure mathematics, in each case showing how it led to meaningful mathematical contributions on important open problems: a new connection between the algebraic and geometric structure of knots, and a candidate algorithm predicted by the combinatorial invariance conjecture for symmetric groups4. Our work may serve as a model for collaboration between the fields of mathematics and artificial intelligence (AI) that can achieve surprising results by leveraging the respective strengths of mathematicians and machine learning.

11.
BMJ Open ; 11(6): e047709, 2021 06 28.
Article in English | MEDLINE | ID: mdl-34183345

ABSTRACT

INTRODUCTION: Standards for Reporting of Diagnostic Accuracy Study (STARD) was developed to improve the completeness and transparency of reporting in studies investigating diagnostic test accuracy. However, its current form, STARD 2015 does not address the issues and challenges raised by artificial intelligence (AI)-centred interventions. As such, we propose an AI-specific version of the STARD checklist (STARD-AI), which focuses on the reporting of AI diagnostic test accuracy studies. This paper describes the methods that will be used to develop STARD-AI. METHODS AND ANALYSIS: The development of the STARD-AI checklist can be distilled into six stages. (1) A project organisation phase has been undertaken, during which a Project Team and a Steering Committee were established; (2) An item generation process has been completed following a literature review, a patient and public involvement and engagement exercise and an online scoping survey of international experts; (3) A three-round modified Delphi consensus methodology is underway, which will culminate in a teleconference consensus meeting of experts; (4) Thereafter, the Project Team will draft the initial STARD-AI checklist and the accompanying documents; (5) A piloting phase among expert users will be undertaken to identify items which are either unclear or missing. This process, consisting of surveys and semistructured interviews, will contribute towards the explanation and elaboration document and (6) On finalisation of the manuscripts, the group's efforts turn towards an organised dissemination and implementation strategy to maximise end-user adoption. ETHICS AND DISSEMINATION: Ethical approval has been granted by the Joint Research Compliance Office at Imperial College London (reference number: 19IC5679). A dissemination strategy will be aimed towards five groups of stakeholders: (1) academia, (2) policy, (3) guidelines and regulation, (4) industry and (5) public and non-specific stakeholders. We anticipate that dissemination will take place in Q3 of 2021.


Subject(s)
Artificial Intelligence , Diagnostic Tests, Routine , Humans , London , Research Design , Research Report
12.
J Am Med Inform Assoc ; 28(9): 1936-1946, 2021 08 13.
Article in English | MEDLINE | ID: mdl-34151965

ABSTRACT

OBJECTIVE: Multitask learning (MTL) using electronic health records allows concurrent prediction of multiple endpoints. MTL has shown promise in improving model performance and training efficiency; however, it often suffers from negative transfer - impaired learning if tasks are not appropriately selected. We introduce a sequential subnetwork routing (SeqSNR) architecture that uses soft parameter sharing to find related tasks and encourage cross-learning between them. MATERIALS AND METHODS: Using the MIMIC-III (Medical Information Mart for Intensive Care-III) dataset, we train deep neural network models to predict the onset of 6 endpoints including specific organ dysfunctions and general clinical outcomes: acute kidney injury, continuous renal replacement therapy, mechanical ventilation, vasoactive medications, mortality, and length of stay. We compare single-task (ST) models with naive multitask and SeqSNR in terms of discriminative performance and label efficiency. RESULTS: SeqSNR showed a modest yet statistically significant performance boost across 4 of 6 tasks compared with ST and naive multitasking. When the size of the training dataset was reduced for a given task (label efficiency), SeqSNR outperformed ST for all cases showing an average area under the precision-recall curve boost of 2.1%, 2.9%, and 2.1% for tasks using 1%, 5%, and 10% of labels, respectively. CONCLUSIONS: The SeqSNR architecture shows superior label efficiency compared with ST and naive multitasking, suggesting utility in scenarios in which endpoint labels are difficult to ascertain.


Subject(s)
Machine Learning , Multiple Organ Failure , Electronic Health Records , Humans , Intensive Care Units , Neural Networks, Computer
13.
Nat Protoc ; 16(6): 2765-2787, 2021 06.
Article in English | MEDLINE | ID: mdl-33953393

ABSTRACT

Early prediction of patient outcomes is important for targeting preventive care. This protocol describes a practical workflow for developing deep-learning risk models that can predict various clinical and operational outcomes from structured electronic health record (EHR) data. The protocol comprises five main stages: formal problem definition, data pre-processing, architecture selection, calibration and uncertainty, and generalizability evaluation. We have applied the workflow to four endpoints (acute kidney injury, mortality, length of stay and 30-day hospital readmission). The workflow can enable continuous (e.g., triggered every 6 h) and static (e.g., triggered at 24 h after admission) predictions. We also provide an open-source codebase that illustrates some key principles in EHR modeling. This protocol can be used by interdisciplinary teams with programming and clinical expertise to build deep-learning prediction models with alternate data sources and prediction tasks.


Subject(s)
Deep Learning , Electronic Health Records , Research Design , Risk Assessment/methods , Humans , Software , Workflow
14.
Nat Commun ; 11(1): 2468, 2020 05 18.
Article in English | MEDLINE | ID: mdl-32424119

ABSTRACT

Advances in machine learning (ML) and artificial intelligence (AI) present an opportunity to build better tools and solutions to help address some of the world's most pressing challenges, and deliver positive social impact in accordance with the priorities outlined in the United Nations' 17 Sustainable Development Goals (SDGs). The AI for Social Good (AI4SG) movement aims to establish interdisciplinary partnerships centred around AI applications towards SDGs. We provide a set of guidelines for establishing successful long-term collaborations between AI researchers and application-domain experts, relate them to existing AI4SG projects and identify key opportunities for future AI applications targeted towards social good.

15.
Nature ; 572(7767): 116-119, 2019 08.
Article in English | MEDLINE | ID: mdl-31367026

ABSTRACT

The early prediction of deterioration could have an important role in supporting healthcare professionals, as an estimated 11% of deaths in hospital follow a failure to promptly recognize and treat deteriorating patients1. To achieve this goal requires predictions of patient risk that are continuously updated and accurate, and delivered at an individual level with sufficient context and enough time to act. Here we develop a deep learning approach for the continuous risk prediction of future deterioration in patients, building on recent work that models adverse events from electronic health records2-17 and using acute kidney injury-a common and potentially life-threatening condition18-as an exemplar. Our model was developed on a large, longitudinal dataset of electronic health records that cover diverse clinical environments, comprising 703,782 adult patients across 172 inpatient and 1,062 outpatient sites. Our model predicts 55.8% of all inpatient episodes of acute kidney injury, and 90.2% of all acute kidney injuries that required subsequent administration of dialysis, with a lead time of up to 48 h and a ratio of 2 false alerts for every true alert. In addition to predicting future acute kidney injury, our model provides confidence assessments and a list of the clinical features that are most salient to each prediction, alongside predicted future trajectories for clinically relevant blood tests9. Although the recognition and prompt treatment of acute kidney injury is known to be challenging, our approach may offer opportunities for identifying patients at risk within a time window that enables early treatment.


Subject(s)
Acute Kidney Injury/diagnosis , Clinical Laboratory Techniques/methods , Acute Kidney Injury/complications , Adolescent , Adult , Aged , Aged, 80 and over , Computer Simulation , Datasets as Topic , False Positive Reactions , Female , Humans , Male , Middle Aged , Pulmonary Disease, Chronic Obstructive/complications , ROC Curve , Risk Assessment , Uncertainty , Young Adult
16.
Nat Med ; 24(9): 1342-1350, 2018 09.
Article in English | MEDLINE | ID: mdl-30104768

ABSTRACT

The volume and complexity of diagnostic imaging is increasing at a pace faster than the availability of human expertise to interpret it. Artificial intelligence has shown great promise in classifying two-dimensional photographs of some common diseases and typically relies on databases of millions of annotated images. Until now, the challenge of reaching the performance of expert clinicians in a real-world clinical pathway with three-dimensional diagnostic scans has remained unsolved. Here, we apply a novel deep learning architecture to a clinically heterogeneous set of three-dimensional optical coherence tomography scans from patients referred to a major eye hospital. We demonstrate performance in making a referral recommendation that reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans. Moreover, we demonstrate that the tissue segmentations produced by our architecture act as a device-independent representation; referral accuracy is maintained when using tissue segmentations from a different type of device. Our work removes previous barriers to wider clinical use without prohibitive training data requirements across multiple pathologies in a real-world setting.


Subject(s)
Deep Learning , Referral and Consultation , Retinal Diseases/diagnosis , Aged , Clinical Decision-Making , Female , Humans , Male , Middle Aged , Retina/diagnostic imaging , Retina/pathology , Retinal Diseases/diagnostic imaging , Tomography, Optical Coherence
17.
F1000Res ; 5: 1573, 2016.
Article in English | MEDLINE | ID: mdl-27830057

ABSTRACT

There are almost two million people in the United Kingdom living with sight loss, including around 360,000 people who are registered as blind or partially sighted. Sight threatening diseases, such as diabetic retinopathy and age related macular degeneration have contributed to the 40% increase in outpatient attendances in the last decade but are amenable to early detection and monitoring. With early and appropriate intervention, blindness may be prevented in many cases. Ophthalmic imaging provides a way to diagnose and objectively assess the progression of a number of pathologies including neovascular ("wet") age-related macular degeneration (wet AMD) and diabetic retinopathy. Two methods of imaging are commonly used: digital photographs of the fundus (the 'back' of the eye) and Optical Coherence Tomography (OCT, a modality that uses light waves in a similar way to how ultrasound uses sound waves). Changes in population demographics and expectations and the changing pattern of chronic diseases creates a rising demand for such imaging. Meanwhile, interrogation of such images is time consuming, costly, and prone to human error. The application of novel analysis methods may provide a solution to these challenges. This research will focus on applying novel machine learning algorithms to automatic analysis of both digital fundus photographs and OCT in Moorfields Eye Hospital NHS Foundation Trust patients. Through analysis of the images used in ophthalmology, along with relevant clinical and demographic information, DeepMind Health will investigate the feasibility of automated grading of digital fundus photographs and OCT and provide novel quantitative measures for specific disease features and for monitoring the therapeutic success.

SELECTION OF CITATIONS
SEARCH DETAIL
...