RESUMEN
Screening mammography aims to identify breast cancer at earlier stages of the disease, when treatment can be more successful1. Despite the existence of screening programmes worldwide, the interpretation of mammograms is affected by high rates of false positives and false negatives2. Here we present an artificial intelligence (AI) system that is capable of surpassing human experts in breast cancer prediction. To assess its performance in the clinical setting, we curated a large representative dataset from the UK and a large enriched dataset from the USA. We show an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives. We provide evidence of the ability of the system to generalize from the UK to the USA. In an independent study of six radiologists, the AI system outperformed all of the human readers: the area under the receiver operating characteristic curve (AUC-ROC) for the AI system was greater than the AUC-ROC for the average radiologist by an absolute margin of 11.5%. We ran a simulation in which the AI system participated in the double-reading process that is used in the UK, and found that the AI system maintained non-inferior performance and reduced the workload of the second reader by 88%. This robust assessment of the AI system paves the way for clinical trials to improve the accuracy and efficiency of breast cancer screening.
Asunto(s)
Inteligencia Artificial/normas , Neoplasias de la Mama/diagnóstico por imagen , Detección Precoz del Cáncer/métodos , Detección Precoz del Cáncer/normas , Femenino , Humanos , Mamografía/normas , Reproducibilidad de los Resultados , Reino Unido , Estados UnidosRESUMEN
Background Developing deep learning models for radiology requires large data sets and substantial computational resources. Data set size limitations can be further exacerbated by distribution shifts, such as rapid changes in patient populations and standard of care during the COVID-19 pandemic. A common partial mitigation is transfer learning by pretraining a "generic network" on a large nonmedical data set and then fine-tuning on a task-specific radiology data set. Purpose To reduce data set size requirements for chest radiography deep learning models by using an advanced machine learning approach (supervised contrastive [SupCon] learning) to generate chest radiography networks. Materials and Methods SupCon helped generate chest radiography networks from 821 544 chest radiographs from India and the United States. The chest radiography networks were used as a starting point for further machine learning model development for 10 prediction tasks (eg, airspace opacity, fracture, tuberculosis, and COVID-19 outcomes) by using five data sets comprising 684 955 chest radiographs from India, the United States, and China. Three model development setups were tested (linear classifier, nonlinear classifier, and fine-tuning the full network) with different data set sizes from eight to 85. Results Across a majority of tasks, compared with transfer learning from a nonmedical data set, SupCon reduced label requirements up to 688-fold and improved the area under the receiver operating characteristic curve (AUC) at matching data set sizes. At the extreme low-data regimen, training small nonlinear models by using only 45 chest radiographs yielded an AUC of 0.95 (noninferior to radiologist performance) in classifying microbiology-confirmed tuberculosis in external validation. At a more moderate data regimen, training small nonlinear models by using only 528 chest radiographs yielded an AUC of 0.75 in predicting severe COVID-19 outcomes. Conclusion Supervised contrastive learning enabled performance comparable to state-of-the-art deep learning models in multiple clinical tasks by using as few as 45 images and is a promising method for predictive modeling with use of small data sets and for predicting outcomes in shifting patient populations. © RSNA, 2022 Online supplemental material is available for this article.
Asunto(s)
COVID-19 , Aprendizaje Profundo , Humanos , Radiografía Torácica/métodos , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Pandemias , COVID-19/diagnóstico por imagen , Estudios Retrospectivos , Radiografía , Aprendizaje AutomáticoRESUMEN
Chest radiography (CXR) is the most widely-used thoracic clinical imaging modality and is crucial for guiding the management of cardiothoracic conditions. The detection of specific CXR findings has been the main focus of several artificial intelligence (AI) systems. However, the wide range of possible CXR abnormalities makes it impractical to detect every possible condition by building multiple separate systems, each of which detects one or more pre-specified conditions. In this work, we developed and evaluated an AI system to classify CXRs as normal or abnormal. For training and tuning the system, we used a de-identified dataset of 248,445 patients from a multi-city hospital network in India. To assess generalizability, we evaluated our system using 6 international datasets from India, China, and the United States. Of these datasets, 4 focused on diseases that the AI was not trained to detect: 2 datasets with tuberculosis and 2 datasets with coronavirus disease 2019. Our results suggest that the AI system trained using a large dataset containing a diverse array of CXR abnormalities generalizes to new patient populations and unseen diseases. In a simulated workflow where the AI system prioritized abnormal cases, the turnaround time for abnormal cases reduced by 7-28%. These results represent an important step towards evaluating whether AI can be safely used to flag cases in a general setting where previously unseen abnormalities exist. Lastly, to facilitate the continued development of AI models for CXR, we release our collected labels for the publicly available dataset.
Asunto(s)
COVID-19/diagnóstico por imagen , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Tuberculosis/diagnóstico por imagen , Adulto , Anciano , Algoritmos , Estudios de Casos y Controles , China , Aprendizaje Profundo , Femenino , Humanos , India , Masculino , Persona de Mediana Edad , Radiografía Torácica , Estados UnidosRESUMEN
Background and study aims Storage of full-length endoscopic procedures is becoming increasingly popular. To facilitate large-scale machine learning (ML) focused on clinical outcomes, these videos must be merged with the patient-level data in the electronic health record (EHR). Our aim was to present a method of accurately linking patient-level EHR data with cloud stored colonoscopy videos. Methods This study was conducted at a single academic medical center. Most procedure videos are automatically uploaded to the cloud server but are identified only by procedure time and procedure room. We developed and then tested an algorithm to match recorded videos with corresponding exams in the EHR based upon procedure time and room and subsequently extract frames of interest. Results Among 28,611 total colonoscopies performed over the study period, 21,170 colonoscopy videos in 20,420 unique patients (54.2â% male, median age 58) were matched to EHR data. Of 100 randomly sampled videos, appropriate matching was manually confirmed in all. In total, these videos represented 489,721 minutes of colonoscopy performed by 50 endoscopists (median 214 colonoscopies per endoscopist). The most common procedure indications were polyp screening (47.3â%), surveillance (28.9â%) and inflammatory bowel disease (9.4â%). From these videos, we extracted procedure highlights (identified by image capture; mean 8.5 per colonoscopy) and surrounding frames. Conclusions We report the successful merging of a large database of endoscopy videos stored with limited identifiers to rich patient-level data in a highly accurate manner. This technique facilitates the development of ML algorithms based upon relevant patient outcomes.
RESUMEN
Cardiac MRI (CMR) techniques offer non-invasive visualizations of cardiac morphology and function. However, imaging can be time-consuming and complex. Seismocardiography (SCG) measures physical vibrations transmitted through the chest from the beating heart and pulsatile blood flow. SCG signals can be acquired quickly and easily, with inexpensive electronics. This study investigates relationships between CMR metrics of function and SCG signal features. Same-day CMR and SCG data were collected from 28 healthy adults and 6 subjects with aortic valve disease history. Correlation testing and statistical median/decile calculations were performed with data from the healthy cohort. MR-quantified flow and function parameters in the healthy cohort correlated with particular SCG energy levels, such as peak aortic velocity with low-frequency SCG (coefficient 0.43, significance 0.02) and peak flow with high-frequency SCG (coefficient 0.40, significance 0.03). Valve disease-induced flow abnormalities in patients were visualized with MRI, and corresponding abnormalities in SCG signals were identified. This investigation found significant cross-modality correlations in cardiac function metrics and SCG signals features from healthy subjects. Additionally, through comparison to normative ranges from healthy subjects, it observed correspondences between pathological flow and abnormal SCG. This may support development of an easy clinical test used to identify potential aortic flow abnormalities.