Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
Nat Commun ; 15(1): 7465, 2024 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-39198519

RESUMO

A core motivation for the use of artificial intelligence (AI) in medicine is to reduce existing healthcare disparities. Yet, recent studies have demonstrated two distinct findings: (1) AI models can show performance biases in underserved populations, and (2) these same models can be directly trained to recognize patient demographics, such as predicting self-reported race from medical images alone. Here, we investigate how these findings may be related, with an end goal of reducing a previously identified underdiagnosis bias. Using two popular chest x-ray datasets, we first demonstrate that technical parameters related to image acquisition and processing influence AI models trained to predict patient race, where these results partly reflect underlying biases in the original clinical datasets. We then find that mitigating the observed differences through a demographics-independent calibration strategy reduces the previously identified bias. While many factors likely contribute to AI bias and demographics prediction, these results highlight the importance of carefully considering data acquisition and processing parameters in AI development and healthcare equity more broadly.


Assuntos
Inteligência Artificial , Viés , Humanos , Radiografia Torácica , Masculino , Feminino , Grupos Raciais , Disparidades em Assistência à Saúde , Pessoa de Meia-Idade , Adulto
2.
J Med Screen ; : 9691413241262960, 2024 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-39129395

RESUMO

Artificial intelligence (AI) algorithms have been retrospectively evaluated as replacement for one radiologist in screening mammography double-reading; however, methods for resolving discordance between radiologists and AI in the absence of 'real-world' arbitration may underestimate cancer detection rate (CDR) and recall. In 108,970 consecutive screens from a population screening program (BreastScreen WA, Western Australia), 20,120 were radiologist/AI discordant without real-world arbitration. Recall probabilities were randomly assigned for these screens in 1000 simulations. Recall thresholds for screen-detected and interval cancers (sensitivity) and no cancer (false-positive proportion, FPP) were varied to calculate mean CDR and recall rate for the entire cohort. Assuming 100% sensitivity, the maximum CDR was 7.30 per 1000 screens. To achieve >95% probability that the mean CDR exceeded the screening program CDR (6.97 per 1000), interval cancer sensitivities ≥63% (at 100% screen-detected sensitivity) and ≥91% (at 80% screen-detected sensitivity) were required. Mean recall rate was relatively constant across sensitivity assumptions, but varied by FPP. FPP > 6.5% resulted in recall rates that exceeded the program estimate (3.38%). CDR improvements depend on a majority of interval cancers being detected in radiologist/AI discordant screens. Such improvements are likely to increase recall, requiring careful monitoring where AI is deployed for screen-reading.

3.
Lancet Oncol ; 25(8): 1053-1069, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39025103

RESUMO

BACKGROUND: Understanding co-occurrence patterns and prognostic implications of immune-related adverse events is crucial for immunotherapy management. However, previous studies have been limited by sample size and generalisability. In this study, we leveraged a multi-institutional cohort and a population-level database to investigate co-occurrence patterns of and survival outcomes after multi-organ immune-related adverse events among recipients of immune checkpoint inhibitors. METHODS: In this retrospective study, we identified individuals who received immune checkpoint inhibitors between May 31, 2015, and June 29, 2022, from the Massachusetts General Hospital, Brigham and Women's Hospital, and Dana-Farber Cancer Institute (Boston, MA, USA; MGBD cohort), and between April 30, 2010, and Oct 11, 2021, from the independent US population-based TriNetX network. We identified recipients from all datasets using medication codes and names of seven common immune checkpoint inhibitors, and patients were excluded from our analysis if they had incomplete information (eg, diagnosis and medication records) or if they initiated immune checkpoint inhibitor therapy after Oct 11, 2021. Eligible patients from the MGBD cohort were then propensity score matched with recipients of immune checkpoint inhibitors from the TriNetX database (1:2) based on demographic, cancer, and immune checkpoint inhibitor characteristics to facilitate cohort comparability. We applied immune-related adverse event identification rules to identify patients who did and did not have immune-related adverse events in the matched cohorts. To reduce the likelihood of false positives, patients diagnosed with suspected immune-related adverse events within 3 months after chemotherapy were excluded. We performed pairwise correlation analyses, non-negative matrix factorisation, and hierarchical clustering to identify co-occurrence patterns in the MGBD cohort. We conducted landmark overall survival analyses for patient clusters based on predominant immune-related adverse event factors and calculated accompanying hazard ratios (HRs) and 95% CIs, focusing on the 6-month landmark time for primary analyses. We validated our findings using the TriNetX cohort. FINDINGS: We identified 15 246 recipients of immune checkpoint inhibitors from MGBD and 50 503 from TriNetX, of whom 13 086 from MGBD and 26 172 from TriNetX were included in our propensity score-matched cohort. Median follow-up durations were 317 days (IQR 113-712) in patients from MGBD and 249 days (91-616) in patients from TriNetX. After applying immune-related adverse event identification rules, 8704 recipients of immune checkpoint inhibitors were retained from MGBD, of whom 3284 (37·7%) had and 5420 (62·3%) did not have immune-related adverse events, and 18 162 recipients were retained from TriNetX, of whom 5538 (30·5%) had and 12 624 (69·5%) did not have immune-related adverse events. In both cohorts, positive pairwise correlations of immune-related adverse events were commonly observed. Co-occurring immune-related adverse events were decomposed into seven factors across organs, revealing seven distinct patient clusters (endocrine, cutaneous, respiratory, gastrointestinal, hepatic, musculoskeletal, and neurological). In the MGBD cohort, the patient clusters that predominantly had endocrine (HR 0·53 [95% CI 0·40-0·70], p<0·0001) and cutaneous (0·61 [0·46-0·81], p=0·0007) immune-related adverse events had favourable overall survival outcomes at the 6-month landmark timepoint, while the other clusters either had unfavourable (respiratory: 1·60 [1·25-2·03], p=0·0001) or neutral survival outcomes (gastrointestinal: 0·86 [0·67-1·10], p=0·23; musculoskeletal: 0·97 [0·78-1·21], p=0·78; hepatic: 1·20 [0·91-1·59], p=0·19; and neurological: 1·30 [0·97-1·74], p=0·074). Similar results were found in the TriNetX cohort (endocrine: HR 0·75 [95% CI 0·60-0·93], p=0·0078; cutaneous: 0·62 [0·48-0·82], p=0·0007; respiratory: 1·21 [1·00-1·46], p=0·044), except for the neurological cluster having unfavourable (rather than neutral) survival outcomes (1·30 [1·06-1·59], p=0·013). INTERPRETATION: Reliably identifying the immune-related adverse event cluster to which a patient belongs can provide valuable clinical information for prognosticating outcomes of immunotherapy. These insights can be leveraged to counsel patients on the clinical impact of their individual constellation of immune-related adverse events and ultimately develop more personalised surveillance and mitigation strategies. FUNDING: US National Institutes of Health.


Assuntos
Inibidores de Checkpoint Imunológico , Neoplasias , Humanos , Inibidores de Checkpoint Imunológico/efeitos adversos , Estudos Retrospectivos , Feminino , Masculino , Pessoa de Meia-Idade , Idoso , Neoplasias/tratamento farmacológico , Neoplasias/imunologia
4.
JCO Clin Cancer Inform ; 8: e2300269, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38810206

RESUMO

PURPOSE: Eastern Cooperative Oncology Group (ECOG) performance status (PS) is a key clinical variable for cancer treatment and research, but it is usually only recorded in unstructured form in the electronic health record. We investigated whether natural language processing (NLP) models can impute ECOG PS using unstructured note text. MATERIALS AND METHODS: Medical oncology notes were identified from all patients with cancer at our center from 1997 to 2023 and divided at the patient level into training (approximately 80%), tuning/validation (approximately 10%), and test (approximately 10%) sets. Regular expressions were used to extract explicitly documented PS. Extracted PS labels were used to train NLP models to impute ECOG PS (0-1 v 2-4) from the remainder of the notes (with regular expression-extracted PS documentation removed). We assessed associations between imputed PS and overall survival (OS). RESULTS: ECOG PS was extracted using regular expressions from 495,862 notes, corresponding to 79,698 patients. A Transformer-based Longformer model imputed PS with high discrimination (test set area under the receiver operating characteristic curve 0.95, area under the precision-recall curve 0.73). Imputed poor PS was associated with worse OS, including among notes with no explicit documentation of PS detected (OS hazard ratio, 11.9; 95% CI, 11.1 to 12.8). CONCLUSION: NLP models can be used to impute performance status from unstructured oncologist notes at scale. This may aid the annotation of oncology data sets for clinical outcomes research and cancer care delivery.


Assuntos
Registros Eletrônicos de Saúde , Oncologia , Processamento de Linguagem Natural , Neoplasias , Humanos , Feminino , Masculino , Oncologia/métodos , Pessoa de Meia-Idade , Idoso
5.
Cancer Discov ; 14(5): 711-726, 2024 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-38597966

RESUMO

Artificial intelligence (AI) in oncology is advancing beyond algorithm development to integration into clinical practice. This review describes the current state of the field, with a specific focus on clinical integration. AI applications are structured according to cancer type and clinical domain, focusing on the four most common cancers and tasks of detection, diagnosis, and treatment. These applications encompass various data modalities, including imaging, genomics, and medical records. We conclude with a summary of existing challenges, evolving solutions, and potential future directions for the field. SIGNIFICANCE: AI is increasingly being applied to all aspects of oncology, where several applications are maturing beyond research and development to direct clinical integration. This review summarizes the current state of the field through the lens of clinical translation along the clinical care continuum. Emerging areas are also highlighted, along with common challenges, evolving solutions, and potential future directions for the field.


Assuntos
Inteligência Artificial , Oncologia , Neoplasias , Humanos , Oncologia/métodos , Oncologia/tendências , Neoplasias/genética , Neoplasias/terapia , Neoplasias/diagnóstico
6.
NPJ Digit Med ; 7(1): 80, 2024 Mar 26.
Artigo em Inglês | MEDLINE | ID: mdl-38531952

RESUMO

As applications of AI in medicine continue to expand, there is an increasing focus on integration into clinical practice. An underappreciated aspect of this clinical translation is where the AI fits into the clinical workflow, and in turn, the outputs generated by the AI to facilitate clinician interaction in this workflow. For instance, in the canonical use case of AI for medical image interpretation, the AI could prioritize cases before clinician review or even autonomously interpret the images without clinician review. A related aspect is explainability - does the AI generate outputs to help explain its predictions to clinicians? While many clinical AI workflows and explainability techniques have been proposed, a summative assessment of the current scope in clinical practice is lacking. Here, we evaluate the current state of FDA-cleared AI devices for medical image interpretation assistance in terms of intended clinical use, outputs generated, and types of explainability offered. We create a curated database focused on these aspects of the clinician-AI interface, where we find a high frequency of "triage" devices, notable variability in output characteristics across products, and often limited explainability of AI predictions. Altogether, we aim to increase transparency of the current landscape of the clinician-AI interface and highlight the need to rigorously assess which strategies ultimately lead to the best clinical outcomes.

7.
Radiol Artif Intell ; 6(2): e230137, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38323914

RESUMO

Purpose To evaluate performance improvements of general radiologists and breast imaging specialists when interpreting a set of diverse digital breast tomosynthesis (DBT) examinations with the aid of a custom-built categorical artificial intelligence (AI) system. Materials and Methods A fully balanced multireader, multicase reader study was conducted to compare the performance of 18 radiologists (nine general radiologists and nine breast imaging specialists) reading 240 retrospectively collected screening DBT mammograms (mean patient age, 59.8 years ± 11.3 [SD]; 100% women), acquired between August 2016 and March 2019, with and without the aid of a custom-built categorical AI system. The area under the receiver operating characteristic curve (AUC), sensitivity, and specificity across general radiologists and breast imaging specialists reading with versus without AI were assessed. Reader performance was also analyzed as a function of breast cancer characteristics and patient subgroups. Results Every radiologist demonstrated improved interpretation performance when reading with versus without AI, with an average AUC of 0.93 versus 0.87, demonstrating a difference in AUC of 0.06 (95% CI: 0.04, 0.08; P < .001). Improvement in AUC was observed for both general radiologists (difference of 0.08; P < .001) and breast imaging specialists (difference of 0.04; P < .001) and across all cancer characteristics (lesion type, lesion size, and pathology) and patient subgroups (race and ethnicity, age, and breast density) examined. Conclusion A categorical AI system helped improve overall radiologist interpretation performance of DBT screening mammograms for both general radiologists and breast imaging specialists and across various patient subgroups and breast cancer characteristics. Keywords: Computer-aided Diagnosis, Screening Mammography, Digital Breast Tomosynthesis, Breast Cancer, Screening, Convolutional Neural Network (CNN), Artificial Intelligence Supplemental material is available for this article. © RSNA, 2024.


Assuntos
Neoplasias da Mama , Feminino , Humanos , Pessoa de Meia-Idade , Neoplasias da Mama/diagnóstico por imagem , Mamografia/métodos , Estudos Retrospectivos , Inteligência Artificial , Detecção Precoce de Câncer/métodos , Radiologistas
9.
EBioMedicine ; 90: 104498, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36863255

RESUMO

BACKGROUND: Artificial intelligence (AI) has been proposed to reduce false-positive screens, increase cancer detection rates (CDRs), and address resourcing challenges faced by breast screening programs. We compared the accuracy of AI versus radiologists in real-world population breast cancer screening, and estimated potential impacts on CDR, recall and workload for simulated AI-radiologist reading. METHODS: External validation of a commercially-available AI algorithm in a retrospective cohort of 108,970 consecutive mammograms from a population-based screening program, with ascertained outcomes (including interval cancers by registry linkage). Area under the ROC curve (AUC), sensitivity and specificity for AI were compared with radiologists who interpreted the screens in practice. CDR and recall were estimated for simulated AI-radiologist reading (with arbitration) and compared with program metrics. FINDINGS: The AUC for AI was 0.83 compared with 0.93 for radiologists. At a prospective threshold, sensitivity for AI (0.67; 95% CI: 0.64-0.70) was comparable to radiologists (0.68; 95% CI: 0.66-0.71) with lower specificity (0.81 [95% CI: 0.81-0.81] versus 0.97 [95% CI: 0.97-0.97]). Recall rate for AI-radiologist reading (3.14%) was significantly lower than for the BSWA program (3.38%) (-0.25%; 95% CI: -0.31 to -0.18; P < 0.001). CDR was also lower (6.37 versus 6.97 per 1000) (-0.61; 95% CI: -0.77 to -0.44; P < 0.001); however, AI detected interval cancers that were not found by radiologists (0.72 per 1000; 95% CI: 0.57-0.90). AI-radiologist reading increased arbitration but decreased overall screen-reading volume by 41.4% (95% CI: 41.2-41.6). INTERPRETATION: Replacement of one radiologist by AI (with arbitration) resulted in lower recall and overall screen-reading volume. There was a small reduction in CDR for AI-radiologist reading. AI detected interval cases that were not identified by radiologists, suggesting potentially higher CDR if radiologists were unblinded to AI findings. These results indicate AI's potential role as a screen-reader of mammograms, but prospective trials are required to determine whether CDR could improve if AI detection was actioned in double-reading with arbitration. FUNDING: National Breast Cancer Foundation (NBCF), National Health and Medical Research Council (NHMRC).


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/epidemiologia , Inteligência Artificial , Estudos Retrospectivos , Estudos Prospectivos , Estudos de Coortes , Programas de Rastreamento/métodos , Detecção Precoce de Câncer/métodos , Mamografia/métodos
10.
JAMA Netw Open ; 5(11): e2242343, 2022 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-36409497

RESUMO

Importance: With a shortfall in fellowship-trained breast radiologists, mammography screening programs are looking toward artificial intelligence (AI) to increase efficiency and diagnostic accuracy. External validation studies provide an initial assessment of how promising AI algorithms perform in different practice settings. Objective: To externally validate an ensemble deep-learning model using data from a high-volume, distributed screening program of an academic health system with a diverse patient population. Design, Setting, and Participants: In this diagnostic study, an ensemble learning method, which reweights outputs of the 11 highest-performing individual AI models from the Digital Mammography Dialogue on Reverse Engineering Assessment and Methods (DREAM) Mammography Challenge, was used to predict the cancer status of an individual using a standard set of screening mammography images. This study was conducted using retrospective patient data collected between 2010 and 2020 from women aged 40 years and older who underwent a routine breast screening examination and participated in the Athena Breast Health Network at the University of California, Los Angeles (UCLA). Main Outcomes and Measures: Performance of the challenge ensemble method (CEM) and the CEM combined with radiologist assessment (CEM+R) were compared with diagnosed ductal carcinoma in situ and invasive cancers within a year of the screening examination using performance metrics, such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC). Results: Evaluated on 37 317 examinations from 26 817 women (mean [SD] age, 58.4 [11.5] years), individual model AUROC estimates ranged from 0.77 (95% CI, 0.75-0.79) to 0.83 (95% CI, 0.81-0.85). The CEM model achieved an AUROC of 0.85 (95% CI, 0.84-0.87) in the UCLA cohort, lower than the performance achieved in the Kaiser Permanente Washington (AUROC, 0.90) and Karolinska Institute (AUROC, 0.92) cohorts. The CEM+R model achieved a sensitivity (0.813 [95% CI, 0.781-0.843] vs 0.826 [95% CI, 0.795-0.856]; P = .20) and specificity (0.925 [95% CI, 0.916-0.934] vs 0.930 [95% CI, 0.929-0.932]; P = .18) similar to the radiologist performance. The CEM+R model had significantly lower sensitivity (0.596 [95% CI, 0.466-0.717] vs 0.850 [95% CI, 0.766-0.923]; P < .001) and specificity (0.803 [95% CI, 0.734-0.861] vs 0.945 [95% CI, 0.936-0.954]; P < .001) than the radiologist in women with a prior history of breast cancer and Hispanic women (0.894 [95% CI, 0.873-0.910] vs 0.926 [95% CI, 0.919-0.933]; P = .004). Conclusions and Relevance: This study found that the high performance of an ensemble deep-learning model for automated screening mammography interpretation did not generalize to a more diverse screening cohort, suggesting that the model experienced underspecification. This study suggests the need for model transparency and fine-tuning of AI models for specific target populations prior to their clinical adoption.


Assuntos
Neoplasias da Mama , Mamografia , Humanos , Feminino , Adulto , Pessoa de Meia-Idade , Inteligência Artificial , Neoplasias da Mama/diagnóstico por imagem , Estudos Retrospectivos , Detecção Precoce de Câncer
11.
J Am Coll Radiol ; 19(10): 1098-1110, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35970474

RESUMO

BACKGROUND: Artificial intelligence (AI) may improve cancer detection and risk prediction during mammography screening, but radiologists' preferences regarding its characteristics and implementation are unknown. PURPOSE: To quantify how different attributes of AI-based cancer detection and risk prediction tools affect radiologists' intentions to use AI during screening mammography interpretation. MATERIALS AND METHODS: Through qualitative interviews with radiologists, we identified five primary attributes for AI-based breast cancer detection and four for breast cancer risk prediction. We developed a discrete choice experiment based on these attributes and invited 150 US-based radiologists to participate. Each respondent made eight choices for each tool between three alternatives: two hypothetical AI-based tools versus screening without AI. We analyzed samplewide preferences using random parameters logit models and identified subgroups with latent class models. RESULTS: Respondents (n = 66; 44% response rate) were from six diverse practice settings across eight states. Radiologists were more interested in AI for cancer detection when sensitivity and specificity were balanced (94% sensitivity with <25% of examinations marked) and AI markup appeared at the end of the hanging protocol after radiologists complete their independent review. For AI-based risk prediction, radiologists preferred AI models using both mammography images and clinical data. Overall, 46% to 60% intended to adopt any of the AI tools presented in the study; 26% to 33% approached AI enthusiastically but were deterred if the features did not align with their preferences. CONCLUSION: Although most radiologists want to use AI-based decision support, short-term uptake may be maximized by implementing tools that meet the preferences of dissuadable users.


Assuntos
Neoplasias da Mama , Mamografia , Inteligência Artificial , Neoplasias da Mama/diagnóstico por imagem , Detecção Precoce de Câncer/métodos , Feminino , Humanos , Mamografia/métodos , Programas de Rastreamento , Radiologistas
12.
BMJ Open ; 12(1): e054005, 2022 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-34980622

RESUMO

INTRODUCTION: Artificial intelligence (AI) algorithms for interpreting mammograms have the potential to improve the effectiveness of population breast cancer screening programmes if they can detect cancers, including interval cancers, without contributing substantially to overdiagnosis. Studies suggesting that AI has comparable or greater accuracy than radiologists commonly employ 'enriched' datasets in which cancer prevalence is higher than in population screening. Routine screening outcome metrics (cancer detection and recall rates) cannot be estimated from these datasets, and accuracy estimates may be subject to spectrum bias which limits generalisabilty to real-world screening. We aim to address these limitations by comparing the accuracy of AI and radiologists in a cohort of consecutive of women attending a real-world population breast cancer screening programme. METHODS AND ANALYSIS: A retrospective, consecutive cohort of digital mammography screens from 109 000 distinct women was assembled from BreastScreen WA (BSWA), Western Australia's biennial population screening programme, from November 2016 to December 2017. The cohort includes 761 screen-detected and 235 interval cancers. Descriptive characteristics and results of radiologist double-reading will be extracted from BSWA outcomes data collection. Mammograms will be reinterpreted by a commercial AI algorithm (DeepHealth). AI accuracy will be compared with that of radiologist single-reading based on the difference in the area under the receiver operating characteristic curve. Cancer detection and recall rates for combined AI-radiologist reading will be estimated by pairing the first radiologist read per screen with the AI algorithm, and compared with estimates for radiologist double-reading. ETHICS AND DISSEMINATION: This study has ethical approval from the Women and Newborn Health Service Ethics Committee (EC00350) and the Curtin University Human Research Ethics Committee (HRE2020-0316). Findings will be published in peer-reviewed journals and presented at national and international conferences. Results will also be disseminated to stakeholders in Australian breast cancer screening programmes and policy makers in population screening.


Assuntos
Neoplasias da Mama , Detecção Precoce de Câncer , Inteligência Artificial , Austrália , Neoplasias da Mama/diagnóstico por imagem , Estudos de Coortes , Detecção Precoce de Câncer/métodos , Feminino , Humanos , Recém-Nascido , Mamografia/métodos , Programas de Rastreamento , Estudos Retrospectivos
13.
Nat Med ; 27(2): 244-249, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33432172

RESUMO

Breast cancer remains a global challenge, causing over 600,000 deaths in 2018 (ref. 1). To achieve earlier cancer detection, health organizations worldwide recommend screening mammography, which is estimated to decrease breast cancer mortality by 20-40% (refs. 2,3). Despite the clear value of screening mammography, significant false positive and false negative rates along with non-uniformities in expert reader availability leave opportunities for improving quality and access4,5. To address these limitations, there has been much recent interest in applying deep learning to mammography6-18, and these efforts have highlighted two key difficulties: obtaining large amounts of annotated training data and ensuring generalization across populations, acquisition equipment and modalities. Here we present an annotation-efficient deep learning approach that (1) achieves state-of-the-art performance in mammogram classification, (2) successfully extends to digital breast tomosynthesis (DBT; '3D mammography'), (3) detects cancers in clinically negative prior mammograms of patients with cancer, (4) generalizes well to a population with low screening rates and (5) outperforms five out of five full-time breast-imaging specialists with an average increase in sensitivity of 14%. By creating new 'maximum suspicion projection' (MSP) images from DBT data, our progressively trained, multiple-instance learning approach effectively trains on DBT exams using only breast-level labels while maintaining localization-based interpretability. Altogether, our results demonstrate promise towards software that can improve the accuracy of and access to screening mammography worldwide.


Assuntos
Neoplasias da Mama/diagnóstico , Mama/diagnóstico por imagem , Aprendizado Profundo , Detecção Precoce de Câncer , Adulto , Mama/patologia , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/epidemiologia , Neoplasias da Mama/patologia , Feminino , Humanos , Mamografia/tendências , Pessoa de Meia-Idade
14.
JAMA Netw Open ; 3(3): e200265, 2020 03 02.
Artigo em Inglês | MEDLINE | ID: mdl-32119094

RESUMO

Importance: Mammography screening currently relies on subjective human interpretation. Artificial intelligence (AI) advances could be used to increase mammography screening accuracy by reducing missed cancers and false positives. Objective: To evaluate whether AI can overcome human mammography interpretation limitations with a rigorous, unbiased evaluation of machine learning algorithms. Design, Setting, and Participants: In this diagnostic accuracy study conducted between September 2016 and November 2017, an international, crowdsourced challenge was hosted to foster AI algorithm development focused on interpreting screening mammography. More than 1100 participants comprising 126 teams from 44 countries participated. Analysis began November 18, 2016. Main Outcomes and Measurements: Algorithms used images alone (challenge 1) or combined images, previous examinations (if available), and clinical and demographic risk factor data (challenge 2) and output a score that translated to cancer yes/no within 12 months. Algorithm accuracy for breast cancer detection was evaluated using area under the curve and algorithm specificity compared with radiologists' specificity with radiologists' sensitivity set at 85.9% (United States) and 83.9% (Sweden). An ensemble method aggregating top-performing AI algorithms and radiologists' recall assessment was developed and evaluated. Results: Overall, 144 231 screening mammograms from 85 580 US women (952 cancer positive ≤12 months from screening) were used for algorithm training and validation. A second independent validation cohort included 166 578 examinations from 68 008 Swedish women (780 cancer positive). The top-performing algorithm achieved an area under the curve of 0.858 (United States) and 0.903 (Sweden) and 66.2% (United States) and 81.2% (Sweden) specificity at the radiologists' sensitivity, lower than community-practice radiologists' specificity of 90.5% (United States) and 98.5% (Sweden). Combining top-performing algorithms and US radiologist assessments resulted in a higher area under the curve of 0.942 and achieved a significantly improved specificity (92.0%) at the same sensitivity. Conclusions and Relevance: While no single AI algorithm outperformed radiologists, an ensemble of AI algorithms combined with radiologist assessment in a single-reader screening environment improved overall accuracy. This study underscores the potential of using machine learning methods for enhancing mammography screening interpretation.


Assuntos
Neoplasias da Mama/diagnóstico por imagem , Aprendizado Profundo , Interpretação de Imagem Assistida por Computador/métodos , Mamografia/métodos , Radiologistas , Adulto , Idoso , Algoritmos , Inteligência Artificial , Detecção Precoce de Câncer , Feminino , Humanos , Pessoa de Meia-Idade , Radiologia , Sensibilidade e Especificidade , Suécia , Estados Unidos
15.
Nat Mach Intell ; 2(4): 210-219, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34291193

RESUMO

Recent work has shown that convolutional neural networks (CNNs) trained on image recognition tasks can serve as valuable models for predicting neural responses in primate visual cortex. However, these models typically require biologically-infeasible levels of labeled training data, so this similarity must at least arise via different paths. In addition, most popular CNNs are solely feedforward, lacking a notion of time and recurrence, whereas neurons in visual cortex produce complex time-varying responses, even to static inputs. Towards addressing these inconsistencies with biology, here we study the emergent properties of a recurrent generative network that is trained to predict future video frames in a self-supervised manner. Remarkably, the resulting model is able to capture a wide variety of seemingly disparate phenomena observed in visual cortex, ranging from single-unit response dynamics to complex perceptual motion illusions, even when subjected to highly impoverished stimuli. These results suggest potentially deep connections between recurrent predictive neural network models and computations in the brain, providing new leads that can enrich both fields.

16.
Proc Natl Acad Sci U S A ; 115(35): 8835-8840, 2018 08 28.
Artigo em Inglês | MEDLINE | ID: mdl-30104363

RESUMO

Making inferences from partial information constitutes a critical aspect of cognition. During visual perception, pattern completion enables recognition of poorly visible or occluded objects. We combined psychophysics, physiology, and computational models to test the hypothesis that pattern completion is implemented by recurrent computations and present three pieces of evidence that are consistent with this hypothesis. First, subjects robustly recognized objects even when they were rendered <15% visible, but recognition was largely impaired when processing was interrupted by backward masking. Second, invasive physiological responses along the human ventral cortex exhibited visually selective responses to partially visible objects that were delayed compared with whole objects, suggesting the need for additional computations. These physiological delays were correlated with the effects of backward masking. Third, state-of-the-art feed-forward computational architectures were not robust to partial visibility. However, recognition performance was recovered when the model was augmented with attractor-based recurrent connectivity. The recurrent model was able to predict which images of heavily occluded objects were easier or harder for humans to recognize, could capture the effect of introducing a backward mask on recognition behavior, and was consistent with the physiological delays along the human ventral visual stream. These results provide a strong argument of plausibility for the role of recurrent computations in making visual inferences from partial information.


Assuntos
Simulação por Computador , Modelos Neurológicos , Reconhecimento Visual de Modelos/fisiologia , Adolescente , Adulto , Feminino , Humanos , Masculino
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA