RESUMEN
BACKGROUND: Few studies have investigated the collaborative potential between artificial intelligence (AI) and pulmonologists for diagnosing pulmonary disease. We hypothesised that the collaboration between a pulmonologist and AI with explanations (explainable AI (XAI)) is superior in diagnostic interpretation of pulmonary function tests (PFTs) than the pulmonologist without support. METHODS: The study was conducted in two phases, a monocentre study (phase 1) and a multicentre intervention study (phase 2). Each phase utilised two different sets of 24 PFT reports of patients with a clinically validated gold standard diagnosis. Each PFT was interpreted without (control) and with XAI's suggestions (intervention). Pulmonologists provided a differential diagnosis consisting of a preferential diagnosis and optionally up to three additional diagnoses. The primary end-point compared accuracy of preferential and additional diagnoses between control and intervention. Secondary end-points were the number of diagnoses in differential diagnosis, diagnostic confidence and inter-rater agreement. We also analysed how XAI influenced pulmonologists' decisions. RESULTS: In phase 1 (n=16 pulmonologists), mean preferential and differential diagnostic accuracy significantly increased by 10.4% and 9.4%, respectively, between control and intervention (p<0.001). Improvements were somewhat lower but highly significant (p<0.0001) in phase 2 (5.4% and 8.7%, respectively; n=62 pulmonologists). In both phases, the number of diagnoses in the differential diagnosis did not reduce, but diagnostic confidence and inter-rater agreement significantly increased during intervention. Pulmonologists updated their decisions with XAI's feedback and consistently improved their baseline performance if AI provided correct predictions. CONCLUSION: A collaboration between a pulmonologist and XAI is better at interpreting PFTs than individual pulmonologists reading without XAI support or XAI alone.
Asunto(s)
Inteligencia Artificial , Enfermedades Pulmonares , Humanos , Neumólogos , Pruebas de Función Respiratoria , Enfermedades Pulmonares/diagnósticoRESUMEN
RATIONALE: Estimating the causal effect of an intervention at individual level, also called individual treatment effect (ITE), may help in identifying response prior to the intervention. OBJECTIVES: We aimed to develop machine learning (ML) models which estimate ITE of an intervention using data from randomised controlled trials and illustrate this approach with prediction of ITE on annual chronic obstructive pulmonary disease (COPD) exacerbation rates. METHODS: We used data from 8151 patients with COPD of the Study to Understand Mortality and MorbidITy in COPD (SUMMIT) trial (NCT01313676) to address the ITE of fluticasone furoate/vilanterol (FF/VI) versus control (placebo) on exacerbation rate and developed a novel metric, Q-score, for assessing the power of causal inference models. We then validated the methodology on 5990 subjects from the InforMing the PAthway of COPD Treatment (IMPACT) trial (NCT02164513) to estimate the ITE of FF/umeclidinium/VI (FF/UMEC/VI) versus UMEC/VI on exacerbation rate. We used Causal Forest as causal inference model. RESULTS: In SUMMIT, Causal Forest was optimised on the training set (n=5705) and tested on 2446 subjects (Q-score 0.61). In IMPACT, Causal Forest was optimised on 4193 subjects in the training set and tested on 1797 individuals (Q-score 0.21). In both trials, the quantiles of patients with the strongest ITE consistently demonstrated the largest reductions in observed exacerbations rates (0.54 and 0.53, p<0.001). Poor lung function and blood eosinophils, respectively, were the strongest predictors of ITE. CONCLUSIONS: This study shows that ML models for causal inference can be used to identify individual response to different COPD treatments and highlight treatment traits. Such models could become clinically useful tools for individual treatment decisions in COPD.
Asunto(s)
Pulmón , Enfermedad Pulmonar Obstructiva Crónica , Humanos , Administración por Inhalación , Enfermedad Pulmonar Obstructiva Crónica/tratamiento farmacológico , Androstadienos/uso terapéutico , Androstadienos/farmacología , Alcoholes Bencílicos/uso terapéutico , Alcoholes Bencílicos/farmacología , Clorobencenos/uso terapéutico , Clorobencenos/farmacología , Broncodilatadores/uso terapéutico , Combinación de Medicamentos , Método Doble Ciego , Resultado del Tratamiento , Ensayos Clínicos Controlados Aleatorios como AsuntoRESUMEN
Rationale: Acquiring high-quality spirometry data in clinical trials is important, particularly when using forced expiratory volume in 1â s or forced vital capacity as primary end-points. In addition to quantitative criteria, the American Thoracic Society (ATS)/European Respiratory Society (ERS) standards include subjective evaluation which introduces inter-rater variability and potential mistakes. We explored the value of artificial intelligence (AI)-based software (ArtiQ.QC) to assess spirometry quality and compared it to traditional over-reading control. Methods: A random sample of 2000 sessions (8258 curves) was selected from Chiesi COPD and asthma trials (n=1000 per disease). Acceptability using the 2005 ATS/ERS standards was determined by over-reader review and by ArtiQ.QC. Additionally, three respiratory physicians jointly reviewed a subset of curves (n=150). Results: The majority of curves (n=7267, 88%) were of good quality. The AI agreed with over-readers in 91% of cases, with 97% sensitivity and 93% positive predictive value. Performance was significantly better in the asthma group. In the revised subset, n=50 curves were repeated to assess intra-rater reliability (κ=0.83, 0.86 and 0.80 for each of the three reviewers). All reviewers agreed on 63% of 100 unique tests (κ=0.5). When reviewers set the consensus (gold standard), individual agreement with it was 88%, 94% and 70%. The agreement between AI and "gold-standard" was 73%; over-reader agreement was 46%. Conclusion: AI-based software can be used to measure spirometry data quality with comparable accuracy as experts. The assessment is a subjective exercise, with intra- and inter-rater variability even when the criteria are defined very precisely and objectively. By providing consistent results and immediate feedback to the sites, AI may benefit clinical trial conduct and variability reduction.
RESUMEN
BACKGROUND: Parameters from maximal expiratory flow-volume curves (MEFVC) have been linked to CT-based parameters of COPD. However, the association between MEFVC shape and phenotypes like emphysema, small airways disease (SAD) and bronchial wall thickening (BWT) has not been investigated. RESEARCH QUESTION: We analyzed if the shape of MEFVC can be linked to CT-determined emphysema, SAD and BWT in a large cohort of COPDGene participants. STUDY DESIGN AND METHODS: In the COPDGene cohort, we used principal component analysis (PCA) to extract patterns from MEFVC shape and performed multiple linear regression to assess the association of these patterns with CT parameters over the COPD spectrum, in mild and moderate-severe COPD. RESULTS: Over the entire spectrum, in mild and moderate-severe COPD, principal components of MEFVC were important predictors for the continuous CT parameters. Their contribution to the prediction of emphysema diminished when classical pulmonary function test parameters were added. For SAD, the components remained very strong predictors. The adjusted R2 was higher in moderate-severe COPD, while in mild COPD, the adjusted R2 for all CT outcomes was low; 0.28 for emphysema, 0.21 for SAD and 0.19 for BWT. INTERPRETATION: The shape of the maximal expiratory flow-volume curve as analyzed with PCA is not an appropriate screening tool for early disease phenotypes identified by CT scan. However, it contributes to assessing emphysema and SAD in moderate-severe COPD.
Asunto(s)
Enfisema , Enfermedad Pulmonar Obstructiva Crónica , Enfisema Pulmonar , Humanos , Análisis de Componente Principal , Fumar , Enfermedad Pulmonar Obstructiva Crónica/diagnóstico , Enfermedad Pulmonar Obstructiva Crónica/genética , Espirometría , Fenotipo , Volumen Espiratorio ForzadoRESUMEN
It is a challenge to keep abreast of all the clinical and scientific advances in the field of respiratory medicine. This article contains an overview of laboratory-based science, randomised controlled trials and qualitative research that were presented during the 2021 European Respiratory Society International Congress within the sessions from the five groups of the Assembly 1 - Respiratory clinical care and physiology. Selected presentations are summarised from a wide range of topics: clinical problems, rehabilitation and chronic care, general practice and primary care, electronic/mobile health (e-health/m-health), clinical respiratory physiology, exercise and functional imaging.
RESUMEN
RATIONALE: While American Thoracic Society (ATS)/European Respiratory Society (ERS) quality control criteria for spirometry include several quantitative limits, it also requires manual visual inspection. The current approach is time consuming and leads to high intertechnician variability. We propose a deep-learning approach called convolutional neural network (CNN), to standardise spirometric manoeuvre acceptability and usability. METHODS AND METHODS: In 36â873 curves from the National Health and Nutritional Examination Survey USA 2011-2012, technicians labelled 54% of curves as meeting ATS/ERS 2005 acceptability criteria with satisfactory start and end of test, but identified 93% of curves with a usable forced expiratory volume in 1â s. We processed raw data into images of maximal expiratory flow-volume curve (MEFVC), calculated ATS/ERS quantifiable criteria and developed CNNs to determine manoeuvre acceptability and usability on 90% of the curves. The models were tested on the remaining 10% of curves. We calculated Shapley values to interpret the models. RESULTS: In the test set (n=3738), CNN showed an accuracy of 87% for acceptability and 92% for usability, with the latter demonstrating a high sensitivity (92%) and specificity (96%). They were significantly superior (p<0.0001) to ATS/ERS quantifiable rule-based models. Shapley interpretation revealed MEFVC<1â s (MEFVC pattern within first second of exhalation) and plateau in volume-time were most important in determining acceptability, while MEFVC<1â s entirely determined usability. CONCLUSION: The CNNs identified relevant attributes in spirometric curves to standardise ATS/ERS manoeuvre acceptability and usability recommendations, and further provides individual manoeuvre feedback. Our algorithm combines the visual experience of skilled technicians and ATS/ERS quantitative rules in automating the critical phase of spirometry quality control.
Asunto(s)
Aprendizaje Profundo , Algoritmos , Espiración , Volumen Espiratorio Forzado , Humanos , Espirometría , Estados Unidos , Capacidad VitalRESUMEN
The past 5 years have seen an explosion of interest in the use of artificial intelligence (AI) and machine learning techniques in medicine. This has been driven by the development of deep neural networks (DNNs)-complex networks residing in silico but loosely modelled on the human brain-that can process complex input data such as a chest radiograph image and output a classification such as 'normal' or 'abnormal'. DNNs are 'trained' using large banks of images or other input data that have been assigned the correct labels. DNNs have shown the potential to equal or even surpass the accuracy of human experts in pattern recognition tasks such as interpreting medical images or biosignals. Within respiratory medicine, the main applications of AI and machine learning thus far have been the interpretation of thoracic imaging, lung pathology slides and physiological data such as pulmonary function tests. This article surveys progress in this area over the past 5 years, as well as highlighting the current limitations of AI and machine learning and the potential for future developments.
Asunto(s)
Inteligencia Artificial , Aprendizaje Automático , Neumología , HumanosRESUMEN
Spirometry is the current gold standard for diagnosing and monitoring the progression of Chronic Obstructive Pulmonary Disease (COPD). However, many current and former smokers who do not meet established spirometric criteria for the diagnosis of this disease have symptoms and clinical courses similar to those with diagnosed COPD. Large longitudinal observational studies following individuals at risk of developing COPD offer us additional insight into spirometric patterns of disease development and progression. Analysis of forced expiratory maneuver changes over time may allow us to better understand early changes predictive of progressive disease. This review discusses the theoretical ability of spirometry to capture fine pathophysiologic changes in early airway disease, highlights the shortcomings of current diagnostic criteria, and reviews existing evidence for spirometric measures which may be used to better detect early airflow impairment.
Asunto(s)
Volumen Espiratorio Forzado , Enfermedad Pulmonar Obstructiva Crónica/diagnóstico , Enfermedad Pulmonar Obstructiva Crónica/fisiopatología , Espirometría , Capacidad Vital , RiesgoRESUMEN
BACKGROUND: Severe hyperinflation causes detrimental effects such as dyspnea and reduced exercise capacity and is an independent predictor of mortality in COPD patients. Static lung volumes are required to diagnose severe hyperinflation, which are not always accessible in primary care. Several studies have shown that the area under the forced expiratory flow-volume loop (AreaFE) is highly sensitive to bronchodilator response and is correlated with residual volume/total lung capacity (RV/TLC), a common index of air trapping. In this study, we investigate the role of AreaFE% (AreaFE expressed as a percentage of reference value) and conventional spirometry parameters in indicating severe hyperinflation. MATERIALS AND METHODS: We used a cohort of 215 individuals with COPD. The presence of severe hyperinflation was defined as elevated air trapping (RV/TLC >60%) or reduced inspiratory fraction (inspiratory capacity [IC]/TLC <25%) measured using body plethysmography. AreaFE% was calculated by integrating the maximal expiratory flow-volume loop with the trapezoidal rule and expressing it as a percentage of the reference value estimated using predicted values of FVC, peak expiratory flow and forced expiratory flow at 25%, 50% and 75% of FVC. Receiver operating characteristics (ROC) curve analysis was used to identify cut-offs that were used to indicate severe hyperinflation, which were then validated in a separate group of 104 COPD subjects. RESULTS: ROC analysis identified cut-offs of 15% and 20% for AreaFE% in indicating RV/TLC >60% and IC/TLC <25%, respectively (N=215). On validation (N=104), these cut-offs consistently registered the highest accuracy (80% each), sensitivity (68% and 75%) and specificity (83% and 80%) among conventional parameters in both criteria of severe hyperinflation. CONCLUSION: AreaFE% consistently provides a superior estimation of severe hyperinflation using different indices, and may provide a convenient way to refer COPD patients for body plethysmography to address static lung volumes.
Asunto(s)
Pulmón/fisiopatología , Flujo Espiratorio Medio Máximo , Enfermedad Pulmonar Obstructiva Crónica/diagnóstico , Espirometría , Anciano , Área Bajo la Curva , Femenino , Volumen Espiratorio Forzado , Humanos , Masculino , Persona de Mediana Edad , Pletismografía Total , Valor Predictivo de las Pruebas , Enfermedad Pulmonar Obstructiva Crónica/fisiopatología , Curva ROC , Volumen Residual , Índice de Severidad de la Enfermedad , Capacidad Pulmonar Total , Capacidad VitalRESUMEN
The interpretation of pulmonary function tests (PFTs) to diagnose respiratory diseases is built on expert opinion that relies on the recognition of patterns and the clinical context for detection of specific diseases. In this study, we aimed to explore the accuracy and interrater variability of pulmonologists when interpreting PFTs compared with artificial intelligence (AI)-based software that was developed and validated in more than 1500 historical patient cases.120 pulmonologists from 16 European hospitals evaluated 50 cases with PFT and clinical information, resulting in 6000 independent interpretations. The AI software examined the same data. American Thoracic Society/European Respiratory Society guidelines were used as the gold standard for PFT pattern interpretation. The gold standard for diagnosis was derived from clinical history, PFT and all additional tests.The pattern recognition of PFTs by pulmonologists (senior 73%, junior 27%) matched the guidelines in 74.4±5.9% of the cases (range 56-88%). The interrater variability of κ=0.67 pointed to a common agreement. Pulmonologists made correct diagnoses in 44.6±8.7% of the cases (range 24-62%) with a large interrater variability (κ=0.35). The AI-based software perfectly matched the PFT pattern interpretations (100%) and assigned a correct diagnosis in 82% of all cases (p<0.0001 for both measures).The interpretation of PFTs by pulmonologists leads to marked variations and errors. AI-based software provides more accurate interpretations and may serve as a powerful decision support tool to improve clinical practice.
Asunto(s)
Inteligencia Artificial , Neumología , Pruebas de Función Respiratoria , Adulto , Anciano , Anciano de 80 o más Años , Femenino , Humanos , Masculino , Persona de Mediana Edad , Estudios Prospectivos , Programas InformáticosRESUMEN
PURPOSE OF REVIEW: The application of artificial intelligence in the diagnosis of obstructive lung diseases is an exciting phenomenon. Artificial intelligence algorithms work by finding patterns in data obtained from diagnostic tests, which can be used to predict clinical outcomes or to detect obstructive phenotypes. The purpose of this review is to describe the latest trends and to discuss the future potential of artificial intelligence in the diagnosis of obstructive lung diseases. RECENT FINDINGS: Machine learning has been successfully used in automated interpretation of pulmonary function tests for differential diagnosis of obstructive lung diseases. Deep learning models such as convolutional neural network are state-of-the art for obstructive pattern recognition in computed tomography. Machine learning has also been applied in other diagnostic approaches such as forced oscillation test, breath analysis, lung sound analysis and telemedicine with promising results in small-scale studies. SUMMARY: Overall, the application of artificial intelligence has produced encouraging results in the diagnosis of obstructive lung diseases. However, large-scale studies are still required to validate current findings and to boost its adoption by the medical community.