RESUMO
PURPOSE: Accurate segmentation (separating diseased portions of the lung from normal appearing lung) is a challenge in radiomic studies of non-neoplastic diseases, such as pulmonary tuberculosis (PTB). In this study, we developed a segmentation method, applicable to chest X-rays (CXR), that can eliminate the need for precise disease delineation, and that is effective for constructing radiomic models for automatic PTB cavity classification. METHODS: This retrospective study used a dataset of 266 posteroanterior CXR of patients diagnosed with laboratory confirmed PTB. The lungs were segmented using a U-net-based in-house automatic segmentation model. A secondary segmentation was developed using a sliding window, superimposed on the primary lung segmentation. Pyradiomics was used for feature extraction from every window which increased the dimensionality of the data, but this allowed us to accurately capture the spread of the features across the lung. Two separate measures (standard-deviation and variance) were used to consolidate the features. Pearson's correlation analysis (with a 0.8 cut-off value) was then applied for dimensionality reduction followed by the construction of Random Forest radiomic models. RESULTS: Two almost identical radiomic signatures consisting of 10 texture features each (9 were the same plus 1 other feature) were identified using the two separate consolidation measures. Two well performing random forest models were constructed from these signatures. The standard-deviation model (AUC = 0.9444 (95% CI, 0.8762; 0.9814)) performed marginally better than the variance model (AUC = 0.9288 (95% CI, 0.9046; 0.9843)). CONCLUSION: The introduction of the secondary sliding window segmentation on CXR could eliminate the need for disease delineation in pulmonary radiomic studies, and it could improve the accuracy of CXR reporting currently regaining prominence as a high-volume screening tool as the developed radiomic models correctly classify cavities from normal CXR.