Búsqueda | Biblioteca Virtual en Salud Fronteriza

1.

Can Deep Adult Lung Segmentation Models Generalize to the Pediatric Population?

Rajaraman, Sivaramakrishnan; Yang, Feng; Zamzmi, Ghada; Xue, Zhiyun; Antani, Sameer.

Expert Syst Appl ; 229(Pt A)2023 Nov 01.

Artículo en Inglés | MEDLINE | ID: mdl-37397242

RESUMEN

Lung segmentation in chest X-rays (CXRs) is an important prerequisite for improving the specificity of diagnoses of cardiopulmonary diseases in a clinical decision support system. Current deep learning models for lung segmentation are trained and evaluated on CXR datasets in which the radiographic projections are captured predominantly from the adult population. However, the shape of the lungs is reported to be significantly different across the developmental stages from infancy to adulthood. This might result in age-related data domain shifts that would adversely impact lung segmentation performance when the models trained on the adult population are deployed for pediatric lung segmentation. In this work, our goal is to (i) analyze the generalizability of deep adult lung segmentation models to the pediatric population and (ii) improve performance through a stage-wise, systematic approach consisting of CXR modality-specific weight initializations, stacked ensembles, and an ensemble of stacked ensembles. To evaluate segmentation performance and generalizability, novel evaluation metrics consisting of mean lung contour distance (MLCD) and average hash score (AHS) are proposed in addition to the multi-scale structural similarity index measure (MS-SSIM), the intersection of union (IoU), Dice score, 95% Hausdorff distance (HD95), and average symmetric surface distance (ASSD). Our results showed a significant improvement (p < 0.05) in cross-domain generalization through our approach. This study could serve as a paradigm to analyze the cross-domain generalizability of deep segmentation models for other medical imaging modalities and applications.

2.

The development of "automated visual evaluation" for cervical cancer screening: The promise and challenges in adapting deep-learning for clinical testing: Interdisciplinary principles of automated visual evaluation in cervical screening.

Desai, Kanan T; Befano, Brian; Xue, Zhiyun; Kelly, Helen; Campos, Nicole G; Egemen, Didem; Gage, Julia C; Rodriguez, Ana-Cecilia; Sahasrabuddhe, Vikrant; Levitz, David; Pearlman, Paul; Jeronimo, Jose; Antani, Sameer; Schiffman, Mark; de Sanjosé, Silvia.

Int J Cancer ; 150(5): 741-752, 2022 03 01.

Artículo en Inglés | MEDLINE | ID: mdl-34800038

RESUMEN

There is limited access to effective cervical cancer screening programs in many resource-limited settings, resulting in continued high cervical cancer burden. Human papillomavirus (HPV) testing is increasingly recognized to be the preferable primary screening approach if affordable due to superior long-term reassurance when negative and adaptability to self-sampling. Visual inspection with acetic acid (VIA) is an inexpensive but subjective and inaccurate method widely used in resource-limited settings, either for primary screening or for triage of HPV-positive individuals. A deep learning (DL)-based automated visual evaluation (AVE) of cervical images has been developed to help improve the accuracy and reproducibility of VIA as assistive technology. However, like any new clinical technology, rigorous evaluation and proof of clinical effectiveness are required before AVE is implemented widely. In the current article, we outline essential clinical and technical considerations involved in building a validated DL-based AVE tool for broad use as a clinical test.

Asunto(s)

Aprendizaje Profundo , Detección Precoz del Cáncer/métodos , Neoplasias del Cuello Uterino/diagnóstico , Algoritmos , Femenino , Humanos , Papillomaviridae/aislamiento & purificación , Reproducibilidad de los Resultados , Neoplasias del Cuello Uterino/virología

3.

A demonstration of automated visual evaluation of cervical images taken with a smartphone camera.

Xue, Zhiyun; Novetsky, Akiva P; Einstein, Mark H; Marcus, Jenna Z; Befano, Brian; Guo, Peng; Demarco, Maria; Wentzensen, Nicolas; Long, Leonard Rodney; Schiffman, Mark; Antani, Sameer.

Int J Cancer ; 147(9): 2416-2423, 2020 11 01.

Artículo en Inglés | MEDLINE | ID: mdl-32356305

RESUMEN

We examined whether automated visual evaluation (AVE), a deep learning computer application for cervical cancer screening, can be used on cervix images taken by a contemporary smartphone camera. A large number of cervix images acquired by the commercial MobileODT EVA system were filtered for acceptable visual quality and then 7587 filtered images from 3221 women were annotated by a group of gynecologic oncologists (so the gold standard is an expert impression, not histopathology). We tested and analyzed on multiple random splits of the images using two deep learning, object detection networks. For all the receiver operating characteristics curves, the area under the curve values for the discrimination of the most likely precancer cases from least likely cases (most likely controls) were above 0.90. These results showed that AVE can classify cervix images with confidence scores that are strongly associated with expert evaluations of severity for the same images. The results on a small subset of images that have histopathologic diagnoses further supported the capability of AVE for predicting cervical precancer. We examined the associations of AVE severity score with gynecologic oncologist impression at all regions where we had a sufficient number of cases and controls, and the influence of a woman's age. The method was found generally resilient to regional variation in the appearance of the cervix. This work suggests that using AVE on smartphones could be a useful adjunct to health-worker visual assessment with acetic acid, a cervical cancer screening method commonly used in low- and middle-resource settings.

Asunto(s)

Cuello del Útero/diagnóstico por imagen , Procesamiento de Imagen Asistido por Computador/métodos , Tamizaje Masivo/instrumentación , Teléfono Inteligente/economía , Displasia del Cuello del Útero/diagnóstico , Neoplasias del Cuello Uterino/prevención & control , Biopsia , Cuello del Útero/patología , Conjuntos de Datos como Asunto , Aprendizaje Profundo , Diagnóstico Diferencial , Femenino , Humanos , Tamizaje Masivo/economía , Tamizaje Masivo/métodos , Curva ROC , Displasia del Cuello del Útero/patología , Neoplasias del Cuello Uterino/diagnóstico , Neoplasias del Cuello Uterino/patología

4.

Feature Selection for Automatic Tuberculosis Screening in Frontal Chest Radiographs.

Vajda, Szilárd; Karargyris, Alexandros; Jaeger, Stefan; Santosh, K C; Candemir, Sema; Xue, Zhiyun; Antani, Sameer; Thoma, George.

J Med Syst ; 42(8): 146, 2018 Jun 29.

Artículo en Inglés | MEDLINE | ID: mdl-29959539

RESUMEN

To detect pulmonary abnormalities such as Tuberculosis (TB), an automatic analysis and classification of chest radiographs can be used as a reliable alternative to more sophisticated and technologically demanding methods (e.g. culture or sputum smear analysis). In target areas like Kenya TB is highly prevalent and often co-occurring with HIV combined with low resources and limited medical assistance. In these regions an automatic screening system can provide a cost-effective solution for a large rural population. Our completely automatic TB screening system is processing the incoming CXRs (chest X-ray) by applying image preprocessing techniques to enhance the image quality followed by an adaptive segmentation based on model selection. The delineated lung regions are described by a multitude of image features. These characteristics are than optimized by a feature selection strategy to provide the best description for the classifier, which will later decide if the analyzed image is normal or abnormal. Our goal is to find the optimal feature set from a larger pool of generic image features, -used originally for problems such as object detection, image retrieval, etc. For performance evaluation measures such as under the curve (AUC) and accuracy (ACC) were considered. Using a neural network classifier on two publicly available data collections, -namely the Montgomery and the Shenzhen dataset, we achieved the maximum area under the curve and accuracy of 0.99 and 97.03%, respectively. Further, we compared our results with existing state-of-the-art systems and to radiologists' decision.

Asunto(s)

Algoritmos , Radiografía , Tuberculosis/diagnóstico por imagen , Automatización , Humanos , Tamizaje Masivo , Esputo

5.

Multi-feature based Benchmark for Cervical Dysplasia Classification Evaluation.

Xu, Tao; Zhang, Han; Xin, Cheng; Kim, Edward; Long, L Rodney; Xue, Zhiyun; Antani, Sameer; Huang, Xiaolei.

Pattern Recognit ; 63: 468-475, 2017 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-28603299

RESUMEN

Cervical cancer is one of the most common types of cancer in women worldwide. Most deaths due to the disease occur in less developed areas of the world. In this work, we introduce a new image dataset along with expert annotated diagnoses for evaluating image-based cervical disease classification algorithms. A large number of Cervigram® images are selected from a database provided by the US National Cancer Institute. For each image, we extract three complementary pyramid features: Pyramid histogram in L*A*B* color space (PLAB), Pyramid Histogram of Oriented Gradients (PHOG), and Pyramid histogram of Local Binary Patterns (PLBP). Other than hand-crafted pyramid features, we investigate the performance of convolutional neural network (CNN) features for cervical disease classification. Our experimental results demonstrate the effectiveness of both our hand-crafted and our deep features. We intend to release this multi-feature dataset and our extensive evaluations using seven classic classifiers can serve as the baseline.

6.

Metabolic disorder: the dark side of ovarian aging.

Xue, Zhiyun; Chen, Xiuying; Li, Jin.

Trends Mol Med ; 2024 Jun 05.

Artículo en Inglés | MEDLINE | ID: mdl-38845328

RESUMEN

Ovarian aging plays an important role in the aging process of the whole body. It has been reported that metabolic disorder may significantly contribute to ovarian aging. This article highlights recent advances in metabolic regulation of ovarian aging and highlights key issues in the field.

7.

Uncovering the effects of model initialization on deep model generalization: A study with adult and pediatric chest X-ray images.

Rajaraman, Sivaramakrishnan; Zamzmi, Ghada; Yang, Feng; Liang, Zhaohui; Xue, Zhiyun; Antani, Sameer.

PLOS Digit Health ; 3(1): e0000286, 2024 Jan.

Artículo en Inglés | MEDLINE | ID: mdl-38232121

RESUMEN

Model initialization techniques are vital for improving the performance and reliability of deep learning models in medical computer vision applications. While much literature exists on non-medical images, the impacts on medical images, particularly chest X-rays (CXRs) are less understood. Addressing this gap, our study explores three deep model initialization techniques: Cold-start, Warm-start, and Shrink and Perturb start, focusing on adult and pediatric populations. We specifically focus on scenarios with periodically arriving data for training, thereby embracing the real-world scenarios of ongoing data influx and the need for model updates. We evaluate these models for generalizability against external adult and pediatric CXR datasets. We also propose novel ensemble methods: F-score-weighted Sequential Least-Squares Quadratic Programming (F-SLSQP) and Attention-Guided Ensembles with Learnable Fuzzy Softmax to aggregate weight parameters from multiple models to capitalize on their collective knowledge and complementary representations. We perform statistical significance tests with 95% confidence intervals and p-values to analyze model performance. Our evaluations indicate models initialized with ImageNet-pretrained weights demonstrate superior generalizability over randomly initialized counterparts, contradicting some findings for non-medical images. Notably, ImageNet-pretrained models exhibit consistent performance during internal and external testing across different training scenarios. Weight-level ensembles of these models show significantly higher recall (p<0.05) during testing compared to individual models. Thus, our study accentuates the benefits of ImageNet-pretrained weight initialization, especially when used with weight-level ensembles, for creating robust and generalizable deep learning solutions.

8.

Semantically redundant training data removal and deep model classification performance: A study with chest X-rays.

Rajaraman, Sivaramakrishnan; Zamzmi, Ghada; Yang, Feng; Liang, Zhaohui; Xue, Zhiyun; Antani, Sameer.

Comput Med Imaging Graph ; 115: 102379, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38608333

RESUMEN

Deep learning (DL) has demonstrated its innate capacity to independently learn hierarchical features from complex and multi-dimensional data. A common understanding is that its performance scales up with the amount of training data. However, the data must also exhibit variety to enable improved learning. In medical imaging data, semantic redundancy, which is the presence of similar or repetitive information, can occur due to the presence of multiple images that have highly similar presentations for the disease of interest. Also, the common use of augmentation methods to generate variety in DL training could limit performance when indiscriminately applied to such data. We hypothesize that semantic redundancy would therefore tend to lower performance and limit generalizability to unseen data and question its impact on classifier performance even with large data. We propose an entropy-based sample scoring approach to identify and remove semantically redundant training data and demonstrate using the publicly available NIH chest X-ray dataset that the model trained on the resulting informative subset of training data significantly outperforms the model trained on the full training set, during both internal (recall: 0.7164 vs 0.6597, p<0.05) and external testing (recall: 0.3185 vs 0.2589, p<0.05). Our findings emphasize the importance of information-oriented training sample selection as opposed to the conventional practice of using all available training data.

Asunto(s)

Aprendizaje Profundo , Radiografía Torácica , Semántica , Humanos

9.

Cleaning and Harmonizing Medical Image Data for Reliable AI: Lessons Learned from Longitudinal Oral Cancer Natural History Study Data.

Xue, Zhiyun; Oguguo, Tochi; Yu, Kelly J; Chen, Tseng-Cheng; Hua, Chun-Hung; Kang, Chung Jan; Chien, Chih-Yen; Tsai, Ming-Hsui; Wang, Cheng-Ping; Chaturvedi, Anil K; Antani, Sameer.

Proc SPIE Int Soc Opt Eng ; 129312024 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-38774479

RESUMEN

For deep learning-based machine learning, not only are large and sufficiently diverse data crucial but their good qualities are equally important. However, in real-world applications, it is very common that raw source data may contain incorrect, noisy, inconsistent, improperly formatted and sometimes missing elements, particularly, when the datasets are large and sourced from many sites. In this paper, we present our work towards preparing and making image data ready for the development of AI-driven approaches for studying various aspects of the natural history of oral cancer. Specifically, we focus on two aspects: 1) cleaning the image data; and 2) extracting the annotation information. Data cleaning includes removing duplicates, identifying missing data, correcting errors, standardizing data sets, and removing personal sensitive information, toward combining data sourced from different study sites. These steps are often collectively referred to as data harmonization. Annotation information extraction includes identifying crucial or valuable texts that are manually entered by clinical providers related to the image paths/names and standardizing of the texts of labels. Both are important for the successful deep learning algorithm development and data analyses. Specifically, we provide details on the data under consideration, describe the challenges and issues we observed that motivated our work, present specific approaches and methods that we used to clean and standardize the image data and extract labelling information. Further, we discuss the ways to increase efficiency of the process and the lessons learned. Research ideas on automating the process with ML-driven techniques are also presented and discussed. Our intent in reporting and discussing such work in detail is to help provide insights in automating or, minimally, increasing the efficiency of these critical yet often under-reported processes.

10.

Deep Cervix Model Development from Heterogeneous and Partially Labeled Image Datasets.

Pal, Anabik; Xue, Zhiyun; Antani, Sameer.

Front ICT Healthc (2002) ; 519: 679-688, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37396668

RESUMEN

Cervical cancer is a significant disease affecting women worldwide. Regular cervical examination with gynecologists is important for early detection and treatment planning for women with precancers. Precancer is the direct precursor to cervical cancer. However, there is a scarcity of experts and the experts' assessments are subject to variations in interpretation. In this scenario, the development of a robust automated cervical image classification system is important to augment the experts' limitations. Ideally, for such a system the class label prediction will vary according to the cervical inspection objectives. Hence, the labeling criteria may not be the same in the cervical image datasets. Moreover, due to the lack of confirmatory test results and inter-rater labeling variation, many images are left unlabeled. Motivated by these challenges, we propose to develop a pretrained cervix model from heterogeneous and partially labeled cervical image datasets. Self-supervised Learning (SSL) is employed to build the cervical model. Further, considering data-sharing restrictions, we show how federated self-supervised learning (FSSL) can be employed to develop a cervix model without sharing the cervical images. The task-specific classification models are developed by fine-tuning the cervix model. Two partially labeled cervical image datasets labeled with different classification criteria are used in this study. According to our experimental study, the cervix model prepared with dataset-specific SSL boosts classification accuracy by 2.5%↑ than ImageNet pretrained model. The classification accuracy is further boosted by 1.5%↑ when images from both datasets are combined for SSL. We see that in comparison with the dataset-specific cervix model developed with SSL, the FSSL is performing better.

11.

Assessing the Impact of Image Resolution on Deep Learning for TB Lesion Segmentation on Frontal Chest X-rays.

Rajaraman, Sivaramakrishnan; Yang, Feng; Zamzmi, Ghada; Xue, Zhiyun; Antani, Sameer.

Diagnostics (Basel) ; 13(4)2023 Feb 16.

Artículo en Inglés | MEDLINE | ID: mdl-36832235

RESUMEN

Deep learning (DL) models are state-of-the-art in segmenting anatomical and disease regions of interest (ROIs) in medical images. Particularly, a large number of DL-based techniques have been reported using chest X-rays (CXRs). However, these models are reportedly trained on reduced image resolutions for reasons related to the lack of computational resources. Literature is sparse in discussing the optimal image resolution to train these models for segmenting the tuberculosis (TB)-consistent lesions in CXRs. In this study, we investigated the performance variations with an Inception-V3 UNet model using various image resolutions with/without lung ROI cropping and aspect ratio adjustments and identified the optimal image resolution through extensive empirical evaluations to improve TB-consistent lesion segmentation performance. We used the Shenzhen CXR dataset for the study, which includes 326 normal patients and 336 TB patients. We proposed a combinatorial approach consisting of storing model snapshots, optimizing segmentation threshold and test-time augmentation (TTA), and averaging the snapshot predictions, to further improve performance with the optimal resolution. Our experimental results demonstrate that higher image resolutions are not always necessary; however, identifying the optimal image resolution is critical to achieving superior performance.

12.

Does image resolution impact chest X-ray based fine-grained Tuberculosis-consistent lesion segmentation?

Rajaraman, Sivaramakrishnan; Yang, Feng; Zamzmi, Ghada; Xue, Zhiyun; Antani, Sameer.

ArXiv ; 2023 Jan 27.

Artículo en Inglés | MEDLINE | ID: mdl-36789135

RESUMEN

Deep learning (DL) models are state-of-the-art in segmenting anatomical and disease regions of interest (ROIs) in medical images. Particularly, a large number of DL-based techniques have been reported using chest X-rays (CXRs). However, these models are reportedly trained on reduced image resolutions for reasons related to the lack of computational resources. Literature is sparse in discussing the optimal image resolution to train these models for segmenting the Tuberculosis (TB)-consistent lesions in CXRs. In this study, we investigated the performance variations using an Inception-V3 UNet model using various image resolutions with/without lung ROI cropping and aspect ratio adjustments, and (ii) identified the optimal image resolution through extensive empirical evaluations to improve TB-consistent lesion segmentation performance. We used the Shenzhen CXR dataset for the study which includes 326 normal patients and 336 TB patients. We proposed a combinatorial approach consisting of storing model snapshots, optimizing segmentation threshold and test-time augmentation (TTA), and averaging the snapshot predictions, to further improve performance with the optimal resolution. Our experimental results demonstrate that higher image resolutions are not always necessary, however, identifying the optimal image resolution is critical to achieving superior performance.

13.

Data Characterization for Reliable AI in Medicine.

Rajaraman, Sivaramakrishnan; Zamzmi, Ghada; Yang, Feng; Xue, Zhiyun; Antani, Sameer K.

Recent Trends Image Process Pattern Recogn (2022) ; 1704: 3-11, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-36780238

RESUMEN

Research in Artificial Intelligence (AI)-based medical computer vision algorithms bear promises to improve disease screening, diagnosis, and subsequently patient care. However, these algorithms are highly impacted by the characteristics of the underlying data. In this work, we discuss various data characteristics, namely Volume, Veracity, Validity, Variety, and Velocity, that impact the design, reliability, and evolution of machine learning in medical computer vision. Further, we discuss each characteristic and the recent works conducted in our research lab that informed our understanding of the impact of these characteristics on the design of medical decision-making algorithms and outcome reliability.

14.

Cross Dataset Analysis of Domain Shift in CXR Lung Region Detection.

Xue, Zhiyun; Yang, Feng; Rajaraman, Sivaramakrishnan; Zamzmi, Ghada; Antani, Sameer.

Diagnostics (Basel) ; 13(6)2023 Mar 11.

Artículo en Inglés | MEDLINE | ID: mdl-36980375

RESUMEN

Domain shift is one of the key challenges affecting reliability in medical imaging-based machine learning predictions. It is of significant importance to investigate this issue to gain insights into its characteristics toward determining controllable parameters to minimize its impact. In this paper, we report our efforts on studying and analyzing domain shift in lung region detection in chest radiographs. We used five chest X-ray datasets, collected from different sources, which have manual markings of lung boundaries in order to conduct extensive experiments toward this goal. We compared the characteristics of these datasets from three aspects: information obtained from metadata or an image header, image appearance, and features extracted from a pretrained model. We carried out experiments to evaluate and compare model performances within each dataset and across datasets in four scenarios using different combinations of datasets. We proposed a new feature visualization method to provide explanations for the applied object detection network on the obtained quantitative results. We also examined chest X-ray modality-specific initialization, catastrophic forgetting, and model repeatability. We believe the observations and discussions presented in this work could help to shed some light on the importance of the analysis of training data for medical imaging machine learning research, and could provide valuable guidance for domain shift analysis.

15.

Semantically Redundant Training Data Removal and Deep Model Classification Performance: A Study with Chest X-rays.

Rajaraman, Sivaramakrishnan; Zamzmi, Ghada; Yang, Feng; Liang, Zhaohui; Xue, Zhiyun; Antani, Sameer.

ArXiv ; 2023 Sep 18.

Artículo en Inglés | MEDLINE | ID: mdl-37986725

RESUMEN

Deep learning (DL) has demonstrated its innate capacity to independently learn hierarchical features from complex and multi-dimensional data. A common understanding is that its performance scales up with the amount of training data. Another data attribute is the inherent variety. It follows, therefore, that semantic redundancy, which is the presence of similar or repetitive information, would tend to lower performance and limit generalizability to unseen data. In medical imaging data, semantic redundancy can occur due to the presence of multiple images that have highly similar presentations for the disease of interest. Further, the common use of augmentation methods to generate variety in DL training may be limiting performance when applied to semantically redundant data. We propose an entropy-based sample scoring approach to identify and remove semantically redundant training data. We demonstrate using the publicly available NIH chest X-ray dataset that the model trained on the resulting informative subset of training data significantly outperforms the model trained on the full training set, during both internal (recall: 0.7164 vs 0.6597, p<0.05) and external testing (recall: 0.3185 vs 0.2589, p<0.05). Our findings emphasize the importance of information-oriented training sample selection as opposed to the conventional practice of using all available training data.

16.

Automatic Quantification of COVID-19 Pulmonary Edema by Self-supervised Contrastive Learning.

Liang, Zhaohui; Xue, Zhiyun; Rajaraman, Sivaramakrishnan; Feng, Yang; Antani, Sameer.

Med Image Learn Ltd Noisy Data (2023) ; 14307: 128-137, 2023 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-38415180

RESUMEN

We proposed a self-supervised machine learning method to automatically rate the severity of pulmonary edema in the frontal chest X-ray radiographs (CXR) which could be potentially related to COVID-19 viral pneumonia. For this we use the modified radiographic assessment of lung edema (mRALE) scoring system. The new model was first optimized with the simple Siamese network (SimSiam) architecture where a ResNet-50 pretrained by ImageNet database was used as the backbone. The encoder projected a 2048-dimension embedding as representation features to a downstream fully connected deep neural network for mRALE score prediction. A 5-fold cross-validation with 2,599 frontal CXRs was used to examine the new model's performance with comparison to a non-pretrained SimSiam encoder and a ResNet-50 trained from scratch. The mean absolute error (MAE) of the new model is 5.05 (95%CI 5.03-5.08), the mean squared error (MSE) is 66.67 (95%CI 66.29-67.06), and the Spearman's correlation coefficient (Spearman ρ) to the expert-annotated scores is 0.77 (95%CI 0.75-0.79). All the performance metrics of the new model are superior to the two comparators (P<0.01), and the scores of MSE and Spearman ρ of the two comparators have no statistical difference (P>0.05). The model also achieved a prediction probability concordance of 0.811 and a quadratic weighted kappa of 0.739 with the medical expert annotations in external validation. We conclude that the self-supervised contrastive learning method is an effective strategy for mRALE automated scoring. It provides a new approach to improve machine learning performance and minimize the expert knowledge involvement in quantitative medical image pattern learning.

17.

Assessing Inter-Annotator Agreement for Medical Image Segmentation.

Yang, Feng; Zamzmi, Ghada; Angara, Sandeep; Rajaraman, Sivaramakrishnan; Aquilina, André; Xue, Zhiyun; Jaeger, Stefan; Papagiannakis, Emmanouil; Antani, Sameer K.

IEEE Access ; 11: 21300-21312, 2023.

Artículo en Inglés | MEDLINE | ID: mdl-37008654

RESUMEN

Artificial Intelligence (AI)-based medical computer vision algorithm training and evaluations depend on annotations and labeling. However, variability between expert annotators introduces noise in training data that can adversely impact the performance of AI algorithms. This study aims to assess, illustrate and interpret the inter-annotator agreement among multiple expert annotators when segmenting the same lesion(s)/abnormalities on medical images. We propose the use of three metrics for the qualitative and quantitative assessment of inter-annotator agreement: 1) use of a common agreement heatmap and a ranking agreement heatmap; 2) use of the extended Cohen's kappa and Fleiss' kappa coefficients for a quantitative evaluation and interpretation of inter-annotator reliability; and 3) use of the Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm, as a parallel step, to generate ground truth for training AI models and compute Intersection over Union (IoU), sensitivity, and specificity to assess the inter-annotator reliability and variability. Experiments are performed on two datasets, namely cervical colposcopy images from 30 patients and chest X-ray images from 336 tuberculosis (TB) patients, to demonstrate the consistency of inter-annotator reliability assessment and the importance of combining different metrics to avoid bias assessment.

18.

Unsupervised Deep Learning Registration of Uterine Cervix Sequence Images.

Guo, Peng; Xue, Zhiyun; Angara, Sandeep; Antani, Sameer K.

Cancers (Basel) ; 14(10)2022 May 13.

Artículo en Inglés | MEDLINE | ID: mdl-35626005

RESUMEN

During a colposcopic examination of the uterine cervix for cervical cancer prevention, one or more digital images are typically acquired after the application of diluted acetic acid. An alternative approach is to acquire a sequence of images at fixed intervals during an examination before and after applying acetic acid. This approach is asserted to be more informative as it can capture dynamic pixel intensity variations on the cervical epithelium during the aceto-whitening reaction. However, the resulting time sequence images may not be spatially aligned due to the movement of the cervix with respect to the imaging device. Disease prediction using automated visual evaluation (AVE) techniques using multiple images could be adversely impacted without correction for this misalignment. The challenge is that there is no registration ground truth to help train a supervised-learning-based image registration algorithm. We present a novel unsupervised registration approach to align a sequence of digital cervix color images. The proposed deep-learning-based registration network consists of three branches and processes the red, green, and blue (RGB, respectively) channels of each input color image separately using an unsupervised strategy. Each network branch consists of a convolutional neural network (CNN) unit and a spatial transform unit. To evaluate the registration performance on a dataset that has no ground truth, we propose an evaluation strategy that is based on comparing automatic cervix segmentation masks in the registered sequence and the original sequence. The compared segmentation masks are generated by a fine-tuned transformer-based object detection model (DeTr). The segmentation model achieved Dice/IoU scores of 0.917/0.870 and 0.938/0.885, which are comparable to the performance of our previous model in two datasets. By comparing our segmentation on both original and registered time sequence images, we observed an average improvement in Dice scores of 12.62% following registration. Further, our approach achieved higher Dice and IoU scores and maintained full image integrity compared to a non-deep learning registration method on the same dataset.

19.

Analysis of digital noise and reduction methods on classifiers used in automated visual evaluation in cervical cancer screening.

Xue, Zhiyun; Angara, Sandeep; Levitz, David; Antani, Sameer.

Proc SPIE Int Soc Opt Eng ; 119502022.

Artículo en Inglés | MEDLINE | ID: mdl-35529321

RESUMEN

The burden of cervical cancer disproportionately falls on low- and middle-income countries (LMICs). Automated visual evaluation (AVE) is a technology being considered as an adjunct tool for the management of HPV-positive women. AVE involves analysis of a white light illuminated cervical image using machine learning classifiers. It is of importance to analyze various impacts of different kinds of image degradations on AVE. In this paper, we report our work regarding the impact of one type of image degradation, Gaussian noise, and one of its remedies we have been exploring. The images, originated from the Natural History Study (NHS) and ASCUS-LSIL Triage Study (ALTS), were modified by the addition of white Gaussian noise at different levels. The AVE pipeline used in the experiments consists of two deep learning components: a cervix locator which uses RetinaNet (an object detection network), and a binary pathology classifier that uses the ResNeSt network. Our findings indicate that Gaussian noise, which frequently appears in low light conditions, is a key factor in degrading the AVE's performance. A blind image denoising technique which uses Variational Denoising Network (VDNet) was tested on a set of 345 digitized cervigram images (115 positives) and evaluated both visually and quantitatively. AVE performances on both the synthetically generated noisy images and the corresponding denoised images were examined and compared. In addition, the denoising technique was evaluated on several real noisy cervix images captured by a camera-based imaging device used for AVE that have no histology confirmation. The comparison between the AVE performances on images with and without denoising shows that denoising can be effective at mitigating classification performance degradation.

20.

A Deep Modality-Specific Ensemble for Improving Pneumonia Detection in Chest X-rays.

Rajaraman, Sivaramakrishnan; Guo, Peng; Xue, Zhiyun; Antani, Sameer K.

Diagnostics (Basel) ; 12(6)2022 Jun 11.

Artículo en Inglés | MEDLINE | ID: mdl-35741252

RESUMEN

Pneumonia is an acute respiratory infectious disease caused by bacteria, fungi, or viruses. Fluid-filled lungs due to the disease result in painful breathing difficulties and reduced oxygen intake. Effective diagnosis is critical for appropriate and timely treatment and improving survival. Chest X-rays (CXRs) are routinely used to screen for the infection. Computer-aided detection methods using conventional deep learning (DL) models for identifying pneumonia-consistent manifestations in CXRs have demonstrated superiority over traditional machine learning approaches. However, their performance is still inadequate to aid in clinical decision-making. This study improves upon the state of the art as follows. Specifically, we train a DL classifier on large collections of CXR images to develop a CXR modality-specific model. Next, we use this model as the classifier backbone in the RetinaNet object detection network. We also initialize this backbone using random weights and ImageNet-pretrained weights. Finally, we construct an ensemble of the best-performing models resulting in improved detection of pneumonia-consistent findings. Experimental results demonstrate that an ensemble of the top-3 performing RetinaNet models outperformed individual models in terms of the mean average precision (mAP) metric (0.3272, 95% CI: (0.3006,0.3538)) toward this task, which is markedly higher than the state of the art (mAP: 0.2547). This performance improvement is attributed to the key modifications in initializing the weights of classifier backbones and constructing model ensembles to reduce prediction variance compared to individual constituent models.

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA