Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 133
Filtrar
1.
Artículo en Inglés | MEDLINE | ID: mdl-38774479

RESUMEN

For deep learning-based machine learning, not only are large and sufficiently diverse data crucial but their good qualities are equally important. However, in real-world applications, it is very common that raw source data may contain incorrect, noisy, inconsistent, improperly formatted and sometimes missing elements, particularly, when the datasets are large and sourced from many sites. In this paper, we present our work towards preparing and making image data ready for the development of AI-driven approaches for studying various aspects of the natural history of oral cancer. Specifically, we focus on two aspects: 1) cleaning the image data; and 2) extracting the annotation information. Data cleaning includes removing duplicates, identifying missing data, correcting errors, standardizing data sets, and removing personal sensitive information, toward combining data sourced from different study sites. These steps are often collectively referred to as data harmonization. Annotation information extraction includes identifying crucial or valuable texts that are manually entered by clinical providers related to the image paths/names and standardizing of the texts of labels. Both are important for the successful deep learning algorithm development and data analyses. Specifically, we provide details on the data under consideration, describe the challenges and issues we observed that motivated our work, present specific approaches and methods that we used to clean and standardize the image data and extract labelling information. Further, we discuss the ways to increase efficiency of the process and the lessons learned. Research ideas on automating the process with ML-driven techniques are also presented and discussed. Our intent in reporting and discussing such work in detail is to help provide insights in automating or, minimally, increasing the efficiency of these critical yet often under-reported processes.

2.
Comput Med Imaging Graph ; 115: 102379, 2024 Apr 09.
Artículo en Inglés | MEDLINE | ID: mdl-38608333

RESUMEN

Deep learning (DL) has demonstrated its innate capacity to independently learn hierarchical features from complex and multi-dimensional data. A common understanding is that its performance scales up with the amount of training data. However, the data must also exhibit variety to enable improved learning. In medical imaging data, semantic redundancy, which is the presence of similar or repetitive information, can occur due to the presence of multiple images that have highly similar presentations for the disease of interest. Also, the common use of augmentation methods to generate variety in DL training could limit performance when indiscriminately applied to such data. We hypothesize that semantic redundancy would therefore tend to lower performance and limit generalizability to unseen data and question its impact on classifier performance even with large data. We propose an entropy-based sample scoring approach to identify and remove semantically redundant training data and demonstrate using the publicly available NIH chest X-ray dataset that the model trained on the resulting informative subset of training data significantly outperforms the model trained on the full training set, during both internal (recall: 0.7164 vs 0.6597, p<0.05) and external testing (recall: 0.3185 vs 0.2589, p<0.05). Our findings emphasize the importance of information-oriented training sample selection as opposed to the conventional practice of using all available training data.

3.
PLOS Digit Health ; 3(1): e0000286, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38232121

RESUMEN

Model initialization techniques are vital for improving the performance and reliability of deep learning models in medical computer vision applications. While much literature exists on non-medical images, the impacts on medical images, particularly chest X-rays (CXRs) are less understood. Addressing this gap, our study explores three deep model initialization techniques: Cold-start, Warm-start, and Shrink and Perturb start, focusing on adult and pediatric populations. We specifically focus on scenarios with periodically arriving data for training, thereby embracing the real-world scenarios of ongoing data influx and the need for model updates. We evaluate these models for generalizability against external adult and pediatric CXR datasets. We also propose novel ensemble methods: F-score-weighted Sequential Least-Squares Quadratic Programming (F-SLSQP) and Attention-Guided Ensembles with Learnable Fuzzy Softmax to aggregate weight parameters from multiple models to capitalize on their collective knowledge and complementary representations. We perform statistical significance tests with 95% confidence intervals and p-values to analyze model performance. Our evaluations indicate models initialized with ImageNet-pretrained weights demonstrate superior generalizability over randomly initialized counterparts, contradicting some findings for non-medical images. Notably, ImageNet-pretrained models exhibit consistent performance during internal and external testing across different training scenarios. Weight-level ensembles of these models show significantly higher recall (p<0.05) during testing compared to individual models. Thus, our study accentuates the benefits of ImageNet-pretrained weight initialization, especially when used with weight-level ensembles, for creating robust and generalizable deep learning solutions.

4.
J Natl Cancer Inst ; 116(1): 26-33, 2024 01 10.
Artículo en Inglés | MEDLINE | ID: mdl-37758250

RESUMEN

Novel screening and diagnostic tests based on artificial intelligence (AI) image recognition algorithms are proliferating. Some initial reports claim outstanding accuracy followed by disappointing lack of confirmation, including our own early work on cervical screening. This is a presentation of lessons learned, organized as a conceptual step-by-step approach to bridge the gap between the creation of an AI algorithm and clinical efficacy. The first fundamental principle is specifying rigorously what the algorithm is designed to identify and what the test is intended to measure (eg, screening, diagnostic, or prognostic). Second, designing the AI algorithm to minimize the most clinically important errors. For example, many equivocal cervical images cannot yet be labeled because the borderline between cases and controls is blurred. To avoid a misclassified case-control dichotomy, we have isolated the equivocal cases and formally included an intermediate, indeterminate class (severity order of classes: case>indeterminate>control). The third principle is evaluating AI algorithms like any other test, using clinical epidemiologic criteria. Repeatability of the algorithm at the borderline, for indeterminate images, has proven extremely informative. Distinguishing between internal and external validation is also essential. Linking the AI algorithm results to clinical risk estimation is the fourth principle. Absolute risk (not relative) is the critical metric for translating a test result into clinical use. Finally, generating risk-based guidelines for clinical use that match local resources and priorities is the last principle in our approach. We are particularly interested in applications to lower-resource settings to address health disparities. We note that similar principles apply to other domains of AI-based image analysis for medical diagnostic testing.


Asunto(s)
Inteligencia Artificial , Neoplasias del Cuello Uterino , Femenino , Humanos , Detección Precoz del Cáncer , Neoplasias del Cuello Uterino/diagnóstico , Algoritmos , Procesamiento de Imagen Asistido por Computador
5.
Sci Rep ; 13(1): 21772, 2023 12 08.
Artículo en Inglés | MEDLINE | ID: mdl-38066031

RESUMEN

Cervical cancer is a leading cause of cancer mortality, with approximately 90% of the 250,000 deaths per year occurring in low- and middle-income countries (LMIC). Secondary prevention with cervical screening involves detecting and treating precursor lesions; however, scaling screening efforts in LMIC has been hampered by infrastructure and cost constraints. Recent work has supported the development of an artificial intelligence (AI) pipeline on digital images of the cervix to achieve an accurate and reliable diagnosis of treatable precancerous lesions. In particular, WHO guidelines emphasize visual triage of women testing positive for human papillomavirus (HPV) as the primary screen, and AI could assist in this triage task. In this work, we implemented a comprehensive deep-learning model selection and optimization study on a large, collated, multi-geography, multi-institution, and multi-device dataset of 9462 women (17,013 images). We evaluated relative portability, repeatability, and classification performance. The top performing model, when combined with HPV type, achieved an area under the Receiver Operating Characteristics (ROC) curve (AUC) of 0.89 within our study population of interest, and a limited total extreme misclassification rate of 3.4%, on held-aside test sets. Our model also produced reliable and consistent predictions, achieving a strong quadratic weighted kappa (QWK) of 0.86 and a minimal %2-class disagreement (% 2-Cl. D.) of 0.69%, between image pairs across women. Our work is among the first efforts at designing a robust, repeatable, accurate and clinically translatable deep-learning model for cervical screening.


Asunto(s)
Infecciones por Papillomavirus , Neoplasias del Cuello Uterino , Humanos , Femenino , Cuello del Útero/patología , Infecciones por Papillomavirus/epidemiología , Inteligencia Artificial , Detección Precoz del Cáncer/métodos , Tamizaje Masivo/métodos , Redes Neurales de la Computación
6.
ArXiv ; 2023 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-37986725

RESUMEN

Deep learning (DL) has demonstrated its innate capacity to independently learn hierarchical features from complex and multi-dimensional data. A common understanding is that its performance scales up with the amount of training data. Another data attribute is the inherent variety. It follows, therefore, that semantic redundancy, which is the presence of similar or repetitive information, would tend to lower performance and limit generalizability to unseen data. In medical imaging data, semantic redundancy can occur due to the presence of multiple images that have highly similar presentations for the disease of interest. Further, the common use of augmentation methods to generate variety in DL training may be limiting performance when applied to semantically redundant data. We propose an entropy-based sample scoring approach to identify and remove semantically redundant training data. We demonstrate using the publicly available NIH chest X-ray dataset that the model trained on the resulting informative subset of training data significantly outperforms the model trained on the full training set, during both internal (recall: 0.7164 vs 0.6597, p<0.05) and external testing (recall: 0.3185 vs 0.2589, p<0.05). Our findings emphasize the importance of information-oriented training sample selection as opposed to the conventional practice of using all available training data.

7.
Infect Agent Cancer ; 18(1): 61, 2023 Oct 16.
Artículo en Inglés | MEDLINE | ID: mdl-37845724

RESUMEN

BACKGROUND: WHO has recommended HPV testing for cervical screening where it is practical and affordable. If used, it is important to both clarify and implement the clinical management of positive results. We estimated the performance in Lusaka, Zambia of a novel screening/triage approach combining HPV typing with visual assessment assisted by a deep-learning approach called automated visual evaluation (AVE). METHODS: In this well-established cervical cancer screening program nested inside public sector primary care health facilities, experienced nurses examined women with high-quality digital cameras; the magnified illuminated images permit inspection of the surface morphology of the cervix and expert telemedicine quality assurance. Emphasizing sensitive criteria to avoid missing precancer/cancer, ~ 25% of women screen positive, reflecting partly the high HIV prevalence. Visual screen-positive women are treated in the same visit by trained nurses using either ablation (~ 60%) or LLETZ excision, or referred for LLETZ or more extensive surgery as needed. We added research elements (which did not influence clinical care) including collection of HPV specimens for testing and typing with BD Onclarity™ with a five channel output (HPV16, HPV18/45, HPV31/33/52/58, HPV35/39/51/56/59/66/68, human DNA control), and collection of triplicate cervical images with a Samsung Galaxy J8 smartphone camera™ that were analyzed using AVE, an AI-based algorithm pre-trained on a large NCI cervical image archive. The four HPV groups and three AVE classes were crossed to create a 12-level risk scale, ranking participants in order of predicted risk of precancer. We evaluated the risk scale and assessed how well it predicted the observed diagnosis of precancer/cancer. RESULTS: HPV type, AVE classification, and the 12-level risk scale all were strongly associated with degree of histologic outcome. The AVE classification showed good reproducibility between replicates, and added finer predictive accuracy to each HPV type group. Women living with HIV had higher prevalence of precancer/cancer; the HPV-AVE risk categories strongly predicted diagnostic findings in these women as well. CONCLUSIONS: These results support the theoretical efficacy of HPV-AVE-based risk estimation for cervical screening. If HPV testing can be made affordable, cost-effective and point of care, this risk-based approach could be one management option for HPV-positive women.

8.
Mil Med ; 2023 Oct 20.
Artículo en Inglés | MEDLINE | ID: mdl-37864817

RESUMEN

The success of deep-learning algorithms in analyzing complex structured and unstructured multidimensional data has caused an exponential increase in the amount of research devoted to the applications of artificial intelligence (AI) in medicine in the past decade. Public release of large language models like ChatGPT the past year has generated an unprecedented storm of excitement and rumors of machine intelligence finally reaching or even surpassing human capability in detecting meaningful signals in complex multivariate data. Such enthusiasm, however, is met with an equal degree of both skepticism and fear over the social, legal, and moral implications of such powerful technology with relatively little safeguards or regulations on its development. The question remains in medicine of how to harness the power of AI to improve patient outcomes by increasing the diagnostic accuracy and treatment precision provided by medical professionals. Military medicine, given its unique mission and resource constraints,can benefit immensely from such technology. However, reaping such benefits hinges on the ability of the rising generations of military medical professionals to understand AI algorithms and their applications. Additionally, they should strongly consider working with them as an adjunct decision-maker and view them as a colleague to access and harness relevant information as opposed to something to be feared. Ideas expressed in this commentary were formulated by a military medical student during a two-month research elective working on a multidisciplinary team of computer scientists and clinicians at the National Library of Medicine advancing the state of the art of AI in medicine. A motivation to incorporate AI in the Military Health System is provided, including examples of applications in military medicine. Rationale is then given for inclusion of AI in education starting in medical school as well as a prudent implementation of these algorithms in a clinical workflow during graduate medical education. Finally, barriers to implementation are addressed along with potential solutions. The end state is not that rising military physicians are technical experts in AI; but rather that they understand how they can leverage its rapidly evolving capabilities to prepare for a future where AI will have a significant role in clinical care. The overall goal is to develop trained clinicians that can leverage these technologies to improve the Military Health System.

9.
Int J Cardiovasc Imaging ; 39(12): 2437-2450, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37682418

RESUMEN

Current noninvasive estimation of right atrial pressure (RAP) by inferior vena cava (IVC) measurement during echocardiography may have significant inter-rater variability due to different levels of observers' experience. Therefore, there is a need to develop new approaches to decrease the variability of IVC analysis and RAP estimation. This study aims to develop a fully automated artificial intelligence (AI)-based system for automated IVC analysis and RAP estimation. We presented a multi-stage AI system to identify the IVC view, select good quality images, delineate the IVC region and quantify its thickness, enabling temporal tracking of its diameter and collapsibility changes. The automated system was trained and tested on expert manual IVC and RAP reference measurements obtained from 255 patients during routine clinical workflow. The performance was evaluated using Pearson correlation and Bland-Altman analysis for IVC values, as well as macro accuracy and chi-square test for RAP values. Our results show an excellent agreement (r=0.96) between automatically computed versus manually measured IVC values, and Bland-Altman analysis showed a small bias of [Formula: see text]0.33 mm. Further, there is an excellent agreement ([Formula: see text]) between automatically estimated versus manually derived RAP values with a macro accuracy of 0.85. The proposed AI-based system accurately quantified IVC diameter, collapsibility index, both are used for RAP estimation. This automated system could serve as a paradigm to perform IVC analysis in routine echocardiography and support various cardiac diagnostic applications.


Asunto(s)
Inteligencia Artificial , Presión Atrial , Humanos , Valor Predictivo de las Pruebas , Ecocardiografía , Corazón , Vena Cava Inferior/diagnóstico por imagen
10.
Front ICT Healthc (2002) ; 519: 679-688, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37396668

RESUMEN

Cervical cancer is a significant disease affecting women worldwide. Regular cervical examination with gynecologists is important for early detection and treatment planning for women with precancers. Precancer is the direct precursor to cervical cancer. However, there is a scarcity of experts and the experts' assessments are subject to variations in interpretation. In this scenario, the development of a robust automated cervical image classification system is important to augment the experts' limitations. Ideally, for such a system the class label prediction will vary according to the cervical inspection objectives. Hence, the labeling criteria may not be the same in the cervical image datasets. Moreover, due to the lack of confirmatory test results and inter-rater labeling variation, many images are left unlabeled. Motivated by these challenges, we propose to develop a pretrained cervix model from heterogeneous and partially labeled cervical image datasets. Self-supervised Learning (SSL) is employed to build the cervical model. Further, considering data-sharing restrictions, we show how federated self-supervised learning (FSSL) can be employed to develop a cervix model without sharing the cervical images. The task-specific classification models are developed by fine-tuning the cervix model. Two partially labeled cervical image datasets labeled with different classification criteria are used in this study. According to our experimental study, the cervix model prepared with dataset-specific SSL boosts classification accuracy by 2.5%↑ than ImageNet pretrained model. The classification accuracy is further boosted by 1.5%↑ when images from both datasets are combined for SSL. We see that in comparison with the dataset-specific cervix model developed with SSL, the FSSL is performing better.

11.
Expert Syst Appl ; 229(Pt A)2023 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-37397242

RESUMEN

Lung segmentation in chest X-rays (CXRs) is an important prerequisite for improving the specificity of diagnoses of cardiopulmonary diseases in a clinical decision support system. Current deep learning models for lung segmentation are trained and evaluated on CXR datasets in which the radiographic projections are captured predominantly from the adult population. However, the shape of the lungs is reported to be significantly different across the developmental stages from infancy to adulthood. This might result in age-related data domain shifts that would adversely impact lung segmentation performance when the models trained on the adult population are deployed for pediatric lung segmentation. In this work, our goal is to (i) analyze the generalizability of deep adult lung segmentation models to the pediatric population and (ii) improve performance through a stage-wise, systematic approach consisting of CXR modality-specific weight initializations, stacked ensembles, and an ensemble of stacked ensembles. To evaluate segmentation performance and generalizability, novel evaluation metrics consisting of mean lung contour distance (MLCD) and average hash score (AHS) are proposed in addition to the multi-scale structural similarity index measure (MS-SSIM), the intersection of union (IoU), Dice score, 95% Hausdorff distance (HD95), and average symmetric surface distance (ASSD). Our results showed a significant improvement (p < 0.05) in cross-domain generalization through our approach. This study could serve as a paradigm to analyze the cross-domain generalizability of deep segmentation models for other medical imaging modalities and applications.

12.
JACC Cardiovasc Imaging ; 16(9): 1209-1223, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37480904

RESUMEN

Artificial intelligence (AI) promises to revolutionize many fields, but its clinical implementation in cardiovascular imaging is still rare despite increasing research. We sought to facilitate discussion across several fields and across the lifecycle of research, development, validation, and implementation to identify challenges and opportunities to further translation of AI in cardiovascular imaging. Furthermore, it seemed apparent that a multidisciplinary effort across institutions would be essential to overcome these challenges. This paper summarizes the proceedings of the National Heart, Lung, and Blood Institute-led workshop, creating consensus around needs and opportunities for institutions at several levels to support and advance research in this field and support future translation.


Asunto(s)
Inteligencia Artificial , Sistema Cardiovascular , Estados Unidos , Humanos , National Heart, Lung, and Blood Institute (U.S.) , Valor Predictivo de las Pruebas , Atención al Paciente
13.
Front Big Data ; 6: 1173038, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37139170

RESUMEN

Data integration is a well-motivated problem in the clinical data science domain. Availability of patient data, reference clinical cases, and datasets for research have the potential to advance the healthcare industry. However, the unstructured (text, audio, or video data) and heterogeneous nature of the data, the variety of data standards and formats, and patient privacy constraint make data interoperability and integration a challenge. The clinical text is further categorized into different semantic groups and may be stored in different files and formats. Even the same organization may store cases in different data structures, making data integration more challenging. With such inherent complexity, domain experts and domain knowledge are often necessary to perform data integration. However, expert human labor is time and cost prohibitive. To overcome the variability in the structure, format, and content of the different data sources, we map the text into common categories and compute similarity within those. In this paper, we present a method to categorize and merge clinical data by considering the underlying semantics behind the cases and use reference information about the cases to perform data integration. Evaluation shows that we were able to merge 88% of clinical data from five different sources.

14.
IEEE Access ; 11: 21300-21312, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37008654

RESUMEN

Artificial Intelligence (AI)-based medical computer vision algorithm training and evaluations depend on annotations and labeling. However, variability between expert annotators introduces noise in training data that can adversely impact the performance of AI algorithms. This study aims to assess, illustrate and interpret the inter-annotator agreement among multiple expert annotators when segmenting the same lesion(s)/abnormalities on medical images. We propose the use of three metrics for the qualitative and quantitative assessment of inter-annotator agreement: 1) use of a common agreement heatmap and a ranking agreement heatmap; 2) use of the extended Cohen's kappa and Fleiss' kappa coefficients for a quantitative evaluation and interpretation of inter-annotator reliability; and 3) use of the Simultaneous Truth and Performance Level Estimation (STAPLE) algorithm, as a parallel step, to generate ground truth for training AI models and compute Intersection over Union (IoU), sensitivity, and specificity to assess the inter-annotator reliability and variability. Experiments are performed on two datasets, namely cervical colposcopy images from 30 patients and chest X-ray images from 336 tuberculosis (TB) patients, to demonstrate the consistency of inter-annotator reliability assessment and the importance of combining different metrics to avoid bias assessment.

15.
Res Sq ; 2023 Mar 03.
Artículo en Inglés | MEDLINE | ID: mdl-36909463

RESUMEN

Cervical cancer is a leading cause of cancer mortality, with approximately 90% of the 250,000 deaths per year occurring in low- and middle-income countries (LMIC). Secondary prevention with cervical screening involves detecting and treating precursor lesions; however, scaling screening efforts in LMIC has been hampered by infrastructure and cost constraints. Recent work has supported the development of an artificial intelligence (AI) pipeline on digital images of the cervix to achieve an accurate and reliable diagnosis of treatable precancerous lesions. In particular, WHO guidelines emphasize visual triage of women testing positive for human papillomavirus (HPV) as the primary screen, and AI could assist in this triage task. Published AI reports have exhibited overfitting, lack of portability, and unrealistic, near-perfect performance estimates. To surmount recognized issues, we implemented a comprehensive deep-learning model selection and optimization study on a large, collated, multi-institutional dataset of 9,462 women (17,013 images). We evaluated relative portability, repeatability, and classification performance. The top performing model, when combined with HPV type, achieved an area under the Receiver Operating Characteristics (ROC) curve (AUC) of 0.89 within our study population of interest, and a limited total extreme misclassification rate of 3.4%, on held-aside test sets. Our work is among the first efforts at designing a robust, repeatable, accurate and clinically translatable deep-learning model for cervical screening.

16.
Diagnostics (Basel) ; 13(6)2023 Mar 11.
Artículo en Inglés | MEDLINE | ID: mdl-36980375

RESUMEN

Domain shift is one of the key challenges affecting reliability in medical imaging-based machine learning predictions. It is of significant importance to investigate this issue to gain insights into its characteristics toward determining controllable parameters to minimize its impact. In this paper, we report our efforts on studying and analyzing domain shift in lung region detection in chest radiographs. We used five chest X-ray datasets, collected from different sources, which have manual markings of lung boundaries in order to conduct extensive experiments toward this goal. We compared the characteristics of these datasets from three aspects: information obtained from metadata or an image header, image appearance, and features extracted from a pretrained model. We carried out experiments to evaluate and compare model performances within each dataset and across datasets in four scenarios using different combinations of datasets. We proposed a new feature visualization method to provide explanations for the applied object detection network on the obtained quantitative results. We also examined chest X-ray modality-specific initialization, catastrophic forgetting, and model repeatability. We believe the observations and discussions presented in this work could help to shed some light on the importance of the analysis of training data for medical imaging machine learning research, and could provide valuable guidance for domain shift analysis.

17.
Artículo en Inglés | MEDLINE | ID: mdl-36780238

RESUMEN

Research in Artificial Intelligence (AI)-based medical computer vision algorithms bear promises to improve disease screening, diagnosis, and subsequently patient care. However, these algorithms are highly impacted by the characteristics of the underlying data. In this work, we discuss various data characteristics, namely Volume, Veracity, Validity, Variety, and Velocity, that impact the design, reliability, and evolution of machine learning in medical computer vision. Further, we discuss each characteristic and the recent works conducted in our research lab that informed our understanding of the impact of these characteristics on the design of medical decision-making algorithms and outcome reliability.

18.
Diagnostics (Basel) ; 13(4)2023 Feb 16.
Artículo en Inglés | MEDLINE | ID: mdl-36832235

RESUMEN

Deep learning (DL) models are state-of-the-art in segmenting anatomical and disease regions of interest (ROIs) in medical images. Particularly, a large number of DL-based techniques have been reported using chest X-rays (CXRs). However, these models are reportedly trained on reduced image resolutions for reasons related to the lack of computational resources. Literature is sparse in discussing the optimal image resolution to train these models for segmenting the tuberculosis (TB)-consistent lesions in CXRs. In this study, we investigated the performance variations with an Inception-V3 UNet model using various image resolutions with/without lung ROI cropping and aspect ratio adjustments and identified the optimal image resolution through extensive empirical evaluations to improve TB-consistent lesion segmentation performance. We used the Shenzhen CXR dataset for the study, which includes 326 normal patients and 336 TB patients. We proposed a combinatorial approach consisting of storing model snapshots, optimizing segmentation threshold and test-time augmentation (TTA), and averaging the snapshot predictions, to further improve performance with the optimal resolution. Our experimental results demonstrate that higher image resolutions are not always necessary; however, identifying the optimal image resolution is critical to achieving superior performance.

19.
ArXiv ; 2023 Jan 27.
Artículo en Inglés | MEDLINE | ID: mdl-36789135

RESUMEN

Deep learning (DL) models are state-of-the-art in segmenting anatomical and disease regions of interest (ROIs) in medical images. Particularly, a large number of DL-based techniques have been reported using chest X-rays (CXRs). However, these models are reportedly trained on reduced image resolutions for reasons related to the lack of computational resources. Literature is sparse in discussing the optimal image resolution to train these models for segmenting the Tuberculosis (TB)-consistent lesions in CXRs. In this study, we investigated the performance variations using an Inception-V3 UNet model using various image resolutions with/without lung ROI cropping and aspect ratio adjustments, and (ii) identified the optimal image resolution through extensive empirical evaluations to improve TB-consistent lesion segmentation performance. We used the Shenzhen CXR dataset for the study which includes 326 normal patients and 336 TB patients. We proposed a combinatorial approach consisting of storing model snapshots, optimizing segmentation threshold and test-time augmentation (TTA), and averaging the snapshot predictions, to further improve performance with the optimal resolution. Our experimental results demonstrate that higher image resolutions are not always necessary, however, identifying the optimal image resolution is critical to achieving superior performance.

20.
Med Image Learn Ltd Noisy Data (2023) ; 14307: 128-137, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-38415180

RESUMEN

We proposed a self-supervised machine learning method to automatically rate the severity of pulmonary edema in the frontal chest X-ray radiographs (CXR) which could be potentially related to COVID-19 viral pneumonia. For this we use the modified radiographic assessment of lung edema (mRALE) scoring system. The new model was first optimized with the simple Siamese network (SimSiam) architecture where a ResNet-50 pretrained by ImageNet database was used as the backbone. The encoder projected a 2048-dimension embedding as representation features to a downstream fully connected deep neural network for mRALE score prediction. A 5-fold cross-validation with 2,599 frontal CXRs was used to examine the new model's performance with comparison to a non-pretrained SimSiam encoder and a ResNet-50 trained from scratch. The mean absolute error (MAE) of the new model is 5.05 (95%CI 5.03-5.08), the mean squared error (MSE) is 66.67 (95%CI 66.29-67.06), and the Spearman's correlation coefficient (Spearman ρ) to the expert-annotated scores is 0.77 (95%CI 0.75-0.79). All the performance metrics of the new model are superior to the two comparators (P<0.01), and the scores of MSE and Spearman ρ of the two comparators have no statistical difference (P>0.05). The model also achieved a prediction probability concordance of 0.811 and a quadratic weighted kappa of 0.739 with the medical expert annotations in external validation. We conclude that the self-supervised contrastive learning method is an effective strategy for mRALE automated scoring. It provides a new approach to improve machine learning performance and minimize the expert knowledge involvement in quantitative medical image pattern learning.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...