RESUMO
OBJECTIVES: MAchine Learning In MyelomA Response (MALIMAR) is an observational clinical study combining "real-world" and clinical trial data, both retrospective and prospective. Images were acquired on three MRI scanners over a 10-year window at two institutions, leading to a need for extensive curation. METHODS: Curation involved image aggregation, pseudonymisation, allocation between project phases, data cleaning, upload to an XNAT repository visible from multiple sites, annotation, incorporation of machine learning research outputs and quality assurance using programmatic methods. RESULTS: A total of 796 whole-body MR imaging sessions from 462 subjects were curated. A major change in scan protocol part way through the retrospective window meant that approximately 30% of available imaging sessions had properties that differed significantly from the remainder of the data. Issues were found with a vendor-supplied clinical algorithm for "composing" whole-body images from multiple imaging stations. Historic weaknesses in a digital video disk (DVD) research archive (already addressed by the mid-2010s) were highlighted by incomplete datasets, some of which could not be completely recovered. The final dataset contained 736 imaging sessions for 432 subjects. Software was written to clean and harmonise data. Implications for the subsequent machine learning activity are considered. CONCLUSIONS: MALIMAR exemplifies the vital role that curation plays in machine learning studies that use real-world data. A research repository such as XNAT facilitates day-to-day management, ensures robustness and consistency and enhances the value of the final dataset. The types of process described here will be vital for future large-scale multi-institutional and multi-national imaging projects. CRITICAL RELEVANCE STATEMENT: This article showcases innovative data curation methods using a state-of-the-art image repository platform; such tools will be vital for managing the large multi-institutional datasets required to train and validate generalisable ML algorithms and future foundation models in medical imaging. KEY POINTS: ⢠Heterogeneous data in the MALIMAR study required the development of novel curation strategies. ⢠Correction of multiple problems affecting the real-world data was successful, but implications for machine learning are still being evaluated. ⢠Modern image repositories have rich application programming interfaces enabling data enrichment and programmatic QA, making them much more than simple "image marts".
RESUMO
Purpose: XNAT is an informatics software platform to support imaging research, particularly in the context of large, multicentre studies of the type that are essential to validate quantitative imaging biomarkers. XNAT provides import, archiving, processing and secure distribution facilities for image and related study data. Until recently, however, modern data visualisation and annotation tools were lacking on the XNAT platform. We describe the background to, and implementation of, an integration of the Open Health Imaging Foundation (OHIF) Viewer into the XNAT environment. We explain the challenges overcome and discuss future prospects for quantitative imaging studies. Materials and methods: The OHIF Viewer adopts an approach based on the DICOM web protocol. To allow operation in an XNAT environment, a data-routing methodology was developed to overcome the mismatch between the DICOM and XNAT information models and a custom viewer panel created to allow navigation within the viewer between different XNAT projects, subjects and imaging sessions. Modifications to the development environment were made to allow developers to test new code more easily against a live XNAT instance. Major new developments focused on the creation and storage of regions-of-interest (ROIs) and included: ROI creation and editing tools for both contour- and mask-based regions; a "smart CT" paintbrush tool; the integration of NVIDIA's Artificial Intelligence Assisted Annotation (AIAA); the ability to view surface meshes, fractional segmentation maps and image overlays; and a rapid image reader tool aimed at radiologists. We have incorporated the OHIF microscopy extension and, in parallel, introduced support for microscopy session types within XNAT for the first time. Results: Integration of the OHIF Viewer within XNAT has been highly successful and numerous additional and enhanced tools have been created in a programme started in 2017 that is still ongoing. The software has been downloaded more than 3700 times during the course of the development work reported here, demonstrating the impact of the work. Conclusions: The OHIF open-source, zero-footprint web viewer has been incorporated into the XNAT platform and is now used at many institutions worldwide. Further innovations are envisaged in the near future.
Assuntos
Inteligência Artificial , Diagnóstico por Imagem , Arquivos , Humanos , SoftwareRESUMO
The National Cancer Institute (NCI) Cancer Research Data Commons (CRDC) aims to establish a national cloud-based data science infrastructure. Imaging Data Commons (IDC) is a new component of CRDC supported by the Cancer Moonshot. The goal of IDC is to enable a broad spectrum of cancer researchers, with and without imaging expertise, to easily access and explore the value of deidentified imaging data and to support integrated analyses with nonimaging data. We achieve this goal by colocating versatile imaging collections with cloud-based computing resources and data exploration, visualization, and analysis tools. The IDC pilot was released in October 2020 and is being continuously populated with radiology and histopathology collections. IDC provides access to curated imaging collections, accompanied by documentation, a user forum, and a growing number of analysis use cases that aim to demonstrate the value of a data commons framework applied to cancer imaging research. SIGNIFICANCE: This study introduces NCI Imaging Data Commons, a new repository of the NCI Cancer Research Data Commons, which will support cancer imaging research on the cloud.
Assuntos
Diagnóstico por Imagem/métodos , National Cancer Institute (U.S.) , Neoplasias/diagnóstico por imagem , Neoplasias/genética , Pesquisa Biomédica/tendências , Computação em Nuvem , Biologia Computacional/métodos , Gráficos por Computador , Segurança Computacional , Interpretação Estatística de Dados , Bases de Dados Factuais , Diagnóstico por Imagem/normas , Humanos , Processamento de Imagem Assistida por Computador , Projetos Piloto , Linguagens de Programação , Radiologia/métodos , Radiologia/normas , Reprodutibilidade dos Testes , Software , Estados Unidos , Interface Usuário-ComputadorRESUMO
PURPOSE: Zero-footprint Web architecture enables imaging applications to be deployed on premise or in the cloud without requiring installation of custom software on the user's computer. Benefits include decreased costs and information technology support requirements, as well as improved accessibility across sites. The Open Health Imaging Foundation (OHIF) Viewer is an extensible platform developed to leverage these benefits and address the demand for open-source Web-based imaging applications. The platform can be modified to support site-specific workflows and accommodate evolving research requirements. MATERIALS AND METHODS: The OHIF Viewer provides basic image review functionality (eg, image manipulation and measurement) as well as advanced visualization (eg, multiplanar reformatting). It is written as a client-only, single-page Web application that can easily be embedded into third-party applications or hosted as a standalone Web site. The platform provides extension points for software developers to include custom tools and adapt the system for their workflows. It is standards compliant and relies on DICOMweb for data exchange and OpenID Connect for authentication, but it can be configured to use any data source or authentication flow. Additionally, the user interface components are provided in a standalone component library so that developers can create custom extensions. RESULTS: The OHIF Viewer and its underlying components have been widely adopted and integrated into multiple clinical research platforms (e,g Precision Imaging Metrics, XNAT, LabCAS, ISB-CGC) and commercial applications (eg, Osirix). It has also been used to build custom imaging applications (eg, ProstateCancer.ai, Crowds Cure Cancer [presented as a case study]). CONCLUSION: The OHIF Viewer provides a flexible framework for building applications to support imaging research. Its adoption could reduce redundancies in software development for National Cancer Institute-funded projects, including Informatics Technology for Cancer Research and the Quantitative Imaging Network.
Assuntos
Neoplasias , Interface Usuário-Computador , Diagnóstico por Imagem , Humanos , Armazenamento e Recuperação da Informação , Internet , Neoplasias/diagnóstico por imagem , SoftwareRESUMO
BACKGROUND: Textural features extracted from MRI potentially provide prognostic information additional to volume for influencing surgical management of cervical cancer. PURPOSE: To identify textural features that differ between cervical tumors above and below the volume threshold of eligibility for trachelectomy and determine their value in predicting recurrence in patients with low-volume tumors. METHODS: Of 378 patients with Stage1-2 cervical cancer imaged prospectively (3T, endovaginal coil), 125 had well-defined, histologically-confirmed squamous or adenocarcinomas with >100 voxels (>0.07 cm3) suitable for radiomic analysis. Regions-of-interest outlined the whole tumor on T2-W images and apparent diffusion coefficient (ADC) maps. Textural features based on grey-level co-occurrence matrices were compared (Mann-Whitney test with Bonferroni correction) between tumors greater (n = 46) or less (n = 79) than 4.19 cm3. Clustering eliminated correlated variables. Significantly different features were used to predict recurrence (regression modelling) in surgically-treated patients with low-volume tumors and compared with a model using clinico-pathological features. RESULTS: Textural features (Dissimilarity, Energy, ClusterProminence, ClusterShade, InverseVariance, Autocorrelation) in 6 of 10 clusters from T2-W and ADC data differed between high-volume (mean ± SD 15.3 ± 11.7 cm3) and low-volume (mean ± SD 1.3 ± 1.2 cm3) tumors. (p < 0.02). In low-volume tumors, predicting recurrence was indicated by: Dissimilarity, Energy (ADC-radiomics, AUC = 0.864); Dissimilarity, ClusterProminence, InverseVariance (T2-W-radiomics, AUC = 0.808); Volume, Depth of Invasion, LymphoVascular Space Invasion (clinico-pathological features, AUC = 0.794). Combining ADC-radiomic (but not T2-radiomic) and clinico-pathological features improved prediction of recurrence compared to the clinico-pathological model (AUC = 0.916, p = 0.006). Findings were supported by bootstrap re-sampling (n = 1000). CONCLUSION: Textural features from ADC maps and T2-W images differ between high- and low-volume tumors and potentially predict recurrence in low-volume tumors.