RESUMEN
Purpose To evaluate the performance of an artificial intelligence (AI) model in detecting overall and clinically significant prostate cancer (csPCa)-positive lesions on paired external and in-house biparametric MRI (bpMRI) scans and assess performance differences between each dataset. Materials and Methods This single-center retrospective study included patients who underwent prostate MRI at an external institution and were rescanned at the authors' institution between May 2015 and May 2022. A genitourinary radiologist performed prospective readouts on in-house MRI scans following the Prostate Imaging Reporting and Data System (PI-RADS) version 2.0 or 2.1 and retrospective image quality assessments for all scans. A subgroup of patients underwent an MRI/US fusion-guided biopsy. A bpMRI-based lesion detection AI model previously developed using a completely separate dataset was tested on both MRI datasets. Detection rates were compared between external and in-house datasets with use of the paired comparison permutation tests. Factors associated with AI detection performance were assessed using multivariable generalized mixed-effects models, incorporating features selected through forward stepwise regression based on the Akaike information criterion. Results The study included 201 male patients (median age, 66 years [IQR, 62-70 years]; prostate-specific antigen density, 0.14 ng/mL2 [IQR, 0.10-0.22 ng/mL2]) with a median interval between external and in-house MRI scans of 182 days (IQR, 97-383 days). For intraprostatic lesions, AI detected 39.7% (149 of 375) on external and 56.0% (210 of 375) on in-house MRI scans (P < .001). For csPCa-positive lesions, AI detected 61% (54 of 89) on external and 79% (70 of 89) on in-house MRI scans (P < .001). On external MRI scans, better overall lesion detection was associated with a higher PI-RADS score (odds ratio [OR] = 1.57; P = .005), larger lesion diameter (OR = 3.96; P < .001), better diffusion-weighted MRI quality (OR = 1.53; P = .02), and fewer lesions at MRI (OR = 0.78; P = .045). Better csPCa detection was associated with a shorter MRI interval between external and in-house scans (OR = 0.58; P = .03) and larger lesion size (OR = 10.19; P < .001). Conclusion The AI model exhibited modest performance in identifying both overall and csPCa-positive lesions on external bpMRI scans. Keywords: MR Imaging, Urinary, Prostate Supplemental material is available for this article. © RSNA, 2024.
Asunto(s)
Aprendizaje Profundo , Imagen por Resonancia Magnética , Neoplasias de la Próstata , Humanos , Masculino , Neoplasias de la Próstata/diagnóstico por imagen , Estudios Retrospectivos , Anciano , Imagen por Resonancia Magnética/métodos , Persona de Mediana Edad , Algoritmos , Próstata/diagnóstico por imagen , Próstata/patología , Interpretación de Imagen Asistida por Computador/métodos , Biopsia Guiada por Imagen/métodosRESUMEN
RATIONALE AND OBJECTIVES: The increasing use of focal therapy (FT) in localized prostate cancer (PCa) management requires a standardized MRI interpretation system to detect recurrent clinically significant PCa (csPCa). This pilot study evaluates the novel Transatlantic Recommendations for Prostate Gland Evaluation with MRI after Focal Therapy (TARGET) and compares its performance to that of the Prostate Imaging after Focal Ablation (PI-FAB) system. MATERIALS AND METHODS: This retrospective study included 38 patients who underwent primary FT for localized PCa, with follow-up multiparametric MRI (mpMRI) and biopsy. Two radiologists assessed the mpMRIs using both PI-FAB and TARGET independently. Diagnostic performance metrics and area under the receiver operating characteristic curve (AUC) were calculated. Inter-reader and intrareader agreement were assessed using Cohen's κ and Kendall's τ. RESULTS: 14 patients had recurrent csPCa. PI-FAB showed high sensitivity (92.9% for both readers) and NPV (reader 1: 93.8%, reader 2: 92.9%) but moderate specificity (reader 1: 62.5%, reader 2: 54.2%). TARGET demonstrated lower sensitivity for one reader (reader 1: 78.6%, reader 2: 92.9%) but higher specificity (reader 1: 79.2%, reader 2: 62.5%) for both readers. Both systems displayed moderate inter-reader agreement (κ = 0.56 for PI-FAB, 0.57 for TARGET). CONCLUSION: PI-FAB and TARGET exhibit similar performances in post-FT MRI. While PI-FAB had consistently high sensitivity, TARGET offered higher specificity for one reader. Moderate agreement levels demonstrate the viability of these systems in clinical settings and a promise for improvement.
RESUMEN
With the ongoing revolution of artificial intelligence (AI) in medicine, the impact of AI in radiology is more pronounced than ever. An increasing number of technical and clinical AI-focused studies are published each day. As these tools inevitably affect patient care and physician practices, it is crucial that radiologists become more familiar with the leading strategies and underlying principles of AI. Multimodal AI models can combine both imaging and clinical metadata and are quickly becoming a popular approach that is being integrated into the medical ecosystem. This narrative review covers major concepts of multimodal AI through the lens of recent literature. We discuss emerging frameworks, including graph neural networks, which allow for explicit learning from non-Euclidean relationships, and transformers, which allow for parallel computation that scales, highlighting existing literature and advocating for a focus on emerging architectures. We also identify key pitfalls in current studies, including issues with taxonomy, data scarcity, and bias. By informing radiologists and biomedical AI experts about existing practices and challenges, we hope to guide the next wave of imaging-based multimodal AI research.
RESUMEN
BACKGROUND: to date, no standardized, evidence-based follow-up schemes exist for the monitoring of patients who underwent focal therapy (FT) and expert centers rely mainly on their own experience and/or institutional protocols. We aimed to perform a comprehensive review of the most advantageous follow-up strategies and their rationale after FT for prostate cancer (PCa). METHODS: a narrative review of the literature was conducted to investigate different follow-up protocols of FT for PCa. Outcomes of interest were post-ablation oncological and functional outcomes and complications. RESULTS: Oncological success after FT was generally defined as the biopsy-confirmed absence of clinically significant PCa in the treated zone. De novo PCa in the untreated area usually reflects an inaccurate patient selection and should be treated as primary PCa. During follow-up, oncological outcomes should be evaluated with periodic PSA, multiparametric MRI and prostate biopsy. The use of PSA derivatives and new biomarkers is still controversial and therefore not recommended. The first MRI after FT should be performed between 6-12 months to avoid ablation-related artifacts and diagnostic delay in case of FT failure. Other imaging modalities, such as PSMA PET/CT scan, are promising but still need to be validated in the post-FT setting. A 12-month "for-protocol" prostate biopsy, including targeted and systematic biopsy, was generally considered the preferred biopsy method to rule out tumor persistence/recurrence. Subsequent mpMRIs and biopsies should follow a risk-adapted approach depending on the clinical scenario. Functional outcomes should be periodically assessed using validated questionnaires within the first year, when typically recover to a new baseline. Complications, despite uncommon, should be strictly monitored mainly in the first month. CONCLUSIONS: FT follow-up is a multifaceted process involving clinical, radiological, and histological assessment. Studies evaluating the impact of different follow-up strategies and ideal timings are needed to produce standardized protocols following FT.
RESUMEN
PURPOSE: This is a phase I trial with the primary objective of identifying the most compressed dose schedule (DS) tolerable using risk-volume-adapted, hypofractionated, post-operative radiotherapy (PORT) for biochemically recurrent prostate cancer. Secondary endpoints included biochemical progression free survival (bPFS) and quality of life (QOL). METHODS: Patients were treated with one of 3 isoeffective dose schedules (DS1: 20 fractions, DS2: 15 fractions, DS3: 10 fractions) that escalated dose to the imaging-defined local recurrence (73Gy3 EQD2) and de-escalated dose to the remainder of the prostate bed (48Gy3 EQD2). Escalation followed a standard 3+3 design with a 6-patient expansion at the maximally tolerated hypofractionated dose schedule (MTHDS). Dose limiting toxicity (DLT) was defined as CTCAE v.4.0 grade (G) 3 toxicity lasting >4 days within 21 days of PORT completion or grade 4 gastrointestinal (GI) or genitourinary (GU) toxicities thereafter. QOL was assessed longitudinally through 24 months with the EPIC-26. RESULTS: Between 01/2018 and 12/2023, 15 patients were treated (3 with DS1, 3 with DS2, and 9 with DS3). The median follow-up was 48 months. No DLTs were observed on any DS, and, thus, expansion occurred at DS3. The cumulative incidence of G3 GI and GU toxicity was 7% and 9% at 24 months, respectively, with no G4 events observed. Transient, acute G2+ GI toxicity was most common. QOL worsened transiently during study follow-up in urinary incontinence, GI, and sexual subdomains but was similar to baseline by 24 months. The bPFS was 91% at both 24- and 60-months. CONCLUSIONS: The maximally tolerated hypofractionated dose schedule for hypofractionated, risk-volume-adapted PORT was determined to be DS3 (36.4Gy to the prostate bed and 47.1Gy to the imaging-defined recurrence in 10 daily fractions). No >G3 events were observed. Transient declines in QOL did not persist through 24 months.
RESUMEN
PURPOSE: Osteosarcoma is an aggressive bone cancer lacking robust biomarkers for personalized treatment. Despite its scarcity in humans, it is relatively common in adult pet dogs. This study aimed to analyze clinically annotated bulk tumor transcriptomic datasets of canine and human osteosarcoma patients to identify potentially conserved patterns of disease progression. EXPERIMENTAL DESIGN: Bulk transcriptomic data from 245 pet dogs with treatment-naïve appendicular osteosarcoma were analyzed using deconvolution to characterize the tumor microenvironment (TME). The TME of both primary and metastatic tumors derived from the same dog was compared, and its impact on canine survival was assessed. A machine learning model was developed to classify the TME based on its inferred composition using canine tumor data. This model was applied to 8 independent human osteosarcoma datasets to assess its generalizability and prognostic value. RESULTS: This study found three distinct TME subtypes of canine osteosarcoma based on cell type composition of bulk tumor samples: Immune Enriched (IE), Immune Enriched Dense Extra-Cellular Matrix-like (IE-ECM), and Immune Desert (ID). These three TME-based subtypes of canine osteosarcomas were conserved in humans and could predict progression-free survival outcomes of human patients, independent of conventional prognostic factors such as percent tumor necrosis post standard of care chemotherapy treatment and disease stage at diagnosis. CONCLUSIONS: These findings demonstrate the potential of leveraging data from naturally occurring cancers in canines to model the complexity of the human osteosarcoma TME, offering a promising avenue for the discovery of novel biomarkers and developing more effective precision oncology treatments.
RESUMEN
BACKGROUND/OBJECTIVES: Apparent Diffusion Coefficient (ADC) maps in prostate MRI can reveal tumor characteristics, but their accuracy can be compromised by artifacts related with patient motion or rectal gas associated distortions. To address these challenges, we propose a novel approach that utilizes a Generative Adversarial Network to synthesize ADC maps from T2-weighted magnetic resonance images (T2W MRI). METHODS: By leveraging contrastive learning, our model accurately maps axial T2W MRI to ADC maps within the cropped region of the prostate organ boundary, capturing subtle variations and intricate structural details by learning similar and dissimilar pairs from two imaging modalities. We trained our model on a comprehensive dataset of unpaired T2-weighted images and ADC maps from 506 patients. In evaluating our model, named AI-ADC, we compared it against three state-of-the-art methods: CycleGAN, CUT, and StyTr2. RESULTS: Our model demonstrated a higher mean Structural Similarity Index (SSIM) of 0.863 on a test dataset of 3240 2D MRI slices from 195 patients, compared to values of 0.855, 0.797, and 0.824 for CycleGAN, CUT, and StyTr2, respectively. Similarly, our model achieved a significantly lower Fréchet Inception Distance (FID) value of 31.992, compared to values of 43.458, 179.983, and 58.784 for the other three models, indicating its superior performance in generating ADC maps. Furthermore, we evaluated our model on 147 patients from the publicly available ProstateX dataset, where it demonstrated a higher SSIM of 0.647 and a lower FID of 113.876 compared to the other three models. CONCLUSIONS: These results highlight the efficacy of our proposed model in generating ADC maps from T2W MRI, showcasing its potential for enhancing clinical diagnostics and radiological workflows.
RESUMEN
Prostate cancer is one of the most prevalent malignancies in the world. While deep learning has potential to further improve computer-aided prostate cancer detection on MRI, its efficacy hinges on the exhaustive curation of manually annotated images. We propose a novel methodology of semisupervised learning (SSL) guided by automatically extracted clinical information, specifically the lesion locations in radiology reports, allowing for use of unannotated images to reduce the annotation burden. By leveraging lesion locations, we refined pseudo labels, which were then used to train our location-based SSL model. We show that our SSL method can improve prostate lesion detection by utilizing unannotated images, with more substantial impacts being observed when larger proportions of unannotated images are used.
RESUMEN
OBJECTIVES: To evaluate MRI-based measurements of androgen-sensitive perineal/pelvic muscles in men with prostate cancer before and after androgen deprivation therapy (ADT) as a novel imaging marker for end-organ effects of hypogonadism. Diagnosing hypogonadism or testosterone deficiency (TD) requires both low serum testosterone and clinical symptoms, such as erectile dysfunction and reduced libido. However, the non-specific nature of many TD symptoms makes it challenging to initiate therapy. Objective markers of TD help to better identify patients who may benefit from testosterone supplementation; however, current markers, such as low bone mineral density, lack sensitivity. Previous studies suggest that decreased bulbocavernosus-muscle (BCM) thickness may be associated with TD, although it remains unclear if this is a correlative relationship. METHODS: Data was prospectively collected for patients with intermediate/high-risk localized prostate cancer enrolled in a phase II trial (NCT02430480). Patients received ADT before prostatectomy and underwent prostate MRI pre-/post-ADT. BCM, ischiocavernosus-muscle (ICM), and levator-ani-muscle (LAM) measurements were made using T2W-MRI. Paired t-tests evaluated changes in BCM/ICM/LAM width, and linear regression analyses evaluated relationships between changes in testosterone and muscle width. RESULTS: Thirty-eight consecutive patients with pre-/post-ADT MRIs were analyzed. Baseline testosterone was 286.5ng/dl, and 36/38 patients had post-ADT testosterone <50ng/dL. Pre-ADT and post-ADT measurements of the bilateral BCM/ICM/LAM width were 7.16mm/7.95mm/5.53mm and 5.68mm/6.71mm/4.89mm, respectively (p<0.001). Decreases in testosterone predicted reduction in combined perineal muscle (BCM+ICM) width (p=0.032). CONCLUSIONS: Androgen deprivation led to significant and relatively rapid decreases in BCM/ICM/LAM thickness. This objective biomarker of low testosterone states may help identify patients who will potentially benefit from testosterone replacement.
RESUMEN
BACKGROUND: As artificial intelligence (AI) tools become widely accessible, more patients and medical professionals will turn to them for medical information. Large language models (LLMs), a subset of AI, excel in natural language processing tasks and hold considerable promise for clinical use. Fields such as oncology, in which clinical decisions are highly dependent on a continuous influx of new clinical trial data and evolving guidelines, stand to gain immensely from such advancements. It is therefore of critical importance to benchmark these models and describe their performance characteristics to guide their safe application to clinical oncology. Accordingly, the primary objectives of this work were to conduct comprehensive evaluations of LLMs in the field of oncology and to identify and characterize strategies that medical professionals can use to bolster their confidence in a model's response. METHODS: This study tested five publicly available LLMs (LLaMA 1, PaLM 2, Claude-v1, generative pretrained transformer 3.5 [GPT-3.5], and GPT-4) on a comprehensive battery of 2044 oncology questions, including topics from medical oncology, surgical oncology, radiation oncology, medical statistics, medical physics, and cancer biology. Model prompts were presented independently of each other, and each prompt was repeated three times to assess output consistency. For each response, models were instructed to provide a self-appraised confidence score (from 1 to 4). Model performance was also evaluated against a novel validation set comprising 50 oncology questions curated to eliminate any risk of overlap with the data used to train the LLMs. RESULTS: There was significant heterogeneity in performance between models (analysis of variance, P<0.001). Relative to a human benchmark (2013 and 2014 examination results), GPT-4 was the only model to perform above the 50th percentile. Overall, model performance varied as a function of subject area across all models, with worse performance observed in clinical oncology subcategories compared with foundational topics (medical statistics, medical physics, and cancer biology). Within the clinical oncology subdomain, worse performance was observed in female-predominant malignancies. A combination of model selection, prompt repetition, and confidence self-appraisal allowed for the identification of high-performing subgroups of questions with observed accuracies of 81.7 and 81.1% in the Claude-v1 and GPT-4 models, respectively. Evaluation of the novel validation question set produced similar trends in model performance while also highlighting improved performance in newer, centrally hosted models (GPT-4 Turbo and Gemini 1.0 Ultra) and local models (Mixtral 8×7B and LLaMA 2). CONCLUSIONS: Of the models tested on a standardized set of oncology questions, GPT-4 was observed to have the highest performance. Although this performance is impressive, all LLMs continue to have clinically significant error rates, including examples of overconfidence and consistent inaccuracies. Given the enthusiasm to integrate these new implementations of AI into clinical practice, continued standardized evaluations of the strengths and limitations of these products will be critical to guide both patients and medical professionals. (Funded by the National Institutes of Health Clinical Center for Research and the Intramural Research Program of the National Institutes of Health; Z99 CA999999.).
RESUMEN
Glioblastoma (GBM) is the most aggressive and the most common primary brain tumor, defined by nearly uniform rapid progression despite the current standard of care involving maximal surgical resection followed by radiation therapy (RT) and temozolomide (TMZ) or concurrent chemoirradiation (CRT), with an overall survival (OS) of less than 30% at 2 years. The diagnosis of tumor progression in the clinic is based on clinical assessment and the interpretation of MRI of the brain using Response Assessment in Neuro-Oncology (RANO) criteria, which suffers from several limitations including a paucity of precise measures of progression. Given that imaging is the primary modality that generates the most quantitative data capable of capturing change over time in the standard of care for GBM, this renders it pivotal in optimizing and advancing response criteria, particularly given the lack of biomarkers in this space. In this study, we employed artificial intelligence (AI)-derived MRI volumetric parameters using the segmentation mask output of the nnU-Net to arrive at four classes (background, edema, non-contrast enhancing tumor (NET), and contrast-enhancing tumor (CET)) to determine if dynamic changes in AI volumes detected throughout therapy can be linked to PFS and clinical features. We identified associations between MR imaging AI-generated volumes and PFS independently of tumor location, MGMT methylation status, and the extent of resection while validating that CET and edema are the most linked to PFS with patient subpopulations separated by district rates of change throughout the disease. The current study provides valuable insights for risk stratification, future RT treatment planning, and treatment monitoring in neuro-oncology.
RESUMEN
The Gleason score is an important predictor of prognosis in prostate cancer. However, its subjective nature can result in over- or under-grading. Our objective was to train an artificial intelligence (AI)-based algorithm to grade prostate cancer in specimens from patients who underwent radical prostatectomy (RP) and to assess the correlation of AI-estimated proportions of different Gleason patterns with biochemical recurrence-free survival (RFS), metastasis-free survival (MFS), and overall survival (OS). Training and validation of algorithms for cancer detection and grading were completed with three large datasets containing a total of 580 whole-mount prostate slides from 191 RP patients at two centers and 6218 annotated needle biopsy slides from the publicly available Prostate Cancer Grading Assessment dataset. A cancer detection model was trained using MobileNetV3 on 0.5â¯mmâ¯×â¯0.5â¯mm cancer areas (tiles) captured at 10× magnification. For cancer grading, a Gleason pattern detector was trained on tiles using a ResNet50 convolutional neural network and a selective CutMix training strategy involving a mixture of real and artificial examples. This strategy resulted in improved model generalizability in the test set compared with three different control experiments when evaluated on both needle biopsy slides and whole-mount prostate slides from different centers. In an additional test cohort of RP patients who were clinically followed over 30â¯years, quantitative Gleason pattern AI estimates achieved concordance indexes of 0.69, 0.72, and 0.64 for predicting RFS, MFS, and OS times, outperforming the control experiments and International Society of Urological Pathology system (ISUP) grading by pathologists. Finally, unsupervised clustering of test RP patient specimens into low-, medium-, and high-risk groups based on AI-estimated proportions of each Gleason pattern resulted in significantly improved RFS and MFS stratification compared with ISUP grading. In summary, deep learning-based quantitative Gleason scoring using a selective CutMix training strategy may improve prognostication after prostate cancer surgery.
RESUMEN
PURPOSE: To provide a comprehensive review of the means by which to optimize target volume definition for the purposes of treatment planning for patients with intact prostate cancer with a specific emphasis on focal boost volume definition. METHODS: Here we conduct a narrative review of the available literature summarizing the current state of knowledge on optimizing target volume definition for the treatment of localized prostate cancer. RESULTS: Historically, the treatment of prostate cancer included a uniform prescription dose administered to the entire prostate with or without coverage of all or part of the seminal vesicles. The development of prostate magnetic resonance imaging (MRI) and positron emission tomography (PET) using prostate-specific radiotracers has ushered in an era in which radiation oncologists are able to localize and focally dose-escalate high-risk volumes in the prostate gland. Recent phase 3 data has demonstrated that incorporating focal dose escalation to high-risk subvolumes of the prostate improves biochemical control without significantly increasing toxicity. Still, several fundamental questions remain regarding the optimal target volume definition and prescription strategy to implement this technique. Given the remaining uncertainty, a knowledge of the pathological correlates of radiographic findings and the anatomic patterns of tumor spread may help inform clinical judgement for the definition of clinical target volumes. CONCLUSION: Advanced imaging has the ability to improve outcomes for patients with prostate cancer in multiple ways, including by enabling focal dose escalation to high-risk subvolumes. However, many questions remain regarding the optimal target volume definition and prescription strategy to implement this practice, and key knowledge gaps remain. A detailed understanding of the pathological correlates of radiographic findings and the patterns of local tumor spread may help inform clinical judgement for target volume definition given the current state of uncertainty.
RESUMEN
OBJECTIVE: To assess impact of image quality on prostate cancer extraprostatic extension (EPE) detection on MRI using a deep learning-based AI algorithm. MATERIALS AND METHODS: This retrospective, single institution study included patients who were imaged with mpMRI and subsequently underwent radical prostatectomy from June 2007 to August 2022. One genitourinary radiologist prospectively evaluated each patient using the NCI EPE grading system. Each T2WI was classified as low- or high-quality by a previously developed AI algorithm. Fisher's exact tests were performed to compare EPE detection metrics between low- and high-quality images. Univariable and multivariable analyses were conducted to assess the predictive value of image quality for pathological EPE. RESULTS: A total of 773 consecutive patients (median age 61 [IQR 56-67] years) were evaluated. At radical prostatectomy, 23% (180/773) of patients had EPE at pathology, and 41% (131/318) of positive EPE calls on mpMRI were confirmed to have EPE. The AI algorithm classified 36% (280/773) of T2WIs as low-quality and 64% (493/773) as high-quality. For EPE grade ≥ 1, high-quality T2WI significantly improved specificity for EPE detection (72% [95% CI 67-76%] vs. 63% [95% CI 56-69%], P = 0.03), but did not significantly affect sensitivity (72% [95% CI 62-80%] vs. 75% [95% CI 63-85%]), positive predictive value (44% [95% CI 39-49%] vs. 38% [95% CI 32-43%]), or negative predictive value (89% [95% CI 86-92%] vs. 89% [95% CI 85-93%]). Sensitivity, specificity, PPV, and NPV for EPE grades ≥ 2 and ≥ 3 did not show significant differences attributable to imaging quality. For NCI EPE grade 1, high-quality images (OR 3.05, 95% CI 1.54-5.86; P < 0.001) demonstrated a stronger association with pathologic EPE than low-quality images (OR 1.76, 95% CI 0.63-4.24; P = 0.24). CONCLUSION: Our study successfully employed a deep learning-based AI algorithm to classify image quality of prostate MRI and demonstrated that better quality T2WI was associated with more accurate prediction of EPE at final pathology.
Asunto(s)
Aprendizaje Profundo , Imagen por Resonancia Magnética , Prostatectomía , Neoplasias de la Próstata , Humanos , Masculino , Neoplasias de la Próstata/diagnóstico por imagen , Neoplasias de la Próstata/patología , Neoplasias de la Próstata/cirugía , Persona de Mediana Edad , Estudios Retrospectivos , Anciano , Imagen por Resonancia Magnética/métodos , Algoritmos , Interpretación de Imagen Asistida por Computador/métodos , Clasificación del TumorRESUMEN
Initial imaging evaluation of hydronephrosis of unknown etiology is a complex subject and is dependent on clinical context. In asymptomatic patients, it is often best conducted via CT urography (CTU) without and with contrast, MR urography (MRU) without and with contrast, or scintigraphic evaluation with mercaptoacetyltriglycine (MAG3) imaging. For symptomatic patients, CTU without and with contrast, MRU without and with contrast, MAG3 scintigraphy, or ultrasound of the kidneys and bladder with Doppler imaging are all viable initial imaging studies. In asymptomatic pregnant patients, nonionizing imaging with US of the kidneys and bladder with Doppler imaging is preferred. Similarly, in symptomatic pregnant patients, US of the kidneys and bladder with Doppler imaging or MRU without contrast is the imaging study of choice, as both ionizing radiation and gadolinium contrast are avoided in pregnancy. The American College of Radiology Appropriateness Criteria are evidence-based guidelines for specific clinical conditions that are reviewed annually by a multidisciplinary expert panel. The guideline development and revision process support the systematic analysis of the medical literature from peer reviewed journals. Established methodology principles such as Grading of Recommendations Assessment, Development, and Evaluation or GRADE are adapted to evaluate the evidence. The RAND/UCLA Appropriateness Method User Manual provides the methodology to determine the appropriateness of imaging and treatment procedures for specific clinical scenarios. In those instances where peer reviewed literature is lacking or equivocal, experts may be the primary evidentiary source available to formulate a recommendation.
Asunto(s)
Medicina Basada en la Evidencia , Hidronefrosis , Sociedades Médicas , Humanos , Hidronefrosis/diagnóstico por imagen , Estados Unidos , Femenino , Embarazo , Diagnóstico por Imagen/métodos , Medios de ContrasteRESUMEN
Introduction: This study explores the use of the latest You Only Look Once (YOLO V7) object detection method to enhance kidney detection in medical imaging by training and testing a modified YOLO V7 on medical image formats. Methods: Study includes 878 patients with various subtypes of renal cell carcinoma (RCC) and 206 patients with normal kidneys. A total of 5657 MRI scans for 1084 patients were retrieved. 326 patients with 1034 tumors recruited from a retrospective maintained database, and bounding boxes were drawn around their tumors. A primary model was trained on 80% of annotated cases, with 20% saved for testing (primary test set). The best primary model was then used to identify tumors in the remaining 861 patients and bounding box coordinates were generated on their scans using the model. Ten benchmark training sets were created with generated coordinates on not-segmented patients. The final model used to predict the kidney in the primary test set. We reported the positive predictive value (PPV), sensitivity, and mean average precision (mAP). Results: The primary training set showed an average PPV of 0.94 ± 0.01, sensitivity of 0.87 ± 0.04, and mAP of 0.91 ± 0.02. The best primary model yielded a PPV of 0.97, sensitivity of 0.92, and mAP of 0.95. The final model demonstrated an average PPV of 0.95 ± 0.03, sensitivity of 0.98 ± 0.004, and mAP of 0.95 ± 0.01. Conclusion: Using a semi-supervised approach with a medical image library, we developed a high-performing model for kidney detection. Further external validation is required to assess the model's generalizability.