RESUMO
The choice to postpone treatment while awaiting genetic testing can result in significant delay in definitive therapies in patients with severe pancytopenia. Conversely, the misdiagnosis of inherited bone marrow failure (BMF) can expose patients to ineffectual and expensive therapies, toxic transplant conditioning regimens, and inappropriate use of an affected family member as a stem cell donor. To predict the likelihood of patients having acquired or inherited BMF, we developed a 2-step data-driven machine-learning model using 25 clinical and laboratory variables typically recorded at the initial clinical encounter. For model development, patients were labeled as having acquired or inherited BMF depending on their genomic data. Data sets were unbiasedly clustered, and an ensemble model was trained with cases from the largest cluster of a training cohort (n = 359) and validated with an independent cohort (n = 127). Cluster A, the largest group, was mostly immune or inherited aplastic anemia, whereas cluster B comprised underrepresented BMF phenotypes and was not included in the next step of data modeling because of a small sample size. The ensemble cluster A-specific model was accurate (89%) to predict BMF etiology, correctly predicting inherited and likely immune BMF in 79% and 92% of cases, respectively. Our model represents a practical guide for BMF diagnosis and highlights the importance of clinical and laboratory variables in the initial evaluation, particularly telomere length. Our tool can be potentially used by general hematologists and health care providers not specialized in BMF, and in under-resourced centers, to prioritize patients for genetic testing or for expeditious treatment.
Assuntos
Anemia Aplástica , Doenças da Medula Óssea , Pancitopenia , Humanos , Doenças da Medula Óssea/diagnóstico , Doenças da Medula Óssea/genética , Doenças da Medula Óssea/terapia , Diagnóstico Diferencial , Anemia Aplástica/diagnóstico , Anemia Aplástica/genética , Anemia Aplástica/terapia , Transtornos da Insuficiência da Medula Óssea/diagnóstico , Pancitopenia/diagnósticoRESUMO
Background GPT-4V (GPT-4 with vision, ChatGPT; OpenAI) has shown impressive performance in several medical assessments. However, few studies have assessed its performance in interpreting radiologic images. Purpose To assess and compare the accuracy of GPT-4V in assessing radiologic cases with both images and textual context to that of radiologists and residents, to assess if GPT-4V assistance improves human accuracy, and to assess and compare the accuracy of GPT-4V with that of image-only or text-only inputs. Materials and Methods Seventy-two Case of the Day questions at the RSNA 2023 Annual Meeting were curated in this observer study. Answers from GPT-4V were obtained between November 26 and December 10, 2023, with the following inputs for each question: image only, text only, and both text and images. Five radiologists and three residents also answered the questions in an "open book" setting. For the artificial intelligence (AI)-assisted portion, the radiologists and residents were provided with the outputs of GPT-4V. The accuracy of radiologists and residents, both with and without AI assistance, was analyzed using a mixed-effects linear model. The accuracies of GPT-4V with different input combinations were compared by using the McNemar test. P < .05 was considered to indicate a significant difference. Results The accuracy of GPT-4V was 43% (31 of 72; 95% CI: 32, 55). Radiologists and residents did not significantly outperform GPT-4V in either imaging-dependent (59% and 56% vs 39%; P = .31 and .52, respectively) or imaging-independent (76% and 63% vs 70%; both P = .99) cases. With access to GPT-4V responses, there was no evidence of improvement in the average accuracy of the readers. The accuracy obtained by GPT-4V with text-only and image-only inputs was 50% (35 of 70; 95% CI: 39, 61) and 38% (26 of 69; 95% CI: 27, 49), respectively. Conclusion The radiologists and residents did not significantly outperform GPT-4V. Assistance from GPT-4V did not help human raters. GPT-4V relied on the textual context for its outputs. © RSNA, 2024 Supplemental material is available for this article. See also the editorial by Katz in this issue.
Assuntos
Radiologia , Humanos , Competência Clínica , Inteligência Artificial , Sociedades Médicas , Internato e ResidênciaRESUMO
OBJECTIVE: To assess the diagnostic performance of post-contrast CT for predicting moderate hepatic steatosis in an older adult cohort undergoing a uniform CT protocol, utilizing hepatic and splenic attenuation values. MATERIALS AND METHODS: A total of 1676 adults (mean age, 68.4 ± 10.2 years; 1045M/631F) underwent a CT urothelial protocol that included unenhanced, portal venous, and 10-min delayed phases through the liver and spleen. Automated hepatosplenic segmentation for attenuation values (in HU) was performed using a validated deep-learning tool. Unenhanced liver attenuation < 40.0 HU, corresponding to > 15% MRI-based proton density fat, served as the reference standard for moderate steatosis. RESULTS: The prevalence of moderate or severe steatosis was 12.9% (216/1676). The diagnostic performance of portal venous liver HU in predicting moderate hepatic steatosis (AUROC = 0.943) was significantly better than the liver-spleen HU difference (AUROC = 0.814) (p < 0.001). Portal venous phase liver thresholds of 80 and 90 HU had a sensitivity/specificity for moderate steatosis of 85.6%/89.6%, and 94.9%/74.7%, respectively, whereas a liver-spleen difference of -40 HU and -10 HU had a sensitivity/specificity of 43.5%/90.0% and 92.1%/52.5%, respectively. Furthermore, livers with moderate-severe steatosis demonstrated significantly less post-contrast enhancement (mean, 35.7 HU vs 47.3 HU; p < 0.001). CONCLUSION: Moderate steatosis can be reliably diagnosed on standard portal venous phase CT using liver attenuation values alone. Consideration of splenic attenuation appears to add little value. Moderate steatosis not only has intrinsically lower pre-contrast liver attenuation values (< 40 HU), but also enhances less, typically resulting in post-contrast liver attenuation values of 80 HU or less. CLINICAL RELEVANCE STATEMENT: Moderate steatosis can be reliably diagnosed on post-contrast CT using liver attenuation values alone. Livers with at least moderate steatosis enhance less than those with mild or no steatosis, which combines with the lower intrinsic attenuation to improve detection. KEY POINTS: The liver-spleen attenuation difference is frequently utilized in routine practice but appears to have performance limitations. The liver-spleen attenuation difference is less effective than liver attenuation for moderate steatosis. Moderate and severe steatosis can be identified on standard portal venous phase CT using liver attenuation alone.
Assuntos
Meios de Contraste , Fígado Gorduroso , Baço , Tomografia Computadorizada por Raios X , Humanos , Masculino , Feminino , Idoso , Tomografia Computadorizada por Raios X/métodos , Baço/diagnóstico por imagem , Fígado Gorduroso/diagnóstico por imagem , Sensibilidade e Especificidade , Fígado/diagnóstico por imagem , Pessoa de Meia-Idade , Estudos RetrospectivosRESUMO
OBJECTIVES: To evaluate the utility of CT-based abdominal fat measures for predicting the risk of death and cardiometabolic disease in an asymptomatic adult screening population. METHODS: Fully automated AI tools quantifying abdominal adipose tissue (L3 level visceral [VAT] and subcutaneous [SAT] fat area, visceral-to-subcutaneous fat ratio [VSR], VAT attenuation), muscle attenuation (L3 level), and liver attenuation were applied to non-contrast CT scans in asymptomatic adults undergoing CT colonography (CTC). Longitudinal follow-up documented subsequent deaths, cardiovascular events, and diabetes. ROC and time-to-event analyses were performed to generate AUCs and hazard ratios (HR) binned by octile. RESULTS: A total of 9223 adults (mean age, 57 years; 4071:5152 M:F) underwent screening CTC from April 2004 to December 2016. 549 patients died on follow-up (median, nine years). Fat measures outperformed BMI for predicting mortality risk-5-year AUCs for muscle attenuation, VSR, and BMI were 0.721, 0.661, and 0.499, respectively. Higher visceral, muscle, and liver fat were associated with increased mortality risk-VSR > 1.53, HR = 3.1; muscle attenuation < 15 HU, HR = 5.4; liver attenuation < 45 HU, HR = 2.3. Higher VAT area and VSR were associated with increased cardiovascular event and diabetes risk-VSR > 1.59, HR = 2.6 for cardiovascular event; VAT area > 291 cm2, HR = 6.3 for diabetes (p < 0.001). A U-shaped association was observed for SAT with a higher risk of death for very low and very high SAT. CONCLUSION: Fully automated CT-based measures of abdominal fat are predictive of mortality and cardiometabolic disease risk in asymptomatic adults and uncover trends that are not reflected in anthropomorphic measures. CLINICAL RELEVANCE STATEMENT: Fully automated CT-based measures of abdominal fat soundly outperform anthropometric measures for mortality and cardiometabolic risk prediction in asymptomatic patients. KEY POINTS: Abdominal fat depots associated with metabolic dysregulation and cardiovascular disease can be derived from abdominal CT. Fully automated AI body composition tools can measure factors associated with increased mortality and cardiometabolic risk. CT-based abdominal fat measures uncover trends in mortality and cardiometabolic risk not captured by BMI in asymptomatic outpatients.
RESUMO
Background: The long-acting glucagon-like peptide-1 receptor agonist semaglutide is used to treat type 2 diabetes or obesity in adults. Clinical trials have observed associations of semaglutide with weight loss, improved diabetic control, and cardiovascular risk reduction. Objective: To evaluate intrapatient changes in body composition after initiation of semaglutide therapy by applying an automated suite of CT-based artificial intelligence (AI) body composition tools. Methods: This retrospective study included adult patients with semaglutide treatment who underwent abdominopelvic CT both within 5 years before and within 5 years after semaglutide initiation, between January 2016 and November 2023. An automated suite of previously validated CT-based AI body composition tools was applied to pre-semaglutide and post-semaglutide scans to quantify visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT) area, skeletal muscle area and attenuation, intermuscular adipose tissue (IMAT) area, liver volume and attenuation, and trabecular bone mineral density (BMD). Patients with ≥5-kg weight loss and ≥5-kg weight gain between scans were compared. Results: The study included 241 patients (mean age, 60.4±12.4 years; 151 women, 90 men). In the weight-loss group (n=67), the post-semaglutide scan, versus pre-semaglutide scan, showed decrease in VAT area (341.1 vs 309.4 cm2, p<.001), SAT area (371.4 vs 410.7 cm2, p<.001), muscle area (179.2 vs 193.0, p<.001), and liver volume (2379.0 vs 2578 HU, p=.009), and increase in liver attenuation (74.5 vs 67.6 HU, p=.03). In the weight-gain group (n=48), the post-semaglutide scan, versus pre-semaglutide scan, showed increase in VAT area (334.0 vs 312.8, p=.002), SAT area (485.8 vs 488.8 cm2, p=.01), and IMAT area (48.4 vs 37.6, p=.009), and decrease in muscle attenuation (5.9 vs 13.1, p<.001). Other comparisons were not significant (p>.05). Conclusion: Patients using semaglutide who lost versus gained weight demonstrated distinct patterns of changes in CT-based body composition measures. Those with weight loss exhibited overall favorable shifts in measures related to cardiometabolic risk. Muscle attenuation decrease in those with weight gain is consistent with decreased muscle quality. Clinical Impact: Automated CT-based AI tools provide biomarkers of body composition changes in patients using semaglutide beyond that which is evident by standard clinical measures.
RESUMO
Background Large language models (LLMs) such as ChatGPT, though proficient in many text-based tasks, are not suitable for use with radiology reports due to patient privacy constraints. Purpose To test the feasibility of using an alternative LLM (Vicuna-13B) that can be run locally for labeling radiography reports. Materials and Methods Chest radiography reports from the MIMIC-CXR and National Institutes of Health (NIH) data sets were included in this retrospective study. Reports were examined for 13 findings. Outputs reporting the presence or absence of the 13 findings were generated by Vicuna by using a single-step or multistep prompting strategy (prompts 1 and 2, respectively). Agreements between Vicuna outputs and CheXpert and CheXbert labelers were assessed using Fleiss κ. Agreement between Vicuna outputs from three runs under a hyperparameter setting that introduced some randomness (temperature, 0.7) was also assessed. The performance of Vicuna and the labelers was assessed in a subset of 100 NIH reports annotated by a radiologist with use of area under the receiver operating characteristic curve (AUC). Results A total of 3269 reports from the MIMIC-CXR data set (median patient age, 68 years [IQR, 59-79 years]; 161 male patients) and 25 596 reports from the NIH data set (median patient age, 47 years [IQR, 32-58 years]; 1557 male patients) were included. Vicuna outputs with prompt 2 showed, on average, moderate to substantial agreement with the labelers on the MIMIC-CXR (κ median, 0.57 [IQR, 0.45-0.66] with CheXpert and 0.64 [IQR, 0.45-0.68] with CheXbert) and NIH (κ median, 0.52 [IQR, 0.41-0.65] with CheXpert and 0.55 [IQR, 0.41-0.74] with CheXbert) data sets, respectively. Vicuna with prompt 2 performed at par (median AUC, 0.84 [IQR, 0.74-0.93]) with both labelers on nine of 11 findings. Conclusion In this proof-of-concept study, outputs of the LLM Vicuna reporting the presence or absence of 13 findings on chest radiography reports showed moderate to substantial agreement with existing labelers. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Cai in this issue.
Assuntos
Camelídeos Americanos , Radiologia , Estados Unidos , Humanos , Masculino , Animais , Idoso , Pessoa de Meia-Idade , Privacidade , Estudos de Viabilidade , Estudos Retrospectivos , IdiomaRESUMO
Background Body composition data have been limited to adults with disease or older age. The prognostic impact in otherwise asymptomatic adults is unclear. Purpose To use artificial intelligence-based body composition metrics from routine abdominal CT scans in asymptomatic adults to clarify the association between obesity, liver steatosis, myopenia, and myosteatosis and the risk of mortality. Materials and Methods In this retrospective single-center study, consecutive adult outpatients undergoing routine colorectal cancer screening from April 2004 to December 2016 were included. Using a U-Net algorithm, the following body composition metrics were extracted from low-dose, noncontrast, supine multidetector abdominal CT scans: total muscle area, muscle density, subcutaneous and visceral fat area, and volumetric liver density. Abnormal body composition was defined by the presence of liver steatosis, obesity, muscle fatty infiltration (myosteatosis), and/or low muscle mass (myopenia). The incidence of death and major adverse cardiovascular events were recorded during a median follow-up of 8.8 years. Multivariable analyses were performed accounting for age, sex, smoking status, myosteatosis, liver steatosis, myopenia, type 2 diabetes, obesity, visceral fat, and history of cardiovascular events. Results Overall, 8982 consecutive outpatients (mean age, 57 years ± 8 [SD]; 5008 female, 3974 male) were included. Abnormal body composition was found in 86% (434 of 507) of patients who died during follow-up. Myosteatosis was found in 278 of 507 patients (55%) who died (15.5% absolute risk at 10 years). Myosteatosis, obesity, liver steatosis, and myopenia were associated with increased mortality risk (hazard ratio [HR]: 4.33 [95% CI: 3.63, 5.16], 1.27 [95% CI: 1.06, 1.53], 1.86 [95% CI: 1.56, 2.21], and 1.75 [95% CI: 1.43, 2.14], respectively). In 8303 patients (excluding 679 patients without complete data), after multivariable adjustment, myosteatosis remained associated with increased mortality risk (HR, 1.89 [95% CI: 1.52, 2.35]; P < .001). Conclusion Artificial intelligence-based profiling of body composition from routine abdominal CT scans identified myosteatosis as a key predictor of mortality risk in asymptomatic adults. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Tong and Magudia in this issue.
Assuntos
Doenças Cardiovasculares , Diabetes Mellitus Tipo 2 , Fígado Gorduroso , Sarcopenia , Humanos , Masculino , Adulto , Feminino , Pessoa de Meia-Idade , Estudos Retrospectivos , Diabetes Mellitus Tipo 2/complicações , Inteligência Artificial , Composição Corporal , Obesidade/patologia , Doenças Cardiovasculares/complicações , Fígado Gorduroso/complicações , Tomografia Computadorizada por Raios X/métodos , Músculo Esquelético/patologia , Sarcopenia/complicaçõesRESUMO
Background CT-based body composition measures derived from fully automated artificial intelligence tools are promising for opportunistic screening. However, body composition thresholds associated with adverse clinical outcomes are lacking. Purpose To determine population and sex-specific thresholds for muscle, abdominal fat, and abdominal aortic calcium measures at abdominal CT for predicting risk of death, adverse cardiovascular events, and fragility fractures. Materials and Methods In this retrospective single-center study, fully automated algorithms for quantifying skeletal muscle (L3 level), abdominal fat (L3 level), and abdominal aortic calcium were applied to noncontrast abdominal CT scans from asymptomatic adults screened from 2004 to 2016. Longitudinal follow-up documented subsequent death, adverse cardiovascular events (myocardial infarction, cerebrovascular event, and heart failure), and fragility fractures. Receiver operating characteristic (ROC) curve analysis was performed to derive thresholds for body composition measures to achieve optimal ROC curve performance and high specificity (90%) for 10-year risks. Results A total of 9223 asymptomatic adults (mean age, 57 years ± 7 [SD]; 5152 women and 4071 men) were evaluated (median follow-up, 9 years). Muscle attenuation and aortic calcium had the highest diagnostic performance for predicting death, with areas under the ROC curve of 0.76 for men (95% CI: 0.72, 0.79) and 0.72 for women (95% CI: 0.69, 0.76) for muscle attenuation. Sex-specific thresholds were higher in men than women (P < .001 for muscle attenuation for all outcomes). The highest-performing markers for risk of death were muscle attenuation in men (31 HU; 71% sensitivity [164 of 232 patients]; 72% specificity [1114 of 1543 patients]) and aortic calcium in women (Agatston score, 167; 70% sensitivity [152 of 218 patients]; 70% specificity [1427 of 2034 patients]). Ninety-percent specificity thresholds for muscle attenuation for both risk of death and fragility fractures were 23 HU (men) and 13 HU (women). For aortic calcium and risk of death and adverse cardiovascular events, 90% specificity Agatston score thresholds were 1475 (men) and 735 (women). Conclusion Sex-specific thresholds for automated abdominal CT-based body composition measures can be used to predict risk of death, adverse cardiovascular events, and fragility fractures. © RSNA, 2022 Online supplemental material is available for this article. See also the editorial by Ohliger in this issue.
Assuntos
Doenças Cardiovasculares , Fraturas Ósseas , Masculino , Adulto , Humanos , Feminino , Pessoa de Meia-Idade , Estudos Retrospectivos , Cálcio , Inteligência Artificial , Músculos Abdominais , Tomografia Computadorizada por Raios X/métodos , Composição CorporalRESUMO
Radiologic tests often contain rich imaging data not relevant to the clinical indication. Opportunistic screening refers to the practice of systematically leveraging these incidental imaging findings. Although opportunistic screening can apply to imaging modalities such as conventional radiography, US, and MRI, most attention to date has focused on body CT by using artificial intelligence (AI)-assisted methods. Body CT represents an ideal high-volume modality whereby a quantitative assessment of tissue composition (eg, bone, muscle, fat, and vascular calcium) can provide valuable risk stratification and help detect unsuspected presymptomatic disease. The emergence of "explainable" AI algorithms that fully automate these measurements could eventually lead to their routine clinical use. Potential barriers to widespread implementation of opportunistic CT screening include the need for buy-in from radiologists, referring providers, and patients. Standardization of acquiring and reporting measures is needed, in addition to expanded normative data according to age, sex, and race and ethnicity. Regulatory and reimbursement hurdles are not insurmountable but pose substantial challenges to commercialization and clinical use. Through demonstration of improved population health outcomes and cost-effectiveness, these opportunistic CT-based measures should be attractive to both payers and health care systems as value-based reimbursement models mature. If highly successful, opportunistic screening could eventually justify a practice of standalone "intended" CT screening.
Assuntos
Inteligência Artificial , Radiologia , Humanos , Algoritmos , Radiologistas , Programas de Rastreamento/métodos , Radiologia/métodosRESUMO
Background Pre-liver transplant (LT) sarcopenia is associated with poor survival. Methods exist for measuring body composition with use of CT scans; however, it is unclear which components best predict post-LT outcomes. Purpose To quantify the association between abdominal CT-based body composition measurements and post-LT mortality in a large North American cohort. Materials and Methods This was a retrospective cohort of adult first-time deceased-donor LT recipients from 2009 to 2018 who underwent pre-LT abdominal CT scans, including at the L3 vertebral level, at Johns Hopkins Hospital. Measurements included sarcopenia (skeletal muscle index [SMI] <50 in men and <39 in women), sarcopenic obesity, myosteatosis (skeletal muscle CT attenuation <41 mean HU for body mass index [BMI] <25 and <33 mean HU for BMI ≥25), visceral adipose tissue (VAT), subcutaneous adipose tissue (SAT), and VAT/SAT ratio. Covariates in the adjusted models were selected with use of least absolute shrinkage and selection operator regression with lambda chosen by means of 10-fold cross-validation. Cox proportional hazards models were used to quantify associations with post-LT mortality. Model discrimination was quantified using the Harrell C-statistic. Results A total of 454 recipients (median age, 57 years [IQR, 50-62 years]; 294 men) were evaluated. In the adjusted model, pre-LT sarcopenia was associated with a higher hazard ratio (HR) of post-LT mortality (HR, 1.6 [95% CI: 1.1, 2.4]; C-statistic, 0.64; P = .02). SMI was significantly negatively associated with survival after adjustment for covariates. There was no evidence that myosteatosis was associated with mortality (HR, 1.3 [95% CI: 0.86, 2.1]; C-statistic, 0.64; P = .21). There was no evidence that BMI (HR, 1.2 [95% CI: 0.95, 1.4]), VAT (HR, 1.0 [95% CI: 0.98, 1.1]), SAT (HR, 1.0 [95% CI: 0.97, 1.0]), and VAT/SAT ratio (HR, 1.1 [95% CI: 0.90, 1.4]) were associated with mortality (P = .15-.77). Conclusions Sarcopenia, as assessed on routine pre-liver transplant (LT) abdominal CT scans, was the only factor significantly associated with post-LT mortality. © RSNA, 2022 See also the editorial by Ruehm in this issue.
Assuntos
Transplante de Fígado , Sarcopenia , Adulto , Masculino , Humanos , Feminino , Pessoa de Meia-Idade , Sarcopenia/complicações , Sarcopenia/diagnóstico por imagem , Estudos Retrospectivos , Doadores Vivos , Composição Corporal , Músculo Esquelético , Tomografia Computadorizada por Raios X/métodosRESUMO
BACKGROUND. Clinically usable artificial intelligence (AI) tools analyzing imaging studies should be robust to expected variations in study parameters. OBJECTIVE. The purposes of this study were to assess the technical adequacy of a set of automated AI abdominal CT body composition tools in a heterogeneous sample of external CT examinations performed outside of the authors' hospital system and to explore possible causes of tool failure. METHODS. This retrospective study included 8949 patients (4256 men, 4693 women; mean age, 55.5 ± 15.9 years) who underwent 11,699 abdominal CT examinations performed at 777 unique external institutions with 83 unique scanner models from six manufacturers with images subsequently transferred to the local PACS for clinical purposes. Three independent automated AI tools were deployed to assess body composition (bone attenuation, amount and attenuation of muscle, amount of visceral and sub-cutaneous fat). One axial series per examination was evaluated. Technical adequacy was defined as tool output values within empirically derived reference ranges. Failures (i.e., tool output outside of reference range) were reviewed to identify possible causes. RESULTS. All three tools were technically adequate in 11,431 of 11,699 (97.7%) examinations. At least one tool failed in 268 (2.3%) of the examinations. Individual adequacy rates were 97.8% for the bone tool, 99.1% for the muscle tool, and 98.9% for the fat tool. A single type of image processing error (anisometry error, due to incorrect DICOM header voxel dimension information) accounted for 81 of 92 (88.0%) examinations in which all three tools failed, and all three tools failed whenever this error occurred. Anisometry error was the most common specific cause of failure of all tools (bone, 31.6%; muscle, 81.0%; fat, 62.8%). A total of 79 of 81 (97.5%) anisometry errors occurred on scanners from a single manufacturer; 80 of 81 (98.8%) occurred on the same scanner model. No cause of failure was identified for 59.4% of failures of the bone tool, 16.0% of failures of the muscle tool, or 34.9% of failures of the fat tool. CONCLUSION. The automated AI body composition tools had high technical adequacy rates in a heterogeneous sample of external CT examinations, supporting the generalizability of the tools and their potential for broad use. CLINICAL IMPACT. Certain causes of AI tool failure related to technical factors may be largely preventable through use of proper acquisition and reconstruction protocols.
Assuntos
Inteligência Artificial , Tomografia Computadorizada por Raios X , Masculino , Humanos , Feminino , Adulto , Pessoa de Meia-Idade , Idoso , Tomografia Computadorizada por Raios X/métodos , Estudos Retrospectivos , Processamento de Imagem Assistida por Computador , Composição CorporalRESUMO
BACKGROUND. Splenomegaly historically has been assessed on imaging by use of potentially inaccurate linear measurements. Prior work tested a deep learning artificial intelligence (AI) tool that automatically segments the spleen to determine splenic volume. OBJECTIVE. The purpose of this study is to apply the deep learning AI tool in a large screening population to establish volume-based splenomegaly thresholds. METHODS. This retrospective study included a primary (screening) sample of 8901 patients (4235 men, 4666 women; mean age, 56 ± 10 [SD] years) who underwent CT colonoscopy (n = 7736) or renal donor CT (n = 1165) from April 2004 to January 2017 and a secondary sample of 104 patients (62 men, 42 women; mean age, 56 ± 8 years) with end-stage liver disease who underwent contrast-enhanced CT performed as part of evaluation for potential liver transplant from January 2011 to May 2013. The automated deep learning AI tool was used for spleen segmentation, to determine splenic volumes. Two radiologists independently reviewed a subset of segmentations. Weight-based volume thresholds for splenomegaly were derived using regression analysis. Performance of linear measurements was assessed. Frequency of splenomegaly in the secondary sample was determined using weight-based volumetric thresholds. RESULTS. In the primary sample, both observers confirmed splenectomy in 20 patients with an automated splenic volume of 0 mL; confirmed incomplete splenic coverage in 28 patients with a tool output error; and confirmed adequate segmentation in 21 patients with low volume (< 50 mL), 49 patients with high volume (> 600 mL), and 200 additional randomly selected patients. In 8853 patients included in analysis of splenic volumes (i.e., excluding a value of 0 mL or error values), the mean automated splenic volume was 216 ± 100 [SD] mL. The weight-based volumetric threshold (expressed in milliliters) for splenomegaly was calculated as (3.01 × weight [expressed as kilograms]) + 127; for weight greater than 125 kg, the splenomegaly threshold was constant (503 mL). Sensitivity and specificity for volume-defined splenomegaly were 13% and 100%, respectively, at a true craniocaudal length of 13 cm, and 78% and 88% for a maximum 3D length of 13 cm. In the secondary sample, both observers identified segmentation failure in one patient. The mean automated splenic volume in the 103 remaining patients was 796 ± 457 mL; 84% (87/103) of patients met the weight-based volume-defined splenomegaly threshold. CONCLUSION. We derived a weight-based volumetric threshold for splenomegaly using an automated AI-based tool. CLINICAL IMPACT. The AI tool could facilitate large-scale opportunistic screening for splenomegaly.
RESUMO
BACKGROUND. CT examinations contain opportunistic body composition data with potential prognostic utility. Previous studies have primarily used manual or semiautomated tools to evaluate body composition in patients with colorectal cancer (CRC). OBJECTIVE. The purpose of this article is to assess the utility of fully automated body composition measures derived from pretreatment CT examinations in predicting survival in patients with CRC. METHODS. This retrospective study included 1766 patients (mean age, 63.7 ± 14.4 [SD] years; 862 men, 904 women) diagnosed with CRC between January 2001 and September 2020 who underwent pretreatment abdominal CT. A panel of fully automated artificial intelligence-based algorithms was applied to portal venous phase images to quantify skeletal muscle attenuation at the L3 lumbar level, visceral adipose tissue (VAT) area and subcutaneous adipose tissue (SAT) area at L3, and abdominal aorta Agatston score (aortic calcium). The electronic health record was reviewed to identify patients who died of any cause (n = 848). ROC analyses and logistic regression analyses were used to identify predictors of survival, with attention to highest- and lowest-risk quartiles. RESULTS. Patients who died, compared with patients who survived, had lower median muscle attenuation (19.2 vs 26.2 HU, p < .001), SAT area (168.4 cm2 vs 197.6 cm2, p < .001), and aortic calcium (620 vs 182, p < .001). Measures with highest 5-year AUCs for predicting survival in patients without (n = 1303) and with (n = 463) metastatic disease were muscle attenuation (0.666 and 0.701, respectively) and aortic calcium (0.677 and 0.689, respectively). A combination of muscle attenuation, SAT area, and aortic calcium yielded 5-year AUCs of 0.758 and 0.732 in patients without and with metastases, respectively. Risk of death was increased (p < .05) in patients in the lowest quartile for muscle attenuation (hazard ratio [HR] = 1.55) and SAT area (HR = 1.81) and in the highest quartile for aortic calcium (HR = 1.37) and decreased (p < .05) in patients in the highest quartile for VAT area (HR = 0.79) and SAT area (HR = 0.76). In 423 patients with available BMI, BMI did not significantly predict death (p = .75). CONCLUSION. Fully automated CT-based body composition measures including muscle attenuation, SAT area, and aortic calcium predict survival in patients with CRC. CLINICAL IMPACT. Routine pretreatment body composition evaluation could improve initial risk stratification of patients with CRC.
Assuntos
Inteligência Artificial , Neoplasias Colorretais , Masculino , Humanos , Feminino , Pessoa de Meia-Idade , Idoso , Estudos Retrospectivos , Cálcio , Tomografia Computadorizada por Raios X/métodos , Composição Corporal , Neoplasias Colorretais/patologiaRESUMO
Background CT biomarkers both inside and outside the pancreas can potentially be used to diagnose type 2 diabetes mellitus. Previous studies on this topic have shown significant results but were limited by manual methods and small study samples. Purpose To investigate abdominal CT biomarkers for type 2 diabetes mellitus in a large clinical data set using fully automated deep learning. Materials and Methods For external validation, noncontrast abdominal CT images were retrospectively collected from consecutive patients who underwent routine colorectal cancer screening with CT colonography from 2004 to 2016. The pancreas was segmented using a deep learning method that outputs measurements of interest, including CT attenuation, volume, fat content, and pancreas fractal dimension. Additional biomarkers assessed included visceral fat, atherosclerotic plaque, liver and muscle CT attenuation, and muscle volume. Univariable and multivariable analyses were performed, separating patients into groups based on time between type 2 diabetes diagnosis and CT date and including clinical factors such as sex, age, body mass index (BMI), BMI greater than 30 kg/m2, and height. The best set of predictors for type 2 diabetes were determined using multinomial logistic regression. Results A total of 8992 patients (mean age, 57 years ± 8 [SD]; 5009 women) were evaluated in the test set, of whom 572 had type 2 diabetes mellitus. The deep learning model had a mean Dice similarity coefficient for the pancreas of 0.69 ± 0.17, similar to the interobserver Dice similarity coefficient of 0.69 ± 0.09 (P = .92). The univariable analysis showed that patients with diabetes had, on average, lower pancreatic CT attenuation (mean, 18.74 HU ± 16.54 vs 29.99 HU ± 13.41; P < .0001) and greater visceral fat volume (mean, 235.0 mL ± 108.6 vs 130.9 mL ± 96.3; P < .0001) than those without diabetes. Patients with diabetes also showed a progressive decrease in pancreatic attenuation with greater duration of disease. The final multivariable model showed pairwise areas under the receiver operating characteristic curve (AUCs) of 0.81 and 0.85 between patients without and patients with diabetes who were diagnosed 0-2499 days before and after undergoing CT, respectively. In the multivariable analysis, adding clinical data did not improve upon CT-based AUC performance (AUC = 0.67 for the CT-only model vs 0.68 for the CT and clinical model). The best predictors of type 2 diabetes mellitus included intrapancreatic fat percentage, pancreatic fractal dimension, plaque severity between the L1 and L4 vertebra levels, average liver CT attenuation, and BMI. Conclusion The diagnosis of type 2 diabetes mellitus was associated with abdominal CT biomarkers, especially measures of pancreatic CT attenuation and visceral fat. © RSNA, 2022 Online supplemental material is available for this article.
Assuntos
Aprendizado Profundo , Diabetes Mellitus Tipo 2 , Biomarcadores , Diabetes Mellitus Tipo 2/diagnóstico por imagem , Feminino , Humanos , Pessoa de Meia-Idade , Estudos Retrospectivos , Tomografia Computadorizada por Raios X/métodosRESUMO
Background Imaging assessment for hepatomegaly is not well defined and currently uses suboptimal, unidimensional measures. Liver volume provides a more direct measure for organ enlargement. Purpose To determine organ volume and to establish thresholds for hepatomegaly with use of a validated deep learning artificial intelligence tool that automatically segments the liver. Materials and Methods In this retrospective study, liver volumes were successfully derived with use of a deep learning tool for asymptomatic outpatient adults who underwent multidetector CT for colorectal cancer screening (unenhanced) or renal donor evaluation (contrast-enhanced) at a single medical center between April 2004 and December 2016. The performance of the craniocaudal and maximal three-dimensional (3D) linear measures was assessed. The manual liver volume results were compared with the automated results in a subset of renal donors in which the entire liver was included at both precontrast and postcontrast CT. Unenhanced liver volumes were standardized to a postcontrast equivalent, reflecting a correction of 3.6%. Linear regression analysis was performed to assess the major patient-specific determinant or determinants of liver volume among age, sex, height, weight, and body surface area. Results A total of 3065 patients (mean age ± standard deviation, 54 years ± 12; 1639 women) underwent multidetector CT for colorectal screening (n = 1960) or renal donor evaluation (n = 1105). The mean standardized automated liver volume ± standard deviation was 1533 mL ± 375 and demonstrated a normal distribution. Patient weight was the major determinant of liver volume and demonstrated a linear relationship. From this result, a linear weight-based upper limit of normal hepatomegaly threshold volume was derived: hepatomegaly (mL) = 14.0 × (weight [kg]) + 979. A craniocaudal threshold of 19 cm was 71% sensitive (49 of 69 patients) and 86% specific (887 of 1030 patients) for hepatomegaly, and a maximal 3D linear threshold of 24 cm was 78% sensitive (54 of 69) and 66% specific (678 of 1030). In the subset of 189 patients, the median difference in hepatic volume between the deep learning tool and the semiautomated or manual method was 2.3% (38 mL). Conclusion A simple weight-based threshold for hepatomegaly derived by using a fully automated CT-based liver volume segmentation based on deep learning provided an objective and more accurate assessment of liver size than linear measures. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Sosna in this issue.
Assuntos
Aprendizado Profundo , Hepatomegalia/diagnóstico por imagem , Tamanho do Órgão , Tomografia Computadorizada por Raios X/métodos , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Estudos RetrospectivosRESUMO
BACKGROUND. Sarcopenia is associated with adverse clinical outcomes. CT-based skeletal muscle measurements for sarcopenia assessment are most commonly performed at the L3 vertebral level. OBJECTIVE. The purpose of this article is to compare the utility of fully automated deep learning CT-based muscle quantitation at the L1 versus L3 level for predicting future hip fractures and death. METHODS. This retrospective study included 9223 asymptomatic adults (mean age, 57 ± 8 [SD] years; 4071 men, 5152 women) who underwent unenhanced low-dose abdominal CT. A previously validated fully automated deep learning tool was used to assess muscle for myosteatosis (by mean attenuation) and myopenia (by cross-sectional area) at the L1 and L3 levels. Performance for predicting hip fractures and death was compared between L1 and L3 measures. Performance for predicting hip fractures and death was also evaluated using the established clinical risk scores from the fracture risk assessment tool (FRAX) and Framingham risk score (FRS), respectively. RESULTS. Median clinical follow-up interval after CT was 8.8 years (interquartile range, 5.1-11.6 years), yielding hip fractures and death in 219 (2.4%) and 549 (6.0%) patients, respectively. L1-level and L3-level muscle attenuation measurements were not different in 2-, 5-, or 10-year AUC for hip fracture (p = .18-.98) or death (p = .19-.95). For hip fracture, 5-year AUCs for L1-level muscle attenuation, L3-level muscle attenuation, and FRAX score were 0.717, 0.709, and 0.708, respectively. For death, 5-year AUCs for L1-level muscle attenuation, L3-level muscle attenuation, and FRS were 0.737, 0.721, and 0.688, respectively. Lowest quartile hazard ratios (HRs) for hip fracture were 2.20 (L1 attenuation), 2.45 (L3 attenuation), and 2.53 (FRAX score), and for death were 3.25 (L1 attenuation), 3.58 (L3 attenuation), and 2.82 (FRS). CT-based muscle cross-sectional area measurements at L1 and L3 were less predictive for hip fracture and death (5-year AUC ≤ 0.571; HR ≤ 1.56). CONCLUSION. Automated CT-based measurements of muscle attenuation for myosteatosis at the L1 level compare favorably with previously established L3-level measurements and clinical risk scores for predicting hip fracture and death. Assessment for myopenia was less predictive of outcomes at both levels. CLINICAL IMPACT. Alternative use of the L1 rather than L3 level for CT-based muscle measurements allows sarcopenia assessment using both chest and abdominal CT scans, greatly increasing the potential yield of opportunistic CT screening.
Assuntos
Aprendizado Profundo , Músculo Esquelético/diagnóstico por imagem , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Sarcopenia/diagnóstico por imagem , Tomografia Computadorizada por Raios X/métodos , Feminino , Seguimentos , Humanos , Masculino , Pessoa de Meia-Idade , Músculo Esquelético/patologia , Valor Preditivo dos Testes , Reprodutibilidade dos Testes , Estudos Retrospectivos , Medição de Risco , Sarcopenia/patologia , Coluna Vertebral/diagnóstico por imagemRESUMO
BACKGROUND. Calibrated CT fat fraction (FFCT) measurements derived from un-enhanced abdominal CT reliably reflect liver fat content, allowing large-scale population-level investigations of steatosis prevalence and associations. OBJECTIVE. The purpose of this study was to compare the prevalence of hepatic steatosis, as assessed by calibrated CT measurements, between population-based Chinese and U.S. cohorts, and to investigate in these populations the relationship of steatosis with age, sex, and body mass index (BMI). METHODS. This retrospective study included 3176 adults (1985 women and 1191 men) from seven Chinese provinces and 8748 adults (4834 women and 3914 men) from a single U.S. medical center, all drawn from previous studies. All participants were at least 40 years old and had undergone unenhanced abdominal CT in previous studies. Liver fat content measurements on CT were cross-calibrated to MRI proton density fat fraction measurements using phantoms and expressed as adjusted FFCT measurements. Mild, moderate, and severe steatosis were defined as adjusted FFCT of 5.0-14.9%, 15.0-24.9%, and 25.0% or more, respectively. The two cohorts were compared. RESULTS. In the Chinese and U.S. cohorts, the median adjusted FFCT for women was 4.7% and 4.8%, respectively, and that for men was 5.8% and 6.2%, respectively. In the Chinese and U.S. cohorts, steatosis prevalence for women was 46.3% and 48.7%, respectively, whereas that for men was 58.9% and 61.9%, respectively. Severe steatosis prevalence was 0.9% and 1.8% for women and 0.2% and 2.6% for men in the Chinese and U.S. cohorts, respectively. Adjusted FFCT did not vary across age decades among women or men in the Chinese cohort, although it increased across age decades among women and men in the U.S. cohort. Adjusted FFCT and BMI exhibited weak correlation (r = 0.312-0.431). Among participants with normal BMI, 36.8% and 38.5% of those in the Chinese and U.S. cohorts, respectively, had mild steatosis, and 3.0% and 1.5% of those in the Chinese and U.S. cohorts, respectively, had moderate or severe steatosis. Among U.S. participants with a BMI of 40.0 or greater, 17.7% had normal liver content. CONCLUSION. Steatosis and severe steatosis had higher prevalence in the U.S. cohort than in the Chinese cohort in both women and men. BMI did not reliably predict steatosis. CLINICAL IMPACT. The findings provide new information on the dependence of hepatic steatosis on age, sex, and BMI.
Assuntos
Fígado Gorduroso , Tomografia Computadorizada por Raios X , Adulto , Índice de Massa Corporal , China/epidemiologia , Fígado Gorduroso/complicações , Fígado Gorduroso/diagnóstico por imagem , Fígado Gorduroso/epidemiologia , Feminino , Humanos , Masculino , Prevalência , Estudos Retrospectivos , Tomografia Computadorizada por Raios X/métodosRESUMO
PURPOSE: To systematically investigate the influence of various data consistency layers and regularization networks with respect to variations in the training and test data domain, for sensitivity-encoded accelerated parallel MR image reconstruction. THEORY AND METHODS: Magnetic resonance (MR) image reconstruction is formulated as a learned unrolled optimization scheme with a down-up network as regularization and varying data consistency layers. The proposed networks are compared to other state-of-the-art approaches on the publicly available fastMRI knee and neuro dataset and tested for stability across different training configurations regarding anatomy and number of training samples. RESULTS: Data consistency layers and expressive regularization networks, such as the proposed down-up networks, form the cornerstone for robust MR image reconstruction. Physics-based reconstruction networks outperform post-processing methods substantially for R = 4 in all cases and for R = 8 when the training and test data are aligned. At R = 8, aligning training and test data is more important than architectural choices. CONCLUSION: In this work, we study how dataset sizes affect single-anatomy and cross-anatomy training of neural networks for MRI reconstruction. The study provides insights into the robustness, properties, and acceleration limits of state-of-the-art networks, and our proposed down-up networks. These key insights provide essential aspects to successfully translate learning-based MRI reconstruction to clinical practice, where we are confronted with limited datasets and various imaged anatomies.
Assuntos
Processamento de Imagem Assistida por Computador , Neurologia , Aceleração , Imageamento por Ressonância Magnética , Redes Neurais de ComputaçãoRESUMO
OBJECTIVE: Metabolic syndrome describes a constellation of reversible cardiometabolic abnormalities associated with cardiovascular risk and diabetes. The present study investigates the use of fully automated abdominal CT-based biometric measures for opportunistic identification of metabolic syndrome in adults without symptoms. MATERIALS AND METHODS: International Diabetes Federation criteria were applied to a cohort of 9223 adults without symptoms who underwent unenhanced abdominal CT. After patients with insufficient clinical data for diagnosis were excluded, the final cohort consisted of 7785 adults (mean age, 57.0 years; 4361 women and 3424 men). Previously validated and fully automated CT-based algorithms for quantifying muscle, visceral and subcutaneous fat, liver fat, and abdominal aortic calcification were applied to this final cohort. RESULTS: A total of 738 subjects (9.5% of all subjects; mean age, 56.7 years; 372 women and 366 men) met the clinical criteria for metabolic syndrome. Subsequent major cardiovascular events occurred more frequently in the cohort with metabolic syndrome (p < 0.001). Significant differences were observed between the two groups for all CT-based biomarkers (p < 0.001). Univariate L1-level total abdominal fat (area under the ROC curve [AUROC] = 0.909; odds ratio [OR] = 27.2), L3-level skeletal muscle index (AUROC = 0.776; OR = 5.8), and volumetric liver attenuation (AUROC = 0.738; OR = 5.1) performed well when compared with abdominal aortic calcification scoring (AUROC = 0.578; OR = 1.6). An L1-level total abdominal fat threshold of 460.6 cm2 was 80.1% sensitive and 85.4% specific for metabolic syndrome. For women, the AUROC was 0.930 when fat and muscle measures were combined. CONCLUSION: Fully automated quantitative tissue measures of fat, muscle, and liver derived from abdominal CT scans can help identify individuals who are at risk for metabolic syndrome. These visceral measures can be opportunistically applied to CT scans obtained for other clinical indications, and they may ultimately provide a more direct and useful definition of metabolic syndrome.
Assuntos
Síndrome Metabólica/diagnóstico por imagem , Radiografia Abdominal , Tomografia Computadorizada por Raios X , Adulto , Idoso , Composição Corporal , Estudos de Coortes , Feminino , Humanos , Masculino , Programas de Rastreamento , Pessoa de Meia-Idade , Sensibilidade e EspecificidadeRESUMO
BACKGROUND. Hepatic attenuation at unenhanced CT is linearly correlated with the MRI proton density fat fraction (PDFF). Liver fat quantification at contrast-enhanced CT is more challenging. OBJECTIVE. The purpose of this article is to evaluate liver steatosis categorization on contrast-enhanced CT using a fully automated deep learning volumetric hepatosplenic segmentation algorithm and unenhanced CT as the reference standard. METHODS. A fully automated volumetric hepatosplenic segmentation algorithm using 3D convolutional neural networks was applied to unenhanced and contrast-enhanced series from a sample of 1204 healthy adults (mean age, 45.2 years; 726 women, 478 men) undergoing CT evaluation for renal donation. The mean volumetric attenuation was computed from all designated liver and spleen voxels. PDFF was estimated from unenhanced CT attenuation and served as the reference standard. Contrast-enhanced attenuations were evaluated for detecting PDFF thresholds of 5% (mild steatosis, 10% and 15% (moderate steatosis); PDFF less than 5% was considered normal. RESULTS. Using unenhanced CT as reference, estimated PDFF was ≥ 5% (mild steatosis), ≥ 10%, and ≥ 15% (moderate steatosis) in 50.1% (n = 603), 12.5% (n = 151) and 4.8% (n = 58) of patients, respectively. ROC AUC values for predicting PDFF thresholds of 5%, 10%, and 15% using contrast-enhanced liver attenuation were 0.669, 0.854, and 0.962, respectively, and using contrast-enhanced liver-spleen attenuation difference were 0.662, 0.866, and 0.986, respectively. A total of 96.8% (90/93) of patients with contrast-enhanced liver attenuation less than 90 HU had steatosis (PDFF ≥ 5%); this threshold of less than 90 HU achieved sensitivity of 75.9% and specificity of 95.7% for moderate steatosis (PDFF ≥ 15%). Liver attenuation less than 100 HU achieved sensitivity of 34.0% and specificity of 94.2% for any steatosis (PDFF ≥ 5%). A total of 93.8% (30/32) of patients with contrast-enhanced liver-spleen attenuation difference 10 HU or less had moderate steatosis (PDFF ≥ 15%); a liver-spleen difference less than 5 HU achieved sensitivity of 91.4% and specificity of 95.0% for moderate steatosis. Liver-spleen difference less than 10 HU achieved sensitivity of 29.5% and specificity of 95.5% for any steatosis (PDFF ≥ 5%). CONCLUSION. Contrast-enhanced volumetric hepatosplenic attenuation derived using a fully automated deep learning CT tool may allow objective categoric assessment of hepatic steatosis. Accuracy was better for moderate than mild steatosis. Further confirmation using different scanning protocols and vendors is warranted. CLINICAL IMPACT. If these results are confirmed in independent patient samples, this automated approach could prove useful for both individualized and population-based steatosis assessment.