RESUMO
STUDY DESIGN: This is a retrospective, cross-sectional, population-based study that automatically measured the facet joint (FJ) angles from T2-weighted axial magnetic resonance imagings (MRIs) of the lumbar spine using deep learning (DL). OBJECTIVE: This work aimed to introduce a semiautomatic framework that measures the FJ angles using DL and study facet tropism (FT) in a large Finnish population-based cohort. SUMMARY OF DATA: T2-weighted axial MRIs of the lumbar spine (L3/4 through L5/S1) for (n=1288) in the NFBC1966 Finnish population-based cohort were used for this study. MATERIALS AND METHODS: A DL model was developed and trained on 430 participants' MRI images. The authors computed FJ angles from the model's prediction for each level, that is, L3/4 through L5/S1, for the male and female subgroups. Inter-rater and intrarater reliability was analyzed for 60 participants using annotations made by two radiologists and a musculoskeletal researcher. With the developed method, we examined FT in the entire NFBC1966 cohort, adopting the literature definitions of FT thresholds at 7° and 10°. The rater agreement was evaluated both for the annotations and the FJ angles computed based on the annotations. FJ asymmetry ( - was used to evaluate the agreement and correlation between the raters. Bland-Altman analysis was used to assess the agreement and systemic bias in the FJ asymmetry. The authors used the Dice score as the metric to compare the annotations between the raters. The authors evaluated the model predictions on the independent test set and compared them against the ground truth annotations. RESULTS: This model scored Dice (92.7±0.1) and intersection over union (87.1±0.2) aggregated across all the regions of interest, that is, vertebral body (VB), FJs, and posterior arch (PA). The mean FJ angles measured for the male and female subgroups were in agreement with the literature findings. Intrarater reliability was high, with a Dice score of VB (97.3), FJ (82.5), and PA (90.3). The inter-rater reliability was better between the radiologists with a Dice score of VB (96.4), FJ (75.5), and PA (85.8) than between the radiologists and the musculoskeletal researcher. The prevalence of FT was higher in the male subgroup, with L4/5 found to be the most affected region. CONCLUSION: The authors developed a DL-based framework that enabled us to study FT in a large cohort. Using the proposed method, the authors present the prevalence of FT in a Finnish population-based cohort.