RESUMEN
In this paper, we address a complex but practical scenario in Active Learning (AL) known as open-set AL, where the unlabeled data consists of both in-distribution (ID) and out-of-distribution (OOD) samples. Standard AL methods will fail in this scenario as OOD samples are highly likely to be regarded as uncertain samples, leading to their selection and wasting of the budget. Existing methods focus on selecting the highly likely ID samples, which tend to be easy and less informative. To this end, we introduce two criteria, namely contrastive confidence and historical divergence, which measure the possibility of being ID and the hardness of a sample, respectively. By balancing the two proposed criteria, highly informative ID samples can be selected as much as possible. Furthermore, unlike previous methods that require additional neural networks to detect the OOD samples, we propose a contrastive clustering framework that endows the classifier with the ability to identify the OOD samples and further enhances the network's representation learning. The experimental results demonstrate that the proposed method achieves state-of-the-art performance on several benchmark datasets.
RESUMEN
Graph Neural Networks (GNNs) have gained momentum in graph representation learning and boosted the state of the art in a variety of areas, such as data mining (e.g., social network analysis and recommender systems), computer vision (e.g., object detection and point cloud learning), and natural language processing (e.g., relation extraction and sequence learning), to name a few. With the emergence of Transformers in natural language processing and computer vision, graph Transformers embed a graph structure into the Transformer architecture to overcome the limitations of local neighborhood aggregation while avoiding strict structural inductive biases. In this paper, we present a comprehensive review of GNNs and graph Transformers in computer vision from a task-oriented perspective. Specifically, we divide their applications in computer vision into five categories according to the modality of input data, i.e., 2D natural images, videos, 3D data, vision + language, and medical images. In each category, we further divide the applications according to a set of vision tasks. Such a task-oriented taxonomy allows us to examine how each task is tackled by different GNN-based approaches and how well these approaches perform. Based on the necessary preliminaries, we provide the definitions and challenges of the tasks, in-depth coverage of the representative approaches, as well as discussions regarding insights, limitations, and future directions.
RESUMEN
Reconstructing a 3D shape based on a single sketch image is challenging due to the inherent sparsity and ambiguity present in sketches. Existing methods lose fine details when extracting features to predict 3D objects from sketches. Upon analyzing the 3D-to-2D projection process, we observe that the density map, characterizing the distribution of 2D point clouds, can serve as a proxy to facilitate the reconstruction process. In this work, we propose a novel sketch-based 3D reconstruction model named SketchSampler. It initiates the process by translating a sketch through an image translation network into a more informative 2D representation, which is then used to generate a density map. Subsequently, a two-stage probabilistic sampling process is employed to reconstruct a 3D point cloud: firstly, recovering the 2D points (i.e., the x and y coordinates) by sampling the density map; and secondly, predicting the depth (i.e., the z coordinate) by sampling the depth values along the ray determined by each 2D point. Additionally, we convert the reconstructed point cloud into a 3D mesh for wider applications. To reduce ambiguity, we incorporate hidden lines in sketches. Experimental results demonstrate that our proposed approach significantly outperforms other baseline methods.
RESUMEN
Geometry- and appearance-controlled full-body human image generation is an interesting but challenging task. Existing solutions are either unconditional or dependent on coarse conditions (e.g., pose, text), thus lacking explicit geometry and appearance control of body and garment. Sketching offers such editing ability and has been adopted in various sketch-based face generation and editing solutions. However, directly adapting sketch-based face generation to full-body generation often fails to produce high-fidelity and diverse results due to the high complexity and diversity in the pose, body shape, and garment shape and texture. Recent geometrically controllable diffusion-based methods mainly rely on prompts to generate appearance. It is hard to balance the realism and the faithfulness of their results to the sketch when the input is coarse. This work presents Sketch2Human, the first system for controllable full-body human image generation guided by a semantic sketch (for geometry control) and a reference image (for appearance control). Our solution is based on the latent space of StyleGAN-Human with inverted geometry and appearance latent codes as input. Specifically, we present a sketch encoder trained with a large synthetic dataset sampled from StyleGAN-Human's latent space and directly supervised by sketches rather than real images. Considering the entangled information of partial geometry and texture in StyleGAN-Human and the absence of disentangled datasets, we design a novel training scheme that creates geometry-preserved and appearance-transferred training data to tune a generator to achieve disentangled geometry and appearance control. Although our method is trained with synthetic data, it can also handle hand-drawn sketches. Qualitative and quantitative evaluations demonstrate the superior performance of our method to state-of-the-art methods.
RESUMEN
BACKGROUND: This study aimed to investigate the positional consistency between the guidewire and the screw in spinal internal fixation surgery. METHODS: This study involved 64 patients who underwent robot-assisted thoracic or lumbar pedicle screw fixation surgery. Guidewires were inserted with the assistance of the Tirobot. Either cannulated screws or solid screws were inserted. Guidewire and screw accuracy was measured using CT images based on the Gertzbein and Robbins scale. The positional consistency between guidewire and screw was evaluated based on the fused CT images, which could graphically and quantitatively demonstrate the consistency. The consistency was evaluated based on a grading system that considered the maximum distance and angulation between the centerline of the guidewire and the screw in the region of the pedicle. RESULTS: A total of 322 screws were placed including 206 cannulated ones and 116 solid ones. Based on the Gertzbein and Robbins scale, 97.5% of the guidewires were grade A, and 94.1% of the screws were grade A. Based on our guidewire-screw consistency scale, 85% in cannulated group, and 69.8% in solid group, were grade A. Both solid and cannulated screws may alter trajectory compared to the guidewires. The positional accuracy and guidewire-screw consistency in the solid screw group is significantly worse than that in the cannulated screw group. The cortical bone of the pedicle has a positive guide effect on either solid or cannulated screws. CONCLUSION: The pedicle screws may alter trajectory despite the guidance of the guidewires. Solid screws show worse positional accuracy and guidewire-screw consistency compared with cannulated screws. Trial registration The study was retrospectively registered and approved by our center's institutional review board.
Asunto(s)
Tornillos Pediculares , Procedimientos Quirúrgicos Robotizados , Robótica , Cirugía Asistida por Computador , Humanos , Procedimientos Quirúrgicos Robotizados/métodos , Columna Vertebral , Cirugía Asistida por Computador/métodosRESUMEN
Respiratory motion-induced vertebral movements can adversely impact intraoperative spine surgery, resulting in inaccurate positional information of the target region and unexpected damage during the operation. In this paper, we propose a novel deep learning architecture for respiratory motion prediction, which can adapt to different patients. The proposed method utilizes an LSTM-AE with attention mechanism network that can be trained using few-shot datasets during operation. To ensure real-time performance, a dimension reduction method based on the respiration-induced physical movement of spine vertebral bodies is introduced. The experiment collected data from prone-positioned patients under general anaesthesia to validate the prediction accuracy and time efficiency of the LSTM-AE-based motion prediction method. The experimental results demonstrate that the presented method (RMSE: 4.39%) outperforms other methods in terms of accuracy within a learning time of 2 min. The maximum predictive errors under the latency of 333 ms with respect to the x, y, and z axes of the optical camera system were 0.13, 0.07, and 0.10 mm, respectively, within a motion range of 2 mm.
RESUMEN
Neural radiance fields (NeRF) have shown great success in novel view synthesis. However, recovering high-quality details from real-world scenes is still challenging for the existing NeRF-based approaches, due to the potential imperfect calibration information and scene representation inaccuracy. Even with high-quality training frames, the synthetic novel views produced by NeRF models still suffer from notable rendering artifacts, such as noise and blur. To address this, we propose NeRFLiX, a general NeRF-agnostic restorer paradigm that learns a degradation-driven inter-viewpoint mixer. Specially, we design a NeRF-style degradation modeling approach and construct large-scale training data, enabling the possibility of effectively removing NeRF-native rendering artifacts for deep neural networks. Moreover, beyond the degradation removal, we propose an inter-viewpoint aggregation framework that fuses highly related high-quality training images, pushing the performance of cutting-edge NeRF models to entirely new levels and producing highly photo-realistic synthetic views. Based on this paradigm, we further present NeRFLiX++ with a stronger two-stage NeRF degradation simulator and a faster inter-viewpoint mixer, achieving superior performance with significantly improved computational efficiency. Notably, NeRFLiX++ is capable of restoring photo-realistic ultra-high-resolution outputs from noisy low-resolution NeRF-rendered views. Extensive experiments demonstrate the excellent restoration ability of NeRFLiX++ on various novel view synthesis benchmarks.
RESUMEN
OBJECTIVES: To develop an automatic computer-based method that can help clinicians in assessing spine growth potential based on EOS radiographs. METHODS: We developed a deep learning-based (DL) algorithm that can mimic the human judgment process to automatically determine spine growth potential and the Risser sign based on full-length spine EOS radiographs. A total of 3383 EOS cases were collected and used for the training and test of the algorithm. Subsequently, the completed DL algorithm underwent clinical validation on an additional 440 cases and was compared to the evaluations of four clinicians. RESULTS: Regarding the Risser sign, the weighted kappa value of our DL algorithm was 0.933, while that of the four clinicians ranged from 0.909 to 0.930. In the assessment of spine growth potential, the kappa value of our DL algorithm was 0.944, while the kappa values of the four clinicians were 0.916, 0.934, 0.911, and 0.920, respectively. Furthermore, our DL algorithm obtained a slightly higher accuracy (0.973) and Youden index (0.952) compared to the best values achieved by the four clinicians. In addition, the speed of our DL algorithm was 15.2 ± 0.3 s/40 cases, much faster than the inference speeds of the clinicians, ranging from 177.2 ± 28.0 s/40 cases to 241.2 ± 64.1 s/40 cases. CONCLUSIONS: Our algorithm demonstrated comparable or even better performance compared to clinicians in assessing spine growth potential. This stable, efficient, and convenient algorithm seems to be a promising approach to assist doctors in clinical practice and deserves further study. CLINICAL RELEVANCE STATEMENT: This method has the ability to quickly ascertain the spine growth potential based on EOS radiographs, and it holds promise to provide assistance to busy doctors in certain clinical scenarios. KEY POINTS: ⢠In the clinic, there is no available computer-based method that can automatically assess spine growth potential. ⢠We developed a deep learning-based method that could automatically ascertain spine growth potential. ⢠Compared with the results of the clinicians, our algorithm got comparable results.
RESUMEN
Transformer, the model of choice for natural language processing, has drawn scant attention from the medical imaging community. Given the ability to exploit long-term dependencies, transformers are promising to help atypical convolutional neural networks to learn more contextualized visual representations. However, most of recently proposed transformer-based segmentation approaches simply treated transformers as assisted modules to help encode global context into convolutional representations. To address this issue, we introduce nnFormer (i.e., not-another transFormer), a 3D transformer for volumetric medical image segmentation. nnFormer not only exploits the combination of interleaved convolution and self-attention operations, but also introduces local and global volume-based self-attention mechanism to learn volume representations. Moreover, nnFormer proposes to use skip attention to replace the traditional concatenation/summation operations in skip connections in U-Net like architecture. Experiments show that nnFormer significantly outperforms previous transformer-based counterparts by large margins on three public datasets. Compared to nnUNet, the most widely recognized convnet-based 3D medical segmentation model, nnFormer produces significantly lower HD95 and is much more computationally efficient. Furthermore, we show that nnFormer and nnUNet are highly complementary to each other in model ensembling. Codes and models of nnFormer are available at https://git.io/JSf3i.
RESUMEN
Modeling 3D avatars benefits various application scenarios such as AR/VR, gaming, and filming. Character faces contribute significant diversity and vividity as a vital component of avatars. However, building 3D character face models usually requires a heavy workload with commercial tools, even for experienced artists. Various existing sketch-based tools fail to support amateurs in modeling diverse facial shapes and rich geometric details. In this paper, we present SketchMetaFace - a sketching system targeting amateur users to model high-fidelity 3D faces in minutes. We carefully design both the user interface and the underlying algorithm. First, curvature-aware strokes are adopted to better support the controllability of carving facial details. Second, considering the key problem of mapping a 2D sketch map to a 3D model, we develop a novel learning-based method termed "Implicit and Depth Guided Mesh Modeling" (IDGMM). It fuses the advantages of mesh, implicit, and depth representations to achieve high-quality results with high efficiency. In addition, to further support usability, we present a coarse-to-fine 2D sketching interface design and a data-driven stroke suggestion tool. User studies demonstrate the superiority of our system over existing modeling tools in terms of the ease to use and visual quality of results. Experimental analyses also show that IDGMM reaches a better trade-off between accuracy and efficiency.
RESUMEN
The purpose of this study is to develop an automated method for identifying the menarche status of adolescents based on EOS radiographs. We designed a deep-learning-based algorithm that contains a region of interest detection network and a classification network. The algorithm was trained and tested on a retrospective dataset of 738 adolescent EOS cases using a five-fold cross-validation strategy and was subsequently tested on a clinical validation set of 259 adolescent EOS cases. On the clinical validation set, our algorithm achieved accuracy of 0.942, macro precision of 0.933, macro recall of 0.938, and a macro F1-score of 0.935. The algorithm showed almost perfect performance in distinguishing between males and females, with the main classification errors found in females aged 12 to 14 years. Specifically for females, the algorithm had accuracy of 0.910, sensitivity of 0.943, and specificity of 0.855 in estimating menarche status, with an area under the curve of 0.959. The kappa value of the algorithm, in comparison to the actual situation, was 0.806, indicating strong agreement between the algorithm and the real-world scenario. This method can efficiently analyze EOS radiographs and identify the menarche status of adolescents. It is expected to become a routine clinical tool and provide references for doctors' decisions under specific clinical conditions.
RESUMEN
Delayed neurocognitive recovery (dNCR) is a common complication that occurs post-surgery, especially in elderly individuals. The soluble N-ethylmaleimide-sensitive factor attachment protein receptor (SNARE) complex plays an essential role in various membrane fusion events, such as synaptic vesicle exocytosis and autophagosome-lysosome fusion. Although SNARE complex dysfunction has been observed in several neurodegenerative disorders, the causal link between SNARE-mediated membrane fusion and dNCR remains unclear. We previously demonstrated that surgical stimuli caused cognitive impairment in aged rats by inducing α-synuclein accumulation, inhibiting autophagy, and disrupting neurotransmitter release in hippocampal synaptosomes. Here, we evaluated the effects of propofol anesthesia plus surgery on learning and memory and investigated levels of SNARE proteins and chaperones in hippocampal synaptosomes. Aged rats that received propofol anesthesia and surgery exhibited learning and memory impairments in a Morris water maze test and decreased levels of synaptosome-associated protein 25, synaptobrevin/vesicle-associated membrane protein 2, and syntaxin 1. Levels of SNARE chaperones, including mammalian uncoordinated-18, complexins 1 and 2, cysteine string protein-α, and N-ethylmaleimide-sensitive factor, were all significantly decreased following anesthesia with surgical stress. However, the synaptic vesicle marker synaptophysin was unaffected. The autophagy-enhancer rapamycin attenuated structural and functional disturbances of the SNARE complex and ameliorated disrupted neurotransmitter release. Our results indicate that perturbations of SNARE proteins in hippocampal synaptosomes may underlie the occurrence of dNCR. Moreover, the protective effect of rapamycin may partially occur through recovery of SNARE structural and functional abnormalities. Our findings provide insight into the molecular mechanisms underlying dNCR.
RESUMEN
We present a deep reinforcement learning method of progressive view inpainting for colored semantic point cloud scene completion under volume guidance, achieving high-quality scene reconstruction from only a single RGB-D image with severe occlusion. Our approach is end-to-end, consisting of three modules: 3D scene volume reconstruction, 2D RGB-D and segmentation image inpainting, and multi-view selection for completion. Given a single RGB-D image, our method first predicts its semantic segmentation map and goes through the 3D volume branch to obtain a volumetric scene reconstruction as a guide to the next view inpainting step, which attempts to make up the missing information; the third step involves projecting the volume under the same view of the input, concatenating them to complete the current view RGB-D and segmentation map, and integrating all RGB-D and segmentation maps into the point cloud. Since the occluded areas are unavailable, we resort to a A3C network to glance around and pick the next best view for large hole completion progressively until a scene is adequately reconstructed while guaranteeing validity. All steps are learned jointly to achieve robust and consistent results. We perform qualitative and quantitative evaluations with extensive experiments on the 3D-FUTURE data, obtaining better results than state-of-the-arts.
RESUMEN
Large-scale datasets and deep generative models have enabled impressive progress in human face reenactment. Existing solutions for face reenactment have focused on processing real face images through facial landmarks by generative models. Different from real human faces, artistic human faces (e.g., those in paintings, cartoons, etc.) often involve exaggerated shapes and various textures. Therefore, directly applying existing solutions to artistic faces often fails to preserve the characteristics of the original artistic faces (e.g., face identity and decorative lines along face contours) due to the domain gap between real and artistic faces. To address these issues, we present ReenactArtFace, the first effective solution for transferring the poses and expressions from human videos to various artistic face images. We achieve artistic face reenactment in a coarse-to-fine manner. First, we perform 3D artistic face reconstruction, which reconstructs a textured 3D artistic face through a 3D morphable model (3DMM) and a 2D parsing map from an input artistic image. The 3DMM can not only rig the expressions better than facial landmarks but also render images under different poses/expressions as coarse reenactment results robustly. However, these coarse results suffer from self-occlusions and lack contour lines. Second, we thus perform artistic face refinement by using a personalized conditional adversarial generative model (cGAN) fine-tuned on the input artistic image and the coarse reenactment results. For high-quality refinement, we propose a contour loss to supervise the cGAN to faithfully synthesize contour lines. Quantitative and qualitative experiments demonstrate that our method achieves better results than the existing solutions.
RESUMEN
BACKGROUND: This study aimed to evaluate the safety and efficacy of robot-assisted percutaneous pars-pedicle screw fixation surgery for treating Hangman's fracture. METHODS: The study involved 33 patients with Hangman's fracture who underwent robot-assisted fixation surgery using cannulated pars-pedicle screws through a percutaneous approach. The primary parameter evaluated was the accuracy of the screws according to the Gertzbein-Robbins scale, using postoperative CT images. Secondary parameters included the duration of surgery, intraoperative blood loss, postoperative hospital stay, and neurovascular injury. RESULTS: A total of 60 pars-pedicle screws were placed in 33 patients. Based on the Levine and Edwards classification, the patients included 12 cases of type I, 15 cases of type II, five cases of type IIa, and one atypical case. The average operative time was 92.4 ± 37.4 min, and the average blood loss was 22.4 ± 17.9 ml. Fifty-five of 60 screws were successfully placed within the bone. No screw-related neurovascular injury was observed, and satisfactory reduction was achieved in all cases. CONCLUSION: Robot-assisted percutaneous pars-pedicle screw fixation is a safe and feasible method for treating Hangman's fracture. TRIAL REGISTRATION: The study was retrospectively registered and approved by our center's institutional review board.
Asunto(s)
Fracturas Óseas , Tornillos Pediculares , Robótica , Fracturas de la Columna Vertebral , Humanos , Fracturas de la Columna Vertebral/diagnóstico por imagen , Fracturas de la Columna Vertebral/cirugía , Fijación Interna de Fracturas/métodos , Fracturas Óseas/diagnóstico por imagen , Fracturas Óseas/cirugía , Estudios RetrospectivosRESUMEN
Chronic Glaucoma is an eye disease with progressive optic nerve damage. It is the second leading cause of blindness after cataract and the first leading cause of irreversible blindness. Glaucoma forecast can predict future eye state of a patient by analyzing the historical fundus images, which is helpful for early detection and intervention of potential patients and avoiding the outcome of blindness. In this paper, we propose a GLaucoma forecast transformer based on Irregularly saMpled fundus images named GLIM-Net to predict the probability of developing glaucoma in the future. The main challenge is that the existing fundus images are often sampled at irregular times, making it difficult to accurately capture the subtle progression of glaucoma over time. We therefore introduce two novel modules, namely time positional encoding and time-sensitive MSA (multi-head self-attention) modules, to address this challenge. Unlike many existing works that focus on prediction for an unspecified future time, we also propose an extended model which is further capable of prediction conditioned on a specific future time. The experimental results on the benchmark dataset SIGF show that the accuracy of our method outperforms the state-of-the-art models. In addition, the ablation experiments also confirm the effectiveness of the two modules we propose, which can provide a good reference for the optimization of Transformer models.
Asunto(s)
Glaucoma , Humanos , Glaucoma/diagnóstico por imagen , Fondo de Ojo , CegueraRESUMEN
INTRODUCTION: The application of robotic navigation during spine surgery has advanced rapidly over the past two decades, especially in the last 5 years. Robotic systems in spine surgery may offer potential advantages for both patients and surgeons. This article serves as an update to our previous review and explores the current status of spine surgery robots in clinical settings. AREAS COVERED: We evaluated the literature published from 2020 to 2022 on the outcomes of robotics-assisted spine surgery, including accuracy and its influencing factors, radiation exposure, and follow-up results. EXPERT OPINION: The application of robotics in spine surgery has driven spine surgery into a new era of precision treatment through a form of artificial intelligence assistance that compensates for the limitations of human abilities. Modularized robot configurations, intelligent alignment and planning incorporating multimodal images, efficient and simple human - machine interaction, accurate surgical status monitoring, and safe control strategies are the main technical features for the development of orthopedic surgical robots. The use of robotics-assisted decompression, osteotomies, and decision-making warrants further study. Future investigations should focus on patients' needs while continuing to explore in-depth medical - industrial collaborative development innovations that improve the overall utilization of artificial intelligence and sophistication in disease treatment.
Asunto(s)
Procedimientos Quirúrgicos Robotizados , Robótica , Cirugía Asistida por Computador , Humanos , Inteligencia Artificial , Procedimientos Quirúrgicos Robotizados/métodos , Columna Vertebral/cirugía , Cirugía Asistida por Computador/métodosRESUMEN
Domain Adaptive Object Detection (DAOD) focuses on improving the generalization ability of object detectors via knowledge transfer. Recent advances in DAOD strive to change the emphasis of the adaptation process from global to local in virtue of fine-grained feature alignment methods. However, both the global and local alignment approaches fail to capture the topological relations among different foreground objects as the explicit dependencies and interactions between and within domains are neglected. In this case, only seeking one-vs-one alignment does not necessarily ensure the precise knowledge transfer. Moreover, conventional alignment-based approaches may be vulnerable to catastrophic overfitting regarding those less transferable regions (e.g., backgrounds) due to the accumulation of inaccurate localization results in the target domain. To remedy these issues, we first formulate DAOD as an open-set domain adaptation problem, in which the foregrounds and backgrounds are seen as the "known classes" and "unknown class" respectively. Accordingly, we propose a new and general framework for DAOD, named Foreground-aware Graph-based Relational Reasoning (FGRR), which incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations on both pixel and semantic spaces, thereby endowing the DAOD model with the capability of relational reasoning beyond the popular alignment-based paradigm. FGRR first identifies the foreground pixels and regions by searching reliable correspondence and cross-domain similarity regularization respectively. The inter-domain visual and semantic correlations are hierarchically modeled via bipartite graph structures, and the intra-domain relations are encoded via graph attention mechanisms. Through message-passing, each node aggregates semantic and contextual information from the same and opposite domain to substantially enhance its expressive power. Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art performance on four DAOD benchmarks.
RESUMEN
Recently deep neural networks, which require a large amount of annotated samples, have been widely applied in nuclei instance segmentation of H&E stained pathology images. However, it is inefficient and unnecessary to label all pixels for a dataset of nuclei images which usually contain similar and redundant patterns. Although unsupervised and semi-supervised learning methods have been studied for nuclei segmentation, very few works have delved into the selective labeling of samples to reduce the workload of annotation. Thus, in this paper, we propose a novel full nuclei segmentation framework that chooses only a few image patches to be annotated, augments the training set from the selected samples, and achieves nuclei segmentation in a semi-supervised manner. In the proposed framework, we first develop a novel consistency-based patch selection method to determine which image patches are the most beneficial to the training. Then we introduce a conditional single-image GAN with a component-wise discriminator, to synthesize more training samples. Lastly, our proposed framework trains an existing segmentation model with the above augmented samples. The experimental results show that our proposed method could obtain the same-level performance as a fully-supervised baseline by annotating less than 5% pixels on some benchmarks.
Asunto(s)
Núcleo Celular , Redes Neurales de la Computación , Aprendizaje Automático SupervisadoRESUMEN
BACKGROUND: To evaluate the accuracy of screw placement using the TiRobot surgical robot in the Harms procedure and to assess the clinical outcomes of this technique. METHODS: This retrospective study included 21 patients with atlantoaxial instability treated by posterior atlantoaxial internal fixation (Harms procedure) using the TiRobot surgical robot between March 2016 and June 2021. The precision of screw placement, perioperative parameters and clinical outcomes were recorded. Screw placement was assessed based on intraoperative guiding pin accuracy measurements on intraoperative C-arm cone-beam computed tomography (CT) images using overlay technology and the incidence of screw encroachment identified on CT images. RESULTS: Among the 21 patients, the mean age was 44.8 years, and the causes of atlantoaxial instability were os odontoideum (n = 11), rheumatoid arthritis (n = 2), unknown pathogenesis (n = 3), and type II odontoid fracture (n = 5). A total of 82 screws were inserted with robotic assistance. From intraoperative guiding pin accuracy measurements, the average translational and angular deviations were 1.52 ± 0.35 mm (range 1.14-2.25 mm) and 2.25° ± 0.45° (range 1.73°-3.20º), respectively. Screw placement was graded as A for 80.5% of screws, B for 15.9%, and C for 3.7%. No complications related to screw misplacement were observed. After the 1-year follow-up, all patients with a neurological deficit experienced neurological improvement based on Nurick Myelopathy Scale scores, and all patients with preoperative neck pain reported improvement based on Visual Analog Scale scores. CONCLUSIONS: Posterior atlantoaxial internal fixation using the Harms technique assisted by a 3D-based navigation robot is safe, accurate, and effective for treating atlantoaxial instability.