RESUMEN
Recent whole-brain mapping projects are collecting large-scale three-dimensional images using modalities such as serial two-photon tomography, fluorescence micro-optical sectioning tomography, light-sheet fluorescence microscopy, volumetric imaging with synchronous on-the-fly scan and readout or magnetic resonance imaging. Registration of these multi-dimensional whole-brain images onto a standard atlas is essential for characterizing neuron types and constructing brain wiring diagrams. However, cross-modal image registration is challenging due to intrinsic variations of brain anatomy and artifacts resulting from different sample preparation methods and imaging modalities. We introduce a cross-modal registration method, mBrainAligner, which uses coherent landmark mapping and deep neural networks to align whole mouse brain images to the standard Allen Common Coordinate Framework atlas. We build a brain atlas for the fluorescence micro-optical sectioning tomography modality to facilitate single-cell mapping, and used our method to generate a whole-brain map of three-dimensional single-neuron morphology and neuron cell types.
Asunto(s)
Encéfalo/citología , Encéfalo/diagnóstico por imagen , Imagenología Tridimensional/métodos , Algoritmos , Animales , Aprendizaje Profundo , Imagen por Resonancia Magnética , Masculino , Ratones Endogámicos C57BL , Flujo de TrabajoRESUMEN
BACKGROUND: For medical diagnosis, clinicians typically begin with a patient's chief concerns, followed by questions about symptoms and medical history, physical examinations, and requests for necessary auxiliary examinations to gather comprehensive medical information. This complex medical investigation process has yet to be modeled by existing artificial intelligence (AI) methodologies. OBJECTIVE: The aim of this study was to develop an AI-driven medical inquiry assistant for clinical diagnosis that provides inquiry recommendations by simulating clinicians' medical investigating logic via reinforcement learning. METHODS: We compiled multicenter, deidentified outpatient electronic health records from 76 hospitals in Shenzhen, China, spanning the period from July to November 2021. These records consisted of both unstructured textual information and structured laboratory test results. We first performed feature extraction and standardization using natural language processing techniques and then used a reinforcement learning actor-critic framework to explore the rational and effective inquiry logic. To align the inquiry process with actual clinical practice, we segmented the inquiry into 4 stages: inquiring about symptoms and medical history, conducting physical examinations, requesting auxiliary examinations, and terminating the inquiry with a diagnosis. External validation was conducted to validate the inquiry logic of the AI model. RESULTS: This study focused on 2 retrospective inquiry-and-diagnosis tasks in the emergency and pediatrics departments. The emergency departments provided records of 339,020 consultations including mainly children (median age 5.2, IQR 2.6-26.1 years) with various types of upper respiratory tract infections (250,638/339,020, 73.93%). The pediatrics department provided records of 561,659 consultations, mainly of children (median age 3.8, IQR 2.0-5.7 years) with various types of upper respiratory tract infections (498,408/561,659, 88.73%). When conducting its own inquiries in both scenarios, the AI model demonstrated high diagnostic performance, with areas under the receiver operating characteristic curve of 0.955 (95% CI 0.953-0.956) and 0.943 (95% CI 0.941-0.944), respectively. When the AI model was used in a simulated collaboration with physicians, it notably reduced the average number of physicians' inquiries to 46% (6.037/13.26; 95% CI 6.009-6.064) and 43% (6.245/14.364; 95% CI 6.225-6.269) while achieving areas under the receiver operating characteristic curve of 0.972 (95% CI 0.970-0.973) and 0.968 (95% CI 0.967-0.969) in the scenarios. External validation revealed a normalized Kendall τ distance of 0.323 (95% CI 0.301-0.346), indicating the inquiry consistency of the AI model with physicians. CONCLUSIONS: This retrospective analysis of predominantly respiratory pediatric presentations in emergency and pediatrics departments demonstrated that an AI-driven diagnostic assistant had high diagnostic performance both in stand-alone use and in simulated collaboration with clinicians. Its investigation process was found to be consistent with the clinicians' medical investigation logic. These findings highlight the diagnostic assistant's promise in assisting the decision-making processes of health care professionals.
Asunto(s)
Inteligencia Artificial , Registros Electrónicos de Salud , Humanos , Registros Electrónicos de Salud/estadística & datos numéricos , Algoritmos , China , Estudios Retrospectivos , Servicio de Urgencia en Hospital/estadística & datos numéricosRESUMEN
SUMMARY: Recent whole-brain mapping projects are collecting increasingly larger sets of high-resolution brain images using a variety of imaging, labeling and sample preparation techniques. Both mining and analysis of these data require reliable and robust cross-modal registration tools. We recently developed the mBrainAligner, a pipeline for performing cross-modal registration of the whole mouse brain. However, using this tool requires scripting or command-line skills to assemble and configure the different modules of mBrainAligner for accommodating different registration requirements and platform settings. In this application note, we present mBrainAligner-Web, a web server with a user-friendly interface that allows to configure and run mBrainAligner locally or remotely across platforms. AVAILABILITY AND IMPLEMENTATION: mBrainAligner-Web is available at http://mbrainaligner.ahu.edu.cn/ with source code at https://github.com/reaneyli/mBrainAligner-web. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Computadores , Programas Informáticos , Animales , Ratones , Encéfalo/diagnóstico por imagenRESUMEN
OBJECTIVES: To identify the feasibility of deep learning-based diagnostic models for detecting and assessing lower-extremity fatigue fracture severity on plain radiographs. METHODS: This retrospective study enrolled 1151 X-ray images (tibiofibula/foot: 682/469) of fatigue fractures and 2842 X-ray images (tibiofibula/foot: 2000/842) without abnormal presentations from two clinical centers. After labeling the lesions, images in a center (tibiofibula/foot: 2539/1180) were allocated at 7:1:2 for model construction, and the remaining images from another center (tibiofibula/foot: 143/131) for external validation. A ResNet-50 and a triplet branch network were adopted to construct diagnostic models for detecting and grading. The performances of detection models were evaluated with sensitivity, specificity, and area under the receiver operating characteristic curve (AUC), while grading models were evaluated with accuracy by confusion matrix. Visual estimations by radiologists were performed for comparisons with models. RESULTS: For the detection model on tibiofibula, a sensitivity of 95.4%/85.5%, a specificity of 80.1%/77.0%, and an AUC of 0.965/0.877 were achieved in the internal testing/external validation set. The detection model on foot reached a sensitivity of 96.4%/90.8%, a specificity of 76.0%/66.7%, and an AUC of 0.947/0.911. The detection models showed superior performance to the junior radiologist, comparable to the intermediate or senior radiologist. The overall accuracy of the diagnostic model was 78.5%/62.9% for tibiofibula and 74.7%/61.1% for foot in the internal testing/external validation set. CONCLUSIONS: The deep learning-based models could be applied to the radiological diagnosis of plain radiographs for assisting in the detection and grading of fatigue fractures on tibiofibula and foot. KEY POINTS: ⢠Fatigue fractures on radiographs are relatively difficult to detect, and apt to be misdiagnosed. ⢠Detection and grading models based on deep learning were constructed on a large cohort of radiographs with lower-extremity fatigue fractures. ⢠The detection model with high sensitivity would help to reduce the misdiagnosis of lower-extremity fatigue fractures.
Asunto(s)
Aprendizaje Profundo , Fracturas por Estrés , Humanos , Estudios Retrospectivos , Radiografía , ExtremidadesRESUMEN
Supervised machine learning methods have been widely developed for segmentation tasks in recent years. However, the quality of labels has high impact on the predictive performance of these algorithms. This issue is particularly acute in the medical image domain, where both the cost of annotation and the inter-observer variability are high. Different human experts contribute estimates of the "actual" segmentation labels in a typical label acquisition process, influenced by their personal biases and competency levels. The performance of automatic segmentation algorithms is limited when these noisy labels are used as the expert consensus label. In this work, we use two coupled CNNs to jointly learn, from purely noisy observations alone, the reliability of individual annotators and the expert consensus label distributions. The separation of the two is achieved by maximally describing the annotator's "unreliable behavior" (we call it "maximally unreliable") while achieving high fidelity with the noisy training data. We first create a toy segmentation dataset using MNIST and investigate the properties of the proposed algorithm. We then use three public medical imaging segmentation datasets to demonstrate our method's efficacy, including both simulated (where necessary) and real-world annotations: 1) ISBI2015 (multiple-sclerosis lesions); 2) BraTS (brain tumors); 3) LIDC-IDRI (lung abnormalities). Finally, we create a real-world multiple sclerosis lesion dataset (QSMSC at UCL: Queen Square Multiple Sclerosis Center at UCL, UK) with manual segmentations from 4 different annotators (3 radiologists with different level skills and 1 expert to generate the expert consensus label). In all datasets, our method consistently outperforms competing methods and relevant baselines, especially when the number of annotations is small and the amount of disagreement is large. The studies also reveal that the system is capable of capturing the complicated spatial characteristics of annotators' mistakes.
RESUMEN
OBJECTIVES: The molecular subtyping of diffuse gliomas is important. The aim of this study was to establish predictive models based on preoperative multiparametric MRI. METHODS: A total of 1016 diffuse glioma patients were retrospectively collected from Beijing Tiantan Hospital. Patients were randomly divided into the training (n = 780) and validation (n = 236) sets. According to the 2016 WHO classification, diffuse gliomas can be classified into four binary classification tasks (tasks I-IV). Predictive models based on radiomics and deep convolutional neural network (DCNN) were developed respectively, and their performances were compared with receiver operating characteristic (ROC) curves. Additionally, the radiomics and DCNN features were visualized and compared with the t-distributed stochastic neighbor embedding technique and Spearman's correlation test. RESULTS: In the training set, areas under the curves (AUCs) of the DCNN models (ranging from 0.99 to 1.00) outperformed the radiomics models in all tasks, and the accuracies of the DCNN models (ranging from 0.90 to 0.94) outperformed the radiomics models in tasks I, II, and III. In the independent validation set, the accuracies of the DCNN models outperformed the radiomics models in all tasks (0.74-0.83), and the AUCs of the DCNN models (0.85-0.89) outperformed the radiomics models in tasks I, II, and III. DCNN features demonstrated more superior discriminative capability than the radiomics features in feature visualization analysis, and their general correlations were weak. CONCLUSIONS: Both the radiomics and DCNN models could preoperatively predict the molecular subtypes of diffuse gliomas, and the latter performed better in most circumstances. KEY POINTS: ⢠The molecular subtypes of diffuse gliomas could be predicted with MRI. ⢠Deep learning features tend to outperform radiomics features in large cohorts. ⢠The correlation between the radiomics features and DCNN features was low.
Asunto(s)
Aprendizaje Profundo , Glioma , Glioma/diagnóstico por imagen , Humanos , Imagen por Resonancia Magnética , Espectroscopía de Resonancia Magnética , Estudios RetrospectivosRESUMEN
OBJECTIVES: Most countries have adopted public activity intervention policies to control the coronavirus disease 2019 (COVID-19) pandemic. Nevertheless, empirical evidence of the effectiveness of different interventions on the containment of the epidemic was inconsistent. METHODS: We retrieved time-series intervention policy data for 145 countries from the Oxford COVID-19 Government Response Tracker from December 31, 2019, to July 1, 2020, which included 8 containment and closure policies. We investigated the association of timeliness, stringency, and duration of intervention with cumulative infections per million population on July 1, 2020. We introduced a novel counterfactual estimator to estimate the effects of these interventions on COVID-19 time-varying reproduction number (Rt). RESULTS: There is some evidence that earlier implementation, longer durations, and more strictness of intervention policies at the early but not middle stage were associated with reduced infections of COVID-19. The counterfactual model proved to have controlled for unobserved time-varying confounders and established a valid causal relationship between policy intervention and Rt reduction. The average intervention effect revealed that all interventions significantly decrease Rt after their implementation. Rt decreased by 30% (22%-41%) in 25 to 32 days after policy intervention. Among the 8 interventions, school closing, workplace closing, and public events cancellation demonstrated the strongest and most consistent evidence of associations. CONCLUSIONS: Our study provides more reliable evidence of the quantitative effects of policy interventions on the COVID-19 epidemic and suggested that stricter public activity interventions should be implemented at the early stage of the epidemic for improved containment.
Asunto(s)
COVID-19 , Gripe Humana , COVID-19/epidemiología , COVID-19/prevención & control , Política de Salud , Humanos , Gripe Humana/epidemiología , Pandemias/prevención & control , Instituciones AcadémicasRESUMEN
BACKGROUND: Colposcopy diagnosis and directed biopsy are the key components in cervical cancer screening programs. However, their performance is limited by the requirement for experienced colposcopists. This study aimed to develop and validate a Colposcopic Artificial Intelligence Auxiliary Diagnostic System (CAIADS) for grading colposcopic impressions and guiding biopsies. METHODS: Anonymized digital records of 19,435 patients were obtained from six hospitals across China. These records included colposcopic images, clinical information, and pathological results (gold standard). The data were randomly assigned (7:1:2) to a training and a tuning set for developing CAIADS and to a validation set for evaluating performance. RESULTS: The agreement between CAIADS-graded colposcopic impressions and pathology findings was higher than that of colposcopies interpreted by colposcopists (82.2% versus 65.9%, kappa 0.750 versus 0.516, p < 0.001). For detecting pathological high-grade squamous intraepithelial lesion or worse (HSIL+), CAIADS showed higher sensitivity than the use of colposcopies interpreted by colposcopists at either biopsy threshold (low-grade or worse 90.5%, 95% CI 88.9-91.4% versus 83.5%, 81.5-85.3%; high-grade or worse 71.9%, 69.5-74.2% versus 60.4%, 57.9-62.9%; all p < 0.001), whereas the specificities were similar (low-grade or worse 51.8%, 49.8-53.8% versus 52.0%, 50.0-54.1%; high-grade or worse 93.9%, 92.9-94.9% versus 94.9%, 93.9-95.7%; all p > 0.05). The CAIADS also demonstrated a superior ability in predicting biopsy sites, with a median mean-intersection-over-union (mIoU) of 0.758. CONCLUSIONS: The CAIADS has potential in assisting beginners and for improving the diagnostic quality of colposcopy and biopsy in the detection of cervical precancer/cancer.
Asunto(s)
Inteligencia Artificial , Carcinoma de Células Escamosas/diagnóstico , Colposcopía/métodos , Detección Precoz del Cáncer/métodos , Neoplasias del Cuello Uterino/diagnóstico , Adulto , Anciano , Biopsia/métodos , Biopsia/estadística & datos numéricos , Carcinoma de Células Escamosas/patología , Carcinoma de Células Escamosas/prevención & control , China/epidemiología , Colposcopía/estadística & datos numéricos , Exactitud de los Datos , Pruebas Diagnósticas de Rutina/métodos , Detección Precoz del Cáncer/estadística & datos numéricos , Femenino , Humanos , Persona de Mediana Edad , Clasificación del Tumor/métodos , Valor Predictivo de las Pruebas , Embarazo , Reproducibilidad de los Resultados , Neoplasias del Cuello Uterino/patología , Neoplasias del Cuello Uterino/prevención & control , Adulto JovenRESUMEN
Rare diseases are characterized by low prevalence and are often chronically debilitating or life-threatening. Imaging phenotype classification of rare diseases is challenging due to the severe shortage of training examples. Few-shot learning (FSL) methods tackle this challenge by extracting generalizable prior knowledge from a large base dataset of common diseases and normal controls and transferring the knowledge to rare diseases. Yet, most existing methods require the base dataset to be labeled and do not make full use of the precious examples of rare diseases. In addition, the extremely small size of the training samples may result in inter-class performance imbalance due to insufficient sampling of the true distributions. To this end, we propose in this work a novel hybrid approach to rare disease imaging phenotype classification, featuring three key novelties targeted at the above drawbacks. First, we adopt the unsupervised representation learning (URL) based on self-supervising contrastive loss, whereby to eliminate the overhead in labeling the base dataset. Second, we integrate the URL with pseudo-label supervised classification for effective self-distillation of the knowledge about the rare diseases, composing a hybrid approach taking advantage of both unsupervised and (pseudo-) supervised learning on the base dataset. Third, we use the feature dispersion to assess the intra-class diversity of training samples, to alleviate the inter-class performance imbalance via dispersion-aware correction. Experimental results of imaging phenotype classification of both simulated (skin lesions and cervical smears) and real clinical rare diseases (retinal diseases) show that our hybrid approach substantially outperforms existing FSL methods (including those using a fully supervised base dataset) via effective integration of the URL, pseudo-label driven self-distillation, and dispersion-aware imbalance correction, thus establishing a new state of the art.
Asunto(s)
Enfermedades Raras , Enfermedades de la Retina , Humanos , Fenotipo , Diagnóstico por ImagenRESUMEN
BACKGROUND: View planning for the acquisition of cardiac magnetic resonance (CMR) imaging remains a demanding task in clinical practice. PURPOSE: Existing approaches to its automation relied either on an additional volumetric image not typically acquired in clinic routine, or on laborious manual annotations of cardiac structural landmarks. This work presents a clinic-compatible, annotation-free system for automatic CMR view planning. METHODS: The system mines the spatial relationship-more specifically, locates the intersecting lines-between the target planes and source views, and trains U-Net-based deep networks to regress heatmaps defined by distances from the intersecting lines. On the one hand, the intersection lines are the prescription lines prescribed by the technologists at the time of image acquisition using cardiac landmarks, and retrospectively identified from the spatial relationship. On the other hand, as the spatial relationship is self-contained in properly stored data, for example, in the DICOM format, the need for additional manual annotation is eliminated. In addition, the interplay of the multiple target planes predicted in a source view is utilized in a stacked hourglass architecture consisting of repeated U-Net-style building blocks to gradually improve the regression. Then, a multiview planning strategy is proposed to aggregate information from the predicted heatmaps for all the source views of a target plane, for a globally optimal prescription, mimicking the similar strategy practiced by skilled human prescribers. For performance evaluation, the retrospectively identified planes prescribed by the technologists are used as the ground truth, and the plane angle differences and localization distances between the planes prescribed by our system and the ground truth are compared. RESULTS: The retrospective experiments include 181 clinical CMR exams, which are randomly split into training, validation, and test sets in the ratio of 64:16:20. Our system yields the mean angular difference and point-to-plane distance of 5.68 ∘ $^\circ$ and 3.12 mm, respectively, on the held-out test set. It not only achieves superior accuracy to existing approaches including conventional atlas-based and newer deep-learning-based in prescribing the four standard CMR planes but also demonstrates prescription of the first cardiac-anatomy-oriented plane(s) from the body-oriented scout. CONCLUSIONS: The proposed system demonstrates accurate automatic CMR view plane prescription based on deep learning on properly archived data, without the need for further manual annotation. This work opens a new direction for automatic view planning of anatomy-oriented medical imaging beyond CMR.
Asunto(s)
Corazón , Imagen por Resonancia Cinemagnética , Humanos , Estudios Retrospectivos , Imagen por Resonancia Cinemagnética/métodos , Corazón/diagnóstico por imagen , Imagen por Resonancia Magnética , AutomatizaciónRESUMEN
X-ray computed tomography (CT) has been broadly adopted in clinical applications for disease diagnosis and image-guided interventions. However, metals within patients always cause unfavorable artifacts in the recovered CT images. Albeit attaining promising reconstruction results for this metal artifact reduction (MAR) task, most of the existing deep-learning-based approaches have some limitations. The critical issue is that most of these methods have not fully exploited the important prior knowledge underlying this specific MAR task. Therefore, in this paper, we carefully investigate the inherent characteristics of metal artifacts which present rotationally symmetrical streaking patterns. Then we specifically propose an orientation-shared convolution representation mechanism to adapt such physical prior structures and utilize Fourier-series-expansion-based filter parametrization for modelling artifacts, which can finely separate metal artifacts from body tissues. By adopting the classical proximal gradient algorithm to solve the model and then utilizing the deep unfolding technique, we easily build the corresponding orientation-shared convolutional network, termed as OSCNet. Furthermore, considering that different sizes and types of metals would lead to different artifact patterns (e.g., intensity of the artifacts), to better improve the flexibility of artifact learning and fully exploit the reconstructed results at iterative stages for information propagation, we design a simple-yet-effective sub-network for the dynamic convolution representation of artifacts. By easily integrating the sub-network into the proposed OSCNet framework, we further construct a more flexible network structure, called OSCNet+, which improves the generalization performance. Through extensive experiments conducted on synthetic and clinical datasets, we comprehensively substantiate the effectiveness of our proposed methods. Code will be released at https://github.com/hongwang01/OSCNet.
Asunto(s)
Artefactos , Procesamiento de Imagen Asistido por Computador , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Tomografía Computarizada por Rayos X/métodos , Algoritmos , Metales , Fantasmas de ImagenRESUMEN
Few-shot medical image segmentation has achieved great progress in improving accuracy and efficiency of medical analysis in the biomedical imaging field. However, most existing methods cannot explore inter-class relations among base and novel medical classes to reason unseen novel classes. Moreover, the same kind of medical class has large intra-class variations brought by diverse appearances, shapes and scales, thus causing ambiguous visual characterization to degrade generalization performance of these existing methods on unseen novel classes. To address the above challenges, in this paper, we propose a Prototype correlation Matching and Class-relation Reasoning (i.e., PMCR) model. The proposed model can effectively mitigate false pixel correlation matches caused by large intra-class variations while reasoning inter-class relations among different medical classes. Specifically, in order to address false pixel correlation match brought by large intra-class variations, we propose a prototype correlation matching module to mine representative prototypes that can characterize diverse visual information of different appearances well. We aim to explore prototypelevel rather than pixel-level correlation matching between support and query features via optimal transport algorithm to tackle false matches caused by intra-class variations. Meanwhile, in order to explore inter-class relations, we design a class-relation reasoning module to segment unseen novel medical objects via reasoning inter-class relations between base and novel classes. Such inter-class relations can be well propagated to semantic encoding of local query features to improve few-shot segmentation performance. Quantitative comparisons illustrates the large performance improvement of our model over other baseline methods.
RESUMEN
Segmenting prostate from magnetic resonance imaging (MRI) is a critical procedure in prostate cancer staging and treatment planning. Considering the nature of labeled data scarcity for medical images, semi-supervised learning (SSL) becomes an appealing solution since it can simultaneously exploit limited labeled data and a large amount of unlabeled data. However, SSL relies on the assumption that the unlabeled images are abundant, which may not be satisfied when the local institute has limited image collection capabilities. An intuitive solution is to seek support from other centers to enrich the unlabeled image pool. However, this further introduces data heterogeneity, which can impede SSL that works under identical data distribution with certain model assumptions. Aiming at this under-explored yet valuable scenario, in this work, we propose a separated collaborative learning (SCL) framework for semi-supervised prostate segmentation with multi-site unlabeled MRI data. Specifically, on top of the teacher-student framework, SCL exploits multi-site unlabeled data by: (i) Local learning, which advocates local distribution fitting, including the pseudo label learning that reinforces confirmation of low-entropy easy regions and the cyclic propagated real label learning that leverages class prototypes to regularize the distribution of intra-class features; (ii) External multi-site learning, which aims to robustly mine informative clues from external data, mainly including the local-support category mutual dependence learning, which takes the spirit that mutual information can effectively measure the amount of information shared by two variables even from different domains, and the stability learning under strong adversarial perturbations to enhance robustness to heterogeneity. Extensive experiments on prostate MRI data from six different clinical centers show that our method can effectively generalize SSL on multi-site unlabeled data and significantly outperform other semi-supervised segmentation methods. Besides, we validate the extensibility of our method on the multi-class cardiac MRI segmentation task with data from four different clinical centers.
Asunto(s)
Prácticas Interdisciplinarias , Neoplasias de la Próstata , Masculino , Humanos , Próstata/diagnóstico por imagen , Neoplasias de la Próstata/diagnóstico por imagen , Entropía , Imagen por Resonancia MagnéticaRESUMEN
Layer segmentation is important to quantitative analysis of retinal optical coherence tomography (OCT). Recently, deep learning based methods have been developed to automate this task and yield remarkable performance. However, due to the large spatial gap and potential mismatch between the B-scans of an OCT volume, all of them were based on 2D segmentation of individual B-scans, which may lose the continuity and diagnostic information of the retinal layers in 3D space. Besides, most of these methods required dense annotation of the OCT volumes, which is labor-intensive and expertise-demanding. This work presents a novel framework based on hybrid 2D-3D convolutional neural networks (CNNs) to obtain continuous 3D retinal layer surfaces from OCT volumes, which works well with both full and sparse annotations. The 2D features of individual B-scans are extracted by an encoder consisting of 2D convolutions. These 2D features are then used to produce the alignment displacement vectors and layer segmentation by two 3D decoders coupled via a spatial transformer module. Two losses are proposed to utilize the retinal layers' natural property of being smooth for B-scan alignment and layer segmentation, respectively, and are the key to the semi-supervised learning with sparse annotation. The entire framework is trained end-to-end. To the best of our knowledge, this is the first work that attempts 3D retinal layer segmentation in volumetric OCT images based on CNNs. Experiments on a synthetic dataset and three public clinical datasets show that our framework can effectively align the B-scans for potential motion correction, and achieves superior performance to state-of-the-art 2D deep learning methods in terms of both layer segmentation accuracy and cross-B-scan 3D continuity in both fully and semi-supervised settings, thus offering more clinical values than previous works.
Asunto(s)
Retina , Tomografía de Coherencia Óptica , Humanos , Retina/diagnóstico por imagen , Redes Neurales de la Computación , Aprendizaje Automático SupervisadoRESUMEN
Self-supervised learning has emerged as a powerful tool for pretraining deep networks on unlabeled data, prior to transfer learning of target tasks with limited annotation. The relevance between the pretraining pretext and target tasks is crucial to the success of transfer learning. Various pretext tasks have been proposed to utilize properties of medical image data (e.g., three dimensionality), which are more relevant to medical image analysis than generic ones for natural images. However, previous work rarely paid attention to data with anatomy-oriented imaging planes, e.g., standard cardiac magnetic resonance imaging views. As these imaging planes are defined according to the anatomy of the imaged organ, pretext tasks effectively exploiting this information can pretrain the networks to gain knowledge on the organ of interest. In this work, we propose two complementary pretext tasks for this group of medical image data based on the spatial relationship of the imaging planes. The first is to learn the relative orientation between the imaging planes and implemented as regressing their intersecting lines. The second exploits parallel imaging planes to regress their relative slice locations within a stack. Both pretext tasks are conceptually straightforward and easy to implement, and can be combined in multitask learning for better representation learning. Thorough experiments on two anatomical structures (heart and knee) and representative target tasks (semantic segmentation and classification) demonstrate that the proposed pretext tasks are effective in pretraining deep networks for remarkably boosted performance on the target tasks, and superior to other recent approaches.
Asunto(s)
Corazón , Articulación de la Rodilla , Humanos , Corazón/diagnóstico por imagen , Semántica , Aprendizaje Automático Supervisado , Procesamiento de Imagen Asistido por ComputadorRESUMEN
Medical image segmentation is a critical task for clinical diagnosis and research. However, dealing with highly imbalanced data remains a significant challenge in this domain, where the region of interest (ROI) may exhibit substantial variations across different slices. This presents a significant hurdle to medical image segmentation, as conventional segmentation methods may either overlook the minority class or overly emphasize the majority class, ultimately leading to a decrease in the overall generalization ability of the segmentation results. To overcome this, we propose a novel approach based on multi-step reinforcement learning, which integrates prior knowledge of medical images and pixel-wise segmentation difficulty into the reward function. Our method treats each pixel as an individual agent, utilizing diverse actions to evaluate its relevance for segmentation. To validate the effectiveness of our approach, we conduct experiments on four imbalanced medical datasets, and the results show that our approach surpasses other state-of-the-art methods in highly imbalanced scenarios. These findings hold substantial implications for clinical diagnosis and research.
Asunto(s)
Algoritmos , Imagenología Tridimensional , Humanos , Imagenología Tridimensional/métodos , Interpretación de Imagen Asistida por Computador/métodos , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
Radiation therapy treatment planning requires balancing the delivery of the target dose while sparing normal tissues, making it a complex process. To streamline the planning process and enhance its quality, there is a growing demand for knowledge-based planning (KBP). Ensemble learning has shown impressive power in various deep learning tasks, and it has great potential to improve the performance of KBP. However, the effectiveness of ensemble learning heavily depends on the diversity and individual accuracy of the base learners. Moreover, the complexity of model ensembles is a major concern, as it requires maintaining multiple models during inference, leading to increased computational cost and storage overhead. In this study, we propose a novel learning-based ensemble approach named LENAS, which integrates neural architecture search with knowledge distillation for 3-D radiotherapy dose prediction. Our approach starts by exhaustively searching each block from an enormous architecture space to identify multiple architectures that exhibit promising performance and significant diversity. To mitigate the complexity introduced by the model ensemble, we adopt the teacher-student paradigm, leveraging the diverse outputs from multiple learned networks as supervisory signals to guide the training of the student network. Furthermore, to preserve high-level semantic information, we design a hybrid loss to optimize the student network, enabling it to recover the knowledge embedded within the teacher networks. The proposed method has been evaluated on two public datasets: 1) OpenKBP and 2) AIMIS. Extensive experimental results demonstrate the effectiveness of our method and its superior performance to the state-of-the-art methods. Code: github.com/hust-linyi/LENAS.
Asunto(s)
Redes Neurales de la Computación , Dosificación Radioterapéutica , Planificación de la Radioterapia Asistida por Computador , Humanos , Planificación de la Radioterapia Asistida por Computador/métodos , Algoritmos , Aprendizaje Profundo , Aprendizaje AutomáticoRESUMEN
The detection head constitutes a pivotal component within object detectors, tasked with executing both classification and localization functions. Regrettably, the commonly used parallel head often lacks omni perceptual capabilities, such as deformation perception (DP), global perception (GP), and cross-task perception (CTP). Despite numerous methods attempting to enhance these abilities from a single aspect, achieving a comprehensive and unified solution remains a significant challenge. In response to this challenge, we develop an innovative detection head, termed UniHead, to unify three perceptual abilities simultaneously. More precisely, our approach: 1) introduces DP, enabling the model to adaptively sample object features; 2) proposes a dual-axial aggregation transformer (DAT) to adeptly model long-range dependencies, thereby achieving GP; and 3) devises a cross-task interaction transformer (CIT) that facilitates interaction between the classification and localization branches, thus aligning the two tasks. As a plug-and-play method, the proposed UniHead can be conveniently integrated with existing detectors. Extensive experiments on the COCO dataset demonstrate that our UniHead can bring significant improvements to many detectors. For instance, the UniHead can obtain + 2.7 AP gains in RetinaNet, + 2.9 AP gains in FreeAnchor, and + 2.1 AP gains in GFL. The code is available at https://github.com/zht8506/UniHead.
RESUMEN
Radiology report generation (RRG) is crucial to save the valuable time of radiologists in drafting the report, therefore increasing their work efficiency. Compared to typical methods that directly transfer image captioning technologies to RRG, our approach incorporates organ-wise priors into the report generation. Specifically, in this paper, we propose Organ-aware Diagnosis (OaD) to generate diagnostic reports containing descriptions of each physiological organ. During training, we first develop a task distillation (TD) module to extract organ-level descriptions from reports. We then introduce an organ-aware report generation module that, for one thing, provides a specific description for each organ, and for another, simulates clinical situations to provide short descriptions for normal cases. Furthermore, we design an auto-balance mask loss to ensure balanced training for normal/abnormal descriptions and various organs simultaneously. Being intuitively reasonable and practically simple, our OaD outperforms SOTA alternatives by large margins on commonly used IU-Xray and MIMIC-CXR datasets, as evidenced by a 3.4% BLEU-1 improvement on MIMIC-CXR and 2.0% BLEU-2 improvement on IU-Xray.
RESUMEN
Federated learning enables multiple hospitals to cooperatively learn a shared model without privacy disclosure. Existing methods often take a common assumption that the data from different hospitals have the same modalities. However, such a setting is difficult to fully satisfy in practical applications, since the imaging guidelines may be different between hospitals, which makes the number of individuals with the same set of modalities limited. To this end, we formulate this practical-yet-challenging cross-modal vertical federated learning task, in which data from multiple hospitals have different modalities with a small amount of multi-modality data collected from the same individuals. To tackle such a situation, we develop a novel framework, namely Federated Consistent Regularization constrained Feature Disentanglement (Fed-CRFD), for boosting MRI reconstruction by effectively exploring the overlapping samples (i.e., same patients with different modalities at different hospitals) and solving the domain shift problem caused by different modalities. Particularly, our Fed-CRFD involves an intra-client feature disentangle scheme to decouple data into modality-invariant and modality-specific features, where the modality-invariant features are leveraged to mitigate the domain shift problem. In addition, a cross-client latent representation consistency constraint is proposed specifically for the overlapping samples to further align the modality-invariant features extracted from different modalities. Hence, our method can fully exploit the multi-source data from hospitals while alleviating the domain shift problem. Extensive experiments on two typical MRI datasets demonstrate that our network clearly outperforms state-of-the-art MRI reconstruction methods. The source code is available at https://github.com/IAMJackYan/FedCRFD.