Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 14.839
Filter
1.
Ophthalmol Sci ; 5(1): 100597, 2025.
Article in English | MEDLINE | ID: mdl-39435136

ABSTRACT

Purpose: Pupillary instability is a known risk factor for complications in cataract surgery. This study aims to develop and validate an innovative and reliable computational framework for the automated assessment of pupil morphologic changes during the various phases of cataract surgery. Design: Retrospective surgical video analysis. Subjects: Two hundred forty complete surgical video recordings, among which 190 surgeries were conducted without the use of pupil expansion devices (PEDs) and 50 were performed with the use of a PED. Methods: The proposed framework consists of 3 stages: feature extraction, deep learning (DL)-based anatomy recognition, and obstruction (OB) detection/compensation. In the first stage, surgical video frames undergo noise reduction using a tensor-based wavelet feature extraction method. In the second stage, DL-based segmentation models are trained and employed to segment the pupil, limbus, and palpebral fissure. In the third stage, obstructed visualization of the pupil is detected and compensated for using a DL-based algorithm. A dataset of 5700 intraoperative video frames across 190 cataract surgeries in the BigCat database was collected for validating algorithm performance. Main Outcome Measures: The pupil analysis framework was assessed on the basis of segmentation performance for both obstructed and unobstructed pupils. Classification performance of models utilizing the segmented pupil time series to predict surgeon use of a PED was also assessed. Results: An architecture based on the Feature Pyramid Network model with Visual Geometry Group 16 backbone integrated with the adaptive wavelet tensor feature extraction feature extraction method demonstrated the highest performance in anatomy segmentation, with Dice coefficient of 96.52%. Incorporation of an OB compensation algorithm improved performance further (Dice 96.82%). Downstream analysis of framework output enabled the development of a Support Vector Machine-based classifier that could predict surgeon usage of a PED prior to its placement with 96.67% accuracy and area under the curve of 99.44%. Conclusions: The experimental results demonstrate that the proposed framework (1) provides high accuracy in pupil analysis compared with human-annotated ground truth, (2) substantially outperforms isolated use of a DL segmentation model, and (3) can enable downstream analytics with clinically valuable predictive capacity. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.

2.
Methods Mol Biol ; 2858: 101-111, 2025.
Article in English | MEDLINE | ID: mdl-39433670

ABSTRACT

Of the known risk factors for glaucoma, elevated intraocular pressure (IOP), is the primary one. The conventional aqueous humor outflow pathway contains the key source of IOP regulation, which is predominantly the trabecular meshwork (TM). Studies of outflow have demonstrated that the outflow pathway is not uniform around the circumference of the eye but highly segmental with regions of relative high flow (HF) and intermediate or medium flow (IF) and regions of low or no flow (LF). Herein we present protocols that we use to study outflow segmentation through the conventional outflow pathway, mostly focusing on human eyes. These methods are quite similar for nonhuman primates and other species. These studies are mostly conducted using ex vivo intact globes or perfused anterior segment organ culture. One potential therapy for IOP reduction in those with elevated IOP to reduce progression of glaucomatous optic nerve damage would be to increase HF or IF and reduce LF proportions.


Subject(s)
Aqueous Humor , Intraocular Pressure , Trabecular Meshwork , Aqueous Humor/metabolism , Trabecular Meshwork/metabolism , Intraocular Pressure/physiology , Humans , Animals , Glaucoma/metabolism , Glaucoma/pathology , Organ Culture Techniques/methods
3.
Sci Rep ; 14(1): 24160, 2024 10 15.
Article in English | MEDLINE | ID: mdl-39406923

ABSTRACT

Pest detection is important for crop cultivation. Crop leaf is the main place of pest invasion. Current technologies to detect crop pests have constraints, such as low efficiency, storage demands and limited precision. Image segmentation is a fast and efficient computer-aided detection technology. High resolution image capture solidly supports the crucial processes in discerning pests from images. Study of analytical methods help parse information in the images. In this paper, a regional convolutional neural network (R-CNN) architecture is designed in combination with the radial bisymmetric divergence (RBD) method for enhancing the efficiency of image segmentation. As a special application of RBD, the hierarchical mask (HM) is produced to endorse detection and classification of the leaf-dwelling pests, offering enhanced efficiency and reduced storage requirements. Moreover, to deal with some mislabeled data, a threshold variable is introduced to adjust a fault-tolerant mechanism into HM, to generate a novel threshold-based hierarchical mask (TbHM). Consequently, the hierarchical mask R-CNN (HM-R-CNN) model and the threshold-based hierarchical mask R-CNN (TbHM-R-CNN) model are established to detect various types of healthy and pest-invasive crop leaves to select the regional image features that are rich in pest information. Then simple linear iterative clustering (SLIC) method is incorporation to finish the image segmentation for the classification of pest invasion. The models are tuned and optimized, then validated. The most optimized modeling results are from the TbHM-R-CNN model, with the classification accuracy of 96.2%, the recall of 97.5% and the F1 score of 0.982. Additionally, the HM-R-CNN model observed appreciable results second only to the best model. These results indicate that the proposed methodologies are well-suited for training and testing a dataset of plant diseases, offering heightened accuracy in pest classification. This study revealed that the proposed methods significantly outperform the existing techniques, marking a substantial improvement over current methods.


Subject(s)
Crops, Agricultural , Image Processing, Computer-Assisted , Neural Networks, Computer , Plant Leaves , Plant Leaves/parasitology , Crops, Agricultural/parasitology , Image Processing, Computer-Assisted/methods , Animals , Algorithms
4.
Bioinformation ; 20(8): 882-887, 2024.
Article in English | MEDLINE | ID: mdl-39411759

ABSTRACT

The effectiveness of manual calculations versus 3D segmentation techniques in volumetric analysis of maxillary sinuses for gender determination is of interest. Maxillary sinuses, which vary anatomically due to factors like age, ethnicity, and gender, are crucial in forensic and anthropological contexts. Traditional methods, relying on two-dimensional imaging, are often time-consuming and prone to errors, whereas 3D segmentation offers a more precise and efficient approach. This research evaluates both methods in terms of reliability, accuracy, and practical use, potentially influencing their application in clinical and forensic settings. The findings may also enhance understanding of anatomical variations in maxillary sinuses across populations, contributing to more accurate gender determination.

5.
Front Med (Lausanne) ; 11: 1375851, 2024.
Article in English | MEDLINE | ID: mdl-39416869

ABSTRACT

Background: Brain metastases are the most common brain malignancies. Automatic detection and segmentation of brain metastases provide significant assistance for radiologists in discovering the location of the lesion and making accurate clinical decisions on brain tumor type for precise treatment. Objectives: However, due to the small size of the brain metastases, existing brain metastases segmentation produces unsatisfactory results and has not been evaluated on clinic datasets. Methodology: In this work, we propose a new metastasis segmentation method DRAU-Net, which integrates a new attention mechanism multi-branch weighted attention module and DResConv module, making the extraction of tumor boundaries more complete. To enhance the evaluation of both the segmentation quality and the number of targets, we propose a novel medical image segmentation evaluation metric: multi-objective segmentation integrity metric, which effectively improves the evaluation results on multiple brain metastases with small size. Results: Experimental results evaluated on the BraTS2023 dataset and collected clinical data show that the proposed method has achieved excellent performance with an average dice coefficient of 0.6858 and multi-objective segmentation integrity metric of 0.5582. Conclusion: Compared with other methods, our proposed method achieved the best performance in the task of segmenting metastatic tumors.

6.
Med Biol Eng Comput ; 2024 Oct 17.
Article in English | MEDLINE | ID: mdl-39417962

ABSTRACT

Using echocardiography to assess the left ventricular function is one of the most crucial cardiac examinations in clinical diagnosis, and LV segmentation plays a particularly vital role in medical image processing as many important clinical diagnostic parameters are derived from the segmentation results, such as ejection function. However, echocardiography typically has a lower resolution and contains a significant amount of noise and motion artifacts, making it a challenge to accurate segmentation, especially in the region of the cardiac chamber boundary, which significantly restricts the accurate calculation of subsequent clinical parameters. In this paper, our goal is to achieve accurate LV segmentation through a simplified approach by introducing a branch sub-network into the decoder of the traditional U-Net. Specifically, we employed the LV contour features to supervise the branch decoding process and used a cross attention module to facilitate the interaction relationship between the branch and the original decoding process, thereby improving the segmentation performance in the region LV boundaries. In the experiments, the proposed branch U-Net (BU-Net) demonstrated superior performance on CAMUS and EchoNet-dynamic public echocardiography segmentation datasets in comparison to state-of-the-art segmentation models, without the need for complex residual connections or transformer-based architectures. Our codes are publicly available at Anonymous Github https://anonymous.4open.science/r/Anoymous_two-BFF2/ .

7.
Med Image Anal ; 99: 103364, 2024 Oct 09.
Article in English | MEDLINE | ID: mdl-39418830

ABSTRACT

Semi-supervised image segmentation has attracted great attention recently. The key is how to leverage unlabeled images in the training process. Most methods maintain consistent predictions of the unlabeled images under variations (e.g., adding noise/perturbations, or creating alternative versions) in the image and/or model level. In most image-level variation, medical images often have prior structure information, which has not been well explored. In this paper, we propose novel dual structure-aware image filterings (DSAIF) as the image-level variations for semi-supervised medical image segmentation. Motivated by connected filtering that simplifies image via filtering in structure-aware tree-based image representation, we resort to the dual contrast invariant Max-tree and Min-tree representation. Specifically, we propose a novel connected filtering that removes topologically equivalent nodes (i.e. connected components) having no siblings in the Max/Min-tree. This results in two filtered images preserving topologically critical structure. Applying the proposed DSAIF to mutually supervised networks decreases the consensus of their erroneous predictions on unlabeled images. This helps to alleviate the confirmation bias issue of overfitting to noisy pseudo labels of unlabeled images, and thus effectively improves the segmentation performance. Extensive experimental results on three benchmark datasets demonstrate that the proposed method significantly/consistently outperforms some state-of-the-art methods. The source codes will be publicly available.

8.
Med Biol Eng Comput ; 2024 Oct 17.
Article in English | MEDLINE | ID: mdl-39417963

ABSTRACT

The counting and characterization of neurons in primary cultures have long been areas of significant scientific interest due to their multifaceted applications, ranging from neuronal viability assessment to the study of neuronal development. Traditional methods, often relying on fluorescence or colorimetric staining and manual segmentation, are time consuming, labor intensive, and prone to error, raising the need for the development of automated and reliable methods. This paper delves into the evaluation of three pivotal deep learning techniques: semantic segmentation, which allows for pixel-level classification and is solely suited for characterization; object detection, which focuses on counting and locating neurons; and instance segmentation, which amalgamates the features of the other two but employing more intricate structures. The goal of this research is to discern what technique or combination of those techniques yields the optimal results for automatic counting and characterization of neurons in images of neuronal cultures. Following rigorous experimentation, we conclude that instance segmentation stands out, providing superior outcomes for both challenges.

9.
Neuroinformatics ; 2024 Oct 17.
Article in English | MEDLINE | ID: mdl-39417954

ABSTRACT

Characterizing the anatomical structure and connectivity between cortical regions is a critical step towards understanding the information processing properties of the brain and will help provide insight into the nature of neurological disorders. A key feature of the mammalian cerebral cortex is its laminar structure. Identifying these layers in neuroimaging data is important for understanding their global structure and to help understand the connectivity patterns of neurons in the brain. We studied Nissl-stained and myelin-stained slice images of the brain of the common marmoset (Callithrix jacchus), which is a new world monkey that is becoming increasingly popular in the neuroscience community as an object of study. We present a novel computational framework that first acquired the cortical labels using AI-based tools followed by a trained deep learning model to segment cerebral cortical layers. We obtained a Euclidean distance of 1274.750 ± 156.400 µ m for the cortical labels acquisition, which was in the acceptable range by computing the half Euclidean distance of the average cortex thickness ( 1800.630 µ m ). We compared our cortical layer segmentation pipeline with the pipeline proposed by Wagstyl et al. (PLoS biology, 18(4), e3000678 2020) adapted to 2D data. We obtained a better mean 95 th percentile Hausdorff distance (95HD) of  92.150 µ m . Whereas a mean 95HD of  94.170 µ m was obtained from Wagstyl et al. We also compared our pipeline's performance against theirs using their dataset (the BigBrain dataset). The results also showed better segmentation quality, 85.318 % Jaccard Index acquired from our pipeline, while 83.000 % was stated in their paper.

10.
Pest Manag Sci ; 2024 Oct 17.
Article in English | MEDLINE | ID: mdl-39420534

ABSTRACT

BACKGROUND: Semantic segmentation of weed and crop images is a key component and prerequisite for automated weed management. For weeds in unmanned aerial vehicle (UAV) images, which are usually characterized by small size and easily confused with crops at early growth stages, existing semantic segmentation models have difficulties to extract sufficiently fine features. This leads to their limited performance in weed and crop segmentation of UAV images. RESULTS: We proposed a fine-grained feature-guided UNet, named FG-UNet, for weed and crop segmentation in UAV images. Specifically, there are two branches in FG-UNet, namely the fine-grained feature branch and the UNet branch. In the fine-grained feature branch, a fine feature-aware (FFA) module was designed to mine fine features in order to enhance the model's ability to segment small objects. In the UNet branch, we used an encoder-decoder structure to realize high-level semantic feature extraction in images. In addition, a contextual feature fusion (CFF) module was designed for the fusion of the fine features and high-level semantic features, thus enhancing the feature discrimination capability of the model. The experimental results showed that our proposed FG-UNet, achieved state-of-the-art performance compared to other semantic segmentation models, with mean intersection over union (MIOU) and mean pixel accuracy (MPA) of 88.06% and 92.37%, respectively. CONCLUSION: The proposed method in this study lays a solid foundation for accurate detection and intelligent management of weeds. It will have a positive impact on the development of smart agriculture. © 2024 Society of Chemical Industry.

11.
J Comput Biol ; 2024 Oct 18.
Article in English | MEDLINE | ID: mdl-39422580

ABSTRACT

Time-lapse microscopy imaging is a crucial technique in biomedical studies for observing cellular behavior over time, providing essential data on cell numbers, sizes, shapes, and interactions. Manual analysis of hundreds or thousands of cells is impractical, necessitating the development of automated cell segmentation approaches. Traditional image processing methods have made significant progress in this area, but the advent of deep learning methods, particularly those using U-Net-based networks, has further enhanced performance in medical and microscopy image segmentation. However, challenges remain, particularly in accurately segmenting touching cells in images with low signal-to-noise ratios. Existing methods often struggle with effectively integrating features across different levels of abstraction. This can lead to model confusion, particularly when important contextual information is lost or the features are not adequately distinguished. The challenge lies in appropriately combining these features to preserve critical details while ensuring robust and accurate segmentation. To address these issues, we propose a novel framework called RA-SE-ASPP-Net, which incorporates Residual Blocks, Attention Mechanism, Squeeze-and-Excitation connection, and Atrous Spatial Pyramid Pooling to achieve precise and robust cell segmentation. We evaluate our proposed architecture using an induced pluripotent stem cell reprogramming dataset, a challenging dataset that has received limited attention in this field. Additionally, we compare our model with different ablation experiments to demonstrate its robustness. The proposed architecture outperforms the baseline models in all evaluated metrics, providing the most accurate semantic segmentation results. Finally, we applied the watershed method to the semantic segmentation results to obtain precise segmentations with specific information for each cell.

12.
J Neurooncol ; 2024 Oct 18.
Article in English | MEDLINE | ID: mdl-39422813

ABSTRACT

INTRODUCTION: - Accurate detection, segmentation, and volumetric analysis of brain lesions are essential in neuro-oncology. Artificial intelligence (AI)-based models have improved the efficiency of these processes. This study evaluated an AI-based module for detecting and segmenting brain metastases, comparing it with manual detection and segmentation. METHODS: - MRIs from 51 patients treated with Gamma Knife radiosurgery for brain metastases were analyzed. Manual lesion identification and contouring on Leksell Gamma Plan at the time of treatment served as the gold standard. The same MRIs were processed through an AI-based module (Brainlab Smart Brush), and lesion detection and volumes were compared. Discrepancies were analyzed to identify possible sources of error. RESULTS: - Among 51 patients, 359 brain metastases were identified. The AI module achieved a sensitivity of 79.2% and a positive predictive value of 95.6%, compared to a 93.3% sensitivity for manual detection. However, for lesions > 0.1 cc, the AI's sensitivity rose to 97.5%, surpassing manual detection at 93%. Volumetric agreement between AI and manual segmentations was high (Spearman's ρ = 0.997, p < 0.001). Most lesions missed by the AI (53.8%) were near anatomical structures that complicated detection. CONCLUSIONS: - The AI module demonstrated higher sensitivity than manual detection for metastases larger than 0.1 cc, with robust volumetric accuracy. However, human expertise remains critical for detecting smaller lesions, especially near complex anatomical areas. AI offers significant potential to enhance neuro-oncology practice by improving the efficiency and accuracy of lesion management.

13.
J Xray Sci Technol ; 2024 Oct 12.
Article in English | MEDLINE | ID: mdl-39422983

ABSTRACT

BACKGROUND: UNet has achieved great success in medical image segmentation. However, due to the inherent locality of convolution operations, UNet is deficient in capturing global features and long-range dependencies of polyps, resulting in less accurate polyp recognition for complex morphologies and backgrounds. Transformers, with their sequential operations, are better at perceiving global features but lack low-level details, leading to limited localization ability. If the advantages of both architectures can be effectively combined, the accuracy of polyp segmentation can be further improved. METHODS: In this paper, we propose an attention and convolution-augmented UNet-Transformer Network (ACU-TransNet) for polyp segmentation. This network is composed of the comprehensive attention UNet and the Transformer head, sequentially connected by the bridge layer. On the one hand, the comprehensive attention UNet enhances specific feature extraction through deformable convolution and channel attention in the first layer of the encoder and achieves more accurate shape extraction through spatial attention and channel attention in the decoder. On the other hand, the Transformer head supplements fine-grained information through convolutional attention and acquires hierarchical global characteristics from the feature maps. RESULTS: mcU-TransNet could comprehensively learn dataset features and enhance colonoscopy interpretability for polyp detection. CONCLUSION: Experimental results on the CVC-ClinicDB and Kvasir-SEG datasets demonstrate that mcU-TransNet outperforms existing state-of-the-art methods, showcasing its robustness.

14.
Microsc Microanal ; 2024 Oct 15.
Article in English | MEDLINE | ID: mdl-39405188

ABSTRACT

Integrating deep learning into image analysis for transmission electron microscopy (TEM) holds significant promise for advancing materials science and nanotechnology. Deep learning is able to enhance image quality, to automate feature detection, and to accelerate data analysis, addressing the complex nature of TEM datasets. This capability is crucial for precise and efficient characterization of details on the nano-and microscale, e.g., facilitating more accurate and high-throughput analysis of nanoparticle structures. This study investigates the influence of batch normalization (BN) and instance normalization (IN) on the performance of deep learning models for semantic segmentation of high-resolution TEM images. Using U-Net and ResNet architectures, we trained models on two different datasets. Our results demonstrate that IN consistently outperforms BN, yielding higher Dice scores and Intersection over Union metrics. These findings underscore the necessity of selecting appropriate normalization methods to maximize the performance of deep learning models applied to TEM images.

15.
Comput Biol Med ; 183: 109272, 2024 Oct 14.
Article in English | MEDLINE | ID: mdl-39405733

ABSTRACT

Lung cancer is a critical health issue that demands swift and accurate diagnosis for effective treatment. In medical imaging, segmentation is crucial for identifying and isolating regions of interest, which is essential for precise diagnosis and treatment planning. Traditional metaheuristic-based segmentation methods often struggle with slow convergence speed, poor optimized thresholds results, balancing exploration and exploitation, leading to suboptimal performance in the multi-thresholding segmenting of lung cancer images. This study presents ASG-HMO, an enhanced variant of the Human Memory Optimization (HMO) algorithm, selected for its simplicity, versatility, and minimal parameters. Although HMO has never been applied to multi-thresholding image segmentation, its characteristics make it ideal to improve pathology lung cancer image segmentation. The ASG-HMO incorporating four innovative strategies that address key challenges in the segmentation process. Firstly, the enhanced adaptive mutualism phase is proposed to balance exploration and exploitation to accurately delineate tumor boundaries without getting trapped in suboptimal solutions. Second, the spiral motion strategy is utilized to adaptively refines segmentation solutions by focusing on both the overall lung structure and the intricate tumor details. Third, the gaussian mutation strategy introduces diversity in the search process, enabling the exploration of a broader range of segmentation thresholds to enhance the accuracy of segmented regions. Finally, the adaptive t-distribution disturbance strategy is proposed to help the algorithm avoid local optima and refine segmentation in later stages. The effectiveness of ASG-HMO is validated through rigorous testing on the IEEE CEC'17 and CEC'20 benchmark suites, followed by its application to multilevel thresholding segmentation in nine histopathology lung cancer images. In these experiments, six different segmentation thresholds were tested, and the algorithm was compared to several classical, recent, and advanced segmentation algorithms. In addition, the proposed ASG-HMO leverages 2D Renyi entropy and 2D histograms to enhance the precision of the segmentation process. Quantitative result analysis in pathological lung cancer segmentation showed that ASG-HMO achieved superior maximum Peak Signal-to-Noise Ratio (PSNR) of 31.924, Structural Similarity Index Measure (SSIM) of 0.919, Feature Similarity Index Measure (FSIM) of 0.990, and Probability Rand Index (PRI) of 0.924. These results indicate that ASG-HMO significantly outperforms existing algorithms in both convergence speed and segmentation accuracy. This demonstrates the robustness of ASG-HMO as a framework for precise segmentation of pathological lung cancer images, offering substantial potential for improving clinical diagnostic processes.

16.
Sensors (Basel) ; 24(19)2024 Sep 25.
Article in English | MEDLINE | ID: mdl-39409237

ABSTRACT

The accurate extraction of buildings from remote sensing images is crucial in fields such as 3D urban planning, disaster detection, and military reconnaissance. In recent years, models based on Transformer have performed well in global information processing and contextual relationship modeling, but suffer from high computational costs and insufficient ability to capture local information. In contrast, convolutional neural networks (CNNs) are very effective in extracting local features, but have a limited ability to process global information. In this paper, an asymmetric network (CTANet), which combines the advantages of CNN and Transformer, is proposed to achieve efficient extraction of buildings. Specifically, CTANet employs ConvNeXt as an encoder to extract features and combines it with an efficient bilateral hybrid attention transformer (BHAFormer) which is designed as a decoder. The BHAFormer establishes global dependencies from both texture edge features and background information perspectives to extract buildings more accurately while maintaining a low computational cost. Additionally, the multiscale mixed attention mechanism module (MSM-AMM) is introduced to learn the multiscale semantic information and channel representations of the encoder features to reduce noise interference and compensate for the loss of information in the downsampling process. Experimental results show that the proposed model achieves the best F1-score (86.7%, 95.74%, and 90.52%) and IoU (76.52%, 91.84%, and 82.68%) compared to other state-of-the-art methods on the Massachusetts building dataset, the WHU building dataset, and the Inria aerial image labeling dataset.

17.
Sensors (Basel) ; 24(19)2024 Sep 30.
Article in English | MEDLINE | ID: mdl-39409368

ABSTRACT

Counting shrimp larvae is an essential part of shrimp farming. Due to their tiny size and high density, this task is exceedingly difficult. Thus, we introduce an algorithm for counting densely packed shrimp larvae utilizing an enhanced You Only Look Once version 5 (YOLOv5) model through a regional segmentation approach. First, the C2f and convolutional block attention modules are used to improve the capabilities of YOLOv5 in recognizing the small shrimp. Moreover, employing a regional segmentation technique can decrease the receptive field area, thereby enhancing the shrimp counter's detection performance. Finally, a strategy for stitching and deduplication is implemented to tackle the problem of double counting across various segments. The findings from the experiments indicate that the suggested algorithm surpasses several other shrimp counting techniques in terms of accuracy. Notably, for high-density shrimp larvae in large quantities, this algorithm attained an accuracy exceeding 98%.


Subject(s)
Algorithms , Larva , Animals , Larva/physiology , Image Processing, Computer-Assisted/methods
18.
Sensors (Basel) ; 24(19)2024 Sep 30.
Article in English | MEDLINE | ID: mdl-39409398

ABSTRACT

Interactive image segmentation extremely accelerates the generation of high-quality annotation image datasets, which are the pillars of the applications of deep learning. However, these methods suffer from the insignificance of interaction information and excessively high optimization costs, resulting in unexpected segmentation outcomes and increased computational burden. To address these issues, this paper focuses on interactive information mining from the network architecture and optimization procedure. In terms of network architecture, the issue mentioned above arises from two perspectives: the less representative feature of interactive regions in each layer and the interactive information weakened by the network hierarchy structure. Therefore, the paper proposes a network called EnNet. The network addresses the two aforementioned issues by employing attention mechanisms to integrate user interaction information across the entire image and incorporating interaction information twice in a design that progresses from coarse to fine. In terms of optimization, this paper proposes a method of using zero-order optimization during the first four iterations of training. This approach can reduce computational overhead with only a minimal reduction in accuracy. The experimental results on GrabCut, Berkeley, DAVIS, and SBD datasets validate the effectiveness of the proposed method, with our approach achieving an average NOC@90 that surpasses RITM by 0.35.

19.
Sensors (Basel) ; 24(19)2024 Oct 02.
Article in English | MEDLINE | ID: mdl-39409445

ABSTRACT

High-quality video object segmentation is a challenging visual computing task. Interactive segmentation can improve segmentation results. This paper proposes a multi-round interactive dynamic propagation instance-level video object segmentation network based on click interaction. The network consists of two parts: a user interaction segmentation module and a bidirectional dynamic propagation module. A prior segmentation network was designed in the user interaction segmentation module to better segment objects of different scales that users click on. The dynamic propagation network achieves high-precision video object segmentation through the bidirectional propagation and fusion of segmentation masks obtained from multiple rounds of interaction. Experiments on interactive segmentation datasets and video object segmentation datasets show that our method achieves state-of-the-art segmentation results with fewer click interactions.

20.
Sensors (Basel) ; 24(19)2024 Oct 09.
Article in English | MEDLINE | ID: mdl-39409538

ABSTRACT

Restricted by a metal-enclosed structure, the internal defects of large transformers are difficult to visually detect. In this paper, a micro-robot is used to visually inspect the interior of a transformer. For the micro-robot to successfully detect the discharge level and insulation degradation trend in the transformer, it is essential to segment the carbon trace accurately and rapidly from the complex background. However, the complex edge features and significant size differences of carbon traces pose a serious challenge for accurate segmentation. To this end, we propose the Hadamard production-Spatial coordinate attention-PixelShuffle UNet (HSP-UNet), an innovative architecture specifically designed for carbon trace segmentation. To address the pixel over-concentration and weak contrast of carbon trace image, the Adaptive Histogram Equalization (AHE) algorithm is used for image enhancement. To realize the effective fusion of carbon trace features with different scales and reduce model complexity, the novel grouped Hadamard Product Attention (HPA) module is designed to replace the original convolution module of the UNet. Meanwhile, to improve the activation intensity and segmentation completeness of carbon traces, the Spatial Coordinate Attention (SCA) mechanism is designed to replace the original jump connection. Furthermore, the PixelShuffle up-sampling module is used to improve the parsing ability of complex boundaries. Compared with UNet, UNet++, UNeXt, MALUNet, and EGE-UNet, HSP-UNet outperformed all the state-of-the-art methods on both carbon trace datasets. For dendritic carbon traces, HSP-UNet improved the Mean Intersection over Union (MIoU), Pixel Accuracy (PA), and Class Pixel Accuracy (CPA) of the benchmark UNet by 2.13, 1.24, and 4.68 percentage points, respectively. For clustered carbon traces, HSP-UNet improved MIoU, PA, and CPA by 0.98, 0.65, and 0.83 percentage points, respectively. At the same time, the validation results showed that the HSP-UNet has a good model lightweighting advantage, with the number of parameters and GFLOPs of 0.061 M and 0.066, respectively. This study could contribute to the accurate segmentation of discharge carbon traces and the assessment of the insulation condition of the oil-immersed transformer.

SELECTION OF CITATIONS
SEARCH DETAIL