Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 97
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-35514186

RESUMO

The identification of active binding drugs for target proteins (referred to as drug-target interaction prediction) is the key challenge in virtual screening, which plays an essential role in drug discovery. Although recent deep learning-based approaches achieve better performance than molecular docking, existing models often neglect topological or spatial of intermolecular information, hindering prediction performance. We recognize this problem and propose a novel approach called the Intermolecular Graph Transformer (IGT) that employs a dedicated attention mechanism to model intermolecular information with a three-way Transformer-based architecture. IGT outperforms state-of-the-art (SoTA) approaches by 9.1% and 20.5% over the second best option for binding activity and binding pose prediction, respectively, and exhibits superior generalization ability to unseen receptor proteins than SoTA approaches. Furthermore, IGT exhibits promising drug screening ability against severe acute respiratory syndrome coronavirus 2 by identifying 83.1% active drugs that have been validated by wet-lab experiments with near-native predicted binding poses. Source code and datasets are available at https://github.com/microsoft/IGT-Intermolecular-Graph-Transformer.


Assuntos
Algoritmos , COVID-19 , Humanos , Simulação de Acoplamento Molecular , Proteínas/química , Software
2.
Neurocomputing (Amst) ; 534: 161-170, 2023 May 14.
Artigo em Inglês | MEDLINE | ID: mdl-36923265

RESUMO

The mutant strains of COVID-19 caused a global explosion of infections, including many cities of China. In 2020, a hybrid AI model was proposed by Zheng et al., which accurately predicted the epidemic in Wuhan. As the main part of the hybrid AI model, ISI method makes two important assumptions to avoid over-fitting. However, the assumptions cannot be effectively applied to new mutant strains. In this paper, a more general method, named the multi-weight susceptible-infected model (MSI) is proposed to predict COVID-19 in Chinese Mainland. First, a Gaussian pre-processing method is proposed to solve the problem of data fluctuation based on the quantity consistency of cumulative infection number and the trend consistency of daily infection number. Then, we improve the model from two aspects: changing the grouped multi-parameter strategy to the multi-weight strategy, and removing the restriction of weight distribution of viral infectivity. Experiments on the outbreaks in many places in China from the end of 2021 to May 2022 show that, in China, an individual infected by Delta or Omicron strains of SARS-CoV-2 can infect others within 3-4 days after he/she got infected. Especially, the proposed method effectively predicts the trend of the epidemics in Xi'an, Tianjin, Henan, and Shanghai from December 2021 to May 2022.

3.
Int J Legal Med ; 136(3): 821-831, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35157129

RESUMO

Age estimation can aid in forensic medicine applications, diagnosis, and treatment planning for orthodontics and pediatrics. Existing dental age estimation methods rely heavily on specialized knowledge and are highly subjective, wasting time, and energy, which can be perfectly solved by machine learning techniques. As the key factor affecting the performance of machine learning models, there are usually two methods for feature extraction: human interference and autonomous extraction without human interference. However, previous studies have rarely applied these two methods for feature extraction in the same image analysis task. Herein, we present two types of convolutional neural networks (CNNs) for dental age estimation. One is an automated dental stage evaluation model (ADSE model) based on specified manually defined features, and the other is an automated end-to-end dental age estimation model (ADAE model), which autonomously extracts potential features for dental age estimation. Although the mean absolute error (MAE) of the ADSE model for stage classification is 0.17 stages, its accuracy in dental age estimation is unsatisfactory, with the MAE (1.63 years) being only 0.04 years lower than the manual dental age estimation method (MDAE model). However, the MAE of the ADAE model is 0.83 years, being reduced by half that of the MDAE model. The results show that fully automated feature extraction in a deep learning model without human interference performs better in dental age estimation, prominently increasing the accuracy and objectivity. This indicates that without human interference, machine learning may perform better in the application of medical imaging.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Criança , Humanos , Processamento de Imagem Assistida por Computador , Lactente , Radiografia
4.
J Opt Soc Am A Opt Image Sci Vis ; 39(12): 2298-2306, 2022 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-36520750

RESUMO

Automatic detection of thin-cap fibroatheroma (TCFA) is essential to prevent acute coronary syndrome. Hence, in this paper, a method is proposed to detect TCFAs by directly classifying each A-line using multi-view intravascular optical coherence tomography (IVOCT) images. To solve the problem of false positives, a multi-input-output network was developed to implement image-level classification and A-line-based classification at the same time, and a contrastive consistency term was designed to ensure consistency between two tasks. In addition, to learn spatial and global information and obtain the complete extent of TCFAs, an architecture and a regional connectivity constraint term are proposed to classify each A-line of IVOCT images. Experimental results obtained on the 2017 China Computer Vision Conference IVOCT dataset show that the proposed method achieved state-of-art performance with a total score of 88.7±0.88%, overlap rate of 88.64±0.26%, precision rate of 84.34±0.86%, and recall rate of 93.67±2.29%.


Assuntos
Placa Aterosclerótica , Tomografia de Coerência Óptica , Humanos , Tomografia de Coerência Óptica/métodos , Placa Aterosclerótica/diagnóstico por imagem , Vasos Coronários
5.
J Neurosci Res ; 96(7): 1159-1175, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29406599

RESUMO

Over the past decade, the simultaneous recording of electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI) data has garnered growing interest because it may provide an avenue towards combining the strengths of both imaging modalities. Given their pronounced differences in temporal and spatial statistics, the combination of EEG and fMRI data is however methodologically challenging. Here, we propose a novel screening approach that relies on a Cross Multivariate Correlation Coefficient (xMCC) framework. This approach accomplishes three tasks: (1) It provides a measure for testing multivariate correlation and multivariate uncorrelation of the two modalities; (2) it provides criterion for the selection of EEG features; (3) it performs a screening of relevant EEG information by grouping the EEG channels into clusters to improve efficiency and to reduce computational load when searching for the best predictors of the BOLD signal. The present report applies this approach to a data set with concurrent recordings of steady-state-visual evoked potentials (ssVEPs) and fMRI, recorded while observers viewed phase-reversing Gabor patches. We test the hypothesis that fluctuations in visuo-cortical mass potentials systematically covary with BOLD fluctuations not only in visual cortical, but also in anterior temporal and prefrontal areas. Results supported the hypothesis and showed that the xMCC-based analysis provides straightforward identification of neurophysiological plausible brain regions with EEG-fMRI covariance. Furthermore xMCC converged with other extant methods for EEG-fMRI analysis.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/diagnóstico por imagem , Eletroencefalografia/métodos , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Adulto , Encéfalo/fisiologia , Mapeamento Encefálico/estatística & dados numéricos , Correlação de Dados , Interpretação Estatística de Dados , Eletroencefalografia/estatística & dados numéricos , Potenciais Evocados Visuais , Feminino , Humanos , Interpretação de Imagem Assistida por Computador/métodos , Interpretação de Imagem Assistida por Computador/estatística & dados numéricos , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Imageamento por Ressonância Magnética/estatística & dados numéricos , Masculino , Imagem Multimodal/métodos , Imagem Multimodal/estatística & dados numéricos , Análise Multivariada
6.
Sensors (Basel) ; 18(5)2018 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-29789447

RESUMO

Inspired by the recent spatio-temporal action localization efforts with tubelets (sequences of bounding boxes), we present a new spatio-temporal action localization detector Segment-tube, which consists of sequences of per-frame segmentation masks. The proposed Segment-tube detector can temporally pinpoint the starting/ending frame of each action category in the presence of preceding/subsequent interference actions in untrimmed videos. Simultaneously, the Segment-tube detector produces per-frame segmentation masks instead of bounding boxes, offering superior spatial accuracy to tubelets. This is achieved by alternating iterative optimization between temporal action localization and spatial action segmentation. Experimental results on three datasets validated the efficacy of the proposed method, including (1) temporal action localization on the THUMOS 2014 dataset; (2) spatial action segmentation on the Segtrack dataset; and (3) joint spatio-temporal action localization on the newly proposed ActSeg dataset. It is shown that our method compares favorably with existing state-of-the-art methods.

7.
Sensors (Basel) ; 18(7)2018 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-29933555

RESUMO

Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs). However, effective and efficient methods for incorporation of temporal information into CNNs are still being actively explored in the recent literature. Motivated by the popular recurrent attention models in the research area of natural language processing, we propose the Attention-aware Temporal Weighted CNN (ATW CNN) for action recognition in videos, which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is simply implemented as temporal weighting yet it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW CNN framework is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with back-propagation. Our experimental results on the UCF-101 and HMDB-51 datasets showed that the proposed attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments.

8.
Sensors (Basel) ; 17(12)2017 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-29189757

RESUMO

Intensified charge-coupled device (ICCD) images are captured by ICCD sensors in extremely low-light conditions. They often contains spatially clustered noises and general filtering methods do not work well. We find that the scale of the clustered noise in ICCD sensing images is often much smaller than that of the true structural information. Then the clustered noise can be identified by properly down-sampling and then up-sampling the ICCD sensing image and comparing it to the noisy image. Based on this finding, we present a denoising algorithm to remove the randomly clustered noise in ICCD images. First, we over-segment the ICCD image into a set of flat patches, and each patch contains very little structural information. Second, we classify the patches into noisy patches and noise-free patches based on the hypergraph cut method. Then the noise-free patches are easily recovered by the general block-matching and 3D filtering (BM3D) algorithm, since they often do not contain the clustered noise. The noisy patches are recovered by subtracting the identified clustered noise from the noisy patches. After that, we could get the whole recovered ICCD image. Finally, the quality of the recovered ICCD image is further improved by diminishing the remaining sparse noise with robust principal component analysis. Experiments are conducted on a set of ICCD images and compared with four existing denoising algorithms, which shows that the proposed algorithm removes well the randomly clustered noise and preserves the true textural information in the ICCD sensing images.

9.
Sensors (Basel) ; 17(4)2017 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-28397759

RESUMO

Depth information has been used in many fields because of its low cost and easy availability, since the Microsoft Kinect was released. However, the Kinect and Kinect-like RGB-D sensors show limited performance in certain applications and place high demands on accuracy and robustness of depth information. In this paper, we propose a depth sensing system that contains a laser projector similar to that used in the Kinect, and two infrared cameras located on both sides of the laser projector, to obtain higher spatial resolution depth information. We apply the block-matching algorithm to estimate the disparity. To improve the spatial resolution, we reduce the size of matching blocks, but smaller matching blocks generate lower matching precision. To address this problem, we combine two matching modes (binocular mode and monocular mode) in the disparity estimation process. Experimental results show that our method can obtain higher spatial resolution depth without loss of the quality of the range image, compared with the Kinect. Furthermore, our algorithm is implemented on a low-cost hardware platform, and the system can support the resolution of 1280 × 960, and up to a speed of 60 frames per second, for depth image sequences.

10.
Sensors (Basel) ; 17(2)2017 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-28134759

RESUMO

An Intensified Charge-Coupled Device (ICCD) image is captured by the ICCD image sensor in extremely low-light conditions. Its noise has two distinctive characteristics. (a) Different from the independent identically distributed (i.i.d.) noise in natural image, the noise in the ICCD sensing image is spatially clustered, which induces unexpected structure information; (b) The pattern of the clustered noise is formed randomly. In this paper, we propose a denoising scheme to remove the randomly clustered noise in the ICCD sensing image. First, we decompose the image into non-overlapped patches and classify them into flat patches and structure patches according to if real structure information is included. Then, two denoising algorithms are designed for them, respectively. For each flat patch, we simulate multiple similar patches for it in pseudo-time domain and remove its noise by averaging all the simulated patches, considering that the structure information induced by the noise varies randomly over time. For each structure patch, we design a structure-preserved sparse coding algorithm to reconstruct the real structure information. It reconstructs each patch by describing it as a weighted summation of its neighboring patches and incorporating the weights into the sparse representation of the current patch. Based on all the reconstructed patches, we generate a reconstructed image. After that, we repeat the whole process by changing relevant parameters, considering that blocking artifacts exist in a single reconstructed image. Finally, we obtain the reconstructed image by merging all the generated images into one. Experiments are conducted on an ICCD sensing image dataset, which verifies its subjective performance in removing the randomly clustered noise and preserving the real structure information in the ICCD sensing image.

11.
Sensors (Basel) ; 16(12)2016 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-27941705

RESUMO

Strong demands for accurate non-cooperative target measurement have been arising recently for the tasks of assembling and capturing. Spherical objects are one of the most common targets in these applications. However, the performance of the traditional vision-based reconstruction method was limited for practical use when handling poorly-textured targets. In this paper, we propose a novel multi-sensor fusion system for measuring and reconstructing textureless non-cooperative spherical targets. Our system consists of four simple lasers and a visual camera. This paper presents a complete framework of estimating the geometric parameters of textureless spherical targets: (1) an approach to calibrate the extrinsic parameters between a camera and simple lasers; and (2) a method to reconstruct the 3D position of the laser spots on the target surface and achieve the refined results via an optimized scheme. The experiment results show that our proposed calibration method can obtain a fine calibration result, which is comparable to the state-of-the-art LRF-based methods, and our calibrated system can estimate the geometric parameters with high accuracy in real time.

12.
Sensors (Basel) ; 16(8)2016 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-27529248

RESUMO

Lane boundary detection technology has progressed rapidly over the past few decades. However, many challenges that often lead to lane detection unavailability remain to be solved. In this paper, we propose a spatial-temporal knowledge filtering model to detect lane boundaries in videos. To address the challenges of structure variation, large noise and complex illumination, this model incorporates prior spatial-temporal knowledge with lane appearance features to jointly identify lane boundaries. The model first extracts line segments in video frames. Two novel filters-the Crossing Point Filter (CPF) and the Structure Triangle Filter (STF)-are proposed to filter out the noisy line segments. The two filters introduce spatial structure constraints and temporal location constraints into lane detection, which represent the spatial-temporal knowledge about lanes. A straight line or curve model determined by a state machine is used to fit the line segments to finally output the lane boundaries. We collected a challenging realistic traffic scene dataset. The experimental results on this dataset and other standard dataset demonstrate the strength of our method. The proposed method has been successfully applied to our autonomous experimental vehicle.

13.
Sensors (Basel) ; 15(6): 13899-916, 2015 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-26076405

RESUMO

In this paper, the problem of spatial signature estimation using a uniform linear array (ULA) with unknown sensor gain and phase errors is considered. As is well known, the directions-of-arrival (DOAs) can only be determined within an unknown rotational angle in this array model. However, the phase ambiguity has no impact on the identification of the spatial signature. Two auto-calibration methods are presented for spatial signature estimation. In our methods, the rotational DOAs and model error parameters are firstly obtained, and the spatial signature is subsequently calculated. The first method extracts two subarrays from the ULA to construct an estimator, and the elements of the array can be used several times in one subarray. The other fully exploits multiple invariances in the interior of the sensor array, and a multidimensional nonlinear problem is formulated. A Gauss-Newton iterative algorithm is applied for solving it. The first method can provide excellent initial inputs for the second one. The effectiveness of the proposed algorithms is demonstrated by several simulation results.

14.
Sensors (Basel) ; 14(12): 23398-418, 2014 Dec 05.
Artigo em Inglês | MEDLINE | ID: mdl-25490597

RESUMO

Compressive Sensing Imaging (CSI) is a new framework for image acquisition, which enables the simultaneous acquisition and compression of a scene. Since the characteristics of Compressive Sensing (CS) acquisition are very different from traditional image acquisition, the general image compression solution may not work well. In this paper, we propose an efficient lossy compression solution for CS acquisition of images by considering the distinctive features of the CSI. First, we design an adaptive compressive sensing acquisition method for images according to the sampling rate, which could achieve better CS reconstruction quality for the acquired image. Second, we develop a universal quantization for the obtained CS measurements from CS acquisition without knowing any a priori information about the captured image. Finally, we apply these two methods in the CSI system for efficient lossy compression of CS acquisition. Simulation results demonstrate that the proposed solution improves the rate-distortion performance by 0.4~2 dB comparing with current state-of-the-art, while maintaining a low computational complexity.

15.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3753-3771, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38145531

RESUMO

Monocular depth inference is a fundamental problem for scene perception of robots. Specific robots may be equipped with a camera plus an optional depth sensor of any type and located in various scenes of different scales, whereas recent advances derived multiple individual sub-tasks. It leads to additional burdens to fine-tune models for specific robots and thereby high-cost customization in large-scale industrialization. This article investigates a unified task of monocular depth inference, which infers high-quality depth maps from all kinds of input raw data from various robots in unseen scenes. A basic benchmark G2-MonoDepth is developed for this task, which comprises four components: (a) a unified data representation RGB+X to accommodate RGB plus raw depth with diverse scene scale/semantics, depth sparsity ([0%, 100%]) and errors (holes/noises/blurs), (b) a novel unified loss to adapt to diverse depth sparsity/errors of input raw data and diverse scales of output scenes, (c) an improved network to well propagate diverse scene scales from input to output, and (d) a data augmentation pipeline to simulate all types of real artifacts in raw depth maps for training. G2-MonoDepth is applied in three sub-tasks including depth estimation, depth completion with different sparsity, and depth enhancement in unseen scenes, and it always outperforms SOTA baselines on both real-world data and synthetic data.

16.
Int J Neural Syst ; 34(1): 2450002, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38084473

RESUMO

Functional MRI (fMRI) is a brain signal with high spatial resolution, and visual cognitive processes and semantic information in the brain can be represented and obtained through fMRI. In this paper, we design single-graphic and matched/unmatched double-graphic visual stimulus experiments and collect 12 subjects' fMRI data to explore the brain's visual perception processes. In the double-graphic stimulus experiment, we focus on the high-level semantic information as "matching", and remove tail-to-tail conjunction by designing a model to screen the matching-related voxels. Then, we perform Bayesian causal learning between fMRI voxels based on the transfer entropy, establish a hierarchical Bayesian causal network (HBcausalNet) of the visual cortex, and use the model for visual stimulus image reconstruction. HBcausalNet achieves an average accuracy of 70.57% and 53.70% in single- and double-graphic stimulus image reconstruction tasks, respectively, higher than HcorrNet and HcasaulNet. The results show that the matching-related voxel screening and causality analysis method in this paper can extract the "matching" information in fMRI, obtain a direct causal relationship between matching information and fMRI, and explore the causal inference process in the brain. It suggests that our model can effectively extract high-level semantic information in brain signals and model effective connections and visual perception processes in the visual cortex of the brain.


Assuntos
Mapeamento Encefálico , Córtex Visual , Humanos , Mapeamento Encefálico/métodos , Teorema de Bayes , Semântica , Encéfalo , Imageamento por Ressonância Magnética/métodos , Córtex Visual/diagnóstico por imagem
17.
IEEE Trans Pattern Anal Mach Intell ; 46(8): 5306-5324, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38349823

RESUMO

Deep Neural Network classifiers are vulnerable to adversarial attacks, where an imperceptible perturbation could result in misclassification. However, the vulnerability of DNN-based image ranking systems remains under-explored. In this paper, we propose two attacks against deep ranking systems, i.e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations. Specifically, the expected ranking order is first represented as a set of inequalities. Then a triplet-like objective function is designed to obtain the optimal perturbation. Conversely, an anti-collapse triplet defense is proposed to improve the ranking model robustness against all proposed attacks, where the model learns to prevent the adversarial attack from pulling the positive and negative samples close to each other. To comprehensively measure the empirical adversarial robustness of a ranking model with our defense, we propose an empirical robustness score, which involves a set of representative attacks against ranking models. Our adversarial ranking attacks and defenses are evaluated on MNIST, Fashion-MNIST, CUB200-2011, CARS196, and Stanford Online Products datasets. Experimental results demonstrate that our attacks can effectively compromise a typical deep ranking system. Nevertheless, our defense can significantly improve the ranking system's robustness and simultaneously mitigate a wide range of attacks.

18.
Artigo em Inglês | MEDLINE | ID: mdl-39283785

RESUMO

Unsupervised person re-identification (Re-ID) is challenging due to the lack of ground-truth labels. Most existing methods rely on pseudo labels estimated via iterative clustering and thus are highly susceptible to performance penalties incurred by the inaccurate estimated number of clusters. Alternatively, we utilize the sample pairs with pairwise pseudo labels to guide the feature learning to avoid the dilemma of determining cluster numbers. In this article, we propose a meta pairwise relationship distillation (MPRD) method that incorporates a graph convolutional network (GCN) to provide high-fidelity pairwise relationships to supervise the model training. A small amount of metadata with very-confidence pairwise relationships and the unlabeled pairs with the provided pseudo pairwise relationships participate in the GCN training. Besides, we introduce a hard sample deduction (HSD) module to timely mine the sample pairs with error-prone pairwise pseudo labels to mitigate the misled optimization by noisy labels. Furthermore, since the features of each positive pair represent the same person, we design a positive pair alignment (PPA) module to reduce the redundant information in each feature, which is achieved by minimizing the difference between each positive pair's feature distributions. Extensive experiments on the Market-1501, DukeMTMC-reID, and MSMT17 datasets show that our method outperforms the state-of-the-art unsupervised methods.

19.
Med Phys ; 51(3): 1775-1797, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37681965

RESUMO

BACKGROUND: Atherosclerotic cardiovascular disease is the leading cause of death worldwide. Early detection of carotid atherosclerosis can prevent the progression of cardiovascular disease. Many (semi-) automatic methods have been designed for the segmentation of carotid vessel wall and the diagnosis of carotid atherosclerosis (i.e., the lumen segmentation, the outer wall segmentation, and the carotid atherosclerosis diagnosis) on black blood magnetic resonance imaging (BB-MRI). However, most of these methods ignore the intrinsic correlation among different tasks on BB-MRI, leading to limited performance. PURPOSE: Thus, we model the intrinsic correlation among the lumen segmentation, the outer wall segmentation, and the carotid atherosclerosis diagnosis tasks on BB-MRI by using the multi-task learning technique and propose a gated multi-task network (GMT-Net) to perform three related tasks in a neural network (i.e., carotid artery lumen segmentation, outer wall segmentation, and carotid atherosclerosis diagnosis). METHODS: In the proposed method, the GMT-Net is composed of three modules, including the sharing module, the segmentation module, and the diagnosis module, which interact with each other to achieve better learning performance. At the same time, two new adaptive layers, namely, the gated exchange layer and the gated fusion layer, are presented to exchange and merge branch features. RESULTS: The proposed method is applied to the CAREII dataset (i.e., 1057 scans) for the lumen segmentation, the outer wall segmentation, and the carotid atherosclerosis diagnosis. The proposed method can achieve promising segmentation performances (0.9677 Dice for the lumen and 0.9669 Dice for the outer wall) and better diagnosis accuracy of carotid atherosclerosis (0.9516 AUC and 0.9024 Accuracy) in the "CAREII test" dataset (i.e., 106 scans). The results show that the proposed method has statistically significant accuracy and efficiency. CONCLUSIONS: Even without the intervention of reviewers required for the previous works, the proposed method automatically segments the lumen and outer wall together and diagnoses carotid atherosclerosis with high performance. The proposed method can be used in clinical trials to help radiologists get rid of tedious reading tasks, such as screening review to separate normal carotid arteries from atherosclerotic arteries and to outline vessel wall contours.


Assuntos
Doenças Cardiovasculares , Doenças das Artérias Carótidas , Humanos , Doenças Cardiovasculares/patologia , Artérias Carótidas/diagnóstico por imagem , Artérias Carótidas/patologia , Doenças das Artérias Carótidas/diagnóstico por imagem , Doenças das Artérias Carótidas/patologia , Angiografia por Ressonância Magnética/métodos , Imageamento por Ressonância Magnética/métodos
20.
Med Phys ; 51(8): 5441-5456, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38648676

RESUMO

BACKGROUND: Liver lesions mainly occur inside the liver parenchyma, which are difficult to locate and have complicated relationships with essential vessels. Thus, preoperative planning is crucial for the resection of liver lesions. Accurate segmentation of the hepatic and portal veins (PVs) on computed tomography (CT) images is of great importance for preoperative planning. However, manually labeling the mask of vessels is laborious and time-consuming, and the labeling results of different clinicians are prone to inconsistencies. Hence, developing an automatic segmentation algorithm for hepatic and PVs on CT images has attracted the attention of researchers. Unfortunately, existing deep learning based automatic segmentation methods are prone to misclassifying peripheral vessels into wrong categories. PURPOSE: This study aims to provide a fully automatic and robust semantic segmentation algorithm for hepatic and PVs, guiding subsequent preoperative planning. In addition, to address the deficiency of the public dataset for hepatic and PV segmentation, we revise the annotations of the Medical Segmentation Decathlon (MSD) hepatic vessel segmentation dataset and add the masks of the hepatic veins (HVs) and PVs. METHODS: We proposed a structure with a dual-stream encoder combining convolution and Transformer block, named Dual-stream Hepatic Portal Vein segmentation Network, to extract local features and long-distance spatial information, thereby extracting anatomical information of hepatic and portal vein, avoiding misdivisions of adjacent peripheral vessels. Besides, a multi-scale feature fusion block based on dilated convolution is proposed to extract multi-scale features on expanded perception fields for local features, and a multi-level fusing attention module is introduced for efficient context information extraction. Paired t-test is conducted to evaluate the significant difference in dice between the proposed methods and the comparing methods. RESULTS: Two datasets are constructed from the original MSD dataset. For each dataset, 50 cases are randomly selected for model evaluation in the scheme of 5-fold cross-validation. The results show that our method outperforms the state-of-the-art Convolutional Neural Network-based and transformer-based methods. Specifically, for the first dataset, our model reaches 0.815, 0.830, and 0.807 at overall dice, precision, and sensitivity. The dice of the hepatic and PVs are 0.835 and 0.796, which also exceed the numeric result of the comparing methods. Almost all the p-values of paired t-tests on the proposed approach and comparing approaches are smaller than 0.05. On the second dataset, the proposed algorithm achieves 0.749, 0.762, 0.726, 0.835, and 0.796 for overall dice, precision, sensitivity, dice for HV, and dice for PV, among which the first four numeric results exceed comparing methods. CONCLUSIONS: The proposed method is effective in solving the problem of misclassifying interlaced peripheral veins for the HV and PV segmentation task and outperforming the comparing methods on the relabeled dataset.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Veia Porta , Tomografia Computadorizada por Raios X , Veia Porta/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Humanos , Veias Hepáticas/diagnóstico por imagem , Aprendizado Profundo , Fígado/diagnóstico por imagem , Fígado/irrigação sanguínea
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA