Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Environ Sci Technol ; 58(11): 5014-5023, 2024 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-38437169

RESUMEN

Estimates of the land area occupied by wind energy differ by orders of magnitude due to data scarcity and inconsistent methodology. We developed a method that combines machine learning-based imagery analysis and geographic information systems and examined the land area of 318 wind farms (15,871 turbines) in the U.S. portion of the Western Interconnection. We found that prior land use and human modification in the project area are critical for land-use efficiency and land transformation of wind projects. Projects developed in areas with little human modification have a land-use efficiency of 63.8 ± 8.9 W/m2 (mean ±95% confidence interval) and a land transformation of 0.24 ± 0.07 m2/MWh, while values for projects in areas with high human modification are 447 ± 49.4 W/m2 and 0.05 ± 0.01 m2/MWh, respectively. We show that land resources for wind can be quantified consistently with our replicable method, a method that obviates >99% of the workload using machine learning. To quantify the peripheral impact of a turbine, buffered geometry can be used as a proxy for measuring land resources and metrics when a large enough impact radius is assumed (e.g., >4 times the rotor diameter). Our analysis provides a necessary first step toward regionalized impact assessment and improved comparisons of energy alternatives.


Asunto(s)
Fuentes Generadoras de Energía , Viento , Humanos , Granjas , Fenómenos Físicos
2.
Appl Opt ; 56(3): B191-B197, 2017 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-28157883

RESUMEN

Most LADAR (laser radar, LIDAR) imaging systems use pixel-basis sampling, where each azimuth and elevation resolution element is uniquely sampled and recorded. We demonstrate and examine alternative sampling and post-detection processing schemes where recorded measurements are made in alternative bases that are intended to reduce system power consumption and laser emissions. A prototype of such a sensor having the capability to generate arbitrary illumination beam patterns rather than spot, line scanning, or flash techniques is described along with computational imaging algorithms to reduce speckle and identify scene objects in a low-dimensional compressed basis rather than in the pixel basis. Such techniques yield considerable energy savings and prove valuable when used on platforms with severe limitations on sensor size, weight, and power, and in particular as part of autonomous systems where image output for human interpretation is unnecessary.

3.
J Opt Soc Am A Opt Image Sci Vis ; 31(5): 1090-103, 2014 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-24979642

RESUMEN

In recent years, sparse representation and dictionary-learning-based methods have emerged as powerful tools for efficiently processing data in nontraditional ways. A particular area of promise for these theories is face recognition. In this paper, we review the role of sparse representation and dictionary learning for efficient face identification and verification. Recent face recognition algorithms from still images, videos, and ambiguously labeled imagery are reviewed. In particular, discriminative dictionary learning algorithms as well as methods based on weakly supervised learning and domain adaptation are summarized. Some of the compelling challenges and issues that confront research in face recognition using sparse representations and dictionary learning are outlined.

4.
IEEE Trans Med Imaging ; 43(1): 582-593, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37703139

RESUMEN

The accelerating magnetic resonance imaging (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k -space. In this paper, we propose a recurrent Transformer model, namely ReconFormer, for MRI reconstruction, which can iteratively reconstruct high-fidelity magnetic resonance images from highly under-sampled k -space data (e.g., up to 8× acceleration). In particular, the proposed architecture is built upon Recurrent Pyramid Transformer Layers (RPTLs). The core design of the proposed method is Recurrent Scale-wise Attention (RSA), which jointly exploits intrinsic multi-scale information at every architecture unit as well as the dependencies of the deep feature correlation through recurrent states. Moreover, benefiting from its recurrent nature, ReconFormer is lightweight compared to other baselines and only contains 1.1 M trainable parameters. We validate the effectiveness of ReconFormer on multiple datasets with different magnetic resonance sequences and show that it achieves significant improvements over the state-of-the-art methods with better parameter efficiency. The implementation code and pre-trained weights are available at https://github.com/guopengf/ReconFormer.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Imagen por Resonancia Magnética
5.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 4167-4179, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-35816537

RESUMEN

One-class novelty detectors are trained with examples of a particular class and are tasked with identifying whether a query example belongs to the same known class. Most recent advances adopt a deep auto-encoder style architecture to compute novelty scores for detecting novel class data. Deep networks have shown to be vulnerable to adversarial attacks, yet little focus is devoted to studying the adversarial robustness of deep novelty detectors. In this article, we first show that existing novelty detectors are susceptible to adversarial examples. We further demonstrate that commonly-used defense approaches for classification tasks have limited effectiveness in one-class novelty detection. Hence, we need a defense specifically designed for novelty detection. To this end, we propose a defense strategy that manipulates the latent space of novelty detectors to improve the robustness against adversarial examples. The proposed method, referred to as Principal Latent Space (PrincipaLS), learns the incrementally-trained cascade principal components in the latent space to robustify novelty detectors. PrincipaLS can purify latent space against adversarial examples and constrain latent space to exclusively model the known class distribution. We conduct extensive experiments on eight attacks, five datasets and seven novelty detectors, showing that PrincipaLS consistently enhances the adversarial robustness of novelty detection models.

6.
Artículo en Inglés | MEDLINE | ID: mdl-37030853

RESUMEN

Recent advances in deep learning have led to the development of accurate and efficient models for various computer vision applications such as classification, segmentation, and detection. However, learning highly accurate models relies on the availability of large-scale annotated datasets. Due to this, model performance drops drastically when evaluated on label-scarce datasets having visually distinct images, termed as domain adaptation problem. There are a plethora of works to adapt classification and segmentation models to label-scarce target dataset through unsupervised domain adaptation. Considering that detection is a fundamental task in computer vision, many recent works have focused on developing novel domain adaptive detection techniques. Here, we describe in detail the domain adaptation problem for detection and present an extensive survey of the various methods. Furthermore, we highlight strategies proposed and the associated shortcomings. Subsequently, we identify multiple aspects of the problem that are most promising for future research. We believe that this survey shall be valuable to the pattern recognition experts working in the fields of computer vision, biometrics, medical imaging, and autonomous navigation by introducing them to the problem, and familiarizing them with the current status of the progress while providing promising directions for future research.

7.
NPJ Digit Med ; 6(1): 116, 2023 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-37344684

RESUMEN

Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for mapping neural activity, is available in most hospitals. Here we show that rs-fMRI can be used to map cerebral hemodynamic function and delineate impairment. By exploiting time variations in breathing pattern during rs-fMRI, deep learning enables reproducible mapping of cerebrovascular reactivity (CVR) and bolus arrival time (BAT) of the human brain using resting-state CO2 fluctuations as a natural "contrast media". The deep-learning network is trained with CVR and BAT maps obtained with a reference method of CO2-inhalation MRI, which includes data from young and older healthy subjects and patients with Moyamoya disease and brain tumors. We demonstrate the performance of deep-learning cerebrovascular mapping in the detection of vascular abnormalities, evaluation of revascularization effects, and vascular alterations in normal aging. In addition, cerebrovascular maps obtained with the proposed method exhibit excellent reproducibility in both healthy volunteers and stroke patients. Deep-learning resting-state vascular imaging has the potential to become a useful tool in clinical cerebrovascular imaging.

8.
IEEE Trans Image Process ; 31: 962-973, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-34965207

RESUMEN

Adversarial robustness of deep neural networks has been actively investigated. However, most existing defense approaches are limited to a specific type of adversarial perturbations. Specifically, they often fail to offer resistance to multiple attack types simultaneously, i.e., they lack multi-perturbation robustness. Furthermore, compared to image recognition problems, the adversarial robustness of video recognition models is relatively unexplored. While several studies have proposed how to generate adversarial videos, only a handful of approaches about defense strategies have been published in the literature. In this paper, we propose one of the first defense strategies against multiple types of adversarial videos for video recognition. The proposed method, referred to as MultiBN, performs adversarial training on multiple adversarial video types using multiple independent batch normalization (BN) layers with a learning-based BN selection module. With a multiple BN structure, each BN brach is responsible for learning the distribution of a single perturbation type and thus provides more precise distribution estimations. This mechanism benefits dealing with multiple perturbation types. The BN selection module detects the attack type of an input video and sends it to the corresponding BN branch, making MultiBN fully automatic and allowing end-to-end training. Compared to present adversarial training approaches, the proposed MultiBN exhibits stronger multi-perturbation robustness against different and even unforeseen adversarial video types, ranging from Lp-bounded attacks and physically realizable attacks. This holds true on different datasets and target models. Moreover, we conduct an extensive analysis to study the properties of the multiple BN structure.

9.
IEEE Trans Pattern Anal Mach Intell ; 44(5): 2594-2609, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-33147141

RESUMEN

We introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD++) that contains "4,372" images with "1.51 million" annotations. In comparison to existing datasets, the proposed dataset is collected under a variety of diverse scenarios and environmental conditions. Specifically, the dataset includes several images with weather-based degradations and illumination variations, making it a very challenging dataset. Additionally, the dataset consists of a rich set of annotations at both image-level and head-level. Several recent methods are evaluated and compared on this dataset. The dataset can be downloaded from http://www.crowd-counting.com. Furthermore, we propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation. The proposed method uses VGG16 as the backbone network and employs density map generated by the final layer as a coarse prediction to refine and generate finer density maps in a progressive fashion using residual learning. Additionally, the residual learning is guided by an uncertainty-based confidence weighting mechanism that permits the flow of only high-confidence residuals in the refinement path. The proposed Confidence Guided Deep Residual Counting Network (CG-DRCN) is evaluated on recent complex datasets, and it achieves significant improvements In errors.


Asunto(s)
Algoritmos , Benchmarking , Aglomeración
10.
Artículo en Inglés | MEDLINE | ID: mdl-35609091

RESUMEN

Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. An fPAD model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) are not directly shared between data owners due to legal and privacy issues. In this article, with the motivation of circumventing this challenge, we propose a federated face presentation attack detection (FedPAD) framework that simultaneously takes advantage of rich fPAD information available at different data owners while preserving data privacy. In the proposed framework, each data owner (referred to as data centers) locally trains its own fPAD model. A server learns a global fPAD model by iteratively aggregating model updates from all data centers without accessing private data in each of them. Once the learned global model converges, it is used for fPAD inference. To equip the aggregated fPAD model in the server with better generalization ability to unseen attacks from users, following the basic idea of FedPAD, we further propose a federated generalized face presentation attack detection (FedGPAD) framework. A federated domain disentanglement strategy is introduced in FedGPAD, which treats each data center as one domain and decomposes the fPAD model into domain-invariant and domain-specific parts in each data center. Two parts disentangle the domain-invariant and domain-specific features from images in each local data center. A server learns a global fPAD model by only aggregating domain-invariant parts of the fPAD models from data centers, and thus, a more generalized fPAD model can be aggregated in server. We introduce the experimental setting to evaluate the proposed FedPAD and FedGPAD frameworks and carry out extensive experiments to provide various insights about federated learning for fPAD.

11.
IEEE Trans Med Imaging ; 41(4): 965-976, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-34813472

RESUMEN

Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these "traditional" encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This issue can be attributed to the increase in receptive field size as we go deeper into the encoder. The extra focus on learning high level features causes U-Net based approaches to learn less information about low-level features which are crucial for detecting small structures. To overcome this issue, we propose using an overcomplete convolutional architecture where we project the input image into a higher dimension such that we constrain the receptive field from increasing in the deep layers of the network. We design a new architecture for im- age segmentation- KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features. Furthermore, we also propose KiU-Net 3D which is a 3D convolutional architecture for volumetric segmentation. We perform a detailed study of KiU-Net by performing experiments on five different datasets covering various image modalities. We achieve a good performance with an additional benefit of fewer parameters and faster convergence. We also demonstrate that the extensions of KiU-Net based on residual blocks and dense blocks result in further performance improvements. Code: https://github.com/jeya-maria-jose/KiU-Net-pytorch.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Redes Neurales de la Computación , Procesamiento de Imagen Asistido por Computador/métodos
12.
Appl Opt ; 50(10): 1425-33, 2011 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-21460910

RESUMEN

We present an automatic target recognition algorithm using the recently developed theory of sparse representations and compressive sensing. We show how sparsity can be helpful for efficient utilization of data for target recognition. We verify the efficacy of the proposed algorithm in terms of the recognition rate and confusion matrices on the well known Comanche (Boeing-Sikorsky, USA) forward-looking IR data set consisting of ten different military targets at different orientations.

13.
IEEE Trans Image Process ; 30: 6570-6582, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34270423

RESUMEN

Recent CNN-based methods for image deraining have achieved excellent performance in terms of reconstruction error as well as visual quality. However, these methods are limited in the sense that they can be trained only on fully labeled data. Due to various challenges in obtaining real world fully-labeled image deraining datasets, existing methods are trained only on synthetically generated data and hence, generalize poorly to real-world images. The use of real-world data in training image deraining networks is relatively less explored in the literature. We propose a Gaussian Process-based semi-supervised learning framework which enables the network in learning to derain using synthetic dataset while generalizing better using unlabeled real-world images. More specifically, we model the latent space vectors of unlabeled data using Gaussian Processes, which is then used to compute pseudo-ground-truth for supervising the network on unlabeled data. The pseudo ground-truth is further used to supervise the network at the intermediate level for the unlabeled data. Through extensive experiments and ablations on several challenging datasets (such as Rain800, Rain200L and DDN-SIRR), we show that the proposed method is able to effectively leverage unlabeled data thereby resulting in significantly better performance as compared to labeled-only training. Additionally, we demonstrate that using unlabeled real-world images in the proposed GP-based framework results in superior performance as compared to the existing methods. Code is available at: https://github.com/rajeevyasarla/Syn2Real.

14.
Artículo en Inglés | MEDLINE | ID: mdl-35444379

RESUMEN

Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. In recent years, deep learning-based methods have been shown to produce superior performance on MR image reconstruction. However, these methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical data privacy regulations. In order to overcome this challenge, we propose a federated learning (FL) based solution in which we take advantage of the MR data available at different institutions while preserving patients' privacy. However, the generalizability of models trained with the FL setting can still be suboptimal due to domain shift, which results from the data collected at multiple institutions with different sensors, disease types, and acquisition protocols, etc. With the motivation of circumventing this challenge, we propose a cross-site modeling for MR image reconstruction in which the learned intermediate latent features among different source sites are aligned with the distribution of the latent features at the target site. Extensive experiments are conducted to provide various insights about FL for MR image reconstruction. Experimental results demonstrate that the proposed framework is a promising direction to utilize multi-institutional data without compromising patients' privacy for achieving improved MR image reconstruction. Our code is available at https://github.com/guopengf/FL-MRCM.

15.
IEEE Trans Med Imaging ; 40(10): 2832-2844, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-33351754

RESUMEN

Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas in neuro-oncology, especially with the help of standard anatomic and advanced molecular MR images. However, data quantity and quality remain a key determinant, and a significant limit of the potential applications. In our previous work, we explored the synthesis of anatomic and molecular MR image networks (SAMR) in patients with post-treatment malignant gliomas. In this work, we extend this through a confidence-guided SAMR (CG-SAMR) that synthesizes data from lesion contour information to multi-modal MR images, including T1-weighted ( [Formula: see text]), gadolinium enhanced [Formula: see text] (Gd- [Formula: see text]), T2-weighted ( [Formula: see text]), and fluid-attenuated inversion recovery ( FLAIR ), as well as the molecular amide proton transfer-weighted ( [Formula: see text]) sequence. We introduce a module that guides the synthesis based on a confidence measure of the intermediate results. Furthermore, we extend the proposed architecture to allow training using unpaired data. Extensive experiments on real clinical data demonstrate that the proposed model can perform better than current the state-of-the-art synthesis methods. Our code is available at https://github.com/guopengf/CG-SAMR.


Asunto(s)
Glioma , Imagen por Resonancia Magnética , Glioma/diagnóstico por imagen , Humanos
16.
Artículo en Inglés | MEDLINE | ID: mdl-34661201

RESUMEN

Reconstructing magnetic resonance (MR) images from under-sampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus much on global features which may not be optimal to reconstruct the fully-sampled image. In this paper, we propose an Over-and-Under Complete Convolutional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network (CRNN). The overcomplete branch gives special attention in learning local structures by restraining the receptive field of the network. Combining it with the undercomplete branch leads to a network which focuses more on low-level features without losing out on the global structures. Extensive experiments on two datasets demonstrate that the proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters.

17.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 2618-2621, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34891790

RESUMEN

The global pandemic of the novel coronavirus disease 2019 (COVID-19) has put tremendous pressure on the medical system. Imaging plays a complementary role in the management of patients with COVID-19. Computed tomography (CT) and chest X-ray (CXR) are the two dominant screening tools. However, difficulty in eliminating the risk of disease transmission, radiation exposure and not being cost-effective are some of the challenges for CT and CXR imaging. This fact induces the implementation of lung ultrasound (LUS) for evaluating COVID-19 due to its practical advantages of noninvasiveness, repeatability, and sensitive bedside property. In this paper, we utilize a deep learning model to perform the classification of COVID-19 from LUS data, which could produce objective diagnostic information for clinicians. Specifically, all LUS images are processed to obtain their corresponding local phase filtered images and radial symmetry transformed images before fed into the multi-scale residual convolutional neural network (CNN). Secondly, image combination as the input of the network is used to explore rich and reliable features. Feature fusion strategy at different levels is adopted to investigate the relationship between the depth of feature aggregation and the classification accuracy. Our proposed method is evaluated on the point-of-care US (POCUS) dataset together with the Italian COVID-19 Lung US database (ICLUS-DB) and shows promising performance for COVID-19 prediction.


Asunto(s)
COVID-19 , Humanos , Pulmón/diagnóstico por imagen , Redes Neurales de la Computación , SARS-CoV-2
18.
Int J Comput Assist Radiol Surg ; 15(9): 1477-1485, 2020 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-32656685

RESUMEN

PURPOSE: Real-time, two (2D) and three-dimensional (3D) ultrasound (US) has been investigated as a potential alternative to fluoroscopy imaging in various surgical and non-surgical orthopedic procedures. However, low signal to noise ratio, imaging artifacts and bone surfaces appearing several millimeters (mm) in thickness have hindered the wide spread adaptation of this safe imaging modality. Limited field of view and manual data collection cause additional problems during US-based orthopedic procedures. In order to overcome these limitations various bone segmentation and registration methods have been developed. Acoustic bone shadow is an important image artifact used to identify the presence of bone boundaries in the collected US data. Information about bone shadow region can be used (1) to guide the orthopedic surgeon or clinician to a standardized diagnostic viewing plane with minimal artifacts, (2) as a prior feature to improve bone segmentation and registration. METHOD: In this work, we propose a computational method, based on a novel generative adversarial network (GAN) architecture, to segment bone shadow images from in vivo US scans in real-time. We also show how these segmented shadow images can be incorporated, as a proxy, to a multi-feature guided convolutional neural network (CNN) architecture for real-time and accurate bone surface segmentation. Quantitative and qualitative evaluation studies are performed on 1235 scans collected from 27 subjects using two different US machines. Finally, we provide qualitative and quantitative comparison results against state-of-the-art GANs. RESULTS: We have obtained mean dice coefficient (± standard deviation) of [Formula: see text] ([Formula: see text]) for bone shadow segmentation, showing that the method is in close range with manual expert annotation. Statistical significant improvements against state-of-the-art GAN methods (paired t-test [Formula: see text]) is also obtained. Using the segmented bone shadow features average bone localization accuracy of 0.11 mm ([Formula: see text]) was achieved. CONCLUSIONS: Reported accurate and robust results make the proposed method promising for various orthopedic procedures. Although we did not investigate in this work, the segmented bone shadow images could also be used as an additional feature to improve accuracy of US-based registration methods. Further extensive validations are required in order to fully understand the clinical utility of the proposed method.


Asunto(s)
Enfermedades Óseas/diagnóstico por imagen , Huesos/diagnóstico por imagen , Diagnóstico por Computador/métodos , Fluoroscopía , Imagenología Tridimensional/métodos , Redes Neurales de la Computación , Ultrasonografía , Acústica , Algoritmos , Artefactos , Simulación por Computador , Humanos , Procesamiento de Imagen Asistido por Computador , Procedimientos Ortopédicos , Ortopedia , Reproducibilidad de los Resultados
19.
Artículo en Inglés | MEDLINE | ID: mdl-32365029

RESUMEN

We propose a novel multi-stream architecture and training methodology that exploits semantic labels for facial image deblurring. The proposed Uncertainty Guided Multi-Stream Semantic Network (UMSN) processes regions belonging to each semantic class independently and learns to combine their outputs into the final deblurred result. Pixel-wise semantic labels are obtained using a segmentation network. A predicted confidence measure is used during training to guide the network towards the challenging regions of the human face such as the eyes and nose. The entire network is trained in an end-to-end fashion. Comprehensive experiments on three different face datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art face deblurring methods. Code is available at.

20.
Int J Comput Assist Radiol Surg ; 15(7): 1127-1135, 2020 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-32430694

RESUMEN

PURPOSE: Automatic bone surfaces segmentation is one of the fundamental tasks of ultrasound (US)-guided computer-assisted orthopedic surgery procedures. However, due to various US imaging artifacts, manual operation of the transducer during acquisition, and different machine settings, many existing methods cannot deal with the large variations of the bone surface responses, in the collected data, without manual parameter selection. Even for fully automatic methods, such as deep learning-based methods, the problem of dataset bias causes networks to perform poorly on the US data that are different from the training set. METHODS: In this work, an intensity-invariant convolutional neural network (CNN) architecture is proposed for robust segmentation of bone surfaces from US data obtained from two different US machines with varying acquisition settings. The proposed CNN takes US image as input and simultaneously generates two intermediate output images, denoted as local phase tensor (LPT) and global context tensor (GCT), from two branches which are invariant to intensity variations. LPT and GCT are fused to generate the final segmentation map. In the training process, the LPT network branch is supervised by precalculated ground truth without manual annotation. RESULTS: The proposed method is evaluated on 1227 in vivo US scans collected using two US machines, including a portable handheld ultrasound scanner, by scanning various bone surfaces from 28 volunteers. Validation of proposed method on both US machines not only shows statistically significant improvements in cross-machine segmentation of bone surfaces compared to state-of-the-art methods but also achieves a computation time of 30 milliseconds per image, [Formula: see text] improvement over state-of-the-art. CONCLUSION: The encouraging results obtained in this initial study suggest that the proposed method is promising enough for further evaluation. Future work will include extensive validation of the method on new US data collected from various machines using different acquisition settings. We will also evaluate the potential of using the segmented bone surfaces as an input to a point set-based registration method.


Asunto(s)
Huesos/cirugía , Procesamiento de Imagen Asistido por Computador/métodos , Cirugía Asistida por Computador , Ultrasonografía Intervencional/métodos , Artefactos , Huesos/diagnóstico por imagen , Aprendizaje Profundo , Humanos , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA