Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 42
Filtrar
1.
Environ Sci Technol ; 58(11): 5014-5023, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38437169

RESUMO

Estimates of the land area occupied by wind energy differ by orders of magnitude due to data scarcity and inconsistent methodology. We developed a method that combines machine learning-based imagery analysis and geographic information systems and examined the land area of 318 wind farms (15,871 turbines) in the U.S. portion of the Western Interconnection. We found that prior land use and human modification in the project area are critical for land-use efficiency and land transformation of wind projects. Projects developed in areas with little human modification have a land-use efficiency of 63.8 ± 8.9 W/m2 (mean ±95% confidence interval) and a land transformation of 0.24 ± 0.07 m2/MWh, while values for projects in areas with high human modification are 447 ± 49.4 W/m2 and 0.05 ± 0.01 m2/MWh, respectively. We show that land resources for wind can be quantified consistently with our replicable method, a method that obviates >99% of the workload using machine learning. To quantify the peripheral impact of a turbine, buffered geometry can be used as a proxy for measuring land resources and metrics when a large enough impact radius is assumed (e.g., >4 times the rotor diameter). Our analysis provides a necessary first step toward regionalized impact assessment and improved comparisons of energy alternatives.


Assuntos
Fontes Geradoras de Energia , Vento , Humanos , Fazendas , Fenômenos Físicos
2.
IEEE Trans Med Imaging ; 43(1): 582-593, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37703139

RESUMO

The accelerating magnetic resonance imaging (MRI) reconstruction process is a challenging ill-posed inverse problem due to the excessive under-sampling operation in k -space. In this paper, we propose a recurrent Transformer model, namely ReconFormer, for MRI reconstruction, which can iteratively reconstruct high-fidelity magnetic resonance images from highly under-sampled k -space data (e.g., up to 8× acceleration). In particular, the proposed architecture is built upon Recurrent Pyramid Transformer Layers (RPTLs). The core design of the proposed method is Recurrent Scale-wise Attention (RSA), which jointly exploits intrinsic multi-scale information at every architecture unit as well as the dependencies of the deep feature correlation through recurrent states. Moreover, benefiting from its recurrent nature, ReconFormer is lightweight compared to other baselines and only contains 1.1 M trainable parameters. We validate the effectiveness of ReconFormer on multiple datasets with different magnetic resonance sequences and show that it achieves significant improvements over the state-of-the-art methods with better parameter efficiency. The implementation code and pre-trained weights are available at https://github.com/guopengf/ReconFormer.


Assuntos
Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética
3.
NPJ Digit Med ; 6(1): 116, 2023 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-37344684

RESUMO

Cerebrovascular disease is a leading cause of death globally. Prevention and early intervention are known to be the most effective forms of its management. Non-invasive imaging methods hold great promises for early stratification, but at present lack the sensitivity for personalized prognosis. Resting-state functional magnetic resonance imaging (rs-fMRI), a powerful tool previously used for mapping neural activity, is available in most hospitals. Here we show that rs-fMRI can be used to map cerebral hemodynamic function and delineate impairment. By exploiting time variations in breathing pattern during rs-fMRI, deep learning enables reproducible mapping of cerebrovascular reactivity (CVR) and bolus arrival time (BAT) of the human brain using resting-state CO2 fluctuations as a natural "contrast media". The deep-learning network is trained with CVR and BAT maps obtained with a reference method of CO2-inhalation MRI, which includes data from young and older healthy subjects and patients with Moyamoya disease and brain tumors. We demonstrate the performance of deep-learning cerebrovascular mapping in the detection of vascular abnormalities, evaluation of revascularization effects, and vascular alterations in normal aging. In addition, cerebrovascular maps obtained with the proposed method exhibit excellent reproducibility in both healthy volunteers and stroke patients. Deep-learning resting-state vascular imaging has the potential to become a useful tool in clinical cerebrovascular imaging.

4.
Artigo em Inglês | MEDLINE | ID: mdl-37030853

RESUMO

Recent advances in deep learning have led to the development of accurate and efficient models for various computer vision applications such as classification, segmentation, and detection. However, learning highly accurate models relies on the availability of large-scale annotated datasets. Due to this, model performance drops drastically when evaluated on label-scarce datasets having visually distinct images, termed as domain adaptation problem. There are a plethora of works to adapt classification and segmentation models to label-scarce target dataset through unsupervised domain adaptation. Considering that detection is a fundamental task in computer vision, many recent works have focused on developing novel domain adaptive detection techniques. Here, we describe in detail the domain adaptation problem for detection and present an extensive survey of the various methods. Furthermore, we highlight strategies proposed and the associated shortcomings. Subsequently, we identify multiple aspects of the problem that are most promising for future research. We believe that this survey shall be valuable to the pattern recognition experts working in the fields of computer vision, biometrics, medical imaging, and autonomous navigation by introducing them to the problem, and familiarizing them with the current status of the progress while providing promising directions for future research.

5.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 4167-4179, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35816537

RESUMO

One-class novelty detectors are trained with examples of a particular class and are tasked with identifying whether a query example belongs to the same known class. Most recent advances adopt a deep auto-encoder style architecture to compute novelty scores for detecting novel class data. Deep networks have shown to be vulnerable to adversarial attacks, yet little focus is devoted to studying the adversarial robustness of deep novelty detectors. In this article, we first show that existing novelty detectors are susceptible to adversarial examples. We further demonstrate that commonly-used defense approaches for classification tasks have limited effectiveness in one-class novelty detection. Hence, we need a defense specifically designed for novelty detection. To this end, we propose a defense strategy that manipulates the latent space of novelty detectors to improve the robustness against adversarial examples. The proposed method, referred to as Principal Latent Space (PrincipaLS), learns the incrementally-trained cascade principal components in the latent space to robustify novelty detectors. PrincipaLS can purify latent space against adversarial examples and constrain latent space to exclusively model the known class distribution. We conduct extensive experiments on eight attacks, five datasets and seven novelty detectors, showing that PrincipaLS consistently enhances the adversarial robustness of novelty detection models.

6.
Artigo em Inglês | MEDLINE | ID: mdl-35609091

RESUMO

Face presentation attack detection (fPAD) plays a critical role in the modern face recognition pipeline. An fPAD model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) are not directly shared between data owners due to legal and privacy issues. In this article, with the motivation of circumventing this challenge, we propose a federated face presentation attack detection (FedPAD) framework that simultaneously takes advantage of rich fPAD information available at different data owners while preserving data privacy. In the proposed framework, each data owner (referred to as data centers) locally trains its own fPAD model. A server learns a global fPAD model by iteratively aggregating model updates from all data centers without accessing private data in each of them. Once the learned global model converges, it is used for fPAD inference. To equip the aggregated fPAD model in the server with better generalization ability to unseen attacks from users, following the basic idea of FedPAD, we further propose a federated generalized face presentation attack detection (FedGPAD) framework. A federated domain disentanglement strategy is introduced in FedGPAD, which treats each data center as one domain and decomposes the fPAD model into domain-invariant and domain-specific parts in each data center. Two parts disentangle the domain-invariant and domain-specific features from images in each local data center. A server learns a global fPAD model by only aggregating domain-invariant parts of the fPAD models from data centers, and thus, a more generalized fPAD model can be aggregated in server. We introduce the experimental setting to evaluate the proposed FedPAD and FedGPAD frameworks and carry out extensive experiments to provide various insights about federated learning for fPAD.

7.
IEEE Trans Pattern Anal Mach Intell ; 44(5): 2594-2609, 2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-33147141

RESUMO

We introduce a new large scale unconstrained crowd counting dataset (JHU-CROWD++) that contains "4,372" images with "1.51 million" annotations. In comparison to existing datasets, the proposed dataset is collected under a variety of diverse scenarios and environmental conditions. Specifically, the dataset includes several images with weather-based degradations and illumination variations, making it a very challenging dataset. Additionally, the dataset consists of a rich set of annotations at both image-level and head-level. Several recent methods are evaluated and compared on this dataset. The dataset can be downloaded from http://www.crowd-counting.com. Furthermore, we propose a novel crowd counting network that progressively generates crowd density maps via residual error estimation. The proposed method uses VGG16 as the backbone network and employs density map generated by the final layer as a coarse prediction to refine and generate finer density maps in a progressive fashion using residual learning. Additionally, the residual learning is guided by an uncertainty-based confidence weighting mechanism that permits the flow of only high-confidence residuals in the refinement path. The proposed Confidence Guided Deep Residual Counting Network (CG-DRCN) is evaluated on recent complex datasets, and it achieves significant improvements In errors.


Assuntos
Algoritmos , Benchmarking , Aglomeração
8.
IEEE Trans Image Process ; 31: 962-973, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34965207

RESUMO

Adversarial robustness of deep neural networks has been actively investigated. However, most existing defense approaches are limited to a specific type of adversarial perturbations. Specifically, they often fail to offer resistance to multiple attack types simultaneously, i.e., they lack multi-perturbation robustness. Furthermore, compared to image recognition problems, the adversarial robustness of video recognition models is relatively unexplored. While several studies have proposed how to generate adversarial videos, only a handful of approaches about defense strategies have been published in the literature. In this paper, we propose one of the first defense strategies against multiple types of adversarial videos for video recognition. The proposed method, referred to as MultiBN, performs adversarial training on multiple adversarial video types using multiple independent batch normalization (BN) layers with a learning-based BN selection module. With a multiple BN structure, each BN brach is responsible for learning the distribution of a single perturbation type and thus provides more precise distribution estimations. This mechanism benefits dealing with multiple perturbation types. The BN selection module detects the attack type of an input video and sends it to the corresponding BN branch, making MultiBN fully automatic and allowing end-to-end training. Compared to present adversarial training approaches, the proposed MultiBN exhibits stronger multi-perturbation robustness against different and even unforeseen adversarial video types, ranging from Lp-bounded attacks and physically realizable attacks. This holds true on different datasets and target models. Moreover, we conduct an extensive analysis to study the properties of the multiple BN structure.

9.
IEEE Trans Med Imaging ; 41(4): 965-976, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34813472

RESUMO

Most methods for medical image segmentation use U-Net or its variants as they have been successful in most of the applications. After a detailed analysis of these "traditional" encoder-decoder based approaches, we observed that they perform poorly in detecting smaller structures and are unable to segment boundary regions precisely. This issue can be attributed to the increase in receptive field size as we go deeper into the encoder. The extra focus on learning high level features causes U-Net based approaches to learn less information about low-level features which are crucial for detecting small structures. To overcome this issue, we propose using an overcomplete convolutional architecture where we project the input image into a higher dimension such that we constrain the receptive field from increasing in the deep layers of the network. We design a new architecture for im- age segmentation- KiU-Net which has two branches: (1) an overcomplete convolutional network Kite-Net which learns to capture fine details and accurate edges of the input, and (2) U-Net which learns high level features. Furthermore, we also propose KiU-Net 3D which is a 3D convolutional architecture for volumetric segmentation. We perform a detailed study of KiU-Net by performing experiments on five different datasets covering various image modalities. We achieve a good performance with an additional benefit of fewer parameters and faster convergence. We also demonstrate that the extensions of KiU-Net based on residual blocks and dense blocks result in further performance improvements. Code: https://github.com/jeya-maria-jose/KiU-Net-pytorch.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Processamento de Imagem Assistida por Computador/métodos
10.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 2618-2621, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34891790

RESUMO

The global pandemic of the novel coronavirus disease 2019 (COVID-19) has put tremendous pressure on the medical system. Imaging plays a complementary role in the management of patients with COVID-19. Computed tomography (CT) and chest X-ray (CXR) are the two dominant screening tools. However, difficulty in eliminating the risk of disease transmission, radiation exposure and not being cost-effective are some of the challenges for CT and CXR imaging. This fact induces the implementation of lung ultrasound (LUS) for evaluating COVID-19 due to its practical advantages of noninvasiveness, repeatability, and sensitive bedside property. In this paper, we utilize a deep learning model to perform the classification of COVID-19 from LUS data, which could produce objective diagnostic information for clinicians. Specifically, all LUS images are processed to obtain their corresponding local phase filtered images and radial symmetry transformed images before fed into the multi-scale residual convolutional neural network (CNN). Secondly, image combination as the input of the network is used to explore rich and reliable features. Feature fusion strategy at different levels is adopted to investigate the relationship between the depth of feature aggregation and the classification accuracy. Our proposed method is evaluated on the point-of-care US (POCUS) dataset together with the Italian COVID-19 Lung US database (ICLUS-DB) and shows promising performance for COVID-19 prediction.


Assuntos
COVID-19 , Humanos , Pulmão/diagnóstico por imagem , Redes Neurais de Computação , SARS-CoV-2
11.
Artigo em Inglês | MEDLINE | ID: mdl-34661201

RESUMO

Reconstructing magnetic resonance (MR) images from under-sampled data is a challenging problem due to various artifacts introduced by the under-sampling operation. Recent deep learning-based methods for MR image reconstruction usually leverage a generic auto-encoder architecture which captures low-level features at the initial layers and high-level features at the deeper layers. Such networks focus much on global features which may not be optimal to reconstruct the fully-sampled image. In this paper, we propose an Over-and-Under Complete Convolutional Recurrent Neural Network (OUCR), which consists of an overcomplete and an undercomplete Convolutional Recurrent Neural Network (CRNN). The overcomplete branch gives special attention in learning local structures by restraining the receptive field of the network. Combining it with the undercomplete branch leads to a network which focuses more on low-level features without losing out on the global structures. Extensive experiments on two datasets demonstrate that the proposed method achieves significant improvements over the compressed sensing and popular deep learning-based methods with less number of trainable parameters.

12.
IEEE Trans Image Process ; 30: 6570-6582, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34270423

RESUMO

Recent CNN-based methods for image deraining have achieved excellent performance in terms of reconstruction error as well as visual quality. However, these methods are limited in the sense that they can be trained only on fully labeled data. Due to various challenges in obtaining real world fully-labeled image deraining datasets, existing methods are trained only on synthetically generated data and hence, generalize poorly to real-world images. The use of real-world data in training image deraining networks is relatively less explored in the literature. We propose a Gaussian Process-based semi-supervised learning framework which enables the network in learning to derain using synthetic dataset while generalizing better using unlabeled real-world images. More specifically, we model the latent space vectors of unlabeled data using Gaussian Processes, which is then used to compute pseudo-ground-truth for supervising the network on unlabeled data. The pseudo ground-truth is further used to supervise the network at the intermediate level for the unlabeled data. Through extensive experiments and ablations on several challenging datasets (such as Rain800, Rain200L and DDN-SIRR), we show that the proposed method is able to effectively leverage unlabeled data thereby resulting in significantly better performance as compared to labeled-only training. Additionally, we demonstrate that using unlabeled real-world images in the proposed GP-based framework results in superior performance as compared to the existing methods. Code is available at: https://github.com/rajeevyasarla/Syn2Real.

13.
Artigo em Inglês | MEDLINE | ID: mdl-35444379

RESUMO

Fast and accurate reconstruction of magnetic resonance (MR) images from under-sampled data is important in many clinical applications. In recent years, deep learning-based methods have been shown to produce superior performance on MR image reconstruction. However, these methods require large amounts of data which is difficult to collect and share due to the high cost of acquisition and medical data privacy regulations. In order to overcome this challenge, we propose a federated learning (FL) based solution in which we take advantage of the MR data available at different institutions while preserving patients' privacy. However, the generalizability of models trained with the FL setting can still be suboptimal due to domain shift, which results from the data collected at multiple institutions with different sensors, disease types, and acquisition protocols, etc. With the motivation of circumventing this challenge, we propose a cross-site modeling for MR image reconstruction in which the learned intermediate latent features among different source sites are aligned with the distribution of the latent features at the target site. Extensive experiments are conducted to provide various insights about FL for MR image reconstruction. Experimental results demonstrate that the proposed framework is a promising direction to utilize multi-institutional data without compromising patients' privacy for achieving improved MR image reconstruction. Our code is available at https://github.com/guopengf/FL-MRCM.

14.
IEEE Trans Med Imaging ; 40(10): 2832-2844, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-33351754

RESUMO

Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas in neuro-oncology, especially with the help of standard anatomic and advanced molecular MR images. However, data quantity and quality remain a key determinant, and a significant limit of the potential applications. In our previous work, we explored the synthesis of anatomic and molecular MR image networks (SAMR) in patients with post-treatment malignant gliomas. In this work, we extend this through a confidence-guided SAMR (CG-SAMR) that synthesizes data from lesion contour information to multi-modal MR images, including T1-weighted ( [Formula: see text]), gadolinium enhanced [Formula: see text] (Gd- [Formula: see text]), T2-weighted ( [Formula: see text]), and fluid-attenuated inversion recovery ( FLAIR ), as well as the molecular amide proton transfer-weighted ( [Formula: see text]) sequence. We introduce a module that guides the synthesis based on a confidence measure of the intermediate results. Furthermore, we extend the proposed architecture to allow training using unpaired data. Extensive experiments on real clinical data demonstrate that the proposed model can perform better than current the state-of-the-art synthesis methods. Our code is available at https://github.com/guopengf/CG-SAMR.


Assuntos
Glioma , Imageamento por Ressonância Magnética , Glioma/diagnóstico por imagem , Humanos
15.
Artigo em Inglês | MEDLINE | ID: mdl-33103161

RESUMO

Current protocol of Amide Proton Transfer-weighted (APTw) imaging commonly starts with the acquisition of high-resolution T2-weighted (T2w) images followed by APTw imaging at particular geometry and locations (i.e. slice) determined by the acquired T2w images. Although many advanced MRI reconstruction methods have been proposed to accelerate MRI, existing methods for APTw MRI lacks the capability of taking advantage of structural information in the acquired T2w images for reconstruction. In this paper, we present a novel APTw image reconstruction framework that can accelerate APTw imaging by reconstructing APTw images directly from highly undersampled k-space data and corresponding T2w image at the same location. The proposed framework starts with a novel sparse representation-based slice matching algorithm that aims to find the matched T2w slice given only the undersampled APTw image. A Recurrent Feature Sharing Reconstruction network (RFS-Rec) is designed to utilize intermediate features extracted from the matched T2w image by a Convolutional Recurrent Neural Network (CRNN), so that the missing structural information can be incorporated into the undersampled APT raw image thus effectively improving the image quality of the reconstructed APTw image. We evaluate the proposed method on two real datasets consisting of brain data from rats and humans. Extensive experiments demonstrate that the proposed RFS-Rec approach can outperform the state-of-the-art methods.

16.
Med Image Comput Comput Assist Interv ; 12262: 104-113, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-33073265

RESUMO

Data-driven automatic approaches have demonstrated their great potential in resolving various clinical diagnostic dilemmas for patients with malignant gliomas in neuro-oncology with the help of conventional and advanced molecular MR images. However, the lack of sufficient annotated MRI data has vastly impeded the development of such automatic methods. Conventional data augmentation approaches, including flipping, scaling, rotation, and distortion are not capable of generating data with diverse image content. In this paper, we propose a method, called synthesis of anatomic and molecular MR images network (SAMR), which can simultaneously synthesize data from arbitrary manipulated lesion information on multiple anatomic and molecular MRI sequences, including T1-weighted (T 1w), gadolinium enhanced T 1w (Gd-T 1w), T2-weighted (T 2w), fluid-attenuated inversion recovery (FLAIR), and amide proton transfer-weighted (APTw). The proposed framework consists of a stretch-out up-sampling module, a brain atlas encoder, a segmentation consistency module, and multi-scale label-wise discriminators. Extensive experiments on real clinical data demonstrate that the proposed model can perform significantly better than the state-of-the-art synthesis methods.

17.
Int J Comput Assist Radiol Surg ; 15(9): 1477-1485, 2020 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32656685

RESUMO

PURPOSE: Real-time, two (2D) and three-dimensional (3D) ultrasound (US) has been investigated as a potential alternative to fluoroscopy imaging in various surgical and non-surgical orthopedic procedures. However, low signal to noise ratio, imaging artifacts and bone surfaces appearing several millimeters (mm) in thickness have hindered the wide spread adaptation of this safe imaging modality. Limited field of view and manual data collection cause additional problems during US-based orthopedic procedures. In order to overcome these limitations various bone segmentation and registration methods have been developed. Acoustic bone shadow is an important image artifact used to identify the presence of bone boundaries in the collected US data. Information about bone shadow region can be used (1) to guide the orthopedic surgeon or clinician to a standardized diagnostic viewing plane with minimal artifacts, (2) as a prior feature to improve bone segmentation and registration. METHOD: In this work, we propose a computational method, based on a novel generative adversarial network (GAN) architecture, to segment bone shadow images from in vivo US scans in real-time. We also show how these segmented shadow images can be incorporated, as a proxy, to a multi-feature guided convolutional neural network (CNN) architecture for real-time and accurate bone surface segmentation. Quantitative and qualitative evaluation studies are performed on 1235 scans collected from 27 subjects using two different US machines. Finally, we provide qualitative and quantitative comparison results against state-of-the-art GANs. RESULTS: We have obtained mean dice coefficient (± standard deviation) of [Formula: see text] ([Formula: see text]) for bone shadow segmentation, showing that the method is in close range with manual expert annotation. Statistical significant improvements against state-of-the-art GAN methods (paired t-test [Formula: see text]) is also obtained. Using the segmented bone shadow features average bone localization accuracy of 0.11 mm ([Formula: see text]) was achieved. CONCLUSIONS: Reported accurate and robust results make the proposed method promising for various orthopedic procedures. Although we did not investigate in this work, the segmented bone shadow images could also be used as an additional feature to improve accuracy of US-based registration methods. Further extensive validations are required in order to fully understand the clinical utility of the proposed method.


Assuntos
Doenças Ósseas/diagnóstico por imagem , Osso e Ossos/diagnóstico por imagem , Diagnóstico por Computador/métodos , Fluoroscopia , Imageamento Tridimensional/métodos , Redes Neurais de Computação , Ultrassonografia , Acústica , Algoritmos , Artefatos , Simulação por Computador , Humanos , Processamento de Imagem Assistida por Computador , Procedimentos Ortopédicos , Ortopedia , Reprodutibilidade dos Testes
18.
Int J Comput Assist Radiol Surg ; 15(7): 1127-1135, 2020 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-32430694

RESUMO

PURPOSE: Automatic bone surfaces segmentation is one of the fundamental tasks of ultrasound (US)-guided computer-assisted orthopedic surgery procedures. However, due to various US imaging artifacts, manual operation of the transducer during acquisition, and different machine settings, many existing methods cannot deal with the large variations of the bone surface responses, in the collected data, without manual parameter selection. Even for fully automatic methods, such as deep learning-based methods, the problem of dataset bias causes networks to perform poorly on the US data that are different from the training set. METHODS: In this work, an intensity-invariant convolutional neural network (CNN) architecture is proposed for robust segmentation of bone surfaces from US data obtained from two different US machines with varying acquisition settings. The proposed CNN takes US image as input and simultaneously generates two intermediate output images, denoted as local phase tensor (LPT) and global context tensor (GCT), from two branches which are invariant to intensity variations. LPT and GCT are fused to generate the final segmentation map. In the training process, the LPT network branch is supervised by precalculated ground truth without manual annotation. RESULTS: The proposed method is evaluated on 1227 in vivo US scans collected using two US machines, including a portable handheld ultrasound scanner, by scanning various bone surfaces from 28 volunteers. Validation of proposed method on both US machines not only shows statistically significant improvements in cross-machine segmentation of bone surfaces compared to state-of-the-art methods but also achieves a computation time of 30 milliseconds per image, [Formula: see text] improvement over state-of-the-art. CONCLUSION: The encouraging results obtained in this initial study suggest that the proposed method is promising enough for further evaluation. Future work will include extensive validation of the method on new US data collected from various machines using different acquisition settings. We will also evaluate the potential of using the segmented bone surfaces as an input to a point set-based registration method.


Assuntos
Osso e Ossos/cirurgia , Processamento de Imagem Assistida por Computador/métodos , Cirurgia Assistida por Computador , Ultrassonografia de Intervenção/métodos , Artefatos , Osso e Ossos/diagnóstico por imagem , Aprendizado Profundo , Humanos , Adulto Jovem
19.
Artigo em Inglês | MEDLINE | ID: mdl-32365029

RESUMO

We propose a novel multi-stream architecture and training methodology that exploits semantic labels for facial image deblurring. The proposed Uncertainty Guided Multi-Stream Semantic Network (UMSN) processes regions belonging to each semantic class independently and learns to combine their outputs into the final deblurred result. Pixel-wise semantic labels are obtained using a segmentation network. A predicted confidence measure is used during training to guide the network towards the challenging regions of the human face such as the eyes and nose. The entire network is trained in an end-to-end fashion. Comprehensive experiments on three different face datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art face deblurring methods. Code is available at.

20.
Artigo em Inglês | MEDLINE | ID: mdl-31329118

RESUMO

Single image-based crowd counting has recently witnessed increased focus, but many leading methods are far from optimal, especially in highly congested scenes. In this paper, we present Hierarchical Attention-based Crowd Counting Network (HA-CCN) that employs attention mechanisms at various levels to selectively enhance the features of the network. The proposed method, which is based on the VGG16 network, consists of a spatial attention module (SAM) and a set of global attention modules (GAM). SAM enhances low-level features in the network by infusing spatial segmentation information, whereas the GAM focuses on enhancing channel-wise information in the higher level layers. The proposed method is a single-step training framework, simple to implement and achieves state-of-the-art results on different datasets. Furthermore, we extend the proposed counting network by introducing a novel set-up to adapt the network to different scenes and datasets via weak supervision using image-level labels. This new set up reduces the burden of acquiring labour intensive point-wise annotations for new datasets while improving the cross-dataset performance.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA