Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 59
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
Opt Express ; 32(10): 17763-17774, 2024 May 06.
Artículo en Inglés | MEDLINE | ID: mdl-38858949

RESUMEN

Terahertz (THz) tomographic imaging based on time-resolved THz signals has raised significant attention due to its non-invasive, non-destructive, non-ionizing, material-classification, and ultrafast-frame-rate nature for object exploration and inspection. However, the material and geometric information of the tested objects is inherently embedded in the highly distorted THz time-domain signals, leading to substantial computational complexity and the necessity for intricate multi-physics models to extract the desired information. To address this challenge, we present a THz multi-dimensional tomographic framework and multi-scale spatio-spectral fusion Unet (MS3-Unet), capable of fusing and collaborating the THz signals across diverse signal domains. MS3-Unet employs multi-scale branches to extract spatio-spectral features, which are subsequently processed through element-wise adaptive filters and fused to achieve high-quality THz image restoration. Evaluated by geometry-variant objects, MS3-Unet outperforms other peer methods in PSNR and SSIM. In addition to the superior performance, the proposed framework additionally provides high scalable, adjustable, and accessible interface to collaborate with different user-defined models or methods.

2.
Neural Netw ; 174: 106218, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38518709

RESUMEN

In image watermark removal, popular methods depend on given reference non-watermark images in a supervised way to remove watermarks. However, reference non-watermark images are difficult to be obtained in the real world. At the same time, they often suffer from the influence of noise when captured by digital devices. To resolve these issues, in this paper, we present a self-supervised network for image denoising and watermark removal (SSNet). SSNet uses a parallel network in a self-supervised learning way to remove noise and watermarks. Specifically, each sub-network contains two sub-blocks. The upper sub-network uses the first sub-block to remove noise, according to noise-to-noise. Then, the second sub-block in the upper sub-network is used to remove watermarks, according to the distributions of watermarks. To prevent the loss of important information, the lower sub-network is used to simultaneously learn noise and watermarks in a self-supervised learning way. Moreover, two sub-networks interact via attention to extract more complementary salient information. The proposed method does not depend on paired images to learn a blind denoising and watermark removal model, which is very meaningful for real applications. Also, it is more effective than the popular image watermark removal methods in public datasets. Codes can be found at https://github.com/hellloxiaotian/SSNet.

3.
IEEE Trans Image Process ; 33: 738-752, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38194374

RESUMEN

Transformer-based method has demonstrated promising performance in image super-resolution tasks, due to its long-range and global aggregation capability. However, the existing Transformer brings two critical challenges for applying it in large-area earth observation scenes: (1) redundant token representation due to most irrelevant tokens; (2) single-scale representation which ignores scale correlation modeling of similar ground observation targets. To this end, this paper proposes to adaptively eliminate the interference of irreverent tokens for a more compact self-attention calculation. Specifically, we devise a Residual Token Selective Group (RTSG) to grasp the most crucial token by dynamically selecting the top- k keys in terms of score ranking for each query. For better feature aggregation, a Multi-scale Feed-forward Layer (MFL) is developed to generate an enriched representation of multi-scale feature mixtures during feed-forward process. Moreover, we also proposed a Global Context Attention (GCA) to fully explore the most informative components, thus introducing more inductive bias to the RTSG for an accurate reconstruction. In particular, multiple cascaded RTSGs form our final Top- k Token Selective Transformer (TTST) to achieve progressive representation. Extensive experiments on simulated and real-world remote sensing datasets demonstrate our TTST could perform favorably against state-of-the-art CNN-based and Transformer-based methods, both qualitatively and quantitatively. In brief, TTST outperforms the state-of-the-art approach (HAT-L) in terms of PSNR by 0.14 dB on average, but only accounts for 47.26% and 46.97% of its computational cost and parameters. The code and pre-trained TTST will be available at https://github.com/XY-boy/TTST for validation.

4.
IEEE Trans Image Process ; 33: 191-204, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38060367

RESUMEN

Convolutional neural networks (CNNs) and self-attention (SA) have demonstrated remarkable success in low-level vision tasks, such as image super-resolution, deraining, and dehazing. The former excels in acquiring local connections with translation equivariance, while the latter is better at capturing long-range dependencies. However, both CNNs and Transformers suffer from individual limitations, such as limited receptive field and weak diversity representation of CNNs during low efficiency and weak local relation learning of SA. To this end, we propose a multi-scale fusion and decomposition network (MFDNet) for rain perturbation removal, which unifies the merits of these two architectures while maintaining both effectiveness and efficiency. To achieve the decomposition and association of rain and rain-free features, we introduce an asymmetrical scheme designed as a dual-path mutual representation network that enables iterative refinement. Additionally, we incorporate high-efficiency convolutions throughout the network and use resolution rescaling to balance computational complexity with performance. Comprehensive evaluations show that the proposed approach outperforms most of the latest SOTA deraining methods and is versatile and robust in various image restoration tasks, including underwater image enhancement, image dehazing, and low-light image enhancement. The source codes and pretrained models are available at https://github.com/qwangg/MFDNet.

5.
IEEE J Biomed Health Inform ; 27(10): 4902-4913, 2023 10.
Artículo en Inglés | MEDLINE | ID: mdl-37490372

RESUMEN

Due to the high labor cost of physicians, it is difficult to collect a rich amount of manually-labeled medical images for developing learning-based computer-aided diagnosis (CADx) systems or segmentation algorithms. To tackle this issue, we reshape the image segmentation task as an image-to-image (I2I) translation problem and propose a retinal vascular segmentation network, which can achieve good cross-domain generalizability even with a small amount of training data. We devise primarily two components to facilitate this I2I-based segmentation method. The first is the constraints provided by the proposed gradient-vector-flow (GVF) loss, and, the second is a two-stage Unet (2Unet) generator with a skip connection. This configuration makes 2Unet's first-stage play a role similar to conventional Unet, but forces 2Unet's second stage to learn to be a refinement module. Extensive experiments show that by re-casting retinal vessel segmentation as an image-to-image translation problem, our I2I translator-based segmentation subnetwork achieves better cross-domain generalizability than existing segmentation methods. Our model, trained on one dataset, e.g., DRIVE, can produce segmentation results stably on datasets of other domains, e.g., CHASE-DB1, STARE, HRF, and DIARETDB1, even in low-shot circumstances.


Asunto(s)
Algoritmos , Retina , Humanos , Retina/diagnóstico por imagen , Vasos Retinianos/diagnóstico por imagen , Fondo de Ojo , Diagnóstico por Computador , Procesamiento de Imagen Asistido por Computador/métodos
6.
Int J Comput Vis ; : 1-20, 2023 Jun 07.
Artículo en Inglés | MEDLINE | ID: mdl-37363294

RESUMEN

Terahertz (THz) tomographic imaging has recently attracted significant attention thanks to its non-invasive, non-destructive, non-ionizing, material-classification, and ultra-fast nature for object exploration and inspection. However, its strong water absorption nature and low noise tolerance lead to undesired blurs and distortions of reconstructed THz images. The diffraction-limited THz signals highly constrain the performances of existing restoration methods. To address the problem, we propose a novel multi-view Subspace-Attention-guided Restoration Network (SARNet) that fuses multi-view and multi-spectral features of THz images for effective image restoration and 3D tomographic reconstruction. To this end, SARNet uses multi-scale branches to extract intra-view spatio-spectral amplitude and phase features and fuse them via shared subspace projection and self-attention guidance. We then perform inter-view fusion to further improve the restoration of individual views by leveraging the redundancies between neighboring views. Here, we experimentally construct a THz time-domain spectroscopy (THz-TDS) system covering a broad frequency range from 0.1 to 4 THz for building up a temporal/spectral/spatial/material THz database of hidden 3D objects. Complementary to a quantitative evaluation, we demonstrate the effectiveness of our SARNet model on 3D THz tomographic reconstruction applications. Supplementary Information: The online version contains supplementary material available at 10.1007/s11263-023-01812-y.

7.
IEEE Trans Image Process ; 32: 2719-2733, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37163393

RESUMEN

Multi-view action recognition aims to identify action categories from given clues. Existing studies ignore the negative influences of fuzzy views between view and action in disentangling, commonly arising the mistaken recognition results. To this end, we regard the observed image as the composition of the view and action components, and give full play to the advantages of multiple views via the adaptive cooperative representation among these two components, forming a Dual-Recommendation Disentanglement Network (DRDN) for multi-view action recognition. Specifically, 1) For the action, we leverage a multi-level Specific Information Recommendation (SIR) to enhance the interaction among intricate activities and views. SIR offers a more comprehensive representation of activities, measuring the trade-off between global and local information. 2) For the view, we utilize a Pyramid Dynamic Recommendation (PDR) to learn a complete and detailed global representation by transferring features from different views. It is explicitly restricted to resist the fuzzy noise influence, focusing on positive knowledge from other views. Our DRDN aims for complete action and view representation, where PDR directly guides action to disentangle with view features and SIR considers mutual exclusivity of view and action clues. Extensive experiments have indicated that the multi-view action recognition method DRDN we proposed achieves state-of-the-art performance over powerful competitors on several standard benchmarks. The code will be available at https://github.com/51cloud/DRDN.

8.
Mol Psychiatry ; 28(5): 1932-1945, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-36882500

RESUMEN

The BTBR T+Itpr3tf/J (BTBR/J) strain is one of the most valid models of idiopathic autism, serving as a potent forward genetics tool to dissect the complexity of autism. We found that a sister strain with an intact corpus callosum, BTBR TF/ArtRbrc (BTBR/R), showed more prominent autism core symptoms but moderate ultrasonic communication/normal hippocampus-dependent memory, which may mimic autism in the high functioning spectrum. Intriguingly, disturbed epigenetic silencing mechanism leads to hyperactive endogenous retrovirus (ERV), a mobile genetic element of ancient retroviral infection, which increases de novo copy number variation (CNV) formation in the two BTBR strains. This feature makes the BTBR strain a still evolving multiple-loci model toward higher ASD susceptibility. Furthermore, active ERV, analogous to virus infection, evades the integrated stress response (ISR) of host defense and hijacks the transcriptional machinery during embryonic development in the BTBR strains. These results suggest dual roles of ERV in the pathogenesis of ASD, driving host genome evolution at a long-term scale and managing cellular pathways in response to viral infection, which has immediate effects on embryonic development. The wild-type Draxin expression in BTBR/R also makes this substrain a more precise model to investigate the core etiology of autism without the interference of impaired forebrain bundles as in BTBR/J.


Asunto(s)
Trastorno del Espectro Autista , Trastorno Autístico , Retrovirus Endógenos , Embarazo , Femenino , Humanos , Animales , Ratones , Retrovirus Endógenos/genética , Variaciones en el Número de Copia de ADN , Trastorno Autístico/etiología , Prosencéfalo/metabolismo , Cuerpo Calloso/patología , Modelos Animales de Enfermedad , Ratones Endogámicos C57BL , Trastorno del Espectro Autista/genética , Trastorno del Espectro Autista/complicaciones , Ratones Endogámicos
9.
PLoS One ; 18(3): e0283040, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36928100

RESUMEN

This study investigates age-specific prostate-specific antigen (PSA) distributions in Taiwanese men and recommends reference ranges for this population after comparison with other studies. From January 1999 to December 2016, a total of 213,986 Taiwanese men aged above 19 years old without history of prostate cancer, urinary tract infection, or prostate infection were recruited from the Taiwan MJ cohort, an ongoing prospective cohort of health examinations conducted by the MJ Health Screening Center in Taiwan. Participants were divided into seven age groups. Simple descriptive statistical analyses were carried out and quartiles and 95th percentiles were calculated for each group as reference ranges for serum PSA in screening for prostate cancer in Taiwanese men. Serum PSA concentration correlated with age (r = 0.274, p<0.001). The median serum PSA concentration (5th to 95th percentile) ranged from 0.7 ng/ml (0.3 to 1.8) for men 20-29 years old (n = 6,382) to 1.6 ng/ml (0.4 to 8.4) for men over 79 years old (n = 504). The age-specific PSA reference ranges are as follows: 20-29 years, 1.80 ng/ml; 30-39 years, 1.80 ng/ml; 40-49 years, 2.0 ng/ml; 50-59 years, 3.20 ng/ml; 60-69 years, 5.60 ng/ml; 70-79 years, 7.40 ng/ml; over 80 years, 8.40 ng/ml. Almost no change occurred in the median serum PSA value in men 50 years old or younger, while a gradual increase was observed in men over 50. Taiwanese men aged 60 years above showed higher 95th percentile serum PSA values compared to Caucasian men and men in other Asian countries but were closer to those of Asian American and African American men. Results indicate significantly different PSA levels correlating to different ethnicities, suggesting that Oesterling's age-specific PSA reference ranges might not be appropriate for Taiwanese men. Our results should be further studied to validate the age-specific PSA reference ranges for Taiwanese men presented in this study.


Asunto(s)
Antígeno Prostático Específico , Neoplasias de la Próstata , Adulto , Anciano , Humanos , Masculino , Persona de Mediana Edad , Adulto Joven , Distribución por Edad , Factores de Edad , Negro o Afroamericano , Pueblos del Este de Asia , Estudios Prospectivos , Antígeno Prostático Específico/sangre , Antígeno Prostático Específico/química , Neoplasias de la Próstata/epidemiología , Valores de Referencia , Población Blanca
10.
IEEE Trans Neural Netw Learn Syst ; 34(1): 134-143, 2023 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-34197327

RESUMEN

Referring expression comprehension (REC) is an emerging research topic in computer vision, which refers to the detection of a target region in an image given a test description. Most existing REC methods follow a multistage pipeline, which is computationally expensive and greatly limits the applications of REC. In this article, we propose a one-stage model toward real-time REC, termed real-time global inference network (RealGIN). RealGIN addresses the issues of expression diversity and complexity of REC with two innovative designs: adaptive feature selection (AFS) and Global Attentive ReAsoNing (GARAN). Expression diversity concerns varying expression content, which includes information such as colors, attributes, locations, and fine-grained categories. To address this issue, AFS adaptively fuses features of different semantic levels to tackle the changes in expression content. In contrast, expression complexity concerns the complex relational conditions in expressions that are used to identify the referent. To this end, GARAN uses the textual feature as a pivot to collect expression-aware visual information from all regions and then diffuses this information back to each region, which provides sufficient context for modeling the relational conditions in expressions. On five benchmark datasets, i.e., RefCOCO, RefCOCO+, RefCOCOg, ReferIT, and Flickr30k, the proposed RealGIN outperforms most existing methods and achieves very competitive performances against the most advanced one, i.e., MAttNet. More importantly, under the same hardware, RealGIN can boost the processing speed by 10-20 times over the existing methods.

11.
IEEE Trans Neural Netw Learn Syst ; 34(11): 9139-9148, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-35294359

RESUMEN

This article focuses on filter-level network pruning. A novel pruning method, termed CLR-RNF, is proposed. We first reveal a "long-tail" pruning problem in magnitude-based weight pruning methods and then propose a computation-aware measurement for individual weight importance, followed by a cross-layer ranking (CLR) of weights to identify and remove the bottom-ranked weights. Consequently, the per-layer sparsity makes up the pruned network structure in our filter pruning. Then, we introduce a recommendation-based filter selection scheme where each filter recommends a group of its closest filters. To pick the preserved filters from these recommended groups, we further devise a k -reciprocal nearest filter (RNF) selection scheme where the selected filters fall into the intersection of these recommended groups. Both our pruned network structure and the filter selection are nonlearning processes, which, thus, significantly reduces the pruning complexity and differentiates our method from existing works. We conduct image classification on CIFAR-10 and ImageNet to demonstrate the superiority of our CLR-RNF over the state-of-the-arts. For example, on CIFAR-10, CLR-RNF removes 74.1% FLOPs and 95.0% parameters from VGGNet-16 with even 0.3% accuracy improvements. On ImageNet, it removes 70.2% FLOPs and 64.8% parameters from ResNet-50 with only 1.7% top-five accuracy drops. Our project is available at https://github.com/lmbxmu/CLR-RNF.

12.
IEEE Trans Neural Netw Learn Syst ; 34(10): 7946-7955, 2023 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-35157600

RESUMEN

Channel pruning has been long studied to compress convolutional neural networks (CNNs), which significantly reduces the overall computation. Prior works implement channel pruning in an unexplainable manner, which tends to reduce the final classification errors while failing to consider the internal influence of each channel. In this article, we conduct channel pruning in a white box. Through deep visualization of feature maps activated by different channels, we observe that different channels have a varying contribution to different categories in image classification. Inspired by this, we choose to preserve channels contributing to most categories. Specifically, to model the contribution of each channel to differentiating categories, we develop a class-wise mask for each channel, implemented in a dynamic training manner with respect to the input image's category. On the basis of the learned class-wise mask, we perform a global voting mechanism to remove channels with less category discrimination. Lastly, a fine-tuning process is conducted to recover the performance of the pruned model. To our best knowledge, it is the first time that CNN interpretability theory is considered to guide channel pruning. Extensive experiments on representative image classification tasks demonstrate the superiority of our White-Box over many state-of-the-arts (SOTAs). For instance, on CIFAR-10, it reduces 65.23% floating point operations per seconds (FLOPs) with even 0.62% accuracy improvement for ResNet-110. On ILSVRC-2012, White-Box achieves a 45.6% FLOP reduction with only a small loss of 0.83% in the top-1 accuracy for ResNet-50. Code is available at https://github.com/zyxxmu/White-Box.

13.
IEEE Trans Pattern Anal Mach Intell ; 45(5): 6277-6288, 2023 May.
Artículo en Inglés | MEDLINE | ID: mdl-36215372

RESUMEN

Binary neural networks (BNNs) have attracted broad research interest due to their efficient storage and computational ability. Nevertheless, a significant challenge of BNNs lies in handling discrete constraints while ensuring bit entropy maximization, which typically makes their weight optimization very difficult. Existing methods relax the learning using the sign function, which simply encodes positive weights into +1s, and -1s otherwise. Alternatively, we formulate an angle alignment objective to constrain the weight binarization to {0,+1} to solve the challenge. In this article, we show that our weight binarization provides an analytical solution by encoding high-magnitude weights into +1s, and 0s otherwise. Therefore, a high-quality discrete solution is established in a computationally efficient manner without the sign function. We prove that the learned weights of binarized networks roughly follow a Laplacian distribution that does not allow entropy maximization, and further demonstrate that it can be effectively solved by simply removing the l2 regularization during network training. Our method, dubbed sign-to-magnitude network binarization (SiMaN), is evaluated on CIFAR-10 and ImageNet, demonstrating its superiority over the sign-based state-of-the-arts. Our source code, experimental settings, training logs and binary models are available at https://github.com/lmbxmu/SiMaN.

14.
Front Cardiovasc Med ; 9: 930443, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36545016

RESUMEN

Background: Pulse pressure (PP) may play a role in the development of cardiovascular disease, and the optimal PP for different ages and sexes is unknown. In a prospective cohort, we studied subjects with favorable cardiovascular health (CVH), proposed the mean PP as the optimal PP values, and demonstrated its relationship with healthy lifestyles. Methods and results: Between 1996 and 2016, a total of 162,636 participants (aged 20 years or above; mean age 34.9 years; 26.4% male subjects; meeting criteria for favorable health) were recruited for a medical examination program. PP in male subjects was 45.6 ± 9.4 mmHg and increased after the age of 50 years. PP in female subjects was 41.8 ± 9.5 mmHg and increased after the age of 40 years, exceeding that of male subjects after the age of 50 years. Except for female subjects with a PP of 40-70 mmHg, PP increase correlates with both systolic blood pressure (BP) increase and diastolic BP decrease. Individuals with mean PP values are more likely to meet health metrics, including body mass index (BMI) <25 kg/m2 (chi-squared = 9.35, p<0.01 in male subjects; chi-squared = 208.79, p < 0.001 in female subjects) and BP <120/80 mmHg (chi-squared =1,300, p < 0.001 in male subjects; chi-squared =11,000, p < 0.001 in female subjects). We propose a health score (Hscore) based on the sum of five metrics (BP, BMI, being physically active, non-smoking, and healthy diet), which significantly correlates with the optimal PP. Conclusion: The mean PP (within ±1 standard deviation) could be proposed as the optimal PP in the adult population with favorable CVH. The relationship between health metrics and the optimal PP based on age and sex was further demonstrated to validate the Hscore.

15.
Artículo en Inglés | MEDLINE | ID: mdl-36227812

RESUMEN

Convolutional neural networks (CNNs) have obtained remarkable performance via deep architectures. However, these CNNs often achieve poor robustness for image super-resolution (SR) under complex scenes. In this article, we present a heterogeneous group SR CNN (HGSRCNN) via leveraging structure information of different types to obtain a high-quality image. Specifically, each heterogeneous group block (HGB) of HGSRCNN uses a heterogeneous architecture containing a symmetric group convolutional block and a complementary convolutional block in a parallel way to enhance the internal and external relations of different channels for facilitating richer low-frequency structure information of different types. To prevent the appearance of obtained redundant features, a refinement block (RB) with signal enhancements in a serial way is designed to filter useless information. To prevent the loss of original information, a multilevel enhancement mechanism guides a CNN to achieve a symmetric architecture for promoting expressive ability of HGSRCNN. Besides, a parallel upsampling mechanism is developed to train a blind SR model. Extensive experiments illustrate that the proposed HGSRCNN has obtained excellent SR performance in terms of both quantitative and qualitative analysis. Codes can be accessed at https://github.com/hellloxiaotian/HGSRCNN.

16.
IEEE Trans Image Process ; 31: 6789-6799, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36288229

RESUMEN

Image motion blur results from a combination of object motions and camera shakes, and such blurring effect is generally directional and non-uniform. Previous research attempted to solve non-uniform blurs using self-recurrent multi-scale, multi-patch, or multi-temporal architectures with self-attention to obtain decent results. However, using self-recurrent frameworks typically leads to a longer inference time, while inter-pixel or inter-channel self-attention may cause excessive memory usage. This paper proposes a Blur-aware Attention Network (BANet), that accomplishes accurate and efficient deblurring via a single forward pass. Our BANet utilizes region-based self-attention with multi-kernel strip pooling to disentangle blur patterns of different magnitudes and orientations and cascaded parallel dilated convolution to aggregate multi-scale content features. Extensive experimental results on the GoPro and RealBlur benchmarks demonstrate that the proposed BANet performs favorably against the state-of-the-arts in blurred image restoration and can provide deblurred results in real-time.

17.
Neural Netw ; 153: 373-385, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-35779445

RESUMEN

CNNs with strong learning abilities are widely chosen to resolve super-resolution problem. However, CNNs depend on deeper network architectures to improve performance of image super-resolution, which may increase computational cost in general. In this paper, we present an enhanced super-resolution group CNN (ESRGCNN) with a shallow architecture by fully fusing deep and wide channel features to extract more accurate low-frequency information in terms of correlations of different channels in single image super-resolution (SISR). Also, a signal enhancement operation in the ESRGCNN is useful to inherit more long-distance contextual information for resolving long-term dependency. An adaptive up-sampling operation is gathered into a CNN to obtain an image super-resolution model with low-resolution images of different sizes. Extensive experiments report that our ESRGCNN surpasses the state-of-the-arts in terms of SISR performance, complexity, execution speed, image quality evaluation and visual effect in SISR. Code is found at https://github.com/hellloxiaotian/ESRGCNN.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Redes Neurales de la Computación , Procesamiento de Imagen Asistido por Computador/métodos
18.
IEEE Trans Image Process ; 31: 3525-3540, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35533162

RESUMEN

Understanding foggy image sequence in driving scene is critical for autonomous driving, but it remains a challenging task due to the difficulty in collecting and annotating real-world images of adverse weather. Recently, self-training strategy has been considered as a powerful solution for unsupervised domain adaptation, which iteratively adapts the model from the source domain to the target domain by generating target pseudo labels and re-training the model. However, the selection of confident pseudo labels inevitably suffers from the conflict between sparsity and accuracy, both of which will lead to suboptimal models. To tackle this problem, we exploit the characteristics of the foggy image sequence of driving scenes to densify the confident pseudo labels. Specifically, based on the two discoveries of local spatial similarity and adjacent temporal correspondence of the sequential image data, we propose a novel Target-Domain driven pseudo label Diffusion (TDo-Dif) scheme. It employs superpixels and optical flows to identify the spatial similarity and temporal correspondence, respectively, and then diffuses the confident but sparse pseudo labels within a superpixel or a temporal corresponding pair linked by the flow. Moreover, to ensure the feature similarity of the diffused pixels, we introduce local spatial similarity loss and temporal contrastive loss in the model re-training stage. Experimental results show that our TDo-Dif scheme helps the adaptive model achieve 51.92% and 53.84% mean intersection-over-union (mIoU) on two publicly available natural foggy datasets (Foggy Zurich and Foggy Driving), which exceeds the state-of-the-art unsupervised domain adaptive semantic segmentation methods. The proposed method can also be applied to non-sequential images in the target domain by considering only spatial similarity.


Asunto(s)
Procesamiento de Imagen Asistido por Computador , Semántica , Procesamiento de Imagen Asistido por Computador/métodos , Tiempo (Meteorología)
19.
Mol Psychiatry ; 27(8): 3343-3354, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35491410

RESUMEN

Immune dysregulation plays a key role in the pathogenesis of autism. Changes occurring at the systemic level, from brain inflammation to disturbed innate/adaptive immune in the periphery, are frequently observed in patients with autism; however, the intrinsic mechanisms behind them remain elusive. We hypothesize a common etiology may lie in progenitors of different types underlying widespread immune dysregulation. By single-cell RNA sequencing (sc-RNA seq), we trace the developmental origins of immune dysregulation in a mouse model of idiopathic autism. It is found that both in aorta-gonad-mesonephros (AGM) and yolk sac (YS) progenitors, the dysregulation of HDAC1-mediated epigenetic machinery alters definitive hematopoiesis during embryogenesis and downregulates the expression of the AP-1 complex for microglia development. Subsequently, these changes result in the dysregulation of the immune system, leading to gut dysbiosis and hyperactive microglia in the brain. We further confirm that dysregulated immune profiles are associated with specific microbiota composition, which may serve as a biomarker to identify autism of immune-dysregulated subtypes. Our findings elucidate a shared mechanism for the origin of immune dysregulation from the brain to the gut in autism and provide new insight to dissecting the heterogeneity of autism, as well as the therapeutic potential of targeting immune-dysregulated autism subtypes.


Asunto(s)
Trastorno Autístico , Ratones , Animales , Trastorno Autístico/genética , Mesonefro , Saco Vitelino/fisiología , Gónadas , Epigénesis Genética/genética , Modelos Animales de Enfermedad
20.
IEEE Trans Image Process ; 31: 2352-2364, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35235507

RESUMEN

Visible-infrared person re-identification (VI-ReID) is a cross-modality retrieval problem, which aims at matching the same pedestrian between the visible and infrared cameras. Due to the existence of pose variation, occlusion, and huge visual differences between the two modalities, previous studies mainly focus on learning image-level shared features. Since they usually learn a global representation or extract uniformly divided part features, these methods are sensitive to misalignments. In this paper, we propose a structure-aware positional transformer (SPOT) network to learn semantic-aware sharable modality features by utilizing the structural and positional information. It consists of two main components: attended structure representation (ASR) and transformer-based part interaction (TPI). Specifically, ASR models the modality-invariant structure feature for each modality and dynamically selects the discriminative appearance regions under the guidance of the structure information. TPI mines the part-level appearance and position relations with a transformer to learn discriminative part-level modality features. With a weighted combination of ASR and TPI, the proposed SPOT explores the rich contextual and structural information, effectively reducing cross-modality difference and enhancing the robustness against misalignments. Extensive experiments indicate that SPOT is superior to the state-of-the-art methods on two cross-modal datasets. Notably, the Rank-1/mAP value on the SYSU-MM01 dataset has improved by 8.43%/6.80%.


Asunto(s)
Peatones , Semántica , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA