Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 117
Filtrar
1.
Sci Total Environ ; : 174668, 2024 Jul 10.
Artigo em Inglês | MEDLINE | ID: mdl-38997039

RESUMO

Understanding the historical variations in organic matter (OM) input to lake sediments and the possible mechanisms regulating this phenomenon is important for studying carbon cycling and burial in lake systems; however, this topic remains poorly addressed for macrophyte-dominated lakes. To bridge these gaps, we analyzed bulk OM and molecular geochemical proxies in a dated sediment core from Lake Liangzi, a typical submerged macrophyte-dominated lake in East China, to infer changes in OM input to sediments over the past 169 years due to the intensification of human activities in the catchment. A relatively primitive OM input pattern was observed in ca. 1841-1965, during which the lowest hydrogen index (HI), short-chain n-alkane abundance, and n-C17/n-C16 alkane indicated minimal input from phytoplankton, whereas the high Paq (proxy of aquatic macrophyte input) and long-chain n-alkane abundance suggested dominant and subordinate inputs from submerged and emergent macrophytes, respectively. OM input transitioned during ca. 1965-1993, with the highest Paq and lowest long-chain n-alkane abundance, indicating an increase of submerged macrophyte input and concurrent decline of emergent macrophyte input, probably caused by hydrological regulation practices and land reclamation in the 1960s, respectively. A further shift in OM input was observed since ca. 1993, characterized by the beginning of an increase in phytoplankton input, as indicated by the greater HI, short-chain n-alkane abundance, and n-C17/n-C16 alkane in sediments. Moreover, a lower Paq and higher abundance of long-chain n-alkanes indicated a decline in input from submerged macrophytes and an elevated input from terrestrial plants. The increase in αß-hopane abundance and homohopane index value indicated that petroleum-sourced OM was first introduced into the sediments. The causes of these OM input changes included nutrient influx associated with domestic and industrial discharge, aquaculture within the lake, and widespread deforestation and land clearance in the catchment.

2.
Artigo em Inglês | MEDLINE | ID: mdl-39042534

RESUMO

Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), which formulate many important real-world applications. Cut selection heavily depends on (P1) which cuts to prefer and (P2) how many cuts to select. Although modern MILP solvers tackle (P1)-(P2) by human-designed heuristics, machine learning carries the potential to learn more effective heuristics. However, many existing learning-based methods learn which cuts to prefer, neglecting the importance of learning how many cuts to select. Moreover, we observe that (P3) what order of selected cuts to prefer significantly impacts the efficiency of MILP solvers as well. To address these challenges, we propose a novel hierarchical sequence/set model (HEM) to learn cut selection policies. Specifically, HEM is a bi-level model: (1) a higher-level module that learns how many cuts to select, (2) and a lower-level module-that formulates the cut selection as a sequence/set to sequence learning problem-to learn policies selecting an ordered subset with the cardinality determined by the higher-level module. To the best of our knowledge, HEM is the first data-driven methodology that well tackles (P1)-(P3) simultaneously. Experiments demonstrate that HEM significantly improves the efficiency of solving MILPs on eleven challenging MILP benchmarks, including two Huawei's real problems.

3.
Sci Total Environ ; 919: 170938, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38354795

RESUMO

Stratigraphic determination of the Anthropocene, the "Great Acceleration", requires more key globally synchronous stratigraphic markers which reflect the significant human impacts on Earth. Lacustrine sediment magnetic characteristics are of considerable importance in Anthropocene studies because they respond sensitively to environmental changes. There are many shallow lakes in the Songnen Plain (SNP) in northeast China, which are conducive to obtaining Anthropocene sedimentary records. This study explored magnetic materials in lacustrine sediment responses to environmental evolution impact by human activities on the SNP by measuring magnetic parameters in dated sediment cores from 5 shallow lakes in the SNP, northeast China. The results revealed that detrital magnetite and hematite dominated the magnetic minerals in lake sediments. The persistently low value of magnetic susceptibility might be caused by the low content of natural ferrimagnetic minerals in Quaternary fluvial deposits and humus-rich black soil in the catchment, and the loss of magnetic materials during the transport process. In Lake Longjiangpao (LJP), the magnetic concentrations significantly responded to regional precipitation, whereas in the other 4 lakes in the center of the plain, the parameters tended to reflect complex human activities. However, the isothermal remanent magnetization ratio (S-300), which is indicative of the ratio of hematite to magnetite, exhibited relatively consistent variations in the 5 studied lakes. After 1950, the "Great Acceleration", the increase of S-300 indicated a relative proportion of magnetite in sediments, and was positively correlated with the growth of human-activity proxies (Gross Domestic Product (GDP) and population). Thus, this proxy can be regarded as a useful indicator of the beginning of the Anthropocene in the studied region. This study provides new insights into the estimation of local human activities in history and possible evidence for the global definition of the Anthropocene.

4.
IEEE Trans Image Process ; 33: 1938-1951, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38224517

RESUMO

Generalized Zero-Shot Learning (GZSL) aims at recognizing images from both seen and unseen classes by constructing correspondences between visual images and semantic embedding. However, existing methods suffer from a strong bias problem, where unseen images in the target domain tend to be recognized as seen classes in the source domain. To address this issue, we propose a Prototype-augmented Self-supervised Generative Network by integrating self-supervised learning and prototype learning into a feature generating model for GZSL. The proposed model enjoys several advantages. First, we propose a Self-supervised Learning Module to exploit inter-domain relationships, where we introduce anchors as a bridge between seen and unseen categories. In the shared space, we pull the distribution of the target domain away from the source domain and obtain domain-aware features. To our best knowledge, this is the first work to introduce self-supervised learning into GZSL as learning guidance. Second, a Prototype Enhancing Module is proposed to utilize class prototypes to model reliable target domain distribution in finer granularity. In this module, a Prototype Alignment mechanism and a Prototype Dispersion mechanism are combined to guide the generation of better target class features with intra-class compactness and inter-class separability. Extensive experimental results on five standard benchmarks demonstrate that our model performs favorably against state-of-the-art GZSL methods.

5.
IEEE Trans Image Process ; 33: 228-240, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38064330

RESUMO

We tackle the problem of establishing dense correspondences between a pair of images in an efficient way. Most existing dense matching methods use 4D convolutions to filter incorrect matches, but 4D convolutions are highly inefficient due to their quadratic complexity. Besides, these methods learn features with fixed convolutions which cannot make learnt features robust to different challenge scenarios. To deal with these issues, we propose an Efficient Dynamic Correspondence Network (EDCNet) by jointly equipping pre-separate convolution (Psconv) and dynamic convolution (Dyconv) to establish dense correspondences in a coarse-to-fine manner. The proposed EDCNet enjoys several merits. First, two well-designed modules including a neighbourhood aggregation (NA) module and a dynamic feature learning (DFL) module are combined elegantly in the coarse-to-fine architecture, which is efficient and effective to establish both reliable and accurate correspondences. Second, the proposed NA module maintains linear complexity, showing its high efficiency. And our proposed DFL module has better flexibility to learn features robust to different challenges. Extensive experimental results show that our algorithm performs favorably against state-of-the-art methods on three challenging datasets including HPatches, Aachen Day-Night and InLoc.

6.
Artigo em Inglês | MEDLINE | ID: mdl-37788191

RESUMO

Federated learning (FL) is a promising framework for privacy-preserving and distributed training with decentralized clients. However, there exists a large divergence between the collected local updates and the expected global update, which is known as the client drift and mainly caused by heterogeneous data distribution among clients, multiple local training steps, and partial client participation training. Most existing works tackle this challenge based on the empirical risk minimization (ERM) rule, while less attention has been paid to the relationship between the global loss landscape and the generalization ability. In this work, we propose FedGAMMA, a novel FL algorithm with Global sharpness-Aware MiniMizAtion to seek a global flat landscape with high performance. Specifically, in contrast to FedSAM which only seeks the local flatness and still suffers from performance degradation when facing the client-drift issue, we adopt a local varieties control technique to better align each client's local updates to alleviate the client drift and make each client heading toward the global flatness together. Finally, extensive experiments demonstrate that FedGAMMA can substantially outperform several existing FL baselines on various datasets, and it can well address the client-drift issue and simultaneously seek a smoother and flatter global landscape.

7.
IEEE Trans Image Process ; 32: 5623-5636, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37812538

RESUMO

Semi-supervised video object segmentation is the task of segmenting the target in sequential frames given the ground truth mask in the first frame. The modern approaches usually utilize such a mask as pixel-level supervision and typically exploit pixel-to-pixel matching between the reference frame and current frame. However, the matching at pixel level, which overlooks the high-level information beyond local areas, often suffers from confusion caused by similar local appearances. In this paper, we present Prototypical Matching Networks (PMNet) - a novel architecture that integrates prototypes into matching-based video objection segmentation frameworks as high-level supervision. Specifically, PMNet first divides the foreground and background areas into several parts according to the similarity to the global prototypes. The part-level prototypes and instance-level prototypes are generated by encapsulating the semantic information of identical parts and identical instances, respectively. To model the correlation between prototypes, the prototype representations are propagated to each other by reasoning on a graph structure. Then, PMNet stores both the pixel-level features and prototypes in the memory bank as the target cues. Three affinities, i.e., pixel-to-pixel affinity, prototype-to-pixel affinity, and prototype-to-prototype affinity, are derived to measure the similarity between the query frame and the features in the memory bank. The features aggregated from the memory bank using these affinities provide powerful discrimination from both the pixel-level and prototype-level perspectives. Extensive experiments conducted on four benchmarks demonstrate superior results than the state-of-the-art video object segmentation techniques.

8.
Environ Sci Pollut Res Int ; 30(47): 103910-103920, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37691060

RESUMO

The abundance and composition of aliphatic hydrocarbon biomarkers were determined in dated sediment cores from Lakes Qijiapao (QJP) and Huoshaoheipao (HSH) in the Songnen Plain, Northeast China, to investigate historical environmental changes in these lakes and identify likely controlling factors. Based on these results, the recent environmental history of the two lakes can be divided into three periods. Before 1950, low Paq values (avg. 0.23 and 0.27, respectively) and middle-chain n-alkane abundances (normalized to total organic carbon, avg. 14.82 and 16.01 µg g-1 TOC, respectively) in both lakes suggested low aquatic productivity and the limited input of submerged macrophyte organic matter (OM). However, the significant increase in the abundance of short-chain n-alkanes in Lake HSH (from 8.34 to 16.68 µg g-1 TOC) indicated the emergence of early nutrient enrichment in the lake. From 1950 to 2000, marked increase in the abundance of middle-chain n-alkanes (avg. 21.72 and 22.62 µg g-1 TOC in Lakes QJP and HSH, respectively) and Paq values indicated that both lakes had undergone eutrophication because of the population explosion and agricultural intensification. From 2000 to 2013, the abundance of short- and middle-chain n-alkanes in Lake QJP markedly exceeded those in Lake HSH and indicated a larger eutrophication in Lake QJP, which could be caused by the development of ecotourism in Lake HSH and the concomitant increase in aquaculture in Lake QJP in recent years. The highest abundance of C30 αß-hopane (~ 10.24 µg g-1 TOC) and the lowest CPIH values in Lake QJP revealed a possible petroleum pollution since 2008. Taken together, lake eutrophication in the Songnen Plain accelerated after 1950 and was influenced primarily by agriculture and aquaculture. This is in contrast to lakes in other regions of China (such as the Yangtze River Basin and Yunnan Province), where urbanization and industrialization have exerted a dominant influence on the lake environment.


Assuntos
Sedimentos Geológicos , Hidrocarbonetos , Humanos , China , Sedimentos Geológicos/química , Hidrocarbonetos/análise , Alcanos/análise , Eutrofização , Monitoramento Ambiental/métodos
9.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 14404-14419, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37616133

RESUMO

Establishing effective correspondences between a pair of images is difficult due to real-world challenges such as illumination, viewpoint and scale variations. Modern detector-based methods typically learn fixed detectors from a given dataset, which is hard to extract repeatable and reliable keypoints for various images with extreme appearance changes and weakly textured scenes. To deal with this problem, we propose a novel Dynamic Keypoint Detection Network (DKDNet) for robust image matching via a dynamic keypoint feature learning module and a guided heatmap activator. The proposed DKDNet enjoys several merits. First, the proposed dynamic keypoint feature learning module can generate adaptive keypoint features via the attention mechanism, which is flexibly updated with the current input image and can capture keypoints with different patterns. Second, the guided heatmap activator can effectively fuse multi-group keypoint heatmaps by fully considering the importance of different feature channels, which can realize more robust keypoint detection. Extensive experimental results on four standard benchmarks demonstrate that our DKDNet outperforms state-of-the-art image-matching methods by a large margin. Specifically, our DKDNet can outperform the best image-matching method by 2.1% in AUC@ 3px on HPatches, 3.74% in AUC@ 5° on ScanNet, 7.14% in AUC@ 5° on MegaDepth and 12.32% in AUC@ 5° on YFCC100M.

10.
Environ Pollut ; 335: 122350, 2023 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-37572845

RESUMO

Limited human activities in catchments make remote alpine lakes valuable sites for studying the evolution of lake environments in response to climate change and atmospheric deposition; however, this issue remains rarely studied owing to the scarcity of monitoring data. In this study, water quality evolution in Lake Jiren, a remote alpine lake on the southeastern margin of the Tibetan Plateau, over the past two centuries was reconstructed through geochemical analyses of aliphatic hydrocarbons, major and trace elements, and organic matter (OM) pyrolysis products in a dated sediment core, and the associated drivers were identified by temporally comparing the geochemical results with document records. All geochemical data demonstrated that the lake water remained relatively pure until 1947, after which the n-alkane and αß-hopane proxies indicated eutrophication and petroleum contamination. The OM pyrolysis proxy hydrocarbon index indicated more eutrophic conditions after 1957. Concurrently, hypolimnetic deoxygenation increased, as indicated by redox-sensitive proxies, such as the enrichment factors (EFs) of molybdenum (Mo). These proxies recorded further intensification of deoxygenation after 1976. The EFs for other trace elements indicated cadmium contamination after 1967. The greater anthropogenic emissions of reactive nitrogen, petroleum products, and heavy metals in East and South Asia since approximately 1950 and the subsequent atmospheric transport of these materials to the lake might be the basic driver of water quality deterioration. Eutrophication induced by nitrogen deposition was responsible for increased hypolimnetic deoxygenation by enhancing phytoplankton productivity and OM input. The further intensification of deoxygenation was attributed to climate warming since the 1970s, as prolonged water column stratification under this condition decreased oxygen input from the epilimnion to the lake bottom. These findings may be beneficial for understanding the natural and anthropogenic effects on the water quality of alpine lakes and help in the environmental management of Lake Jiren and other alpine lakes.


Assuntos
Petróleo , Oligoelementos , Poluentes Químicos da Água , Humanos , Qualidade da Água , Tibet , Oligoelementos/análise , Monitoramento Ambiental/métodos , Sedimentos Geológicos/análise , Hidrocarbonetos/análise , Petróleo/análise , Nitrogênio/análise , Poluentes Químicos da Água/análise , China
11.
IEEE Trans Image Process ; 32: 4567-4580, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37556339

RESUMO

As a crucial application in privacy protection, scene text removal (STR) has received amounts of attention in recent years. However, existing approaches coarsely erasing texts from images ignore two important properties: the background texture integrity (BI) and the text erasure exhaustivity (EE). These two properties directly determine the erasure performance, and how to maintain them in a single network is the core problem for STR task. In this paper, we attribute the lack of BI and EE properties to the implicit erasure guidance and imbalanced multi-stage erasure respectively. To improve these two properties, we propose a new ProgrEssively Region-based scene Text eraser (PERT). There are three key contributions in our study. First, a novel explicit erasure guidance is proposed to enhance the BI property. Different from implicit erasure guidance modifying all the pixels in the entire image, our explicit one accurately performs stroke-level modification with only bounding-box level annotations. Second, a new balanced multi-stage erasure is constructed to improve the EE property. By balancing the learning difficulty and network structure among progressive stages, each stage takes an equal step towards the text-erased image to ensure the erasure exhaustivity. Third, we propose two new evaluation metrics called BI-metric and EE-metric, which make up the shortcomings of current evaluation tools in analyzing BI and EE properties. Compared with previous methods, PERT outperforms them by a large margin in both BI-metric ( ↑ 6.13 %) and EE-metric ( ↑ 1.9 %), obtaining SOTA results with high speed (71 FPS) and at least 25% lower parameter complexity. Code will be available at https://github.com/wangyuxin87/PERT.

12.
Health Inf Sci Syst ; 11(1): 26, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37325196

RESUMO

Semi-supervised learning (SSL) has attracted increasing attention in medical image segmentation, where the mainstream usually explores perturbation-based consistency as a regularization to leverage unlabelled data. However, unlike directly optimizing segmentation task objectives, consistency regularization is a compromise by incorporating invariance towards perturbations, and inevitably suffers from noise in self-predicted targets. The above issues result in a knowledge gap between supervised guidance and unsupervised regularization. To bridge the knowledge gap, this work proposes a meta-based semi-supervised segmentation framework with the exploitation of label hierarchy. Two main prominent components named Divide and Generalize, and Label Hierarchy, are built in this work. Concretely, rather than merging all knowledge indiscriminately, we dynamically divide consistency regularization from supervised guidance as different domains. Then, a domain generalization technique is introduced with a meta-based optimization objective which ensures the update on supervised guidance should generalize to the consistency regularization, thereby bridging the knowledge gap. Furthermore, to alleviate the negative impact of noise in self-predicted targets, we propose to distill the noisy pixel-level consistency by exploiting label hierarchy and extracting hierarchical consistencies. Comprehensive experiments on two public medical segmentation benchmarks demonstrate the superiority of our framework to other semi-supervised segmentation methods, with new state-of-the-art results.

13.
IEEE Trans Image Process ; 32: 2734-2748, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37155387

RESUMO

Point cloud shape correspondence aims at accurately mapping one point cloud to another point cloud with various 3D shapes. Since point clouds are usually sparse, disordered, irregular, and with diverse shapes, it is challenging to learn consistent point cloud representations and achieve the accurate matching of different point cloud shapes. To address the above issues, we propose a Hierarchical Shape-consistent TRansformer for unsupervised point cloud shape correspondence (HSTR), including a multi-receptive-field point representation encoder and a shape-consistent constrained module in a unified architecture. The proposed HSTR enjoys several merits. In the multi-receptive-field point representation encoder, we set progressively larger receptive fields in different blocks to simultaneously consider the local structure and the long-range context. In the shape-consistent constrained module, we design two novel shape selective whitening losses, which can complement each other to achieve suppression of features sensitive to shape change. Extensive experimental results on four standard benchmarks demonstrate the superiority and generalization ability of our approach to existing methods at the similar model scale, and our method achieves the new state-of-the-art results.

14.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 9109-9121, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37015535

RESUMO

Weakly supervised object localization (WSOL) aims to predict both object locations and categories with only image-level class labels. However, most existing methods rely on class-specific image regions for localization, resulting in incomplete object localization. To alleviate this problem, we propose a novel end-to-end task-aware framework with a transformer encoder-decoder architecture (TAFormer) to learn class-agnostic foreground maps, including a representation encoder, a localization decoder, and a classification decoder. The proposed TAFormer enjoys several merits. First, the designed three modules can effectively perform class-agnostic localization and classification in a task-aware manner, achieving remarkable performance for both tasks. Second, an optimal transport algorithm is proposed to provide pixel-level pseudo labels to online refine foreground maps. To the best of our knowledge, this is the first work by exploring a task-aware framework with a transformer architecture and an optimal transport algorithm to achieve accurate object localization for WSOL. Extensive experiments with four backbones on two standard benchmarks demonstrate that our TAFormer achieves favorable performance against state-of-the-art methods. Furthermore, we show that the proposed TAFormer provides higher robustness against adversarial attacks and noisy labels.

15.
Acad Radiol ; 30 Suppl 2: S114-S126, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37003874

RESUMO

RATIONALE AND OBJECTIVES: This study assessed the role of second-look automated breast ultrasound (ABUS) adjunct to mammography (MAM) versus MAM alone in asymptomatic women and compared it with supplementing handheld ultrasound (HHUS). MATERIALS AND METHODS: Women aged 45 to 64 underwent HHUS, ABUS, and MAM among six hospitals in China from 2018 to 2022. We compared the screening performance of three strategies (MAM alone, MAM plus HHUS, and MAM plus ABUS) stratified by age groups and breast density. McNemar's test was used to assess differences in the cancer detection rate (CDR), the false-positive biopsy rate, sensitivity, and specificity of different strategies. RESULTS: Of 19,171 women analyzed (mean [SD] age, 51.54 [4.61] years), 72 cases of breast cancer (3.76 per 1000) were detected. The detection rates for both HHUS and ABUS combined with MAM were statistically higher than those for MAM alone (all p < 0.001). There was no significant difference in cancer yields between the two integration strategies. The increase in CRD of the integrated strategies was higher in women aged 45-54 years with denser breasts compared with MAM alone (all p < 0.0167). In addition, the false-positive biopsy rate of MAM plus ABUS was lower than that of MAM plus HHUS (p = 0.025). Moreover, the retraction in ABUS was more frequent in cases detected among MAM-negative results. CONCLUSION: Integrated ABUS or HHUS into MAM provided similar CDRs that were significantly higher than those for MAM alone in younger women (45-54 years) with denser breasts. ABUS has the potential to avoid unnecessary biopsies and provides specific image features to distinguish malignant tumors from HHUS.


Assuntos
Neoplasias da Mama , Ultrassonografia Mamária , Feminino , Humanos , Pessoa de Meia-Idade , Ultrassonografia Mamária/métodos , Sensibilidade e Especificidade , Mamografia , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/epidemiologia , China/epidemiologia
16.
ACS Biomater Sci Eng ; 9(5): 2683-2693, 2023 05 08.
Artigo em Inglês | MEDLINE | ID: mdl-37083337

RESUMO

Noninterventional embolization does not require the use of a catheter, and the treatment of solid tumors in combination with thermal ablation can avoid some of the risks of the surgical procedure. Therefore, we developed an efficient tumor microenvironment-gelled nanocomposites with poly [(l-glutamic acid-ran-l-tyrosine)-b-l-serine-b-l-cysteine] (PGTSCs) coated-nanoparticles (Fe3O4&Au@PGTSCs), from which the prepared PGTSCs were given possession of pH response to an acidic tumor microenvironment. Fe3O4&Au@PGTSC in noninterventional embolization treatment not only achieved the smart targeted medicine delivery but also meshed with noninvasive multimodal thermal ablation therapy and multimodal imaging of solid tumors via intravenous injection. It was worth noting that the results of animal experiments in vivo demonstrated that Fe3O4&Au@PGTSCs have specific tumor accumulation and embolization and thermal ablation effects; at 10 days postinjection, only scars were found at the tumor site. After 20 days, the tumors of model mice completely disappeared. This device is easier to treat solid tumors based on the slightly acidic tumor environment.


Assuntos
Hipertermia Induzida , Nanocompostos , Nanopartículas , Neoplasias , Camundongos , Animais , Aminoácidos , Neoplasias/terapia , Nanopartículas/uso terapêutico , Nanopartículas/química , Hipertermia Induzida/métodos , Nanocompostos/uso terapêutico , Microambiente Tumoral
17.
World Wide Web ; 26(2): 539-559, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-35528264

RESUMO

Developmental dysplasia of the hip (DDH) is one of the most common diseases in children. Due to the experience-requiring medical image analysis work, online automatic diagnosis of DDH has intrigued the researchers. Traditional implementation of online diagnosis faces challenges with reliability and interpretability. In this paper, we establish an online diagnosis tool based on a multi-task hourglass network, which can accurately extract landmarks to detect the extent of hip dislocation and predict the age of the femoral head. Our method utilizes a multi-task hourglass network, which trains an encoder-decoder network to regress the landmarks and predict the developmental age for online DDH diagnosis. With the support of precise image analysis and fast GPU computing, our method can help overcome the shortage of medical resources and enable telehealth for DDH diagnosis. Applying this approach to a dataset of DDH X-ray images, we demonstrate 4.64 mean pixel error of landmark detection compared to the results of human experts. Moreover, we can improve the accuracy of the age prediction of femoral heads to 89%. Our online automatic diagnosis system has provided service to 112 patients, and the results demonstrate the effectiveness of our method.

18.
IEEE Trans Cybern ; 53(9): 5631-5640, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35427228

RESUMO

Graph convolutional networks (GCNs) have attracted increasing research attention, which merits in its strong ability to handle graph data, such as the citation network or social network. Existing models typically use first-order neighborhood information to design specific convolution operations, which aggregate the features of all adjacent nodes. However, such models ignore the high-order spatial relationship among neighboring nodes in noisy data due to its modeling complexity. In this article, we propose a novel robust graph relational network to address this issue toward modeling high-order relationships in noisy data for graph convolution. Our key innovation lies in designing a generic relation network layer, which is used to infer the underlying relations among adjacent noisy nodes. Specifically, a fixed number of adjacent nodes for each node is chosen by solving the ridge regression problem, in which the regression coefficients are used to rank the adjacent nodes of each node in a graph. Furthermore, to mine the rich features, we extract high-order information from the nodes to significantly enhance the representation ability of the GCNs for extensive applications. We conduct extensive semisupervised node classification experiments on the noisy benchmark datasets, which clearly show that our model is superior to the existing methods and can achieve state-of-the-art performance.

19.
IEEE Trans Pattern Anal Mach Intell ; 45(6): 7123-7141, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-36417745

RESUMO

Scene text spotting is of great importance to the computer vision community due to its wide variety of applications. Recent methods attempt to introduce linguistic knowledge for challenging recognition rather than pure visual classification. However, how to effectively model the linguistic rules in end-to-end deep networks remains a research challenge. In this paper, we argue that the limited capacity of language models comes from 1) implicit language modeling; 2) unidirectional feature representation; and 3) language model with noise input. Correspondingly, we propose an autonomous, bidirectional and iterative ABINet++ for scene text spotting. First, the autonomous suggests enforcing explicitly language modeling by decoupling the recognizer into vision model and language model and blocking gradient flow between both models. Second, a novel bidirectional cloze network (BCN) as the language model is proposed based on bidirectional feature representation. Third, we propose an execution manner of iterative correction for the language model which can effectively alleviate the impact of noise input. Additionally, based on an ensemble of the iterative predictions, a self-training method is developed which can learn from unlabeled images effectively. Finally, to polish ABINet++ in long text recognition, we propose to aggregate horizontal features by embedding Transformer units inside a U-Net, and design a position and content attention module which integrates character order and content to attend to character features precisely. ABINet++ achieves state-of-the-art performance on both scene text recognition and scene text spotting benchmarks, which consistently demonstrates the superiority of our method in various environments especially on low-quality images. Besides, extensive experiments including in English and Chinese also prove that, a text spotter that incorporates our language modeling method can significantly improve its performance both in accuracy and speed compared with commonly used attention-based recognizers. Code is available at https://github.com/FangShancheng/ABINet-PP.

20.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 5252-5267, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35994544

RESUMO

In weakly supervised (WSAL) and unsupervised temporal action localization (UAL), the target is to simultaneously localize temporal boundaries and identify category labels of actions with only video-level category labels (WSAL) or category numbers in a dataset (UAL) during training. Among existing methods, attention based methods have achieved superior performance in both tasks by highlighting action segments with foreground attention weights. However, without the segment-level supervision on the attention weight learning, the quality of the attention weight hinders the performance of these methods. In this paper, we propose a novel Uncertainty Guided Collaborative Training (UGCT) strategy to alleviate this problem, which mainly includes two key designs: (1) The first design is an online pseudo label generation module, in which the RGB and FLOW streams work collaboratively to learn from each other. (2) The second design is an uncertainty aware learning module, which can mitigate the noise in the generated pseudo labels. These two designs work together to promote the model performance effectively and efficiently by exchanging information between RGB and FLOW streams. Extensive experimental results on two benchmark datasets with three attention based methods demonstrate the effectiveness of the proposed method, e.g, more than 7.0% performance gain for mAP@IoU=0.5 on THUMOS14 dataset.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA