Your browser doesn't support javascript.
loading
Language-aware multiple datasets detection pretraining for DETRs.
Hao, Jing; Chen, Song.
Afiliação
  • Hao J; VIS, Baidu Inc., Beijing, 100000, China. Electronic address: isjinghao@gmail.com.
  • Chen S; VIS, Baidu Inc., Beijing, 100000, China. Electronic address: chensong03@baidu.com.
Neural Netw ; 179: 106506, 2024 Jul 04.
Article em En | MEDLINE | ID: mdl-38996689
ABSTRACT
Pretraining on large-scale datasets can boost the performance of object detectors while the annotated datasets for object detection are hard to scale up due to the high labor cost. What we possess are numerous isolated filed-specific datasets, thus, it is appealing to jointly pretrain models across aggregation of datasets to enhance data volume and diversity. In this paper, we propose a strong framework for utilizing Multiple datasets to pretrain DETR-like detectors, termed METR, without the need for manual label spaces integration. It converts the typical multi-classification in object detection into binary classification by introducing a pre-trained language model. Specifically, we design a category extraction module for extracting potential categories involved in an image and assign these categories into different queries by language embeddings. Each query is only responsible for predicting a class-specific object. Besides, to adapt our novel detection paradigm, we propose a Class-wise Bipartite Matching strategy that limits the ground truths to match queries assigned to the same category. Extensive experiments demonstrate that METR achieves extraordinary results on either multi-task joint training or the pretrain & finetune paradigm. Notably, our pre-trained models have high flexible transferability and increase the performance upon various DETR-like detectors on COCO val2017 benchmark. Our code is publicly available at https//github.com/isbrycee/METR.
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Neural Netw Assunto da revista: NEUROLOGIA Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Idioma: En Revista: Neural Netw Assunto da revista: NEUROLOGIA Ano de publicação: 2024 Tipo de documento: Article