Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
IEEE Trans Image Process ; 32: 4543-4554, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37531308

RESUMEN

Composing Text and Image to Image Retrieval (CTI-IR) aims at finding the target image, which matches the query image visually along with the query text semantically. However, existing works ignore the fact that the reference text usually serves multiple functions, e.g., modification and auxiliary. To address this issue, we put forth a unified solution, namely Hierarchical Aggregation Transformer incorporated with Cross Relation Network (CRN). CRN unifies modification and relevance manner in a single framework. This configuration shows broader applicability, enabling us to model both modification and auxiliary text or their combination in triplet relationships simultaneously. Specifically, CRN includes: 1) Cross Relation Network comprehensively captures the relationships of various composed retrieval scenarios caused by two different query text types, allowing a unified retrieval model to designate adaptive combination strategies for flexible applicability; 2) Hierarchical Aggregation Transformer aggregates top-down features with Multi-layer Perceptron (MLP) to overcome the limitations of edge information loss in a window-based multi-stage Transformer. Extensive experiments demonstrate the superiority of the proposed CRN over all three fashion-domain datasets. Code is available at github.com/yan9qu/crn.

2.
IEEE Trans Image Process ; 32: 2190-2201, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37018096

RESUMEN

Visual intention understanding is the task of exploring the potential and underlying meaning expressed in images. Simply modeling the objects or backgrounds within the image content leads to unavoidable comprehension bias. To alleviate this problem, this paper proposes a Cross-modality Pyramid Alignment with Dynamic optimization (CPAD) to enhance the global understanding of visual intention with hierarchical modeling. The core idea is to exploit the hierarchical relationship between visual content and textual intention labels. For visual hierarchy, we formulate the visual intention understanding task as a hierarchical classification problem, capturing multiple granular features in different layers, which corresponds to hierarchical intention labels. For textual hierarchy, we directly extract the semantic representation from intention labels at different levels, which supplements the visual content modeling without extra manual annotations. Moreover, to further narrow the domain gap between different modalities, a cross-modality pyramid alignment module is designed to dynamically optimize the performance of visual intention understanding in a joint learning manner. Comprehensive experiments intuitively demonstrate the superiority of our proposed method, outperforming existing visual intention understanding methods.

3.
Eur Radiol ; 33(6): 4303-4312, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36576543

RESUMEN

OBJECTIVES: Lymph node (LN) metastasis is a common cause of recurrence in oral cancer; however, the accuracy of distinguishing positive and negative LNs is not ideal. Here, we aimed to develop a deep learning model that can identify, locate, and distinguish LNs in contrast-enhanced CT (CECT) images with a higher accuracy. METHODS: The preoperative CECT images and corresponding postoperative pathological diagnoses of 1466 patients with oral cancer from our hospital were retrospectively collected. In stage I, full-layer images (five common anatomical structures) were labeled; in stage II, negative and positive LNs were separately labeled. The stage I model was innovatively employed for stage II training to improve accuracy with the idea of transfer learning (TL). The Mask R-CNN instance segmentation framework was selected for model construction and training. The accuracy of the model was compared with that of human observers. RESULTS: A total of 5412 images and 5601 images were labeled in stage I and II, respectively. The stage I model achieved an excellent segmentation effect in the test set (AP50-0.7249). The positive LN accuracy of the stage II TL model was similar to that of the radiologist and much higher than that of the surgeons and students (0.7042 vs. 0.7647 (p = 0.243), 0.4216 (p < 0.001), and 0.3629 (p < 0.001)). The clinical accuracy of the model was highest (0.8509 vs. 0.8000, 0.5500, 0.4500, and 0.6658 of the Radiology Department). CONCLUSIONS: The model was constructed using a deep neural network and had high accuracy in LN localization and metastasis discrimination, which could contribute to accurate diagnosis and customized treatment planning. KEY POINTS: • Lymph node metastasis is not well recognized with modern medical imaging tools. • Transfer learning can improve the accuracy of deep learning model prediction. • Deep learning can aid the accurate identification of lymph node metastasis.


Asunto(s)
Aprendizaje Profundo , Neoplasias de la Boca , Humanos , Estudios Retrospectivos , Metástasis Linfática/diagnóstico por imagen , Neoplasias de la Boca/diagnóstico por imagen , Tomografía Computarizada por Rayos X/métodos , Ganglios Linfáticos/diagnóstico por imagen
4.
J Burn Care Res ; 42(4): 755-762, 2021 08 04.
Artículo en Inglés | MEDLINE | ID: mdl-33336696

RESUMEN

Burn injuries are severe problems for human. Accurate segmentation for burn wounds in patient surface can improve the calculation precision of %TBSA (total burn surface area), which is helpful in determining treatment plan. Recently, deep learning methods have been used to automatically segment wounds. However, owing to the difficulty of collecting relevant images as training data, those methods cannot often achieve fine segmentation. A burn image-generating framework is proposed in this paper to generate burn image datasets with annotations automatically. Those datasets can be used to increase segmentation accuracy and save the time of annotating. This paper brings forward an advanced burn image generation framework called Burn-GAN. The framework consists of four parts: Generating burn wounds based on the mainstream Style-GAN network; Fusing wounds with human skins by Color Adjusted Seamless Cloning (CASC); Simulating real burn scene in three-dimensional space; Acquiring annotated dataset through three-dimensional and local burn coordinates transformation. Using this framework, a large variety of burn image datasets can be obtained. Finally, standard metrics like precision, Pixel Accuracy (PA) and Dice Coefficient (DC) were utilized to assess the framework. With nonsaturating loss with R2 regularization (NSLR2) and CASC, the segmentation network gains the best results. The framework achieved precision at 90.75%, PA at 96.88% and improved the DC from 84.5 to 89.3%. A burn data-generating framework have been built to improve the segmentation network, which can automatically segment burn images with higher accuracy and less time than traditional methods.


Asunto(s)
Quemaduras/diagnóstico por imagen , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Quemaduras/patología , Bases de Datos Factuales/estadística & datos numéricos , Humanos , Interpretación de Imagen Asistida por Computador/estadística & datos numéricos
5.
Burns Trauma ; 7: 6, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30859107

RESUMEN

BACKGROUND: Burns are life-threatening with high morbidity and mortality. Reliable diagnosis supported by accurate burn area and depth assessment is critical to the success of the treatment decision and, in some cases, can save the patient's life. Current techniques such as straight-ruler method, aseptic film trimming method, and digital camera photography method are not repeatable and comparable, which lead to a great difference in the judgment of burn wounds and impede the establishment of the same evaluation criteria. Hence, in order to semi-automate the burn diagnosis process, reduce the impact of human error, and improve the accuracy of burn diagnosis, we include the deep learning technology into the diagnosis of burns. METHOD: This article proposes a novel method employing a state-of-the-art deep learning technique to segment the burn wounds in the images. We designed this deep learning segmentation framework based on the Mask Regions with Convolutional Neural Network (Mask R-CNN). For training our framework, we labeled 1150 pictures with the format of the Common Objects in Context (COCO) data set and trained our model on 1000 pictures. In the evaluation, we compared the different backbone networks in our framework. These backbone networks are Residual Network-101 with Atrous Convolution in Feature Pyramid Network (R101FA), Residual Network-101 with Atrous Convolution (R101A), and InceptionV2-Residual Network with Atrous Convolution (IV2RA). Finally, we used the Dice coefficient (DC) value to assess the model accuracy. RESULT: The R101FA backbone network gains the highest accuracy 84.51% in 150 pictures. Moreover, we chose different burn depth pictures to evaluate these three backbone networks. The R101FA backbone network gains the best segmentation effect in superficial, superficial thickness, and deep partial thickness. The R101A backbone network gains the best segmentation effect in full-thickness burn. CONCLUSION: This deep learning framework shows excellent segmentation in burn wound and extremely robust in different burn wound depths. Moreover, this framework just needs a suitable burn wound image when analyzing the burn wound. It is more convenient and more suitable when using in clinics compared with the traditional methods. And it also contributes more to the calculation of total body surface area (TBSA) burned.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...