Búsqueda | Portal Regional de la BVS

Medical visual question answering: A survey.

Lin, Zhihong; Zhang, Donghao; Tao, Qingyi; Shi, Danli; Haffari, Gholamreza; Wu, Qi; He, Mingguang; Ge, Zongyuan.

Artif Intell Med ; 143: 102611, 2023 09.

Artículo en Inglés | MEDLINE | ID: mdl-37673579

RESUMEN

Medical Visual Question Answering (VQA) is a combination of medical artificial intelligence and popular VQA challenges. Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer. Although the general-domain VQA has been extensively studied, the medical VQA still needs specific investigation and exploration due to its task features. In the first part of this survey, we collect and discuss the publicly available medical VQA datasets up-to-date about the data source, data quantity, and task feature. In the second part, we review the approaches used in medical VQA tasks. We summarize and discuss their techniques, innovations, and potential improvements. In the last part, we analyze some medical-specific challenges for the field and discuss future research directions. Our goal is to provide comprehensive and helpful information for researchers interested in the medical visual question answering field and encourage them to conduct further research in this field.

Asunto(s)

Inteligencia Artificial

Label-free tumor cells classification using deep learning and high-content imaging.

Piansaddhayanon, Chawan; Koracharkornradt, Chonnuttida; Laosaengpha, Napat; Tao, Qingyi; Ingrungruanglert, Praewphan; Israsena, Nipan; Chuangsuwanich, Ekapol; Sriswasdi, Sira.

Sci Data ; 10(1): 570, 2023 08 26.

Artículo en Inglés | MEDLINE | ID: mdl-37634014

RESUMEN

Many studies have shown that cellular morphology can be used to distinguish spiked-in tumor cells in blood sample background. However, most validation experiments included only homogeneous cell lines and inadequately captured the broad morphological heterogeneity of cancer cells. Furthermore, normal, non-blood cells could be erroneously classified as cancer because their morphology differ from blood cells. Here, we constructed a dataset of microscopic images of organoid-derived cancer and normal cell with diverse morphology and developed a proof-of-concept deep learning model that can distinguish cancer cells from normal cells within an unlabeled microscopy image. In total, more than 75,000 organoid-drived cells from 3 cholangiocarcinoma patients were collected. The model achieved an area under the receiver operating characteristics curve (AUROC) of 0.78 and can generalize to cell images from an unseen patient. These resources serve as a foundation for an automated, robust platform for circulating tumor cell detection.

Asunto(s)

Línea Celular Tumoral , Neoplasias , Humanos , Área Bajo la Curva , Aprendizaje Profundo , Microscopía , Línea Celular Tumoral/clasificación , Línea Celular Tumoral/patología , Neoplasias/diagnóstico por imagen , Neoplasias/patología

Contrastive pre-training and linear interaction attention-based transformer for universal medical reports generation.

Lin, Zhihong; Zhang, Donghao; Shi, Danli; Xu, Renjing; Tao, Qingyi; Wu, Lin; He, Mingguang; Ge, Zongyuan.

J Biomed Inform ; 138: 104281, 2023 02.

Artículo en Inglés | MEDLINE | ID: mdl-36638935

RESUMEN

Interpreting medical images such as chest X-ray images and retina images is an essential step for diagnosing and treating relevant diseases. Proposing automatic and reliable medical report generation systems can reduce the time-consuming workload, improve efficiencies of clinical workflows, and decrease practical variations between different clinical professionals. Many recent approaches based on image-encoder and language-decoder structure have been proposed to tackle this task. However, some technical challenges remain to be solved, including the fusion efficacy between the language and visual cues and the difficulty of obtaining an effective pre-trained image feature extractor for medical-specific tasks. In this work, we proposed the weighted query-key interacting attention module, including both the second-order and first-order interactions. Compared with the conventional scaled dot-product attention, this design generates a strong fusion mechanism between language and visual signals. In addition, we also proposed the contrastive pre-training step to reduce the domain gap between the image encoder and the target dataset. To test the generalizability of our learning scheme, we collected and verified our model on the world-first multi-modality retina report generation dataset referred to as Retina ImBank and another large-scale retina Chinese-based report dataset referred to as Retina Chinese. These two datasets will be made publicly available and serve as benchmarks to encourage further research exploration in this field. From our experimental results, we demonstrate that our proposed method has outperformed multiple state-of-the-art image captioning and medical report generation methods on IU X-RAY, MIMIC-CXR, Retina ImBank, and Retina Chinese datasets.

Asunto(s)

Benchmarking , Lenguaje , Aprendizaje , Registros Médicos , Registros

ReCasNet: Improving consistency within the two-stage mitosis detection framework.

Piansaddhayanaon, Chawan; Santisukwongchote, Sakun; Shuangshoti, Shanop; Tao, Qingyi; Sriswasdi, Sira; Chuangsuwanich, Ekapol.

Artif Intell Med ; 135: 102462, 2023 01.

Artículo en Inglés | MEDLINE | ID: mdl-36628784

RESUMEN

Mitotic count (MC) is an important histological parameter for cancer diagnosis and grading, but the manual process for obtaining MC from whole-slide histopathological images is very time-consuming and prone to error. Therefore, deep learning models have been proposed to facilitate this process. Existing approaches utilize a two-stage pipeline: the detection stage for identifying the locations of potential mitotic cells and the classification stage for refining prediction confidences. However, this pipeline formulation can lead to inconsistencies in the classification stage due to the poor prediction quality of the detection stage and the mismatches in training data distributions between the two stages. In this study, we propose a Refine Cascade Network (ReCasNet), an enhanced deep learning pipeline that mitigates the aforementioned problems with three improvements. First, window relocation was used to reduce the number of poor quality false positives generated during the detection stage. Second, object re-cropping was performed with another deep learning model to adjust poorly centered objects. Third, improved data selection strategies were introduced during the classification stage to reduce the mismatches in training data distributions. ReCasNet was evaluated on two large-scale mitotic figure recognition datasets, canine cutaneous mast cell tumor (CCMCT) and canine mammary carcinoma (CMC), which resulted in up to 4.8% percentage point improvements in the F1 scores for mitotic cell detection and 44.1% reductions in mean absolute percentage error (MAPE) for MC prediction. Techniques that underlie ReCasNet can be generalized to other two-stage object detection pipeline and should contribute to improving the performances of deep learning models in broad digital pathology applications.

Asunto(s)

Mitosis , Animales , Perros

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA