Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 216
Filtrar
1.
Sci Rep ; 14(1): 17900, 2024 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-39095389

RESUMO

Plant diseases pose significant threats to agriculture, impacting both food safety and public health. Traditional plant disease detection systems are typically limited to recognizing disease categories included in the training dataset, rendering them ineffective against new disease types. Although out-of-distribution (OOD) detection methods have been proposed to address this issue, the impact of fine-tuning paradigms on these methods has been overlooked. This paper focuses on studying the impact of fine-tuning paradigms on the performance of detecting unknown plant diseases. Currently, fine-tuning on visual tasks is mainly divided into visual-based models and visual-language-based models. We first discuss the limitations of large-scale visual language models in this task: textual prompts are difficult to design. To avoid the side effects of textual prompts, we futher explore the effectiveness of purely visual pre-trained models for OOD detection in plant disease tasks. Specifically, we employed five publicly accessible datasets to establish benchmarks for open-set recognition, OOD detection, and few-shot learning in plant disease recognition. Additionally, we comprehensively compared various OOD detection methods, fine-tuning paradigms, and factors affecting OOD detection performance, such as sample quantity. The results show that visual prompt tuning outperforms fully fine-tuning and linear probe tuning in out-of-distribution detection performance, especially in the few-shot scenarios. Notably, the max-logit-based on visual prompt tuning achieves an AUROC score of 94.8 % in the 8-shot setting, which is nearly comparable to the method of fully fine-tuning on the full dataset (95.2 % ), which implies that an appropriate fine-tuning paradigm can directly improve OOD detection performance. Finally, we visualized the prediction distributions of different OOD detection methods and discussed the selection of thresholds. Overall, this work lays the foundation for unknown plant disease recognition, providing strong support for the security and reliability of plant disease recognition systems. We will release our code at https://github.com/JiuqingDong/PDOOD to further advance this field.


Assuntos
Doenças das Plantas , Algoritmos
2.
Sci Rep ; 14(1): 18319, 2024 08 07.
Artigo em Inglês | MEDLINE | ID: mdl-39112791

RESUMO

Accurately assigning standardized diagnosis and procedure codes from clinical text is crucial for healthcare applications. However, this remains challenging due to the complexity of medical language. This paper proposes a novel model that incorporates extreme multi-label classification tasks to enhance International Classification of Diseases (ICD) coding. The model utilizes deformable convolutional neural networks to fuse representations from hidden layer outputs of pre-trained language models and external medical knowledge embeddings fused using a multimodal approach to provide rich semantic encodings for each code. A probabilistic label tree is constructed based on the hierarchical structure existing in ICD labels to incorporate ontological relationships between ICD codes and enable structured output prediction. Experiments on medical code prediction on the MIMIC-III database demonstrate competitive performance, highlighting the benefits of this technique for robust clinical code assignment.


Assuntos
Classificação Internacional de Doenças , Redes Neurais de Computação , Semântica , Humanos , Processamento de Linguagem Natural , Algoritmos , Bases de Dados Factuais
3.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39177261

RESUMO

Large language models (LLMs) are sophisticated AI-driven models trained on vast sources of natural language data. They are adept at generating responses that closely mimic human conversational patterns. One of the most notable examples is OpenAI's ChatGPT, which has been extensively used across diverse sectors. Despite their flexibility, a significant challenge arises as most users must transmit their data to the servers of companies operating these models. Utilizing ChatGPT or similar models online may inadvertently expose sensitive information to the risk of data breaches. Therefore, implementing LLMs that are open source and smaller in scale within a secure local network becomes a crucial step for organizations where ensuring data privacy and protection has the highest priority, such as regulatory agencies. As a feasibility evaluation, we implemented a series of open-source LLMs within a regulatory agency's local network and assessed their performance on specific tasks involving extracting relevant clinical pharmacology information from regulatory drug labels. Our research shows that some models work well in the context of few- or zero-shot learning, achieving performance comparable, or even better than, neural network models that needed thousands of training samples. One of the models was selected to address a real-world issue of finding intrinsic factors that affect drugs' clinical exposure without any training or fine-tuning. In a dataset of over 700 000 sentences, the model showed a 78.5% accuracy rate. Our work pointed to the possibility of implementing open-source LLMs within a secure local network and using these models to perform various natural language processing tasks when large numbers of training examples are unavailable.


Assuntos
Processamento de Linguagem Natural , Humanos , Redes Neurais de Computação , Aprendizado de Máquina
4.
Artif Intell Med ; 156: 102949, 2024 Aug 16.
Artigo em Inglês | MEDLINE | ID: mdl-39178621

RESUMO

The lack of annotated medical images limits the performance of deep learning models, which usually need large-scale labelled datasets. Few-shot learning techniques can reduce data scarcity issues and enhance medical image analysis speed and robustness. This systematic review gives a comprehensive overview of few-shot learning methods for medical image analysis, aiming to establish a standard methodological pipeline for future research reference. With a particular emphasis on the role of meta-learning, we analysed 80 relevant articles published from 2018 to 2023, conducting a risk of bias assessment and extracting relevant information, especially regarding the employed learning techniques. From this, we delineated a comprehensive methodological pipeline shared among all studies. In addition, we performed a statistical analysis of the studies' results concerning the clinical task and the meta-learning method employed while also presenting supplemental information such as imaging modalities and model robustness evaluation techniques. We discussed the findings of our analysis, providing a deep insight into the limitations of the state-of-the-art methods and the most promising approaches. Drawing on our investigation, we yielded recommendations on potential future research directions aiming to bridge the gap between research and clinical practice.

5.
ACS Nano ; 18(34): 23489-23496, 2024 Aug 27.
Artigo em Inglês | MEDLINE | ID: mdl-39137093

RESUMO

Ternary content-addressable memory (TCAM) is promising for data-intensive artificial intelligence applications due to its large-scale parallel in-memory computing capabilities. However, it is still challenging to build a reliable TCAM cell from a single circuit component. Here, we demonstrate a single transistor TCAM based on a floating-gate two-dimensional (2D) ambipolar MoTe2 field-effect transistor with graphene contacts. Our bottom graphene contacts scheme enables gate modulation of the contact Schottky barrier heights, facilitating carrier injection for both electrons and holes. The 2D nature of our channel and contact materials provides device scaling potentials beyond silicon. By integration with a floating-gate stack, a highly reliable nonvolatile memory is achieved. Our TCAM cell exhibits a resistance ratio larger than 1000 and symmetrical complementary states, allowing the implementation of large-scale TCAM arrays. Finally, we show through circuit simulations that in-memory Hamming distance computation is readily achievable based on our TCAM with array sizes up to 128 cells.

6.
Quant Imaging Med Surg ; 14(8): 5443-5459, 2024 Aug 01.
Artigo em Inglês | MEDLINE | ID: mdl-39144045

RESUMO

Background: The automated classification of histological images is crucial for the diagnosis of cancer. The limited availability of well-annotated datasets, especially for rare cancers, poses a significant challenge for deep learning methods due to the small number of relevant images. This has led to the development of few-shot learning approaches, which bear considerable clinical importance, as they are designed to overcome the challenges of data scarcity in deep learning for histological image classification. Traditional methods often ignore the challenges of intraclass diversity and interclass similarities in histological images. To address this, we propose a novel mutual reconstruction network model, aimed at meeting these challenges and improving the few-shot classification performance of histological images. Methods: The key to our approach is the extraction of subtle and discriminative features. We introduce a feature enhancement module (FEM) and a mutual reconstruction module to increase differences between classes while reducing variance within classes. First, we extract features of support and query images using a feature extractor. These features are then processed by the FEM, which uses a self-attention mechanism for self-reconstruction of features, enhancing the learning of detailed features. These enhanced features are then input into the mutual reconstruction module. This module uses enhanced support features to reconstruct enhanced query features and vice versa. The classification of query samples is based on weighted calculations of the distances between query features and reconstructed query features and between support features and reconstructed support features. Results: We extensively evaluated our model using a specially created few-shot histological image dataset. The results showed that in a 5-way 10-shot setup, our model achieved an impressive accuracy of 92.09%. This is a 23.59% improvement in accuracy compared to the model-agnostic meta-learning (MAML) method, which does not focus on fine-grained attributes. In the more challenging, 5-way 1-shot setting, our model also performed well, demonstrating a 18.52% improvement over the ProtoNet, which does not address this challenge. Additional ablation studies indicated the effectiveness and complementary nature of each module and confirmed our method's ability to parse small differences between classes and large variations within classes in histological images. These findings strongly support the superiority of our proposed method in the few-shot classification of histological images. Conclusions: The mutual reconstruction network provides outstanding performance in the few-shot classification of histological images, successfully overcoming the challenges of similarities between classes and diversity within classes. This marks a significant advancement in the automated classification of histological images.

7.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39133096

RESUMO

The molecular property prediction (MPP) plays a crucial role in the drug discovery process, providing valuable insights for molecule evaluation and screening. Although deep learning has achieved numerous advances in this area, its success often depends on the availability of substantial labeled data. The few-shot MPP is a more challenging scenario, which aims to identify unseen property with only few available molecules. In this paper, we propose an attribute-guided prototype network (APN) to address the challenge. APN first introduces an molecular attribute extractor, which can not only extract three different types of fingerprint attributes (single fingerprint attributes, dual fingerprint attributes, triplet fingerprint attributes) by considering seven circular-based, five path-based, and two substructure-based fingerprints, but also automatically extract deep attributes from self-supervised learning methods. Furthermore, APN designs the Attribute-Guided Dual-channel Attention module to learn the relationship between the molecular graphs and attributes and refine the local and global representation of the molecules. Compared with existing works, APN leverages high-level human-defined attributes and helps the model to explicitly generalize knowledge in molecular graphs. Experiments on benchmark datasets show that APN can achieve state-of-the-art performance in most cases and demonstrate that the attributes are effective for improving few-shot MPP performance. In addition, the strong generalization ability of APN is verified by conducting experiments on data from different domains.


Assuntos
Aprendizado Profundo , Descoberta de Drogas , Descoberta de Drogas/métodos , Humanos , Algoritmos , Redes Neurais de Computação
8.
Med Image Anal ; 98: 103321, 2024 Aug 23.
Artigo em Inglês | MEDLINE | ID: mdl-39197302

RESUMO

Accurate segmentation of the left atrium (LA) from late gadolinium-enhanced cardiac magnetic resonance (LGE CMR) images is crucial for aiding the treatment of patients with atrial fibrillation. Few-shot learning holds significant potential for achieving accurate LA segmentation with low demand on high-cost labeled LGE CMR data and fast generalization across different centers. However, accurate LA segmentation with few-shot learning is a challenging task due to the low-intensity contrast between the LA and other neighboring organs in LGE CMR images. To address this issue, we propose an Adaptive Dynamic Inference Network (ADINet) that explicitly models the differences between the foreground and background. Specifically, ADINet leverages dynamic collaborative inference (DCI) and dynamic reverse inference (DRI) to adaptively allocate semantic-aware and spatial-specific convolution weights and indication information. These allocations are conditioned on the support foreground and background knowledge, utilizing pixel-wise correlations, for different spatial positions of query images. The convolution weights adapt to different visual patterns based on spatial positions, enabling effective encoding of differences between foreground and background regions. Meanwhile, the indication information adapts to the background visual pattern to reversely decode foreground LA regions, leveraging their spatial complementarity. To promote the learning of ADINet, we propose hierarchical supervision, which enforces spatial consistency and differences between the background and foreground regions through pixel-wise semantic supervision and pixel-pixel correlation supervision. We demonstrated the performance of ADINet on three LGE CMR datasets from different centers. Compared to state-of-the-art methods with ten available samples, ADINet yielded better segmentation performance in terms of four metrics.

10.
Neural Netw ; 179: 106536, 2024 Jul 14.
Artigo em Inglês | MEDLINE | ID: mdl-39089156

RESUMO

Cross-domain few-shot Learning (CDFSL) is proposed to first pre-train deep models on a source domain dataset where sufficient data is available, and then generalize models to target domains to learn from only limited data. However, the gap between the source and target domains greatly hampers the generalization and target-domain few-shot finetuning. To address this problem, we analyze the domain gap from the aspect of frequency-domain analysis. We find the domain gap could be reflected by the compositions of source-domain spectra, and the lack of compositions in the source datasets limits the generalization. Therefore, we aim to expand the coverage of spectra composition in the source datasets to help the source domain cover a larger range of possible target-domain information, to mitigate the domain gap. To achieve this goal, we propose the Spectral Decomposition and Transformation (SDT) method, which first randomly decomposes the spectrogram of the source datasets into orthogonal bases, and then randomly samples different coordinates in the space formed by these bases. We integrate the above process into a data augmentation module, and further design a two-stream network to handle augmented images and original images respectively. Experimental results show that our method achieves state-of-the-art performance in the CDFSL benchmark dataset.

11.
Front Hum Neurosci ; 18: 1421922, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-39050382

RESUMO

This paper presents a systematic literature review, providing a comprehensive taxonomy of Data Augmentation (DA), Transfer Learning (TL), and Self-Supervised Learning (SSL) techniques within the context of Few-Shot Learning (FSL) for EEG signal classification. EEG signals have shown significant potential in various paradigms, including Motor Imagery, Emotion Recognition, Visual Evoked Potentials, Steady-State Visually Evoked Potentials, Rapid Serial Visual Presentation, Event-Related Potentials, and Mental Workload. However, challenges such as limited labeled data, noise, and inter/intra-subject variability have impeded the effectiveness of traditional machine learning (ML) and deep learning (DL) models. This review methodically explores how FSL approaches, incorporating DA, TL, and SSL, can address these challenges and enhance classification performance in specific EEG paradigms. It also delves into the open research challenges related to these techniques in EEG signal classification. Specifically, the review examines the identification of DA strategies tailored to various EEG paradigms, the creation of TL architectures for efficient knowledge transfer, and the formulation of SSL methods for unsupervised representation learning from EEG data. Addressing these challenges is crucial for enhancing the efficacy and robustness of FSL-based EEG signal classification. By presenting a structured taxonomy of FSL techniques and discussing the associated research challenges, this systematic review offers valuable insights for future investigations in EEG signal classification. The findings aim to guide and inspire researchers, promoting advancements in applying FSL methodologies for improved EEG signal analysis and classification in real-world settings.

13.
J Imaging ; 10(7)2024 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-39057738

RESUMO

The limited availability of specialized image databases (particularly in hospitals, where tools vary between providers) makes it difficult to train deep learning models. This paper presents a few-shot learning methodology that uses a pre-trained ResNet integrated with an encoder as a backbone to encode conditional shape information for the classification of neonatal resuscitation equipment from less than 100 natural images. The model is also strengthened by incorporating a reliability score, which enriches the prediction with an estimation of classification reliability. The model, whose performance is cross-validated, reached a median accuracy performance of over 99% (and a lower limit of 73.4% for the least accurate model/fold) using only 87 meta-training images. During the test phase on complex natural images, performance was slightly degraded due to a sub-optimal segmentation strategy (FastSAM) required to maintain the real-time inference phase (median accuracy 87.25%). This methodology proves to be excellent for applying complex classification models to contexts (such as neonatal resuscitation) that are not available in public databases. Improvements to the automatic segmentation strategy prior to the extraction of conditional information will allow a natural application in simulation and hospital settings.

14.
Bioengineering (Basel) ; 11(7)2024 Jul 05.
Artigo em Inglês | MEDLINE | ID: mdl-39061767

RESUMO

Bioacoustic event detection is a demanding endeavor involving recognizing and classifying the sounds animals make in their natural habitats. Traditional supervised learning requires a large amount of labeled data, which are hard to come by in bioacoustics. This paper presents a few-shot learning (FSL) method incorporating transductive inference and data augmentation to address the issues of too few labeled events and small volumes of recordings. Here, transductive inference iteratively alters class prototypes and feature extractors to seize essential patterns, whereas data augmentation applies SpecAugment on Mel spectrogram features to augment training data. The proposed approach is evaluated by using the Detecting and Classifying Acoustic Scenes and Events (DCASE) 2022 and 2021 datasets. Extensive experimental results demonstrate that all components of the proposed method achieve significant F-score improvements of 27% and 10%, for the DCASE-2022 and DCASE-2021 datasets, respectively, compared to recent advanced approaches. Moreover, our method is helpful in FSL tasks because it effectively adapts to sounds from various animal species, recordings, and durations.

15.
Interdiscip Sci ; 16(2): 469-488, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38951382

RESUMO

Image classification, a fundamental task in computer vision, faces challenges concerning limited data handling, interpretability, improved feature representation, efficiency across diverse image types, and processing noisy data. Conventional architectural approaches have made insufficient progress in addressing these challenges, necessitating architectures capable of fine-grained classification, enhanced accuracy, and superior generalization. Among these, the vision transformer emerges as a noteworthy computer vision architecture. However, its reliance on substantial data for training poses a drawback due to its complexity and high data requirements. To surmount these challenges, this paper proposes an innovative approach, MetaV, integrating meta-learning into a vision transformer for medical image classification. N-way K-shot learning is employed to train the model, drawing inspiration from human learning mechanisms utilizing past knowledge. Additionally, deformational convolution and patch merging techniques are incorporated into the vision transformer model to mitigate complexity and overfitting while enhancing feature representation. Augmentation methods such as perturbation and Grid Mask are introduced to address the scarcity and noise in medical images, particularly for rare diseases. The proposed model is evaluated using diverse datasets including Break His, ISIC 2019, SIPaKMed, and STARE. The achieved performance accuracies of 89.89%, 87.33%, 94.55%, and 80.22% for Break His, ISIC 2019, SIPaKMed, and STARE, respectively, present evidence validating the superior performance of the proposed model in comparison to conventional models, setting a new benchmark for meta-vision image classification models.


Assuntos
Processamento de Imagem Assistida por Computador , Humanos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , Aprendizado de Máquina , Diagnóstico por Imagem , Aprendizado Profundo
16.
Neuropathol Appl Neurobiol ; 50(4): e12997, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39010256

RESUMO

AIMS: Recent advances in artificial intelligence, particularly with large language models like GPT-4Vision (GPT-4V)-a derivative feature of ChatGPT-have expanded the potential for medical image interpretation. This study evaluates the accuracy of GPT-4V in image classification tasks of histopathological images and compares its performance with a traditional convolutional neural network (CNN). METHODS: We utilised 1520 images, including haematoxylin and eosin staining and tau immunohistochemistry, from patients with various neurodegenerative diseases, such as Alzheimer's disease (AD), progressive supranuclear palsy (PSP) and corticobasal degeneration (CBD). We assessed GPT-4V's performance using multi-step prompts to determine how textual context influences image interpretation. We also employed few-shot learning to enhance improvements in GPT-4V's diagnostic performance in classifying three specific tau lesions-astrocytic plaques, neuritic plaques and tufted astrocytes-and compared the outcomes with the CNN model YOLOv8. RESULTS: GPT-4V accurately recognised staining techniques and tissue origin but struggled with specific lesion identification. The interpretation of images was notably influenced by the provided textual context, which sometimes led to diagnostic inaccuracies. For instance, when presented with images of the motor cortex, the diagnosis shifted inappropriately from AD to CBD or PSP. However, few-shot learning markedly improved GPT-4V's diagnostic capabilities, enhancing accuracy from 40% in zero-shot learning to 90% with 20-shot learning, matching the performance of YOLOv8, which required 100-shot learning to achieve the same accuracy. CONCLUSIONS: Although GPT-4V faces challenges in independently interpreting histopathological images, few-shot learning significantly improves its performance. This approach is especially promising for neuropathology, where acquiring extensive labelled datasets is often challenging.


Assuntos
Redes Neurais de Computação , Doenças Neurodegenerativas , Humanos , Doenças Neurodegenerativas/patologia , Interpretação de Imagem Assistida por Computador/métodos , Doença de Alzheimer/patologia
17.
Sensors (Basel) ; 24(13)2024 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-39001199

RESUMO

Automatic Modulation Recognition (AMR) is a key technology in the field of cognitive communication, playing a core role in many applications, especially in wireless security issues. Currently, deep learning (DL)-based AMR technology has achieved many research results, greatly promoting the development of AMR technology. However, the few-shot dilemma faced by DL-based AMR methods greatly limits their application in practical scenarios. Therefore, this paper endeavored to address the challenge of AMR with limited data and proposed a novel meta-learning method, the Multi-Level Comparison Relation Network with Class Reconstruction (MCRN-CR). Firstly, the method designs a structure of a multi-level comparison relation network, which involves embedding functions to output their feature maps hierarchically, comprehensively calculating the relation scores between query samples and support samples to determine the modulation category. Secondly, the embedding function integrates a reconstruction module, leveraging an autoencoder for support sample reconstruction, wherein the encoder serves dual purposes as the embedding mechanism. The training regimen incorporates a meta-learning paradigm, harmoniously combining classification and reconstruction losses to refine the model's performance. The experimental results on the RadioML2018 dataset show that our designed method can greatly alleviate the small sample problem in AMR and is superior to existing methods.

18.
PeerJ Comput Sci ; 10: e2080, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38983194

RESUMO

Poultry farming is an indispensable part of global agriculture, playing a crucial role in food safety and economic development. Managing and preventing diseases is a vital task in the poultry industry, where semantic segmentation technology can significantly enhance the efficiency of traditional manual monitoring methods. Furthermore, traditional semantic segmentation has achieved excellent results on extensively manually annotated datasets, facilitating real-time monitoring of poultry. Nonetheless, the model encounters limitations when exposed to new environments, diverse breeding varieties, or varying growth stages within the same species, necessitating extensive data retraining. Overreliance on large datasets results in higher costs for manual annotations and deployment delays, thus hindering practical applicability. To address this issue, our study introduces HSDNet, an innovative semantic segmentation model based on few-shot learning, for monitoring poultry farms. The HSDNet model adeptly adjusts to new settings or species with a single image input while maintaining substantial accuracy. In the specific context of poultry breeding, characterized by small congregating animals and the inherent complexities of agricultural environments, issues of non-smooth losses arise, potentially compromising accuracy. HSDNet incorporates a Sharpness-Aware Minimization (SAM) strategy to counteract these challenges. Furthermore, by considering the effects of imbalanced loss on convergence, HSDNet mitigates the overfitting issue induced by few-shot learning. Empirical findings underscore HSDNet's proficiency in poultry breeding settings, exhibiting a significant 72.89% semantic segmentation accuracy on single images, which is higher than SOTA's 68.85%.

19.
JMIR Med Inform ; 12: e56243, 2024 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-39037700

RESUMO

BACKGROUND: Understanding the multifaceted nature of health outcomes requires a comprehensive examination of the social, economic, and environmental determinants that shape individual well-being. Among these determinants, behavioral factors play a crucial role, particularly the consumption patterns of psychoactive substances, which have important implications on public health. The Global Burden of Disease Study shows a growing impact in disability-adjusted life years due to substance use. The successful identification of patients' substance use information equips clinical care teams to address substance-related issues more effectively, enabling targeted support and ultimately improving patient outcomes. OBJECTIVE: Traditional natural language processing methods face limitations in accurately parsing diverse clinical language associated with substance use. Large language models offer promise in overcoming these challenges by adapting to diverse language patterns. This study investigates the application of the generative pretrained transformer (GPT) model in specific GPT-3.5 for extracting tobacco, alcohol, and substance use information from patient discharge summaries in zero-shot and few-shot learning settings. This study contributes to the evolving landscape of health care informatics by showcasing the potential of advanced language models in extracting nuanced information critical for enhancing patient care. METHODS: The main data source for analysis in this paper is Medical Information Mart for Intensive Care III data set. Among all notes in this data set, we focused on discharge summaries. Prompt engineering was undertaken, involving an iterative exploration of diverse prompts. Leveraging carefully curated examples and refined prompts, we investigate the model's proficiency through zero-shot as well as few-shot prompting strategies. RESULTS: Results show GPT's varying effectiveness in identifying mentions of tobacco, alcohol, and substance use across learning scenarios. Zero-shot learning showed high accuracy in identifying substance use, whereas few-shot learning reduced accuracy but improved in identifying substance use status, enhancing recall and F1-score at the expense of lower precision. CONCLUSIONS: Excellence of zero-shot learning in precisely extracting text span mentioning substance use demonstrates its effectiveness in situations in which comprehensive recall is important. Conversely, few-shot learning offers advantages when accurately determining the status of substance use is the primary focus, even if it involves a trade-off in precision. The results contribute to enhancement of early detection and intervention strategies, tailor treatment plans with greater precision, and ultimately, contribute to a holistic understanding of patient health profiles. By integrating these artificial intelligence-driven methods into electronic health record systems, clinicians can gain immediate, comprehensive insights into substance use that results in shaping interventions that are not only timely but also more personalized and effective.

20.
Biomed Eng Lett ; 14(4): 877-889, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38946819

RESUMO

Due to the difficulty in obtaining clinical samples and the high cost of labeling, rare skin diseases are characterized by data scarcity, making training deep neural networks for classification challenging. In recent years, few-shot learning has emerged as a promising solution, enabling models to recognize unseen disease classes by limited labeled samples. However, most existing methods ignored the fine-grained nature of rare skin diseases, resulting in poor performance when generalizing to highly similar classes. Moreover, the distributions learned from limited labeled data are biased, severely impairing the model's generalizability. This paper proposes a self-supervision distribution calibration network (SS-DCN) to address the above issues. Specifically, SS-DCN adopts a multi-task learning framework during pre-training. By introducing self-supervised tasks to aid in supervised learning, the model can learn more discriminative and transferable visual representations. Furthermore, SS-DCN applied an enhanced distribution calibration (EDC) strategy, which utilizes the statistics of base classes with sufficient samples to calibrate the bias distribution of novel classes with few-shot samples. By generating more samples from the calibrated distribution, EDC can provide sufficient supervision for subsequent classifier training. The proposed method is evaluated on three public skin disease datasets(i.e., ISIC2018, Derm7pt, and SD198), achieving significant performance improvements over state-of-the-art methods.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA