Your browser doesn't support javascript.
loading
Montrer: 20 | 50 | 100
Résultats 1 - 20 de 2.439
Filtrer
1.
Hu Li Za Zhi ; 71(5): 7-13, 2024 Oct.
Article de Chinois | MEDLINE | ID: mdl-39350704

RÉSUMÉ

Artificial intelligence (AI) is driving global change, and the implementation of generative AI in higher education is inevitable. AI language models such as the chat generative pre-trained transformer (ChatGPT) hold the potential to revolutionize the delivery of nursing education in the future. Nurse educators play a crucial role in preparing nursing students for a future technology-integrated healthcare system. While the technology has limitations and potential biases, the emergence of ChatGPT presents both opportunities and challenges. It is critical for faculty to be familiar with the capabilities and limitations of this model to foster effective, ethical, and responsible utilization of AI technology while preparing students in advance for the dynamic and rapidly advancing landscape of nursing and healthcare. Therefore, this article was written to present a strengths, weaknesses, opportunities, and threats (SWOT) analysis of integrating ChatGPT into nursing education, providing a guide for implementing ChatGPT in nursing education and offering a well-rounded assessment to help nurse educators make informed decisions.


Sujet(s)
Intelligence artificielle , Enseignement infirmier , Humains
2.
Microsc Res Tech ; 2024 Oct 01.
Article de Anglais | MEDLINE | ID: mdl-39351968

RÉSUMÉ

Lymph-node status is important in decision-making during early gastric cancer (EGC) treatment. Currently, endoscopic submucosal dissection is the mainstream treatment for EGC. However, it is challenging for even experienced endoscopists to accurately diagnose and treat EGC. Multiphoton microscopy can extract the morphological features of collagen fibers from tissues. The characteristics of collagen fibers can be used to assess the lymph-node metastasis status in patients with EGC. First, we compared the accuracy of four deep learning models (VGG16, ResNet34, MobileNetV2, and PVTv2) in training preprocessed images and test datasets. Next, we integrated the features of the best-performing model, which was PVTv2, with manual and clinical features to develop a novel model called AutoLNMNet. The prediction accuracy of AutoLNMNet for the no metastasis (Ly0) and metastasis in lymph nodes (Ly1) stages reached 0.92, which was 0.3% higher than that of PVTv2. The receiver operating characteristics of AutoLNMNet in quantifying Ly0 and Ly1 stages were 0.97 and 0.97, respectively. Therefore, AutoLNMNet is highly reliable and accurate in detecting lymph-node metastasis, providing an important tool for the early diagnosis and treatment of EGC.

3.
Front Plant Sci ; 15: 1409544, 2024.
Article de Anglais | MEDLINE | ID: mdl-39354942

RÉSUMÉ

In the current agricultural landscape, a significant portion of tomato plants suffer from leaf diseases, posing a major challenge to manual detection due to the task's extensive scope. Existing detection algorithms struggle to balance speed with accuracy, especially when identifying small-scale leaf diseases across diverse settings. Addressing this need, this study presents FCHF-DETR (Faster-Cascaded-attention-High-feature-fusion-Focaler Detection-Transformer), an innovative, high-precision, and lightweight detection algorithm based on RT-DETR-R18 (Real-Time-Detection-Transformer-ResNet18). The algorithm was developed using a carefully curated dataset of 3147 RGB images, showcasing tomato leaf diseases across a range of scenes and resolutions. FasterNet replaces ResNet18 in the algorithm's backbone network, aimed at reducing the model's size and improving memory efficiency. Additionally, replacing the conventional AIFI (Attention-based Intra-scale Feature Interaction) module with Cascaded Group Attention and the original CCFM (CNN-based Cross-scale Feature-fusion Module) module with HSFPN (High-Level Screening-feature Fusion Pyramid Networks) in the Efficient Hybrid Encoder significantly enhanced detection accuracy without greatly affecting efficiency. To tackle the challenge of identifying challenging samples, the Focaler-CIoU loss function was incorporated, refining the model's performance throughout the dataset. Empirical results show that FCHF-DETR achieved 96.4% Precision, 96.7% Recall, 89.1% mAP (Mean Average Precision) 50-95 and 97.2% mAP50 on the test set, with a reduction of 9.2G in FLOPs (floating point of operations) and 3.6M in parameters. These findings clearly demonstrate that the proposed method improves detection accuracy and reduces computational complexity, addressing the dual challenges of precision and efficiency in tomato leaf disease detection.

4.
Front Plant Sci ; 15: 1389961, 2024.
Article de Anglais | MEDLINE | ID: mdl-39354950

RÉSUMÉ

Introduction: In the field of agriculture, automated harvesting of Camellia oleifera fruit has become an important research area. However, accurately detecting Camellia oleifera fruit in a natural environment is a challenging task. The task of accurately detecting Camellia oleifera fruit in natural environments is complex due to factors such as shadows, which can impede the performance of traditional detection techniques, highlighting the need for more robust methods. Methods: To overcome these challenges, we propose an efficient deep learning method called YOLO-CFruit, which is specifically designed to accurately detect Camellia oleifera fruits in challenging natural environments. First, we collected images of Camellia oleifera fruits and created a dataset, and then used a data enhancement method to further enhance the diversity of the dataset. Our YOLO-CFruit model combines a CBAM module for identifying regions of interest in landscapes with Camellia oleifera fruit and a CSP module with Transformer for capturing global information. In addition, we improve YOLOCFruit by replacing the CIoU Loss with the EIoU Loss in the original YOLOv5. Results: By testing the training network, we find that the method performs well, achieving an average precision of 98.2%, a recall of 94.5%, an accuracy of 98%, an F1 score of 96.2, and a frame rate of 19.02 ms. The experimental results show that our method improves the average precision by 1.2% and achieves the highest accuracy and higher F1 score among all state-of-the-art networks compared to the conventional YOLOv5s network. Discussion: The robust performance of YOLO-CFruit under different real-world conditions, including different light and shading scenarios, signifies its high reliability and lays a solid foundation for the development of automated picking devices.

5.
Neural Netw ; 181: 106749, 2024 Sep 23.
Article de Anglais | MEDLINE | ID: mdl-39357266

RÉSUMÉ

Unsupervised Domain Adaptation aims to leverage a source domain with ample labeled data to tackle tasks on an unlabeled target domain. However, this poses a significant challenge, particularly in scenarios exhibiting significant disparities between the two domains. Prior methods often fall short in challenging domains due to the impact of incorrect pseudo-labeling noise and the limits of handcrafted domain alignment rules. In this paper, we propose a novel method called DCST (Dual Cross-Supervision Transformer), which improves upon existing methods in two key aspects. Firstly, vision transformer is combined with dual cross-supervision learning strategy to enforce consistency learning from different domains. The network accomplishes domain-specific self-training and cross-domain feature alignment in an adaptive manner. Secondly, due to the presence of noise in challenging domain, and the need to reduce the risks of model collapse and overfitting, we propose a Domain Shift Filter. Specifically, this module allows the model to leverage the memory of source domain features to facilitate a smooth transition. It can also improve the effectiveness of knowledge transfer between domains with significant gaps. We conduct extensive experiments on four benchmark datasets and achieved the best classification results, including 94.3% on Office-31, 86.0% on Office-Home, 89.3% on VisDA-2017, and 48.8% on DomainNet. Code is available in https://github.com/Yislight/DCST.

6.
Curr Med Imaging ; 2024 Oct 02.
Article de Anglais | MEDLINE | ID: mdl-39360542

RÉSUMÉ

INTRODUCTION: In this study, we harnessed three cutting-edge algorithms' capabilities to refine the elbow fracture prediction process through X-ray image analysis. Employing the YOLOv8 (You only look once) algorithm, we first identified Regions of Interest (ROI) within the X-ray images, significantly augmenting fracture prediction accuracy. METHODS: Subsequently, we integrated and compared the ResNet, the SeResNet (Squeeze-and-Excitation Residual Network) ViT (Vision Transformer) algorithms to refine our predictive capabilities. Furthermore, to ensure optimal precision, we implemented a series of meticulous refinements. This included recalibrating ROI regions to enable finer-grained identification of diagnostically significant areas within the X-ray images. Additionally, advanced image enhancement techniques were applied to optimize the X-ray images' visual quality and structural clarity. RESULTS: These methodological enhancements synergistically contributed to a substantial improvement in the overall accuracy of our fracture predictions. The dataset utilized for training, testing & validation, and comprehensive evaluation exclusively comprised elbow X-ray images, where predicting the fracture with three algorithms: Resnet50; accuracy 0.97, precision 1, recall 0.95, SeResnet50; accuracy 0.97, precision 1, recall 0.95 & ViTB- 16 with high accuracy of 0.99, precision same as the other two algorithms, with a recall of 0.95. CONCLUSION: This approach has the potential to increase the precision of diagnoses, lessen the burden of radiologists, easily integrate into current medical imaging systems, and assist clinical decision-making, all of which could lead to better patient care and health outcomes overall.

7.
Proteomics ; : e202400210, 2024 Oct 03.
Article de Anglais | MEDLINE | ID: mdl-39361250

RÉSUMÉ

N-Linked glycosylation is crucial for various biological processes such as protein folding, immune response, and cellular transport. Traditional experimental methods for determining N-linked glycosylation sites entail substantial time and labor investment, which has led to the development of computational approaches as a more efficient alternative. However, due to the limited availability of 3D structural data, existing prediction methods often struggle to fully utilize structural information and fall short in integrating sequence and structural information effectively. Motivated by the progress of protein pretrained language models (pLMs) and the breakthrough in protein structure prediction, we introduced a high-accuracy model called CoNglyPred. Having compared various pLMs, we opt for the large-scale pLM ESM-2 to extract sequence embeddings, thus mitigating certain limitations associated with manual feature extraction. Meanwhile, our approach employs a graph transformer network to process the 3D protein structures predicted by AlphaFold2. The final graph output and ESM-2 embedding are intricately integrated through a co-attention mechanism. Among a series of comprehensive experiments on the independent test dataset, CoNglyPred outperforms state-of-the-art models and demonstrates exceptional performance in case study. In addition, we are the first to report the uncertainty of N-linked glycosylation predictors using expected calibration error and expected uncertainty calibration error.

8.
Front Med (Lausanne) ; 11: 1402457, 2024.
Article de Anglais | MEDLINE | ID: mdl-39359921

RÉSUMÉ

This study aims to evaluate the feasibility of large language model (LLM) in answering pathology questions based on pathology reports (PRs) of colorectal cancer (CRC). Four common questions (CQs) and corresponding answers about pathology were retrieved from public webpages. These questions were input as prompts for Chat Generative Pretrained Transformer (ChatGPT) (gpt-3.5-turbo). The quality indicators (understanding, scientificity, satisfaction) of all answers were evaluated by gastroenterologists. Standard PRs from 5 CRC patients who received radical surgeries in Shanghai Changzheng Hospital were selected. Six report questions (RQs) and corresponding answers were generated by a gastroenterologist and a pathologist. We developed an interactive PRs interpretation system which allows users to upload standard PRs as JPG images. Then the ChatGPT's responses to the RQs were generated. The quality indicators of all answers were evaluated by gastroenterologists and out-patients. As for CQs, gastroenterologists rated AI answers similarly to non-AI answers in understanding, scientificity, and satisfaction. As for RQ1-3, gastroenterologists and patients rated the AI mean scores higher than non-AI scores among the quality indicators. However, as for RQ4-6, gastroenterologists rated the AI mean scores lower than non-AI scores in understanding and satisfaction. In RQ4, gastroenterologists rated the AI scores lower than non-AI scores in scientificity (P = 0.011); patients rated the AI scores lower than non-AI scores in understanding (P = 0.004) and satisfaction (P = 0.011). In conclusion, LLM could generate credible answers to common pathology questions and conceptual questions on the PRs. It holds great potential in improving doctor-patient communication.

9.
Brain Inform ; 11(1): 25, 2024 Oct 03.
Article de Anglais | MEDLINE | ID: mdl-39363122

RÉSUMÉ

Transformers have dominated the landscape of Natural Language Processing (NLP) and revolutionalized generative AI applications. Vision Transformers (VT) have recently become a new state-of-the-art for computer vision applications. Motivated by the success of VTs in capturing short and long-range dependencies and their ability to handle class imbalance, this paper proposes an ensemble framework of VTs for the efficient classification of Alzheimer's Disease (AD). The framework consists of four vanilla VTs, and ensembles formed using hard and soft-voting approaches. The proposed model was tested using two popular AD datasets: OASIS and ADNI. The ADNI dataset was employed to assess the models' efficacy under imbalanced and data-scarce conditions. The ensemble of VT saw an improvement of around 2% compared to individual models. Furthermore, the results are compared with state-of-the-art and custom-built Convolutional Neural Network (CNN) architectures and Machine Learning (ML) models under varying data conditions. The experimental results demonstrated an overall performance gain of 4.14% and 4.72% accuracy over the ML and CNN algorithms, respectively. The study has also identified specific limitations and proposes avenues for future research. The codes used in the study are made publicly available.

10.
Front Neurorobot ; 18: 1452019, 2024.
Article de Anglais | MEDLINE | ID: mdl-39381775

RÉSUMÉ

Introduction: Currently, using machine learning methods for precise analysis and improvement of swimming techniques holds significant research value and application prospects. The existing machine learning methods have improved the accuracy of action recognition to some extent. However, they still face several challenges such as insufficient data feature extraction, limited model generalization ability, and poor real-time performance. Methods: To address these issues, this paper proposes an innovative approach called Swimtrans Net: A multimodal robotic system for swimming action recognition driven via Swin-Transformer. By leveraging the powerful visual data feature extraction capabilities of Swin-Transformer, Swimtrans Net effectively extracts swimming image information. Additionally, to meet the requirements of multimodal tasks, we integrate the CLIP model into the system. Swin-Transformer serves as the image encoder for CLIP, and through fine-tuning the CLIP model, it becomes capable of understanding and interpreting swimming action data, learning relevant features and patterns associated with swimming. Finally, we introduce transfer learning for pre-training to reduce training time and lower computational resources, thereby providing real-time feedback to swimmers. Results and discussion: Experimental results show that Swimtrans Net has achieved a 2.94% improvement over the current state-of-the-art methods in swimming motion analysis and prediction, making significant progress. This study introduces an innovative machine learning method that can help coaches and swimmers better understand and improve swimming techniques, ultimately improving swimming performance.

11.
J Environ Manage ; 370: 122742, 2024 Oct 08.
Article de Anglais | MEDLINE | ID: mdl-39383749

RÉSUMÉ

Sorting out plastic waste (PW) from municipal solid waste (MSW) by material type is crucial for reutilization and pollution reduction. However, current automatic separation methods are costly and inefficient, necessitating an advanced sorting process to ensure high feedstock purity. This study introduces a Swin Transformer-based model for effectively detecting PW in real-world MSW streams, leveraging both morphological and material properties. And, a dataset comprising 3560 optical images and infrared spectra data was created to support this task. This vision-based system can localize and classify PW into five categories: polypropylene (PP), polyethylene (PE), polyethylene terephthalate (PET), polyvinyl chloride (PVC), and polystyrene (PS). Performance evaluations reveal an accuracy rate of 99.75% and a mean Average Precision (mAP50) exceeding 91%. Compared to popular convolutional neural network (CNN)-based models, this well-trained Swin Transformer-based model offers enhanced convenience and performance in five-category PW detection task, maintaining a mAP50 over 80% in the real-life deployment. The model's effectiveness is further supported by visualization of detection results on MSW streams and principal component analysis of classification scores. These results demonstrate the system's significant effectiveness in both lab-scale and real-life conditions, aligning with global regulations and strategies that promote innovative technologies for plastic recycling, thereby contributing to the development of a sustainable circular economy.

12.
Front Plant Sci ; 15: 1452821, 2024.
Article de Anglais | MEDLINE | ID: mdl-39391778

RÉSUMÉ

Accurate fruit detection is crucial for automated fruit picking. However, real-world scenarios, influenced by complex environmental factors such as illumination variations, occlusion, and overlap, pose significant challenges to accurate fruit detection. These challenges subsequently impact the commercialization of fruit harvesting robots. A tomato detection model named YOLO-SwinTF, based on YOLOv7, is proposed to address these challenges. Integrating Swin Transformer (ST) blocks into the backbone network enables the model to capture global information by modeling long-range visual dependencies. Trident Pyramid Networks (TPN) are introduced to overcome the limitations of PANet's focus on communication-based processing. TPN incorporates multiple self-processing (SP) modules within existing top-down and bottom-up architectures, allowing feature maps to generate new findings for communication. In addition, Focaler-IoU is introduced to reconstruct the original intersection-over-union (IoU) loss to allow the loss function to adjust its focus based on the distribution of difficult and easy samples. The proposed model is evaluated on a tomato dataset, and the experimental results demonstrated that the proposed model's detection recall, precision, F1 score, and AP reach 96.27%, 96.17%, 96.22%, and 98.67%, respectively. These represent improvements of 1.64%, 0.92%, 1.28%, and 0.88% compared to the original YOLOv7 model. When compared to other state-of-the-art detection methods, this approach achieves superior performance in terms of accuracy while maintaining comparable detection speed. In addition, the proposed model exhibits strong robustness under various lighting and occlusion conditions, demonstrating its significant potential in tomato detection.

13.
Brief Bioinform ; 25(6)2024 Sep 23.
Article de Anglais | MEDLINE | ID: mdl-39391931

RÉSUMÉ

Despite advanced diagnostics, 3%-5% of cases remain classified as cancer of unknown primary (CUP). DNA methylation, an important epigenetic feature, is essential for determining the origin of metastatic tumors. We presented PathMethy, a novel Transformer model integrated with functional categories and crosstalk of pathways, to accurately trace the origin of tumors in CUP samples based on DNA methylation. PathMethy outperformed seven competing methods in F1-score across nine cancer datasets and predicted accurately the molecular subtypes within nine primary tumor types. It not only excelled at tracing the origins of both primary and metastatic tumors but also demonstrated a high degree of agreement with previously diagnosed sites in cases of CUP. PathMethy provided biological insights by highlighting key pathways, functional categories, and their interactions. Using functional categories of pathways, we gained a global understanding of biological processes. For broader access, a user-friendly web server for researchers and clinicians is available at https://cup.pathmethy.com.


Sujet(s)
Méthylation de l'ADN , Tumeurs , Humains , Tumeurs/génétique , Logiciel , Intelligence artificielle , Biologie informatique/méthodes , Algorithmes , Épigenèse génétique
14.
J Food Sci ; 2024 Oct 09.
Article de Anglais | MEDLINE | ID: mdl-39385405

RÉSUMÉ

Pinelliae Rhizoma is a key ingredient in botanical supplements and is often adulterated by Rhizoma Pinelliae Pedatisectae, which is similar in appearance but less expensive. Accurate identification of these materials is crucial for both scientific and commercial purposes. Traditional morphological identification relies heavily on expert experience and is subjective, while chemical analysis and molecular biological identification are typically time consuming and labor intensive. This study aims to employ a simpler, faster, and non-invasive image recognition technique to distinguish between these two highly similar plant materials. In the realm of image recognition, we aimed to utilize the vision transformer (ViT) algorithm, a cutting-edge image recognition technology, to differentiate these materials. All samples were verified using DNA molecular identification before image analysis. The result demonstrates that the ViT algorithm achieves a classification accuracy exceeding 94%, significantly outperforming the convolutional neural network model's 60%-70% accuracy. This highlights the efficiency of this technology in identifying plant materials with similar appearances. This study marks the pioneer work of the ViT algorithm to such a challenging task, showcasing its potential for precise botanical material identification and setting the stage for future advancements in the field.

15.
BMC Med Res Methodol ; 24(1): 232, 2024 Oct 07.
Article de Anglais | MEDLINE | ID: mdl-39375589

RÉSUMÉ

BACKGROUND: Postoperative pain is a prevalent symptom experienced by patients undergoing surgical procedures. This study aims to develop deep learning algorithms for predicting acute postoperative pain using both essential patient details and real-time vital sign data during surgery. METHODS: Through a retrospective observational approach, we utilized Graph Attention Networks (GAT) and graph Transformer Networks (GTN) deep learning algorithms to construct the DoseFormer model while incorporating an attention mechanism. This model employed patient information and intraoperative vital signs obtained during Video-assisted thoracoscopic surgery (VATS) surgery to anticipate postoperative pain. By categorizing the static and dynamic data, the DoseFormer model performed binary classification to predict the likelihood of postoperative acute pain. RESULTS: A total of 1758 patients were initially included, with 1552 patients after data cleaning. These patients were then divided into training set (n = 931) and testing set (n = 621). In the testing set, the DoseFormer model exhibited significantly higher AUROC (0.98) compared to classical machine learning algorithms. Furthermore, the DoseFormer model displayed a significantly higher F1 value (0.85) in comparison to other classical machine learning algorithms. Notably, the attending anesthesiologists' F1 values (attending: 0.49, fellow: 0.43, Resident: 0.16) were significantly lower than those of the DoseFormer model in predicting acute postoperative pain. CONCLUSIONS: Deep learning model can predict postoperative acute pain events based on patients' basic information and intraoperative vital signs.


Sujet(s)
Apprentissage profond , Douleur postopératoire , Chirurgie thoracique vidéoassistée , Humains , Chirurgie thoracique vidéoassistée/méthodes , Chirurgie thoracique vidéoassistée/effets indésirables , Douleur postopératoire/étiologie , Douleur postopératoire/diagnostic , Études rétrospectives , Femelle , Mâle , Adulte d'âge moyen , Algorithmes , Sujet âgé , Adulte , Douleur aigüe/diagnostic , Douleur aigüe/étiologie
16.
BMC Med Inform Decis Mak ; 24(1): 288, 2024 Oct 07.
Article de Anglais | MEDLINE | ID: mdl-39375719

RÉSUMÉ

BACKGROUND: Histopathology is a gold standard for cancer diagnosis. It involves extracting tissue specimens from suspicious areas to prepare a glass slide for a microscopic examination. However, histological tissue processing procedures result in the introduction of artifacts, which are ultimately transferred to the digitized version of glass slides, known as whole slide images (WSIs). Artifacts are diagnostically irrelevant areas and may result in wrong predictions from deep learning (DL) algorithms. Therefore, detecting and excluding artifacts in the computational pathology (CPATH) system is essential for reliable automated diagnosis. METHODS: In this paper, we propose a mixture of experts (MoE) scheme for detecting five notable artifacts, including damaged tissue, blur, folded tissue, air bubbles, and histologically irrelevant blood from WSIs. First, we train independent binary DL models as experts to capture particular artifact morphology. Then, we ensemble their predictions using a fusion mechanism. We apply probabilistic thresholding over the final probability distribution to improve the sensitivity of the MoE. We developed four DL pipelines to evaluate computational and performance trade-offs. These include two MoEs and two multiclass models of state-of-the-art deep convolutional neural networks (DCNNs) and vision transformers (ViTs). These DL pipelines are quantitatively and qualitatively evaluated on external and out-of-distribution (OoD) data to assess generalizability and robustness for artifact detection application. RESULTS: We extensively evaluated the proposed MoE and multiclass models. DCNNs-based MoE and ViTs-based MoE schemes outperformed simpler multiclass models and were tested on datasets from different hospitals and cancer types, where MoE using (MobileNet) DCNNs yielded the best results. The proposed MoE yields 86.15 % F1 and 97.93% sensitivity scores on unseen data, retaining less computational cost for inference than MoE using ViTs. This best performance of MoEs comes with relatively higher computational trade-offs than multiclass models. Furthermore, we apply post-processing to create an artifact segmentation mask, a potential artifact-free RoI map, a quality report, and an artifact-refined WSI for further computational analysis. During the qualitative evaluation, field experts assessed the predictive performance of MoEs over OoD WSIs. They rated artifact detection and artifact-free area preservation, where the highest agreement translated to a Cohen Kappa of 0.82, indicating substantial agreement for the overall diagnostic usability of the DCNN-based MoE scheme. CONCLUSIONS: The proposed artifact detection pipeline will not only ensure reliable CPATH predictions but may also provide quality control. In this work, the best-performing pipeline for artifact detection is MoE with DCNNs. Our detailed experiments show that there is always a trade-off between performance and computational complexity, and no straightforward DL solution equally suits all types of data and applications. The code and HistoArtifacts dataset can be found online at Github and Zenodo , respectively.


Sujet(s)
Artéfacts , Apprentissage profond , Humains , Tumeurs , Traitement d'image par ordinateur/méthodes , Anatomopathologie clinique/normes , Interprétation d'images assistée par ordinateur/méthodes
17.
Front Comput Neurosci ; 18: 1404623, 2024.
Article de Anglais | MEDLINE | ID: mdl-39380741

RÉSUMÉ

Introduction: With the great success of Transformers in the field of machine learning, it is also gradually attracting widespread interest in the field of remote sensing (RS). However, the research in the field of remote sensing has been hampered by the lack of large labeled data sets and the inconsistency of data modes caused by the diversity of RS platforms. With the rise of self-supervised learning (SSL) algorithms in recent years, RS researchers began to pay attention to the application of "pre-training and fine-tuning" paradigm in RS. However, there are few researches on multi-modal data fusion in remote sensing field. Most of them choose to use only one of the modal data or simply splice multiple modal data roughly. Method: In order to study a more efficient multi-modal data fusion scheme, we propose a multi-modal fusion mechanism based on gated unit control (MGSViT). In this paper, we pretrain the ViT model based on BigEarthNet dataset by combining two commonly used SSL algorithms, and propose an intra-modal and inter-modal gated fusion unit for feature learning by combining multispectral (MS) and synthetic aperture radar (SAR). Our method can effectively combine different modal data to extract key feature information. Results and discussion: After fine-tuning and comparison experiments, we outperform the most advanced algorithms in all downstream classification tasks. The validity of our proposed method is verified.

18.
Sci Rep ; 14(1): 23239, 2024 Oct 05.
Article de Anglais | MEDLINE | ID: mdl-39369065

RÉSUMÉ

Network is an essential tool today, and the Intrusion Detection System (IDS) can ensure the safe operation. However, with the explosive growth of data, current methods are increasingly struggling as they often detect based on a single scale, leading to the oversight of potential features in the extensive traffic data, which may result in degraded performance. In this work, we propose a novel detection model utilizing multi-scale transformer namely IDS-MTran. In essence, the collaboration of multi-scale traffic features broads the pattern coverage of intrusion detection. Firstly, we employ convolution operators with various kernels to generate multi-scale features. Secondly, to enhance the representation of features and the interaction between branches, we propose Patching with Pooling (PwP) to serve as a bridge. Next, we design multi-scale transformer-based backbone to model the features at diverse scales, extracting potential intrusion trails. Finally, to fully capitalize these multi-scale branches, we propose the Cross Feature Enrichment (CFE) to integrate and enrich features, and then output the results. Sufficient experiments show that compared with other models, the proposed method can distinguish different attack types more effectively. Specifically, the accuracy on three common datasets NSL-KDD, CIC-DDoS 2019 and UNSW-NB15 has all exceeded 99%, which is more accurate and stable.

19.
JACC Adv ; 3(9): 101196, 2024 Sep.
Article de Anglais | MEDLINE | ID: mdl-39372455

RÉSUMÉ

Background: Ejection fraction (EF) estimation informs patient plans in the ICU, and low EF can indicate ventricular systolic dysfunction, which increases the risk of adverse events including heart failure. Automated echocardiography models are an attractive solution for high-variance human EF estimation, and key to this goal are echocardiogram vector embeddings, which are a critical resource for computational researchers. Objectives: The authors aimed to extract the vector embeddings from each echocardiogram in the EchoNet dataset using a classifier trained to classify EF as healthy (>50%) or unhealthy (<= 50%) to create an embeddings dataset for computational researchers. Methods: We repurposed an R3D transformer to classify whether patient EF is below or above 50%. Training, validation, and testing were done on the EchoNet dataset of 10,030 echocardiograms, and the resulting model generated embeddings for each of these videos. Results: We extracted 400-dimensional vector embeddings for each of the 10,030 EchoNet echocardiograms using the trained R3D model, which achieved a test AUC of 0.916 and 87.5% accuracy, approaching the performance of comparable studies. Conclusions: We present 10,030 vector embeddings learned by this model as a resource to the cardiology research community, as well as the trained model itself. These vectors enable algorithmic improvements and multimodal applications within automated echocardiography, benefitting the research community and those with ventricular systolic dysfunction (https://github.com/Team-Echo-MIT/r3d-v0-embeddings).

20.
Med Image Anal ; 99: 103356, 2024 Sep 30.
Article de Anglais | MEDLINE | ID: mdl-39378568

RÉSUMÉ

Breast cancer is a significant global public health concern, with various treatment options available based on tumor characteristics. Pathological examination of excision specimens after surgery provides essential information for treatment decisions. However, the manual selection of representative sections for histological examination is laborious and subjective, leading to potential sampling errors and variability, especially in carcinomas that have been previously treated with chemotherapy. Furthermore, the accurate identification of residual tumors presents significant challenges, emphasizing the need for systematic or assisted methods to address this issue. In order to enable the development of deep-learning algorithms for automated cancer detection on radiology images, it is crucial to perform radiology-pathology registration, which ensures the generation of accurately labeled ground truth data. The alignment of radiology and histopathology images plays a critical role in establishing reliable cancer labels for training deep-learning algorithms on radiology images. However, aligning these images is challenging due to their content and resolution differences, tissue deformation, artifacts, and imprecise correspondence. We present a novel deep learning-based pipeline for the affine registration of faxitron images, the x-ray representations of macrosections of ex-vivo breast tissue, and their corresponding histopathology images of tissue segments. The proposed model combines convolutional neural networks and vision transformers, allowing it to effectively capture both local and global information from the entire tissue macrosection as well as its segments. This integrated approach enables simultaneous registration and stitching of image segments, facilitating segment-to-macrosection registration through a puzzling-based mechanism. To address the limitations of multi-modal ground truth data, we tackle the problem by training the model using synthetic mono-modal data in a weakly supervised manner. The trained model demonstrated successful performance in multi-modal registration, yielding registration results with an average landmark error of 1.51 mm (±2.40), and stitching distance of 1.15 mm (±0.94). The results indicate that the model performs significantly better than existing baselines, including both deep learning-based and iterative models, and it is also approximately 200 times faster than the iterative approach. This work bridges the gap in the current research and clinical workflow and has the potential to improve efficiency and accuracy in breast cancer evaluation and streamline pathology workflow.

SÉLECTION CITATIONS
DÉTAIL DE RECHERCHE