Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Proc Natl Acad Sci U S A ; 121(27): e2311807121, 2024 Jul 02.
Artículo en Inglés | MEDLINE | ID: mdl-38913893

RESUMEN

Machine learning has been proposed as an alternative to theoretical modeling when dealing with complex problems in biological physics. However, in this perspective, we argue that a more successful approach is a proper combination of these two methodologies. We discuss how ideas coming from physical modeling neuronal processing led to early formulations of computational neural networks, e.g., Hopfield networks. We then show how modern learning approaches like Potts models, Boltzmann machines, and the transformer architecture are related to each other, specifically, through a shared energy representation. We summarize recent efforts to establish these connections and provide examples on how each of these formulations integrating physical modeling and machine learning have been successful in tackling recent problems in biomolecular structure, dynamics, function, evolution, and design. Instances include protein structure prediction; improvement in computational complexity and accuracy of molecular dynamics simulations; better inference of the effects of mutations in proteins leading to improved evolutionary modeling and finally how machine learning is revolutionizing protein engineering and design. Going beyond naturally existing protein sequences, a connection to protein design is discussed where synthetic sequences are able to fold to naturally occurring motifs driven by a model rooted in physical principles. We show that this model is "learnable" and propose its future use in the generation of unique sequences that can fold into a target structure.


Asunto(s)
Aprendizaje Automático , Redes Neurales de la Computación , Proteínas , Proteínas/química , Proteínas/metabolismo , Ingeniería de Proteínas/métodos , Simulación de Dinámica Molecular
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38305453

RESUMEN

Target enrichment sequencing techniques are gaining widespread use in the field of genomics, prized for their economic efficiency and swift processing times. However, their success depends on the performance of probes and the evenness of sequencing depth among each probe. To accurately predict probe coverage depth, a model called Deqformer is proposed in this study. Deqformer utilizes the oligonucleotides sequence of each probe, drawing inspiration from Watson-Crick base pairing and incorporating two BERT encoders to capture the underlying information from the forward and reverse probe strands, respectively. The encoded data are combined with a feed-forward network to make precise predictions of sequencing depth. The performance of Deqformer is evaluated on four different datasets: SNP panel with 38 200 probes, lncRNA panel with 2000 probes, synthetic panel with 5899 probes and HD-Marker panel for Yesso scallop with 11 000 probes. The SNP and synthetic panels achieve impressive factor 3 of accuracy (F3acc) of 96.24% and 99.66% in 5-fold cross-validation. F3acc rates of over 87.33% and 72.56% are obtained when training on the SNP panel and evaluating performance on the lncRNA and HD-Marker datasets, respectively. Our analysis reveals that Deqformer effectively captures hybridization patterns, making it robust for accurate predictions in various scenarios. Deqformer leads to a novel perspective for probe design pipeline, aiming to enhance efficiency and effectiveness in probe design tasks.


Asunto(s)
Aprendizaje Profundo , ARN Largo no Codificante , Sondas de ADN/genética , Hibridación de Ácido Nucleico , Genómica
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38340092

RESUMEN

De novo peptide sequencing is a promising approach for novel peptide discovery, highlighting the performance improvements for the state-of-the-art models. The quality of mass spectra often varies due to unexpected missing of certain ions, presenting a significant challenge in de novo peptide sequencing. Here, we use a novel concept of complementary spectra to enhance ion information of the experimental spectrum and demonstrate it through conceptual and practical analyses. Afterward, we design suitable encoders to encode the experimental spectrum and the corresponding complementary spectrum and propose a de novo sequencing model $\pi$-HelixNovo based on the Transformer architecture. We first demonstrated that $\pi$-HelixNovo outperforms other state-of-the-art models using a series of comparative experiments. Then, we utilized $\pi$-HelixNovo to de novo gut metaproteome peptides for the first time. The results show $\pi$-HelixNovo increases the identification coverage and accuracy of gut metaproteome and enhances the taxonomic resolution of gut metaproteome. We finally trained a powerful $\pi$-HelixNovo utilizing a larger training dataset, and as expected, $\pi$-HelixNovo achieves unprecedented performance, even for peptide-spectrum matches with never-before-seen peptide sequences. We also use the powerful $\pi$-HelixNovo to identify antibody peptides and multi-enzyme cleavage peptides, and $\pi$-HelixNovo is highly robust in these applications. Our results demonstrate the effectivity of the complementary spectrum and take a significant step forward in de novo peptide sequencing.


Asunto(s)
Análisis de Secuencia de Proteína , Espectrometría de Masas en Tándem , Espectrometría de Masas en Tándem/métodos , Análisis de Secuencia de Proteína/métodos , Péptidos , Secuencia de Aminoácidos , Anticuerpos , Algoritmos
4.
J Biomed Inform ; 153: 104630, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38548007

RESUMEN

OBJECTIVE: To develop soft prompt-based learning architecture for large language models (LLMs), examine prompt-tuning using frozen/unfrozen LLMs, and assess their abilities in transfer learning and few-shot learning. METHODS: We developed a soft prompt-based learning architecture and compared 4 strategies including (1) fine-tuning without prompts; (2) hard-prompting with unfrozen LLMs; (3) soft-prompting with unfrozen LLMs; and (4) soft-prompting with frozen LLMs. We evaluated GatorTron, a clinical LLM with up to 8.9 billion parameters, and compared GatorTron with 4 existing transformer models for clinical concept and relation extraction on 2 benchmark datasets for adverse drug events and social determinants of health (SDoH). We evaluated the few-shot learning ability and generalizability for cross-institution applications. RESULTS AND CONCLUSION: When LLMs are unfrozen, GatorTron-3.9B with soft prompting achieves the best strict F1-scores of 0.9118 and 0.8604 for concept extraction, outperforming the traditional fine-tuning and hard prompt-based models by 0.6 âˆ¼ 3.1 % and 1.2 âˆ¼ 2.9 %, respectively; GatorTron-345 M with soft prompting achieves the best F1-scores of 0.8332 and 0.7488 for end-to-end relation extraction, outperforming other two models by 0.2 âˆ¼ 2 % and 0.6 âˆ¼ 11.7 %, respectively. When LLMs are frozen, small LLMs have a big gap to be competitive with unfrozen models; scaling LLMs up to billions of parameters makes frozen LLMs competitive with unfrozen models. Soft prompting with a frozen GatorTron-8.9B model achieved the best performance for cross-institution evaluation. We demonstrate that (1) machines can learn soft prompts better than hard prompts composed by human, (2) frozen LLMs have good few-shot learning ability and generalizability for cross-institution applications, (3) frozen LLMs reduce computing cost to 2.5 âˆ¼ 6 % of previous methods using unfrozen LLMs, and (4) frozen LLMs require large models (e.g., over several billions of parameters) for good performance.


Asunto(s)
Procesamiento de Lenguaje Natural , Humanos , Aprendizaje Automático , Minería de Datos/métodos , Algoritmos , Determinantes Sociales de la Salud , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos
5.
Sensors (Basel) ; 24(14)2024 Jul 22.
Artículo en Inglés | MEDLINE | ID: mdl-39066157

RESUMEN

Visual object tracking is an important technology in camera-based sensor networks, which has a wide range of practicability in auto-drive systems. A transformer is a deep learning model that adopts the mechanism of self-attention, and it differentially weights the significance of each part of the input data. It has been widely applied in the field of visual tracking. Unfortunately, the security of the transformer model is unclear. It causes such transformer-based applications to be exposed to security threats. In this work, the security of the transformer model was investigated with an important component of autonomous driving, i.e., visual tracking. Such deep-learning-based visual tracking is vulnerable to adversarial attacks, and thus, adversarial attacks were implemented as the security threats to conduct the investigation. First, adversarial examples were generated on top of video sequences to degrade the tracking performance, and the frame-by-frame temporal motion was taken into consideration when generating perturbations over the depicted tracking results. Then, the influence of perturbations on performance was sequentially investigated and analyzed. Finally, numerous experiments on OTB100, VOT2018, and GOT-10k data sets demonstrated that the executed adversarial examples were effective on the performance drops of the transformer-based visual tracking. White-box attacks showed the highest effectiveness, where the attack success rates exceeded 90% against transformer-based trackers.

6.
Sensors (Basel) ; 24(2)2024 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-38257657

RESUMEN

Recently, realistic services like virtual reality and augmented reality have gained popularity. These realistic services require deterministic transmission with end-to-end low latency and high reliability for practical applications. However, for these real-time services to be deterministic, the network core should provide the requisite level of network. To deliver differentiated services to each real-time service, network service providers can classify applications based on traffic. However, due to the presence of personal information in headers, application classification based on encrypted application data is necessary. Initially, we collected application traffic from four well-known applications and preprocessed this data to extract encrypted application data and convert it into model input. We proposed a lightweight transformer model consisting of an encoder, a global average pooling layer, and a dense layer to categorize applications based on the encrypted payload in a packet. To enhance the performance of the proposed model, we determined hyperparameters using several performance evaluations. We evaluated performance with 1D-CNN and ET-BERT. The proposed transformer model demonstrated good performance in the performance evaluation, with a classification accuracy and F1 score of 96% and 95%, respectively. The time complexity of the proposed transformer model was higher than that of 1D-CNN but performed better in application classification. The proposed transformer model had lower time complexity and higher classification performance than ET-BERT.

7.
Sensors (Basel) ; 24(6)2024 Mar 16.
Artículo en Inglés | MEDLINE | ID: mdl-38544168

RESUMEN

A transformer neural network is employed in the present study to predict Q-values in a simulated environment using reinforcement learning techniques. The goal is to teach an agent to navigate and excel in the Flappy Bird game, which became a popular model for control in machine learning approaches. Unlike most top existing approaches that use the game's rendered image as input, our main contribution lies in using sensory input from LIDAR, which is represented by the ray casting method. Specifically, we focus on understanding the temporal context of measurements from a ray casting perspective and optimizing potentially risky behavior by considering the degree of the approach to objects identified as obstacles. The agent learned to use the measurements from ray casting to avoid collisions with obstacles. Our model substantially outperforms related approaches. Going forward, we aim to apply this approach in real-world scenarios.

8.
Int J Mol Sci ; 25(9)2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38731879

RESUMEN

Since the onset of the coronavirus disease 2019 (COVID-19) pandemic, SARS-CoV-2 variants capable of breakthrough infections have attracted global attention. These variants have significant mutations in the receptor-binding domain (RBD) of the spike protein and the membrane (M) protein, which may imply an enhanced ability to evade immune responses. In this study, an examination of co-mutations within the spike RBD and their potential correlation with mutations in the M protein was conducted. The EVmutation method was utilized to analyze the distribution of the mutations to elucidate the relationship between the mutations in the spike RBD and the alterations in the M protein. Additionally, the Sequence-to-Sequence Transformer Model (S2STM) was employed to establish mapping between the amino acid sequences of the spike RBD and M proteins, offering a novel and efficient approach for streamlined sequence analysis and the exploration of their interrelationship. Certain mutations in the spike RBD, G339D-S373P-S375F and Q493R-Q498R-Y505, are associated with a heightened propensity for inducing mutations at specific sites within the M protein, especially sites 3 and 19/63. These results shed light on the concept of mutational synergy between the spike RBD and M proteins, illuminating a potential mechanism that could be driving the evolution of SARS-CoV-2.


Asunto(s)
Proteínas M de Coronavirus , Aprendizaje Automático , Mutación , Dominios Proteicos , SARS-CoV-2 , Glicoproteína de la Espiga del Coronavirus , Humanos , Secuencia de Aminoácidos , Proteínas M de Coronavirus/genética , COVID-19/virología , Unión Proteica , Dominios Proteicos/genética , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , Glicoproteína de la Espiga del Coronavirus/genética , Glicoproteína de la Espiga del Coronavirus/química
9.
Front Artif Intell ; 7: 1375419, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39049961

RESUMEN

Simplifying summaries of scholarly publications has been a popular method for conveying scientific discoveries to a broader audience. While text summarization aims to shorten long documents, simplification seeks to reduce the complexity of a document. To accomplish these tasks collectively, there is a need to develop machine learning methods to shorten and simplify longer texts. This study presents a new Simplification Aware Text Summarization model (SATS) based on future n-gram prediction. The proposed SATS model extends ProphetNet, a text summarization model, by enhancing the objective function using a word frequency lexicon for simplification tasks. We have evaluated the performance of SATS on a recently published text summarization and simplification corpus consisting of 5,400 scientific article pairs. Our results in terms of automatic evaluation demonstrate that SATS outperforms state-of-the-art models for simplification, summarization, and joint simplification-summarization across two datasets on ROUGE, SARI, and CSS1 . We also provide human evaluation of summaries generated by the SATS model. We evaluated 100 summaries from eight annotators for grammar, coherence, consistency, fluency, and simplicity. The average human judgment for all evaluated dimensions lies between 4.0 and 4.5 on a scale from 1 to 5 where 1 means low and 5 means high.

10.
Plants (Basel) ; 13(7)2024 Mar 28.
Artículo en Inglés | MEDLINE | ID: mdl-38611501

RESUMEN

In this study, an innovative approach based on multimodal data and the transformer model was proposed to address challenges in agricultural disease detection and question-answering systems. This method effectively integrates image, text, and sensor data, utilizing deep learning technologies to profoundly analyze and process complex agriculture-related issues. The study achieved technical breakthroughs and provides new perspectives and tools for the development of intelligent agriculture. In the task of agricultural disease detection, the proposed method demonstrated outstanding performance, achieving a precision, recall, and accuracy of 0.95, 0.92, and 0.94, respectively, significantly outperforming the other conventional deep learning models. These results indicate the method's effectiveness in identifying and accurately classifying various agricultural diseases, particularly excelling in handling subtle features and complex data. In the task of generating descriptive text from agricultural images, the method also exhibited impressive performance, with a precision, recall, and accuracy of 0.92, 0.88, and 0.91, respectively. This demonstrates that the method can not only deeply understand the content of agricultural images but also generate accurate and rich descriptive texts. The object detection experiment further validated the effectiveness of our approach, where the method achieved a precision, recall, and accuracy of 0.96, 0.91, and 0.94. This achievement highlights the method's capability for accurately locating and identifying agricultural targets, especially in complex environments. Overall, the approach in this study not only demonstrated exceptional performance in multiple tasks such as agricultural disease detection, image captioning, and object detection but also showcased the immense potential of multimodal data and deep learning technologies in the application of intelligent agriculture.

11.
Comput Biol Med ; 170: 107955, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38215618

RESUMEN

Multi-organ segmentation is vital for clinical diagnosis and treatment. Although CNN and its extensions are popular in organ segmentation, they suffer from the local receptive field. In contrast, MultiLayer-Perceptron-based models (e.g., MLP-Mixer) have a global receptive field. However, these MLP-based models employ fully connected layers with many parameters and tend to overfit on sample-deficient medical image datasets. Therefore, we propose a Cascaded Spatial Shift Network, CSSNet, for multi-organ segmentation. Specifically, we design a novel cascaded spatial shift block to reduce the number of model parameters and aggregate feature segments in a cascaded way for efficient and effective feature extraction. Then, we propose a feature refinement network to aggregate multi-scale features with location information, and enhance the multi-scale features along the channel and spatial axis to obtain a high-quality feature map. Finally, we employ a self-attention-based fusion strategy to focus on the discriminative feature information for better multi-organ segmentation performance. Experimental results on the Synapse (multiply organs) and LiTS (liver & tumor) datasets demonstrate that our CSSNet achieves promising segmentation performance compared with CNN, MLP, and Transformer models. The source code will be available at https://github.com/zkyseu/CSSNet.


Asunto(s)
Neoplasias Hepáticas , Humanos , Redes Neurales de la Computación , Programas Informáticos , Procesamiento de Imagen Asistido por Computador
12.
J Cheminform ; 16(1): 71, 2024 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-38898528

RESUMEN

Among the various molecular properties and their combinations, it is a costly process to obtain the desired molecular properties through theory or experiment. Using machine learning to analyze molecular structure features and to predict molecular properties is a potentially efficient alternative for accelerating the prediction of molecular properties. In this study, we analyze molecular properties through the molecular structure from the perspective of machine learning. We use SMILES sequences as inputs to an artificial neural network in extracting molecular structural features and predicting molecular properties. A SMILES sequence comprises symbols representing molecular structures. To address the problem that a SMILES sequence is different from actual molecular structural data, we propose a pretraining model for a SMILES sequence based on the BERT model, which is widely used in natural language processing, such that the model learns to extract the molecular structural information contained in the SMILES sequence. In an experiment, we first pretrain the proposed model with 100,000 SMILES sequences and then use the pretrained model to predict molecular properties on 22 data sets and the odor characteristics of molecules (98 types of odor descriptor). The experimental results show that our proposed pretraining model effectively improves the performance of molecular property prediction SCIENTIFIC CONTRIBUTION: The 2-encoder pretraining is proposed by focusing on the lower dependency of symbols to the contextual environment in a SMILES than one in a natural language sentence and the corresponding of one compound to multiple SMILES sequences. The model pretrained with 2-encoder shows higher robustness in tasks of molecular properties prediction compared to BERT which is adept at natural language.

13.
J Big Data ; 11(1): 25, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38321999

RESUMEN

The transformer model is a famous natural language processing model proposed by Google in 2017. Now, with the extensive development of deep learning, many natural language processing tasks can be solved by deep learning methods. After the BERT model was proposed, many pre-trained models such as the XLNet model, the RoBERTa model, and the ALBERT model were also proposed in the research community. These models perform very well in various natural language processing tasks. In this paper, we describe and compare these well-known models. In addition, we also apply several types of existing and well-known models which are the BERT model, the XLNet model, the RoBERTa model, the GPT2 model, and the ALBERT model to different existing and well-known natural language processing tasks, and analyze each model based on their performance. There are a few papers that comprehensively compare various transformer models. In our paper, we use six types of well-known tasks, such as sentiment analysis, question answering, text generation, text summarization, name entity recognition, and topic modeling tasks to compare the performance of various transformer models. In addition, using the existing models, we also propose ensemble learning models for the different natural language processing tasks. The results show that our ensemble learning models  perform better than a single classifier on specific tasks.

14.
Phys Eng Sci Med ; 2024 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-39101991

RESUMEN

Intensity-modulated radiation therapy (IMRT) has been widely used in treating head and neck tumors. However, due to the complex anatomical structures in the head and neck region, it is challenging for the plan optimizer to rapidly generate clinically acceptable IMRT treatment plans. A novel deep learning multi-scale Transformer (MST) model was developed in the current study aiming to accelerate the IMRT planning for head and neck tumors while generating more precise prediction of the voxel-level dose distribution. The proposed end-to-end MST model employs the shunted Transformer to capture multi-scale features and learn a global dependency, and utilizes 3D deformable convolution bottleneck blocks to extract shape-aware feature and compensate the loss of spatial information in the patch merging layers. Moreover, data augmentation and self-knowledge distillation are used to further improve the prediction performance of the model. The MST model was trained and evaluated on the OpenKBP Challenge dataset. Its prediction accuracy was compared with three previous dose prediction models: C3D, TrDosePred, and TSNet. The predicted dose distributions of our proposed MST model in the tumor region are closest to the original clinical dose distribution. The MST model achieves the dose score of 2.23 Gy and the DVH score of 1.34 Gy on the test dataset, outperforming the other three models by 8%-17%. For clinical-related DVH dosimetric metrics, the prediction accuracy in terms of mean absolute error (MAE) is 2.04% for D 99 , 1.54% for D 95 , 1.87% for D 1 , 1.87% for D mean , 1.89% for D 0.1 c c , respectively, superior to the other three models. The quantitative results demonstrated that the proposed MST model achieved more accurate voxel-level dose prediction than the previous models for head and neck tumors. The MST model has a great potential to be applied to other disease sites to further improve the quality and efficiency of radiotherapy planning.

15.
Patterns (N Y) ; 5(3): 100933, 2024 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-38487800

RESUMEN

In cancer research, pathology report text is a largely untapped data source. Pathology reports are routinely generated, more nuanced than structured data, and contain added insight from pathologists. However, there are no publicly available datasets for benchmarking report-based models. Two recent advances suggest the urgent need for a benchmark dataset. First, improved optical character recognition (OCR) techniques will make it possible to access older pathology reports in an automated way, increasing the data available for analysis. Second, recent improvements in natural language processing (NLP) techniques using artificial intelligence (AI) allow more accurate prediction of clinical targets from text. We apply state-of-the-art OCR and customized post-processing to report PDFs from The Cancer Genome Atlas, generating a machine-readable corpus of 9,523 reports. Finally, we perform a proof-of-principle cancer-type classification across 32 tissues, achieving 0.992 average AU-ROC. This dataset will be useful to researchers across specialties, including research clinicians, clinical trial investigators, and clinical NLP researchers.

16.
PeerJ Comput Sci ; 10: e1887, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38660197

RESUMEN

Emotion detection (ED) involves the identification and understanding of an individual's emotional state through various cues such as facial expressions, voice tones, physiological changes, and behavioral patterns. In this context, behavioral analysis is employed to observe actions and behaviors for emotional interpretation. This work specifically employs behavioral metrics like drawing and handwriting to determine a person's emotional state, recognizing these actions as physical functions integrating motor and cognitive processes. The study proposes an attention-based transformer model as an innovative approach to identify emotions from handwriting and drawing samples, thereby advancing the capabilities of ED into the domains of fine motor skills and artistic expression. The initial data obtained provides a set of points that correspond to the handwriting or drawing strokes. Each stroke point is subsequently delivered to the attention-based transformer model, which embeds it into a high-dimensional vector space. The model builds a prediction about the emotional state of the person who generated the sample by integrating the most important components and patterns in the input sequence using self-attentional processes. The proposed approach possesses a distinct advantage in its enhanced capacity to capture long-range correlations compared to conventional recurrent neural networks (RNN). This characteristic makes it particularly well-suited for the precise identification of emotions from samples of handwriting and drawings, signifying a notable advancement in the field of emotion detection. The proposed method produced cutting-edge outcomes of 92.64% on the benchmark dataset known as EMOTHAW (Emotion Recognition via Handwriting and Drawing).

17.
Sci Rep ; 14(1): 6320, 2024 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-38491085

RESUMEN

This study aims to explore the research methodology of applying the Transformer model algorithm to Chinese word sense disambiguation, seeking to resolve word sense ambiguity in the Chinese language. The study introduces deep learning and designs a Chinese word sense disambiguation model based on the fusion of the Transformer with the Bi-directional Long Short-Term Memory (BiLSTM) algorithm. By utilizing the self-attention mechanism of Transformer and the sequence modeling capability of BiLSTM, this model efficiently captures semantic information and context relationships in Chinese sentences, leading to accurate word sense disambiguation. The model's evaluation is conducted using the PKU Paraphrase Bank, a Chinese text paraphrase dataset. The results demonstrate that the model achieves a precision rate of 83.71% in Chinese word sense disambiguation, significantly outperforming the Long Short-Term Memory algorithm. Additionally, the root mean squared error of this algorithm is less than 17, with a loss function value remaining around 0.14. Thus, this study validates that the constructed Transformer-fused BiLSTM-based Chinese word sense disambiguation model algorithm exhibits both high accuracy and robustness in identifying word senses in the Chinese language. The findings of this study provide valuable insights for advancing the intelligent development of word senses in Chinese language applications.

18.
EClinicalMedicine ; 75: 102772, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39170939

RESUMEN

Background: Acute respiratory distress syndrome (ARDS) is a life-threatening condition with a high incidence and mortality rate in intensive care unit (ICU) admissions. Early identification of patients at high risk for developing ARDS is crucial for timely intervention and improved clinical outcomes. However, the complex pathophysiology of ARDS makes early prediction challenging. This study aimed to develop an artificial intelligence (AI) model for automated lung lesion segmentation and early prediction of ARDS to facilitate timely intervention in the intensive care unit. Methods: A total of 928 ICU patients with chest computed tomography (CT) scans were included from November 2018 to November 2021 at three centers in China. Patients were divided into a retrospective cohort for model development and internal validation, and three independent cohorts for external validation. A deep learning-based framework using the UNet Transformer (UNETR) model was developed to perform the segmentation of lung lesions and early prediction of ARDS. We employed various data augmentation techniques using the Medical Open Network for AI (MONAI) framework, enhancing the training sample diversity and improving the model's generalization capabilities. The performance of the deep learning-based framework was compared with a Densenet-based image classification network and evaluated in external and prospective validation cohorts. The segmentation performance was assessed using the Dice coefficient (DC), and the prediction performance was assessed using area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. The contributions of different features to ARDS prediction were visualized using Shapley Explanation Plots. This study was registered with the China Clinical Trial Registration Centre (ChiCTR2200058700). Findings: The segmentation task using the deep learning framework achieved a DC of 0.734 ± 0.137 in the validation set. For the prediction task, the deep learning-based framework achieved AUCs of 0.916 [0.858-0.961], 0.865 [0.774-0.945], 0.901 [0.835-0.955], and 0.876 [0.804-0.936] in the internal validation cohort, external validation cohort I, external validation cohort II, and prospective validation cohort, respectively. It outperformed the Densenet-based image classification network in terms of prediction accuracy. Moreover, the ARDS prediction model identified lung lesion features and clinical parameters such as C-reactive protein, albumin, bilirubin, platelet count, and age as significant contributors to ARDS prediction. Interpretation: The deep learning-based framework using the UNETR model demonstrated high accuracy and robustness in lung lesion segmentation and early ARDS prediction, and had good generalization ability and clinical applicability. Funding: This study was supported by grants from the Shanghai Renji Hospital Clinical Research Innovation and Cultivation Fund (RJPY-DZX-008) and Shanghai Science and Technology Development Funds (22YF1423300).

19.
J Am Med Inform Assoc ; 31(9): 1892-1903, 2024 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-38630580

RESUMEN

OBJECTIVE: To solve major clinical natural language processing (NLP) tasks using a unified text-to-text learning architecture based on a generative large language model (LLM) via prompt tuning. METHODS: We formulated 7 key clinical NLP tasks as text-to-text learning and solved them using one unified generative clinical LLM, GatorTronGPT, developed using GPT-3 architecture and trained with up to 20 billion parameters. We adopted soft prompts (ie, trainable vectors) with frozen LLM, where the LLM parameters were not updated (ie, frozen) and only the vectors of soft prompts were updated, known as prompt tuning. We added additional soft prompts as a prefix to the input layer, which were optimized during the prompt tuning. We evaluated the proposed method using 7 clinical NLP tasks and compared them with previous task-specific solutions based on Transformer models. RESULTS AND CONCLUSION: The proposed approach achieved state-of-the-art performance for 5 out of 7 major clinical NLP tasks using one unified generative LLM. Our approach outperformed previous task-specific transformer models by ∼3% for concept extraction and 7% for relation extraction applied to social determinants of health, 3.4% for clinical concept normalization, 3.4%-10% for clinical abbreviation disambiguation, and 5.5%-9% for natural language inference. Our approach also outperformed a previously developed prompt-based machine reading comprehension (MRC) model, GatorTron-MRC, for clinical concept and relation extraction. The proposed approach can deliver the "one model for all" promise from training to deployment using a unified generative LLM.


Asunto(s)
Procesamiento de Lenguaje Natural , Registros Electrónicos de Salud , Humanos , Aprendizaje Automático
20.
Sci Rep ; 14(1): 18506, 2024 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-39122773

RESUMEN

This paper aims to increase the Unmanned Aerial Vehicle's (UAV) capacity for target tracking. First, a control model based on fuzzy logic is created, which modifies the UAV's flight attitude in response to the target's motion status and changes in the surrounding environment. Then, an edge computing-based target tracking framework is created. By deploying edge devices around the UAV, the calculation of target recognition and position prediction is transferred from the central processing unit to the edge nodes. Finally, the latest Vision Transformer model is adopted for target recognition, the image is divided into uniform blocks, and then the attention mechanism is used to capture the relationship between different blocks to realize real-time image analysis. To anticipate the position, the particle filter algorithm is used with historical data and sensor inputs to produce a high-precision estimate of the target position. The experimental results in different scenes show that the average target capture time of the algorithm based on fuzzy logic control is shortened by 20% compared with the traditional proportional-integral-derivative (PID) method, from 5.2 s of the traditional PID to 4.2 s. The average tracking error is reduced by 15%, from 0.8 m of traditional PID to 0.68 m. Meanwhile, in the case of environmental change and target motion change, this algorithm shows better robustness, and the fluctuation range of tracking error is only half of that of traditional PID. This shows that the fuzzy logic control theory is successfully applied to the UAV target tracking field, which proves the effectiveness of this method in improving the target tracking performance.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA