RESUMO
Single-cell RNA sequencing (scRNA-seq) detects whole transcriptome signals for large amounts of individual cells and is powerful for determining cell-to-cell differences and investigating the functional characteristics of various cell types. scRNA-seq datasets are usually sparse and highly noisy. Many steps in the scRNA-seq analysis workflow, including reasonable gene selection, cell clustering and annotation, as well as discovering the underlying biological mechanisms from such datasets, are difficult. In this study, we proposed an scRNA-seq analysis method based on the latent Dirichlet allocation (LDA) model. The LDA model estimates a series of latent variables, i.e. putative functions (PFs), from the input raw cell-gene data. Thus, we incorporated the 'cell-function-gene' three-layer framework into scRNA-seq analysis, as this framework is capable of discovering latent and complex gene expression patterns via a built-in model approach and obtaining biologically meaningful results through a data-driven functional interpretation process. We compared our method with four classic methods on seven benchmark scRNA-seq datasets. The LDA-based method performed best in the cell clustering test in terms of both accuracy and purity. By analysing three complex public datasets, we demonstrated that our method could distinguish cell types with multiple levels of functional specialization, and precisely reconstruct cell development trajectories. Moreover, the LDA-based method accurately identified the representative PFs and the representative genes for the cell types/cell stages, enabling data-driven cell cluster annotation and functional interpretation. According to the literature, most of the previously reported marker/functionally relevant genes were recognized.
Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Transcriptoma , Análise por Conglomerados , AlgoritmosRESUMO
BACKGROUND: Aging is a complex, heterogeneous process that has multiple causes. Knowledge on genomic, epigenomic and transcriptomic changes during the aging process shed light on understanding the aging mechanism. A recent breakthrough in biotechnology, single cell RNAseq, is revolutionizing aging study by providing gene expression profile of the entire transcriptome of individual cells. Many interesting information could be inferred from this new type of data with the help of novel computational methods. RESULTS: In this manuscript a novel statistical method, penalized Latent Dirichlet Allocation (pLDA), is applied to an aging mouse blood scRNA-seq data set. A pipeline is built for cell type and aging prediction. The sequence of models in the pipeline take scRNA-seq expression counts as input, preprocess the data using pLDA and predict the cell type and aging status. CONCLUSIONS: pLDA learns a dimension reduced representation of the expression profile. This representation allows identification of cell types and has predictability of the age of cells.
Assuntos
Envelhecimento , Animais , Camundongos , Envelhecimento/genética , Análise de Célula Única/métodos , Células Sanguíneas/metabolismo , Transcriptoma , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos , AlgoritmosRESUMO
The relative proportion of RNA isoforms expressed for a given gene has been associated with disease states in cancer, retinal diseases, and neurological disorders. Examination of relative isoform proportions can help determine biological mechanisms, but such analyses often require a per-gene investigation of splicing patterns. Leveraging large public data sets produced by genomic consortia as a reference, one can compare splicing patterns in a data set of interest with those of a reference panel in which samples are divided into distinct groups, such as tissue of origin, or disease status. We propose A latent Dirichlet model to Compare expressed isoform proportions TO a Reference panel (ACTOR), a latent Dirichlet model with Dirichlet Multinomial observations to compare expressed isoform proportions in a data set to an independent reference panel. We use a variational Bayes procedure to estimate posterior distributions for the group membership of one or more samples. Using the Genotype-Tissue Expression project as a reference data set, we evaluate ACTOR on simulated and real RNA-seq data sets to determine tissue-type classifications of genes. ACTOR is publicly available as an R package at https://github.com/mccabes292/actor.
Assuntos
Teorema de Bayes , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/análise , Isoformas de Proteínas/metabolismo , Análise de Sequência de RNA/métodosRESUMO
Single-cell RNA sequencing trades read-depth for dimensionality, often leading to loss of critical signaling gene information that is typically present in bulk data sets. We introduce DURIAN (Deconvolution and mUltitask-Regression-based ImputAtioN), an integrative method for recovery of gene expression in single-cell data. Through systematic benchmarking, we demonstrate the accuracy, robustness and empirical convergence of DURIAN using both synthetic and published data sets. We show that use of DURIAN improves single-cell clustering, low-dimensional embedding, and recovery of intercellular signaling networks. Our study resolves several inconsistent results of cell-cell communication analysis using single-cell or bulk data independently. The method has broad application in biomarker discovery and cell signaling analysis using single-cell transcriptomics data sets.
Assuntos
Bombacaceae , Transcriptoma , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Transdução de Sinais/genética , Análise de Célula Única/métodosRESUMO
OBJECTIVE: Evaluate guselkumab efficacy, an anti-interleukin-23p19-subunit antibody, in patients with active psoriatic arthritis (PsA) and inadequate response to 1-2 tumour necrosis factor inhibitors (TNFi-IR), utilizing composite indices assessing disease activity across disease domains. METHODS: In the Phase IIIb COSMOS trial, 285 adults with TNFi-IR PsA were randomized (2:1) to receive guselkumab 100 mg or placebo at Week (W)0, W4, then every 8 weeks through W44. Patients receiving placebo crossed over to guselkumab at W24. In this post-hoc analysis, composite indices evaluated included the Disease Activity Index for Psoriatic Arthritis (DAPSA), Disease Activity Score 28 (DAS28), Psoriatic Arthritis Response Criteria (PsARC), Psoriatic Arthritis Disease Activity Score (PASDAS), GRAPPA Composite score (GRACE), modified Composite Psoriatic Disease Activity Index (mCPDAI), minimal disease activity (MDA) and very low disease activity (VLDA). Through W24, treatment failure rules were applied. Through W48, non-responder imputation was used for missing data. RESULTS: Greater proportions of guselkumab- than placebo-randomized patients achieved composite index endpoints relating to low disease activity (LDA; 14.8-52.4% vs 3.1-28.1%) or remission (3.7-5.3% vs 0.0-2.1%) at W24. Among guselkumab-randomized patients, LDA rates increased to W48 (DAPSA, 44.4%; DAS28, 47.8%; PASDAS, 34.4%; GRACE, 33.3%; mCPDAI, 40.2%), and 27.0% and 64.0% achieved MDA and a PsARC response, respectively. In the placeboâguselkumab crossover group, W48 response rates were similar to the guselkumab-randomized group. CONCLUSION: Guselkumab treatment provided substantial benefits across multiple disease domains, with increasing proportions of patients achieving LDA/remission over 1 year, highlighting the effectiveness of guselkumab despite previous inadequate response to TNFi.
RESUMO
Despite the improvements in forensic DNA quantification methods that allow for the early detection of low template/challenged DNA samples, complicating stochastic effects are not revealed until the final stage of the DNA analysis workflow. An assay that would provide genotyping information at the earlier stage of quantification would allow examiners to make critical adjustments prior to STR amplification allowing for potentially exclusionary information to be immediately reported. Specifically, qPCR instruments often have dissociation curve and/or high-resolution melt curve (HRM) capabilities; this, coupled with statistical prediction analysis, could provide additional information regarding STR genotypes present. Thus, this study aimed to evaluate Qiagen's principal component analysis (PCA)-based ScreenClust® HRM® software and a linear discriminant analysis (LDA)-based technique for their abilities to accurately predict genotypes and similar groups of genotypes from HRM data. Melt curves from single source samples were generated from STR D5S818 and D18S51 amplicons using a Rotor-Gene® Q qPCR instrument and EvaGreen® intercalating dye. When used to predict D5S818 genotypes for unknown samples, LDA analysis outperformed the PCA-based method whether predictions were for individual genotypes (58.92% accuracy) or for geno-groups (81.00% accuracy). However, when a locus with increased heterogeneity was tested (D18S51), PCA-based prediction accuracy rates improved to rates similar to those obtained using LDA (45.10% and 63.46%, respectively). This study provides foundational data documenting the performance of prediction modeling for STR genotyping based on qPCR-HRM data. In order to expand the forensic applicability of this HRM assay, the method could be tested with a more commonly utilized qPCR platform.
Assuntos
Impressões Digitais de DNA , Genótipo , Repetições de Microssatélites , Análise de Componente Principal , Reação em Cadeia da Polimerase em Tempo Real , Humanos , Impressões Digitais de DNA/métodos , Análise Discriminante , Reação em Cadeia da Polimerase em Tempo Real/métodos , SoftwareRESUMO
Bladder cancer (BC) is an epidemiological urologic malignancy that continues to increase each year. Early diagnosis and prognosis monitoring is always significant in clinical practice, especially in distinguishing non-muscle-invasive bladder cancer (NMIBC) from muscle-invasive bladder cancer (MIBC), due to the various depths of tumor invasion related to different therapeutic schedules and recurrence rates. Common diagnostic approaches are too invasive or generally inefficient in accuracy and specificity. In this work, a totally non-invasive and cost-effective method is established by investigating urine samples using surface-enhanced Raman spectroscopy (SERS) and multivariate statistical analysis. The comparison of urine SERS spectra shows the intensities of characteristic peaks for DNA/RNA, hypoxanthine, albumin, D-( +)-galactosamine, fatty acids, and some amino acids are distinguishable in BC occurrence and invasion progression. A PLS-LDA-based two-step binary classification scheme is performed on urine SERS spectra and the diagnostic accuracies were 97.7% and 96.3% for healthy individuals versus BC patients and NMIBC versus MIBC patients, respectively. Moreover, the impact of urine SERS spectral lengths in reaching high-precision recognition of BC is investigated. The results show that the Raman peaks at 803, 893, 1139, 1375, and 1466 cm-1 play an essential role in correctly categorizing healthy control, NMIBC, and MIBC patients, and SERS spectra ranges from 400 to 1600 cm-1 are enough for this identification task. These findings provide a sensitive, label-free, rapid, and totally non-invasive way for assessment of invasion depth of BC to its early diagnosis and prognosis monitoring, as well as valuable insights for selecting reasonable spectral range to enhance the measurement efficiency especially in large-scale sample datasets.
RESUMO
BACKGROUND: This study reviews the research status of Diagnosis-related groups (DRGs) payment system in China and globally by analyzing topical issues in this field and exploring the evolutionary trends of DRGs in different developmental stages. METHODS: Abstracts of relevant literature in the field of DRGs were extracted from the China National Knowledge Infrastructure (CNKI) database and the Web of Science (WoS) core database and used as text data. A probabilistic distribution-based Latent Dirichlet Allocation (LDA) topic model was applied to mine the text topics. Topical issues were determined by topic intensity, and the cosine similarity of the topics in adjacent stages was calculated to analyze the topic evolution trend. RESULTS: A total of 6,758 English articles and 3,321 Chinese articles were included. Foreign research on DRGs focuses on grouping optimization, implementation effects, and influencing factors, whereas research topics in China focus on grouping and payment mechanism establishment, medical cost change evaluation, medical quality control, and performance management reform exploration. CONCLUSIONS: Currently, the field of DRGs in China is developing rapidly and attracting deepening research. However, the implementation depth of research in China remains insufficient compared with the in-depth research conducted abroad.
Assuntos
Grupos Diagnósticos Relacionados , ChinaRESUMO
BACKGROUND: By the end of 2021, the new wave of COVID-19 sparked by the Omicron variant spread rapidly due to its highly contagious nature, affecting more than 170 countries worldwide. Nucleic acid testing became the gold standard for diagnosing novel coronavirus infections. As of July 2022, numerous cities and regions in China have implemented regular nucleic acid testing policies, which have had a significant impact on socioeconomics and people's lives. This policy has garnered widespread attention on social media platforms. OBJECTIVE: This study took the newly issued regular nucleic acid testing policy during the COVID-19 pandemic as an example to explore the sentiment responses and fluctuations of netizens toward new policies during public health emergencies. It aimed to propose strategies for managing public opinion on the internet and provide recommendations for policy making and public opinion control. METHODS: We collected blog posts related to nucleic acid testing on Weibo from April 1, 2022, to July 31, 2022. We used the topic modeling technique latent Dirichlet allocation (LDA) to identify the most common topics posted by users. We used Bidirectional Encoder Representations from Transformers (BERT) to calculate the sentiment score of each post. We used an autoregressive integrated moving average (ARIMA) model to examine the relationship between sentiment scores and changes over time. We compared the differences in sentiment scores across various topics, as well as the changes in sentiment before and after the announcement of the nucleic acid price reduction policy (May 22) and the lifting of the lockdown policy in Shanghai (June 1). RESULTS: We collected a total of 463,566 Weibo posts, with an average of 3799.72 (SD 1296.06) posts published daily. The LDA topic extraction identified 8 topics, with the most numerous being the Shanghai outbreak, nucleic acid testing price, and transportation. The average sentiment score of the posts was 0.64 (SD 0.31), indicating a predominance of positive sentiment. For all topics, posts with positive sentiment consistently outnumbered those with negative sentiment (χ27=24,844.4, P<.001). The sentiment scores of posts related to "nucleic acid testing price" decreased after May 22 compared with before (t120=3.882, P<.001). Similarly, the sentiment scores of posts related to the "Shanghai outbreak" decreased after June 1 compared with before (t120=11.943, P<.001). CONCLUSIONS: During public health emergencies, the topics of public concern were diverse. Public sentiment toward the regular nucleic acid testing policy was generally positive, but fluctuations occurred following the announcement of key policies. To understand the primary concerns of the public, the government needs to monitor social media posts by citizens. By promptly sharing information on media platforms and engaging in effective communication, the government can bridge the information gap between the public and government agencies, fostering a positive public opinion environment.
Assuntos
COVID-19 , Política de Saúde , Saúde Pública , Opinião Pública , Humanos , China , COVID-19/prevenção & controle , COVID-19/epidemiologia , População do Leste Asiático , Emergências , Pandemias , Mídias SociaisRESUMO
The potential for rotor component shedding in rotating machinery poses significant risks, necessitating the development of an early and precise fault diagnosis technique to prevent catastrophic failures and reduce maintenance costs. This study introduces a data-driven approach to detect rotor component shedding at its inception, thereby enhancing operational safety and minimizing downtime. Utilizing frequency analysis, this research identifies harmonic amplitudes within rotor vibration data as key indicators of impending faults. The methodology employs principal component analysis (PCA) to orthogonalize and reduce the dimensionality of vibration data from rotor sensors, followed by k-fold cross-validation to select a subset of significant features, ensuring the detection algorithm's robustness and generalizability. These features are then integrated into a linear discriminant analysis (LDA) model, which serves as the diagnostic engine to predict the probability of rotor component shedding. The efficacy of the approach is demonstrated through its application to 16 industrial compressors and turbines, proving its value in providing timely fault warnings and enhancing operational reliability.
RESUMO
The Internet of Things (IoT) is a significant technological advancement that allows for seamless device integration and data flow. The development of the IoT has led to the emergence of several solutions in various sectors. However, rapid popularization also has its challenges, and one of the most serious challenges is the security of the IoT. Security is a major concern, particularly routing attacks in the core network, which may cause severe damage due to information loss. Routing Protocol for Low-Power and Lossy Networks (RPL), a routing protocol used for IoT devices, is faced with selective forwarding attacks. In this paper, we present a federated learning-based detection technique for detecting selective forwarding attacks, termed FL-DSFA. A lightweight model involving the IoT Routing Attack Dataset (IRAD), which comprises Hello Flood (HF), Decreased Rank (DR), and Version Number (VN), is used in this technique to increase the detection efficiency. The attacks on IoT threaten the security of the IoT system since they mainly focus on essential elements of RPL. The components include control messages, routing topologies, repair procedures, and resources within sensor networks. Binary classification approaches have been used to assess the training efficiency of the proposed model. The training step includes the implementation of machine learning algorithms, including logistic regression (LR), K-nearest neighbors (KNN), support vector machine (SVM), and naive Bayes (NB). The comparative analysis illustrates that this study, with SVM and KNN classifiers, exhibits the highest accuracy during training and achieves the most efficient runtime performance. The proposed system demonstrates exceptional performance, achieving a prediction precision of 97.50%, an accuracy of 95%, a recall rate of 98.33%, and an F1 score of 97.01%. It outperforms the current leading research in this field, with its classification results, scalability, and enhanced privacy.
RESUMO
The accuracy of classifying motor imagery (MI) activities is a significant challenge when using brain-computer interfaces (BCIs). BCIs allow people with motor impairments to control external devices directly with their brains using electroencephalogram (EEG) patterns that translate brain activity into control signals. Many researchers have been working to develop MI-based BCI recognition systems using various time-frequency feature extraction and classification approaches. However, the existing systems still face challenges in achieving satisfactory performance due to large amount of non-discriminative and ineffective features. To get around these problems, we suggested a multiband decomposition-based feature extraction and classification method that works well, along with a strong feature selection method for MI tasks. Our method starts by splitting the preprocessed EEG signal into four sub-bands. In each sub-band, we then used a common spatial pattern (CSP) technique to pull out narrowband-oriented useful features, which gives us a high-dimensional feature vector. Subsequently, we utilized an effective feature selection method, Relief-F, which reduces the dimensionality of the final features. Finally, incorporating advanced classification techniques, we classified the final reduced feature vector. To evaluate the proposed model, we used the three different EEG-based MI benchmark datasets, and our proposed model achieved better performance accuracy than existing systems. Our model's strong points include its ability to effectively reduce feature dimensionality and improve classification accuracy through advanced feature extraction and selection methods.
Assuntos
Interfaces Cérebro-Computador , Eletroencefalografia , Eletroencefalografia/métodos , Humanos , Algoritmos , Processamento de Sinais Assistido por Computador , Imaginação/fisiologia , Encéfalo/fisiologiaRESUMO
Effective early fire detection is crucial for preventing damage to people and buildings, especially in fire-prone historic structures. However, due to the infrequent occurrence of fire events throughout a building's lifespan, real-world data for training models are often sparse. In this study, we applied feature representation transfer and instance transfer in the context of early fire detection using multi-sensor nodes. The goal was to investigate whether training data from a small-scale setup (source domain) can be used to identify various incipient fire scenarios in their early stages within a full-scale test room (target domain). In a first step, we employed Linear Discriminant Analysis (LDA) to create a new feature space solely based on the source domain data and predicted four different fire types (smoldering wood, smoldering cotton, smoldering cable and candle fire) in the target domain with a classification rate up to 69% and a Cohen's Kappa of 0.58. Notably, lower classification performance was observed for sensor node positions close to the wall in the full-scale test room. In a second experiment, we applied the TrAdaBoost algorithm as a common instance transfer technique to adapt the model to the target domain, assuming that sparse information from the target domain is available. Boosting the data from 1% to 30% was utilized for individual sensor node positions in the target domain to adapt the model to the target domain. We found that additional boosting improved the classification performance (average classification rate of 73% and an average Cohen's Kappa of 0.63). However, it was noted that excessively boosting the data could lead to overfitting to a specific sensor node position in the target domain, resulting in a reduction in the overall classification performance.
RESUMO
With the exacerbation of global climate change and the growing environmental awareness among the general public, the concept of green consumption has gained significant attention across various sectors of society. As a representative example of green consumer products, energy-saving products play a crucial role in the timely realization of dual carbon goals. However, an analysis of online comments regarding energy-saving products reveals that the majority of these products still exhibit shortcomings in terms of efficacy, noise level, cost-effectiveness, and particularly, energy-saving appliances. This study focuses on the user-generated online comments data from the Taobao e-commerce platform for Grade 1 energy-saving refrigerators. By employing text mining techniques, the study aims to extract the essential information and sentiments expressed in the comments, in order to explore the consumption characteristics of Grade 1 energy-saving refrigerators. Moreover, the LBBA (LDA-Bert-BiLSTM-Attention) model is utilized to investigate the consumer topics of interest and emotional features. Initially, the LDA model is adopted to identify the attributes and weights of consumer concerns. Subsequently, the Bert model is pre-trained with the online comment data, and combined with the BiLSTM algorithm and Attention mechanism to predict sentiment categories. Finally, a transfer learning approach is utilized to determine the sentiment inclination of user-generated online comments and to identify the primary driving factors behind each sentiment category. This research employs sentiment analysis on online comments data regarding energy-saving products to uncover consumer sentiment attributes and emotional characteristics. It provides decision-makers with a comprehensive and systematic understanding of public consumption intentions, offering decision support for the efficient operation and management of the energy-saving product market.
Assuntos
Algoritmos , Mudança Climática , HumanosRESUMO
Varietal volatile compounds are characteristic of each variety of grapes and come from the skins of the grapes. This work focuses on the development of a methodology for the analysis of free compounds in grapes from Trincadeira, Cabernet Sauvignon, Syrah, Castelão and Tinta Barroca from the 2021 and 2022 harvests, using HS-SPME-GC × GC-TOFMS. To achieve this purpose, a previous optimization step of sample preparation was implemented, with the optimized conditions being 4 g of grapes, 2 g of NaCl, and 2 mL of H2O. The extraction conditions were also optimized, and it was observed that performing the extraction for 40 min at 60 °C was the best for identifying more varietal compounds. The fiber used was a triple fiber of carboxen/divinylbenzene/polydimethylsiloxane (CAR/DVB/PDMS). In addition to the sample preparation, the analytical conditions were also optimized, enabling the adequate separation of analytes. Using the optimized methodology, it was possible to identify fifty-two free volatile compounds, including seventeen monoterpenes, twenty-eight sesquiterpenes, and seven C13-norisoprenoids. It was observed that in 2021, more free varietal volatile compounds were identifiable compared to 2022. According to the results obtained through a linear discriminant analysis (LDA), the differences in volatile varietal signature are observed both among different grape varieties and across different years.
RESUMO
This study aimed to examine the biological effects of blood plasma exchange in liver tissues of aged and young rats using machine learning methods and spectrochemical and histopathological approaches. Linear Discriminant Analysis (LDA) and Support Vector Machine (SVM) were the machine learning algorithms employed. Young plasma was given to old male rats (24 months), while old plasma was given to young male rats (5 weeks) for thirty days. LDA (95.83-100%) and SVM (87.5-91.67%) detected significant qualitative changes in liver biomolecules. In old rats, young plasma infusion increased the length of fatty acids, triglyceride, lipid carbonyl, and glycogen levels. Nucleic acid concentration, phosphorylation, and carbonylation rates of proteins were also increased, whereas a decrease in protein concentration was measured. Aged plasma decreased protein carbonylation, triglyceride, and lipid carbonyl levels. Young plasma infusion improved hepatic fibrosis and cellular degeneration and reduced hepatic microvesicular steatosis in aged rats. Otherwise, old plasma infusion in young rats caused disrupted cellular organization, steatosis, and increased fibrosis. Young plasma administration increased liver glycogen accumulation and serum albumin levels. Aged plasma infusion raised serum ALT levels while diminished ALP concentrations in young rats, suggesting possible liver dysfunction. Young plasma increased serum albumin levels in old rats. The study concluded that young plasma infusion might be associated with declined liver damage and fibrosis in aged rats, while aged plasma infusion negatively impacted liver health in young rats. These results imply that young blood plasma holds potential as a rejuvenation therapy for liver health and function.
Assuntos
Fígado Gorduroso , Troca Plasmática , Masculino , Ratos , Animais , Fígado/metabolismo , Fígado Gorduroso/metabolismo , Triglicerídeos/metabolismo , Triglicerídeos/farmacologia , Fibrose , Albumina Sérica/metabolismo , Albumina Sérica/farmacologiaRESUMO
BACKGROUND: During the aging process, cognitive functions and performance of the muscular and neural system show signs of decline, thus making the elderly more susceptible to disease and death. These alterations, which occur with advanced age, affect functional performance in both the lower and upper members, and consequently human motor functions. Objective measurements are important tools to help understand and characterize the dysfunctions and limitations that occur due to neuromuscular changes related to advancing age. Therefore, the objective of this study is to attest to the difference between groups of young and old individuals through manual movements and whether the combination of features can produce a linear correlation concerning the different age groups. METHODS: This study counted on 99 participants, these were divided into 8 groups, which were grouped by age. The data collection was performed using inertial sensors (positioned on the back of the hand and on the back of the forearm). Firstly, the participants were divided into groups of young and elderly to verify if the groups could be distinguished through the features alone. Following this, the features were combined using the linear discriminant analysis (LDA), which gave rise to a singular feature called the LDA-value that aided in verifying the correlation between the different age ranges and the LDA-value. RESULTS: The results demonstrated that 125 features are able to distinguish the difference between the groups of young and elderly individuals. The use of the LDA-value allows for the obtaining of a linear model of the changes that occur with aging in the performance of tasks in line with advancing age, the correlation obtained, using Pearson's coefficient, was 0.86. CONCLUSION: When we compare only the young and elderly groups, the results indicate that there is a difference in the way tasks are performed between young and elderly individuals. When the 8 groups were analyzed, the linear correlation obtained was strong, with the LDA-value being effective in obtaining a linear correlation of the eight groups, demonstrating that although the features alone do not demonstrate gradual changes as a function of age, their combination established these changes.
Assuntos
Envelhecimento , Antebraço , Humanos , Idoso , Análise Discriminante , Modelos Lineares , AlgoritmosRESUMO
BACKGROUND: Monitoring people's perspectives on the COVID-19 vaccine is crucial for understanding public vaccination hesitancy and developing effective, targeted vaccine promotion strategies. Although this is widely recognized, studies on the evolution of public opinion over the course of an actual vaccination campaign are rare. OBJECTIVE: We aimed to track the evolution of public opinion and sentiment toward COVID-19 vaccines in online discussions over an entire vaccination campaign. Moreover, we aimed to reveal the pattern of gender differences in attitudes and perceptions toward vaccination. METHODS: We collected COVID-19 vaccine-related posts by the general public that appeared on Sina Weibo from January 1, 2021, to December 31, 2021; this period covered the entire vaccination process in China. We identified popular discussion topics using latent Dirichlet allocation. We further examined changes in public sentiment and topics during the 3 stages of the vaccination timeline. Gender differences in perceptions toward vaccination were also investigated. RESULTS: Of 495,229 crawled posts, 96,145 original posts from individual accounts were included. Most posts presented positive sentiments (positive: 65,981/96,145, 68.63%; negative: 23,184/96,145, 24.11%; neutral: 6980/96,145, 7.26%). The average sentiment scores were 0.75 (SD 0.35) for men and 0.67 (SD 0.37) for women. The overall trends in sentiment scores showed a mixed response to the number of new cases and significant events related to vaccine development and important holidays. The sentiment scores showed a weak correlation with new case numbers (R=0.296; P=.03). Significant sentiment score differences were observed between men and women (P<.001). Common and distinguishing characteristics were found among frequently discussed topics during the different stages, with significant differences in topic distribution between men and women (January 1, 2021, to March 31, 2021: χ23=3030.9; April 1, 2021, to September 30, 2021: χ24=8893.8; October 1, 2021, to December 31, 2021: χ25=3019.5; P<.001). Women were more concerned with side effects and vaccine effectiveness. In contrast, men reported broader concerns around the global pandemic, the progress of vaccine development, and economics affected by the pandemic. CONCLUSIONS: Understanding public concerns regarding vaccination is essential for reaching vaccine-induced herd immunity. This study tracked the year-long evolution of attitudes and opinions on COVID-19 vaccines according to the different stages of vaccination in China. These findings provide timely information that will enable the government to understand the reasons for low vaccine uptake and promote COVID-19 vaccination nationwide.
Assuntos
COVID-19 , Mídias Sociais , Feminino , Humanos , Opinião Pública , COVID-19/prevenção & controle , Vacinas contra COVID-19 , SARS-CoV-2 , Infodemiologia , Vacinação , China , AtitudeRESUMO
BACKGROUND: Artificial intelligence (AI), conceived in the 1950s, has permeated numerous industries, intensifying in tandem with advancements in computing power. Despite the widespread adoption of AI, its integration into medicine trails other sectors. However, medical AI research has experienced substantial growth, attracting considerable attention from researchers and practitioners. OBJECTIVE: In the absence of an existing framework, this study aims to outline the current landscape of medical AI research and provide insights into its future developments by examining all AI-related studies within PubMed over the past 2 decades. We also propose potential data acquisition and analysis methods, developed using Python (version 3.11) and to be executed in Spyder IDE (version 5.4.3), for future analogous research. METHODS: Our dual-pronged approach involved (1) retrieving publication metadata related to AI from PubMed (spanning 2000-2022) via Python, including titles, abstracts, authors, journals, country, and publishing years, followed by keyword frequency analysis and (2) classifying relevant topics using latent Dirichlet allocation, an unsupervised machine learning approach, and defining the research scope of AI in medicine. In the absence of a universal medical AI taxonomy, we used an AI dictionary based on the European Commission Joint Research Centre AI Watch report, which emphasizes 8 domains: reasoning, planning, learning, perception, communication, integration and interaction, service, and AI ethics and philosophy. RESULTS: From 2000 to 2022, a comprehensive analysis of 307,701 AI-related publications from PubMed highlighted a 36-fold increase. The United States emerged as a clear frontrunner, producing 68,502 of these articles. Despite its substantial contribution in terms of volume, China lagged in terms of citation impact. Diving into specific AI domains, as the Joint Research Centre AI Watch report categorized, the learning domain emerged dominant. Our classification analysis meticulously traced the nuanced research trajectories across each domain, revealing the multifaceted and evolving nature of AI's application in the realm of medicine. CONCLUSIONS: The research topics have evolved as the volume of AI studies increases annually. Machine learning remains central to medical AI research, with deep learning expected to maintain its fundamental role. Empowered by predictive algorithms, pattern recognition, and imaging analysis capabilities, the future of AI research in medicine is anticipated to concentrate on medical diagnosis, robotic intervention, and disease management. Our topic modeling outcomes provide a clear insight into the focus of AI research in medicine over the past decades and lay the groundwork for predicting future directions. The domains that have attracted considerable research attention, primarily the learning domain, will continue to shape the trajectory of AI in medicine. Given the observed growing interest, the domain of AI ethics and philosophy also stands out as a prospective area of increased focus.
Assuntos
Inteligência Artificial , Robótica , Humanos , Algoritmos , Bibliometria , Medicina de Precisão/métodosRESUMO
BACKGROUND: Social media is an important information source for a growing subset of the population and can likely be leveraged to provide insight into the evolving drug overdose epidemic. Twitter can provide valuable insight into trends, colloquial information available to potential users, and how networks and interactivity might influence what people are exposed to and how they engage in communication around drug use. OBJECTIVE: This exploratory study was designed to investigate the ways in which unsupervised machine learning analyses using natural language processing could identify coherent themes for tweets containing substance names. METHODS: This study involved harnessing data from Twitter, including large-scale collection of brand name (N=262,607) and street name (N=204,068) prescription drug-related tweets and use of unsupervised machine learning analyses (ie, natural language processing) of collected data with data visualization to identify pertinent tweet themes. Latent Dirichlet allocation (LDA) with coherence score calculations was performed to compare brand (eg, OxyContin) and street (eg, oxys) name tweets. RESULTS: We found people discussed drug use differently depending on whether a brand name or street name was used. Brand name categories often contained political talking points (eg, border, crime, and political handling of ongoing drug mitigation strategies). In contrast, categories containing street names occasionally referenced drug misuse, though multiple social uses for a term (eg, Sonata) muddled topic clarity. CONCLUSIONS: Content in the brand name corpus reflected discussion about the drug itself and less often reflected personal use. However, content in the street name corpus was notably more diverse and resisted simple LDA categorization. We speculate this may reflect effective use of slang terminology to clandestinely discuss drug-related activity. If so, straightforward analyses of digital drug-related communication may be more difficult than previously assumed. This work has the potential to be used for surveillance and detection of harmful drug use information. It also might be used for appropriate education and dissemination of information to persons engaged in drug use content on Twitter.