Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 5.242
Filter
Add more filters








Publication year range
1.
Aging Clin Exp Res ; 36(1): 165, 2024 Aug 09.
Article in English | MEDLINE | ID: mdl-39120630

ABSTRACT

BACKGROUND: We aimed to explore the association of sleep duration with depressive symptoms among rural-dwelling older adults in China, and to estimate the impact of substituting sleep with sedentary behavior (SB) and physical activity (PA) on the association with depressive symptoms. METHODS: This population-based cross-sectional study included 2001 rural-dwelling older adults (age ≥ 60 years, 59.2% female). Sleep duration was assessed using the Pittsburgh Sleep Quality Index. We used accelerometers to assess SB and PA, and the 15-item Geriatric Depression Scale to assess depressive symptoms. Data were analyzed using restricted cubic splines, compositional logistic regression, and isotemporal substitution models. RESULTS: Restricted cubic spline curves showed a U-shaped association between daily sleep duration and the likelihood of depressive symptoms (P-nonlinear < 0.001). Among older adults with sleep duration < 7 h/day, reallocating 60 min/day spent on SB and PA to sleep were associated with multivariable-adjusted odds ratio (OR) of 0.81 (95% confidence interval [CI] = 0.78-0.84) and 0.79 (0.76-0.82), respectively, for depressive symptoms. Among older adults with sleep duration ≥ 7 h/day, reallocating 60 min/day spent in sleep to SB and PA, and reallocating 60 min/day spent on SB to PA were associated with multivariable-adjusted OR of 0.78 (0.74-0.84), 0.73 (0.69-0.78), and 0.94 (0.92-0.96), respectively, for depressive symptoms. CONCLUSIONS: Our study reveals a U-shaped association of sleep duration with depressive symptoms in rural older adults and further shows that replacing SB and PA with sleep or vice versa is associated with reduced likelihoods of depressive symptoms depending on sleep duration.


Subject(s)
Depression , Exercise , Rural Population , Sedentary Behavior , Sleep , Humans , Female , Male , Aged , Depression/epidemiology , Cross-Sectional Studies , Exercise/physiology , Middle Aged , Sleep/physiology , China/epidemiology , Aged, 80 and over , Data Analysis
2.
BMC Med Res Methodol ; 24(1): 178, 2024 Aug 08.
Article in English | MEDLINE | ID: mdl-39117997

ABSTRACT

Statistical regression models are used for predicting outcomes based on the values of some predictor variables or for describing the association of an outcome with predictors. With a data set at hand, a regression model can be easily fit with standard software packages. This bears the risk that data analysts may rush to perform sophisticated analyses without sufficient knowledge of basic properties, associations in and errors of their data, leading to wrong interpretation and presentation of the modeling results that lacks clarity. Ignorance about special features of the data such as redundancies or particular distributions may even invalidate the chosen analysis strategy. Initial data analysis (IDA) is prerequisite to regression analyses as it provides knowledge about the data needed to confirm the appropriateness of or to refine a chosen model building strategy, to interpret the modeling results correctly, and to guide the presentation of modeling results. In order to facilitate reproducibility, IDA needs to be preplanned, an IDA plan should be included in the general statistical analysis plan of a research project, and results should be well documented. Biased statistical inference of the final regression model can be minimized if IDA abstains from evaluating associations of outcome and predictors, a key principle of IDA. We give advice on which aspects to consider in an IDA plan for data screening in the context of regression modeling to supplement the statistical analysis plan. We illustrate this IDA plan for data screening in an example of a typical diagnostic modeling project and give recommendations for data visualizations.


Subject(s)
Models, Statistical , Humans , Regression Analysis , Data Interpretation, Statistical , Multivariate Analysis , Reproducibility of Results , Software , Data Analysis
3.
Sci Rep ; 14(1): 17782, 2024 08 01.
Article in English | MEDLINE | ID: mdl-39090143

ABSTRACT

Previous correlative and modeling approaches indicate influences of environmental factors on COVID-19 spread through atmospheric conditions' impact on virus survival and transmission or host susceptibility. However, causal connections from environmental factors to the pandemic, mediated by human mobility, received less attention. We use the technique of Convergent Cross Mapping to identify the causal connections, beyond correlation at the country level, between pairs of variables associated with weather conditions, human mobility, and the number of COVID-19 cases for 32 European states. Here, we present data-based evidence that the relatively reduced number of cases registered in Northern Europe is related to the causal impact of precipitation on people's decision to spend more time at home and that the relatively large number of cases observed in Southern Europe is linked to people's choice to spend time outdoors during warm days. We also emphasize the channels of the significant impact of the pandemic on human mobility. The weather-human mobility connections inferred here are relevant not only for COVID-19 spread but also for any other virus transmitted through human interactions. These results may help authorities and public health experts contain possible future waves of the COVID-19 pandemic or limit the threats of similar human-to-human transmitted viruses.


Subject(s)
COVID-19 , SARS-CoV-2 , Weather , COVID-19/epidemiology , COVID-19/transmission , COVID-19/virology , Humans , Europe/epidemiology , SARS-CoV-2/isolation & purification , SARS-CoV-2/pathogenicity , Pandemics , Data Analysis
5.
Methods Mol Biol ; 2812: 1-9, 2024.
Article in English | MEDLINE | ID: mdl-39068354

ABSTRACT

In this chapter, we present an established pipeline for analyzing RNA-Seq data, which involves a step-by-step flow starting from raw data obtained from a sequencer and culminating in the identification of differentially expressed genes with their functional characterization. The pipeline is divided into three sections, each addressing crucial stages of the analysis process. The first section covers the initial steps of the pipeline, including downloading of the data of interest and performing quality control assessment. Assessment ensures that the data used for analysis is reliable and suitable for downstream analyses. In the second section, gene-level quantification is performed, which entails quantification of expression levels of genes in the samples. The third and final section is focused on differential expression analysis, which involves comparing gene expression levels between two or more conditions. This step helps identify genes that show significant differences in expression levels under different experimental conditions. To facilitate accessibility and reproducibility, we have provided an online repository containing all scripts and files. Additionally, custom scripts are available, enabling users to modify the pipeline's output for various downstream analyses. By following this pipeline, researchers can effectively analyze RNA-Seq data and gain valuable insights into gene expression patterns and, furthermore, the understanding of biological processes.


Subject(s)
Gene Expression Profiling , RNA-Seq , Software , RNA-Seq/methods , Gene Expression Profiling/methods , Computational Biology/methods , Humans , Sequence Analysis, RNA/methods , High-Throughput Nucleotide Sequencing/methods , Reproducibility of Results , Quality Control , Data Analysis , Transcriptome/genetics
6.
JMIR Ment Health ; 11: e58352, 2024 Jul 18.
Article in English | MEDLINE | ID: mdl-39024004

ABSTRACT

BACKGROUND: Emotional clarity has often been assessed with self-report measures, but efforts have also been made to measure it passively, which has advantages such as avoiding potential inaccuracy in responses stemming from social desirability bias or poor insight into emotional clarity. Response times (RTs) to emotion items administered in ecological momentary assessments (EMAs) may be an indirect indicator of emotional clarity. Another proposed indicator is the drift rate parameter, which assumes that, aside from how fast a person responds to emotion items, the measurement of emotional clarity also requires the consideration of how careful participants were in providing responses. OBJECTIVE: This paper aims to examine the reliability and validity of RTs and drift rate parameters from EMA emotion items as indicators of individual differences in emotional clarity. METHODS: Secondary data analysis was conducted on data from 196 adults with type 1 diabetes who completed a 2-week EMA study involving the completion of 5 to 6 surveys daily. If lower RTs and higher drift rates (from EMA emotion items) were indicators of emotional clarity, we hypothesized that greater levels (ie, higher clarity) should be associated with greater life satisfaction; lower levels of neuroticism, depression, anxiety, and diabetes distress; and fewer difficulties with emotion regulation. Because prior literature suggested emotional clarity could be valence specific, EMA items for negative affect (NA) and positive affect were examined separately. RESULTS: Reliability of the proposed indicators of emotional clarity was acceptable with a small number of EMA prompts (ie, 4 to 7 prompts in total or 1 to 2 days of EMA surveys). Consistent with expectations, the average drift rate of NA items across multiple EMAs had expected associations with other measures, such as correlations of r=-0.27 (P<.001) with depression symptoms, r=-0.27 (P=.001) with anxiety symptoms, r=-0.15 (P=.03) with emotion regulation difficulties, and r=0.63 (P<.001) with RTs to NA items. People with a higher NA drift rate responded faster to NA emotion items, had greater subjective well-being (eg, fewer depression symptoms), and had fewer difficulties with overall emotion regulation, which are all aligned with the expectation for an emotional clarity measure. Contrary to expectations, the validities of average RTs to NA items, the drift rate of positive affect items, and RTs to positive affect items were not strongly supported by our results. CONCLUSIONS: Study findings provided initial support for the validity of NA drift rate as an indicator of emotional clarity but not for that of other RT-based clarity measures. Evidence was preliminary because the sample size was not sufficient to detect small but potentially meaningful correlations, as the sample size of the diabetes EMA study was chosen for other more primary research questions. Further research on passive emotional clarity measures is needed.


Subject(s)
Ecological Momentary Assessment , Emotions , Humans , Female , Male , Reproducibility of Results , Adult , Middle Aged , Diabetes Mellitus, Type 1/psychology , Reaction Time/physiology , Emotional Regulation/physiology , Data Analysis , Personal Satisfaction , Surveys and Questionnaires , Secondary Data Analysis
7.
SLAS Discov ; 29(5): 100172, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38969289

ABSTRACT

The Cellular Thermal Shift Assay (CETSA) enables the study of protein-ligand interactions in a cellular context. It provides valuable information on the binding affinity and specificity of both small and large molecule ligands in a relevant physiological context, hence forming a unique tool in drug discovery. Though high-throughput lab protocols exist for scaling up CETSA, subsequent data analysis and quality control remain laborious and limit experimental throughput. Here, we introduce a scalable and robust data analysis workflow which allows integration of CETSA into routine high throughput screening (HT-CETSA). This new workflow automates data analysis and incorporates quality control (QC), including outlier detection, sample and plate QC, and result triage. We describe the workflow and show its robustness against typical experimental artifacts, show scaling effects, and discuss the impact of data analysis automation by eliminating manual data processing steps.


Subject(s)
High-Throughput Screening Assays , Workflow , High-Throughput Screening Assays/methods , Quality Control , Data Analysis , Automation/methods , Humans , Ligands , Drug Discovery/methods , Protein Binding
8.
Nature ; 631(8022): 924-925, 2024 Jul.
Article in English | MEDLINE | ID: mdl-39039191
9.
Se Pu ; 42(7): 669-680, 2024 Jul.
Article in Chinese | MEDLINE | ID: mdl-38966975

ABSTRACT

Mass spectrometry imaging (MSI) is a promising method for characterizing the spatial distribution of compounds. Given the diversified development of acquisition methods and continuous improvements in the sensitivity of this technology, both the total amount of generated data and complexity of analysis have exponentially increased, rendering increasing challenges of data postprocessing, such as large amounts of noise, background signal interferences, as well as image registration deviations caused by sample position changes and scan deviations, and etc. Deep learning (DL) is a powerful tool widely used in data analysis and image reconstruction. This tool enables the automatic feature extraction of data by building and training a neural network model, and achieves comprehensive and in-depth analysis of target data through transfer learning, which has great potential for MSI data analysis. This paper reviews the current research status, application progress and challenges of DL in MSI data analysis, focusing on four core stages: data preprocessing, image reconstruction, cluster analysis, and multimodal fusion. The application of a combination of DL and mass spectrometry imaging in the study of tumor diagnosis and subtype classification is also illustrated. This review also discusses trends of development in the future, aiming to promote a better combination of artificial intelligence and mass spectrometry technology.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted , Mass Spectrometry , Mass Spectrometry/methods , Image Processing, Computer-Assisted/methods , Humans , Data Analysis
10.
Article in English | MEDLINE | ID: mdl-38977033

ABSTRACT

PURPOSE: This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under two stopping rules (SEM 0.3 and 0.25) using both real and simulated data in medical examinations in Korea. METHODS: This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees' passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules. RESULTS: Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r = 0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data. CONCLUSION: The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.


Subject(s)
Educational Measurement , Psychometrics , Students, Medical , Humans , Educational Measurement/methods , Educational Measurement/standards , Republic of Korea , Psychometrics/methods , Computer Simulation , Data Analysis , Education, Medical, Undergraduate/methods , Male , Female
11.
Methods Mol Biol ; 2836: 67-76, 2024.
Article in English | MEDLINE | ID: mdl-38995536

ABSTRACT

Recently, HexNAcQuest was developed to help distinguish peptides modified by HexNAc isomers, more specifically O-linked ß-N-acetylglucosamine (O-GlcNAc) and O-linked α-N-acetylgalactosamine (O-GalNAc, Tn antigen). To facilitate its usage (particularly for datasets from glycoproteomics studies), herein we present a detailed protocol. It describes example cases and procedures for which users might need to use HexNAcQuest to distinguish these two modifications.


Subject(s)
Proteomics , Software , Proteomics/methods , Isomerism , Humans , Acetylglucosamine/chemistry , Acetylglucosamine/metabolism , Glycopeptides/chemistry , Glycopeptides/analysis , Glycoproteins/chemistry , Acetylgalactosamine/chemistry , Data Analysis , Peptides/chemistry , Glycosylation
12.
Environ Sci Pollut Res Int ; 31(35): 48497-48522, 2024 Jul.
Article in English | MEDLINE | ID: mdl-39030454

ABSTRACT

Flooding is a major natural hazard worldwide, causing catastrophic damage to communities and infrastructure. Due to climate change exacerbating extreme weather events robust flood hazard modeling is crucial to support disaster resilience and adaptation. This study uses multi-sourced geospatial datasets to develop an advanced machine learning framework for flood hazard assessment in the Arambag region of West Bengal, India. The flood inventory was constructed through Sentinel-1 SAR analysis and global flood databases. Fifteen flood conditioning factors related to topography, land cover, soil, rainfall, proximity, and demographics were incorporated. Rigorous training and testing of diverse machine learning models, including RF, AdaBoost, rFerns, XGB, DeepBoost, GBM, SDA, BAM, monmlp, and MARS algorithms, were undertaken for categorical flood hazard mapping. Model optimization was achieved through statistical feature selection techniques. Accuracy metrics and advanced model interpretability methods like SHAP and Boruta were implemented to evaluate predictive performance. According to the area under the receiver operating characteristic curve (AUC), the prediction accuracy of the models performed was around > 80%. RF achieves an AUC of 0.847 at resampling factor 5, indicating strong discriminative performance. AdaBoost also consistently exhibits good discriminative ability, with AUC values of 0.839 at resampling factor 10. Boruta and SHAP analysis indicated precipitation and elevation as factors most significantly contributing to flood hazard assessment in the study area. Most of the machine learning models pointed out southern portions of the study area as highly susceptible areas. On average, from 17.2 to 18.6% of the study area is highly susceptible to flood hazards. In the feature selection analysis, various nature-inspired algorithms identified the selected input parameters for flood hazard assessment, i.e., elevation, precipitation, distance to rivers, TWI, geomorphology, lithology, TRI, slope, soil type, curvature, NDVI, distance to roads, and gMIS. As per the Boruta and SHAP analyses, it was found that elevation, precipitation, and distance to rivers play the most crucial roles in the decision-making process for flood hazard assessment. The results indicated that the majority of the building footprints (15.27%) are at high and very high risk, followed by those at very low risk (43.80%), low risk (24.30%), and moderate risk (16.63%). Similarly, the cropland area affected by flooding in this region is categorized into five risk classes: very high (16.85%), high (17.28%), moderate (16.07%), low (16.51%), and very low (33.29%). However, this interdisciplinary study contributes significantly towards hydraulic and hydrological modeling for flood hazard management.


Subject(s)
Floods , Machine Learning , India , Risk Assessment , Data Analysis , Algorithms
14.
J Am Soc Mass Spectrom ; 35(8): 1865-1874, 2024 Aug 07.
Article in English | MEDLINE | ID: mdl-38967378

ABSTRACT

Ion mobility-mass spectrometry (IM-MS) has become a technology deployed across a wide range of structural biology applications despite the challenges in characterizing closely related protein structures. Collision-induced unfolding (CIU) has emerged as a valuable technique for distinguishing closely related, iso-cross-sectional protein and protein complex ions through their distinct unfolding pathways in the gas phase. With the speed and sensitivity of CIU analyses, there has been a rapid growth of CIU-based assays, especially regarding biomolecular targets that remain challenging to assess and characterize with other structural biology tools. With information-rich CIU data, many software tools have been developed to automate laborious data analysis. However, with the recent development of new IM-MS technologies, such as cyclic IM-MS, CIU continues to evolve, necessitating improved data analysis tools to keep pace with new technologies and facilitating the automation of various data processing tasks. Here, we present CIUSuite 3, a software package that contains updated algorithms that support various IM-MS platforms and supports the automation of various data analysis tasks such as peak detection, multidimensional classification, and collision cross section (CCS) calibration. CIUSuite 3 uses local maxima searches along with peak width and prominence filters to detect peaks to automate CIU data extraction. To support both the primary CIU (CIU1) and secondary CIU (CIU2) experiments enabled by cyclic IM-MS, two-dimensional data preprocessing is deployed, which allows multidimensional classification. Our data suggest that additional dimensions in classification improve the overall accuracy of class assignments. CIUSuite 3 also supports CCS calibration for both traveling wave and drift tube IM-MS, and we demonstrate the accuracy of a new single-field CCS calibration method designed for drift tube IM-MS leveraging calibrant CIU data. Overall, CIUSuite 3 is positioned to support current and next-generation IM-MS and CIU assay development deployed in an automated format.


Subject(s)
Algorithms , Protein Unfolding , Proteins , Software , Proteins/chemistry , Proteins/analysis , Calibration , Gases/chemistry , Ion Mobility Spectrometry/methods , Mass Spectrometry/methods , Data Analysis
15.
PLoS One ; 19(7): e0305038, 2024.
Article in English | MEDLINE | ID: mdl-38985781

ABSTRACT

The meta-learning method proposed in this paper addresses the issue of small-sample regression in the application of engineering data analysis, which is a highly promising direction for research. By integrating traditional regression models with optimization-based data augmentation from meta-learning, the proposed deep neural network demonstrates excellent performance in optimizing glass fiber reinforced plastic (GFRP) for wrapping concrete short columns. When compared with traditional regression models, such as Support Vector Regression (SVR), Gaussian Process Regression (GPR), and Radial Basis Function Neural Networks (RBFNN), the meta-learning method proposed here performs better in modeling small data samples. The success of this approach illustrates the potential of deep learning in dealing with limited amounts of data, offering new opportunities in the field of material data analysis.


Subject(s)
Construction Materials , Deep Learning , Glass , Neural Networks, Computer , Plastics , Data Analysis
16.
BMC Bioinformatics ; 25(1): 232, 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38982382

ABSTRACT

BACKGROUND: Characterization of microbial growth is of both fundamental and applied interest. Modern platforms can automate collection of high-throughput microbial growth curves, necessitating the development of computational tools to handle and analyze these data to produce insights. RESULTS: To address this need, here I present a newly-developed R package: gcplyr. gcplyr can flexibly import growth curve data in common tabular formats, and reshapes it under a tidy framework that is flexible and extendable, enabling users to design custom analyses or plot data with popular visualization packages. gcplyr can also incorporate metadata and generate or import experimental designs to merge with data. Finally, gcplyr carries out model-free (non-parametric) analyses. These analyses do not require mathematical assumptions about microbial growth dynamics, and gcplyr is able to extract a broad range of important traits, including growth rate, doubling time, lag time, maximum density and carrying capacity, diauxie, area under the curve, extinction time, and more. CONCLUSIONS: gcplyr makes scripted analyses of growth curve data in R straightforward, streamlines common data wrangling and analysis steps, and easily integrates with common visualization and statistical analyses.


Subject(s)
Software , Computational Biology/methods , Data Analysis
17.
PLoS One ; 19(7): e0297930, 2024.
Article in English | MEDLINE | ID: mdl-38959245

ABSTRACT

Data analysis can be accurate and reliable only if the underlying assumptions of the used statistical method are validated. Any violations of these assumptions can change the outcomes and conclusions of the analysis. In this study, we developed Smart Data Analysis V2 (SDA-V2), an interactive and user-friendly web application, to assist users with limited statistical knowledge in data analysis, and it can be freely accessed at https://jularatchumnaul.shinyapps.io/SDA-V2/. SDA-V2 automatically explores and visualizes data, examines the underlying assumptions associated with the parametric test, and selects an appropriate statistical method for the given data. Furthermore, SDA-V2 can assess the quality of research instruments and determine the minimum sample size required for a meaningful study. However, while SDA-V2 is a valuable tool for simplifying statistical analysis, it does not replace the need for a fundamental understanding of statistical principles. Researchers are encouraged to combine their expertise with the software's capabilities to achieve the most accurate and credible results.


Subject(s)
Software , Humans , Data Analysis , User-Computer Interface , Data Interpretation, Statistical
18.
Eur J Public Health ; 34(Supplement_1): i43-i49, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38946447

ABSTRACT

BACKGROUND: The extensive and continuous reuse of sensitive health data could enhance the role of population health research on public decisions. This paper describes the design principles and the different building blocks that have supported the implementation and deployment of Population Health Information Research Infrastructure (PHIRI), the strengths and challenges of the approach and some future developments. METHODS: The design and implementation of PHIRI have been developed upon: (i) the data visiting principle-data does not move but code moves; (ii) the orchestration of the research question throughout a workflow that ensured legal, organizational, semantic and technological interoperability and (iii) a 'master-worker' federated computational architecture that supported the development of four uses cases. RESULTS: Nine participants nodes and 28 Euro-Peristat members completed the deployment of the infrastructure according to the expected outputs. As a consequence, each use case produced and published their own common data model, the analytical pipeline and the corresponding research outputs. All the digital objects were developed and published according to Open Science and FAIR principles. CONCLUSION: PHIRI has successfully supported the development of four use cases in a federated manner, overcoming limitations for the reuse of sensitive health data and providing a methodology to achieve interoperability in multiple research nodes.


Subject(s)
Data Analysis , Routinely Collected Health Data , Humans
19.
Clin Chim Acta ; 561: 119811, 2024 Jul 15.
Article in English | MEDLINE | ID: mdl-38879064

ABSTRACT

BACKGROUND: Patient registries are crucial for rare disease management. However, manual registry construction is labor-intensive and often not user-friendly. Our goal is to establish Hong Kong's first computer-assisted patient identification tool for rare diseases, starting with inborn errors of metabolism (IEM). METHODS: Patient data from 2010 to 2019 was retrieved from electronic databases. Through big data analytics, patient data were filtered based on specific IEM-related biochemical and genetic tests. Clinical notes were analyzed using a rule-based natural language processing technique called regular expression. The algorithm classified each extracted paragraph as "IEM-related" or "not IEM-related." Pathologists reviewed the paragraphs for curation, and the algorithm's performance was evaluated. RESULTS: Out of 46,419 patients with IEM-related tests, the algorithm identified 100 as "IEM-related." After pathologists' validation, 96 cases were confirmed as true IEM, with 1 uncertain case and 3 false positives. A secondary ascertainment yielded a sensitivity of 92.3% compared to our previously published IEM cohort. CONCLUSIONS: Our artificial intelligence approach provides a novel method to identify IEM patients, facilitating the creation of a centralized, computer-assisted rare disease patient registry at the local and national levels. This data can potentially be accessed by multiple stakeholders for collaborative research and to enhance healthcare management for rare diseases.


Subject(s)
Big Data , Metabolism, Inborn Errors , Rare Diseases , Registries , Humans , Rare Diseases/diagnosis , Metabolism, Inborn Errors/diagnosis , Algorithms , Data Analysis , Male , Female
20.
JMIR Ment Health ; 11: e55747, 2024 Jun 27.
Article in English | MEDLINE | ID: mdl-38935419

ABSTRACT

BACKGROUND: Text-based digital media platforms have revolutionized communication and information sharing, providing valuable access to knowledge and understanding in the fields of mental health and suicide prevention. OBJECTIVE: This systematic review aimed to determine how machine learning and data analysis can be applied to text-based digital media data to understand mental health and aid suicide prevention. METHODS: A systematic review of research papers from the following major electronic databases was conducted: Web of Science, MEDLINE, Embase (via MEDLINE), and PsycINFO (via MEDLINE). The database search was supplemented by a hand search using Google Scholar. RESULTS: Overall, 19 studies were included, with five major themes as to how data analysis and machine learning techniques could be applied: (1) as predictors of personal mental health, (2) to understand how personal mental health and suicidal behavior are communicated, (3) to detect mental disorders and suicidal risk, (4) to identify help seeking for mental health difficulties, and (5) to determine the efficacy of interventions to support mental well-being. CONCLUSIONS: Our findings show that data analysis and machine learning can be used to gain valuable insights, such as the following: web-based conversations relating to depression vary among different ethnic groups, teenagers engage in a web-based conversation about suicide more often than adults, and people seeking support in web-based mental health communities feel better after receiving online support. Digital tools and mental health apps are being used successfully to manage mental health, particularly through the COVID-19 epidemic, during which analysis has revealed that there was increased anxiety and depression, and web-based communities played a part in reducing isolation during the pandemic. Predictive analytics were also shown to have potential, and virtual reality shows promising results in the delivery of preventive or curative care. Future research efforts could center on optimizing algorithms to enhance the potential of text-based digital media analysis in mental health and suicide prevention. In addressing depression, a crucial step involves identifying the factors that contribute to happiness and using machine learning to forecast these sources of happiness. This could extend to understanding how various activities result in improved happiness across different socioeconomic groups. Using insights gathered from such data analysis and machine learning, there is an opportunity to craft digital interventions, such as chatbots, designed to provide support and address mental health challenges and suicide prevention.


Subject(s)
Machine Learning , Suicide Prevention , Humans , Mental Health , Social Media , Data Analysis
SELECTION OF CITATIONS
SEARCH DETAIL