Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 6.309
Filter
1.
Sci Rep ; 14(1): 17782, 2024 Aug 01.
Article in English | MEDLINE | ID: mdl-39090143

ABSTRACT

Previous correlative and modeling approaches indicate influences of environmental factors on COVID-19 spread through atmospheric conditions' impact on virus survival and transmission or host susceptibility. However, causal connections from environmental factors to the pandemic, mediated by human mobility, received less attention. We use the technique of Convergent Cross Mapping to identify the causal connections, beyond correlation at the country level, between pairs of variables associated with weather conditions, human mobility, and the number of COVID-19 cases for 32 European states. Here, we present data-based evidence that the relatively reduced number of cases registered in Northern Europe is related to the causal impact of precipitation on people's decision to spend more time at home and that the relatively large number of cases observed in Southern Europe is linked to people's choice to spend time outdoors during warm days. We also emphasize the channels of the significant impact of the pandemic on human mobility. The weather-human mobility connections inferred here are relevant not only for COVID-19 spread but also for any other virus transmitted through human interactions. These results may help authorities and public health experts contain possible future waves of the COVID-19 pandemic or limit the threats of similar human-to-human transmitted viruses.


Subject(s)
COVID-19 , SARS-CoV-2 , Weather , COVID-19/epidemiology , COVID-19/transmission , COVID-19/virology , Humans , Europe/epidemiology , SARS-CoV-2/isolation & purification , SARS-CoV-2/pathogenicity , Pandemics , Data Analysis
2.
Eur J Public Health ; 34(Supplement_1): i43-i49, 2024 Jul 01.
Article in English | MEDLINE | ID: mdl-38946447

ABSTRACT

BACKGROUND: The extensive and continuous reuse of sensitive health data could enhance the role of population health research on public decisions. This paper describes the design principles and the different building blocks that have supported the implementation and deployment of Population Health Information Research Infrastructure (PHIRI), the strengths and challenges of the approach and some future developments. METHODS: The design and implementation of PHIRI have been developed upon: (i) the data visiting principle-data does not move but code moves; (ii) the orchestration of the research question throughout a workflow that ensured legal, organizational, semantic and technological interoperability and (iii) a 'master-worker' federated computational architecture that supported the development of four uses cases. RESULTS: Nine participants nodes and 28 Euro-Peristat members completed the deployment of the infrastructure according to the expected outputs. As a consequence, each use case produced and published their own common data model, the analytical pipeline and the corresponding research outputs. All the digital objects were developed and published according to Open Science and FAIR principles. CONCLUSION: PHIRI has successfully supported the development of four use cases in a federated manner, overcoming limitations for the reuse of sensitive health data and providing a methodology to achieve interoperability in multiple research nodes.


Subject(s)
Data Analysis , Routinely Collected Health Data , Humans
3.
Methods Mol Biol ; 2836: 67-76, 2024.
Article in English | MEDLINE | ID: mdl-38995536

ABSTRACT

Recently, HexNAcQuest was developed to help distinguish peptides modified by HexNAc isomers, more specifically O-linked ß-N-acetylglucosamine (O-GlcNAc) and O-linked α-N-acetylgalactosamine (O-GalNAc, Tn antigen). To facilitate its usage (particularly for datasets from glycoproteomics studies), herein we present a detailed protocol. It describes example cases and procedures for which users might need to use HexNAcQuest to distinguish these two modifications.


Subject(s)
Proteomics , Software , Proteomics/methods , Isomerism , Humans , Acetylglucosamine/chemistry , Acetylglucosamine/metabolism , Glycopeptides/chemistry , Glycopeptides/analysis , Glycoproteins/chemistry , Acetylgalactosamine/chemistry , Data Analysis , Peptides/chemistry , Glycosylation
4.
BMC Bioinformatics ; 25(1): 232, 2024 Jul 09.
Article in English | MEDLINE | ID: mdl-38982382

ABSTRACT

BACKGROUND: Characterization of microbial growth is of both fundamental and applied interest. Modern platforms can automate collection of high-throughput microbial growth curves, necessitating the development of computational tools to handle and analyze these data to produce insights. RESULTS: To address this need, here I present a newly-developed R package: gcplyr. gcplyr can flexibly import growth curve data in common tabular formats, and reshapes it under a tidy framework that is flexible and extendable, enabling users to design custom analyses or plot data with popular visualization packages. gcplyr can also incorporate metadata and generate or import experimental designs to merge with data. Finally, gcplyr carries out model-free (non-parametric) analyses. These analyses do not require mathematical assumptions about microbial growth dynamics, and gcplyr is able to extract a broad range of important traits, including growth rate, doubling time, lag time, maximum density and carrying capacity, diauxie, area under the curve, extinction time, and more. CONCLUSIONS: gcplyr makes scripted analyses of growth curve data in R straightforward, streamlines common data wrangling and analysis steps, and easily integrates with common visualization and statistical analyses.


Subject(s)
Software , Computational Biology/methods , Data Analysis
5.
PLoS One ; 19(7): e0305038, 2024.
Article in English | MEDLINE | ID: mdl-38985781

ABSTRACT

The meta-learning method proposed in this paper addresses the issue of small-sample regression in the application of engineering data analysis, which is a highly promising direction for research. By integrating traditional regression models with optimization-based data augmentation from meta-learning, the proposed deep neural network demonstrates excellent performance in optimizing glass fiber reinforced plastic (GFRP) for wrapping concrete short columns. When compared with traditional regression models, such as Support Vector Regression (SVR), Gaussian Process Regression (GPR), and Radial Basis Function Neural Networks (RBFNN), the meta-learning method proposed here performs better in modeling small data samples. The success of this approach illustrates the potential of deep learning in dealing with limited amounts of data, offering new opportunities in the field of material data analysis.


Subject(s)
Construction Materials , Deep Learning , Glass , Neural Networks, Computer , Plastics , Data Analysis
6.
PLoS One ; 19(7): e0297930, 2024.
Article in English | MEDLINE | ID: mdl-38959245

ABSTRACT

Data analysis can be accurate and reliable only if the underlying assumptions of the used statistical method are validated. Any violations of these assumptions can change the outcomes and conclusions of the analysis. In this study, we developed Smart Data Analysis V2 (SDA-V2), an interactive and user-friendly web application, to assist users with limited statistical knowledge in data analysis, and it can be freely accessed at https://jularatchumnaul.shinyapps.io/SDA-V2/. SDA-V2 automatically explores and visualizes data, examines the underlying assumptions associated with the parametric test, and selects an appropriate statistical method for the given data. Furthermore, SDA-V2 can assess the quality of research instruments and determine the minimum sample size required for a meaningful study. However, while SDA-V2 is a valuable tool for simplifying statistical analysis, it does not replace the need for a fundamental understanding of statistical principles. Researchers are encouraged to combine their expertise with the software's capabilities to achieve the most accurate and credible results.


Subject(s)
Software , Humans , Data Analysis , User-Computer Interface , Data Interpretation, Statistical
7.
Nature ; 631(8022): 924-925, 2024 Jul.
Article in English | MEDLINE | ID: mdl-39039191
8.
SLAS Discov ; 29(5): 100172, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38969289

ABSTRACT

The Cellular Thermal Shift Assay (CETSA) enables the study of protein-ligand interactions in a cellular context. It provides valuable information on the binding affinity and specificity of both small and large molecule ligands in a relevant physiological context, hence forming a unique tool in drug discovery. Though high-throughput lab protocols exist for scaling up CETSA, subsequent data analysis and quality control remain laborious and limit experimental throughput. Here, we introduce a scalable and robust data analysis workflow which allows integration of CETSA into routine high throughput screening (HT-CETSA). This new workflow automates data analysis and incorporates quality control (QC), including outlier detection, sample and plate QC, and result triage. We describe the workflow and show its robustness against typical experimental artifacts, show scaling effects, and discuss the impact of data analysis automation by eliminating manual data processing steps.


Subject(s)
High-Throughput Screening Assays , Workflow , High-Throughput Screening Assays/methods , Quality Control , Data Analysis , Automation/methods , Humans , Ligands , Drug Discovery/methods , Protein Binding
9.
JMIR Ment Health ; 11: e58352, 2024 Jul 18.
Article in English | MEDLINE | ID: mdl-39024004

ABSTRACT

BACKGROUND: Emotional clarity has often been assessed with self-report measures, but efforts have also been made to measure it passively, which has advantages such as avoiding potential inaccuracy in responses stemming from social desirability bias or poor insight into emotional clarity. Response times (RTs) to emotion items administered in ecological momentary assessments (EMAs) may be an indirect indicator of emotional clarity. Another proposed indicator is the drift rate parameter, which assumes that, aside from how fast a person responds to emotion items, the measurement of emotional clarity also requires the consideration of how careful participants were in providing responses. OBJECTIVE: This paper aims to examine the reliability and validity of RTs and drift rate parameters from EMA emotion items as indicators of individual differences in emotional clarity. METHODS: Secondary data analysis was conducted on data from 196 adults with type 1 diabetes who completed a 2-week EMA study involving the completion of 5 to 6 surveys daily. If lower RTs and higher drift rates (from EMA emotion items) were indicators of emotional clarity, we hypothesized that greater levels (ie, higher clarity) should be associated with greater life satisfaction; lower levels of neuroticism, depression, anxiety, and diabetes distress; and fewer difficulties with emotion regulation. Because prior literature suggested emotional clarity could be valence specific, EMA items for negative affect (NA) and positive affect were examined separately. RESULTS: Reliability of the proposed indicators of emotional clarity was acceptable with a small number of EMA prompts (ie, 4 to 7 prompts in total or 1 to 2 days of EMA surveys). Consistent with expectations, the average drift rate of NA items across multiple EMAs had expected associations with other measures, such as correlations of r=-0.27 (P<.001) with depression symptoms, r=-0.27 (P=.001) with anxiety symptoms, r=-0.15 (P=.03) with emotion regulation difficulties, and r=0.63 (P<.001) with RTs to NA items. People with a higher NA drift rate responded faster to NA emotion items, had greater subjective well-being (eg, fewer depression symptoms), and had fewer difficulties with overall emotion regulation, which are all aligned with the expectation for an emotional clarity measure. Contrary to expectations, the validities of average RTs to NA items, the drift rate of positive affect items, and RTs to positive affect items were not strongly supported by our results. CONCLUSIONS: Study findings provided initial support for the validity of NA drift rate as an indicator of emotional clarity but not for that of other RT-based clarity measures. Evidence was preliminary because the sample size was not sufficient to detect small but potentially meaningful correlations, as the sample size of the diabetes EMA study was chosen for other more primary research questions. Further research on passive emotional clarity measures is needed.


Subject(s)
Ecological Momentary Assessment , Emotions , Humans , Female , Male , Reproducibility of Results , Adult , Middle Aged , Diabetes Mellitus, Type 1/psychology , Reaction Time/physiology , Emotional Regulation/physiology , Data Analysis , Personal Satisfaction , Surveys and Questionnaires , Secondary Data Analysis
11.
Methods Mol Biol ; 2812: 1-9, 2024.
Article in English | MEDLINE | ID: mdl-39068354

ABSTRACT

In this chapter, we present an established pipeline for analyzing RNA-Seq data, which involves a step-by-step flow starting from raw data obtained from a sequencer and culminating in the identification of differentially expressed genes with their functional characterization. The pipeline is divided into three sections, each addressing crucial stages of the analysis process. The first section covers the initial steps of the pipeline, including downloading of the data of interest and performing quality control assessment. Assessment ensures that the data used for analysis is reliable and suitable for downstream analyses. In the second section, gene-level quantification is performed, which entails quantification of expression levels of genes in the samples. The third and final section is focused on differential expression analysis, which involves comparing gene expression levels between two or more conditions. This step helps identify genes that show significant differences in expression levels under different experimental conditions. To facilitate accessibility and reproducibility, we have provided an online repository containing all scripts and files. Additionally, custom scripts are available, enabling users to modify the pipeline's output for various downstream analyses. By following this pipeline, researchers can effectively analyze RNA-Seq data and gain valuable insights into gene expression patterns and, furthermore, the understanding of biological processes.


Subject(s)
Gene Expression Profiling , RNA-Seq , Software , RNA-Seq/methods , Gene Expression Profiling/methods , Computational Biology/methods , Humans , Sequence Analysis, RNA/methods , High-Throughput Nucleotide Sequencing/methods , Reproducibility of Results , Quality Control , Data Analysis , Transcriptome/genetics
12.
Se Pu ; 42(7): 669-680, 2024 Jul.
Article in Chinese | MEDLINE | ID: mdl-38966975

ABSTRACT

Mass spectrometry imaging (MSI) is a promising method for characterizing the spatial distribution of compounds. Given the diversified development of acquisition methods and continuous improvements in the sensitivity of this technology, both the total amount of generated data and complexity of analysis have exponentially increased, rendering increasing challenges of data postprocessing, such as large amounts of noise, background signal interferences, as well as image registration deviations caused by sample position changes and scan deviations, and etc. Deep learning (DL) is a powerful tool widely used in data analysis and image reconstruction. This tool enables the automatic feature extraction of data by building and training a neural network model, and achieves comprehensive and in-depth analysis of target data through transfer learning, which has great potential for MSI data analysis. This paper reviews the current research status, application progress and challenges of DL in MSI data analysis, focusing on four core stages: data preprocessing, image reconstruction, cluster analysis, and multimodal fusion. The application of a combination of DL and mass spectrometry imaging in the study of tumor diagnosis and subtype classification is also illustrated. This review also discusses trends of development in the future, aiming to promote a better combination of artificial intelligence and mass spectrometry technology.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted , Mass Spectrometry , Mass Spectrometry/methods , Image Processing, Computer-Assisted/methods , Humans , Data Analysis
13.
Article in English | MEDLINE | ID: mdl-38977033

ABSTRACT

PURPOSE: This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under two stopping rules (SEM 0.3 and 0.25) using both real and simulated data in medical examinations in Korea. METHODS: This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees' passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules. RESULTS: Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r = 0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data. CONCLUSION: The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.


Subject(s)
Educational Measurement , Psychometrics , Students, Medical , Humans , Educational Measurement/methods , Educational Measurement/standards , Republic of Korea , Psychometrics/methods , Computer Simulation , Data Analysis , Education, Medical, Undergraduate/methods , Male , Female
15.
Sensors (Basel) ; 24(11)2024 Jun 06.
Article in English | MEDLINE | ID: mdl-38894472

ABSTRACT

Human trajectories can be tracked by the internal processing of a camera as an edge device. This work aims to match peoples' trajectories obtained from cameras to sensor data such as acceleration and angular velocity, obtained from wearable devices. Since human trajectory and sensor data differ in modality, the matching method is not straightforward. Furthermore, complete trajectory information is unavailable; it is difficult to determine which fragments belong to whom. To solve this problem, we newly proposed the SyncScore model to find the similarity between a unit period trajectory and the corresponding sensor data. We also propose a Likelihood Fusion algorithm that systematically updates the similarity data and integrates it over time while keeping other trajectories in mind. We confirmed that the proposed method can match human trajectories and sensor data with an accuracy, a sensitivity, and an F1 of 0.725. Our models achieved decent results on the UEA dataset.


Subject(s)
Algorithms , Wearable Electronic Devices , Humans , Data Analysis
16.
Methods Mol Biol ; 2817: 177-220, 2024.
Article in English | MEDLINE | ID: mdl-38907155

ABSTRACT

Mass-spectrometry (MS)-based single-cell proteomics (SCP) explores cellular heterogeneity by focusing on the functional effectors of the cells-proteins. However, extracting meaningful biological information from MS data is far from trivial, especially with single cells. Currently, data analysis workflows are substantially different from one research team to another. Moreover, it is difficult to evaluate pipelines as ground truths are missing. Our team has developed the R/Bioconductor package called scp to provide a standardized framework for SCP data analysis. It relies on the widely used QFeatures and SingleCellExperiment data structures. In addition, we used a design containing cell lines mixed in known proportions to generate controlled variability for data analysis benchmarking. In this chapter, we provide a flexible data analysis protocol for SCP data using the scp package together with comprehensive explanations at each step of the processing. Our main steps are quality control on the feature and cell level, aggregation of the raw data into peptides and proteins, normalization, and batch correction. We validate our workflow using our ground truth data set. We illustrate how to use this modular, standardized framework and highlight some crucial steps.


Subject(s)
Mass Spectrometry , Proteomics , Single-Cell Analysis , Software , Workflow , Proteomics/methods , Proteomics/standards , Single-Cell Analysis/methods , Mass Spectrometry/methods , Humans , Computational Biology/methods , Proteome/analysis , Data Analysis
18.
Methods Mol Biol ; 2822: 263-290, 2024.
Article in English | MEDLINE | ID: mdl-38907924

ABSTRACT

RNA-Seq data analysis stands as a vital part of genomics research, turning vast and complex datasets into meaningful biological insights. It is a field marked by rapid evolution and ongoing innovation, necessitating a thorough understanding for anyone seeking to unlock the potential of RNA-Seq data. In this chapter, we describe the intricate landscape of RNA-seq data analysis, elucidating a comprehensive pipeline that navigates through the entirety of this complex process. Beginning with quality control, the chapter underscores the paramount importance of ensuring the integrity of RNA-seq data, as it lays the groundwork for subsequent analyses. Preprocessing is then addressed, where the raw sequence data undergoes necessary modifications and enhancements, setting the stage for the alignment phase. This phase involves mapping the processed sequences to a reference genome, a step pivotal for decoding the origins and functions of these sequences.Venturing into the heart of RNA-seq analysis, the chapter then explores differential expression analysis-the process of identifying genes that exhibit varying expression levels across different conditions or sample groups. Recognizing the biological context of these differentially expressed genes is pivotal; hence, the chapter transitions into functional analysis. Here, methods and tools like Gene Ontology and pathway analyses help contextualize the roles and interactions of the identified genes within broader biological frameworks. However, the chapter does not stop at conventional analysis methods. Embracing the evolving paradigms of data science, it delves into machine learning applications for RNA-seq data, introducing advanced techniques in dimension reduction and both unsupervised and supervised learning. These approaches allow for patterns and relationships to be discerned in the data that might be imperceptible through traditional methods.


Subject(s)
Computational Biology , RNA-Seq , Software , RNA-Seq/methods , Humans , Computational Biology/methods , Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Genomics/methods , Data Analysis , Gene Ontology , High-Throughput Nucleotide Sequencing/methods
19.
Sci Rep ; 14(1): 14158, 2024 06 19.
Article in English | MEDLINE | ID: mdl-38898123

ABSTRACT

Genome analysis in cancer has focused mainly on elucidating the function and regulatory mechanisms of genes that exhibit differential expression or mutation in cancer samples compared to normal samples. Recently, transcriptome analysis revealed that abnormal splicing events in cancer samples could contribute to cancer pathogenesis. Moreover, splicing variants in cancer reportedly generate diverse cancer antigens. Although abnormal splicing events are expected to be potential targets in cancer immunotherapy, the exploration of such targets and their biological significance in cancer have not been fully understood. In this study, to explore subtype-specific alternative splicing events, we conducted a comprehensive analysis of splicing events for each breast cancer subtype using large-scale splicing data derived from The Cancer Genome Atlas and found subtype-specific alternative splicing patterns. Analyses indicated that genes that produce subtype-specific alternative splicing events are potential novel targets for immunotherapy against breast cancer. The subtype-specific alternative splicing events identified in this study, which were not identified by mutation or differential expression analysis, bring new significance to previously overlooked splicing events.


Subject(s)
Alternative Splicing , Breast Neoplasms , Gene Expression Regulation, Neoplastic , Humans , Alternative Splicing/genetics , Breast Neoplasms/genetics , Female , Gene Expression Profiling , Mutation , Data Analysis
20.
Medicina (Kaunas) ; 60(6)2024 Jun 04.
Article in English | MEDLINE | ID: mdl-38929556

ABSTRACT

Background and Objectives: Although statins are recommended for secondary prevention of acute ischemic stroke, some population-based studies and clinical evidence suggest that they might be used with an increased risk of intracranial hemorrhage. In this nested case-control study, we used Taiwan's nationwide universal health insurance database to investigate the possible association between statin therapy prescribed to acute ischemic stroke patients and their risk of subsequent intracerebral hemorrhage and all-cause mortality in Taiwan. Materials and Methods: All data were retrospectively obtained from Taiwan's National Health Insurance Research Database. Acute ischemic stroke patients were divided into a cohort receiving statin pharmacotherapy and a control cohort not receiving statin pharmacotherapy. A 1:1 matching for age, gender, and index day, and propensity score matching was conducted, producing 39,366 cases and 39,366 controls. The primary outcomes were long-term subsequent intracerebral hemorrhage and all-cause mortality. The competing risk between subsequent intracerebral hemorrhage and all-cause mortality was estimated using the Fine and Gray regression hazards model. Results: Patients receiving statin pharmacotherapy after an acute ischemic stroke had a significantly lower risk of subsequent intracerebral hemorrhage (p < 0.0001) and lower all-cause mortality rates (p < 0.0001). Low, moderate, and high dosages of statin were associated with significantly decreased risks for subsequent intracerebral hemorrhage (adjusted sHRs 0.82, 0.74, 0.53) and all-cause mortality (adjusted sHRs 0.75, 0.74, 0.74), respectively. Conclusions: Statin pharmacotherapy was found to safely and effectively reduce the risk of subsequent intracerebral hemorrhage and all-cause mortality in acute ischemic stroke patients in Taiwan.


Subject(s)
Big Data , Cerebral Hemorrhage , Hydroxymethylglutaryl-CoA Reductase Inhibitors , Ischemic Stroke , Humans , Taiwan/epidemiology , Hydroxymethylglutaryl-CoA Reductase Inhibitors/therapeutic use , Hydroxymethylglutaryl-CoA Reductase Inhibitors/adverse effects , Female , Male , Cerebral Hemorrhage/mortality , Aged , Middle Aged , Case-Control Studies , Retrospective Studies , Ischemic Stroke/prevention & control , Ischemic Stroke/epidemiology , Aged, 80 and over , Data Analysis , Risk Factors , Propensity Score
SELECTION OF CITATIONS
SEARCH DETAIL