RESUMO
Adverse childhood experiences have been linked to detrimental mental health outcomes in adulthood. This study investigates a potential neurodevelopmental pathway between adversity and mental health outcomes: brain connectivity. We used data from the prospective, longitudinal Adolescent Brain Cognitive Development (ABCD) study (N â 12.000, participants aged 9-13 years, male and female) and assessed structural brain connectivity using fractional anisotropy (FA) of white matter tracts. The adverse experiences modeled included family conflict and traumatic experiences. K-means clustering and latent basis growth models were used to determine subgroups based on total levels and trajectories of brain connectivity. Multinomial regression was used to determine associations between cluster membership and adverse experiences. The results showed that higher family conflict was associated with higher FA levels across brain tracts (e.g., t (3) = -3.81, ß = -0.09, p bonf = 0.003) and within the corpus callosum (CC), fornix, and anterior thalamic radiations (ATR). A decreasing FA trajectory across two brain imaging timepoints was linked to lower socioeconomic status and neighborhood safety. Socioeconomic status was related to FA across brain tracts (e.g., t (3) = 3.44, ß = 0.10, p bonf = 0.01), the CC and the ATR. Neighborhood safety was associated with FA in the Fornix and ATR (e.g., t (1) = 3.48, ß = 0.09, p bonf = 0.01). There is a complex and multifaceted relationship between adverse experiences and brain development, where adverse experiences during early adolescence are related to brain connectivity. These findings underscore the importance of studying adverse experiences beyond early childhood to understand lifespan developmental outcomes.
Assuntos
Imagem de Tensor de Difusão , Substância Branca , Humanos , Masculino , Adolescente , Pré-Escolar , Feminino , Estudos Prospectivos , Imagem de Tensor de Difusão/métodos , Encéfalo/diagnóstico por imagem , Substância Branca/diagnóstico por imagem , Corpo Caloso , AnisotropiaRESUMO
Studies have confirmed that the occurrence of many complex diseases in the human body is closely related to the microbial community, and microbes can affect tumorigenesis and metastasis by regulating the tumor microenvironment. However, there are still large gaps in the clinical observation of the microbiota in disease. Although biological experiments are accurate in identifying disease-associated microbes, they are also time-consuming and expensive. The computational models for effective identification of diseases related microbes can shorten this process, and reduce capital and time costs. Based on this, in the paper, a model named DSAE_RF is presented to predict latent microbe-disease associations by combining multi-source features and deep learning. DSAE_RF calculates four similarities between microbes and diseases, which are then used as feature vectors for the disease-microbe pairs. Later, reliable negative samples are screened by k-means clustering, and a deep sparse autoencoder neural network is further used to extract effective features of the disease-microbe pairs. In this foundation, a random forest classifier is presented to predict the associations between microbes and diseases. To assess the performance of the model in this paper, 10-fold cross-validation is implemented on the same dataset. As a result, the AUC and AUPR of the model are 0.9448 and 0.9431, respectively. Furthermore, we also conduct a variety of experiments, including comparison of negative sample selection methods, comparison with different models and classifiers, Kolmogorov-Smirnov test and t-test, ablation experiments, robustness analysis, and case studies on Covid-19 and colorectal cancer. The results fully demonstrate the reliability and availability of our model.
Assuntos
COVID-19 , Aprendizado Profundo , Microbiota , Humanos , Reprodutibilidade dos Testes , Algoritmos , Biologia Computacional/métodosRESUMO
BACKGROUND: Metagene plots provide a visualization of biological signal trends over subsections of the genome and are used to perform high-level analysis of experimental data by aggregating genome-level data to create an average profile. The generation of metagene plots is useful for summarizing the results of many sequencing-based applications. Despite their prevalence and utility, the standard metagene plot is blind to conflicting signals within data. If multiple distinct trends occur, they can interact destructively, creating a plot that does not accurately represent any of the underlying trends. RESULTS: We present MetageneCluster, a Python tool to generate a collection of representative metagene plots based on k-means clustering of genomic regions of interest. Clustering the data by similarity allows us to identify patterns within the features of interest. We are then able to summarize each pattern present in the data, rather than averaging across the entire feature space. We show that our method performs well when used to identify conflicting signals in real-world genome-level data. CONCLUSIONS: Overall, MetageneCluster is a user-friendly tool for the creation of metagene plots that capture distinct patterns in underlying sequence data.
Assuntos
Genoma , Genômica , Genômica/métodos , SoftwareRESUMO
BACKGROUND: As one of the world's most important beverage crops, tea plants (Camellia sinensis) are renowned for their unique flavors and numerous beneficial secondary metabolites, attracting researchers to investigate the formation of tea quality. With the increasing availability of transcriptome data on tea plants in public databases, conducting large-scale co-expression analyses has become feasible to meet the demand for functional characterization of tea plant genes. However, as the multidimensional noise increases, larger-scale co-expression analyses are not always effective. Analyzing a subset of samples generated by effectively downsampling and reorganizing the global sample set often leads to more accurate results in co-expression analysis. Meanwhile, global-based co-expression analyses are more likely to overlook condition-specific gene interactions, which may be more important and worthy of exploration and research. RESULTS: Here, we employed the k-means clustering method to organize and classify the global samples of tea plants, resulting in clustered samples. Metadata annotations were then performed on these clustered samples to determine the "conditions" represented by each cluster. Subsequently, we conducted gene co-expression network analysis (WGCNA) separately on the global samples and the clustered samples, resulting in global modules and cluster-specific modules. Comparative analyses of global modules and cluster-specific modules have demonstrated that cluster-specific modules exhibit higher accuracy in co-expression analysis. To measure the degree of condition specificity of genes within condition-specific clusters, we introduced the correlation difference value (CDV). By incorporating the CDV into co-expression analyses, we can assess the condition specificity of genes. This approach proved instrumental in identifying a series of high CDV transcription factor encoding genes upregulated during sustained cold treatment in Camellia sinensis leaves and buds, and pinpointing a pair of genes that participate in the antioxidant defense system of tea plants under sustained cold stress. CONCLUSIONS: To summarize, downsampling and reorganizing the sample set improved the accuracy of co-expression analysis. Cluster-specific modules were more accurate in capturing condition-specific gene interactions. The introduction of CDV allowed for the assessment of condition specificity in gene co-expression analyses. Using this approach, we identified a series of high CDV transcription factor encoding genes related to sustained cold stress in Camellia sinensis. This study highlights the importance of considering condition specificity in co-expression analysis and provides insights into the regulation of the cold stress in Camellia sinensis.
Assuntos
Camellia sinensis , Camellia sinensis/genética , Camellia sinensis/metabolismo , Análise por Conglomerados , Genes de Plantas , Perfilação da Expressão Gênica/métodos , Mineração de Dados/métodos , Transcriptoma , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de GenesRESUMO
BACKGROUND: Cardiovascular disease (CVD) is closely associated with the triglyceride glucose (TyG) index and its related indicators, particularly its combination with obesity indices. However, there is limited research on the relationship between changes in TyG-related indices and CVD, as most studies have focused on baseline TyG-related indices. METHODS: The data for this prospective cohort study were obtained from the China Health and Retirement Longitudinal Study. The exposures were changes in TyG-related indices and cumulative TyG-related indices from 2012 to 2015. The K-means algorithm was used to classify changes in each TyG-related index into four classes (Class 1 to Class 4). Multivariate logistic regressions were used to evaluate the associations between the changes in TyG-related indices and the incidence of CVD. RESULTS: In total, 3243 participants were included in this study, of whom 1761 (54.4%) were female, with a mean age of 57.62 years at baseline. Over a 5-year follow-up, 637 (19.6%) participants developed CVD. Fully adjusted logistic regression analyses revealed significant positive associations between changes in TyG-related indices, cumulative TyG-related indices and the incidence of CVD. Among these changes in TyG-related indices, changes in TyG-waist circumference (WC) showed the strongest association with incident CVD. Compared to the participants in Class 1 of changes in TyG-WC, the odds ratio (OR) for participants in Class 2 was 1.41 (95% confidence interval (CI) 1.08-1.84), the OR for participants in Class 3 was 1.54 (95% CI 1.15-2.07), and the OR for participants in Class 4 was 1.94 (95% CI 1.34-2.80). Moreover, cumulative TyG-WC exhibited the strongest association with incident CVD among cumulative TyG-related indices. Compared to the participants in Quartile 1 of cumulative TyG-WC, the OR for participants in Quartile 2 was 1.33 (95% CI 1.00-1.76), the OR for participants in Quartile 3 was 1.46 (95% CI 1.09-1.96), and the OR for participants in Quartile 4 was 1.79 (95% CI 1.30-2.47). CONCLUSIONS: Changes in TyG-related indices are independently associated with the risk of CVD. Changes in TyG-WC are expected to become more effective indicators for identifying individuals at a heightened risk of CVD.
Assuntos
Biomarcadores , Glicemia , Doenças Cardiovasculares , Obesidade , Triglicerídeos , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/sangue , Estudos Prospectivos , Triglicerídeos/sangue , Incidência , Medição de Risco , China/epidemiologia , Glicemia/metabolismo , Obesidade/epidemiologia , Obesidade/diagnóstico , Obesidade/sangue , Idoso , Biomarcadores/sangue , Estudos Longitudinais , Fatores de Tempo , Prognóstico , Fatores de Risco de Doenças Cardíacas , Valor Preditivo dos Testes , Fatores de RiscoRESUMO
BACKGROUND: Insulin resistance is linked to an increased risk of frailty, yet the comprehensive relationship between the triglyceride glucose-body mass index (TyG-BMI), which reflects weight, and frailty, remains unclear. This relationship is investigated in this study. METHODS: Data from 9135 participants in the China Health and Retirement Longitudinal Study (2011-2020) were analysed. Baseline TyG-BMI, changes in the TyG-BMI and cumulative TyG-BMI between baseline and 2015, along with the frailty index (FI) over nine years, were calculated. Participants were grouped into different categories based on TyG-BMI changes using K-means clustering. FI trajectories were assessed using a group-based trajectory model. Logistic and Cox regression models were used to analyse the associations between the TyG-BMI and FI trajectory and frail incidence. Nonlinear relationships were explored using restricted cubic splines, and a linear mixed-effects model was used to evaluate FI development speed. Weighted quantile regression was used to identify the primary contributing factors. RESULTS: Four classes of changes in the TyG-BMI and two FI trajectories were identified. Individuals in the third (OR = 1.25, 95% CI: 1.10-1.42) and fourth (OR = 1.83, 95% CI: 1.61-2.09) quartiles of baseline TyG-BMI, those with consistently second to highest (OR = 1.49, 95% CI: 1.32-1.70) and the highest (OR = 2.17, 95% CI: 1.84-2.56) TyG-BMI changes, and those in the third (OR = 1.20, 95% CI: 1.05-1.36) and fourth (OR = 1.94, 95% CI: 1.70-2.22) quartiles of the cumulative TyG-BMI had greater odds of experiencing a rapid FI trajectory. Higher frail risk was noted in those in the fourth quartile of baseline TyG-BMI (HR = 1.42, 95% CI: 1.28-1.58), with consistently second to highest (HR = 1.23, 95% CI: 1.12-1.34) and the highest TyG-BMI changes (HR = 1.58, 95% CI: 1.42-1.77), and those in the third (HR = 1.10, 95% CI: 1.00-1.21) and fourth quartile of cumulative TyG-BMI (HR = 1.46, 95% CI: 1.33-1.60). Participants with persistently second-lowest to the highest TyG-BMI changes (ß = 0.15, 0.38 and 0.76 respectively) and those experiencing the third to fourth cumulative TyG-BMI (ß = 0.25 and 0.56, respectively) demonstrated accelerated FI progression. A U-shaped association was observed between TyG-BMI levels and both rapid FI trajectory and higher frail risk, with BMI being the primary factor. CONCLUSION: A higher TyG-BMI is associated with the rapid development of FI trajectory and a greater frail risk. However, excessively low TyG-BMI levels also appear to contribute to frail development. Maintaining a healthy TyG-BMI, especially a healthy BMI, may help prevent or delay the frail onset.
Assuntos
Biomarcadores , Glicemia , Índice de Massa Corporal , Idoso Fragilizado , Fragilidade , Avaliação Geriátrica , Triglicerídeos , Humanos , Masculino , Fragilidade/epidemiologia , Fragilidade/diagnóstico , Fragilidade/sangue , Feminino , Pessoa de Meia-Idade , Idoso , China/epidemiologia , Incidência , Glicemia/metabolismo , Triglicerídeos/sangue , Fatores de Risco , Medição de Risco , Estudos Longitudinais , Fatores de Tempo , Fatores Etários , Biomarcadores/sangue , Resistência à Insulina , Prognóstico , Idoso de 80 Anos ou maisRESUMO
BACKGROUND: The triglyceride-glucose (TyG) index and its combination with obesity indicators can predict cardiovascular diseases (CVD). However, there is limited research on the relationship between changes in the triglyceride glucose-waist height ratio (TyG-WHtR) and CVD. Our study aims to investigate the relationship between the change in the TyG-WHtR and the risk of CVD. METHODS: Participants were from the China Health and Retirement Longitudinal Study (CHARLS). CVD was defined as self-reporting heart disease and stroke. Participants were divided into three groups based on changes in TyG-WHtR using K-means cluster analysis. Multivariable binary logistic regression analysis was used to examine the association between different groups (based on the change of TyG-WHtR) and CVD. A restricted cubic spline (RCS) regression model was used to explore the potential nonlinear association of the cumulative TyG-WHtR and CVD events. RESULTS: During follow-up between 2015 and 2020, 623 (18.8%) of 3312 participants developed CVD. After adjusting for various potential confounders, compared to the participants with consistently low and stable TyG-WHtR, the risk of CVD was significantly higher in participants with moderate and increasing TyG-WHtR (OR 1.28, 95%CI 1.01-1.63) and participants with high TyG-WHtR with a slowly increasing trend (OR 1.58, 95%CI 1.16-2.15). Higher levels of cumulative TyG-WHtR were independently associated with a higher risk of CVD events (per SD, OR 1.27, 95%CI 1.12-1.43). CONCLUSIONS: For middle-aged and older adults, changes in the TyG-WHtR are independently associated with the risk of CVD. Maintaining a favorable TyG index, effective weight management, and a reasonable waist circumference contribute to preventing CVD.
Assuntos
Biomarcadores , Glicemia , Doenças Cardiovasculares , Triglicerídeos , Humanos , Feminino , Masculino , Pessoa de Meia-Idade , China/epidemiologia , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/sangue , Triglicerídeos/sangue , Idoso , Medição de Risco , Glicemia/metabolismo , Biomarcadores/sangue , Estudos Longitudinais , Razão Cintura-Estatura , Fatores Etários , Fatores de Tempo , Prognóstico , Valor Preditivo dos Testes , Fatores de Risco , Fatores de Risco de Doenças Cardíacas , Incidência , População do Leste AsiáticoRESUMO
The presence of a normal large blood vessel (LBV) in a tumor region can impact the evaluation of quantitative dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) parameters and tumor classification. Hence, there is a need for automatic removal of LBVs from brain tissues including intratumoral regions for achieving an objective assessment of tumors. This retrospective study included 103 histopathologically confirmed brain tumor patients who underwent MRI, including DCE-MRI data acquisition. Quantitative DCE-MRI analysis was performed for computing various parameters such as wash-out slope (Slope-2), relative cerebral blood volume (rCBV), relative cerebral blood flow (rCBF), blood plasma volume fraction (Vp), and volume transfer constant (Ktrans). An approach based on data-clustering algorithm, morphological operations, and quantitative DCE-MRI maps was proposed for the segmentation of normal LBVs in brain tissues, including the tumor region. Here, three widely used data-clustering algorithms were evaluated on two types of quantitative maps: (a) Slope-2, and (b) a new proposed combination of rCBV and Slope-2 maps. Fluid-attenuated inversion recovery-MRI hyperintense lesions were also automatically segmented using deep learning-based architecture. The accuracy of LBV segmentation was qualitatively assessed blindly by two experienced observers, and Likert scoring was also obtained from each individual and compared using Cohen's Kappa test, and multiple statistical features from quantitative DCE-MRI parameters were obtained in the segmented tumor. t-test and receiver operating characteristic (ROC) curve analysis were performed for comparing the effect of removal of LBVs on parameters as well as on tumor grading. k-means clustering exhibited better accuracy and computational efficiency. Tumors, in particular high-grade gliomas (HGGs), showed a high contrast compared with normal tissues (relative % difference = 18.5%) on quantitative maps after the removal of LBVs. Statistical features (95th percentile values) of all parameters in the tumor region showed a statistically significant difference (p < 0.05) between with and without LBV maps. Similar results were obtained for the ROC curve analysis for differentiation between low-grade gliomas and HGGs. Moreover, after the removal of LBVs, the rCBV, rCBF, and Vp maps show better visualization of tumor regions.
Assuntos
Neoplasias Encefálicas , Meios de Contraste , Imageamento por Ressonância Magnética , Humanos , Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/patologia , Neoplasias Encefálicas/irrigação sanguínea , Imageamento por Ressonância Magnética/métodos , Feminino , Masculino , Pessoa de Meia-Idade , Adulto , Idoso , Automação , Estudos Retrospectivos , Algoritmos , Adulto Jovem , Vasos Sanguíneos/diagnóstico por imagem , Vasos Sanguíneos/patologia , Volume Sanguíneo Cerebral , Circulação CerebrovascularRESUMO
BACKGROUND: Continuous glucose monitoring has facilitated the evaluation of dynamic changes in glucose throughout the day and their effect on fetal growth abnormalities in pregnancy. However, studies of multiple continuous glucose monitoring metrics combined and their association with other adverse pregnancy outcomes are limited. OBJECTIVE: This study aimed to (1) use machine learning techniques to identify discrete glucose profiles based on weekly continuous glucose monitoring metrics in pregnant individuals with pregestational diabetes mellitus and (2) investigate their association with adverse pregnancy outcomes. STUDY DESIGN: This study analyzed data from a retrospective cohort study of pregnant patients with type 1 or 2 diabetes mellitus who used Dexcom G6 continuous glucose monitoring and delivered a nonanomalous, singleton pregnancy at a tertiary center between 2019 and 2023. Continuous glucose monitoring data were collapsed into 39 weekly glycemic measures related to centrality, spread, excursions, and circadian cycle patterns. Principal component analysis and k-means clustering were used to identify 4 discrete groups, and patients were assigned to the group that best represented their continuous glucose monitoring patterns during pregnancy. Finally, the association between glucose profile groups and outcomes (preterm birth, cesarean delivery, preeclampsia, large-for-gestational-age neonate, neonatal hypoglycemia, and neonatal intensive care unit admission) was estimated using multivariate logistic regression adjusted for diabetes mellitus type, maternal age, insurance, continuous glucose monitoring use before pregnancy, and parity. RESULTS: Of 177 included patients, 90 (50.8%) had type 1 diabetes mellitus, and 85 (48.3%) had type 2 diabetes mellitus. This study identified 4 glucose profiles: (1) well controlled; (2) suboptimally controlled with high variability, fasting hypoglycemia, and daytime hyperglycemia; (3) suboptimally controlled with minimal circadian variation; and (4) poorly controlled with peak hyperglycemia overnight. Compared with the well-controlled profile, the suboptimally controlled profile with high variability had higher odds of a large-for-gestational-age neonate (adjusted odds ratio, 3.34; 95% confidence interval, 1.15-9.89). The suboptimally controlled with minimal circadian variation profile had higher odds of preterm birth (adjusted odds ratio, 2.59; 95% confidence interval, 1.10-6.24), cesarean delivery (adjusted odds ratio, 2.76; 95% confidence interval, 1.09-7.46), and neonatal intensive care unit admission (adjusted odds ratio, 4.08; 95% confidence interval, 1.58-11.40). The poorly controlled profile with peak hyperglycemia overnight had higher odds of preeclampsia (adjusted odds ratio, 2.54; 95% confidence interval, 1.02-6.52), large-for-gestational-age neonate (adjusted odds ratio, 3.72; 95% confidence interval, 1.37-10.4), neonatal hypoglycemia (adjusted odds ratio, 3.53; 95% confidence interval, 1.37-9.71), and neonatal intensive care unit admission (adjusted odds ratio, 3.15; 95% confidence interval, 1.20-9.09). CONCLUSION: Discrete glucose profiles of pregnant individuals with pregestational diabetes mellitus were identified through joint consideration of multiple continuous glucose monitoring metrics. Prolonged exposure to maternal hyperglycemia may be associated with a higher risk of adverse pregnancy outcomes than suboptimal glycemic control characterized by high glucose variability and intermittent hyperglycemia.
Assuntos
Automonitorização da Glicemia , Glicemia , Cesárea , Diabetes Mellitus Tipo 1 , Diabetes Mellitus Tipo 2 , Hipoglicemia , Pré-Eclâmpsia , Resultado da Gravidez , Gravidez em Diabéticas , Nascimento Prematuro , Humanos , Feminino , Gravidez , Adulto , Estudos Retrospectivos , Gravidez em Diabéticas/sangue , Diabetes Mellitus Tipo 1/sangue , Hipoglicemia/epidemiologia , Glicemia/metabolismo , Glicemia/análise , Nascimento Prematuro/epidemiologia , Cesárea/estatística & dados numéricos , Pré-Eclâmpsia/epidemiologia , Recém-Nascido , Diabetes Mellitus Tipo 2/sangue , Macrossomia Fetal/epidemiologia , Aprendizado de Máquina , Unidades de Terapia Intensiva Neonatal , Estudos de Coortes , Terapia Intensiva Neonatal , Monitoramento Contínuo da GlicoseRESUMO
BACKGROUND: Individuals confronting health threats may display an optimistic bias such that judgments of their risk for illness or death are unrealistically positive given their objective circumstances. PURPOSE: We explored optimistic bias for health risks using k-means clustering in the context of COVID-19. We identified risk profiles using subjective and objective indicators of severity and susceptibility risk for COVID-19. METHODS: Between 3/18/2020-4/18/2020, a national probability sample of 6,514 U.S. residents reported both their subjective risk perceptions (e.g., perceived likelihood of illness or death) and objective risk indices (e.g., age, weight, pre-existing conditions) of COVID-19-related susceptibility and severity, alongside other pandemic-related experiences. Six months later, a subsample (N = 5,661) completed a follow-up survey with questions about their frequency of engagement in recommended health protective behaviors (social distancing, mask wearing, risk behaviors, vaccination intentions). RESULTS: The k-means clustering procedure identified five risk profiles in the Wave 1 sample; two of these demonstrated aspects of optimistic bias, representing almost 44% of the sample. In OLS regression models predicting health protective behavior adoption at Wave 2, clusters representing individuals with high perceived severity risk were most likely to report engagement in social distancing, but many individuals who were objectively at high risk for illness and death did not report engaging in self-protective behaviors. CONCLUSIONS: Objective risk of disease severity only inconsistently predicted health protective behavior. Risk profiles may help identify groups that need more targeted interventions to increase their support for public health policy and health enhancing recommendations more broadly.
As we move into an endemic stage of the COVID-19 pandemic, understanding engagement in health behaviors to curb the spread of disease remains critically important to manage COVID-19 and other health threats. However, peoples' perceptions about their risk of getting sick and having severe outcomes if they do fall ill are subject to bias. We studied a nationally representative probability sample of over 6,500 U.S. residents who completed surveys immediately after the COVID-19 pandemic began and approximately 6 months later. We used a computer processing (i.e., machine learning) approach to categorize participants based on both their actual risk factors for COVID-19 and their subjective understanding of that risk. Our analysis identified groups of individuals whose subjective perceptions of risk did not align with their actual risk characteristics. Specifically, almost 44% of our sample demonstrated an optimistic bias: they did not report higher risk of death from COVID-19 despite having one or more well-known risk factors for poor disease outcomes (e.g., older age, obesity). Six months later, membership in these risk groups prospectively predicted engagement in health protective and risky behaviors, as well as vaccine intentions, demonstrating how early risk perceptions may influence health behaviors over time.
Assuntos
COVID-19 , Humanos , COVID-19/epidemiologia , Comportamentos Relacionados com a Saúde , Pandemias , Inquéritos e QuestionáriosRESUMO
BACKGROUND AND PURPOSE: Post-stroke fatigue commonly presents alongside several comorbidities. The interaction between comorbidities and their relationship to fatigue is not known. In this study, we focus on physical and mood comorbidities, alongside lesion characteristics. We predict the emergence of distinct fatigue phenotypes with distinguishable physical and mood characteristics. METHODS: In this cross-sectional observational study, in 94 first time, non-depressed, moderate to minimally impaired chronic stroke survivors, the relationship between measures of motor function (grip strength, nine-hole peg test time), motor cortical excitability (resting motor threshold), Hospital Anxiety and Depression Scale and Fatigue Severity Scale-7 (FSS-7) scores, age, gender and side of stroke was established using Spearman's rank correlation. Mood and motor variables were then entered into a k-means clustering algorithm to identify the number of unique clusters, if any. Post hoc pairwise comparisons followed by corrections for multiple comparisons were performed to characterize differences among clusters in the variables included in k-means clustering. RESULTS: Clustering analysis revealed a four-cluster model to be the best model (average silhouette score of 0.311). There was no significant difference in FSS-7 scores among the four high-fatigue clusters. Two clusters consisted of only left-hemisphere strokes, and the remaining two were exclusively right-hemisphere strokes. Factors that differentiated hemisphere-specific clusters were the level of depressive symptoms and anxiety. Motor characteristics distinguished the low-depressive left-hemisphere from the right-hemisphere clusters. CONCLUSION: The significant differences in side of stroke and the differential relationship between mood and motor function in the four clusters reveal the heterogenous nature of post-stroke fatigue, which is amenable to categorization. Such categorization is critical to an understanding of the interactions between post-stroke fatigue and its presenting comorbid deficits, with significant implications for the development of context-/category-specific interventions.
Assuntos
Reabilitação do Acidente Vascular Cerebral , Acidente Vascular Cerebral , Humanos , Estudos Transversais , Fadiga/etiologia , Acidente Vascular Cerebral/diagnóstico , Masculino , FemininoRESUMO
In the process of screening for probiotic strains, there are no clearly established bacterial phenotypic markers which could be used for the prediction of their in vivo mechanism of action. In this work, we demonstrate for the first time that Machine Learning (ML) methods can be used for accurately predicting the in vivo immunomodulatory activity of probiotic strains based on their cell surface phenotypic features using a snail host-microbe interaction model. A broad range of snail gut presumptive probiotics, including 240 new lactic acid bacterial strains (Lactobacillus, Leuconostoc, Lactococcus, and Enterococcus), were isolated and characterized based on their capacity to withstand snails' gastrointestinal defense barriers, such as the pedal mucus, gastric mucus, gastric juices, and acidic pH, in association with their cell surface hydrophobicity, autoaggregation, and biofilm formation ability. The implemented ML pipeline predicted with high accuracy (88 %) strains with a strong capacity to enhance chemotaxis and phagocytic activity of snails' hemolymph cells, while also revealed bacterial autoaggregation and cell surface hydrophobicity as the most important parameters that significantly affect host immune responses. The results show that ML approaches may be useful to derive a predictive understanding of host-probiotic interactions, while also highlighted the use of snails as an efficient animal model for screening presumptive probiotic strains in the light of their interaction with cellular innate immune responses.
Assuntos
Aprendizado de Máquina , Probióticos , Probióticos/farmacologia , Animais , Lactobacillales/fisiologia , Lactobacillales/imunologia , Caramujos/imunologia , Caramujos/microbiologia , Caracois Helix/imunologia , Caracois Helix/fisiologia , Imunidade Inata , ImunomodulaçãoRESUMO
Lake and reservoir surface areas are an important proxy for freshwater availability. Advancements in machine learning (ML) techniques and increased accessibility of remote sensing data products have enabled the analysis of waterbody surface area dynamics on broad spatial scales. However, interpreting the ML results remains a challenge. While ML provides important tools for identifying patterns, the resultant models do not include mechanisms. Thus, the "black-box" nature of ML techniques often lacks ecological meaning. Using ML, we characterized temporal patterns in lake and reservoir surface area change from 1984 to 2016 for 103,930 waterbodies in the contiguous United States. We then employed knowledge-guided machine learning (KGML) to classify all waterbodies into seven ecologically interpretable groups representing distinct patterns of surface area change over time. Many waterbodies were classified as having "no change" (43%), whereas the remaining 57% of waterbodies fell into other groups representing both linear and nonlinear patterns. This analysis demonstrates the potential of KGML not only for identifying ecologically relevant patterns of change across time but also for unraveling complex processes that underpin those changes.
Assuntos
Lagos , Aprendizado de Máquina , Estados UnidosRESUMO
The sewer system, despite being a significant source of methane emissions, has often been overlooked in current greenhouse gas inventories due to the limited availability of quantitative data. Direct monitoring in sewers can be expensive or biased due to access limitations and internal heterogeneity of sewer networks. Fortunately, since methane is almost exclusively biogenic in sewers, we demonstrate in this study that the methanogenic potential can be estimated using known sewer microbiome data. By combining data mining techniques and bioinformatics databases, we developed the first data-driven method to analyze methanogenic potentials using a data set containing 633 observations of 53 variables obtained from literature mining. The methanogenic potential in the sewer sediment was around 250-870% higher than that in the wet biofilm on the pipe and sewage water. Additionally, k-means clustering and principal component analysis linked higher methane emission rates (9.72 ± 51.3 kgCO2 eq m-3 d-1) with smaller pipe size, higher water level, and higher potentials of sulfate reduction in the wetted pipe biofilm. These findings exhibit the possibility of connecting microbiome data with biogenic greenhouse gases, further offering insights into new approaches for understanding greenhouse gas emissions from understudied sources.
RESUMO
OBJECTIVE: Survival analysis is widely utilized in healthcare to predict the timing of disease onset. Traditional methods of survival analysis are usually based on Cox Proportional Hazards model and assume proportional risk for all subjects. However, this assumption is rarely true for most diseases, as the underlying factors have complex, non-linear, and time-varying relationships. This concern is especially relevant for pregnancy, where the risk for pregnancy-related complications, such as preeclampsia, varies across gestation. Recently, deep learning survival models have shown promise in addressing the limitations of classical models, as the novel models allow for non-proportional risk handling, capturing nonlinear relationships, and navigating complex temporal dynamics. METHODS: We present a methodology to model the temporal risk of preeclampsia during pregnancy and investigate the associated clinical risk factors. We utilized a retrospective dataset including 66,425 pregnant individuals who delivered in two tertiary care centers from 2015 to 2023. We modeled the preeclampsia risk by modifying DeepHit, a deep survival model, which leverages neural network architecture to capture time-varying relationships between covariates in pregnancy. We applied time series k-means clustering to DeepHit's normalized output and investigated interpretability using Shapley values. RESULTS: We demonstrate that DeepHit can effectively handle high-dimensional data and evolving risk hazards over time with performance similar to the Cox Proportional Hazards model, achieving an area under the curve (AUC) of 0.78 for both models. The deep survival model outperformed traditional methodology by identifying time-varied risk trajectories for preeclampsia, providing insights for early and individualized intervention. K-means clustering resulted in patients delineating into low-risk, early-onset, and late-onset preeclampsia groups-notably, each of those has distinct risk factors. CONCLUSION: This work demonstrates a novel application of deep survival analysis in time-varying prediction of preeclampsia risk. Our results highlight the advantage of deep survival models compared to Cox Proportional Hazards models in providing personalized risk trajectory and demonstrating the potential of deep survival models to generate interpretable and meaningful clinical applications in medicine.
Assuntos
Pré-Eclâmpsia , Humanos , Pré-Eclâmpsia/mortalidade , Gravidez , Feminino , Análise de Sobrevida , Fatores de Risco , Aprendizado Profundo , Adulto , Estudos Retrospectivos , Modelos de Riscos Proporcionais , Redes Neurais de Computação , Medição de Risco/métodosRESUMO
The Changzhi Basin in Shanxi is renowned for its extensive mining activities. It's crucial to comprehend the spatial distribution and geochemical factors influencing its water quality to uphold water security and safeguard the ecosystem. However, the complexity inherent in hydrogeochemical data presents challenges for linear data analysis methods. This study utilizes a combined approach of self-organizing maps (SOM) and K-means clustering to investigate the hydrogeochemical sources of shallow groundwater in the Changzhi Basin and the associated human health risks. The results showed that the groundwater chemical characteristics were categorized into 48 neurons grouped into six clusters (C1-C6) representing different groundwater types with different contamination characteristics. C1, C3, and C5 represent uncontaminated or minimally contaminated groundwater (Ca-HCO3 type), while C2 signifies mixed-contaminated groundwater (HCO3-Ca type, Mixed Cl-Mg-Ca type, and CaSO4 type). C4 samples exhibit impacts from agricultural activities (Mixed Cl-Mg-Ca), and C6 reflects high Ca and NO3- groundwater. Anthropogenic activities, especially agriculture, have resulted in elevated NO3- levels in shallow groundwater. Notably, heightened non-carcinogenic risks linked to NO3-, Pb, F-, and Mn exposure through drinking water, particularly impacting children, warrant significant attention. This research contributes valuable insights into sustainable groundwater resource development, pollution mitigation strategies, and effective ecosystem protection within intensive mining regions like the Changzhi Basin. It serves as a vital reference for similar areas worldwide, offering guidance for groundwater management, pollution prevention, and control.
Assuntos
Monitoramento Ambiental , Água Subterrânea , Mineração , Poluentes Químicos da Água , Água Subterrânea/química , Água Subterrânea/análise , China , Poluentes Químicos da Água/análise , Humanos , Monitoramento Ambiental/métodos , Medição de RiscoRESUMO
BACKGROUND: Numerous studies have affirmed a robust correlation between residual cholesterol (RC) and the occurrence of cardiovascular disease (CVD). However, the current body of literature fails to adequately address the link between alterations in RC and the occurrence of CVD. Existing studies have focused mainly on individual RC values. Hence, the primary objective of this study is to elucidate the association between the cumulative RC (Cum-RC) and the morbidity of CVD. METHODS: The changes in RC were categorized into a high-level fast-growth group (Class 1) and a low-level slow-growth group (Class 2) by K-means cluster analysis. To investigate the relationship between combined exposure to multiple lipids and CVD risk, a weighted quantile sum (WQS) regression analysis was employed. This analysis involved the calculation of weights for total cholesterol (TC), low-density lipoprotein (LDL), and high-density lipoprotein (HDL), which were used to effectively elucidate the RC. RESULTS: Among the cohort of 5,372 research participants, a considerable proportion of 45.94% consisted of males, with a median age of 58. In the three years of follow-up, 669 participants (12.45%) had CVD. Logistic regression analysis revealed that Class 2 individuals had a significantly reduced risk of developing CVD compared to Class 1. The probability of having CVD increased by 13% for every 1-unit increase in the Cum-RC according to the analysis of continuous variables. The restricted cubic spline (RCS) analysis showed that Cum-RC and CVD risk were linearly related (P for nonlinearity = 0.679). The WQS regression results showed a nonsignificant trend toward an association between the WQS index and CVD incidence but an overall positive trend, with the greatest contribution from TC (weight = 0.652), followed by LDL (weight = 0.348). CONCLUSION: Cum-RC was positively and strongly related to CVD risk, suggesting that in addition to focusing on traditional lipid markers, early intervention in patients with increased RC may further reduce the incidence of CVD.
Assuntos
Doenças Cardiovasculares , Masculino , Humanos , Doenças Cardiovasculares/epidemiologia , HDL-Colesterol , LDL-Colesterol , Colesterol , Incidência , Fatores de RiscoRESUMO
Variation in forage composition decreases the accuracy of diets delivered to dairy cows. However, variability of forages can be managed using a renewal reward model (RRM) and genetic algorithm (GA) to optimize sampling and monitoring practices for farm conditions. Specifically, use of quality control charts to monitor forage composition can identify changes in composition for which adjustment in the formulated diet will result in a better match of the nutrients delivered to cows. The objectives of this study were (1) to assess the use of a clustering algorithm to estimate the mean time (d) the process is stable or in control (TStable) and the magnitude of the change in forage composition between stable periods (ΔForage) for corn silage and alfalfa-grass silage, which are input parameters for the RRM; (2) to compare optimized farm-specific sampling practices (number of samples, sampling interval, and control limits [ΔLimit]) using previously proposed defaults and our estimates for the TStable and ΔForage input parameters; and (3) to conduct a simulation study to compare the number of recommended diet changes under the proposed sampling and monitoring protocols. We estimated the TStable and ΔForage parameters for corn silage NDF and starch and alfalfa-grass silage NDF and CP using a k-means clustering approach applied to forage samples collected from 8 farms, 3×/wk during a 16-wk period. We compared 4 sampling and monitoring protocols that resulted from the 2 methods for estimating TStable and ΔForage (default values and our proposed method) and either optimizing only the control limit or optimizing the control limits, the number of samples, and the number of days between sampling. We simulated the outcomes of implementing the optimized monitoring protocols using a quality control chart for corn silage and alfalfa-grass silage of each farm. Estimates of T^Stable and Δ^Forage from the k-means clustering analysis were, respectively, shorter and larger than previously proposed default values. In the simulated quality control monitoring, larger Δ^Forage estimates increased the optimized ΔLimit, resulting in fewer detected shifts in composition of forages, a lower frequency of false alarms, and a lower quality control cost ($/d). Recommended diet reformulation intervals from the simulated quality control analysis were specific for the type of forage and farm management practices. The median of the diet reformulation intervals for all farms using our optimal protocols was 14 d (quartile [Q]1 = 8, Q3 = 26) for corn silage and 16 d (Q1 = 8, Q3 = 26) for alfalfa-grass silage.
Assuntos
Ração Animal , Dieta , Silagem , Animais , Bovinos , Dieta/veterinária , Ração Animal/análise , Fazendas , Zea mays , FemininoRESUMO
BACKGROUND: Multiple chronic conditions (multimorbidity) are becoming more prevalent among aging populations. Digital health technologies have the potential to assist in the self-management of multimorbidity, improving the awareness and monitoring of health and well-being, supporting a better understanding of the disease, and encouraging behavior change. OBJECTIVE: The aim of this study was to analyze how 60 older adults (mean age 74, SD 6.4; range 65-92 years) with multimorbidity engaged with digital symptom and well-being monitoring when using a digital health platform over a period of approximately 12 months. METHODS: Principal component analysis and clustering analysis were used to group participants based on their levels of engagement, and the data analysis focused on characteristics (eg, age, sex, and chronic health conditions), engagement outcomes, and symptom outcomes of the different clusters that were discovered. RESULTS: Three clusters were identified: the typical user group, the least engaged user group, and the highly engaged user group. Our findings show that age, sex, and the types of chronic health conditions do not influence engagement. The 3 primary factors influencing engagement were whether the same device was used to submit different health and well-being parameters, the number of manual operations required to take a reading, and the daily routine of the participants. The findings also indicate that higher levels of engagement may improve the participants' outcomes (eg, reduce symptom exacerbation and increase physical activity). CONCLUSIONS: The findings indicate potential factors that influence older adult engagement with digital health technologies for home-based multimorbidity self-management. The least engaged user groups showed decreased health and well-being outcomes related to multimorbidity self-management. Addressing the factors highlighted in this study in the design and implementation of home-based digital health technologies may improve symptom management and physical activity outcomes for older adults self-managing multimorbidity.
Assuntos
Saúde Digital , Multimorbidade , Idoso , Humanos , Envelhecimento , Análise por Conglomerados , Confiabilidade dos Dados , Idoso de 80 Anos ou maisRESUMO
Intense urban development and high urban density cause the thermal environment in urban centers to deteriorate continuously, affecting the quality of the living environment. In this study, 707.49 hectares of land in the central area of Changsha were divided into 121 plots. 11 microclimate-related morphological indicators were comprehensively selected, and the K-means method was used for cluster analysis. Then, the relationship between morphological clusters and the thermal environment was explored by simulating the thermal environment of the study area with ENVI-met. First, five spatial types were found to characterize the area: high-level with high floor area ratio, low density, and low greenery; middle-level with high floor area ratio high density; medium-capacity with high density and small volume; low-level with low density and high greenery; and low floor area ratio, low density, and high greenery. Second, the building windward surface density, sky openness, building density, floor area ratio and green space rate affect the thermal environment. Third, Cluster3 had the highest average air temperature (Ta), followed by Cluster5, furthermore Clusters4, 1, and2 had relatively low Ta. The spatial vitality index and green space rate in Cluster1; the area-weighted building shape index, average building volume and sky openness in Cluster2; green space rate in Cluster3; indicators such as the floor area ratio and green space rate in Cluster4; indicators such as the impervious surface rate and green space rate in Cluster5 had greater influences on Ta. Fourthly, simply increasing the area of green space cannot maximize the cooling effect of green spaces. Instead, constructing an equalized greening network can better regulate the thermal environment. Fifthly, the results provide a scientific basis for the design and the regulation of urban centers.