Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 990
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Am J Hum Genet ; 110(2): 314-325, 2023 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-36610401

RESUMO

Admixture estimation plays a crucial role in ancestry inference and genome-wide association studies (GWASs). Computer programs such as ADMIXTURE and STRUCTURE are commonly employed to estimate the admixture proportions of sample individuals. However, these programs can be overwhelmed by the computational burdens imposed by the 105 to 106 samples and millions of markers commonly found in modern biobanks. An attractive strategy is to run these programs on a set of ancestry-informative SNP markers (AIMs) that exhibit substantially different frequencies across populations. Unfortunately, existing methods for identifying AIMs require knowing ancestry labels for a subset of the sample. This supervised learning approach creates a chicken and the egg scenario. In this paper, we present an unsupervised, scalable framework that seamlessly carries out AIM selection and likelihood-based estimation of admixture proportions. Our simulated and real data examples show that this approach is scalable to modern biobank datasets. OpenADMIXTURE, our Julia implementation of the method, is open source and available for free.


Assuntos
Bancos de Espécimes Biológicos , Estudo de Associação Genômica Ampla , Humanos , Estudo de Associação Genômica Ampla/métodos , Funções Verossimilhança , Grupos Populacionais , Software , Genética Populacional
2.
J Neurosci ; 44(8)2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38124022

RESUMO

Adverse childhood experiences have been linked to detrimental mental health outcomes in adulthood. This study investigates a potential neurodevelopmental pathway between adversity and mental health outcomes: brain connectivity. We used data from the prospective, longitudinal Adolescent Brain Cognitive Development (ABCD) study (N ≍ 12.000, participants aged 9-13 years, male and female) and assessed structural brain connectivity using fractional anisotropy (FA) of white matter tracts. The adverse experiences modeled included family conflict and traumatic experiences. K-means clustering and latent basis growth models were used to determine subgroups based on total levels and trajectories of brain connectivity. Multinomial regression was used to determine associations between cluster membership and adverse experiences. The results showed that higher family conflict was associated with higher FA levels across brain tracts (e.g., t (3) = -3.81, ß = -0.09, p bonf = 0.003) and within the corpus callosum (CC), fornix, and anterior thalamic radiations (ATR). A decreasing FA trajectory across two brain imaging timepoints was linked to lower socioeconomic status and neighborhood safety. Socioeconomic status was related to FA across brain tracts (e.g., t (3) = 3.44, ß = 0.10, p bonf = 0.01), the CC and the ATR. Neighborhood safety was associated with FA in the Fornix and ATR (e.g., t (1) = 3.48, ß = 0.09, p bonf = 0.01). There is a complex and multifaceted relationship between adverse experiences and brain development, where adverse experiences during early adolescence are related to brain connectivity. These findings underscore the importance of studying adverse experiences beyond early childhood to understand lifespan developmental outcomes.


Assuntos
Imagem de Tensor de Difusão , Substância Branca , Humanos , Masculino , Adolescente , Pré-Escolar , Feminino , Estudos Prospectivos , Imagem de Tensor de Difusão/métodos , Encéfalo/diagnóstico por imagem , Substância Branca/diagnóstico por imagem , Corpo Caloso , Anisotropia
3.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37406190

RESUMO

Studies have confirmed that the occurrence of many complex diseases in the human body is closely related to the microbial community, and microbes can affect tumorigenesis and metastasis by regulating the tumor microenvironment. However, there are still large gaps in the clinical observation of the microbiota in disease. Although biological experiments are accurate in identifying disease-associated microbes, they are also time-consuming and expensive. The computational models for effective identification of diseases related microbes can shorten this process, and reduce capital and time costs. Based on this, in the paper, a model named DSAE_RF is presented to predict latent microbe-disease associations by combining multi-source features and deep learning. DSAE_RF calculates four similarities between microbes and diseases, which are then used as feature vectors for the disease-microbe pairs. Later, reliable negative samples are screened by k-means clustering, and a deep sparse autoencoder neural network is further used to extract effective features of the disease-microbe pairs. In this foundation, a random forest classifier is presented to predict the associations between microbes and diseases. To assess the performance of the model in this paper, 10-fold cross-validation is implemented on the same dataset. As a result, the AUC and AUPR of the model are 0.9448 and 0.9431, respectively. Furthermore, we also conduct a variety of experiments, including comparison of negative sample selection methods, comparison with different models and classifiers, Kolmogorov-Smirnov test and t-test, ablation experiments, robustness analysis, and case studies on Covid-19 and colorectal cancer. The results fully demonstrate the reliability and availability of our model.


Assuntos
COVID-19 , Aprendizado Profundo , Microbiota , Humanos , Reprodutibilidade dos Testes , Algoritmos , Biologia Computacional/métodos
4.
BMC Bioinformatics ; 25(1): 21, 2024 Jan 12.
Artigo em Inglês | MEDLINE | ID: mdl-38216886

RESUMO

BACKGROUND: Metagene plots provide a visualization of biological signal trends over subsections of the genome and are used to perform high-level analysis of experimental data by aggregating genome-level data to create an average profile. The generation of metagene plots is useful for summarizing the results of many sequencing-based applications. Despite their prevalence and utility, the standard metagene plot is blind to conflicting signals within data. If multiple distinct trends occur, they can interact destructively, creating a plot that does not accurately represent any of the underlying trends. RESULTS: We present MetageneCluster, a Python tool to generate a collection of representative metagene plots based on k-means clustering of genomic regions of interest. Clustering the data by similarity allows us to identify patterns within the features of interest. We are then able to summarize each pattern present in the data, rather than averaging across the entire feature space. We show that our method performs well when used to identify conflicting signals in real-world genome-level data. CONCLUSIONS: Overall, MetageneCluster is a user-friendly tool for the creation of metagene plots that capture distinct patterns in underlying sequence data.


Assuntos
Genoma , Genômica , Genômica/métodos , Software
5.
BMC Plant Biol ; 24(1): 373, 2024 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-38714965

RESUMO

BACKGROUND: As one of the world's most important beverage crops, tea plants (Camellia sinensis) are renowned for their unique flavors and numerous beneficial secondary metabolites, attracting researchers to investigate the formation of tea quality. With the increasing availability of transcriptome data on tea plants in public databases, conducting large-scale co-expression analyses has become feasible to meet the demand for functional characterization of tea plant genes. However, as the multidimensional noise increases, larger-scale co-expression analyses are not always effective. Analyzing a subset of samples generated by effectively downsampling and reorganizing the global sample set often leads to more accurate results in co-expression analysis. Meanwhile, global-based co-expression analyses are more likely to overlook condition-specific gene interactions, which may be more important and worthy of exploration and research. RESULTS: Here, we employed the k-means clustering method to organize and classify the global samples of tea plants, resulting in clustered samples. Metadata annotations were then performed on these clustered samples to determine the "conditions" represented by each cluster. Subsequently, we conducted gene co-expression network analysis (WGCNA) separately on the global samples and the clustered samples, resulting in global modules and cluster-specific modules. Comparative analyses of global modules and cluster-specific modules have demonstrated that cluster-specific modules exhibit higher accuracy in co-expression analysis. To measure the degree of condition specificity of genes within condition-specific clusters, we introduced the correlation difference value (CDV). By incorporating the CDV into co-expression analyses, we can assess the condition specificity of genes. This approach proved instrumental in identifying a series of high CDV transcription factor encoding genes upregulated during sustained cold treatment in Camellia sinensis leaves and buds, and pinpointing a pair of genes that participate in the antioxidant defense system of tea plants under sustained cold stress. CONCLUSIONS: To summarize, downsampling and reorganizing the sample set improved the accuracy of co-expression analysis. Cluster-specific modules were more accurate in capturing condition-specific gene interactions. The introduction of CDV allowed for the assessment of condition specificity in gene co-expression analyses. Using this approach, we identified a series of high CDV transcription factor encoding genes related to sustained cold stress in Camellia sinensis. This study highlights the importance of considering condition specificity in co-expression analysis and provides insights into the regulation of the cold stress in Camellia sinensis.


Assuntos
Camellia sinensis , Camellia sinensis/genética , Camellia sinensis/metabolismo , Análise por Conglomerados , Genes de Plantas , Perfilação da Expressão Gênica/métodos , Mineração de Dados/métodos , Transcriptoma , Regulação da Expressão Gênica de Plantas , Redes Reguladoras de Genes
6.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35226073

RESUMO

Microbiome research is advancing rapidly, and every new study should definitively be based on updated methods, trends and milestones in this field to avoid the wrong interpretation of results. Most human microbiota surveys rely on data captured from snapshots-single data points from subjects-and have permitted uncovering the recognized interindividual variability and major covariates of such microbial communities. Currently, changes in individualized microbiota profiles are under the spotlight to serve as robust predictors of clinical outcomes (e.g. weight loss via dietary interventions) and disease anticipation. Therefore, novel methods are needed to provide robust evaluation of longitudinal series of microbiota data with the aim of assessing intrapersonally short-term to long-term microbiota changes likely linked to health and disease states. Consequently, we developed microbiota STability ASsessment via Iterative cluStering (µSTASIS)-a multifunction R package to evaluate individual-centered microbiota stability. µSTASIS targets the recognized interindividual variability inherent to microbiota data to stress the tight relationships observed among and characteristic of longitudinal samples derived from a single individual via iteratively growing-partitioned clustering. The algorithms and functions implemented in this framework deal properly with the sparse and compositional nature of microbiota data. Moreover, the resulting metric is intuitive and independent of beta diversity distance methods and correlation coefficients, thus estimating stability for each microbiota sample rather than giving nonconsensus magnitudes that are difficult to interpret within and between datasets. Our method is freely available under GPL-3 licensing. We demonstrate its utility by assessing gut microbiota stability from three independent studies published previously with multiple longitudinal series of multivariate data and respective metadata.


Assuntos
Microbioma Gastrointestinal , Microbiota , Análise por Conglomerados , Humanos
7.
J Transl Med ; 22(1): 597, 2024 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-38937754

RESUMO

BACKGROUND: Over the last two decades, tumor-derived RNA expression signatures have been developed for the two most commonly diagnosed tumors worldwide, namely prostate and breast tumors, in order to improve both outcome prediction and treatment decision-making. In this context, molecular signatures gained by main components of the tumor microenvironment, such as cancer-associated fibroblasts (CAFs), have been explored as prognostic and therapeutic tools. Nevertheless, a deeper understanding of the significance of CAFs-related gene signatures in breast and prostate cancers still remains to be disclosed. METHODS: RNA sequencing technology (RNA-seq) was employed to profile and compare the transcriptome of CAFs isolated from patients affected by breast and prostate tumors. The differentially expressed genes (DEGs) characterizing breast and prostate CAFs were intersected with data from public datasets derived from bulk RNA-seq profiles of breast and prostate tumor patients. Pathway enrichment analyses allowed us to appreciate the biological significance of the DEGs. K-means clustering was applied to construct CAFs-related gene signatures specific for breast and prostate cancer and to stratify independent cohorts of patients into high and low gene expression clusters. Kaplan-Meier survival curves and log-rank tests were employed to predict differences in the outcome parameters of the clusters of patients. Decision-tree analysis was used to validate the clustering results and boosting calculations were then employed to improve the results obtained by the decision-tree algorithm. RESULTS: Data obtained in breast CAFs allowed us to assess a signature that includes 8 genes (ITGA11, THBS1, FN1, EMP1, ITGA2, FYN, SPP1, and EMP2) belonging to pro-metastatic signaling routes, such as the focal adhesion pathway. Survival analyses indicated that the cluster of breast cancer patients showing a high expression of the aforementioned genes displays worse clinical outcomes. Next, we identified a prostate CAFs-related signature that includes 11 genes (IL13RA2, GDF7, IL33, CXCL1, TNFRSF19, CXCL6, LIFR, CXCL5, IL7, TSLP, and TNFSF15) associated with immune responses. A low expression of these genes was predictive of poor survival rates in prostate cancer patients. The results obtained were significantly validated through a two-step approach, based on unsupervised (clustering) and supervised (classification) learning techniques, showing a high prediction accuracy (≥ 90%) in independent RNA-seq cohorts. CONCLUSION: We identified a huge heterogeneity in the transcriptional profile of CAFs derived from breast and prostate tumors. Of note, the two novel CAFs-related gene signatures might be considered as reliable prognostic indicators and valuable biomarkers for a better management of breast and prostate cancer patients.


Assuntos
Neoplasias da Mama , Fibroblastos Associados a Câncer , Regulação Neoplásica da Expressão Gênica , Neoplasias da Próstata , Humanos , Neoplasias da Próstata/genética , Neoplasias da Próstata/patologia , Masculino , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Feminino , Fibroblastos Associados a Câncer/metabolismo , Fibroblastos Associados a Câncer/patologia , Prognóstico , Transcriptoma/genética , Perfilação da Expressão Gênica , Análise por Conglomerados , Resultado do Tratamento , Pessoa de Meia-Idade , Estimativa de Kaplan-Meier
8.
Cardiovasc Diabetol ; 23(1): 192, 2024 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-38844974

RESUMO

BACKGROUND: Cardiovascular disease (CVD) is closely associated with the triglyceride glucose (TyG) index and its related indicators, particularly its combination with obesity indices. However, there is limited research on the relationship between changes in TyG-related indices and CVD, as most studies have focused on baseline TyG-related indices. METHODS: The data for this prospective cohort study were obtained from the China Health and Retirement Longitudinal Study. The exposures were changes in TyG-related indices and cumulative TyG-related indices from 2012 to 2015. The K-means algorithm was used to classify changes in each TyG-related index into four classes (Class 1 to Class 4). Multivariate logistic regressions were used to evaluate the associations between the changes in TyG-related indices and the incidence of CVD. RESULTS: In total, 3243 participants were included in this study, of whom 1761 (54.4%) were female, with a mean age of 57.62 years at baseline. Over a 5-year follow-up, 637 (19.6%) participants developed CVD. Fully adjusted logistic regression analyses revealed significant positive associations between changes in TyG-related indices, cumulative TyG-related indices and the incidence of CVD. Among these changes in TyG-related indices, changes in TyG-waist circumference (WC) showed the strongest association with incident CVD. Compared to the participants in Class 1 of changes in TyG-WC, the odds ratio (OR) for participants in Class 2 was 1.41 (95% confidence interval (CI) 1.08-1.84), the OR for participants in Class 3 was 1.54 (95% CI 1.15-2.07), and the OR for participants in Class 4 was 1.94 (95% CI 1.34-2.80). Moreover, cumulative TyG-WC exhibited the strongest association with incident CVD among cumulative TyG-related indices. Compared to the participants in Quartile 1 of cumulative TyG-WC, the OR for participants in Quartile 2 was 1.33 (95% CI 1.00-1.76), the OR for participants in Quartile 3 was 1.46 (95% CI 1.09-1.96), and the OR for participants in Quartile 4 was 1.79 (95% CI 1.30-2.47). CONCLUSIONS: Changes in TyG-related indices are independently associated with the risk of CVD. Changes in TyG-WC are expected to become more effective indicators for identifying individuals at a heightened risk of CVD.


Assuntos
Biomarcadores , Glicemia , Doenças Cardiovasculares , Obesidade , Triglicerídeos , Humanos , Feminino , Pessoa de Meia-Idade , Masculino , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/sangue , Estudos Prospectivos , Triglicerídeos/sangue , Incidência , Medição de Risco , China/epidemiologia , Glicemia/metabolismo , Obesidade/epidemiologia , Obesidade/diagnóstico , Obesidade/sangue , Idoso , Biomarcadores/sangue , Estudos Longitudinais , Fatores de Tempo , Prognóstico , Fatores de Risco de Doenças Cardíacas , Valor Preditivo dos Testes , Fatores de Risco
9.
Cardiovasc Diabetol ; 23(1): 247, 2024 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-38992634

RESUMO

BACKGROUND: The triglyceride-glucose (TyG) index and its combination with obesity indicators can predict cardiovascular diseases (CVD). However, there is limited research on the relationship between changes in the triglyceride glucose-waist height ratio (TyG-WHtR) and CVD. Our study aims to investigate the relationship between the change in the TyG-WHtR and the risk of CVD. METHODS: Participants were from the China Health and Retirement Longitudinal Study (CHARLS). CVD was defined as self-reporting heart disease and stroke. Participants were divided into three groups based on changes in TyG-WHtR using K-means cluster analysis. Multivariable binary logistic regression analysis was used to examine the association between different groups (based on the change of TyG-WHtR) and CVD. A restricted cubic spline (RCS) regression model was used to explore the potential nonlinear association of the cumulative TyG-WHtR and CVD events. RESULTS: During follow-up between 2015 and 2020, 623 (18.8%) of 3312 participants developed CVD. After adjusting for various potential confounders, compared to the participants with consistently low and stable TyG-WHtR, the risk of CVD was significantly higher in participants with moderate and increasing TyG-WHtR (OR 1.28, 95%CI 1.01-1.63) and participants with high TyG-WHtR with a slowly increasing trend (OR 1.58, 95%CI 1.16-2.15). Higher levels of cumulative TyG-WHtR were independently associated with a higher risk of CVD events (per SD, OR 1.27, 95%CI 1.12-1.43). CONCLUSIONS: For middle-aged and older adults, changes in the TyG-WHtR are independently associated with the risk of CVD. Maintaining a favorable TyG index, effective weight management, and a reasonable waist circumference contribute to preventing CVD.


Assuntos
Biomarcadores , Glicemia , Doenças Cardiovasculares , Triglicerídeos , Humanos , Feminino , Masculino , Pessoa de Meia-Idade , China/epidemiologia , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/sangue , Triglicerídeos/sangue , Idoso , Medição de Risco , Glicemia/metabolismo , Biomarcadores/sangue , Estudos Longitudinais , Razão Cintura-Estatura , Fatores Etários , Fatores de Tempo , Prognóstico , Valor Preditivo dos Testes , Fatores de Risco , Fatores de Risco de Doenças Cardíacas , Incidência , População do Leste Asiático
10.
NMR Biomed ; : e5218, 2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39051137

RESUMO

The presence of a normal large blood vessel (LBV) in a tumor region can impact the evaluation of quantitative dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) parameters and tumor classification. Hence, there is a need for automatic removal of LBVs from brain tissues including intratumoral regions for achieving an objective assessment of tumors. This retrospective study included 103 histopathologically confirmed brain tumor patients who underwent MRI, including DCE-MRI data acquisition. Quantitative DCE-MRI analysis was performed for computing various parameters such as wash-out slope (Slope-2), relative cerebral blood volume (rCBV), relative cerebral blood flow (rCBF), blood plasma volume fraction (Vp), and volume transfer constant (Ktrans). An approach based on data-clustering algorithm, morphological operations, and quantitative DCE-MRI maps was proposed for the segmentation of normal LBVs in brain tissues, including the tumor region. Here, three widely used data-clustering algorithms were evaluated on two types of quantitative maps: (a) Slope-2, and (b) a new proposed combination of rCBV and Slope-2 maps. Fluid-attenuated inversion recovery-MRI hyperintense lesions were also automatically segmented using deep learning-based architecture. The accuracy of LBV segmentation was qualitatively assessed blindly by two experienced observers, and Likert scoring was also obtained from each individual and compared using Cohen's Kappa test, and multiple statistical features from quantitative DCE-MRI parameters were obtained in the segmented tumor. t-test and receiver operating characteristic (ROC) curve analysis were performed for comparing the effect of removal of LBVs on parameters as well as on tumor grading. k-means clustering exhibited better accuracy and computational efficiency. Tumors, in particular high-grade gliomas (HGGs), showed a high contrast compared with normal tissues (relative % difference = 18.5%) on quantitative maps after the removal of LBVs. Statistical features (95th percentile values) of all parameters in the tumor region showed a statistically significant difference (p < 0.05) between with and without LBV maps. Similar results were obtained for the ROC curve analysis for differentiation between low-grade gliomas and HGGs. Moreover, after the removal of LBVs, the rCBV, rCBF, and Vp maps show better visualization of tumor regions.

11.
Am J Obstet Gynecol ; 231(1): 122.e1-122.e9, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38527606

RESUMO

BACKGROUND: Continuous glucose monitoring has facilitated the evaluation of dynamic changes in glucose throughout the day and their effect on fetal growth abnormalities in pregnancy. However, studies of multiple continuous glucose monitoring metrics combined and their association with other adverse pregnancy outcomes are limited. OBJECTIVE: This study aimed to (1) use machine learning techniques to identify discrete glucose profiles based on weekly continuous glucose monitoring metrics in pregnant individuals with pregestational diabetes mellitus and (2) investigate their association with adverse pregnancy outcomes. STUDY DESIGN: This study analyzed data from a retrospective cohort study of pregnant patients with type 1 or 2 diabetes mellitus who used Dexcom G6 continuous glucose monitoring and delivered a nonanomalous, singleton pregnancy at a tertiary center between 2019 and 2023. Continuous glucose monitoring data were collapsed into 39 weekly glycemic measures related to centrality, spread, excursions, and circadian cycle patterns. Principal component analysis and k-means clustering were used to identify 4 discrete groups, and patients were assigned to the group that best represented their continuous glucose monitoring patterns during pregnancy. Finally, the association between glucose profile groups and outcomes (preterm birth, cesarean delivery, preeclampsia, large-for-gestational-age neonate, neonatal hypoglycemia, and neonatal intensive care unit admission) was estimated using multivariate logistic regression adjusted for diabetes mellitus type, maternal age, insurance, continuous glucose monitoring use before pregnancy, and parity. RESULTS: Of 177 included patients, 90 (50.8%) had type 1 diabetes mellitus, and 85 (48.3%) had type 2 diabetes mellitus. This study identified 4 glucose profiles: (1) well controlled; (2) suboptimally controlled with high variability, fasting hypoglycemia, and daytime hyperglycemia; (3) suboptimally controlled with minimal circadian variation; and (4) poorly controlled with peak hyperglycemia overnight. Compared with the well-controlled profile, the suboptimally controlled profile with high variability had higher odds of a large-for-gestational-age neonate (adjusted odds ratio, 3.34; 95% confidence interval, 1.15-9.89). The suboptimally controlled with minimal circadian variation profile had higher odds of preterm birth (adjusted odds ratio, 2.59; 95% confidence interval, 1.10-6.24), cesarean delivery (adjusted odds ratio, 2.76; 95% confidence interval, 1.09-7.46), and neonatal intensive care unit admission (adjusted odds ratio, 4.08; 95% confidence interval, 1.58-11.40). The poorly controlled profile with peak hyperglycemia overnight had higher odds of preeclampsia (adjusted odds ratio, 2.54; 95% confidence interval, 1.02-6.52), large-for-gestational-age neonate (adjusted odds ratio, 3.72; 95% confidence interval, 1.37-10.4), neonatal hypoglycemia (adjusted odds ratio, 3.53; 95% confidence interval, 1.37-9.71), and neonatal intensive care unit admission (adjusted odds ratio, 3.15; 95% confidence interval, 1.20-9.09). CONCLUSION: Discrete glucose profiles of pregnant individuals with pregestational diabetes mellitus were identified through joint consideration of multiple continuous glucose monitoring metrics. Prolonged exposure to maternal hyperglycemia may be associated with a higher risk of adverse pregnancy outcomes than suboptimal glycemic control characterized by high glucose variability and intermittent hyperglycemia.


Assuntos
Automonitorização da Glicemia , Glicemia , Cesárea , Diabetes Mellitus Tipo 1 , Diabetes Mellitus Tipo 2 , Hipoglicemia , Pré-Eclâmpsia , Resultado da Gravidez , Gravidez em Diabéticas , Nascimento Prematuro , Humanos , Feminino , Gravidez , Adulto , Estudos Retrospectivos , Gravidez em Diabéticas/sangue , Diabetes Mellitus Tipo 1/sangue , Hipoglicemia/epidemiologia , Glicemia/metabolismo , Glicemia/análise , Nascimento Prematuro/epidemiologia , Cesárea/estatística & dados numéricos , Pré-Eclâmpsia/epidemiologia , Recém-Nascido , Diabetes Mellitus Tipo 2/sangue , Macrossomia Fetal/epidemiologia , Aprendizado de Máquina , Unidades de Terapia Intensiva Neonatal , Estudos de Coortes , Terapia Intensiva Neonatal , Monitoramento Contínuo da Glicose
12.
Ann Behav Med ; 58(4): 242-252, 2024 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-38413045

RESUMO

BACKGROUND: Individuals confronting health threats may display an optimistic bias such that judgments of their risk for illness or death are unrealistically positive given their objective circumstances. PURPOSE: We explored optimistic bias for health risks using k-means clustering in the context of COVID-19. We identified risk profiles using subjective and objective indicators of severity and susceptibility risk for COVID-19. METHODS: Between 3/18/2020-4/18/2020, a national probability sample of 6,514 U.S. residents reported both their subjective risk perceptions (e.g., perceived likelihood of illness or death) and objective risk indices (e.g., age, weight, pre-existing conditions) of COVID-19-related susceptibility and severity, alongside other pandemic-related experiences. Six months later, a subsample (N = 5,661) completed a follow-up survey with questions about their frequency of engagement in recommended health protective behaviors (social distancing, mask wearing, risk behaviors, vaccination intentions). RESULTS: The k-means clustering procedure identified five risk profiles in the Wave 1 sample; two of these demonstrated aspects of optimistic bias, representing almost 44% of the sample. In OLS regression models predicting health protective behavior adoption at Wave 2, clusters representing individuals with high perceived severity risk were most likely to report engagement in social distancing, but many individuals who were objectively at high risk for illness and death did not report engaging in self-protective behaviors. CONCLUSIONS: Objective risk of disease severity only inconsistently predicted health protective behavior. Risk profiles may help identify groups that need more targeted interventions to increase their support for public health policy and health enhancing recommendations more broadly.


As we move into an endemic stage of the COVID-19 pandemic, understanding engagement in health behaviors to curb the spread of disease remains critically important to manage COVID-19 and other health threats. However, peoples' perceptions about their risk of getting sick and having severe outcomes if they do fall ill are subject to bias. We studied a nationally representative probability sample of over 6,500 U.S. residents who completed surveys immediately after the COVID-19 pandemic began and approximately 6 months later. We used a computer processing (i.e., machine learning) approach to categorize participants based on both their actual risk factors for COVID-19 and their subjective understanding of that risk. Our analysis identified groups of individuals whose subjective perceptions of risk did not align with their actual risk characteristics. Specifically, almost 44% of our sample demonstrated an optimistic bias: they did not report higher risk of death from COVID-19 despite having one or more well-known risk factors for poor disease outcomes (e.g., older age, obesity). Six months later, membership in these risk groups prospectively predicted engagement in health protective and risky behaviors, as well as vaccine intentions, demonstrating how early risk perceptions may influence health behaviors over time.


Assuntos
COVID-19 , Humanos , COVID-19/epidemiologia , Comportamentos Relacionados com a Saúde , Pandemias , Inquéritos e Questionários
13.
Eur J Neurol ; 31(3): e16170, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38069662

RESUMO

BACKGROUND AND PURPOSE: Post-stroke fatigue commonly presents alongside several comorbidities. The interaction between comorbidities and their relationship to fatigue is not known. In this study, we focus on physical and mood comorbidities, alongside lesion characteristics. We predict the emergence of distinct fatigue phenotypes with distinguishable physical and mood characteristics. METHODS: In this cross-sectional observational study, in 94 first time, non-depressed, moderate to minimally impaired chronic stroke survivors, the relationship between measures of motor function (grip strength, nine-hole peg test time), motor cortical excitability (resting motor threshold), Hospital Anxiety and Depression Scale and Fatigue Severity Scale-7 (FSS-7) scores, age, gender and side of stroke was established using Spearman's rank correlation. Mood and motor variables were then entered into a k-means clustering algorithm to identify the number of unique clusters, if any. Post hoc pairwise comparisons followed by corrections for multiple comparisons were performed to characterize differences among clusters in the variables included in k-means clustering. RESULTS: Clustering analysis revealed a four-cluster model to be the best model (average silhouette score of 0.311). There was no significant difference in FSS-7 scores among the four high-fatigue clusters. Two clusters consisted of only left-hemisphere strokes, and the remaining two were exclusively right-hemisphere strokes. Factors that differentiated hemisphere-specific clusters were the level of depressive symptoms and anxiety. Motor characteristics distinguished the low-depressive left-hemisphere from the right-hemisphere clusters. CONCLUSION: The significant differences in side of stroke and the differential relationship between mood and motor function in the four clusters reveal the heterogenous nature of post-stroke fatigue, which is amenable to categorization. Such categorization is critical to an understanding of the interactions between post-stroke fatigue and its presenting comorbid deficits, with significant implications for the development of context-/category-specific interventions.


Assuntos
Reabilitação do Acidente Vascular Cerebral , Acidente Vascular Cerebral , Humanos , Estudos Transversais , Fadiga/etiologia , Acidente Vascular Cerebral/diagnóstico , Masculino , Feminino
14.
Fish Shellfish Immunol ; 152: 109788, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-39053586

RESUMO

In the process of screening for probiotic strains, there are no clearly established bacterial phenotypic markers which could be used for the prediction of their in vivo mechanism of action. In this work, we demonstrate for the first time that Machine Learning (ML) methods can be used for accurately predicting the in vivo immunomodulatory activity of probiotic strains based on their cell surface phenotypic features using a snail host-microbe interaction model. A broad range of snail gut presumptive probiotics, including 240 new lactic acid bacterial strains (Lactobacillus, Leuconostoc, Lactococcus, and Enterococcus), were isolated and characterized based on their capacity to withstand snails' gastrointestinal defense barriers, such as the pedal mucus, gastric mucus, gastric juices, and acidic pH, in association with their cell surface hydrophobicity, autoaggregation, and biofilm formation ability. The implemented ML pipeline predicted with high accuracy (88 %) strains with a strong capacity to enhance chemotaxis and phagocytic activity of snails' hemolymph cells, while also revealed bacterial autoaggregation and cell surface hydrophobicity as the most important parameters that significantly affect host immune responses. The results show that ML approaches may be useful to derive a predictive understanding of host-probiotic interactions, while also highlighted the use of snails as an efficient animal model for screening presumptive probiotic strains in the light of their interaction with cellular innate immune responses.

15.
Environ Sci Technol ; 58(11): 5003-5013, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38446785

RESUMO

Lake and reservoir surface areas are an important proxy for freshwater availability. Advancements in machine learning (ML) techniques and increased accessibility of remote sensing data products have enabled the analysis of waterbody surface area dynamics on broad spatial scales. However, interpreting the ML results remains a challenge. While ML provides important tools for identifying patterns, the resultant models do not include mechanisms. Thus, the "black-box" nature of ML techniques often lacks ecological meaning. Using ML, we characterized temporal patterns in lake and reservoir surface area change from 1984 to 2016 for 103,930 waterbodies in the contiguous United States. We then employed knowledge-guided machine learning (KGML) to classify all waterbodies into seven ecologically interpretable groups representing distinct patterns of surface area change over time. Many waterbodies were classified as having "no change" (43%), whereas the remaining 57% of waterbodies fell into other groups representing both linear and nonlinear patterns. This analysis demonstrates the potential of KGML not only for identifying ecologically relevant patterns of change across time but also for unraveling complex processes that underpin those changes.


Assuntos
Lagos , Aprendizado de Máquina , Estados Unidos
16.
Eur J Nutr ; 63(4): 1293-1314, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38403812

RESUMO

PURPOSE: The previous studies that examined the effectiveness of unsupervised machine learning methods versus traditional methods in assessing dietary patterns and their association with incident hypertension showed contradictory results. Consequently, our aim is to explore the correlation between the incidence of hypertension and overall dietary patterns that were extracted using unsupervised machine learning techniques. METHODS: Data were obtained from Japanese male participants enrolled in a prospective cohort study between August 2008 and August 2010. A final dataset of 447 male participants was used for analysis. Dimension reduction using uniform manifold approximation and projection (UMAP) and subsequent K-means clustering was used to derive dietary patterns. In addition, multivariable logistic regression was used to evaluate the association between dietary patterns and the incidence of hypertension. RESULTS: We identified four dietary patterns: 'Low-protein/fiber High-sugar,' 'Dairy/vegetable-based,' 'Meat-based,' and 'Seafood and Alcohol.' Compared with 'Seafood and Alcohol' as a reference, the protective dietary patterns for hypertension were 'Dairy/vegetable-based' (OR 0.39, 95% CI 0.19-0.80, P = 0.013) and the 'Meat-based' (OR 0.37, 95% CI 0.16-0.86, P = 0.022) after adjusting for potential confounding factors, including age, body mass index, smoking, education, physical activity, dyslipidemia, and diabetes. An age-matched sensitivity analysis confirmed this finding. CONCLUSION: This study finds that relative to the 'Seafood and Alcohol' pattern, the 'Dairy/vegetable-based' and 'Meat-based' dietary patterns are associated with a lower risk of hypertension among men.


Assuntos
Dieta , Hipertensão , Aprendizado de Máquina , Humanos , Masculino , Hipertensão/epidemiologia , Japão/epidemiologia , Incidência , Pessoa de Meia-Idade , Estudos Prospectivos , Dieta/métodos , Dieta/estatística & dados numéricos , Estudos de Coortes , Adulto , Fatores de Risco , Comportamento Alimentar , Padrões Dietéticos , População do Leste Asiático
17.
Eur J Nutr ; 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38512358

RESUMO

PURPOSE: This study utilized data mining and machine learning (ML) techniques to identify new patterns and classifications of the associations between nutrient intake and anemia among university students. METHODS: We employed K-means clustering analysis algorithm and Decision Tree (DT) technique to identify the association between anemia and vitamin and mineral intakes. We normalized and balanced the data based on anemia weighted clusters for improving ML models' accuracy. In addition, t-tests and Analysis of Variance (ANOVA) were performed to identify significant differences between the clusters. We evaluated the models on a balanced dataset of 755 female participants from the Hebron district in Palestine. RESULTS: Our study found that 34.8% of the participants were anemic. The intake of various micronutrients (i.e., folate, Vit A, B5, B6, B12, C, E, Ca, Fe, and Mg) was below RDA/AI values, which indicated an overall unbalanced malnutrition in the present cohort. Anemia was significantly associated with intakes of energy, protein, fat, Vit B1, B5, B6, C, Mg, Cu and Zn. On the other hand, intakes of protein, Vit B2, B5, B6, C, E, choline, folate, phosphorus, Mn and Zn were significantly lower in anemic than in non-anemic subjects. DT classification models for vitamins and minerals (accuracy rate: 82.1%) identified an inverse association between intakes of Vit B2, B3, B5, B6, B12, E, folate, Zn, Mg, Fe and Mn and prevalence of anemia. CONCLUSIONS: Besides the nutrients commonly known to be linked to anemia-like folate, Vit B6, C, B12, or Fe-the cluster analyses in the present cohort of young female university students have also found choline, Vit E, B2, Zn, Mg, Mn, and phosphorus as additional nutrients that might relate to the development of anemia. Further research is needed to elucidate if the intake of these nutrients might influence the risk of anemia.

18.
J Biomed Inform ; 156: 104688, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39002866

RESUMO

OBJECTIVE: Survival analysis is widely utilized in healthcare to predict the timing of disease onset. Traditional methods of survival analysis are usually based on Cox Proportional Hazards model and assume proportional risk for all subjects. However, this assumption is rarely true for most diseases, as the underlying factors have complex, non-linear, and time-varying relationships. This concern is especially relevant for pregnancy, where the risk for pregnancy-related complications, such as preeclampsia, varies across gestation. Recently, deep learning survival models have shown promise in addressing the limitations of classical models, as the novel models allow for non-proportional risk handling, capturing nonlinear relationships, and navigating complex temporal dynamics. METHODS: We present a methodology to model the temporal risk of preeclampsia during pregnancy and investigate the associated clinical risk factors. We utilized a retrospective dataset including 66,425 pregnant individuals who delivered in two tertiary care centers from 2015 to 2023. We modeled the preeclampsia risk by modifying DeepHit, a deep survival model, which leverages neural network architecture to capture time-varying relationships between covariates in pregnancy. We applied time series k-means clustering to DeepHit's normalized output and investigated interpretability using Shapley values. RESULTS: We demonstrate that DeepHit can effectively handle high-dimensional data and evolving risk hazards over time with performance similar to the Cox Proportional Hazards model, achieving an area under the curve (AUC) of 0.78 for both models. The deep survival model outperformed traditional methodology by identifying time-varied risk trajectories for preeclampsia, providing insights for early and individualized intervention. K-means clustering resulted in patients delineating into low-risk, early-onset, and late-onset preeclampsia groups-notably, each of those has distinct risk factors. CONCLUSION: This work demonstrates a novel application of deep survival analysis in time-varying prediction of preeclampsia risk. Our results highlight the advantage of deep survival models compared to Cox Proportional Hazards models in providing personalized risk trajectory and demonstrating the potential of deep survival models to generate interpretable and meaningful clinical applications in medicine.


Assuntos
Pré-Eclâmpsia , Humanos , Pré-Eclâmpsia/mortalidade , Gravidez , Feminino , Análise de Sobrevida , Fatores de Risco , Aprendizado Profundo , Adulto , Estudos Retrospectivos , Modelos de Riscos Proporcionais , Redes Neurais de Computação , Medição de Risco/métodos
19.
Cereb Cortex ; 33(12): 8056-8065, 2023 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-37067514

RESUMO

Temporal lobe epilepsy (TLE) is the most common epilepsy syndrome that empirically represents a network disorder, which makes graph theory (GT) a practical approach to understand it. Multi-shell diffusion-weighted imaging (DWI) was obtained from 89 TLE and 50 controls. GT measures extracted from harmonized DWI matrices were used as factors in a support vector machine (SVM) analysis to discriminate between groups, and in a k-means algorithm to find intrinsic structural phenotypes within TLE. SVM was able to predict group membership (mean accuracy = 0.70, area under the curve (AUC) = 0.747, Brier score (BS) = 0.264) using 10-fold cross-validation. In addition, k-means clustering identified 2 TLE clusters: 1 similar to controls, and 1 dissimilar. Clusters were significantly different in their distribution of cognitive phenotypes, with the Dissimilar cluster containing the majority of TLE with cognitive impairment (χ2 = 6.641, P = 0.036). In addition, cluster membership showed significant correlations between GT measures and clinical variables. Given that SVM classification seemed driven by the Dissimilar cluster, SVM analysis was repeated to classify Dissimilar versus Similar + Controls with a mean accuracy of 0.91 (AUC = 0.957, BS = 0.189). Altogether, the pattern of results shows that GT measures based on connectome DWI could be significant factors in the search for clinical and neurobehavioral biomarkers in TLE.


Assuntos
Conectoma , Epilepsia do Lobo Temporal , Humanos , Epilepsia do Lobo Temporal/diagnóstico por imagem , Conectoma/métodos , Imagem de Difusão por Ressonância Magnética , Cognição , Imageamento por Ressonância Magnética/métodos
20.
Environ Res ; 252(Pt 2): 118934, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38653438

RESUMO

The Changzhi Basin in Shanxi is renowned for its extensive mining activities. It's crucial to comprehend the spatial distribution and geochemical factors influencing its water quality to uphold water security and safeguard the ecosystem. However, the complexity inherent in hydrogeochemical data presents challenges for linear data analysis methods. This study utilizes a combined approach of self-organizing maps (SOM) and K-means clustering to investigate the hydrogeochemical sources of shallow groundwater in the Changzhi Basin and the associated human health risks. The results showed that the groundwater chemical characteristics were categorized into 48 neurons grouped into six clusters (C1-C6) representing different groundwater types with different contamination characteristics. C1, C3, and C5 represent uncontaminated or minimally contaminated groundwater (Ca-HCO3 type), while C2 signifies mixed-contaminated groundwater (HCO3-Ca type, Mixed Cl-Mg-Ca type, and CaSO4 type). C4 samples exhibit impacts from agricultural activities (Mixed Cl-Mg-Ca), and C6 reflects high Ca and NO3- groundwater. Anthropogenic activities, especially agriculture, have resulted in elevated NO3- levels in shallow groundwater. Notably, heightened non-carcinogenic risks linked to NO3-, Pb, F-, and Mn exposure through drinking water, particularly impacting children, warrant significant attention. This research contributes valuable insights into sustainable groundwater resource development, pollution mitigation strategies, and effective ecosystem protection within intensive mining regions like the Changzhi Basin. It serves as a vital reference for similar areas worldwide, offering guidance for groundwater management, pollution prevention, and control.


Assuntos
Monitoramento Ambiental , Água Subterrânea , Mineração , Poluentes Químicos da Água , Água Subterrânea/química , Água Subterrânea/análise , China , Poluentes Químicos da Água/análise , Humanos , Monitoramento Ambiental/métodos , Medição de Risco
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA