Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 237
Filtrar
1.
Stat Med ; 2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39007408

RESUMEN

In this work, we propose methods to examine how the complex interrelationships between clinical symptoms and, separately, brain imaging biomarkers change over time leading up to the diagnosis of a disease in subjects with a known genetic near-certainty of disease. We propose a time-dependent undirected graphical model that ensures temporal and structural smoothness across time-specific networks to examine the trajectories of interactions between markers aligned at the time of disease onset. Specifically, we anchor subjects relative to the time of disease diagnosis (anchoring time) as in a revival process, and we estimate networks at each time point of interest relative to the anchoring time. To use all available data, we apply kernel weights to borrow information across observations that are close to the time of interest. Adaptive lasso weights are introduced to encourage temporal smoothness in edge strength, while a novel elastic fused- l 0 $$ {l}_0 $$ penalty removes spurious edges and encourages temporal smoothness in network structure. Our approach can handle practical complications such as unbalanced visit times. We conduct simulation studies to compare our approach with existing methods. We then apply our method to data from PREDICT-HD, a large prospective observational study of pre-manifest Huntington's disease (HD) patients, to identify symptom and imaging network changes that precede clinical diagnosis of HD.

2.
J Nutr Health Aging ; 28(9): 100324, 2024 Jul 26.
Artículo en Inglés | MEDLINE | ID: mdl-39067141

RESUMEN

BACKGROUND: Along with the ageing of society, the absolute prevalence of age-related diseases is expected to rise, leading to a substantial burden on healthcare systems and society. Thus, there is an urgent need to promote healthy ageing. As opposed to chronological age, biological age was introduced to accurately represent the ageing process, as it considers physiological deterioration that is linked to morbidity and mortality risk. Furthermore, biological age responds to various factors, including nutritional factors, which have the potential to mitigate the risk of age-related diseases. As a result, a promising biomarker of biological age known as the epigenetic clock has emerged as a suitable measure to investigate the direct relations between nutritional factors and ageing, thereby identifying potential intervention targets to improve healthy ageing. METHODS: In this study, we analysed data from 3,969 postmenopausal women from the Women's Health Initiative to identify nutrients that are associated with the rate of ageing by using an accurate measure of biological age called the PhenoAge epigenetic clock. We used Copula Graphical Models, a data-driven exploratory analysis tool, to identify direct relationships between nutrient intake and age-acceleration, while correcting for every variable in the dataset. RESULTS: We revealed that increased dietary intakes of coumestrol, beta-carotene and arachidic acid were associated with decelerated epigenetic ageing. In contrast, increased intakes of added sugar, gondoic acid, behenic acid, arachidonic acid, vitamin A and ash were associated with accelerated epigenetic ageing in postmenopausal women. CONCLUSION: Our study discovered direct relations between nutrients and epigenetic ageing, revealing promising areas for follow-up studies to determine the magnitude and causality of our estimated diet-epigenetic relationships.

3.
J Appl Stat ; 51(10): 1843-1860, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39071251

RESUMEN

A growing literature suggests that gene expression can be greatly altered in disease conditions, and identifying those changes will improve the understanding of complex diseases such as cancers or diabetes. A prevailing direction in the analysis of gene expression studies the changes in gene pathways which include sets of related genes. Therefore, introducing structured exploration to differential analysis of gene expression networks may lead to meaningful discoveries. The topic of this paper is differential network analysis, which focuses on capturing the differences between two or more precision matrices. We discuss the connection between the thresholding method and the D-trace loss method on differential network analysis in the case that the precision matrices share the common connected components. Based on this connection, we further propose the cluster D-trace loss method which directly estimates the differential network and achieves model selection consistency. Simulation studies demonstrate its improved performance and computational efficiency. Finally, the usefulness of our proposed estimator is demonstrated by a real-data analysis on non-small cell lung cancer.

4.
Behav Res Methods ; 2024 Jul 30.
Artículo en Inglés | MEDLINE | ID: mdl-39080123

RESUMEN

We develop a Bayesian method for aggregating partial ranking data using the Thurstone model. Our implementation is a JAGS graphical model that allows each individual to rank any subset of items, and provides an inference about the latent true ranking of the items and the relative expertise of each individual. We demonstrate the method by analyzing data from new experiments that collected partial ranking data. In one experiment, participants were assigned subsets of items to rank; in the other experiment, participants could choose how many and which items they ranked. We show that our method works effectively for both sorts of partial ranking in applications to US city populations and the chronology of US presidents. We discuss the potential of the method for studying the wisdom of the crowd and other research problems that require aggregating incomplete or partial rankings.

5.
BMC Bioinformatics ; 25(1): 209, 2024 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-38867193

RESUMEN

BACKGROUND: Single-cell RNA sequencing (sc-RNASeq) data illuminate transcriptomic heterogeneity but also possess a high level of noise, abundant missing entries and sometimes inadequate or no cell type annotations at all. Bulk-level gene expression data lack direct information of cell population composition but are more robust and complete and often better annotated. We propose a modeling framework to integrate bulk-level and single-cell RNASeq data to address the deficiencies and leverage the mutual strengths of each type of data and enable a more comprehensive inference of their transcriptomic heterogeneity. Contrary to the standard approaches of factorizing the bulk-level data with one algorithm and (for some methods) treating single-cell RNASeq data as references to decompose bulk-level data, we employed multiple deconvolution algorithms to factorize the bulk-level data, constructed the probabilistic graphical models of cell-level gene expressions from the decomposition outcomes, and compared the log-likelihood scores of these models in single-cell data. We term this framework backward deconvolution as inference operates from coarse-grained bulk-level data to fine-grained single-cell data. As the abundant missing entries in sc-RNASeq data have a significant effect on log-likelihood scores, we also developed a criterion for inclusion or exclusion of zero entries in log-likelihood score computation. RESULTS: We selected nine deconvolution algorithms and validated backward deconvolution in five datasets. In the in-silico mixtures of mouse sc-RNASeq data, the log-likelihood scores of the deconvolution algorithms were strongly anticorrelated with their errors of mixture coefficients and cell type specific gene expression signatures. In the true bulk-level mouse data, the sample mixture coefficients were unknown but the log-likelihood scores were strongly correlated with accuracy rates of inferred cell types. In the data of autism spectrum disorder (ASD) and normal controls, we found that ASD brains possessed higher fractions of astrocytes and lower fractions of NRGN-expressing neurons than normal controls. In datasets of breast cancer and low-grade gliomas (LGG), we compared the log-likelihood scores of three simple hypotheses about the gene expression patterns of the cell types underlying the tumor subtypes. The model that tumors of each subtype were dominated by one cell type persistently outperformed an alternative model that each cell type had elevated expression in one gene group and tumors were mixtures of those cell types. Superiority of the former model is also supported by comparing the real breast cancer sc-RNASeq clusters with those generated by simulated sc-RNASeq data. CONCLUSIONS: The results indicate that backward deconvolution serves as a sensible model selection tool for deconvolution algorithms and facilitates discerning hypotheses about cell type compositions underlying heterogeneous specimens such as tumors.


Asunto(s)
Algoritmos , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Transcriptoma , Análisis de la Célula Individual/métodos , Análisis de Secuencia de ARN/métodos , Transcriptoma/genética , Humanos , Perfilación de la Expresión Génica/métodos , Animales , Ratones , Análisis de Expresión Génica de una Sola Célula
6.
BMC Med Res Methodol ; 24(1): 136, 2024 Jun 22.
Artículo en Inglés | MEDLINE | ID: mdl-38909216

RESUMEN

BACKGROUND: Generating synthetic patient data is crucial for medical research, but common approaches build up on black-box models which do not allow for expert verification or intervention. We propose a highly available method which enables synthetic data generation from real patient records in a privacy preserving and compliant fashion, is interpretable and allows for expert intervention. METHODS: Our approach ties together two established tools in medical informatics, namely OMOP as a data standard for electronic health records and Synthea as a data synthetization method. For this study, data pipelines were built which extract data from OMOP, convert them into time series format, learn temporal rules by 2 statistical algorithms (Markov chain, TARM) and 3 algorithms of causal discovery (DYNOTEARS, J-PCMCI+, LiNGAM) and map the outputs into Synthea graphs. The graphs are evaluated quantitatively by their individual and relative complexity and qualitatively by medical experts. RESULTS: The algorithms were found to learn qualitatively and quantitatively different graph representations. Whereas the Markov chain results in extremely large graphs, TARM, DYNOTEARS, and J-PCMCI+ were found to reduce the data dimension during learning. The MultiGroupDirect LiNGAM algorithm was found to not be applicable to the problem statement at hand. CONCLUSION: Only TARM and DYNOTEARS are practical algorithms for real-world data in this use case. As causal discovery is a method to debias purely statistical relationships, the gradient-based causal discovery algorithm DYNOTEARS was found to be most suitable.


Asunto(s)
Algoritmos , Registros Electrónicos de Salud , Humanos , Registros Electrónicos de Salud/estadística & datos numéricos , Registros Electrónicos de Salud/normas , Cadenas de Markov , Informática Médica/métodos , Informática Médica/estadística & datos numéricos
7.
Mathematics (Basel) ; 12(9)2024 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-38784721

RESUMEN

While existing research has identified diverse fall risk factors in adults aged 60 and older across various areas, comprehensively examining the interrelationships between all factors can enhance our knowledge of complex mechanisms and ultimately prevent falls. This study employs a novel approach-a mixed undirected graphical model (MUGM)-to unravel the interplay between sociodemographics, mental well-being, body composition, self-assessed and performance-based fall risk assessments, and physical activity patterns. Using a parameterized joint probability density, MUGMs specify the higher-order dependence structure and reveals the underlying graphical structure of heterogeneous variables. The MUGM consisting of mixed types of variables (continuous and categorical) has versatile applications that provide innovative and practical insights, as it is equipped to transcend the limitations of traditional correlation analysis and uncover sophisticated interactions within a high-dimensional data set. Our study included 120 elders from central Florida whose 37 fall risk factors were analyzed using an MUGM. Among the identified features, 34 exhibited pairwise relationships, while COVID-19-related factors and housing composition remained conditionally independent from all others. The results from our study serve as a foundational exploration, and future research investigating the longitudinal aspects of these features plays a pivotal role in enhancing our knowledge of the dynamics contributing to fall prevention in this population.

8.
Mol Cell Proteomics ; 23(6): 100785, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38750696

RESUMEN

The molecular mechanisms that drive the onset and development of osteoarthritis (OA) remain largely unknown. In this exploratory study, we used a proteomic platform (SOMAscan assay) to measure the relative abundance of more than 6000 proteins in synovial fluid (SF) from knees of human donors with healthy or mildly degenerated tissues, and knees with late-stage OA from patients undergoing knee replacement surgery. Using a linear mixed effects model, we estimated the differential abundance of 6251 proteins between the three groups. We found 583 proteins upregulated in the late-stage OA, including MMP1, collagenase 3 and interleukin-6. Further, we selected 760 proteins (800 aptamers) based on absolute fold changes between the healthy and mild degeneration groups. To those, we applied Gaussian Graphical Models (GGMs) to analyze the conditional dependence of proteins and to identify key proteins and subnetworks involved in early OA pathogenesis. After regularization and stability selection, we identified 102 proteins involved in GGM networks. Notably, network complexity was lost in the protein graph for mild degeneration when compared to controls, suggesting a disruption in the regular protein interplay. Furthermore, among our main findings were several downregulated (in mild degeneration versus healthy) proteins with unique interactions in the healthy group, one of which, SLCO5A1, has not previously been associated with OA. Our results suggest that this protein is important for healthy joint function. Further, our data suggests that SF proteomics, combined with GGMs, can reveal novel insights into the molecular pathogenesis and identification of biomarker candidates for early-stage OA.


Asunto(s)
Mapas de Interacción de Proteínas , Proteómica , Líquido Sinovial , Humanos , Líquido Sinovial/metabolismo , Proteómica/métodos , Femenino , Masculino , Anciano , Persona de Mediana Edad , Osteoartritis de la Rodilla/metabolismo , Osteoartritis de la Rodilla/patología , Osteoartritis/metabolismo , Osteoartritis/patología , Interleucina-6/metabolismo , Proteoma/metabolismo , Metaloproteinasa 1 de la Matriz/metabolismo
9.
Age Ageing ; 53(Suppl 2): ii20-ii29, 2024 05 11.
Artículo en Inglés | MEDLINE | ID: mdl-38745494

RESUMEN

BACKGROUND: Heterogeneity in ageing rates drives the need for research into lifestyle secrets of successful agers. Biological age, predicted by epigenetic clocks, has been shown to be a more reliable measure of ageing than chronological age. Dietary habits are known to affect the ageing process. However, much remains to be learnt about specific dietary habits that may directly affect the biological process of ageing. OBJECTIVE: To identify food groups that are directly related to biological ageing, using Copula Graphical Models. METHODS: We performed a preregistered analysis of 3,990 postmenopausal women from the Women's Health Initiative, based in North America. Biological age acceleration was calculated by the epigenetic clock PhenoAge using whole-blood DNA methylation. Copula Graphical Modelling, a powerful data-driven exploratory tool, was used to examine relations between food groups and biological ageing whilst adjusting for an extensive amount of confounders. Two food group-age acceleration networks were established: one based on the MyPyramid food grouping system and another based on item-level food group data. RESULTS: Intake of eggs, organ meat, sausages, cheese, legumes, starchy vegetables, added sugar and lunch meat was associated with biological age acceleration, whereas intake of peaches/nectarines/plums, poultry, nuts, discretionary oil and solid fat was associated with decelerated ageing. CONCLUSION: We identified several associations between specific food groups and biological ageing. These findings pave the way for subsequent studies to ascertain causality and magnitude of these relationships, thereby improving the understanding of biological mechanisms underlying the interplay between food groups and biological ageing.


Asunto(s)
Envejecimiento , Metilación de ADN , Conducta Alimentaria , Humanos , Femenino , Anciano , Persona de Mediana Edad , Factores de Edad , Epigénesis Genética , Dieta/estadística & datos numéricos , Posmenopausia
10.
J Multivar Anal ; 2022024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38433779

RESUMEN

Network estimation has been a critical component of high-dimensional data analysis and can provide an understanding of the underlying complex dependence structures. Among the existing studies, Gaussian graphical models have been highly popular. However, they still have limitations due to the homogeneous distribution assumption and the fact that they are only applicable to small-scale data. For example, cancers have various levels of unknown heterogeneity, and biological networks, which include thousands of molecular components, often differ across subgroups while also sharing some commonalities. In this article, we propose a new joint estimation approach for multiple networks with unknown sample heterogeneity, by decomposing the Gaussian graphical model (GGM) into a collection of sparse regression problems. A reparameterization technique and a composite minimax concave penalty are introduced to effectively accommodate the specific and common information across the networks of multiple subgroups, making the proposed estimator significantly advancing from the existing heterogeneity network analysis based on the regularized likelihood of GGM directly and enjoying scale-invariant, tuning-insensitive, and optimization convexity properties. The proposed analysis can be effectively realized using parallel computing. The estimation and selection consistency properties are rigorously established. The proposed approach allows the theoretical studies to focus on independent network estimation only and has the significant advantage of being both theoretically and computationally applicable to large-scale data. Extensive numerical experiments with simulated data and the TCGA breast cancer data demonstrate the prominent performance of the proposed approach in both subgroup and network identifications.

11.
Medicina (Kaunas) ; 60(3)2024 Mar 06.
Artículo en Inglés | MEDLINE | ID: mdl-38541163

RESUMEN

Background and Objectives: This paper aims to assess the role of laser therapy in periodontitis through an innovative approach involving computational prediction and advanced modeling performed through network analysis (Gaussian graphical models-GGMs) and structural equations (SEM). Materials and Methods: Forty patients, exhibiting periodontal pockets with a minimum depth of 5 mm, were randomly divided into two groups: a control group and a laser group. Four specific indicators were measured for each tooth, namely periodontal pocket depth (PPD), clinical attachment level (CAL), bleeding on probing (BOP), and plaque index (PI), and the mean of six measured values was recorded at five time markers (baseline, 6 months, 1 year, 2 years, and 4 years). The assessment algorithm included enrollment, measurements, and differential non-surgical periodontal treatment, according to the group allocation. Scaling, root planing, and chlorhexidine 1% were conducted for the control group, and scaling, root planing and erbium, chromium:yttrium-scandium-gallium-garnet (Er,CR:YSGG) laser therapy were conducted for the laser group. Results: The main results highlight that the addition of laser treatment to scaling and root planing led to notable clinical improvements, decreasing the PPD values, reducing the BOP scores, and increasing the CAL. Conclusions: Notable relationships between the specific indicators considered were highlighted by both the GGMs and by SEM, thus confirming their suitability as proxies for the success of periodontal treatment.


Asunto(s)
Terapia por Láser , Terapia por Luz de Baja Intensidad , Periodontitis , Humanos , Análisis de Clases Latentes , Periodontitis/radioterapia , Periodontitis/cirugía , Terapia por Láser/métodos , Aplanamiento de la Raíz/métodos , Estudios de Seguimiento
12.
J Appl Stat ; 51(1): 114-138, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38179161

RESUMEN

We propose a novel approach to the estimation of multiple Graphical Models to analyse temporal patterns of association among a set of metabolites over different groups of patients. Our motivating application is the Southall And Brent REvisited (SABRE) study, a tri-ethnic cohort study conducted in the UK. We are interested in identifying potential ethnic differences in metabolite levels and associations as well as their evolution over time, with the aim of gaining a better understanding of different risk of cardio-metabolic disorders across ethnicities. Within a Bayesian framework, we employ a nodewise regression approach to infer the structure of the graphs, borrowing information across time as well as across ethnicities. The response variables of interest are metabolite levels measured at two time points and for two ethnic groups, Europeans and South-Asians. We use nodewise regression to estimate the high-dimensional precision matrices of the metabolites, imposing sparsity on the regression coefficients through the dynamic horseshoe prior, thus favouring sparser graphs. We provide the code to fit the proposed model using the software Stan, which performs posterior inference using Hamiltonian Monte Carlo sampling, as well as a detailed description of a block Gibbs sampling scheme.

13.
Syst Biol ; 73(2): 290-307, 2024 Jul 27.
Artículo en Inglés | MEDLINE | ID: mdl-38262741

RESUMEN

The processes responsible for the formation of Earth's most conspicuous diversity pattern, the latitudinal diversity gradient (LDG), remain unexplored for many clades in the Tree of Life. Here, we present a densely sampled and dated molecular phylogeny for the most speciose clade of damselflies worldwide (Odonata: Coenagrionoidea) and investigate the role of time, macroevolutionary processes, and biome-shift dynamics in shaping the LDG in this ancient insect superfamily. We used process-based biogeographic models to jointly infer ancestral ranges and speciation times and to characterize within-biome dispersal and biome-shift dynamics across the cosmopolitan distribution of Coenagrionoidea. We also investigated temporal and biome-dependent variation in diversification rates. Our results uncover a tropical origin of pond damselflies and featherlegs ~105 Ma, while highlighting the uncertainty of ancestral ranges within the tropics in deep time. Even though diversification rates have declined since the origin of this clade, global climate change and biome-shifts have slowly increased diversity in warm- and cold-temperate areas, where lineage turnover rates have been relatively higher. This study underscores the importance of biogeographic origin and time to diversify as important drivers of the LDG in pond damselflies and their relatives, while diversification dynamics have instead resulted in the formation of ephemeral species in temperate regions. Biome-shifts, although limited by tropical niche conservatism, have been the main factor reducing the steepness of the LDG in the last 30 Myr. With ongoing climate change and increasing northward range expansions of many damselfly taxa, the LDG may become less pronounced. Our results support recent calls to unify biogeographic and macroevolutionary approaches to improve our understanding of how latitudinal diversity gradients are formed and why they vary across time and among taxa.


Asunto(s)
Odonata , Filogenia , Animales , Odonata/clasificación , Odonata/genética , Clima Tropical , Distribución Animal , Biodiversidad , Filogeografía , Especiación Genética
14.
Stat Med ; 43(6): 1135-1152, 2024 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-38197220

RESUMEN

The prevalence of chronic non-communicable diseases such as obesity has noticeably increased in the last decade. The study of these diseases in early life is of paramount importance in determining their course in adult life and in supporting clinical interventions. Recently, attention has been drawn to approaches that study the alteration of metabolic pathways in obese children. In this work, we propose a novel joint modeling approach for the analysis of growth biomarkers and metabolite associations, to unveil metabolic pathways related to childhood obesity. Within a Bayesian framework, we flexibly model the temporal evolution of growth trajectories and metabolic associations through the specification of a joint nonparametric random effect distribution, with the main goal of clustering subjects, thus identifying risk sub-groups. Growth profiles as well as patterns of metabolic associations determine the clustering structure. Inclusion of risk factors is straightforward through the specification of a regression term. We demonstrate the proposed approach on data from the Growing Up in Singapore Towards healthy Outcomes cohort study, based in Singapore. Posterior inference is obtained via a tailored MCMC algorithm, involving a nonparametric prior with mixed support. Our analysis has identified potential key pathways in obese children that allow for the exploration of possible molecular mechanisms associated with childhood obesity.


Asunto(s)
Obesidad Infantil , Adulto , Humanos , Niño , Obesidad Infantil/epidemiología , Estudios de Cohortes , Teorema de Bayes , Factores de Riesgo , Biomarcadores
15.
Med Image Anal ; 92: 103037, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38056163

RESUMEN

The preterm phenotype results from the interplay of multiple disorders affecting the brain and cognitive outcomes. Accurately characterising these interactions can reveal prematurity markers. Bayesian Networks (BNs) are powerful tools to disentangle these relationships, as they inherently measure associations between variables while mitigating confounding factors. We present Modified PC-HC (MPC-HC), a Bayesian Network (BN) structural learning algorithm. MPC-HC employs statistical testing and search-and-score techniques to explore equivalent classes. We employ MPC-HC to estimate BNs for extremely preterm (EP) young adults and full-term controls. Using MRI measurements and cognitive performance markers, we investigate predictive relationships and mutual influences through predictions and sensitivity analysis. We assess the confidence in the estimated BN structures using bootstrapping. Furthermore, MPC-HC's validation involves assessing its ability to recover benchmark BN structures. MPC-HC achieves an average prediction accuracy of 72.5% compared to 62.5% of PC, 64.5% of MMHC, and 71.5% of HC, while it outperforms PC, MMHC, and HC algorithms in reconstructing the true structure of benchmark BNs. The sensitivity analysis shows that MRI measurements mainly affect EP cognitive scores. Our work has two key contributions: first, the introduction and validation of a new BN structure learning method. Second, demonstrating the potential of BNs in modelling variable relationships, predicting variables of interest, modelling uncertainty, and evaluating how variables impact each other. Finally, we demonstrate this by characterising complex phenotypes, such as preterm birth, and discovering results consistent with literature findings.


Asunto(s)
Recien Nacido Extremadamente Prematuro , Nacimiento Prematuro , Recién Nacido , Femenino , Adulto Joven , Humanos , Teorema de Bayes , Algoritmos , Encéfalo/diagnóstico por imagen
16.
Comput Struct Biotechnol J ; 24: 12-22, 2024 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-38144574

RESUMEN

Machine learning models are increasingly used in the medical domain to study the association between risk factors and diseases to support practitioners in understanding health outcomes. In this paper, we showcase the use of machine-learned staged tree models for investigating complex asymmetric dependence structures in health data. Staged trees are a specific class of generative, probabilistic graphical models that formally model asymmetric conditional independence and non-regular sample spaces. An investigation of the risk factors in invasive fungal infections demonstrates the insights staged trees provide to support medical decision-making.

17.
Transl Pediatr ; 12(11): 2074-2089, 2023 Nov 28.
Artículo en Inglés | MEDLINE | ID: mdl-38130578

RESUMEN

Background: Recent research has demonstrated that machine learning (ML) has the potential to improve several aspects of medical application for critical illness, including sepsis. This scoping review aims to evaluate the feasibility of probabilistic graphical model (PGM) methods in pediatric sepsis application and describe the use of pediatric sepsis definition in these studies. Methods: Literature searches were conducted in PubMed, Scopus, Cumulative Index to Nursing and Allied Health Literature (CINAHL+), and Web of Sciences from 2000-2023. Keywords included "pediatric", "neonates", "infants", "machine learning", "probabilistic graphical model", and "sepsis". Results: A total of 3,244 studies were screened, and 72 were included in this scoping review. Sepsis was defined using positive microbiology cultures in 19 studies (26.4%), followed by the 2005's international pediatric sepsis consensus definition in 11 studies (15.3%), and Sepsis-3 definition in seven studies (9.7%). Other sepsis definitions included: bacterial infection, the international classification of diseases, clinicians' assessment, and antibiotic administration time. Among the most common ML approaches used were logistic regression (n=27), random forest (n=24), and Neural Network (n=18). PGMs were used in 13 studies (18.1%), including Bayesian classifiers (n=10), and the Markov Model (n=3). When applied on the same dataset, PGMs show a relatively inferior performance to other ML models in most cases. Other aspects of explainability and transparency were not examined in these studies. Conclusions: Current studies suggest that the performance of probabilistic graphic models is relatively inferior to other ML methods. However, its explainability and transparency advantages make it a potentially viable method for several pediatric sepsis studies and applications.

18.
N Engl J Stat Data Sci ; 1(2): 283-295, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37817840

RESUMEN

Graphical models have witnessed significant growth and usage in spatial data science for modeling data referenced over a massive number of spatial-temporal coordinates. Much of this literature has focused on a single or relatively few spatially dependent outcomes. Recent attention has focused upon addressing modeling and inference for substantially large number of outcomes. While spatial factor models and multivariate basis expansions occupy a prominent place in this domain, this article elucidates a recent approach, graphical Gaussian Processes, that exploits the notion of conditional independence among a very large number of spatial processes to build scalable graphical models for fully model-based Bayesian analysis of multivariate spatial data.

19.
Ann Appl Stat ; 17(3): 1958-1983, 2023 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-37830084

RESUMEN

Recent advances in biological research have seen the emergence of high-throughput technologies with numerous applications that allow the study of biological mechanisms at an unprecedented depth and scale. A large amount of genomic data is now distributed through consortia like The Cancer Genome Atlas (TCGA), where specific types of biological information on specific type of tissue or cell are available. In cancer research, the challenge is now to perform integrative analyses of high-dimensional multi-omic data with the goal to better understand genomic processes that correlate with cancer outcomes, e.g. elucidate gene networks that discriminate a specific cancer subgroups (cancer sub-typing) or discovering gene networks that overlap across different cancer types (pan-cancer studies). In this paper, we propose a novel mixed graphical model approach to analyze multi-omic data of different types (continuous, discrete and count) and perform model selection by extending the Birth-Death MCMC (BDMCMC) algorithm initially proposed by Stephens (2000) and later developed by Mohammadi and Wit (2015). We compare the performance of our method to the LASSO method and the standard BDMCMC method using simulations and find that our method is superior in terms of both computational efficiency and the accuracy of the model selection results. Finally, an application to the TCGA breast cancer data shows that integrating genomic information at different levels (mutation and expression data) leads to better subtyping of breast cancers.

20.
Artículo en Inglés | MEDLINE | ID: mdl-37396752

RESUMEN

A mixture-model of beta distributions framework is introduced to identify significant correlations among P features when P is large. The method relies on theorems in convex geometry, which are used to show how to control the error rate of edge detection in graphical models. The proposed 'betaMix' method does not require any assumptions about the network structure, nor does it assume that the network is sparse. The results hold for a wide class of data-generating distributions that include light-tailed and heavy-tailed spherically symmetric distributions. The results are robust for sufficiently large sample sizes and hold for non-elliptically-symmetric distributions.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA