Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 239
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell Proteomics ; 23(6): 100785, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38750696

RESUMO

The molecular mechanisms that drive the onset and development of osteoarthritis (OA) remain largely unknown. In this exploratory study, we used a proteomic platform (SOMAscan assay) to measure the relative abundance of more than 6000 proteins in synovial fluid (SF) from knees of human donors with healthy or mildly degenerated tissues, and knees with late-stage OA from patients undergoing knee replacement surgery. Using a linear mixed effects model, we estimated the differential abundance of 6251 proteins between the three groups. We found 583 proteins upregulated in the late-stage OA, including MMP1, collagenase 3 and interleukin-6. Further, we selected 760 proteins (800 aptamers) based on absolute fold changes between the healthy and mild degeneration groups. To those, we applied Gaussian Graphical Models (GGMs) to analyze the conditional dependence of proteins and to identify key proteins and subnetworks involved in early OA pathogenesis. After regularization and stability selection, we identified 102 proteins involved in GGM networks. Notably, network complexity was lost in the protein graph for mild degeneration when compared to controls, suggesting a disruption in the regular protein interplay. Furthermore, among our main findings were several downregulated (in mild degeneration versus healthy) proteins with unique interactions in the healthy group, one of which, SLCO5A1, has not previously been associated with OA. Our results suggest that this protein is important for healthy joint function. Further, our data suggests that SF proteomics, combined with GGMs, can reveal novel insights into the molecular pathogenesis and identification of biomarker candidates for early-stage OA.


Assuntos
Mapas de Interação de Proteínas , Proteômica , Líquido Sinovial , Humanos , Líquido Sinovial/metabolismo , Proteômica/métodos , Feminino , Masculino , Idoso , Pessoa de Meia-Idade , Osteoartrite do Joelho/metabolismo , Osteoartrite do Joelho/patologia , Osteoartrite/metabolismo , Osteoartrite/patologia , Interleucina-6/metabolismo , Proteoma/metabolismo , Metaloproteinase 1 da Matriz/metabolismo
2.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-36920069

RESUMO

Gaussian graphical model is a strong tool for identifying interactions from metabolomics data based on conditional correlation. However, data may be collected from different stages or subgroups of subjects with heterogeneity or hierarchical structure. There are different integrating strategies of graphical models for multi-group data proposed by data scientists. It is challenging to select the methods for metabolism data analysis. This study aimed to evaluate the performance of several different integrating graphical models for multi-group data and provide support for the choice of strategy for similar characteristic data. We compared the performance of seven methods in estimating graph structures through simulation study. We also applied all the methods in breast cancer metabolomics data grouped by stages to illustrate the real data application. The method of Shaddox et al. achieved the highest average area under the receiver operating characteristic curve and area under the precision-recall curve across most scenarios, and it was the only approach with all indicators ranked at the top. Nevertheless, it also cost the most time in all settings. Stochastic search structure learning tends to result in estimates that focus on the precision of identified edges, while BEAM, hierarchical Bayesian approach and birth-death Markov chain Monte Carlo may identify more potential edges. In the real metabolomics data analysis from three stages of breast cancer patients, results were in line with that in simulation study.


Assuntos
Neoplasias da Mama , Metabolômica , Humanos , Feminino , Teorema de Bayes , Metabolômica/métodos , Simulação por Computador
3.
Syst Biol ; 73(2): 290-307, 2024 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-38262741

RESUMO

The processes responsible for the formation of Earth's most conspicuous diversity pattern, the latitudinal diversity gradient (LDG), remain unexplored for many clades in the Tree of Life. Here, we present a densely sampled and dated molecular phylogeny for the most speciose clade of damselflies worldwide (Odonata: Coenagrionoidea) and investigate the role of time, macroevolutionary processes, and biome-shift dynamics in shaping the LDG in this ancient insect superfamily. We used process-based biogeographic models to jointly infer ancestral ranges and speciation times and to characterize within-biome dispersal and biome-shift dynamics across the cosmopolitan distribution of Coenagrionoidea. We also investigated temporal and biome-dependent variation in diversification rates. Our results uncover a tropical origin of pond damselflies and featherlegs ~105 Ma, while highlighting the uncertainty of ancestral ranges within the tropics in deep time. Even though diversification rates have declined since the origin of this clade, global climate change and biome-shifts have slowly increased diversity in warm- and cold-temperate areas, where lineage turnover rates have been relatively higher. This study underscores the importance of biogeographic origin and time to diversify as important drivers of the LDG in pond damselflies and their relatives, while diversification dynamics have instead resulted in the formation of ephemeral species in temperate regions. Biome-shifts, although limited by tropical niche conservatism, have been the main factor reducing the steepness of the LDG in the last 30 Myr. With ongoing climate change and increasing northward range expansions of many damselfly taxa, the LDG may become less pronounced. Our results support recent calls to unify biogeographic and macroevolutionary approaches to improve our understanding of how latitudinal diversity gradients are formed and why they vary across time and among taxa.


Assuntos
Odonatos , Filogenia , Animais , Odonatos/classificação , Odonatos/genética , Clima Tropical , Distribuição Animal , Biodiversidade , Filogeografia , Especiação Genética
4.
BMC Bioinformatics ; 25(1): 209, 2024 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-38867193

RESUMO

BACKGROUND: Single-cell RNA sequencing (sc-RNASeq) data illuminate transcriptomic heterogeneity but also possess a high level of noise, abundant missing entries and sometimes inadequate or no cell type annotations at all. Bulk-level gene expression data lack direct information of cell population composition but are more robust and complete and often better annotated. We propose a modeling framework to integrate bulk-level and single-cell RNASeq data to address the deficiencies and leverage the mutual strengths of each type of data and enable a more comprehensive inference of their transcriptomic heterogeneity. Contrary to the standard approaches of factorizing the bulk-level data with one algorithm and (for some methods) treating single-cell RNASeq data as references to decompose bulk-level data, we employed multiple deconvolution algorithms to factorize the bulk-level data, constructed the probabilistic graphical models of cell-level gene expressions from the decomposition outcomes, and compared the log-likelihood scores of these models in single-cell data. We term this framework backward deconvolution as inference operates from coarse-grained bulk-level data to fine-grained single-cell data. As the abundant missing entries in sc-RNASeq data have a significant effect on log-likelihood scores, we also developed a criterion for inclusion or exclusion of zero entries in log-likelihood score computation. RESULTS: We selected nine deconvolution algorithms and validated backward deconvolution in five datasets. In the in-silico mixtures of mouse sc-RNASeq data, the log-likelihood scores of the deconvolution algorithms were strongly anticorrelated with their errors of mixture coefficients and cell type specific gene expression signatures. In the true bulk-level mouse data, the sample mixture coefficients were unknown but the log-likelihood scores were strongly correlated with accuracy rates of inferred cell types. In the data of autism spectrum disorder (ASD) and normal controls, we found that ASD brains possessed higher fractions of astrocytes and lower fractions of NRGN-expressing neurons than normal controls. In datasets of breast cancer and low-grade gliomas (LGG), we compared the log-likelihood scores of three simple hypotheses about the gene expression patterns of the cell types underlying the tumor subtypes. The model that tumors of each subtype were dominated by one cell type persistently outperformed an alternative model that each cell type had elevated expression in one gene group and tumors were mixtures of those cell types. Superiority of the former model is also supported by comparing the real breast cancer sc-RNASeq clusters with those generated by simulated sc-RNASeq data. CONCLUSIONS: The results indicate that backward deconvolution serves as a sensible model selection tool for deconvolution algorithms and facilitates discerning hypotheses about cell type compositions underlying heterogeneous specimens such as tumors.


Assuntos
Algoritmos , Análise de Sequência de RNA , Análise de Célula Única , Transcriptoma , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Transcriptoma/genética , Humanos , Perfilação da Expressão Gênica/métodos , Animais , Camundongos , Análise da Expressão Gênica de Célula Única
5.
Stat Med ; 43(21): 4131-4147, 2024 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-39007408

RESUMO

In this work, we propose methods to examine how the complex interrelationships between clinical symptoms and, separately, brain imaging biomarkers change over time leading up to the diagnosis of a disease in subjects with a known genetic near-certainty of disease. We propose a time-dependent undirected graphical model that ensures temporal and structural smoothness across time-specific networks to examine the trajectories of interactions between markers aligned at the time of disease onset. Specifically, we anchor subjects relative to the time of disease diagnosis (anchoring time) as in a revival process, and we estimate networks at each time point of interest relative to the anchoring time. To use all available data, we apply kernel weights to borrow information across observations that are close to the time of interest. Adaptive lasso weights are introduced to encourage temporal smoothness in edge strength, while a novel elastic fused- l 0 $$ {l}_0 $$ penalty removes spurious edges and encourages temporal smoothness in network structure. Our approach can handle practical complications such as unbalanced visit times. We conduct simulation studies to compare our approach with existing methods. We then apply our method to data from PREDICT-HD, a large prospective observational study of pre-manifest Huntington's disease (HD) patients, to identify symptom and imaging network changes that precede clinical diagnosis of HD.


Assuntos
Simulação por Computador , Doença de Huntington , Modelos Estatísticos , Neuroimagem , Humanos , Neuroimagem/métodos , Doença de Huntington/diagnóstico por imagem , Fatores de Tempo , Estudos Prospectivos , Encéfalo/diagnóstico por imagem , Biomarcadores
6.
Stat Med ; 43(6): 1135-1152, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38197220

RESUMO

The prevalence of chronic non-communicable diseases such as obesity has noticeably increased in the last decade. The study of these diseases in early life is of paramount importance in determining their course in adult life and in supporting clinical interventions. Recently, attention has been drawn to approaches that study the alteration of metabolic pathways in obese children. In this work, we propose a novel joint modeling approach for the analysis of growth biomarkers and metabolite associations, to unveil metabolic pathways related to childhood obesity. Within a Bayesian framework, we flexibly model the temporal evolution of growth trajectories and metabolic associations through the specification of a joint nonparametric random effect distribution, with the main goal of clustering subjects, thus identifying risk sub-groups. Growth profiles as well as patterns of metabolic associations determine the clustering structure. Inclusion of risk factors is straightforward through the specification of a regression term. We demonstrate the proposed approach on data from the Growing Up in Singapore Towards healthy Outcomes cohort study, based in Singapore. Posterior inference is obtained via a tailored MCMC algorithm, involving a nonparametric prior with mixed support. Our analysis has identified potential key pathways in obese children that allow for the exploration of possible molecular mechanisms associated with childhood obesity.


Assuntos
Obesidade Infantil , Adulto , Humanos , Criança , Obesidade Infantil/epidemiologia , Estudos de Coortes , Teorema de Bayes , Fatores de Risco , Biomarcadores
7.
BMC Med Res Methodol ; 24(1): 136, 2024 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-38909216

RESUMO

BACKGROUND: Generating synthetic patient data is crucial for medical research, but common approaches build up on black-box models which do not allow for expert verification or intervention. We propose a highly available method which enables synthetic data generation from real patient records in a privacy preserving and compliant fashion, is interpretable and allows for expert intervention. METHODS: Our approach ties together two established tools in medical informatics, namely OMOP as a data standard for electronic health records and Synthea as a data synthetization method. For this study, data pipelines were built which extract data from OMOP, convert them into time series format, learn temporal rules by 2 statistical algorithms (Markov chain, TARM) and 3 algorithms of causal discovery (DYNOTEARS, J-PCMCI+, LiNGAM) and map the outputs into Synthea graphs. The graphs are evaluated quantitatively by their individual and relative complexity and qualitatively by medical experts. RESULTS: The algorithms were found to learn qualitatively and quantitatively different graph representations. Whereas the Markov chain results in extremely large graphs, TARM, DYNOTEARS, and J-PCMCI+ were found to reduce the data dimension during learning. The MultiGroupDirect LiNGAM algorithm was found to not be applicable to the problem statement at hand. CONCLUSION: Only TARM and DYNOTEARS are practical algorithms for real-world data in this use case. As causal discovery is a method to debias purely statistical relationships, the gradient-based causal discovery algorithm DYNOTEARS was found to be most suitable.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Humanos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Registros Eletrônicos de Saúde/normas , Cadeias de Markov , Informática Médica/métodos , Informática Médica/estatística & dados numéricos
8.
Age Ageing ; 53(Suppl 2): ii20-ii29, 2024 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-38745494

RESUMO

BACKGROUND: Heterogeneity in ageing rates drives the need for research into lifestyle secrets of successful agers. Biological age, predicted by epigenetic clocks, has been shown to be a more reliable measure of ageing than chronological age. Dietary habits are known to affect the ageing process. However, much remains to be learnt about specific dietary habits that may directly affect the biological process of ageing. OBJECTIVE: To identify food groups that are directly related to biological ageing, using Copula Graphical Models. METHODS: We performed a preregistered analysis of 3,990 postmenopausal women from the Women's Health Initiative, based in North America. Biological age acceleration was calculated by the epigenetic clock PhenoAge using whole-blood DNA methylation. Copula Graphical Modelling, a powerful data-driven exploratory tool, was used to examine relations between food groups and biological ageing whilst adjusting for an extensive amount of confounders. Two food group-age acceleration networks were established: one based on the MyPyramid food grouping system and another based on item-level food group data. RESULTS: Intake of eggs, organ meat, sausages, cheese, legumes, starchy vegetables, added sugar and lunch meat was associated with biological age acceleration, whereas intake of peaches/nectarines/plums, poultry, nuts, discretionary oil and solid fat was associated with decelerated ageing. CONCLUSION: We identified several associations between specific food groups and biological ageing. These findings pave the way for subsequent studies to ascertain causality and magnitude of these relationships, thereby improving the understanding of biological mechanisms underlying the interplay between food groups and biological ageing.


Assuntos
Envelhecimento , Metilação de DNA , Comportamento Alimentar , Humanos , Feminino , Idoso , Pessoa de Meia-Idade , Fatores Etários , Epigênese Genética , Dieta/estatística & dados numéricos , Pós-Menopausa
9.
Bioessays ; 44(4): e2100255, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35212408

RESUMO

Bayesian learning theory and evolutionary theory both formalize adaptive competition dynamics in possibly high-dimensional, varying, and noisy environments. What do they have in common and how do they differ? In this paper, we discuss structural and dynamical analogies and their limits, both at a computational and an algorithmic-mechanical level. We point out mathematical equivalences between their basic dynamical equations, generalizing the isomorphism between Bayesian update and replicator dynamics. We discuss how these mechanisms provide analogous answers to the challenge of adapting to stochastically changing environments at multiple timescales. We elucidate an algorithmic equivalence between a sampling approximation, particle filters, and the Wright-Fisher model of population genetics. These equivalences suggest that the frequency distribution of types in replicator populations optimally encodes regularities of a stochastic environment to predict future environments, without invoking the known mechanisms of multilevel selection and evolvability. A unified view of the theories of learning and evolution comes in sight.


Assuntos
Evolução Biológica , Genética Populacional , Teorema de Bayes , Aprendizagem
10.
Behav Res Methods ; 56(7): 8091-8104, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-39080123

RESUMO

We develop a Bayesian method for aggregating partial ranking data using the Thurstone model. Our implementation is a JAGS graphical model that allows each individual to rank any subset of items, and provides an inference about the latent true ranking of the items and the relative expertise of each individual. We demonstrate the method by analyzing data from new experiments that collected partial ranking data. In one experiment, participants were assigned subsets of items to rank; in the other experiment, participants could choose how many and which items they ranked. We show that our method works effectively for both sorts of partial ranking in applications to US city populations and the chronology of US presidents. We discuss the potential of the method for studying the wisdom of the crowd and other research problems that require aggregating incomplete or partial rankings.


Assuntos
Teorema de Bayes , Humanos , Modelos Estatísticos
11.
Medicina (Kaunas) ; 60(3)2024 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-38541163

RESUMO

Background and Objectives: This paper aims to assess the role of laser therapy in periodontitis through an innovative approach involving computational prediction and advanced modeling performed through network analysis (Gaussian graphical models-GGMs) and structural equations (SEM). Materials and Methods: Forty patients, exhibiting periodontal pockets with a minimum depth of 5 mm, were randomly divided into two groups: a control group and a laser group. Four specific indicators were measured for each tooth, namely periodontal pocket depth (PPD), clinical attachment level (CAL), bleeding on probing (BOP), and plaque index (PI), and the mean of six measured values was recorded at five time markers (baseline, 6 months, 1 year, 2 years, and 4 years). The assessment algorithm included enrollment, measurements, and differential non-surgical periodontal treatment, according to the group allocation. Scaling, root planing, and chlorhexidine 1% were conducted for the control group, and scaling, root planing and erbium, chromium:yttrium-scandium-gallium-garnet (Er,CR:YSGG) laser therapy were conducted for the laser group. Results: The main results highlight that the addition of laser treatment to scaling and root planing led to notable clinical improvements, decreasing the PPD values, reducing the BOP scores, and increasing the CAL. Conclusions: Notable relationships between the specific indicators considered were highlighted by both the GGMs and by SEM, thus confirming their suitability as proxies for the success of periodontal treatment.


Assuntos
Terapia a Laser , Terapia com Luz de Baixa Intensidade , Periodontite , Humanos , Análise de Classes Latentes , Periodontite/radioterapia , Periodontite/cirurgia , Terapia a Laser/métodos , Aplainamento Radicular/métodos , Seguimentos
12.
Biostatistics ; 23(3): 825-843, 2022 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-33527998

RESUMO

Functional magnetic resonance imaging (fMRI) data have become increasingly available and are useful for describing functional connectivity (FC), the relatedness of neuronal activity in regions of the brain. This FC of the brain provides insight into certain neurodegenerative diseases and psychiatric disorders, and thus is of clinical importance. To help inform physicians regarding patient diagnoses, unsupervised clustering of subjects based on FC is desired, allowing the data to inform us of groupings of patients based on shared features of connectivity. Since heterogeneity in FC is present even between patients within the same group, it is important to allow subject-level differences in connectivity, while still pooling information across patients within each group to describe group-level FC. To this end, we propose a random covariance clustering model (RCCM) to concurrently cluster subjects based on their FC networks, estimate the unique FC networks of each subject, and to infer shared network features. Although current methods exist for estimating FC or clustering subjects using fMRI data, our novel contribution is to cluster or group subjects based on similar FC of the brain while simultaneously providing group- and subject-level FC network estimates. The competitive performance of RCCM relative to other methods is demonstrated through simulations in various settings, achieving both improved clustering of subjects and estimation of FC networks. Utility of the proposed method is demonstrated with application to a resting-state fMRI data set collected on 43 healthy controls and 61 participants diagnosed with schizophrenia.


Assuntos
Imageamento por Ressonância Magnética , Esquizofrenia , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia , Mapeamento Encefálico/métodos , Análise por Conglomerados , Humanos , Imageamento por Ressonância Magnética/métodos , Esquizofrenia/diagnóstico por imagem
13.
Biostatistics ; 24(1): 161-176, 2022 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-34520533

RESUMO

Single-cell RNA-sequencing (scRNAseq) data contain a high level of noise, especially in the form of zero-inflation, that is, the presence of an excessively large number of zeros. This is largely due to dropout events and amplification biases that occur in the preparation stage of single-cell experiments. Recent scRNAseq experiments have been augmented with unique molecular identifiers (UMI) and External RNA Control Consortium (ERCC) molecules which can be used to account for zero-inflation. However, most of the current methods on graphical models are developed under the assumption of the multivariate Gaussian distribution or its variants, and thus they are not able to adequately account for an excessively large number of zeros in scRNAseq data. In this article, we propose a single-cell latent graphical model (scLGM)-a Bayesian hierarchical model for estimating the conditional dependency network among genes using scRNAseq data. Taking advantage of UMI and ERCC data, scLGM explicitly models the two sources of zero-inflation. Our simulation study and real data analysis demonstrate that the proposed approach outperforms several existing methods.


Assuntos
RNA , Análise de Célula Única , Humanos , Análise de Sequência de RNA/métodos , Teorema de Bayes , RNA/genética , Simulação por Computador
14.
BMC Med ; 21(1): 105, 2023 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-36944999

RESUMO

BACKGROUND: When tackling complex public health challenges such as childhood obesity, interventions focused on immediate causes, such as poor diet and physical inactivity, have had limited success, largely because upstream root causes remain unresolved. A priority is to develop new modelling frameworks to infer the causal structure of complex chronic disease networks, allowing disease "on-ramps" to be identified and targeted. METHODS: The system surrounding childhood obesity was modelled as a Bayesian network, using data from The Longitudinal Study of Australian Children. The existence and directions of the dependencies between factors represent possible causal pathways for childhood obesity and were encoded in directed acyclic graphs (DAGs). The posterior distribution of the DAGs was estimated using the Partition Markov chain Monte Carlo. RESULTS: We have implemented structure learning for each dataset at a single time point. For each wave and cohort, socio-economic status was central to the DAGs, implying that socio-economic status drives the system regarding childhood obesity. Furthermore, the causal pathway socio-economic status and/or parental high school levels → parental body mass index (BMI) → child's BMI existed in over 99.99% of posterior DAG samples across all waves and cohorts. For children under the age of 8 years, the most influential proximate causal factors explaining child BMI were birth weight and parents' BMI. After age 8 years, free time activity became an important driver of obesity, while the upstream factors influencing free time activity for boys compared with girls were different. CONCLUSIONS: Childhood obesity is largely a function of socio-economic status, which is manifest through numerous downstream factors. Parental high school levels entangle with socio-economic status, and hence, are on-ramp to childhood obesity. The strong and independent causal relationship between birth weight and childhood BMI suggests a biological link. Our study implies that interventions that improve the socio-economic status, including through increasing high school completion rates, may be effective in reducing childhood obesity prevalence.


Assuntos
Obesidade Infantil , Masculino , Feminino , Criança , Humanos , Obesidade Infantil/diagnóstico , Obesidade Infantil/epidemiologia , Estudos Longitudinais , Peso ao Nascer , Teorema de Bayes , Austrália/epidemiologia , Índice de Massa Corporal
15.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34373890

RESUMO

MOTIVATION: Empowered by advanced genomics discovery tools, recent biomedical research has produced a massive amount of genomic data on (post-)transcriptional regulations related to transcription factors, microRNAs, long non-coding RNAs, epigenetic modifications and genetic variations. Computational modeling, as an essential research method, has generated promising testable quantitative models that represent complex interplay among different gene regulatory mechanisms based on these data in many biological systems. However, given the dynamic changes of interactome in chaotic systems such as cancers, and the dramatic growth of heterogeneous data on this topic, such promise has encountered unprecedented challenges in terms of model complexity and scalability. In this study, we introduce a new integrative machine learning approach that can infer multifaceted gene regulations in cancers with a particular focus on microRNA regulation. In addition to new strategies for data integration and graphical model fusion, a supervised deep learning model was integrated to identify conditional microRNA-mRNA interactions across different cancer stages. RESULTS: In a case study of human breast cancer, we have identified distinct gene regulatory networks associated with four progressive stages. The subsequent functional analysis focusing on microRNA-mediated dysregulation across stages has revealed significant changes in major cancer hallmarks, as well as novel pathological signaling and metabolic processes, which shed light on microRNAs' regulatory roles in breast cancer progression. We believe this integrative model can be a robust and effective discovery tool to understand key regulatory characteristics in complex biological systems. AVAILABILITY: http://sbbi-panda.unl.edu/pin/.


Assuntos
Neoplasias da Mama/genética , Aprendizado de Máquina , MicroRNAs/genética , Neoplasias da Mama/patologia , Progressão da Doença , Feminino , Redes Reguladoras de Genes , Humanos , Modelos Teóricos
16.
Respir Res ; 24(1): 30, 2023 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-36698131

RESUMO

BACKGROUND: Chronic obstructive pulmonary disease (COPD) varies significantly in symptomatic and physiologic presentation. Identifying disease subtypes from molecular data, collected from easily accessible blood samples, can help stratify patients and guide disease management and treatment. METHODS: Blood gene expression measured by RNA-sequencing in the COPDGene Study was analyzed using a network perturbation analysis method. Each COPD sample was compared against a learned reference gene network to determine the part that is deregulated. Gene deregulation values were used to cluster the disease samples. RESULTS: The discovery set included 617 former smokers from COPDGene. Four distinct gene network subtypes are identified with significant differences in symptoms, exercise capacity and mortality. These clusters do not necessarily correspond with the levels of lung function impairment and are independently validated in two external cohorts: 769 former smokers from COPDGene and 431 former smokers in the Multi-Ethnic Study of Atherosclerosis (MESA). Additionally, we identify several genes that are significantly deregulated across these subtypes, including DSP and GSTM1, which have been previously associated with COPD through genome-wide association study (GWAS). CONCLUSIONS: The identified subtypes differ in mortality and in their clinical and functional characteristics, underlining the need for multi-dimensional assessment potentially supplemented by selected markers of gene expression. The subtypes were consistent across cohorts and could be used for new patient stratification and disease prognosis.


Assuntos
Redes Reguladoras de Genes , Doença Pulmonar Obstrutiva Crônica , Humanos , Redes Reguladoras de Genes/genética , Fumantes , Estudo de Associação Genômica Ampla/métodos , Doença Pulmonar Obstrutiva Crônica/diagnóstico , Doença Pulmonar Obstrutiva Crônica/genética , Prognóstico
17.
Psychol Med ; 53(15): 7385-7394, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37092859

RESUMO

BACKGROUND: Depression is associated with metabolic alterations including lipid dysregulation, whereby associations may vary across individual symptoms. Evaluating these associations using a network perspective yields a more complete insight than single outcome-single predictor models. METHODS: We used data from the Netherlands Study of Depression and Anxiety (N = 2498) and leveraged networks capturing associations between 30 depressive symptoms (Inventory of Depressive Symptomatology) and 46 metabolites. Analyses involved 4 steps: creating a network with Mixed Graphical Models; calculating centrality measures; bootstrapping for stability testing; validating central, stable associations by extra covariate-adjustment; and validation using another data wave collected 6 years later. RESULTS: The network yielded 28 symptom-metabolite associations. There were 15 highly-central variables (8 symptoms, 7 metabolites), and 3 stable links involving the symptoms Low energy (fatigue), and Hypersomnia. Specifically, fatigue showed consistent associations with higher mean diameter for VLDL particles and lower estimated degree of (fatty acid) unsaturation. These remained present after adjustment for lifestyle and health-related factors and using another data wave. CONCLUSIONS: The somatic symptoms Fatigue and Hypersomnia and cholesterol and fatty acid measures showed central, stable, and consistent relationships in our network. The present analyses showed how metabolic alterations are more consistently linked to specific symptom profiles.


Assuntos
Depressão , Distúrbios do Sono por Sonolência Excessiva , Humanos , Ansiedade , Fadiga , Ácidos Graxos
18.
Stat Med ; 42(13): 2116-2133, 2023 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-37004994

RESUMO

Gaussian graphical models (GGMs) are a popular form of network model in which nodes represent features in multivariate normal data and edges reflect conditional dependencies between these features. GGM estimation is an active area of research. Currently available tools for GGM estimation require investigators to make several choices regarding algorithms, scoring criteria, and tuning parameters. An estimated GGM may be highly sensitive to these choices, and the accuracy of each method can vary based on structural characteristics of the network such as topology, degree distribution, and density. Because these characteristics are a priori unknown, it is not straightforward to establish universal guidelines for choosing a GGM estimation method. We address this problem by introducing SpiderLearner, an ensemble method that constructs a consensus network from multiple estimated GGMs. Given a set of candidate methods, SpiderLearner estimates the optimal convex combination of results from each method using a likelihood-based loss function. K $$ K $$ -fold cross-validation is applied in this process, reducing the risk of overfitting. In simulations, SpiderLearner performs better than or comparably to the best candidate methods according to a variety of metrics, including relative Frobenius norm and out-of-sample likelihood. We apply SpiderLearner to publicly available ovarian cancer gene expression data including 2013 participants from 13 diverse studies, demonstrating our tool's potential to identify biomarkers of complex disease. SpiderLearner is implemented as flexible, extensible, open-source code in the R package ensembleGGM at https://github.com/katehoffshutta/ensembleGGM.


Assuntos
Algoritmos , Distribuição Normal , Humanos , Funções Verossimilhança , Software , Expressão Gênica , Neoplasias Ovarianas/genética
19.
Artigo em Inglês | MEDLINE | ID: mdl-37396752

RESUMO

A mixture-model of beta distributions framework is introduced to identify significant correlations among P features when P is large. The method relies on theorems in convex geometry, which are used to show how to control the error rate of edge detection in graphical models. The proposed 'betaMix' method does not require any assumptions about the network structure, nor does it assume that the network is sparse. The results hold for a wide class of data-generating distributions that include light-tailed and heavy-tailed spherically symmetric distributions. The results are robust for sufficiently large sample sizes and hold for non-elliptically-symmetric distributions.

20.
J Neurosci ; 41(41): 8577-8588, 2021 10 13.
Artigo em Inglês | MEDLINE | ID: mdl-34413204

RESUMO

Neuronal ensembles are groups of neurons with coordinated activity that could represent sensory, motor, or cognitive states. The study of how neuronal ensembles are built, recalled, and involved in the guiding of complex behaviors has been limited by the lack of experimental and analytical tools to reliably identify and manipulate neurons that have the ability to activate entire ensembles. Such pattern completion neurons have also been proposed as key elements of artificial and biological neural networks. Indeed, the relevance of pattern completion neurons is highlighted by growing evidence that targeting them can activate neuronal ensembles and trigger behavior. As a method to reliably detect pattern completion neurons, we use conditional random fields (CRFs), a type of probabilistic graphical model. We apply CRFs to identify pattern completion neurons in ensembles in experiments using in vivo two-photon calcium imaging from primary visual cortex of male mice and confirm the CRFs predictions with two-photon optogenetics. To test the broader applicability of CRFs we also analyze publicly available calcium imaging data (Allen Institute Brain Observatory dataset) and demonstrate that CRFs can reliably identify neurons that predict specific features of visual stimuli. Finally, to explore the scalability of CRFs we apply them to in silico network simulations and show that CRFs-identified pattern completion neurons have increased functional connectivity. These results demonstrate the potential of CRFs to characterize and selectively manipulate neural circuits.SIGNIFICANCE STATEMENT We describe a graph theory method to identify and optically manipulate neurons with pattern completion capability in mouse cortical circuits. Using calcium imaging and two-photon optogenetics in vivo we confirm that key neurons identified by this method can recall entire neuronal ensembles. This method could be broadly applied to manipulate neuronal ensemble activity to trigger behavior or for therapeutic applications in brain prostheses.


Assuntos
Modelos Neurológicos , Neurônios/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Probabilidade , Córtex Visual/fisiologia , Animais , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Microscopia de Fluorescência por Excitação Multifotônica/métodos , Neurônios/química , Optogenética/métodos , Estimulação Luminosa/métodos , Córtex Visual/química , Córtex Visual/citologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA