Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 98
Filtrar
1.
Patterns (N Y) ; 5(5): 100986, 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38800365

RESUMO

Spatially resolved transcriptomics has revolutionized genome-scale transcriptomic profiling by providing high-resolution characterization of transcriptional patterns. Here, we present our spatial transcriptomics analysis framework, MUSTANG (MUlti-sample Spatial Transcriptomics data ANalysis with cross-sample transcriptional similarity Guidance), which is capable of performing multi-sample spatial transcriptomics spot cellular deconvolution by allowing both cross-sample expression-based similarity information sharing as well as spatial correlation in gene expression patterns within samples. Experiments on a semi-synthetic spatial transcriptomics dataset and three real-world spatial transcriptomics datasets demonstrate the effectiveness of MUSTANG in revealing biological insights inherent in the cellular characterization of tissue samples under study.

2.
Front Bioinform ; 4: 1280971, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38812660

RESUMO

Radiation exposure poses a significant threat to human health. Emerging research indicates that even low-dose radiation once believed to be safe, may have harmful effects. This perception has spurred a growing interest in investigating the potential risks associated with low-dose radiation exposure across various scenarios. To comprehensively explore the health consequences of low-dose radiation, our study employs a robust statistical framework that examines whether specific groups of genes, belonging to known pathways, exhibit coordinated expression patterns that align with the radiation levels. Notably, our findings reveal the existence of intricate yet consistent signatures that reflect the molecular response to radiation exposure, distinguishing between low-dose and high-dose radiation. Moreover, we leverage a pathway-constrained variational autoencoder to capture the nonlinear interactions within gene expression data. By comparing these two analytical approaches, our study aims to gain valuable insights into the impact of low-dose radiation on gene expression patterns, identify pathways that are differentially affected, and harness the potential of machine learning to uncover hidden activity within biological networks. This comparative analysis contributes to a deeper understanding of the molecular consequences of low-dose radiation exposure.

3.
Cell Host Microbe ; 32(4): 588-605.e9, 2024 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-38531364

RESUMO

Many powerful methods have been employed to elucidate the global transcriptomic, proteomic, or metabolic responses to pathogen-infected host cells. However, the host glycome responses to bacterial infection remain largely unexplored, and hence, our understanding of the molecular mechanisms by which bacterial pathogens manipulate the host glycome to favor infection remains incomplete. Here, we address this gap by performing a systematic analysis of the host glycome during infection by the bacterial pathogen Brucella spp. that cause brucellosis. We discover, surprisingly, that a Brucella effector protein (EP) Rhg1 induces global reprogramming of the host cell N-glycome by interacting with components of the oligosaccharide transferase complex that controls N-linked protein glycosylation, and Rhg1 regulates Brucella replication and tissue colonization in a mouse model of brucellosis, demonstrating that Brucella exploits the EP Rhg1 to reprogram the host N-glycome and promote bacterial intracellular parasitism, thereby providing a paradigm for bacterial control of host cell infection.


Assuntos
Brucella , Brucelose , Animais , Camundongos , Brucella/fisiologia , Proteômica , Brucelose/metabolismo , Retículo Endoplasmático/metabolismo
5.
Patterns (N Y) ; 4(11): 100863, 2023 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-38035192

RESUMO

Significant acceleration of the future discovery of novel functional materials requires a fundamental shift from the current materials discovery practice, which is heavily dependent on trial-and-error campaigns and high-throughput screening, to one that builds on knowledge-driven advanced informatics techniques enabled by the latest advances in signal processing and machine learning. In this review, we discuss the major research issues that need to be addressed to expedite this transformation along with the salient challenges involved. We especially focus on Bayesian signal processing and machine learning schemes that are uncertainty aware and physics informed for knowledge-driven learning, robust optimization, and efficient objective-driven experimental design.

6.
Patterns (N Y) ; 4(11): 100875, 2023 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-38035191

RESUMO

The need for efficient computational screening of molecular candidates that possess desired properties frequently arises in various scientific and engineering problems, including drug discovery and materials design. However, the enormous search space containing the candidates and the substantial computational cost of high-fidelity property prediction models make screening practically challenging. In this work, we propose a general framework for constructing and optimizing a high-throughput virtual screening (HTVS) pipeline that consists of multi-fidelity models. The central idea is to optimally allocate the computational resources to models with varying costs and accuracy to optimize the return on computational investment. Based on both simulated and real-world data, we demonstrate that the proposed optimal HTVS framework can significantly accelerate virtual screening without any degradation in terms of accuracy. Furthermore, it enables an adaptive operational strategy for HTVS, where one can trade accuracy for efficiency.

8.
J Comput Biol ; 30(7): 751-765, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-36961389

RESUMO

TRIMER, Transcription Regulation Integrated with MEtabolic Regulation, is a genome-scale modeling pipeline targeting at metabolic engineering applications. Using TRIMER, regulated metabolic reactions can be effectively predicted by integrative modeling of metabolic reactions with a transcription factor-gene regulatory network (TRN), which is modeled through a Bayesian network (BN). In this article, we focus on sensitivity analysis of metabolic flux prediction for uncertainty quantification of BN structures for TRN modeling in TRIMER. We propose a computational strategy to construct the uncertainty class of TRN models based on the inferred regulatory order uncertainty given transcriptomic expression data. With that, we analyze the prediction sensitivity of the TRIMER pipeline for the metabolite yields of interest. The obtained sensitivity analyses can guide optimal experimental design (OED) to help acquire new data that can enhance TRN modeling and achieve specific metabolic engineering objectives, including metabolite yield alterations. We have performed small- and large-scale simulated experiments, demonstrating the effectiveness of our developed sensitivity analysis strategy for BN structure learning to quantify the edge importance in terms of metabolic flux prediction uncertainty reduction and its potential to effectively guide OED.


Assuntos
Redes e Vias Metabólicas , Modelos Biológicos , Teorema de Bayes , Redes e Vias Metabólicas/genética , Redes Reguladoras de Genes , Análise do Fluxo Metabólico
9.
Sci Stud Read ; 27(1): 5-20, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36843656

RESUMO

Purpose: Researchers have developed a constellation model of decoding-related reading disabilities (RD) to improve the RD risk determination. The model's hallmark is its inclusion of various RD indicators to determine RD risk. Classification methods such as logistic regression (LR) might be one way to determine RD risk within the constellation model framework. However, some issues may arise with applying the logistic regression method (e.g., multicollinearity). Machine learning techniques, such as random forest (RF), might assist in overcoming these limitations. They can better deal with complex data relations than traditional approaches. We examined the prediction performance of RF and compared it against LR to determine RD risk. Method: The sample comprised 12,171 students from Florida whose third-grade RD risk was operationalized using the constellation model with one, two, three, or four RD indicators in first and second grade. Results: Results revealed that LR and RF performed on par in accurately predicting RD risk. Regarding predictor importance, reading fluency was consistently the most critical predictor for RD risk. Conclusion: Findings suggest that RF does not outperform LR in RD prediction accuracy in models with multiple linearly related predictors. Findings also highlight including reading fluency in early identification batteries for later RD determination.

10.
Methods Mol Biol ; 2586: 147-162, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36705903

RESUMO

TOPAS (TOPological network-based Alignment of Structural RNAs) is a network-based alignment algorithm that predicts structurally sound pairwise alignment of RNAs. In order to take advantage of recent advances in comparative network analysis for efficient structurally sound RNA alignment, TOPAS constructs topological network representations for RNAs, which consist of sequential edges connecting nucleotide bases as well as structural edges reflecting the underlying folding structure. Structural edges are weighted by the estimated base-pairing probabilities. Next, the constructed networks are aligned using probabilistic network alignment techniques, which yield a structurally sound RNA alignment that considers both the sequence similarity and the structural similarity between the given RNAs. Compared to traditional Sankoff-style algorithms, this network-based alignment scheme leads to a significant reduction in the overall computational cost while yielding favorable alignment results. Another important benefit is its capability to handle arbitrary folding structures, which can potentially lead to more accurate alignment for RNAs with pseudoknots.


Assuntos
Algoritmos , RNA , Sequência de Bases , Conformação de Ácido Nucleico , Alinhamento de Sequência , Análise de Sequência de RNA/métodos , RNA/genética , RNA/química
11.
Patterns (N Y) ; 3(3): 100428, 2022 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-35510184

RESUMO

Classification has been a major task for building intelligent systems because it enables decision-making under uncertainty. Classifier design aims at building models from training data for representing feature-label distributions-either explicitly or implicitly. In many scientific or clinical settings, training data are typically limited, which impedes the design and evaluation of accurate classifiers. Atlhough transfer learning can improve the learning in target domains by incorporating data from relevant source domains, it has received little attention for performance assessment, notably in error estimation. Here, we investigate knowledge transferability in the context of classification error estimation within a Bayesian paradigm. We introduce a class of Bayesian minimum mean-square error estimators for optimal Bayesian transfer learning, which enables rigorous evaluation of classification error under uncertainty in small-sample settings. Using Monte Carlo importance sampling, we illustrate the outstanding performance of the proposed estimator for a broad family of classifiers that span diverse learning capabilities.

12.
ACS Appl Mater Interfaces ; 14(22): 25907-25919, 2022 Jun 08.
Artigo em Inglês | MEDLINE | ID: mdl-35622945

RESUMO

Van der Waals (vdW) heterostructures are constructed by different two-dimensional (2D) monolayers vertically stacked and weakly coupled by van der Waals interactions. VdW heterostructures often possess rich physical and chemical properties that are unique to their constituent monolayers. As many 2D materials have been recently identified, the combinatorial configuration space of vdW-stacked heterostructures grows exceedingly large, making it difficult to explore through traditional experimental or computational approaches in a trial-and-error manner. Here, we present a computational framework that combines first-principles electronic structure calculations, 2D material database, and supervised machine learning methods to construct efficient data-driven models capable of predicting electronic and structural properties of vdW heterostructures from their constituent monolayer properties. We apply this approach to predict the band gap, band edges, interlayer distance, and interlayer binding energy of vdW heterostructures. Our data-driven model will open avenues for efficient screening and discovery of low-dimensional vdW heterostructures and moiré superlattices with desired electronic and optical properties for targeted device applications.

13.
Elife ; 112022 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-35587649

RESUMO

The phagocytosis and destruction of pathogens in lysosomes constitute central elements of innate immune defense. Here, we show that Brucella, the causative agent of brucellosis, the most prevalent bacterial zoonosis globally, subverts this immune defense pathway by activating regulated IRE1α-dependent decay (RIDD) of Bloc1s1 mRNA encoding BLOS1, a protein that promotes endosome-lysosome fusion. RIDD-deficient cells and mice harboring a RIDD-incompetent variant of IRE1α were resistant to infection. Inactivation of the Bloc1s1 gene impaired the ability to assemble BLOC-1-related complex (BORC), resulting in differential recruitment of BORC-related lysosome trafficking components, perinuclear trafficking of Brucella-containing vacuoles (BCVs), and enhanced susceptibility to infection. The RIDD-resistant Bloc1s1 variant maintains the integrity of BORC and a higher-level association of BORC-related components that promote centrifugal lysosome trafficking, resulting in enhanced BCV peripheral trafficking and lysosomal destruction, and resistance to infection. These findings demonstrate that host RIDD activity on BLOS1 regulates Brucella intracellular parasitism by disrupting BORC-directed lysosomal trafficking. Notably, coronavirus murine hepatitis virus also subverted the RIDD-BLOS1 axis to promote intracellular replication. Our work establishes BLOS1 as a novel immune defense factor whose activity is hijacked by diverse pathogens.


Assuntos
Brucella , Brucelose , Animais , Brucelose/metabolismo , Brucelose/microbiologia , Endorribonucleases/metabolismo , Endossomos/metabolismo , Camundongos , Proteínas Serina-Treonina Quinases
14.
Data Brief ; 42: 108113, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35434232

RESUMO

Transfer learning (TL) techniques can enable effective learning in data scarce domains by allowing one to re-purpose data or scientific knowledge available in relevant source domains for predictive tasks in a target domain of interest. In this Data in Brief article, we present a synthetic dataset for binary classification in the context of Bayesian transfer learning, which can be used for the design and evaluation of TL-based classifiers. For this purpose, we consider numerous combinations of classification settings, based on which we simulate a diverse set of feature-label distributions with varying learning complexity. For each set of model parameters, we provide a pair of target and source datasets that have been jointly sampled from the underlying feature-label distributions in the target and source domains, respectively. For both target and source domains, the data in a given class and domain are normally distributed, where the distributions across domains are related to each other through a joint prior. To ensure the consistency of the classification complexity across the provided datasets, we have controlled the Bayes error such that it is maintained within a range of predefined values that mimic realistic classification scenarios across different relatedness levels. The provided datasets may serve as useful resources for designing and benchmarking transfer learning schemes for binary classification as well as the estimation of classification error.

15.
STAR Protoc ; 3(1): 101184, 2022 03 18.
Artigo em Inglês | MEDLINE | ID: mdl-35243375

RESUMO

This protocol explains the pipeline for condition-dependent metabolite yield prediction using Transcription Regulation Integrated with MEtabolic Regulation (TRIMER). TRIMER targets metabolic engineering applications via a hybrid model integrating transcription factor (TF)-gene regulatory network (TRN) with a Bayesian network (BN) inferred from transcriptomic expression data to effectively regulate metabolic reactions. For E. coli and yeast, TRIMER achieves reliable knockout phenotype and flux predictions from the deletion of one or more TFs at the genome scale. For complete details on the use and execution of this protocol, please refer to Niu et al. (2021).


Assuntos
Escherichia coli , Redes Reguladoras de Genes , Teorema de Bayes , Escherichia coli/genética , Regulação da Expressão Gênica , Saccharomyces cerevisiae/genética , Fatores de Transcrição/genética
16.
Bioinformatics ; 38(4): 1075-1086, 2022 01 27.
Artigo em Inglês | MEDLINE | ID: mdl-34788368

RESUMO

MOTIVATION: Accurate disease diagnosis and prognosis based on omics data rely on the effective identification of robust prognostic and diagnostic markers that reflect the states of the biological processes underlying the disease pathogenesis and progression. In this article, we present GCNCC, a Graph Convolutional Network-based approach for Clustering and Classification, that can identify highly effective and robust network-based disease markers. Based on a geometric deep learning framework, GCNCC learns deep network representations by integrating gene expression data with protein interaction data to identify highly reproducible markers with consistently accurate prediction performance across independent datasets possibly from different platforms. GCNCC identifies these markers by clustering the nodes in the protein interaction network based on latent similarity measures learned by the deep architecture of a graph convolutional network, followed by a supervised feature selection procedure that extracts clusters that are highly predictive of the disease state. RESULTS: By benchmarking GCNCC based on independent datasets from different diseases (psychiatric disorder and cancer) and different platforms (microarray and RNA-seq), we show that GCNCC outperforms other state-of-the-art methods in terms of accuracy and reproducibility. AVAILABILITY AND IMPLEMENTATION: https://github.com/omarmaddouri/GCNCC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Neurais de Computação , Mapas de Interação de Proteínas , Humanos , Reprodutibilidade dos Testes
17.
iScience ; 24(11): 103218, 2021 Nov 19.
Artigo em Inglês | MEDLINE | ID: mdl-34761179

RESUMO

There has been extensive research in predictive modeling of genome-scale metabolic reaction networks. Living systems involve complex stochastic processes arising from interactions among different biomolecules. For more accurate and robust prediction of target metabolic behavior under different conditions, not only metabolic reactions but also the genetic regulatory relationships involving transcription factors (TFs) affecting these metabolic reactions should be modeled. We have developed a modeling and simulation pipeline enabling the analysis of Transcription Regulation Integrated with Metabolic Regulation: TRIMER. TRIMER utilizes a Bayesian network (BN) inferred from transcriptomes to model the transcription factor regulatory network. TRIMER then infers the probabilities of the gene states relevant to the metabolism of interest, and predicts the metabolic fluxes and their changes that result from the deletion of one or more transcription factors at the genome scale. We demonstrate TRIMER's applicability to both simulated and experimental data and provide performance comparison with other existing approaches.

18.
Artigo em Inglês | MEDLINE | ID: mdl-34051378

RESUMO

CPI-613 is a mitochondrial metabolism disrupter that inhibits tricarboxylic acid (TCA) cycle activity. The consequences of TCA cycle disruption on various metabolic pathways and overall organismal physiology are not fully known. The present study integrates in vivo experimental data with an in silico stoichiometric metabolism model of zebrafish to study the metabolic pathways perturbed under CPI-613 exposure. Embryo-larval life stages of zebrafish (Danio rerio) were exposed to 1 µM CPI-613 for 20 days. Whole-organism respirometry measurements showed an initial suppression of O2 consumption at Day 5 of exposure, followed by recovery comparable to the solvent control (0.01% DMSO) by Day 20. Comparison of whole-transcriptome RNA-sequencing at Day 5 vs. 20 of exposure showed functional categories related to O2 binding and transport, antioxidant activity, FAD binding, and hemoglobin complexes, to be commonly represented. Metabolic enzyme gene expression changes and O2 consumption rate was used to parametrize two in silico stoichiometric metabolic models representative of Day 5 or 20 of exposure. Computational simulations predicted impaired ATP synthesis, α-ketoglutarate dehydrogenase (KGDH) activity, and fatty acid ß-oxidation at Day 5 vs. 20 of exposure. These results show that the targeted disruption of KGDH may also impact oxidative phosphorylation (ATP synthesis) and fatty acid metabolism (ß-oxidation), in turn influencing cellular bioenergetics and the observed reduction in whole-organism O2 consumption rate. The results of this study provide an integrated in vivo and in silico framework to study the impacts of metabolic disruption on organismal physiology.


Assuntos
Caprilatos/toxicidade , Simulação por Computador , Embrião não Mamífero/efeitos dos fármacos , Larva/efeitos dos fármacos , Sulfetos/toxicidade , Trifosfato de Adenosina/metabolismo , Animais , Regulação para Baixo , Regulação da Expressão Gênica no Desenvolvimento/efeitos dos fármacos , Estudo de Associação Genômica Ampla , Consumo de Oxigênio/efeitos dos fármacos , Transcriptoma , Regulação para Cima , Peixe-Zebra
19.
Bioinformatics ; 37(19): 3212-3219, 2021 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-33822889

RESUMO

MOTIVATION: When learning to subtype complex disease based on next-generation sequencing data, the amount of available data is often limited. Recent works have tried to leverage data from other domains to design better predictors in the target domain of interest with varying degrees of success. But they are either limited to the cases requiring the outcome label correspondence across domains or cannot leverage the label information at all. Moreover, the existing methods cannot usually benefit from other information available a priori such as gene interaction networks. RESULTS: In this article, we develop a generative optimal Bayesian supervised domain adaptation (OBSDA) model that can integrate RNA sequencing (RNA-Seq) data from different domains along with their labels for improving prediction accuracy in the target domain. Our model can be applied in cases where different domains share the same labels or have different ones. OBSDA is based on a hierarchical Bayesian negative binomial model with parameter factorization, for which the optimal predictor can be derived by marginalization of likelihood over the posterior of the parameters. We first provide an efficient Gibbs sampler for parameter inference in OBSDA. Then, we leverage the gene-gene network prior information and construct an informed and flexible variational family to infer the posterior distributions of model parameters. Comprehensive experiments on real-world RNA-Seq data demonstrate the superior performance of OBSDA, in terms of accuracy in identifying cancer subtypes by utilizing data from different domains. Moreover, we show that by taking advantage of the prior network information we can further improve the performance. AVAILABILITY AND IMPLEMENTATION: The source code for implementations of OBSDA and SI-OBSDA are available at the following link. https://github.com/SHBLK/BSDA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

20.
J Biomed Inform ; 117: 103691, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33610882

RESUMO

Survival data analysis has been leveraged in medical research to study disease morbidity and mortality, and to discover significant bio-markers affecting them. A crucial objective in studying high dimensional medical data is the development of inherently interpretable models that can efficiently capture sparse underlying signals while retaining a high predictive accuracy. Recently developed rule ensemble models have been shown to effectively accomplish this objective; however, they are computationally expensive when applied to survival data and do not account for sparsity in the number of variables included in the generated rules. To address these gaps, we present SURVFIT, a "doubly sparse" rule extraction formulation for survival data. This doubly sparse method can induce sparsity both in the number of rules and in the number of variables involved in the rules. Our method has the computational efficiency needed to realistically solve the problem of rule-extraction from survival data if we consider both rule sparsity and variable sparsity, by adopting a quadratic loss function with an overlapping group regularization. Further, a systematic rule evaluation framework that includes statistical testing, decomposition analysis and sensitivity analysis is provided. We demonstrate the utility of SURVFIT via experiments carried out on a synthetic dataset and a sepsis survival dataset from MIMIC-III.


Assuntos
Algoritmos , Aprendizagem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA