Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 15.795
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Cell ; 175(3): 835-847.e25, 2018 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-30340044

RESUMO

How transcriptional bursting relates to gene regulation is a central question that has persisted for more than a decade. Here, we measure nascent transcriptional activity in early Drosophila embryos and characterize the variability in absolute activity levels across expression boundaries. We demonstrate that boundary formation follows a common transcription principle: a single control parameter determines the distribution of transcriptional activity, regardless of gene identity, boundary position, or enhancer-promoter architecture. We infer the underlying bursting kinetics and identify the key regulatory parameter as the fraction of time a gene is in a transcriptionally active state. Unexpectedly, both the rate of polymerase initiation and the switching rates are tightly constrained across all expression levels, predicting synchronous patterning outcomes at all positions in the embryo. These results point to a shared simplicity underlying the apparently complex transcriptional processes of early embryonic patterning and indicate a path to general rules in transcriptional regulation.


Assuntos
Padronização Corporal/genética , Regulação da Expressão Gênica no Desenvolvimento , Ativação Transcricional , Animais , RNA Polimerases Dirigidas por DNA/metabolismo , Drosophila melanogaster , Embrião não Mamífero/metabolismo , Modelos Teóricos , Regiões Promotoras Genéticas
2.
Cell ; 174(5): 1293-1308.e36, 2018 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-29961579

RESUMO

Knowledge of immune cell phenotypes in the tumor microenvironment is essential for understanding mechanisms of cancer progression and immunotherapy response. We profiled 45,000 immune cells from eight breast carcinomas, as well as matched normal breast tissue, blood, and lymph nodes, using single-cell RNA-seq. We developed a preprocessing pipeline, SEQC, and a Bayesian clustering and normalization method, Biscuit, to address computational challenges inherent to single-cell data. Despite significant similarity between normal and tumor tissue-resident immune cells, we observed continuous phenotypic expansions specific to the tumor microenvironment. Analysis of paired single-cell RNA and T cell receptor (TCR) sequencing data from 27,000 additional T cells revealed the combinatorial impact of TCR utilization on phenotypic diversity. Our results support a model of continuous activation in T cells and do not comport with the macrophage polarization model in cancer. Our results have important implications for characterizing tumor-infiltrating immune cells.


Assuntos
Neoplasias da Mama/imunologia , Regulação Neoplásica da Expressão Gênica , Receptores de Antígenos de Linfócitos T/metabolismo , Análise de Sequência de RNA , Análise de Célula Única , Microambiente Tumoral/imunologia , Teorema de Bayes , Neoplasias da Mama/patologia , Análise por Conglomerados , Biologia Computacional , Feminino , Perfilação da Expressão Gênica , Humanos , Sistema Imunitário , Imunoterapia/métodos , Linfonodos , Linfócitos do Interstício Tumoral , Macrófagos/metabolismo , Fenótipo , Transcriptoma
3.
Annu Rev Neurosci ; 46: 233-258, 2023 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-36972611

RESUMO

Flexible behavior requires the creation, updating, and expression of memories to depend on context. While the neural underpinnings of each of these processes have been intensively studied, recent advances in computational modeling revealed a key challenge in context-dependent learning that had been largely ignored previously: Under naturalistic conditions, context is typically uncertain, necessitating contextual inference. We review a theoretical approach to formalizing context-dependent learning in the face of contextual uncertainty and the core computations it requires. We show how this approach begins to organize a large body of disparate experimental observations, from multiple levels of brain organization (including circuits, systems, and behavior) and multiple brain regions (most prominently the prefrontal cortex, the hippocampus, and motor cortices), into a coherent framework. We argue that contextual inference may also be key to understanding continual learning in the brain. This theory-driven perspective places contextual inference as a core component of learning.


Assuntos
Encéfalo , Aprendizagem , Hipocampo , Córtex Pré-Frontal , Simulação por Computador
4.
Annu Rev Neurosci ; 44: 449-473, 2021 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-33882258

RESUMO

Adaptive behavior in a complex, dynamic, and multisensory world poses some of the most fundamental computational challenges for the brain, notably inference, decision-making, learning, binding, and attention. We first discuss how the brain integrates sensory signals from the same source to support perceptual inference and decision-making by weighting them according to their momentary sensory uncertainties. We then show how observers solve the binding or causal inference problem-deciding whether signals come from common causes and should hence be integrated or else be treated independently. Next, we describe the multifarious interplay between multisensory processing and attention. We argue that attentional mechanisms are crucial to compute approximate solutions to the binding problem in naturalistic environments when complex time-varying signals arise from myriad causes. Finally, we review how the brain dynamically adapts multisensory processing to a changing world across multiple timescales.


Assuntos
Atenção , Percepção Auditiva , Encéfalo , Aprendizagem , Percepção Visual
5.
Proc Natl Acad Sci U S A ; 121(31): e2404676121, 2024 Jul 30.
Artigo em Inglês | MEDLINE | ID: mdl-39042681

RESUMO

This work establishes a different paradigm on digital molecular spaces and their efficient navigation by exploiting sigma profiles. To do so, the remarkable capability of Gaussian processes (GPs), a type of stochastic machine learning model, to correlate and predict physicochemical properties from sigma profiles is demonstrated, outperforming state-of-the-art neural networks previously published. The amount of chemical information encoded in sigma profiles eases the learning burden of machine learning models, permitting the training of GPs on small datasets which, due to their negligible computational cost and ease of implementation, are ideal models to be combined with optimization tools such as gradient search or Bayesian optimization (BO). Gradient search is used to efficiently navigate the sigma profile digital space, quickly converging to local extrema of target physicochemical properties. While this requires the availability of pretrained GP models on existing datasets, such limitations are eliminated with the implementation of BO, which can find global extrema with a limited number of iterations. A remarkable example of this is that of BO toward boiling temperature optimization. Holding no knowledge of chemistry except for the sigma profile and boiling temperature of carbon monoxide (the worst possible initial guess), BO finds the global maximum of the available boiling temperature dataset (over 1,000 molecules encompassing more than 40 families of organic and inorganic compounds) in just 15 iterations (i.e., 15 property measurements), cementing sigma profiles as a powerful digital chemical space for molecular optimization and discovery, particularly when little to no experimental data is initially available.

6.
Proc Natl Acad Sci U S A ; 121(28): e2302924121, 2024 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-38950368

RESUMO

The human colonization of the Canary Islands represents the sole known expansion of Berber communities into the Atlantic Ocean and is an example of marine dispersal carried out by an African population. While this island colonization shows similarities to the populating of other islands across the world, several questions still need to be answered before this case can be included in wider debates regarding patterns of initial colonization and human settlement, human-environment interactions, and the emergence of island identities. Specifically, the chronology of the first human settlement of the Canary Islands remains disputed due to differing estimates of the timing of its first colonization. This absence of a consensus has resulted in divergent hypotheses regarding the motivations that led early settlers to migrate to the islands, e.g., ecological or demographic. Distinct motivations would imply differences in the strategies and dynamics of colonization; thus, identifying them is crucial to understanding how these populations developed in such environments. In response, the current study assembles a comprehensive dataset of the most reliable radiocarbon dates, which were used for building Bayesian models of colonization. The findings suggest that i) the Romans most likely discovered the islands around the 1st century BCE; ii) Berber groups from western North Africa first set foot on one of the islands closest to the African mainland sometime between the 1st and 3rd centuries CE; iii) Roman and Berber societies did not live simultaneously in the Canary Islands; and iv) the Berber people rapidly spread throughout the archipelago.


Assuntos
Migração Humana , Humanos , Espanha , Migração Humana/história , Teorema de Bayes , História Antiga , Datação Radiométrica
7.
Proc Natl Acad Sci U S A ; 121(17): e2320239121, 2024 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-38630721

RESUMO

Collective motion is ubiquitous in nature; groups of animals, such as fish, birds, and ungulates appear to move as a whole, exhibiting a rich behavioral repertoire that ranges from directed movement to milling to disordered swarming. Typically, such macroscopic patterns arise from decentralized, local interactions among constituent components (e.g., individual fish in a school). Preeminent models of this process describe individuals as self-propelled particles, subject to self-generated motion and "social forces" such as short-range repulsion and long-range attraction or alignment. However, organisms are not particles; they are probabilistic decision-makers. Here, we introduce an approach to modeling collective behavior based on active inference. This cognitive framework casts behavior as the consequence of a single imperative: to minimize surprise. We demonstrate that many empirically observed collective phenomena, including cohesion, milling, and directed motion, emerge naturally when considering behavior as driven by active Bayesian inference-without explicitly building behavioral rules or goals into individual agents. Furthermore, we show that active inference can recover and generalize the classical notion of social forces as agents attempt to suppress prediction errors that conflict with their expectations. By exploring the parameter space of the belief-based model, we reveal nontrivial relationships between the individual beliefs and group properties like polarization and the tendency to visit different collective states. We also explore how individual beliefs about uncertainty determine collective decision-making accuracy. Finally, we show how agents can update their generative model over time, resulting in groups that are collectively more sensitive to external fluctuations and encode information more robustly.


Assuntos
Comportamento de Massa , Modelos Biológicos , Animais , Teorema de Bayes , Movimento , Movimento (Física) , Peixes , Comportamento Social , Comportamento Animal
8.
Proc Natl Acad Sci U S A ; 121(14): e2308814121, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38527194

RESUMO

RNA decay is a crucial mechanism for regulating gene expression in response to environmental stresses. In bacteria, RNA-binding proteins (RBPs) are known to be involved in posttranscriptional regulation, but their global impact on RNA half-lives has not been extensively studied. To shed light on the role of the major RBPs ProQ and CspC/E in maintaining RNA stability, we performed RNA sequencing of Salmonella enterica over a time course following treatment with the transcription initiation inhibitor rifampicin (RIF-seq) in the presence and absence of these RBPs. We developed a hierarchical Bayesian model that corrects for confounding factors in rifampicin RNA stability assays and enables us to identify differentially decaying transcripts transcriptome-wide. Our analysis revealed that the median RNA half-life in Salmonella in early stationary phase is less than 1 min, a third of previous estimates. We found that over half of the 500 most long-lived transcripts are bound by at least one major RBP, suggesting a general role for RBPs in shaping the transcriptome. Integrating differential stability estimates with cross-linking and immunoprecipitation followed by RNA sequencing (CLIP-seq) revealed that approximately 30% of transcripts with ProQ binding sites and more than 40% with CspC/E binding sites in coding or 3' untranslated regions decay differentially in the absence of the respective RBP. Analysis of differentially destabilized transcripts identified a role for ProQ in the oxidative stress response. Our findings provide insights into posttranscriptional regulation by ProQ and CspC/E, and the importance of RBPs in regulating gene expression.


Assuntos
Perfilação da Expressão Gênica , Rifampina , Teorema de Bayes , Meia-Vida , Transcriptoma , Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , Salmonella/metabolismo , Estabilidade de RNA/genética
9.
Proc Natl Acad Sci U S A ; 121(37): e2316256121, 2024 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-39226366

RESUMO

Trajectory inference methods are essential for analyzing the developmental paths of cells in single-cell sequencing datasets. It provides insights into cellular differentiation, transitions, and lineage hierarchies, helping unravel the dynamic processes underlying development and disease progression. However, many existing tools lack a coherent statistical model and reliable uncertainty quantification, limiting their utility and robustness. In this paper, we introduce VITAE (Variational Inference for Trajectory by AutoEncoder), a statistical approach that integrates a latent hierarchical mixture model with variational autoencoders to infer trajectories. The statistical hierarchical model enhances the interpretability of our framework, while the posterior approximations generated by our variational autoencoder ensure computational efficiency and provide uncertainty quantification of cell projections along trajectories. Specifically, VITAE enables simultaneous trajectory inference and data integration, improving the accuracy of learning a joint trajectory structure in the presence of biological and technical heterogeneity across datasets. We show that VITAE outperforms other state-of-the-art trajectory inference methods on both real and synthetic data under various trajectory topologies. Furthermore, we apply VITAE to jointly analyze three distinct single-cell RNA sequencing datasets of the mouse neocortex, unveiling comprehensive developmental lineages of projection neurons. VITAE effectively reduces batch effects within and across datasets and uncovers finer structures that might be overlooked in individual datasets. Additionally, we showcase VITAE's efficacy in integrative analyses of multiomic datasets with continuous cell population structures.


Assuntos
Aprendizado Profundo , Genômica , Análise de Célula Única , Análise de Célula Única/métodos , Animais , Camundongos , Genômica/métodos , Análise de Sequência de RNA/métodos , Humanos
10.
Proc Natl Acad Sci U S A ; 121(18): e2312992121, 2024 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-38648479

RESUMO

Cortical neurons exhibit highly variable responses over trials and time. Theoretical works posit that this variability arises potentially from chaotic network dynamics of recurrently connected neurons. Here, we demonstrate that chaotic neural dynamics, formed through synaptic learning, allow networks to perform sensory cue integration in a sampling-based implementation. We show that the emergent chaotic dynamics provide neural substrates for generating samples not only of a static variable but also of a dynamical trajectory, where generic recurrent networks acquire these abilities with a biologically plausible learning rule through trial and error. Furthermore, the networks generalize their experience in the stimulus-evoked samples to the inference without partial or all sensory information, which suggests a computational role of spontaneous activity as a representation of the priors as well as a tractable biological computation for marginal distributions. These findings suggest that chaotic neural dynamics may serve for the brain function as a Bayesian generative model.


Assuntos
Modelos Neurológicos , Neurônios , Neurônios/fisiologia , Teorema de Bayes , Rede Nervosa/fisiologia , Dinâmica não Linear , Humanos , Aprendizagem/fisiologia , Animais , Encéfalo/fisiologia
11.
Hum Mol Genet ; 33(14): 1262-1272, 2024 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-38676403

RESUMO

BACKGROUND: Genetic susceptibility to various chronic diseases has been shown to influence heart failure (HF) risk. However, the underlying biological pathways, particularly the role of leukocyte telomere length (LTL), are largely unknown. We investigated the impact of genetic susceptibility to chronic diseases and various traits on HF risk, and whether LTL mediates or modifies the pathways. METHODS: We conducted prospective cohort analyses on 404 883 European participants from the UK Biobank, including 9989 incident HF cases. Multivariable Cox regression was used to estimate associations between HF risk and 24 polygenic risk scores (PRSs) for various diseases or traits previously generated using a Bayesian approach. We assessed multiplicative interactions between the PRSs and LTL previously measured in the UK Biobank using quantitative PCR. Causal mediation analyses were conducted to estimate the proportion of the total effect of PRSs acting indirectly through LTL, an integrative marker of biological aging. RESULTS: We identified 9 PRSs associated with HF risk, including those for various cardiovascular diseases or traits, rheumatoid arthritis (P = 1.3E-04), and asthma (P = 1.8E-08). Additionally, longer LTL was strongly associated with decreased HF risk (P-trend = 1.7E-08). Notably, LTL strengthened the asthma-HF relationship significantly (P-interaction = 2.8E-03). However, LTL mediated only 1.13% (P < 0.001) of the total effect of the asthma PRS on HF risk. CONCLUSIONS: Our findings shed light onto the shared genetic susceptibility between HF risk, asthma, rheumatoid arthritis, and other traits. Longer LTL strengthened the genetic effect of asthma in the pathway to HF. These results support consideration of LTL and PRSs in HF risk prediction.


Assuntos
Predisposição Genética para Doença , Insuficiência Cardíaca , Leucócitos , Telômero , Humanos , Insuficiência Cardíaca/genética , Insuficiência Cardíaca/epidemiologia , Feminino , Leucócitos/metabolismo , Masculino , Pessoa de Meia-Idade , Telômero/genética , Doença Crônica , Idoso , Estudos Prospectivos , Homeostase do Telômero/genética , Fatores de Risco , Polimorfismo de Nucleotídeo Único , Adulto , Herança Multifatorial/genética , Estudo de Associação Genômica Ampla , População Branca/genética , População Europeia
12.
Am J Hum Genet ; 110(11): 1863-1874, 2023 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-37879338

RESUMO

Genome-wide association studies (GWASs) across thousands of traits have revealed the pervasive pleiotropy of trait-associated genetic variants. While methods have been proposed to characterize pleiotropic components across groups of phenotypes, scaling these approaches to ultra-large-scale biobanks has been challenging. Here, we propose FactorGo, a scalable variational factor analysis model to identify and characterize pleiotropic components using biobank GWAS summary data. In extensive simulations, we observe that FactorGo outperforms the state-of-the-art (model-free) approach tSVD in capturing latent pleiotropic factors across phenotypes while maintaining a similar computational cost. We apply FactorGo to estimate 100 latent pleiotropic factors from GWAS summary data of 2,483 phenotypes measured in European-ancestry Pan-UK BioBank individuals (N = 420,531). Next, we find that factors from FactorGo are more enriched with relevant tissue-specific annotations than those identified by tSVD (p = 2.58E-10) and validate our approach by recapitulating brain-specific enrichment for BMI and the height-related connection between reproductive system and muscular-skeletal growth. Finally, our analyses suggest shared etiologies between rheumatoid arthritis and periodontal condition in addition to alkaline phosphatase as a candidate prognostic biomarker for prostate cancer. Overall, FactorGo improves our biological understanding of shared etiologies across thousands of GWASs.


Assuntos
Artrite Reumatoide , Estudo de Associação Genômica Ampla , Masculino , Humanos , Estudo de Associação Genômica Ampla/métodos , Herança Multifatorial , Fenótipo , Encéfalo , Artrite Reumatoide/genética , Polimorfismo de Nucleotídeo Único/genética , Pleiotropia Genética
13.
Am J Hum Genet ; 110(5): 741-761, 2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-37030289

RESUMO

The advent of large-scale genome-wide association studies (GWASs) has motivated the development of statistical methods for phenotype prediction with single-nucleotide polymorphism (SNP) array data. These polygenic risk score (PRS) methods use a multiple linear regression framework to infer joint effect sizes of all genetic variants on the trait. Among the subset of PRS methods that operate on GWAS summary statistics, sparse Bayesian methods have shown competitive predictive ability. However, most existing Bayesian approaches employ Markov chain Monte Carlo (MCMC) algorithms, which are computationally inefficient and do not scale favorably to higher dimensions, for posterior inference. Here, we introduce variational inference of polygenic risk scores (VIPRS), a Bayesian summary statistics-based PRS method that utilizes variational inference techniques to approximate the posterior distribution for the effect sizes. Our experiments with 36 simulation configurations and 12 real phenotypes from the UK Biobank dataset demonstrated that VIPRS is consistently competitive with the state-of-the-art in prediction accuracy while being more than twice as fast as popular MCMC-based approaches. This performance advantage is robust across a variety of genetic architectures, SNP heritabilities, and independent GWAS cohorts. In addition to its competitive accuracy on the "White British" samples, VIPRS showed improved transferability when applied to other ethnic groups, with up to 1.7-fold increase in R2 among individuals of Nigerian ancestry for low-density lipoprotein (LDL) cholesterol. To illustrate its scalability, we applied VIPRS to a dataset of 9.6 million genetic markers, which conferred further improvements in prediction accuracy for highly polygenic traits, such as height.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Herança Multifatorial/genética , Estudo de Associação Genômica Ampla/métodos , Teorema de Bayes , Polimorfismo de Nucleotídeo Único/genética , Fatores de Risco , Predisposição Genética para Doença
14.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38581417

RESUMO

Untargeted metabolomics based on liquid chromatography-mass spectrometry technology is quickly gaining widespread application, given its ability to depict the global metabolic pattern in biological samples. However, the data are noisy and plagued by the lack of clear identity of data features measured from samples. Multiple potential matchings exist between data features and known metabolites, while the truth can only be one-to-one matches. Some existing methods attempt to reduce the matching uncertainty, but are far from being able to remove the uncertainty for most features. The existence of the uncertainty causes major difficulty in downstream functional analysis. To address these issues, we develop a novel approach for Bayesian Analysis of Untargeted Metabolomics data (BAUM) to integrate previously separate tasks into a single framework, including matching uncertainty inference, metabolite selection and functional analysis. By incorporating the knowledge graph between variables and using relatively simple assumptions, BAUM can analyze datasets with small sample sizes. By allowing different confidence levels of feature-metabolite matching, the method is applicable to datasets in which feature identities are partially known. Simulation studies demonstrate that, compared with other existing methods, BAUM achieves better accuracy in selecting important metabolites that tend to be functionally consistent and assigning confidence scores to feature-metabolite matches. We analyze a COVID-19 metabolomics dataset and a mouse brain metabolomics dataset using BAUM. Even with a very small sample size of 16 mice per group, BAUM is robust and stable. It finds pathways that conform to existing knowledge, as well as novel pathways that are biologically plausible.


Assuntos
Metabolômica , Camundongos , Animais , Teorema de Bayes , Tamanho da Amostra , Incerteza , Metabolômica/métodos , Simulação por Computador
15.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39133097

RESUMO

Constructing gene regulatory networks is a widely adopted approach for investigating gene regulation, offering diverse applications in biology and medicine. A great deal of research focuses on using time series data or single-cell RNA-sequencing data to infer gene regulatory networks. However, such gene expression data lack either cellular or temporal information. Fortunately, the advent of time-lapse confocal laser microscopy enables biologists to obtain tree-shaped gene expression data of Caenorhabditis elegans, achieving both cellular and temporal resolution. Although such tree-shaped data provide abundant knowledge, they pose challenges like non-pairwise time series, laying the inaccuracy of downstream analysis. To address this issue, a comprehensive framework for data integration and a novel Bayesian approach based on Boolean network with time delay are proposed. The pre-screening process and Markov Chain Monte Carlo algorithm are applied to obtain the parameter estimates. Simulation studies show that our method outperforms existing Boolean network inference algorithms. Leveraging the proposed approach, gene regulatory networks for five subtrees are reconstructed based on the real tree-shaped datatsets of Caenorhabditis elegans, where some gene regulatory relationships confirmed in previous genetic studies are recovered. Also, heterogeneity of regulatory relationships in different cell lineage subtrees is detected. Furthermore, the exploration of potential gene regulatory relationships that bear importance in human diseases is undertaken. All source code is available at the GitHub repository https://github.com/edawu11/BBTD.git.


Assuntos
Algoritmos , Caenorhabditis elegans , Redes Reguladoras de Genes , Caenorhabditis elegans/genética , Animais , Teorema de Bayes , Biologia Computacional/métodos , Cadeias de Markov , Perfilação da Expressão Gênica/métodos
16.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38653490

RESUMO

Genome-wide Association Studies (GWAS) methods have identified individual single-nucleotide polymorphisms (SNPs) significantly associated with specific phenotypes. Nonetheless, many complex diseases are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT (https://github.com/wangjr03/BayesKAT), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.


Assuntos
Teorema de Bayes , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla/métodos , Predisposição Genética para Doença , Algoritmos , Software , Biologia Computacional/métodos , Estudos de Associação Genética/métodos
17.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38653489

RESUMO

There is a growing interest in inferring context specific gene regulatory networks from single-cell RNA sequencing (scRNA-seq) data. This involves identifying the regulatory relationships between transcription factors (TFs) and genes in individual cells, and then characterizing these relationships at the level of specific cell types or cell states. In this study, we introduce scGATE (single-cell gene regulatory gate) as a novel computational tool for inferring TF-gene interaction networks and reconstructing Boolean logic gates involving regulatory TFs using scRNA-seq data. In contrast to current Boolean models, scGATE eliminates the need for individual formulations and likelihood calculations for each Boolean rule (e.g. AND, OR, XOR). By employing a Bayesian framework, scGATE infers the Boolean rule after fitting the model to the data, resulting in significant reductions in time-complexities for logic-based studies. We have applied assay for transposase-accessible chromatin with sequencing (scATAC-seq) data and TF DNA binding motifs to filter out non-relevant TFs in gene regulations. By integrating single-cell clustering with these external cues, scGATE is able to infer context specific networks. The performance of scGATE is evaluated using synthetic and real single-cell multi-omics data from mouse tissues and human blood, demonstrating its superiority over existing tools for reconstructing TF-gene networks. Additionally, scGATE provides a flexible framework for understanding the complex combinatorial and cooperative relationships among TFs regulating target genes by inferring Boolean logic gates among them.


Assuntos
Redes Reguladoras de Genes , Análise de Célula Única , Fatores de Transcrição , Análise de Célula Única/métodos , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Animais , Camundongos , Biologia Computacional/métodos , Teorema de Bayes , Humanos , Algoritmos , Análise de Sequência de RNA/métodos , Regulação da Expressão Gênica , Multiômica
18.
Mol Cell ; 71(5): 733-744.e11, 2018 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-30174289

RESUMO

Cell-fate decisions are central to the survival and development of both uni- and multicellular organisms. It remains unclear when and to what degree cells can decide on future fates prior to commitment. This uncertainty stems from experimental and theoretical limitations in measuring and integrating multiple signals at the single-cell level during a decision process. Here, we combine six-color live-cell imaging with the Bayesian method of statistical evidence to study the meiosis/quiescence decision in budding yeast. Integration of multiple upstream metabolic signals predicts individual cell fates with high probability well before commitment. Cells "decide" their fates before birth, well before the activation of pathways characteristic of downstream cell fates. This decision, which remains stable through several cell cycles, occurs when multiple metabolic parameters simultaneously cross cell-fate-specific thresholds. Taken together, our results show that cells can decide their future fates long before commitment mechanisms are activated.


Assuntos
Redes e Vias Metabólicas/fisiologia , Saccharomycetales/metabolismo , Saccharomycetales/fisiologia , Teorema de Bayes , Meiose/fisiologia
19.
Proc Natl Acad Sci U S A ; 120(20): e2220672120, 2023 05 16.
Artigo em Inglês | MEDLINE | ID: mdl-37159475

RESUMO

The extraordinary number of species in the tropics when compared to the extra-tropics is probably the most prominent and consistent pattern in biogeography, suggesting that overarching processes regulate this diversity gradient. A major challenge to characterizing which processes are at play relies on quantifying how the frequency and determinants of tropical and extra-tropical speciation, extinction, and dispersal events shaped evolutionary radiations. We address this question by developing and applying spatiotemporal phylogenetic and paleontological models of diversification for tetrapod species incorporating paleoenvironmental variation. Our phylogenetic model results show that area, energy, or species richness did not uniformly affect speciation rates across tetrapods and dispute expectations of a latitudinal gradient in speciation rates. Instead, both neontological and fossil evidence coincide in underscoring the role of extra-tropical extinctions and the outflow of tropical species in shaping biodiversity. These diversification dynamics accurately predict present-day levels of species richness across latitudes and uncover temporal idiosyncrasies but spatial generality across the major tetrapod radiations.


Assuntos
Biodiversidade , Evolução Biológica , Filogenia , Dissidências e Disputas , Fósseis
20.
Proc Natl Acad Sci U S A ; 120(17): e2220045120, 2023 Apr 25.
Artigo em Inglês | MEDLINE | ID: mdl-37068251

RESUMO

Interpreting the outcome of chemistry experiments consistently is slow and frequently introduces unwanted hidden bias. This difficulty limits the scale of collectable data and often leads to exclusion of negative results, which severely limits progress in the field. What is needed is a way to standardize the discovery process and accelerate the interpretation of high-dimensional data aided by the expert chemist's intuition. We demonstrate a digital Oracle that interprets chemical reactivity using probability. By carrying out >500 reactions covering a large space and retaining both the positive and negative results, the Oracle was able to rediscover eight historically important reactions including the aldol condensation, Buchwald-Hartwig amination, Heck, Mannich, Sonogashira, Suzuki, Wittig, and Wittig-Horner reactions. This paradigm for decoding reactivity validates and formalizes the expert chemist's experience and intuition, providing a quantitative criterion of discovery scalable to all available experimental data.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA