Pesquisa | Portal de Pesquisa da BVS

Identifying temporal and spatial patterns of variation from multimodal data using MEFISTO.

Velten, Britta; Braunger, Jana M; Argelaguet, Ricard; Arnol, Damien; Wirbel, Jakob; Bredikhin, Danila; Zeller, Georg; Stegle, Oliver.

Nat Methods ; 19(2): 179-186, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-35027765

RESUMO

Factor analysis is a widely used method for dimensionality reduction in genome biology, with applications from personalized health to single-cell biology. Existing factor analysis models assume independence of the observed samples, an assumption that fails in spatio-temporal profiling studies. Here we present MEFISTO, a flexible and versatile toolbox for modeling high-dimensional data when spatial or temporal dependencies between the samples are known. MEFISTO maintains the established benefits of factor analysis for multimodal data, but enables the performance of spatio-temporally informed dimensionality reduction, interpolation, and separation of smooth from non-smooth patterns of variation. Moreover, MEFISTO can integrate multiple related datasets by simultaneously identifying and aligning the underlying patterns of variation in a data-driven manner. To illustrate MEFISTO, we apply the model to different datasets with spatial or temporal resolution, including an evolutionary atlas of organ development, a longitudinal microbiome study, a single-cell multi-omics atlas of mouse gastrulation and spatially resolved transcriptomics.

Assuntos

Biologia Computacional/métodos , Bases de Dados Factuais , Microbioma Gastrointestinal/fisiologia , Regulação da Expressão Gênica no Desenvolvimento , Software , Animais , Evolução Molecular , Humanos , Lactente , Estudos Longitudinais , Análise de Célula Única , Análise Espaço-Temporal

Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets.

Argelaguet, Ricard; Velten, Britta; Arnol, Damien; Dietrich, Sascha; Zenz, Thorsten; Marioni, John C; Buettner, Florian; Huber, Wolfgang; Stegle, Oliver.

Mol Syst Biol ; 14(6): e8124, 2018 06 20.

Artigo em Inglês | MEDLINE | ID: mdl-29925568

RESUMO

Multi-omics studies promise the improved characterization of biological processes across molecular layers. However, methods for the unsupervised integration of the resulting heterogeneous data sets are lacking. We present Multi-Omics Factor Analysis (MOFA), a computational method for discovering the principal sources of variation in multi-omics data sets. MOFA infers a set of (hidden) factors that capture biological and technical sources of variability. It disentangles axes of heterogeneity that are shared across multiple modalities and those specific to individual data modalities. The learnt factors enable a variety of downstream analyses, including identification of sample subgroups, data imputation and the detection of outlier samples. We applied MOFA to a cohort of 200 patient samples of chronic lymphocytic leukaemia, profiled for somatic mutations, RNA expression, DNA methylation and ex vivo drug responses. MOFA identified major dimensions of disease heterogeneity, including immunoglobulin heavy-chain variable region status, trisomy of chromosome 12 and previously underappreciated drivers, such as response to oxidative stress. In a second application, we used MOFA to analyse single-cell multi-omics data, identifying coordinated transcriptional and epigenetic changes along cell differentiation.

Assuntos

Biologia Computacional/métodos , Conjuntos de Dados como Assunto , Antineoplásicos/uso terapêutico , Simulação por Computador , Humanos , Leucemia Linfocítica Crônica de Células B/tratamento farmacológico , Leucemia Linfocítica Crônica de Células B/genética , Modelos Estatísticos , Estresse Oxidativo , Software , Transcriptoma

Identifying the Potential Mechanism of Action of SNPs Associated With Breast Cancer Susceptibility With GVITamIN.

Nguyen, An-Phi; Nicoletti, Paola; Arnol, Damien; Califano, Andrea; Rodríguez Martínez, María.

Front Bioeng Biotechnol ; 8: 798, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32850701

RESUMO

In the last decade, a large number of genome-wide association studies have uncovered many single-nucleotide polymorphisms (SNPs) that are associated with complex traits and confer susceptibility to diseases, such as cancer. However, so far only a few heritable traits with medium-to-high penetrance have been identified. The vast majority of the discovered variants only leads to disease in combination with other still unknown factors. Furthermore, while many studies aimed to link the effect of SNPs to changes in molecular phenotypes, the analysis has been often focused on testing associations between a single SNP and a transcript, hence disregarding the dysregulation of gene regulatory networks that has been shown to play an essential role in disease onset, notably in cancer. Here we take a systems biology approach and develop GVITamIN (Genetic VarIaTIoN functional analysis tool), a new statistical and computational approach to characterize the effect of a SNP on both genes and transcriptional regulatory programs. GVITamIN exploits a novel statistical approach to combine the usually small effect of disease-susceptibility SNPs, and reveals important potential oncogenic mechanisms, hence taking one step further in the direction of understanding the SNP mechanism of action. We apply GVITamIN on a breast cancer cohort and identify well-known cancer-related transcription factors, such as CTCF, LEF1, and FOXA1, as TFs dysregulated by breast cancer-associated SNPs. Furthermore, our results reveal that SNPs located on the RAD51B gene are significantly associated with an abnormal regulatory activity, suggesting a pivotal role for homologous recombination repair mechanisms in breast cancer.

MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data.

Argelaguet, Ricard; Arnol, Damien; Bredikhin, Danila; Deloro, Yonatan; Velten, Britta; Marioni, John C; Stegle, Oliver.

Genome Biol ; 21(1): 111, 2020 05 11.

Artigo em Inglês | MEDLINE | ID: mdl-32393329

RESUMO

Technological advances have enabled the profiling of multiple molecular layers at single-cell resolution, assaying cells from multiple samples or conditions. Consequently, there is a growing need for computational strategies to analyze data from complex experimental designs that include multiple data modalities and multiple groups of samples. We present Multi-Omics Factor Analysis v2 (MOFA+), a statistical framework for the comprehensive and scalable integration of single-cell multi-modal data. MOFA+ reconstructs a low-dimensional representation of the data using computationally efficient variational inference and supports flexible sparsity constraints, allowing to jointly model variation across multiple sample groups and data modalities.

Assuntos

Análise Fatorial , Análise de Célula Única , Animais , Metilação de DNA , Desenvolvimento Embrionário , Lobo Frontal/metabolismo , Camundongos , Análise de Sequência de RNA

Modeling Cell-Cell Interactions from Spatial Molecular Data with Spatial Variance Component Analysis.

Arnol, Damien; Schapiro, Denis; Bodenmiller, Bernd; Saez-Rodriguez, Julio; Stegle, Oliver.

Cell Rep ; 29(1): 202-211.e6, 2019 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-31577949

RESUMO

Technological advances enable assaying multiplexed spatially resolved RNA and protein expression profiling of individual cells, thereby capturing molecular variations in physiological contexts. While these methods are increasingly accessible, computational approaches for studying the interplay of the spatial structure of tissues and cell-cell heterogeneity are only beginning to emerge. Here, we present spatial variance component analysis (SVCA), a computational framework for the analysis of spatial molecular data. SVCA enables quantifying different dimensions of spatial variation and in particular quantifies the effect of cell-cell interactions on gene expression. In a breast cancer Imaging Mass Cytometry dataset, our model yields interpretable spatial variance signatures, which reveal cell-cell interactions as a major driver of protein expression heterogeneity. Applied to high-dimensional imaging-derived RNA data, SVCA identifies plausible gene families that are linked to cell-cell interactions. SVCA is available as a free software tool that can be widely applied to spatial data from different technologies.

Assuntos

Comunicação Celular/genética , Expressão Gênica/genética , Análise de Variância , Neoplasias da Mama/genética , Feminino , Perfilação da Expressão Gênica/métodos , Humanos , RNA/genética , Software

Structural rearrangements generate cell-specific, gene-independent CRISPR-Cas9 loss of fitness effects.

Gonçalves, Emanuel; Behan, Fiona M; Louzada, Sandra; Arnol, Damien; Stronach, Euan A; Yang, Fengtang; Yusa, Kosuke; Stegle, Oliver; Iorio, Francesco; Garnett, Mathew J.

Genome Biol ; 20(1): 27, 2019 02 05.

Artigo em Inglês | MEDLINE | ID: mdl-30722791

RESUMO

BACKGROUND: CRISPR-Cas9 genome editing is widely used to study gene function, from basic biology to biomedical research. Structural rearrangements are a ubiquitous feature of cancer cells and their impact on the functional consequences of CRISPR-Cas9 gene-editing has not yet been assessed. RESULTS: Utilizing CRISPR-Cas9 knockout screens for 250 cancer cell lines, we demonstrate that targeting structurally rearranged regions, in particular tandem or interspersed amplifications, is highly detrimental to cellular fitness in a gene-independent manner. In contrast, amplifications caused by whole chromosomal duplication have little to no impact on fitness. This effect is cell line specific and dependent on the ploidy status. We devise a copy-number ratio metric that substantially improves the detection of gene-independent cell fitness effects in CRISPR-Cas9 screens. Furthermore, we develop a computational tool, called Crispy, to account for these effects on a single sample basis and provide corrected gene fitness effects. CONCLUSION: Our analysis demonstrates the importance of structural rearrangements in mediating the effect of CRISPR-Cas9-induced DNA damage, with implications for the use of CRISPR-Cas9 gene-editing in cancer cells.

Assuntos

Sistemas CRISPR-Cas , Variação Estrutural do Genoma , Genômica/métodos , Sequenciamento Completo do Genoma , Linhagem Celular Tumoral , Humanos , Neoplasias/genética , Ploidias , Software

Coupled Motions in ß2AR:Gαs Conformational Ensembles.

Pachov, Dimitar V; Fonseca, Rasmus; Arnol, Damien; Bernauer, Julie; van den Bedem, Henry.

J Chem Theory Comput ; 12(3): 946-56, 2016 Mar 08.

Artigo em Inglês | MEDLINE | ID: mdl-26756780

RESUMO

G protein-coupled receptors (GPCRs) act as conduits in the plasma membrane, facilitating cellular responses to physiological events by activating intracellular signal transduction pathways. Extracellular signaling molecules can induce conformational changes in GPCR, which allow it to selectively activate intracellular protein partners such as heterotrimeric protein G. However, a major unsolved problem is how GPCRs and G proteins form complexes and how their interaction results in G protein activation. Here, we show that an inactive, agonist-free ß2AR:Gαs complex can collectively sample intermediate states of the receptor on an activation pathway. An in silico conformational ensemble around the inactive state manifests significant conformational coupling between structural elements implicated in G protein activation throughout the complex. While Gαs helix α5 has received much attention as a driver for nucleotide exchange, we also observe interactions between helix αN with Intra Cellular Loop 2, which can be transmitted by ß1 to facilitate nucleotide exchange by disrupting a salt bridge between the P-loop and Switch I. These interactions are moderated in an active state ensemble. Collectively, our results support an alternative view of G protein activation, in which precoupling can allosterically modulate an agonist-free receptor. Subsequent selective agonist recruitment would result in collective activation of the complex. This alternative view can help us understand how distinct extracellular binding partners result in different but interdependent signaling pathways, with broad implications for GPCR drug discovery.

Assuntos

Proteínas Heterotriméricas de Ligação ao GTP/química , Movimento , Receptores Adrenérgicos beta 2/química , Humanos , Modelos Moleculares , Conformação Proteica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA