Pesquisa | BVS Doenças Infecciosas e Parasitárias

Predicting cellular responses to complex perturbations in high-throughput screens.

Lotfollahi, Mohammad; Klimovskaia Susmelj, Anna; De Donno, Carlo; Hetzel, Leon; Ji, Yuge; Ibarra, Ignacio L; Srivatsan, Sanjay R; Naghipourfar, Mohsen; Daza, Riza M; Martin, Beth; Shendure, Jay; McFaline-Figueroa, Jose L; Boyeau, Pierre; Wolf, F Alexander; Yakubova, Nafissa; Günnemann, Stephan; Trapnell, Cole; Lopez-Paz, David; Theis, Fabian J.

Mol Syst Biol ; 19(6): e11517, 2023 06 12.

Artigo em Inglês | MEDLINE | ID: mdl-37154091

RESUMO

Recent advances in multiplexed single-cell transcriptomics experiments facilitate the high-throughput study of drug and genetic perturbations. However, an exhaustive exploration of the combinatorial perturbation space is experimentally unfeasible. Therefore, computational methods are needed to predict, interpret, and prioritize perturbations. Here, we present the compositional perturbation autoencoder (CPA), which combines the interpretability of linear models with the flexibility of deep-learning approaches for single-cell response modeling. CPA learns to in silico predict transcriptional perturbation response at the single-cell level for unseen dosages, cell types, time points, and species. Using newly generated single-cell drug combination data, we validate that CPA can predict unseen drug combinations while outperforming baseline models. Additionally, the architecture's modularity enables incorporating the chemical representation of the drugs, allowing the prediction of cellular response to completely unseen drugs. Furthermore, CPA is also applicable to genetic combinatorial screens. We demonstrate this by imputing in silico 5,329 missing combinations (97.6% of all possibilities) in a single-cell Perturb-seq experiment with diverse genetic interactions. We envision CPA will facilitate efficient experimental design and hypothesis generation by enabling in silico response prediction at the single-cell level and thus accelerate therapeutic applications using single-cell technologies.

Assuntos

Biologia Computacional , Perfilação da Expressão Gênica , Ensaios de Triagem em Larga Escala , Análise da Expressão Gênica de Célula Única

Back-to-back regression: Disentangling the influence of correlated factors from multivariate observations.

King, Jean-Rémi; Charton, François; Lopez-Paz, David; Oquab, Maxime.

Neuroimage ; 220: 117028, 2020 10 15.

Artigo em Inglês | MEDLINE | ID: mdl-32603859

RESUMO

Identifying causes solely from observations can be particularly challenging when i) the factors under investigation are difficult to manipulate independently from one-another and ii) observations are high-dimensional. To address this issue, we introduce ''Back-to-Back'' regression (B2B), a linear method designed to efficiently estimate, from a set of correlated factors, those that most plausibly account for multidimensional observations. First, we prove the consistency of B2B, its links to other linear approaches, and show how it can provide a robust, unbiased and interpretable scalar estimate for each factor. Second, we use a variety of simulated data to show that B2B can outperform forward modeling ("encoding"), backward modeling ("decoding") as well as cross-decomposition modeling (i.e. canonical correlation analysis and partial least squares) on causal identification when the factors and the observations are not orthogonal. Finally, we apply B2B to a hundred magneto-encephalography recordings and to a hundred functional Magnetic Resonance Imaging recordings acquired while subjects performed a 1 âh reading task. B2B successfully disentangles the respective contribution of collinear factors such as word length, word frequency in the early visual and late associative cortical responses respectively. B2B compared favorably to other standard techniques on this disentanglement. We discuss how the speed and the generality of B2B sets promising foundations to help identify the causal contributions of covarying factors from high-dimensional observations.

Assuntos

Córtex Cerebral/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Imageamento por Ressonância Magnética/métodos , Magnetoencefalografia/métodos , Mapeamento Encefálico/métodos , Córtex Cerebral/fisiologia , Humanos , Análise Multivariada , Leitura , Análise de Regressão

Interpolation consistency training for semi-supervised learning.

Verma, Vikas; Kawaguchi, Kenji; Lamb, Alex; Kannala, Juho; Solin, Arno; Bengio, Yoshua; Lopez-Paz, David.

Neural Netw ; 145: 90-106, 2022 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-34735894

RESUMO

We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density regions of the data distribution. Our experiments show that ICT achieves state-of-the-art performance when applied to standard neural network architectures on the CIFAR-10 and SVHN benchmark datasets. Our theoretical analysis shows that ICT corresponds to a certain type of data-adaptive regularization with unlabeled points which reduces overfitting to labeled points under high confidence values.

Assuntos

Redes Neurais de Computação , Aprendizado de Máquina Supervisionado , Algoritmos , Benchmarking

An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers.

Vedantam, Ramakrishna; Lopez-Paz, David; Schwab, David J.

Adv Neural Inf Process Syst ; 34: 28131-28143, 2021 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-36597462

RESUMO

Recent work demonstrates that deep neural networks trained using Empirical Risk Minimization (ERM) can generalize under distribution shift, outperforming specialized training algorithms for domain generalization. The goal of this paper is to further understand this phenomenon. In particular, we study the extent to which the seminal domain adaptation theory of Ben-David et al. (2007) explains the performance of ERMs. Perhaps surprisingly, we find that this theory does not provide a tight explanation of the out-of-domain generalization observed across a large number of ERM models trained on three popular domain generalization datasets. This motivates us to investigate other possible measures-that, however, lack theory-which could explain generalization in this setting. Our investigation reveals that measures relating to the Fisher information, predictive entropy, and maximum mean discrepancy are good predictors of the out-of-distribution generalization of ERM models. We hope that our work helps galvanize the community towards building a better understanding of when deep networks trained with ERM generalize out-of-distribution.

Poincaré maps for analyzing complex hierarchies in single-cell data.

Klimovskaia, Anna; Lopez-Paz, David; Bottou, Léon; Nickel, Maximilian.

Nat Commun ; 11(1): 2966, 2020 06 11.

Artigo em Inglês | MEDLINE | ID: mdl-32528075

RESUMO

The need to understand cell developmental processes spawned a plethora of computational methods for discovering hierarchies from scRNAseq data. However, existing techniques are based on Euclidean geometry, a suboptimal choice for modeling complex cell trajectories with multiple branches. To overcome this fundamental representation issue we propose Poincaré maps, a method that harness the power of hyperbolic geometry into the realm of single-cell data analysis. Often understood as a continuous extension of trees, hyperbolic geometry enables the embedding of complex hierarchical data in only two dimensions while preserving the pairwise distances between points in the hierarchy. This enables the use of our embeddings in a wide variety of downstream data analysis tasks, such as visualization, clustering, lineage detection and pseudotime inference. When compared to existing methods - unable to address all these important tasks using a single embedding - Poincaré maps produce state-of-the-art two-dimensional representations of cell trajectories on multiple scRNAseq datasets.

Assuntos

Biologia Computacional/métodos , Algoritmos , Aprendizado de Máquina , Biologia de Sistemas/métodos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA