Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 39(7)2023 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-37421399

RESUMEN

MOTIVATION: Modality matching in single-cell omics data analysis-i.e. matching cells across datasets collected using different types of genomic assays-has become an important problem, because unifying perspectives across different technologies holds the promise of yielding biological and clinical discoveries. However, single-cell dataset sizes can now reach hundreds of thousands to millions of cells, which remain out of reach for most multimodal computational methods. RESULTS: We propose LSMMD-MA, a large-scale Python implementation of the MMD-MA method for multimodal data integration. In LSMMD-MA, we reformulate the MMD-MA optimization problem using linear algebra and solve it with KeOps, a CUDA framework for symbolic matrix computation in Python. We show that LSMMD-MA scales to a million cells in each modality, two orders of magnitude greater than existing implementations. AVAILABILITY AND IMPLEMENTATION: LSMMD-MA is freely available at https://github.com/google-research/large_scale_mmdma and archived at https://doi.org/10.5281/zenodo.8076311.


Asunto(s)
Genoma , Genómica , Genómica/métodos , Proyectos de Investigación , Análisis de Datos , Análisis de la Célula Individual , Programas Informáticos
2.
Neuroimage ; 220: 116847, 2020 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-32438046

RESUMEN

Magnetoencephalography and electroencephalography (M/EEG) are non-invasive modalities that measure the weak electromagnetic fields generated by neural activity. Estimating the location and magnitude of the current sources that generated these electromagnetic fields is an inverse problem. Although it can be cast as a linear regression, this problem is severely ill-posed as the number of observations, which equals the number of sensors, is small. When considering a group study, a common approach consists in carrying out the regression tasks independently for each subject using techniques such as MNE or sLORETA. An alternative is to jointly localize sources for all subjects taken together, while enforcing some similarity between them. By pooling S subjects in a single joint regression, the number of observations is S times larger, potentially making the problem better posed and offering the ability to identify more sources with greater precision. Here we show how the coupling of the different regression problems can be done through a multi-task regularization that promotes focal source estimates. To take into account intersubject variabilities, we propose the Minimum Wasserstein Estimates (MWE). Thanks to a new joint regression method based on optimal transport (OT) metrics, MWE does not enforce perfect overlap of activation foci for all subjects but rather promotes spatial proximity on the cortical mantle. Besides, by estimating the noise level of each subject, MWE copes with the subject-specific signal-to-noise ratios with only one regularization parameter. On realistic simulations, MWE decreases the localization error by up to 4 â€‹mm per source compared to individual solutions. Experiments on the Cam-CAN dataset show improvements in spatial specificity in population imaging compared to individual models such as dSPM as well as a state-of-the-art Bayesian group level model. Our analysis of a multimodal dataset shows how multi-subject source localization reduces the gap between MEG and fMRI for brain mapping.


Asunto(s)
Mapeo Encefálico/métodos , Encéfalo/fisiología , Electroencefalografía/métodos , Magnetoencefalografía/métodos , Modelos Neurológicos , Humanos , Análisis Multivariante , Relación Señal-Ruido
3.
Neural Comput ; 31(5): 827-848, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30883281

RESUMEN

We propose a new divergence on the manifold of probability distributions, building on the entropic regularization of optimal transportation problems. As Cuturi ( 2013 ) showed, regularizing the optimal transport problem with an entropic term is known to bring several computational benefits. However, because of that regularization, the resulting approximation of the optimal transport cost does not define a proper distance or divergence between probability distributions. We recently tried to introduce a family of divergences connecting the Wasserstein distance and the Kullback-Leibler divergence from an information geometry point of view (see Amari, Karakida, & Oizumi, 2018 ). However, that proposal was not able to retain key intuitive aspects of the Wasserstein geometry, such as translation invariance, which plays a key role when used in the more general problem of computing optimal transport barycenters. The divergence we propose in this work is able to retain such properties and admits an intuitive interpretation.

4.
Neural Netw ; 18(8): 1111-23, 2005 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-16198086

RESUMEN

We propose a new kernel for strings which borrows ideas and techniques from information theory and data compression. This kernel can be used in combination with any kernel method, in particular Support Vector Machines for string classification, with notable applications in proteomics. By using a Bayesian averaging framework with conjugate priors on a class of Markovian models known as probabilistic suffix trees or context-trees, we compute the value of this kernel in linear time and space while only using the information contained in the spectrum of the considered strings. This is ensured through an adaptation of a compression method known as the context-tree weighting algorithm. Encouraging classification results are reported on a standard protein homology detection experiment, showing that the context-tree kernel performs well with respect to other state-of-the-art methods while using no biological prior knowledge.


Asunto(s)
Algoritmos , Inteligencia Artificial , Compresión de Datos/métodos , Redes Neurales de la Computación , Teorema de Bayes , Bases de Datos de Proteínas , Cadenas de Markov , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...