Búsqueda | Portal Regional de la BVS

Multiset multicover methods for discriminative marker selection.

Hasanaj, Euxhen; Alavi, Amir; Gupta, Anupam; Póczos, Barnabás; Bar-Joseph, Ziv.

Cell Rep Methods ; 2(11): 100332, 2022 11 21.

Artículo en Inglés | MEDLINE | ID: mdl-36452867

RESUMEN

Markers are increasingly being used for several high-throughput data analysis and experimental design tasks. Examples include the use of markers for assigning cell types in scRNA-seq studies, for deconvolving bulk gene expression data, and for selecting marker proteins in single-cell spatial proteomics studies. Most marker selection methods focus on differential expression (DE) analysis. Although such methods work well for data with a few non-overlapping marker sets, they are not appropriate for large atlas-size datasets where several cell types and tissues are considered. To address this, we define the phenotype cover (PC) problem for marker selection and present algorithms that can improve the discriminative power of marker sets. Analysis of these sets on several marker-selection tasks suggests that these methods can lead to solutions that accurately distinguish different phenotypes in the data.

Asunto(s)

Perfilación de la Expresión Génica , Análisis de la Célula Individual , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Algoritmos , Fenotipo

Hierarchical Machine Learning for High-Fidelity 3D Printed Biopolymers.

Bone, Jennifer M; Childs, Christopher M; Menon, Aditya; Póczos, Barnabás; Feinberg, Adam W; LeDuc, Philip R; Washburn, Newell R.

ACS Biomater Sci Eng ; 6(12): 7021-7031, 2020 12 14.

Artículo en Inglés | MEDLINE | ID: mdl-33320614

RESUMEN

A hierarchical machine learning (HML) framework is presented that uses a small dataset to learn and predict the dominant build parameters necessary to print high-fidelity 3D features of alginate hydrogels. We examine the 3D printing of soft hydrogel forms printed with the freeform reversible embedding of suspended hydrogel method based on a CAD file that isolated the single-strand diameter and shape fidelity of printed alginate. Combinations of system variables ranging from print speed, flow rate, ink concentration to nozzle diameter were systematically varied to generate a small dataset of 48 prints. Prints were imaged and scored according to their dimensional similarity to the CAD file, and high print fidelity was defined as prints with less than 10% error from the CAD file. As a part of the HML framework, statistical inference was performed, using the least absolute shrinkage and selection operator to find the dominant variables that drive the error in the final prints. Model fit between the system parameters and print score was elucidated and improved by a parameterized middle layer of variable relationships which showed good performance between the predicted and observed data (R2 = 0.643). Optimization allowed for the prediction of build parameters that gave rise to high-fidelity prints of the measured features. A trade-off was identified when optimizing for the fidelity of different features printed within the same construct, showing the need for complex predictive design tools. A combination of known and discovered relationships was used to generate process maps for the 3D bioprinting designer that show error minimums based on the chosen input variables. Our approach offers a promising pathway toward scaling 3D bioprinting by optimizing print fidelity via learned build parameters that reduce the need for iterative testing.

Asunto(s)

Bioimpresión , Biopolímeros , Hidrogeles , Aprendizaje Automático , Impresión Tridimensional

Learning to predict the cosmological structure formation.

He, Siyu; Li, Yin; Feng, Yu; Ho, Shirley; Ravanbakhsh, Siamak; Chen, Wei; Póczos, Barnabás.

Proc Natl Acad Sci U S A ; 116(28): 13825-13832, 2019 07 09.

Artículo en Inglés | MEDLINE | ID: mdl-31235606

RESUMEN

Matter evolved under the influence of gravity from minuscule density fluctuations. Nonperturbative structure formed hierarchically over all scales and developed non-Gaussian features in the Universe, known as the cosmic web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and use a large ensemble of computer simulations to compare with the observed data to extract the full information of our own Universe. However, to evolve billions of particles over billions of years, even with the simplest physics, is a daunting task. We build a deep neural network, the Deep Density Displacement Model ([Formula: see text]), which learns from a set of prerun numerical simulations, to predict the nonlinear large-scale structure of the Universe with the Zel'dovich Approximation (ZA), an analytical approximation based on perturbation theory, as the input. Our extensive analysis demonstrates that [Formula: see text] outperforms the second-order perturbation theory (2LPT), the commonly used fast-approximate simulation method, in predicting cosmic structure in the nonlinear regime. We also show that [Formula: see text] is able to accurately extrapolate far beyond its training data and predict structure formation for significantly different cosmological parameters. Our study proves that deep learning is a practical and accurate alternative to approximate 3D simulations of the gravitational structure formation of the Universe.

Predicting enhancer-promoter interaction from genomic sequence with deep neural networks.

Singh, Shashank; Yang, Yang; Póczos, Barnabás; Ma, Jian.

Quant Biol ; 7(2): 122-137, 2019 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-34113473

RESUMEN

BACKGROUND: In the human genome, distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions. Although recently developed high-throughput experimental approaches have allowed us to recognize potential enhancer-promoter interactions genome-wide, it is still largely unclear to what extent the sequence-level information encoded in our genome help guide such interactions. METHODS: Here we report a new computational method (named "SPEID") using deep learning models to predict enhancer-promoter interactions based on sequence-based features only, when the locations of putative enhancers and promoters in a particular cell type are given. RESULTS: Our results across six different cell types demonstrate that SPEID is effective in predicting enhancer-promoter interactions as compared to state-of-the-art methods that only use information from a single cell type. As a proof-of-principle, we also applied SPEID to identify somatic non-coding mutations in melanoma samples that may have reduced enhancer-promoter interactions in tumor genomes. CONCLUSIONS: This work demonstrates that deep learning models can help reveal that sequence-based features alone are sufficient to reliably predict enhancer-promoter interactions genome-wide.

Subject2Vec: Generative-Discriminative Approach from a Set of Image Patches to a Vector.

Singla, Sumedha; Gong, Mingming; Ravanbakhsh, Siamak; Sciurba, Frank; Poczos, Barnabas; Batmanghelich, Kayhan N.

Med Image Comput Comput Assist Interv ; 11070: 502-510, 2018 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-30895278

RESUMEN

We propose an attention-based method that aggregates local image features to a subject-level representation for predicting disease severity. In contrast to classical deep learning that requires a fixed dimensional input, our method operates on a set of image patches; hence it can accommodate variable length input image without image resizing. The model learns a clinically interpretable subject-level representation that is reflective of the disease severity. Our model consists of three mutually dependent modules which regulate each other: (1) a discriminative network that learns a fixed-length representation from local features and maps them to disease severity; (2) an attention mechanism that provides interpretability by focusing on the areas of the anatomy that contribute the most to the prediction task; and (3) a generative network that encourages the diversity of the local latent features. The generative term ensures that the attention weights are non-degenerate while maintaining the relevance of the local regions to the disease severity. We train our model end-to-end in the context of a large-scale lung CT study of Chronic Obstructive Pulmonary Disease (COPD). Our model gives state-of-the art performance in predicting clinical measures of severity for COPD.The distribution of the attention provides the regional relevance of lung tissue to the clinical measurements.

Asunto(s)

Interpretación de Imagen Asistida por Computador , Pulmón , Tomografía Computarizada por Rayos X , Algoritmos , Humanos , Pulmón/diagnóstico por imagen , Reconocimiento de Normas Patrones Automatizadas , Reproducibilidad de los Resultados , Sensibilidad y Especificidad

Quantifying Differences and Similarities in Whole-Brain White Matter Architecture Using Local Connectome Fingerprints.

Yeh, Fang-Cheng; Vettel, Jean M; Singh, Aarti; Poczos, Barnabas; Grafton, Scott T; Erickson, Kirk I; Tseng, Wen-Yih I; Verstynen, Timothy D.

PLoS Comput Biol ; 12(11): e1005203, 2016 Nov.

Artículo en Inglés | MEDLINE | ID: mdl-27846212

RESUMEN

Quantifying differences or similarities in connectomes has been a challenge due to the immense complexity of global brain networks. Here we introduce a noninvasive method that uses diffusion MRI to characterize whole-brain white matter architecture as a single local connectome fingerprint that allows for a direct comparison between structural connectomes. In four independently acquired data sets with repeated scans (total N = 213), we show that the local connectome fingerprint is highly specific to an individual, allowing for an accurate self-versus-others classification that achieved 100% accuracy across 17,398 identification tests. The estimated classification error was approximately one thousand times smaller than fingerprints derived from diffusivity-based measures or region-to-region connectivity patterns for repeat scans acquired within 3 months. The local connectome fingerprint also revealed neuroplasticity within an individual reflected as a decreasing trend in self-similarity across time, whereas this change was not observed in the diffusivity measures. Moreover, the local connectome fingerprint can be used as a phenotypic marker, revealing 12.51% similarity between monozygotic twins, 5.14% between dizygotic twins, and 4.51% between none-twin siblings, relative to differences between unrelated subjects. This novel approach opens a new door for probing the influence of pathological, genetic, social, or environmental factors on the unique configuration of the human connectome.

Asunto(s)

Encéfalo/anatomía & histología , Conectoma/métodos , Imagen de Difusión Tensora/métodos , Interpretación de Imagen Asistida por Computador/métodos , Técnica de Sustracción , Sustancia Blanca/anatomía & histología , Adulto , Algoritmos , Femenino , Humanos , Aumento de la Imagen/métodos , Masculino , Reconocimiento de Normas Patrones Automatizadas/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Adulto Joven

Variance Reduction in Stochastic Gradient Langevin Dynamics.

Dubey, Avinava; Reddi, Sashank J; Póczos, Barnabás; Smola, Alexander J; Xing, Eric P; Williamson, Sinead A.

Adv Neural Inf Process Syst ; 29: 1154-1162, 2016 12.

Artículo en Inglés | MEDLINE | ID: mdl-28713210

RESUMEN

Stochastic gradient-based Monte Carlo methods such as stochastic gradient Langevin dynamics are useful tools for posterior inference on large scale datasets in many machine learning applications. These methods scale to large datasets by using noisy gradients calculated using a mini-batch or subset of the dataset. However, the high variance inherent in these noisy gradients degrades performance and leads to slower mixing. In this paper, we present techniques for reducing variance in stochastic gradient Langevin dynamics, yielding novel stochastic Monte Carlo methods that improve performance by reducing the variance in the stochastic gradient. We show that our proposed method has better theoretical guarantees on convergence rate than stochastic Langevin dynamics. This is complemented by impressive empirical results obtained on a variety of real world datasets, and on four different machine learning tasks (regression, classification, independent component analysis and mixture modeling). These theoretical and empirical contributions combine to make a compelling case for using variance reduction in stochastic Monte Carlo methods.

Competitive spiking and indirect entropy minimization of rate code: efficient search for hidden components.

Szatmáry, Botond; Póczos, Barnabás; Lorincz, András.

J Physiol Paris ; 98(4-6): 407-16, 2004.

Artículo en Inglés | MEDLINE | ID: mdl-16289549

RESUMEN

Our motivation, which originates from the psychological and physiological evidences of component-based representations in the brain, is to find neural methods that can efficiently search for structures. Here, an architecture made of coupled parallel working reconstruction subnetworks is presented. Each subnetwork utilizes non-negativity constraint on the generative weights and on the internal representation. 'Spikes' are generated within subnetworks via winner take all mechanism. Memory components are modified in order to directly minimize the reconstruction error and to indirectly minimize the entropy of the spike rate distribution, via a combination of a stochastic gradient search and a novel tuning method. This tuning dynamically changes the learning rate: the higher the entropy of the spike rate, the higher the learning rate of the gradient search in the subnetworks. This method effectively reduces the search space and increases the escape probability from high entropy local minima. We demonstrate that one subnetwork can develop localized and oriented components. Coupled networks can discover and sort components into the subnetworks; a problem subject to combinatorial explosion. Synergy between spike code and rate code is discussed.

Asunto(s)

Potenciales de Acción/fisiología , Aprendizaje/fisiología , Modelos Neurológicos , Redes Neurales de la Computación , Algoritmos , Animales , Simulación por Computador , Entropía , Humanos , Matemática , Factores de Tiempo

Cost component analysis.

Lörincz, András; Póczos, Barnabás.

Int J Neural Syst ; 13(3): 183-92, 2003 Jun.

Artículo en Inglés | MEDLINE | ID: mdl-12884451

RESUMEN

In optimizations the dimension of the problem may severely, sometimes exponentially increase optimization time. Parametric function approximatiors (FAPPs) have been suggested to overcome this problem. Here, a novel FAPP, cost component analysis (CCA) is described. In CCA, the search space is resampled according to the Boltzmann distribution generated by the energy landscape. That is, CCA converts the optimization problem to density estimation. Structure of the induced density is searched by independent component analysis (ICA). The advantage of CCA is that each independent ICA component can be optimized separately. In turn, (i) CCA intends to partition the original problem into subproblems and (ii) separating (partitioning) the original optimization problem into subproblems may serve interpretation. Most importantly, (iii) CCA may give rise to high gains in optimization time. Numerical simulations illustrate the working of the algorithm.

Asunto(s)

Algoritmos , Costos y Análisis de Costo , Inteligencia Artificial , Simulación por Computador , Redes Neurales de la Computación , Dinámicas no Lineales

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA