Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Anal Chem ; 95(48): 17458-17466, 2023 12 05.
Artigo em Inglês | MEDLINE | ID: mdl-37971927

RESUMO

Microfluidics can split samples into thousands or millions of partitions, such as droplets or nanowells. Partitions capture analytes according to a Poisson distribution, and in diagnostics, the analyte concentration is commonly inferred with a closed-form solution via maximum likelihood estimation (MLE). Here, we present a new scalable approach to multiplexing analytes. We generalize MLE with microfluidic partitioning and extend our previously developed Sparse Poisson Recovery (SPoRe) inference algorithm. We also present the first in vitro demonstration of SPoRe with droplet digital PCR (ddPCR) toward infection diagnostics. Digital PCR is intrinsically highly sensitive, and SPoRe helps expand its multiplexing capacity by circumventing its channel limitations. We broadly amplify bacteria with 16S ddPCR and assign barcodes to nine pathogen genera by using five nonspecific probes. Given our two-channel ddPCR system, we measured two probes at a time in multiple groups of droplets. Although individual droplets are ambiguous in their bacterial contents, we recover the concentrations of bacteria in the sample from the pooled data. We achieve stable quantification down to approximately 200 total copies of the 16S gene per sample, enabling a suite of clinical applications given a robust upstream microbial DNA extraction procedure. We develop a new theory that generalizes the application of this framework to many realistic sensing modalities, and we prove scaling rules for system design to achieve further expanded multiplexing. The core principles demonstrated here could impact many biosensing applications with microfluidic partitioning.


Assuntos
Bactérias , Microfluídica , Reação em Cadeia da Polimerase/métodos , Bactérias/genética
2.
Nucleic Acids Res ; 48(10): 5217-5234, 2020 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-32338745

RESUMO

As computational biologists continue to be inundated by ever increasing amounts of metagenomic data, the need for data analysis approaches that keep up with the pace of sequence archives has remained a challenge. In recent years, the accelerated pace of genomic data availability has been accompanied by the application of a wide array of highly efficient approaches from other fields to the field of metagenomics. For instance, sketching algorithms such as MinHash have seen a rapid and widespread adoption. These techniques handle increasingly large datasets with minimal sacrifices in quality for tasks such as sequence similarity calculations. Here, we briefly review the fundamentals of the most impactful probabilistic and signal processing algorithms. We also highlight more recent advances to augment previous reviews in these areas that have taken a broader approach. We then explore the application of these techniques to metagenomics, discuss their pros and cons, and speculate on their future directions.


Assuntos
Algoritmos , Metagenômica/métodos , Probabilidade , Processamento de Sinais Assistido por Computador , Humanos , Metagenoma/genética
3.
IEEE Trans Signal Process ; 70: 2388-2401, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36082267

RESUMO

Compressed sensing (CS) is a signal processing technique that enables the efficient recovery of a sparse high-dimensional signal from low-dimensional measurements. In the multiple measurement vector (MMV) framework, a set of signals with the same support must be recovered from their corresponding measurements. Here, we present the first exploration of the MMV problem where signals are independently drawn from a sparse, multivariate Poisson distribution. We are primarily motivated by a suite of biosensing applications of microfluidics where analytes (such as whole cells or biomarkers) are captured in small volume partitions according to a Poisson distribution. We recover the sparse parameter vector of Poisson rates through maximum likelihood estimation with our novel Sparse Poisson Recovery (SPoRe) algorithm. SPoRe uses batch stochastic gradient ascent enabled by Monte Carlo approximations of otherwise intractable gradients. By uniquely leveraging the Poisson structure, SPoRe substantially outperforms a comprehensive set of existing and custom baseline CS algorithms. Notably, SPoRe can exhibit high performance even with one-dimensional measurements and high noise levels. This resource efficiency is not only unprecedented in the field of CS but is also particularly potent for applications in microfluidics in which the number of resolvable measurements per partition is often severely limited. We prove the identifiability property of the Poisson model under such lax conditions, analytically develop insights into system performance, and confirm these insights in simulated experiments. Our findings encourage a new approach to biosensing and are generalizable to other applications featuring spatial and temporal Poisson signals.

4.
Biometrics ; 73(1): 10-19, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-27163413

RESUMO

In the biclustering problem, we seek to simultaneously group observations and features. While biclustering has applications in a wide array of domains, ranging from text mining to collaborative filtering, the problem of identifying structure in high-dimensional genomic data motivates this work. In this context, biclustering enables us to identify subsets of genes that are co-expressed only within a subset of experimental conditions. We present a convex formulation of the biclustering problem that possesses a unique global minimizer and an iterative algorithm, COBRA, that is guaranteed to identify it. Our approach generates an entire solution path of possible biclusters as a single tuning parameter is varied. We also show how to reduce the problem of selecting this tuning parameter to solving a trivial modification of the convex biclustering problem. The key contributions of our work are its simplicity, interpretability, and algorithmic guarantees-features that arguably are lacking in the current alternative algorithms. We demonstrate the advantages of our approach, which includes stably and reproducibly identifying biclusterings, on simulated and real microarray data.


Assuntos
Análise por Conglomerados , Interpretação Estatística de Dados , Redes Reguladoras de Genes , Algoritmos , Biologia Computacional/métodos , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos
5.
J Stat Plan Inference ; 166: 52-66, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26500388

RESUMO

We develop a modeling framework for joint factor and cluster analysis of datasets where multiple categorical response items are collected on a heterogeneous population of individuals. We introduce a latent factor multinomial probit model and employ prior constructions that allow inference on the number of factors as well as clustering of the subjects into homogenous groups according to their relevant factors. Clustering, in particular, allows us to borrow strength across subjects, therefore helping in the estimation of the model parameters, particularly when the number of observations is small. We employ Markov chain Monte Carlo techniques and obtain tractable posterior inference for our objectives, including sampling of missing data. We demonstrate the effectiveness of our method on simulated data. We also analyze two real-world educational datasets and show that our method outperforms state-of-the-art methods. In the analysis of the real-world data, we uncover hidden relationships between the questions and the underlying educational concepts, while simultaneously partitioning the students into groups of similar educational mastery.

6.
IEEE Trans Neural Netw Learn Syst ; 35(4): 5014-5026, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37104113

RESUMO

The first step toward investigating the effectiveness of a treatment via a randomized trial is to split the population into control and treatment groups then compare the average response of the treatment group receiving the treatment to the control group receiving the placebo. To ensure that the difference between the two groups is caused only by the treatment, it is crucial that the control and the treatment groups have similar statistics. Indeed, the validity and reliability of a trial are determined by the similarity of two groups' statistics. Covariate balancing methods increase the similarity between the distributions of the two groups' covariates. However, often in practice, there are not enough samples to accurately estimate the groups' covariate distributions. In this article, we empirically show that covariate balancing with the standardized means difference (SMD) covariate balancing measure, as well as Pocock and Simon's sequential treatment assignment method, are susceptible to worst case treatment assignments. Worst case treatment assignments are those admitted by the covariate balance measure, but result in highest possible ATE estimation errors. We developed an adversarial attack to find adversarial treatment assignment for any given trial. Then, we provide an index to measure how close the given trial is to the worst case. To this end, we provide an optimization-based algorithm, namely adversarial treatment assignment in treatment effect trials (ATASTREET), to find the adversarial treatment assignments.


Assuntos
Redes Neurais de Computação , Projetos de Pesquisa , Reprodutibilidade dos Testes , Ensaios Clínicos Controlados Aleatórios como Assunto , Simulação por Computador
7.
Artigo em Inglês | MEDLINE | ID: mdl-39190515

RESUMO

DeepTensor is a computationally efficient framework for low-rank decomposition of matrices and tensors using deep generative networks. We decompose a tensor as the product of low-rank tensor factors (e.g., a matrix as the outer product of two vectors), where each low-rank tensor is generated by a deep network (DN) that is trained in a self-supervised manner to minimize the mean-square approximation error. Our key observation is that the implicit regularization inherent in DNs enables them to capture nonlinear signal structures (e.g., manifolds) that are out of the reach of classical linear methods like the singular value decomposition (SVD) and principal components analysis (PCA). Furthermore, in contrast to the SVD and PCA, whose performance deteriorates when the tensor's entries deviate from additive white Gaussian noise, we demonstrate that the performance of DeepTensor is robust to a wide range of distributions. We validate that DeepTensor is a robust and computationally efficient drop-in replacement for the SVD, PCA, nonnegative matrix factorization (NMF), and similar decompositions by exploring a range of real-world applications, including hyperspectral image denoising, 3D MRI tomography, and image classification. In particular, DeepTensor offers a 6dB signal-to-noise ratio improvement over standard denoising methods for signal corrupted by Poisson noise and learns to decompose 3D tensors 60 times faster than a single DN equipped with 3D convolutions.

8.
IEEE Trans Pattern Anal Mach Intell ; 44(2): 1098-1107, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-33026983

RESUMO

Inferring appropriate information from large datasets has become important. In particular, identifying relationships among variables in these datasets has far-reaching impacts. In this article, we introduce the uniform information coefficient (UIC), which measures the amount of dependence between two multidimensional variables and is able to detect both linear and non-linear associations. Our proposed UIC is inspired by the maximal information coefficient (MIC) [1].; however, the MIC was originally designed to measure dependence between two one-dimensional variables. Unlike the MIC calculation that depends on the type of association between two variables, we show that the UIC calculation is less computationally expensive and more robust to the type of association between two variables. The UIC achieves this by replacing the dynamic programming step in the MIC calculation with a simpler technique based on the uniform partitioning of the data grid. This computational efficiency comes at the cost of not maximizing the information coefficient as done by the MIC algorithm. We present theoretical guarantees for the performance of the UIC and a variety of experiments to demonstrate its quality in detecting associations.


Assuntos
Algoritmos
9.
IEEE Trans Pattern Anal Mach Intell ; 43(7): 2233-2244, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33891546

RESUMO

We introduce a novel video-rate hyperspectral imager with high spatial, temporal and spectral resolutions. Our key hypothesis is that spectral profiles of pixels within each super-pixel tend to be similar. Hence, a scene-adaptive spatial sampling of a hyperspectral scene, guided by its super-pixel segmented image, is capable of obtaining high-quality reconstructions. To achieve this, we acquire an RGB image of the scene, compute its super-pixels, from which we generate a spatial mask of locations where we measure high-resolution spectrum. The hyperspectral image is subsequently estimated by fusing the RGB image and the spectral measurements using a learnable guided filtering approach. Due to low computational complexity of the superpixel estimation step, our setup can capture hyperspectral images of the scenes with little overhead over traditional snapshot hyperspectral cameras, but with significantly higher spatial and spectral resolutions. We validate the proposed technique with extensive simulations as well as a lab prototype that measures hyperspectral video at a spatial resolution of 600 ×900 pixels, at a spectral resolution of 10 nm over visible wavebands, and achieving a frame rate at 18fps.

10.
Artigo em Inglês | MEDLINE | ID: mdl-34746376

RESUMO

Ridge-like regularization often leads to improved generalization performance of machine learning models by mitigating overfitting. While ridge-regularized machine learning methods are widely used in many important applications, direct training via optimization could become challenging in huge data scenarios with millions of examples and features. We tackle such challenges by proposing a general approach that achieves ridge-like regularization through implicit techniques named Minipatch Ridge (MPRidge). Our approach is based on taking an ensemble of coefficients of unregularized learners trained on many tiny, random subsamples of both the examples and features of the training data, which we call minipatches. We empirically demonstrate that MPRidge induces an implicit ridge-like regularizing effect and performs nearly the same as explicit ridge regularization for a general class of predictors including logistic regression, SVM, and robust regression. Embarrassingly parallelizable, MPRidge provides a computationally appealing alternative to inducing ridge-like regularization for improving generalization performance in challenging big-data settings.

11.
PLoS One ; 14(3): e0212508, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30840653

RESUMO

Open Educational Resources (OER) have been lauded for their ability to reduce student costs and improve equity in higher education. Research examining whether OER provides learning benefits have produced mixed results, with most studies showing null effects. We argue that the common methods used to examine OER efficacy are unlikely to detect positive effects based on predictions of the access hypothesis. The access hypothesis states that OER benefits learning by providing access to critical course materials, and therefore predicts that OER should only benefit students who would not otherwise have access to the materials. Through the use of simulation analysis, we demonstrate that even if there is a learning benefit of OER, standard research methods are unlikely to detect it.


Assuntos
Educação a Distância , Aprendizagem , Estudantes , Adolescente , Adulto , Feminino , Humanos , Masculino
12.
IEEE Trans Image Process ; 17(7): 1069-82, 2008 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-18586616

RESUMO

The dual-tree quaternion wavelet transform (QWT) is a new multiscale analysis tool for geometric image features. The QWT is a near shift-invariant tight frame representation whose coefficients sport a magnitude and three phases: two phases encode local image shifts while the third contains image texture information. The QWT is based on an alternative theory for the 2-D Hilbert transform and can be computed using a dual-tree filter bank with linear computational complexity. To demonstrate the properties of the QWT's coherent magnitude/phase representation, we develop an efficient and accurate procedure for estimating the local geometrical structure of an image. We also develop a new multiscale algorithm for estimating the disparity between a pair of images that is promising for image registration and flow estimation applications. The algorithm features multiscale phase unwrapping, linear complexity, and sub-pixel estimation accuracy.


Assuntos
Algoritmos , Inteligência Artificial , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Reconhecimento Automatizado de Padrão/métodos , Processamento de Sinais Assistido por Computador , Tomografia de Coerência Óptica/métodos , Aumento da Imagem/métodos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
13.
IEEE Trans Image Process ; 16(11): 2752-65, 2007 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-17990752

RESUMO

Multi-input single-output deconvolution (MISO-D) aims to extract a deblurred estimate of a target signal from several blurred and noisy observations. This paper develops a new two step framework--Texas Two-Step--to solve MISO-D problems with known blurs. Texas Two-Step first reduces the MISO-D problem to a related single-input single-output deconvolution (SISO-D) problem by invoking the concept of sufficient statistics (SSs) and then solves the simpler SISO-D problem using an appropriate technique. The two-step framework enables new MISO-D techniques (both optimal and suboptimal) based on the rich suite of existing SISO-D techniques. In fact, the properties of SSs imply that a MISO-D algorithm is mean-squared-error optimal if and only if it can be rearranged to conform to the Texas Two-Step framework. Using this insight, we construct new wavelet- and curvelet-based MISO-D algorithms with asymptotically optimal performance. Simulated and real data experiments verify that the framework is indeed effective.


Assuntos
Algoritmos , Artefatos , Inteligência Artificial , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Análise de Regressão , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
14.
Sci Adv ; 3(12): e1701548, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-29226243

RESUMO

Modern biology increasingly relies on fluorescence microscopy, which is driving demand for smaller, lighter, and cheaper microscopes. However, traditional microscope architectures suffer from a fundamental trade-off: As lenses become smaller, they must either collect less light or image a smaller field of view. To break this fundamental trade-off between device size and performance, we present a new concept for three-dimensional (3D) fluorescence imaging that replaces lenses with an optimized amplitude mask placed a few hundred micrometers above the sensor and an efficient algorithm that can convert a single frame of captured sensor data into high-resolution 3D images. The result is FlatScope: perhaps the world's tiniest and lightest microscope. FlatScope is a lensless microscope that is scarcely larger than an image sensor (roughly 0.2 g in weight and less than 1 mm thick) and yet able to produce micrometer-resolution, high-frame rate, 3D fluorescence movies covering a total volume of several cubic millimeters. The ability of FlatScope to reconstruct full 3D images from a single frame of captured sensor data allows us to image 3D volumes roughly 40,000 times faster than a laser scanning confocal microscope while providing comparable resolution. We envision that this new flat fluorescence microscopy paradigm will lead to implantable endoscopes that minimize tissue damage, arrays of imagers that cover large areas, and bendable, flexible microscopes that conform to complex topographies.

15.
IEEE Trans Image Process ; 15(5): 1071-87, 2006 May.
Artigo em Inglês | MEDLINE | ID: mdl-16671289

RESUMO

The wavelet transform provides a sparse representation for smooth images, enabling efficient approximation and compression using techniques such as zerotrees. Unfortunately, this sparsity does not extend to piecewise smooth images, where edge discontinuities separating smooth regions persist along smooth contours. This lack of sparsity hampers the efficiency of wavelet-based approximation and compression. On the class of images containing smooth C2 regions separated by edges along smooth C2 contours, for example, the asymptotic rate-distortion (R-D) performance of zerotree-based wavelet coding is limited to D(R) (< or = 1/R, well below the optimal rate of 1/R2. In this paper, we develop a geometric modeling framework for wavelets that addresses this shortcoming. The framework can be interpreted either as 1) an extension to the "zerotree model" for wavelet coefficients that explicitly accounts for edge structure at fine scales, or as 2) a new atomic representation that synthesizes images using a sparse combination of wavelets and wedgeprints--anisotropic atoms that are adapted to edge singularities. Our approach enables a new type of quadtree pruning for piecewise smooth images, using zerotrees in uniformly smooth regions and wedgeprints in regions containing geometry. Using this framework, we develop a prototype image coder that has near-optimal asymptotic R-D performance D(R) < or = (log R)2 /R2 for piecewise smooth C2/C2 images. In addition, we extend the algorithm to compress natural images, exploring the practical problems that arise and attaining promising results in terms of mean-square error and visual quality.


Assuntos
Algoritmos , Compressão de Dados/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Sinais Assistido por Computador , Gráficos por Computador , Análise Numérica Assistida por Computador
16.
IEEE Trans Image Process ; 15(6): 1365-78, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16764263

RESUMO

We routinely encounter digital color images that were previously compressed using the Joint Photographic Experts Group (JPEG) standard. En route to the image's current representation, the previous JPEG compression's various settings-termed its JPEG compression history (CH)-are often discarded after the JPEG decompression step. Given a JPEG-decompressed color image, this paper aims to estimate its lost JPEG CH. We observe that the previous JPEG compression's quantization step introduces a lattice structure in the discrete cosine transform (DCT) domain. This paper proposes two approaches that exploit this structure to solve the JPEG Compression History Estimation (CHEst) problem. First, we design a statistical dictionary-based CHEst algorithm that tests the various CHs in a dictionary and selects the maximum a posteriori estimate. Second, for cases where the DCT coefficients closely conform to a 3-D parallelepiped lattice, we design a blind lattice-based CHEst algorithm. The blind algorithm exploits the fact that the JPEG CH is encoded in the nearly orthogonal bases for the 3-D lattice and employs novel lattice algorithms and recent results on nearly orthogonal lattice bases to estimate the CH. Both algorithms provide robust JPEG CHEst performance in practice. Simulations demonstrate that JPEG CHEst can be useful in JPEG recompression; the estimated CH allows us to recompress a JPEG-decompressed image with minimal distortion (large signal-to-noise-ratio) and simultaneously achieve a small file-size.


Assuntos
Cor , Colorimetria/métodos , Gráficos por Computador , Compressão de Dados/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Processamento de Sinais Assistido por Computador , Algoritmos , Redes de Comunicação de Computadores , Simulação por Computador , Interpretação Estatística de Dados , Modelos Estatísticos
18.
Sci Adv ; 2(9): e1600025, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27704040

RESUMO

Early identification of pathogens is essential for limiting development of therapy-resistant pathogens and mitigating infectious disease outbreaks. Most bacterial detection schemes use target-specific probes to differentiate pathogen species, creating time and cost inefficiencies in identifying newly discovered organisms. We present a novel universal microbial diagnostics (UMD) platform to screen for microbial organisms in an infectious sample, using a small number of random DNA probes that are agnostic to the target DNA sequences. Our platform leverages the theory of sparse signal recovery (compressive sensing) to identify the composition of a microbial sample that potentially contains novel or mutant species. We validated the UMD platform in vitro using five random probes to recover 11 pathogenic bacteria. We further demonstrated in silico that UMD can be generalized to screen for common human pathogens in different taxonomy levels. UMD's unorthodox sensing approach opens the door to more efficient and universal molecular diagnostics.


Assuntos
Bactérias/genética , Sondas de DNA/genética , DNA Bacteriano/genética , Infecções/diagnóstico , Bactérias/isolamento & purificação , Bactérias/patogenicidade , DNA Bacteriano/classificação , Humanos , Infecções/genética , Infecções/microbiologia , Reação em Cadeia da Polimerase
19.
IEEE Trans Image Process ; 12(12): 1449-59, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-18244701

RESUMO

We investigate central issues such as invertibility, stability, synchronization, and frequency characteristics for nonlinear wavelet transforms built using the lifting framework. The nonlinearity comes from adaptively choosing between a class of linear predictors within the lifting framework. We also describe how earlier families of nonlinear filter banks can be extended through the use of prediction functions operating on a causal neighborhood of pixels. Preliminary compression results for model and real-world images demonstrate the promise of our techniques.

20.
IEEE Trans Image Process ; 21(2): 494-504, 2012 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21859622

RESUMO

Compressive sensing (CS) is an emerging approach for the acquisition of signals having a sparse or compressible representation in some basis. While the CS literature has mostly focused on problems involving 1-D signals and 2-D images, many important applications involve multidimensional signals; the construction of sparsifying bases and measurement systems for such signals is complicated by their higher dimensionality. In this paper, we propose the use of Kronecker product matrices in CS for two purposes. First, such matrices can act as sparsifying bases that jointly model the structure present in all of the signal dimensions. Second, such matrices can represent the measurement protocols used in distributed settings. Our formulation enables the derivation of analytical bounds for the sparse approximation of multidimensional signals and CS recovery performance, as well as a means of evaluating novel distributed measurement schemes.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA