Búsqueda | BVS CLAP/SMR-OPS/OMS

A scalable and unbiased discordance metric with H.

Dyjack, Nathan; Baker, Daniel N; Braverman, Vladimir; Langmead, Ben; Hicks, Stephanie C.

Biostatistics ; 2022 Sep 05.

Artículo en Inglés | MEDLINE | ID: mdl-36063544

RESUMEN

A standard unsupervised analysis is to cluster observations into discrete groups using a dissimilarity measure, such as Euclidean distance. If there does not exist a ground-truth label for each observation necessary for external validity metrics, then internal validity metrics, such as the tightness or separation of the clusters, are often used. However, the interpretation of these internal metrics can be problematic when using different dissimilarity measures as they have different magnitudes and ranges of values that they span. To address this problem, previous work introduced the "scale-agnostic" $G_{+}$ discordance metric; however, this internal metric is slow to calculate for large data. Furthermore, in the setting of unsupervised clustering with $k$ groups, we show that $G_{+}$ varies as a function of the proportion of observations assigned to each of the groups (or clusters), referred to as the group balance, which is an undesirable property. To address this problem, we propose a modification of $G_{+}$, referred to as $H_{+}$, and demonstrate that $H_{+}$ does not vary as a function of group balance using a simulation study and with public single-cell RNA-sequencing data. Finally, we provide scalable approaches to estimate $H_{+}$, which are available in the $\mathtt{fasthplus}$ R package.

Streaming Quantiles Algorithms with Small Space and Update Time.

Ivkin, Nikita; Liberty, Edo; Lang, Kevin; Karnin, Zohar; Braverman, Vladimir.

Sensors (Basel) ; 22(24)2022 Dec 08.

Artículo en Inglés | MEDLINE | ID: mdl-36559998

RESUMEN

Approximating quantiles and distributions over streaming data has been studied for roughly two decades now. Recently, Karnin, Lang, and Liberty proposed the first asymptotically optimal algorithm for doing so. This manuscript complements their theoretical result by providing a practical variants of their algorithm with improved constants. For a given sketch size, our techniques provably reduce the upper bound on the sketch error by a factor of two. These improvements are verified experimentally. Our modified quantile sketch improves the latency as well by reducing the worst-case update time from O(1Îµ) down to O(log1Îµ).

Asunto(s)

Algoritmos

A phase Ib/IIa, open-label, multiple ascending-dose trial of domagrozumab in fukutin-related protein limb-girdle muscular dystrophy.

Leung, Doris G; Bocchieri, Alex E; Ahlawat, Shivani; Jacobs, Michael A; Parekh, Vishwa S; Braverman, Vladimir; Summerton, Katherine; Mansour, Jennifer; Stinson, Nikia; Bibat, Genila; Morris, Carl; Marraffino, Shannon; Wagner, Kathryn R.

Muscle Nerve ; 64(2): 172-179, 2021 08.

Artículo en Inglés | MEDLINE | ID: mdl-33961310

RESUMEN

INTRODUCTION/AIMS: In this study we report the results of a phase Ib/IIa, open-label, multiple ascending-dose trial of domagrozumab, a myostatin inhibitor, in patients with fukutin-related protein (FKRP)-associated limb-girdle muscular dystrophy. METHODS: Nineteen patients were enrolled and assigned to one of three dosing arms (5, 20, or 40 mg/kg every 4 weeks). After 32 weeks of treatment, participants receiving the lowest dose were switched to the highest dose (40 mg/kg) for an additional 32 weeks. An extension study was also conducted. The primary endpoints were safety and tolerability. Secondary endpoints included muscle strength, timed function testing, pulmonary function, lean body mass, pharmacokinetics, and pharmacodynamics. As an exploratory outcome, muscle fat fractions were derived from whole-body magnetic resonance images. RESULTS: Serum concentrations of domagrozumab increased in a dose-dependent manner and modest levels of myostatin inhibition were observed in both serum and muscle tissue. The most frequently occurring adverse events were injuries secondary to falls. There were no significant between-group differences in the strength, functional, or imaging outcomes studied. DISCUSSION: We conclude that, although domagrozumab was safe in patients in limb-girdle muscular dystrophy type 2I/R9, there was no clear evidence supporting its efficacy in improving muscle strength or function.

Asunto(s)

Anticuerpos Monoclonales Humanizados/uso terapéutico , Fuerza Muscular/efectos de los fármacos , Distrofia Muscular de Cinturas/tratamiento farmacológico , Adulto , Composición Corporal/efectos de los fármacos , Femenino , Humanos , Masculino , Persona de Mediana Edad , Músculo Esquelético/efectos de los fármacos , Músculo Esquelético/fisiopatología , Distrofia Muscular de Cinturas/fisiopatología , Pentosiltransferasa/metabolismo , Adulto Joven

Longitudinal functional and imaging outcome measures in FKRP limb-girdle muscular dystrophy.

Leung, Doris G; Bocchieri, Alex E; Ahlawat, Shivani; Jacobs, Michael A; Parekh, Vishwa S; Braverman, Vladimir; Summerton, Katherine; Mansour, Jennifer; Bibat, Genila; Morris, Carl; Marraffino, Shannon; Wagner, Kathryn R.

BMC Neurol ; 20(1): 196, 2020 May 19.

Artículo en Inglés | MEDLINE | ID: mdl-32429923

RESUMEN

BACKGROUND: Pathogenic variants in the FKRP gene cause impaired glycosylation of α-dystroglycan in muscle, producing a limb-girdle muscular dystrophy with cardiomyopathy. Despite advances in understanding the pathophysiology of FKRP-associated myopathies, clinical research in the limb-girdle muscular dystrophies has been limited by the lack of normative biomarker data to gauge disease progression. METHODS: Participants in a phase 2 clinical trial were evaluated over a 4-month, untreated lead-in period to evaluate repeatability and to obtain normative data for timed function tests, strength tests, pulmonary function, and body composition using DEXA and whole-body MRI. Novel deep learning algorithms were used to analyze MRI scans and quantify muscle, fat, and intramuscular fat infiltration in the thighs. T-tests and signed rank tests were used to assess changes in these outcome measures. RESULTS: Nineteen participants were observed during the lead-in period for this trial. No significant changes were noted in the strength, pulmonary function, or body composition outcome measures over the 4-month observation period. One timed function measure, the 4-stair climb, showed a statistically significant difference over the observation period. Quantitative estimates of muscle, fat, and intramuscular fat infiltration from whole-body MRI corresponded significantly with DEXA estimates of body composition, strength, and timed function measures. CONCLUSIONS: We describe normative data and repeatability performance for multiple physical function measures in an adult FKRP muscular dystrophy population. Our analysis indicates that deep learning algorithms can be used to quantify healthy and dystrophic muscle seen on whole-body imaging. TRIAL REGISTRATION: This study was retrospectively registered in clinicaltrials.gov (NCT02841267) on July 22, 2016 and data supporting this study has been submitted to this registry.

Asunto(s)

Distrofia Muscular de Cinturas/fisiopatología , Pentosiltransferasa/genética , Adulto , Anciano , Distroglicanos/metabolismo , Femenino , Glicosilación , Humanos , Imagen por Resonancia Magnética , Masculino , Persona de Mediana Edad , Músculo Esquelético/patología , Distrofia Muscular de Cinturas/genética , Evaluación de Resultado en la Atención de Salud , Adulto Joven

Automatic Active Lesion Tracking in Multiple Sclerosis Using Unsupervised Machine Learning.

Uwaeze, Jason; Narayana, Ponnada A; Kamali, Arash; Braverman, Vladimir; Jacobs, Michael A; Akhbardeh, Alireza.

Diagnostics (Basel) ; 14(6)2024 Mar 16.

Artículo en Inglés | MEDLINE | ID: mdl-38535052

RESUMEN

BACKGROUND: Identifying active lesions in magnetic resonance imaging (MRI) is crucial for the diagnosis and treatment planning of multiple sclerosis (MS). Active lesions on MRI are identified following the administration of Gadolinium-based contrast agents (GBCAs). However, recent studies have reported that repeated administration of GBCA results in the accumulation of Gd in tissues. In addition, GBCA administration increases health care costs. Thus, reducing or eliminating GBCA administration for active lesion detection is important for improved patient safety and reduced healthcare costs. Current state-of-the-art methods for identifying active lesions in brain MRI without GBCA administration utilize data-intensive deep learning methods. OBJECTIVE: To implement nonlinear dimensionality reduction (NLDR) methods, locally linear embedding (LLE) and isometric feature mapping (Isomap), which are less data-intensive, for automatically identifying active lesions on brain MRI in MS patients, without the administration of contrast agents. MATERIALS AND METHODS: Fluid-attenuated inversion recovery (FLAIR), T2-weighted, proton density-weighted, and pre- and post-contrast T1-weighted images were included in the multiparametric MRI dataset used in this study. Subtracted pre- and post-contrast T1-weighted images were labeled by experts as active lesions (ground truth). Unsupervised methods, LLE and Isomap, were used to reconstruct multiparametric brain MR images into a single embedded image. Active lesions were identified on the embedded images and compared with ground truth lesions. The performance of NLDR methods was evaluated by calculating the Dice similarity (DS) index between the observed and identified active lesions in embedded images. RESULTS: LLE and Isomap, were applied to 40 MS patients, achieving median DS scores of 0.74 ± 0.1 and 0.78 ± 0.09, respectively, outperforming current state-of-the-art methods. CONCLUSIONS: NLDR methods, Isomap and LLE, are viable options for the identification of active MS lesions on non-contrast images, and potentially could be used as a clinical decision tool.

Data-Independent Structured Pruning of Neural Networks via Coresets.

Mussay, Ben; Feldman, Dan; Zhou, Samson; Braverman, Vladimir; Osadchy, Margarita.

IEEE Trans Neural Netw Learn Syst ; 33(12): 7829-7841, 2022 12.

Artículo en Inglés | MEDLINE | ID: mdl-34166205

RESUMEN

Model compression is crucial for the deployment of neural networks on devices with limited computational and memory resources. Many different methods show comparable accuracy of the compressed model and similar compression rates. However, the majority of the compression methods are based on heuristics and offer no worst case guarantees on the tradeoff between the compression rate and the approximation error for an arbitrarily new sample. We propose the first efficient structured pruning algorithm with a provable tradeoff between its compression rate and the approximation error for any future test sample. Our method is based on the coreset framework, and it approximates the output of a layer of neurons/filters by a coreset of neurons/filters in the previous layer and discards the rest. We apply this framework in a layer-by-layer fashion from the bottom to the top. Unlike previous works, our coreset is data-independent, meaning that it provably guarantees the accuracy of the function for any input [Formula: see text], including an adversarial one.

Asunto(s)

Compresión de Datos , Redes Neurales de la Computación , Algoritmos , Neuronas

Fast and memory-efficient scRNA-seq k-means clustering with various distances.

Baker, Daniel N; Dyjack, Nathan; Braverman, Vladimir; Hicks, Stephanie C; Langmead, Ben.

ACM BCB ; 20212021 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-34778889

RESUMEN

Single-cell RNA-sequencing (scRNA-seq) analyses typically begin by clustering a gene-by-cell expression matrix to empirically define groups of cells with similar expression profiles. We describe new methods and a new open source library, minicore, for efficient k-means++ center finding and k-means clustering of scRNA-seq data. Minicore works with sparse count data, as it emerges from typical scRNA-seq experiments, as well as with dense data from after dimensionality reduction. Minicore's novel vectorized weighted reservoir sampling algorithm allows it to find initial k-means++ centers for a 4-million cell dataset in 1.5 minutes using 20 threads. Minicore can cluster using Euclidean distance, but also supports a wider class of measures like Jensen-Shannon Divergence, Kullback-Leibler Divergence, and the Bhattachaiyya distance, which can be directly applied to count data and probability distributions. Further, minicore produces lower-cost centerings more efficiently than scikit-learn for scRNA-seq datasets with millions of cells. With careful handling of priors, minicore implements these distance measures with only minor (<2-fold) speed differences among all distances. We show that a minicore pipeline consisting of k-means++, localsearch++ and mini-batch k-means can cluster a 4-million cell dataset in minutes, using less than 10GiB of RAM. This memory-efficiency enables atlas-scale clustering on laptops and other commodity hardware. Finally, we report findings on which distance measures give clusterings that are most consistent with known cell type labels.

Near Optimal Linear Algebra in the Online and Sliding Window Models.

Braverman, Vladimir; Drineas, Petros; Musco, Cameron; Musco, Christopher; Upadhyay, Jalaj; Woodruff, David P; Zhou, Samson.

Proc Annu Symp Found Comput Sci ; 1: 517-528, 2020.

Artículo en Inglés | MEDLINE | ID: mdl-34421392

RESUMEN

We initiate the study of numerical linear algebra in the sliding window model, where only the most recent W updates in a stream form the underlying data set. Although many existing algorithms in the sliding window model use or borrow elements from the smooth histogram framework (Braverman and Ostrovsky, FOCS 2007), we show that many interesting linear-algebraic problems, including spectral and vector induced matrix norms, generalized regression, and lowrank approximation, are not amenable to this approach in the row-arrival model. To overcome this challenge, we first introduce a unified row-sampling based framework that gives randomized algorithms for spectral approximation, low-rank approximation/projection-cost preservation, and â 1-subspace embeddings in the sliding window model, which often use nearly optimal space and achieve nearly input sparsity runtime. Our algorithms are based on "reverse online" versions of offline sampling distributions such as (ridge) leverage scores, â 1 sensitivities, and Lewis weights to quantify both the importance and the recency of a row; our structural results on these distributions may be of independent interest for future algorithmic design. Although our techniques initially address numerical linear algebra in the sliding window model, our row-sampling framework rather surprisingly implies connections to the well-studied online model; our structural results also give the first sample optimal (up to lower order terms) online algorithm for low-rank approximation/projection-cost preservation. Using this powerful primitive, we give online algorithms for column/row subset selection and principal component analysis that resolves the main open question of Bhaskara et al. (FOCS 2019). We also give the first online algorithm for â 1-subspace embeddings. We further formalize the connection between the online model and the sliding window model by introducing an additional unified framework for deterministic algorithms using a merge and reduce paradigm and the concept of online coresets, which we define as a weighted subset of rows of the input matrix that can be used to compute a good approximation to some given function on all of its prefixes. Our sampling based algorithms in the row-arrival online model yield online coresets, giving deterministic algorithms for spectral approximation, low-rank approximation/projection-cost preservation, and â 1-subspace embeddings in the sliding window model that use nearly optimal space.

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA