Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 9 de 9
Filter
Add more filters










Database
Language
Publication year range
1.
JCI Insight ; 7(13)2022 07 08.
Article in English | MEDLINE | ID: mdl-35801589

ABSTRACT

People with HIV (PWH) on antiretroviral therapy (ART) experience elevated rates of neurological impairment, despite controlling for demographic factors and comorbidities, suggesting viral or neuroimmune etiologies for these deficits. Here, we apply multimodal and cross-compartmental single-cell analyses of paired cerebrospinal fluid (CSF) and peripheral blood in PWH and uninfected controls. We demonstrate that a subset of central memory CD4+ T cells in the CSF produced HIV-1 RNA, despite apparent systemic viral suppression, and that HIV-1-infected cells were more frequently found in the CSF than in the blood. Using cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), we show that the cell surface marker CD204 is a reliable marker for rare microglia-like cells in the CSF, which have been implicated in HIV neuropathogenesis, but which we did not find to contain HIV transcripts. Through a feature selection method for supervised deep learning of single-cell transcriptomes, we find that abnormal CD8+ T cell activation, rather than CD4+ T cell abnormalities, predominated in the CSF of PWH compared with controls. Overall, these findings suggest ongoing CNS viral persistence and compartmentalized CNS neuroimmune effects of HIV infection during ART and demonstrate the power of single-cell studies of CSF to better understand the CNS reservoir during HIV infection.


Subject(s)
HIV Infections , HIV-1 , HIV Infections/drug therapy , HIV Infections/pathology , HIV-1/genetics , Humans , Longitudinal Studies , Microglia/pathology , Viral Transcription
2.
Neural Netw ; 152: 34-43, 2022 Aug.
Article in English | MEDLINE | ID: mdl-35500458

ABSTRACT

Modern datasets often contain large subsets of correlated features and nuisance features, which are not or loosely related to the main underlying structures of the data. Nuisance features can be identified using the Laplacian score criterion, which evaluates the importance of a given feature via its consistency with the Graph Laplacians' leading eigenvectors. We demonstrate that in the presence of large numbers of nuisance features, the Laplacian must be computed on the subset of selected features rather than on the complete feature set. To do this, we propose a fully differentiable approach for unsupervised feature selection, utilizing the Laplacian score criterion to avoid the selection of nuisance features. We employ an autoencoder architecture to cope with correlated features, trained to reconstruct the data from the subset of selected features. Building on the recently proposed concrete layer that allows controlling for the number of selected features via architectural design, simplifying the optimization process. Experimenting on several real-world datasets, we demonstrate that our proposed approach outperforms similar approaches designed to avoid only correlated or nuisance features, but not both. Several state-of-the-art clustering results are reported. Our code is publically available at https://github.com/jsvir/lscae.


Subject(s)
Cluster Analysis
3.
Arch Pathol Lab Med ; 146(2): 182-193, 2022 01 02.
Article in English | MEDLINE | ID: mdl-34086849

ABSTRACT

CONTEXT.­: Large cell transformation (LCT) of indolent B-cell lymphomas, such as follicular lymphoma (FL) and chronic lymphocytic leukemia (CLL), signals a worse prognosis, at which point aggressive chemotherapy is initiated. Although LCT is relatively straightforward to diagnose in lymph nodes, a marrow biopsy is often obtained first given its ease of procedure, low cost, and low morbidity. However, consensus criteria for LCT in bone marrow have not been established. OBJECTIVE.­: To study the accuracy and reproducibility of a trained convolutional neural network in identifying LCT, in light of promising machine learning tools that may introduce greater objectivity to morphologic analysis. DESIGN.­: We retrospectively identified patients who had a diagnosis of FL or CLL who had undergone bone marrow biopsy for the clinical question of LCT. We scored morphologic criteria and correlated results with clinical disease progression. In addition, whole slide scans were annotated into patches to train convolutional neural networks to discriminate between small and large tumor cells and to predict the patient's probability of transformation. RESULTS.­: Using morphologic examination, the proportion of large lymphoma cells (≥10% in FL and ≥30% in CLL), chromatin pattern, distinct nucleoli, and proliferation index were significantly correlated with LCT in FL and CLL. Compared to pathologist-derived estimates, machine-generated quantification demonstrated better reproducibility and stronger correlation with final outcome data. CONCLUSIONS.­: These histologic findings may serve as indications of LCT in bone marrow biopsies. The pathologist-augmented with machine system appeared to be the most predictive, arguing for greater efforts to validate and implement these tools to further enhance physician practice.


Subject(s)
Deep Learning , Leukemia, Lymphocytic, Chronic, B-Cell , Lymphoma, Follicular , Biopsy , Bone Marrow/pathology , Humans , Leukemia, Lymphocytic, Chronic, B-Cell/diagnosis , Leukemia, Lymphocytic, Chronic, B-Cell/pathology , Lymphoma, Follicular/diagnosis , Lymphoma, Follicular/pathology , Machine Learning , Reproducibility of Results , Retrospective Studies
4.
Chaos ; 31(4): 043118, 2021 Apr.
Article in English | MEDLINE | ID: mdl-34251227

ABSTRACT

A low-dimensional dynamical system is observed in an experiment as a high-dimensional signal, for example, a video of a chaotic pendulums system. Assuming that we know the dynamical model up to some unknown parameters, can we estimate the underlying system's parameters by measuring its time-evolution only once? The key information for performing this estimation lies in the temporal inter-dependencies between the signal and the model. We propose a kernel-based score to compare these dependencies. Our score generalizes a maximum likelihood estimator for a linear model to a general nonlinear setting in an unknown feature space. We estimate the system's underlying parameters by maximizing the proposed score. We demonstrate the accuracy and efficiency of the method using two chaotic dynamical systems-the double pendulum and the Lorenz '63 model.

5.
Proc Natl Acad Sci U S A ; 118(22)2021 06 01.
Article in English | MEDLINE | ID: mdl-34001664

ABSTRACT

Comprehensive and accurate comparisons of transcriptomic distributions of cells from samples taken from two different biological states, such as healthy versus diseased individuals, are an emerging challenge in single-cell RNA sequencing (scRNA-seq) analysis. Current methods for detecting differentially abundant (DA) subpopulations between samples rely heavily on initial clustering of all cells in both samples. Often, this clustering step is inadequate since the DA subpopulations may not align with a clear cluster structure, and important differences between the two biological states can be missed. Here, we introduce DA-seq, a targeted approach for identifying DA subpopulations not restricted to clusters. DA-seq is a multiscale method that quantifies a local DA measure for each cell, which is computed from its k nearest neighboring cells across a range of k values. Based on this measure, DA-seq delineates contiguous significant DA subpopulations in the transcriptomic space. We apply DA-seq to several scRNA-seq datasets and highlight its improved ability to detect differences between distinct phenotypes in severe versus mildly ill COVID-19 patients, melanomas subjected to immune checkpoint therapy comparing responders to nonresponders, embryonic development at two time points, and young versus aging brain tissue. DA-seq enabled us to detect differences between these phenotypes. Importantly, we find that DA-seq not only recovers the DA cell types as discovered in the original studies but also reveals additional DA subpopulations that were not described before. Analysis of these subpopulations yields biological insights that would otherwise be undetected using conventional computational approaches.


Subject(s)
Aging/genetics , COVID-19/genetics , Cell Lineage/genetics , Melanoma/genetics , RNA, Small Cytoplasmic/genetics , Skin Neoplasms/genetics , Aging/metabolism , B-Lymphocytes/immunology , B-Lymphocytes/virology , Brain/cytology , Brain/metabolism , COVID-19/immunology , COVID-19/pathology , COVID-19/virology , Cell Lineage/immunology , Cytokines/genetics , Cytokines/immunology , Datasets as Topic , Dendritic Cells/immunology , Dendritic Cells/virology , Gene Expression Profiling , Gene Expression Regulation , High-Throughput Nucleotide Sequencing , Humans , Melanoma/immunology , Melanoma/pathology , Monocytes/immunology , Monocytes/virology , Phenotype , RNA, Small Cytoplasmic/immunology , SARS-CoV-2/pathogenicity , Severity of Illness Index , Single-Cell Analysis/methods , Skin Neoplasms/immunology , Skin Neoplasms/pathology , T-Lymphocytes/immunology , T-Lymphocytes/virology , Transcriptome
6.
Nucleic Acids Res ; 49(4): e21, 2021 02 26.
Article in English | MEDLINE | ID: mdl-33330933

ABSTRACT

Following antigenic challenge, activated B cells rapidly expand and undergo somatic hypermutation, yielding groups of clonally related B cells with diversified immunoglobulin receptors. Inference of clonal relationships based on the receptor sequence is an essential step in many adaptive immune receptor repertoire sequencing studies. These relationships are typically identified by a multi-step process that involves: (i) grouping sequences based on shared V and J gene assignments, and junction lengths and (ii) clustering these sequences using a junction-based distance. However, this approach is sensitive to the initial gene assignments, which are error-prone, and fails to identify clonal relatives whose junction length has changed through accumulation of indels. Through defining a translation-invariant feature space in which we cluster the sequences, we develop an alignment free clonal identification method that does not require gene assignments and is not restricted to a fixed junction length. This alignment free approach has higher sensitivity compared to a typical junction-based distance method without loss of specificity and PPV. While the alignment free procedure identifies clones that are broadly consistent with the junction-based distance method, it also identifies clones with characteristics (multiple V or J gene assignments or junction lengths) that are not detectable with the junction-based distance method.


Subject(s)
Genes, Immunoglobulin , Sequence Analysis, DNA/methods , Clone Cells , VDJ Exons
7.
Proc Natl Acad Sci U S A ; 117(49): 30918-30927, 2020 12 08.
Article in English | MEDLINE | ID: mdl-33229581

ABSTRACT

We propose a local conformal autoencoder (LOCA) for standardized data coordinates. LOCA is a deep learning-based method for obtaining standardized data coordinates from scientific measurements. Data observations are modeled as samples from an unknown, nonlinear deformation of an underlying Riemannian manifold, which is parametrized by a few normalized, latent variables. We assume a repeated measurement sampling strategy, common in scientific measurements, and present a method for learning an embedding in [Formula: see text] that is isometric to the latent variables of the manifold. The coordinates recovered by our method are invariant to diffeomorphisms of the manifold, making it possible to match between different instrumental observations of the same phenomenon. Our embedding is obtained using LOCA, which is an algorithm that learns to rectify deformations by using a local z-scoring procedure, while preserving relevant geometric information. We demonstrate the isometric embedding properties of LOCA in various model settings and observe that it exhibits promising interpolation and extrapolation capabilities, superior to the current state of the art. Finally, we demonstrate LOCA's efficacy in single-site Wi-Fi localization data and for the reconstruction of three-dimensional curved surfaces from two-dimensional projections.


Subject(s)
Algorithms , Data Analysis , Reference Standards
8.
Data Min Knowl Discov ; 34(6): 1676-1712, 2020.
Article in English | MEDLINE | ID: mdl-32837252

ABSTRACT

Kernel methods play a critical role in many machine learning algorithms. They are useful in manifold learning, classification, clustering and other data analysis tasks. Setting the kernel's scale parameter, also referred to as the kernel's bandwidth, highly affects the performance of the task in hand. We propose to set a scale parameter that is tailored to one of two types of tasks: classification and manifold learning. For manifold learning, we seek a scale which is best at capturing the manifold's intrinsic dimension. For classification, we propose three methods for estimating the scale, which optimize the classification results in different senses. The proposed frameworks are simulated on artificial and on real datasets. The results show a high correlation between optimal classification rates and the estimated scales. Finally, we demonstrate the approach on a seismic event classification task.

9.
Article in English | MEDLINE | ID: mdl-34504892

ABSTRACT

Word2vec introduced by Mikolov et al. is a word embedding method that is widely used in natural language processing. Despite its success and frequent use, a strong theoretical justification is still lacking. The main contribution of our paper is to propose a rigorous analysis of the highly nonlinear functional of word2vec. Our results suggest that word2vec may be primarily driven by an underlying spectral method. This insight may open the door to obtaining provable guarantees for word2vec. We support these findings by numerical simulations. One fascinating open question is whether the nonlinear properties of word2vec that are not captured by the spectral method are beneficial and, if so, by what mechanism.

SELECTION OF CITATIONS
SEARCH DETAIL
...