Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 2.229
Filtrar
Más filtros

Tipo del documento
Publication year range
1.
Cell ; 185(1): 184-203.e19, 2022 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-34963056

RESUMEN

Cancers display significant heterogeneity with respect to tissue of origin, driver mutations, and other features of the surrounding tissue. It is likely that individual tumors engage common patterns of the immune system-here "archetypes"-creating prototypical non-destructive tumor immune microenvironments (TMEs) and modulating tumor-targeting. To discover the dominant immune system archetypes, the University of California, San Francisco (UCSF) Immunoprofiler Initiative (IPI) processed 364 individual tumors across 12 cancer types using standardized protocols. Computational clustering of flow cytometry and transcriptomic data obtained from cell sub-compartments uncovered dominant patterns of immune composition across cancers. These archetypes were profound insofar as they also differentiated tumors based upon unique immune and tumor gene-expression patterns. They also partitioned well-established classifications of tumor biology. The IPI resource provides a template for understanding cancer immunity as a collection of dominant patterns of immune organization and provides a rational path forward to learn how to modulate these to improve therapy.


Asunto(s)
Censos , Neoplasias/genética , Neoplasias/inmunología , Transcriptoma/genética , Microambiente Tumoral/inmunología , Biomarcadores de Tumor , Análisis por Conglomerados , Estudios de Cohortes , Biología Computacional/métodos , Citometría de Flujo/métodos , Regulación Neoplásica de la Expresión Génica , Humanos , Neoplasias/clasificación , Neoplasias/patología , RNA-Seq/métodos , San Francisco , Universidades
2.
Mol Cell ; 78(1): 96-111.e6, 2020 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-32105612

RESUMEN

Current models suggest that chromosome domains segregate into either an active (A) or inactive (B) compartment. B-compartment chromatin is physically separated from the A compartment and compacted by the nuclear lamina. To examine these models in the developmental context of C. elegans embryogenesis, we undertook chromosome tracing to map the trajectories of entire autosomes. Early embryonic chromosomes organized into an unconventional barbell-like configuration, with two densely folded B compartments separated by a central A compartment. Upon gastrulation, this conformation matured into conventional A/B compartments. We used unsupervised clustering to uncover subpopulations with differing folding properties and variable positioning of compartment boundaries. These conformations relied on tethering to the lamina to stretch the chromosome; detachment from the lamina compacted, and allowed intermingling between, A/B compartments. These findings reveal the diverse conformations of early embryonic chromosomes and uncover a previously unappreciated role for the lamina in systemic chromosome stretching.


Asunto(s)
Caenorhabditis elegans/genética , Cromosomas/química , Lámina Nuclear/fisiología , Animales , Caenorhabditis elegans/embriología , Cromosomas/ultraestructura , Embrión no Mamífero/ultraestructura , Gastrulación/genética , Hibridación Fluorescente in Situ , Conformación Molecular
3.
Proc Natl Acad Sci U S A ; 121(33): e2403771121, 2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39110730

RESUMEN

Complex systems are typically characterized by intricate internal dynamics that are often hard to elucidate. Ideally, this requires methods that allow to detect and classify in an unsupervised way the microscopic dynamical events occurring in the system. However, decoupling statistically relevant fluctuations from the internal noise remains most often nontrivial. Here, we describe "Onion Clustering": a simple, iterative unsupervised clustering method that efficiently detects and classifies statistically relevant fluctuations in noisy time-series data. We demonstrate its efficiency by analyzing simulation and experimental trajectories of various systems with complex internal dynamics, ranging from the atomic- to the microscopic-scale, in- and out-of-equilibrium. The method is based on an iterative detect-classify-archive approach. In a similar way as peeling the external (evident) layer of an onion reveals the internal hidden ones, the method performs a first detection/classification of the most populated dynamical environment in the system and of its characteristic noise. The signal of such dynamical cluster is then removed from the time-series data and the remaining part, cleared-out from its noise, is analyzed again. At every iteration, the detection of hidden dynamical subdomains is facilitated by an increasing (and adaptive) relevance-to-noise ratio. The process iterates until no new dynamical domains can be uncovered, revealing, as an output, the number of clusters that can be effectively distinguished/classified in a statistically robust way as a function of the time-resolution of the analysis. Onion Clustering is general and benefits from clear-cut physical interpretability. We expect that it will help analyzing a variety of complex dynamical systems and time-series data.

4.
Proc Natl Acad Sci U S A ; 121(37): e2319804121, 2024 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-39226356

RESUMEN

The rapid growth of large-scale spatial gene expression data demands efficient and reliable computational tools to extract major trends of gene expression in their native spatial context. Here, we used stability-driven unsupervised learning (i.e., staNMF) to identify principal patterns (PPs) of 3D gene expression profiles and understand spatial gene distribution and anatomical localization at the whole mouse brain level. Our subsequent spatial correlation analysis systematically compared the PPs to known anatomical regions and ontology from the Allen Mouse Brain Atlas using spatial neighborhoods. We demonstrate that our stable and spatially coherent PPs, whose linear combinations accurately approximate the spatial gene data, are highly correlated with combinations of expert-annotated brain regions. These PPs yield a brain ontology based purely on spatial gene expression. Our PP identification approach outperforms principal component analysis and typical clustering algorithms on the same task. Moreover, we show that the stable PPs reveal marked regional imbalance of brainwide genetic architecture, leading to region-specific marker genes and gene coexpression networks. Our findings highlight the advantages of stability-driven machine learning for plausible biological discovery from dense spatial gene expression data, streamlining tasks that are infeasible by conventional manual approaches.


Asunto(s)
Encéfalo , Animales , Ratones , Encéfalo/metabolismo , Perfilación de la Expresión Génica/métodos , Transcriptoma , Algoritmos , Aprendizaje Automático no Supervisado , Ontología de Genes , Atlas como Asunto , Redes Reguladoras de Genes , Análisis de Componente Principal
5.
Proc Natl Acad Sci U S A ; 121(37): e2400002121, 2024 Sep 10.
Artículo en Inglés | MEDLINE | ID: mdl-39226348

RESUMEN

Single-cell RNA sequencing (scRNA-seq) data, susceptible to noise arising from biological variability and technical errors, can distort gene expression analysis and impact cell similarity assessments, particularly in heterogeneous populations. Current methods, including deep learning approaches, often struggle to accurately characterize cell relationships due to this inherent noise. To address these challenges, we introduce scAMF (Single-cell Analysis via Manifold Fitting), a framework designed to enhance clustering accuracy and data visualization in scRNA-seq studies. At the heart of scAMF lies the manifold fitting module, which effectively denoises scRNA-seq data by unfolding their distribution in the ambient space. This unfolding aligns the gene expression vector of each cell more closely with its underlying structure, bringing it spatially closer to other cells of the same cell type. To comprehensively assess the impact of scAMF, we compile a collection of 25 publicly available scRNA-seq datasets spanning various sequencing platforms, species, and organ types, forming an extensive RNA data bank. In our comparative studies, benchmarking scAMF against existing scRNA-seq analysis algorithms in this data bank, we consistently observe that scAMF outperforms in terms of clustering efficiency and data visualization clarity. Further experimental analysis reveals that this enhanced performance stems from scAMF's ability to improve the spatial distribution of the data and capture class-consistent neighborhoods. These findings underscore the promising application potential of manifold fitting as a tool in scRNA-seq analysis, signaling a significant enhancement in the precision and reliability of data interpretation in this critical field of study.


Asunto(s)
Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Humanos , Análisis de Secuencia de ARN/métodos , Animales , Algoritmos , ARN/genética , Perfilación de la Expresión Génica/métodos , RNA-Seq/métodos
6.
J Cell Sci ; 137(20)2024 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-38738282

RESUMEN

Advances in imaging, segmentation and tracking have led to the routine generation of large and complex microscopy datasets. New tools are required to process this 'phenomics' type data. Here, we present 'Cell PLasticity Analysis Tool' (cellPLATO), a Python-based analysis software designed for measurement and classification of cell behaviours based on clustering features of cell morphology and motility. Used after segmentation and tracking, the tool extracts features from each cell per timepoint, using them to segregate cells into dimensionally reduced behavioural subtypes. Resultant cell tracks describe a 'behavioural ID' at each timepoint, and similarity analysis allows the grouping of behavioural sequences into discrete trajectories with assigned IDs. Here, we use cellPLATO to investigate the role of IL-15 in modulating human natural killer (NK) cell migration on ICAM-1 or VCAM-1. We find eight behavioural subsets of NK cells based on their shape and migration dynamics between single timepoints, and four trajectories based on sequences of these behaviours over time. Therefore, by using cellPLATO, we show that IL-15 increases plasticity between cell migration behaviours and that different integrin ligands induce different forms of NK cell migration.


Asunto(s)
Movimiento Celular , Interleucina-15 , Células Asesinas Naturales , Humanos , Células Asesinas Naturales/citología , Células Asesinas Naturales/metabolismo , Células Asesinas Naturales/inmunología , Interleucina-15/metabolismo , Programas Informáticos , Molécula 1 de Adhesión Intercelular/metabolismo , Molécula 1 de Adhesión Celular Vascular/metabolismo
7.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38975893

RESUMEN

The process of drug discovery is widely known to be lengthy and resource-intensive. Artificial Intelligence approaches bring hope for accelerating the identification of molecules with the necessary properties for drug development. Drug-likeness assessment is crucial for the virtual screening of candidate drugs. However, traditional methods like Quantitative Estimation of Drug-likeness (QED) struggle to distinguish between drug and non-drug molecules accurately. Additionally, some deep learning-based binary classification models heavily rely on selecting training negative sets. To address these challenges, we introduce a novel unsupervised learning framework called DrugMetric, an innovative framework for quantitatively assessing drug-likeness based on the chemical space distance. DrugMetric blends the powerful learning ability of variational autoencoders with the discriminative ability of the Gaussian Mixture Model. This synergy enables DrugMetric to identify significant differences in drug-likeness across different datasets effectively. Moreover, DrugMetric incorporates principles of ensemble learning to enhance its predictive capabilities. Upon testing over a variety of tasks and datasets, DrugMetric consistently showcases superior scoring and classification performance. It excels in quantifying drug-likeness and accurately distinguishing candidate drugs from non-drugs, surpassing traditional methods including QED. This work highlights DrugMetric as a practical tool for drug-likeness scoring, facilitating the acceleration of virtual drug screening, and has potential applications in other biochemical fields.


Asunto(s)
Descubrimiento de Drogas , Descubrimiento de Drogas/métodos , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/clasificación , Algoritmos , Aprendizaje Profundo , Inteligencia Artificial
8.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38483256

RESUMEN

Numerous imaging techniques are available for observing and interrogating biological samples, and several of them can be used consecutively to enable correlative analysis of different image modalities with varying resolutions and the inclusion of structural or molecular information. Achieving accurate registration of multimodal images is essential for the correlative analysis process, but it remains a challenging computer vision task with no widely accepted solution. Moreover, supervised registration methods require annotated data produced by experts, which is limited. To address this challenge, we propose a general unsupervised pipeline for multimodal image registration using deep learning. We provide a comprehensive evaluation of the proposed pipeline versus the current state-of-the-art image registration and style transfer methods on four types of biological problems utilizing different microscopy modalities. We found that style transfer of modality domains paired with fully unsupervised training leads to comparable image registration accuracy to supervised methods and, most importantly, does not require human intervention.


Asunto(s)
Aprendizaje Profundo , Humanos , Microscopía
9.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38819253

RESUMEN

Spatially resolved transcriptomics (SRT) has emerged as a powerful tool for investigating gene expression in spatial contexts, providing insights into the molecular mechanisms underlying organ development and disease pathology. However, the expression sparsity poses a computational challenge to integrate other modalities (e.g. histological images and spatial locations) that are simultaneously captured in SRT datasets for spatial clustering and variation analyses. In this study, to meet such a challenge, we propose multi-modal domain adaption for spatial transcriptomics (stMDA), a novel multi-modal unsupervised domain adaptation method, which integrates gene expression and other modalities to reveal the spatial functional landscape. Specifically, stMDA first learns the modality-specific representations from spatial multi-modal data using multiple neural network architectures and then aligns the spatial distributions across modal representations to integrate these multi-modal representations, thus facilitating the integration of global and spatially local information and improving the consistency of clustering assignments. Our results demonstrate that stMDA outperforms existing methods in identifying spatial domains across diverse platforms and species. Furthermore, stMDA excels in identifying spatially variable genes with high prognostic potential in cancer tissues. In conclusion, stMDA as a new tool of multi-modal data integration provides a powerful and flexible framework for analyzing SRT datasets, thereby advancing our understanding of intricate biological systems.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Humanos , Perfilación de la Expresión Génica/métodos , Análisis por Conglomerados , Biología Computacional/métodos , Redes Neurales de la Computación , Neoplasias/genética , Algoritmos
10.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38349057

RESUMEN

Efficient and accurate recognition of protein-DNA interactions is vital for understanding the molecular mechanisms of related biological processes and further guiding drug discovery. Although the current experimental protocols are the most precise way to determine protein-DNA binding sites, they tend to be labor-intensive and time-consuming. There is an immediate need to design efficient computational approaches for predicting DNA-binding sites. Here, we proposed ULDNA, a new deep-learning model, to deduce DNA-binding sites from protein sequences. This model leverages an LSTM-attention architecture, embedded with three unsupervised language models that are pre-trained on large-scale sequences from multiple database sources. To prove its effectiveness, ULDNA was tested on 229 protein chains with experimental annotation of DNA-binding sites. Results from computational experiments revealed that ULDNA significantly improves the accuracy of DNA-binding site prediction in comparison with 17 state-of-the-art methods. In-depth data analyses showed that the major strength of ULDNA stems from employing three transformer language models. Specifically, these language models capture complementary feature embeddings with evolution diversity, in which the complex DNA-binding patterns are buried. Meanwhile, the specially crafted LSTM-attention network effectively decodes evolution diversity-based embeddings as DNA-binding results at the residue level. Our findings demonstrated a new pipeline for predicting DNA-binding sites on a large scale with high accuracy from protein sequence alone.


Asunto(s)
Análisis de Datos , Lenguaje , Sitios de Unión , Secuencia de Aminoácidos , Bases de Datos Factuales
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda