Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 148
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 626(7998): 367-376, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38092041

RESUMEN

Implantation of the human embryo begins a critical developmental stage that comprises profound events including axis formation, gastrulation and the emergence of haematopoietic system1,2. Our mechanistic knowledge of this window of human life remains limited due to restricted access to in vivo samples for both technical and ethical reasons3-5. Stem cell models of human embryo have emerged to help unlock the mysteries of this stage6-16. Here we present a genetically inducible stem cell-derived embryoid model of early post-implantation human embryogenesis that captures the reciprocal codevelopment of embryonic tissue and the extra-embryonic endoderm and mesoderm niche with early haematopoiesis. This model is produced from induced pluripotent stem cells and shows unanticipated self-organizing cellular programmes similar to those that occur in embryogenesis, including the formation of amniotic cavity and bilaminar disc morphologies as well as the generation of an anterior hypoblast pole and posterior domain. The extra-embryonic layer in these embryoids lacks trophoblast and shows advanced multilineage yolk sac tissue-like morphogenesis that harbours a process similar to distinct waves of haematopoiesis, including the emergence of erythroid-, megakaryocyte-, myeloid- and lymphoid-like cells. This model presents an easy-to-use, high-throughput, reproducible and scalable platform to probe multifaceted aspects of human development and blood formation at the early post-implantation stage. It will provide a tractable human-based model for drug testing and disease modelling.


Asunto(s)
Desarrollo Embrionario , Estratos Germinativos , Hematopoyesis , Saco Vitelino , Humanos , Implantación del Embrión , Endodermo/citología , Endodermo/embriología , Estratos Germinativos/citología , Estratos Germinativos/embriología , Saco Vitelino/citología , Saco Vitelino/embriología , Mesodermo/citología , Mesodermo/embriología , Células Madre Pluripotentes Inducidas/citología , Amnios/citología , Amnios/embriología , Cuerpos Embrioides/citología , Linaje de la Célula , Biología Evolutiva/métodos , Biología Evolutiva/tendencias
2.
Nat Rev Genet ; 23(6): 355-368, 2022 06.
Artículo en Inglés | MEDLINE | ID: mdl-35102309

RESUMEN

Methods for profiling genes at the single-cell level have revolutionized our ability to study several biological processes and systems including development, differentiation, response programmes and disease progression. In many of these studies, cells are profiled over time in order to infer dynamic changes in cell states and types, sets of expressed genes, active pathways and key regulators. However, time-series single-cell RNA sequencing (scRNA-seq) also raises several new analysis and modelling issues. These issues range from determining when and how deep to profile cells, linking cells within and between time points, learning continuous trajectories, and integrating bulk and single-cell data for reconstructing models of dynamic networks. In this Review, we discuss several approaches for the analysis and modelling of time-series scRNA-seq, highlighting their steps, key assumptions, and the types of data and biological questions they are most appropriate for.


Asunto(s)
Análisis de la Célula Individual , Transcriptoma , Diferenciación Celular/genética , Perfilación de la Expresión Génica/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos
3.
Genome Res ; 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38951026

RESUMEN

mRNA-based vaccines and therapeutics are gaining popularity and usage across a wide range of conditions. One of the critical issues when designing such mRNAs is sequence optimization. Even small proteins or peptides can be encoded by an enormously large number of mRNAs. The actual mRNA sequence can have a large impact on several properties including expression, stability, immunogenicity, and more. To enable the selection of an optimal sequence, we developed CodonBERT, a large language model (LLM) for mRNAs. Unlike prior models, CodonBERT uses codons as inputs which enables it to learn better representations. CodonBERT was trained using more than 10 million mRNA sequences from a diverse set of organisms. The resulting model captures important biological concepts. CodonBERT can also be extended to perform prediction tasks for various mRNA properties. CodonBERT outperforms previous mRNA prediction methods including on a new flu vaccine dataset.

4.
Nat Methods ; 20(8): 1237-1243, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37429992

RESUMEN

Spatial transcriptomics promises to greatly improve our understanding of tissue organization and cell-cell interactions. While most current platforms for spatial transcriptomics only offer multi-cellular resolution, with 10-15 cells per spot, recent technologies provide a much denser spot placement leading to subcellular resolution. A key challenge for these newer methods is cell segmentation and the assignment of spots to cells. Traditional image-based segmentation methods are limited and do not make full use of the information profiled by spatial transcriptomics. Here we present subcellular spatial transcriptomics cell segmentation (SCS), which combines imaging data with sequencing data to improve cell segmentation accuracy. SCS assigns spots to cells by adaptively learning the position of each spot relative to the center of its cell using a transformer neural network. SCS was tested on two new subcellular spatial transcriptomics technologies and outperformed traditional image-based segmentation methods. SCS achieved better accuracy, identified more cells and provided more realistic cell size estimation. Subcellular analysis of RNAs using SCS spot assignments provides information on RNA localization and further supports the segmentation results.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Comunicación Celular , Tamaño de la Célula , Aprendizaje
5.
Genome Res ; 2022 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-35764397

RESUMEN

One of the first steps in the analysis of single-cell RNA sequencing (scRNA-seq) data is the assignment of cell types. Although a number of supervised methods have been developed for this, in most cases such assignment is performed by first clustering cells in low-dimensional space and then assigning cell types to different clusters. To overcome noise and to improve cell type assignments, we developed UNIFAN, a neural network method that simultaneously clusters and annotates cells using known gene sets. UNIFAN combines both low-dimensional representation for all genes and cell-specific gene set activity scores to determine the clustering. We applied UNIFAN to human and mouse scRNA-seq data sets from several different organs. We show, by using knowledge about gene sets, that UNIFAN greatly outperforms prior methods developed for clustering scRNA-seq data. The gene sets assigned by UNIFAN to different clusters provide strong evidence for the cell type that is represented by this cluster, making annotations easier.

6.
Nat Methods ; 19(10): 1306-1319, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36064772

RESUMEN

Hematopoietic humanized (hu) mice are powerful tools for modeling the action of human immune system and are widely used for preclinical studies and drug discovery. However, generating a functional human T cell compartment in hu mice remains challenging, primarily due to the species-related differences between human and mouse thymus. While engrafting human fetal thymic tissues can support robust T cell development in hu mice, tissue scarcity and ethical concerns limit their wide use. Here, we describe the tissue engineering of human thymus organoids from inducible pluripotent stem cells (iPSC-thymus) that can support the de novo generation of a diverse population of functional human T cells. T cells of iPSC-thymus-engrafted hu mice could mediate both cellular and humoral immune responses, including mounting robust proinflammatory responses on T cell receptor engagement, inhibiting allogeneic tumor graft growth and facilitating efficient Ig class switching. Our findings indicate that hu mice engrafted with iPSC-thymus can serve as a new animal model to study human T cell-mediated immunity and accelerate the translation of findings from animal studies into the clinic.


Asunto(s)
Trasplante de Células Madre Hematopoyéticas , Células Madre Pluripotentes Inducidas , Animales , Modelos Animales de Enfermedad , Humanos , Ratones , Ratones SCID , Organoides , Linfocitos T , Timo
7.
Bioinformatics ; 40(Supplement_1): i151-i159, 2024 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-38940139

RESUMEN

MOTIVATION: Analysis of time series transcriptomics data from clinical trials is challenging. Such studies usually profile very few time points from several individuals with varying response patterns and dynamics. Current methods for these datasets are mainly based on linear, global orderings using visit times which do not account for the varying response rates and subgroups within a patient cohort. RESULTS: We developed a new method that utilizes multi-commodity flow algorithms for trajectory inference in large scale clinical studies. Recovered trajectories satisfy individual-based timing restrictions while integrating data from multiple patients. Testing the method on multiple drug datasets demonstrated an improved performance compared to prior approaches suggested for this task, while identifying novel disease subtypes that correspond to heterogeneous patient response patterns. AVAILABILITY AND IMPLEMENTATION: The source code and instructions to download the data have been deposited on GitHub at https://github.com/euxhenh/Truffle.


Asunto(s)
Algoritmos , Transcriptoma , Humanos , Transcriptoma/genética , Perfilación de la Expresión Génica/métodos , Programas Informáticos
8.
Bioinformatics ; 40(7)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38810107

RESUMEN

MOTIVATION: Lipid nanoparticles (LNPs) are the most widely used vehicles for mRNA vaccine delivery. The structure of the lipids composing the LNPs can have a major impact on the effectiveness of the mRNA payload. Several properties should be optimized to improve delivery and expression including biodegradability, synthetic accessibility, and transfection efficiency. RESULTS: To optimize LNPs, we developed and tested models that enable the virtual screening of LNPs with high transfection efficiency. Our best method uses the lipid Simplified Molecular-Input Line-Entry System (SMILES) as inputs to a large language model. Large language model-generated embeddings are then used by a downstream gradient-boosting classifier. As we show, our method can more accurately predict lipid properties, which could lead to higher efficiency and reduced experimental time and costs. AVAILABILITY AND IMPLEMENTATION: Code and data links available at: https://github.com/Sanofi-Public/LipoBART.


Asunto(s)
Lípidos , Nanopartículas , Transfección , Nanopartículas/química , Lípidos/química , Transfección/métodos , ARN Mensajero/metabolismo , Liposomas
9.
Nucleic Acids Res ; 51(7): e38, 2023 04 24.
Artículo en Inglés | MEDLINE | ID: mdl-36762475

RESUMEN

Inference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)-gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values. This allows our method to overcome many of the problems faced by prior methods leading to more accurate and more comprehensive set of identified regulatory interactions. Application of our method to atlas scale single cell data from 6 HuBMAP tissues led to several validated and novel predictions and greatly improved on prior methods proposed for this task.


Asunto(s)
Biología Computacional , Algoritmos , Redes Reguladoras de Genes , Biología de Sistemas , Análisis de la Célula Individual , Atlas como Asunto
10.
Bioinformatics ; 39(39 Suppl 1): i140-i148, 2023 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-37387167

RESUMEN

MOTIVATION: Spatial proteomics data have been used to map cell states and improve our understanding of tissue organization. More recently, these methods have been extended to study the impact of such organization on disease progression and patient survival. However, to date, the majority of supervised learning methods utilizing these data types did not take full advantage of the spatial information, impacting their performance and utilization. RESULTS: Taking inspiration from ecology and epidemiology, we developed novel spatial feature extraction methods for use with spatial proteomics data. We used these features to learn prediction models for cancer patient survival. As we show, using the spatial features led to consistent improvement over prior methods that used the spatial proteomics data for the same task. In addition, feature importance analysis revealed new insights about the cell interactions that contribute to patient survival. AVAILABILITY AND IMPLEMENTATION: The code for this work can be found at gitlab.com/enable-medicine-public/spatsurv.


Asunto(s)
Neoplasias , Proteómica , Humanos , Neoplasias/diagnóstico por imagen , Comunicación Celular , Progresión de la Enfermedad , Análisis de Supervivencia
11.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33876191

RESUMEN

Time-course gene-expression data have been widely used to infer regulatory and signaling relationships between genes. Most of the widely used methods for such analysis were developed for bulk expression data. Single cell RNA-Seq (scRNA-Seq) data offer several advantages including the large number of expression profiles available and the ability to focus on individual cells rather than averages. However, the data also raise new computational challenges. Using a novel encoding for scRNA-Seq expression data, we develop deep learning methods for interaction prediction from time-course data. Our methods use a supervised framework which represents the data as 3D tensor and train convolutional and recurrent neural networks for predicting interactions. We tested our time-course deep learning (TDL) models on five different time-series scRNA-Seq datasets. As we show, TDL can accurately identify causal and regulatory gene-gene interactions and can also be used to assign new function to genes. TDL improves on prior methods for the above tasks and can be generally applied to new time-series scRNA-Seq data.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Aprendizaje Profundo , Perfilación de la Expresión Génica/métodos , RNA-Seq/métodos , Análisis de la Célula Individual/métodos , Animales , Células Cultivadas , Epistasis Genética , Redes Reguladoras de Genes/genética , Humanos , Ratones , Modelos Genéticos , Factores de Tiempo
12.
Bioinformatics ; 38(4): 997-1004, 2022 01 27.
Artículo en Inglés | MEDLINE | ID: mdl-34623423

RESUMEN

MOTIVATION: Recent advancements in fluorescence in situ hybridization (FISH) techniques enable them to concurrently obtain information on the location and gene expression of single cells. A key question in the initial analysis of such spatial transcriptomics data is the assignment of cell types. To date, most studies used methods that only rely on the expression levels of the genes in each cell for such assignments. To fully utilize the data and to improve the ability to identify novel sub-types, we developed a new method, FICT, which combines both expression and neighborhood information when assigning cell types. RESULTS: FICT optimizes a probabilistic function that we formalize and for which we provide learning and inference algorithms. We used FICT to analyze both simulated and several real spatial transcriptomics data. As we show, FICT can accurately identify cell types and sub-types, improving on expression only methods and other methods proposed for clustering spatial transcriptomics data. Some of the spatial sub-types identified by FICT provide novel hypotheses about the new functions for excitatory and inhibitory neurons. AVAILABILITY AND IMPLEMENTATION: FICT is available at: https://github.com/haotianteng/FICT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Hibridación Fluorescente in Situ , Perfilación de la Expresión Génica/métodos , Algoritmos , Análisis por Conglomerados
13.
PLoS Comput Biol ; 18(9): e1010468, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36095011

RESUMEN

Studies comparing single cell RNA-Seq (scRNA-Seq) data between conditions mainly focus on differences in the proportion of cell types or on differentially expressed genes. In many cases these differences are driven by changes in cell interactions which are challenging to infer without spatial information. To determine cell-cell interactions that differ between conditions we developed the Cell Interaction Network Inference (CINS) pipeline. CINS combines Bayesian network analysis with regression-based modeling to identify differential cell type interactions and the proteins that underlie them. We tested CINS on a disease case control and on an aging mouse dataset. In both cases CINS correctly identifies cell type interactions and the ligands involved in these interactions improving on prior methods suggested for cell interaction predictions. We performed additional mouse aging scRNA-Seq experiments which further support the interactions identified by CINS.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Animales , Teorema de Bayes , Comunicación Celular , Perfilación de la Expresión Génica/métodos , Ligandos , Ratones , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos
14.
Bioinformatics ; 37(7): 968-975, 2021 05 17.
Artículo en Inglés | MEDLINE | ID: mdl-32886099

RESUMEN

MOTIVATION: Recent technological advances enable the profiling of spatial single-cell expression data. Such data present a unique opportunity to study cell-cell interactions and the signaling genes that mediate them. However, most current methods for the analysis of these data focus on unsupervised descriptive modeling, making it hard to identify key signaling genes and quantitatively assess their impact. RESULTS: We developed a Mixture of Experts for Spatial Signaling genes Identification (MESSI) method to identify active signaling genes within and between cells. The mixture of experts strategy enables MESSI to subdivide cells into subtypes. MESSI relies on multi-task learning using information from neighboring cells to improve the prediction of response genes within a cell. Applying the methods to three spatial single-cell expression datasets, we show that MESSI accurately predicts the levels of response genes, improving upon prior methods and provides useful biological insights about key signaling genes and subtypes of excitatory neuron cells. AVAILABILITY AND IMPLEMENTATION: MESSI is available at: https://github.com/doraadong/MESSI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Transducción de Señal , Programas Informáticos , Análisis de la Célula Individual
15.
Bioinformatics ; 37(11): 1535-1543, 2021 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-30768159

RESUMEN

MOTIVATION: Intra-tumor heterogeneity is one of the key confounding factors in deciphering tumor evolution. Malignant cells exhibit variations in their gene expression, copy numbers and mutation even when originating from a single progenitor cell. Single cell sequencing of tumor cells has recently emerged as a viable option for unmasking the underlying tumor heterogeneity. However, extracting features from single cell genomic data in order to infer their evolutionary trajectory remains computationally challenging due to the extremely noisy and sparse nature of the data. RESULTS: Here we describe 'Dhaka', a variational autoencoder method which transforms single cell genomic data to a reduced dimension feature space that is more efficient in differentiating between (hidden) tumor subpopulations. Our method is general and can be applied to several different types of genomic data including copy number variation from scDNA-Seq and gene expression from scRNA-Seq experiments. We tested the method on synthetic and six single cell cancer datasets where the number of cells ranges from 250 to 6000 for each sample. Analysis of the resulting feature space revealed subpopulations of cells and their marker genes. The features are also able to infer the lineage and/or differentiation trajectory between cells greatly improving upon prior methods suggested for feature extraction and dimensionality reduction of such data. AVAILABILITY AND IMPLEMENTATION: All the datasets used in the paper are publicly available and developed software package and supporting info is available on Github https://github.com/MicrosoftGenomics/Dhaka. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

16.
J Biomed Inform ; 128: 104031, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35183765

RESUMEN

Preterm birth affects more than 10% of all births worldwide. Such infants are much more prone to Growth Faltering (GF), an issue that has been unsolved despite the implementation of numerous interventions aimed at optimizing preterm infant nutrition. To improve the ability for early prediction of GF risk for preterm infants we collected a comprehensive, large, and unique clinical and microbiome dataset from 3 different sites in the US and the UK. We use and extend machine learning methods for GF prediction from clinical data. We next extend graphical models to integrate time series clinical and microbiome data. A model that integrates clinical and microbiome data improves on the ability to predict GF when compared to models using clinical data only. Information on a small subset of the taxa is enough to help improve model accuracy and to predict interventions that can improve outcome. We show that a hierarchical classifier that only uses a subset of the taxa for a subset of the infants is both the most accurate and cost-effective method for GF prediction. Further analysis of the best classifiers enables the prediction of interventions that can improve outcome.


Asunto(s)
Microbiota , Nacimiento Prematuro , Humanos , Lactante , Recién Nacido , Recien Nacido Prematuro , Aprendizaje Automático
17.
Proc Natl Acad Sci U S A ; 116(52): 27151-27158, 2019 Dec 26.
Artículo en Inglés | MEDLINE | ID: mdl-31822622

RESUMEN

Several methods were developed to mine gene-gene relationships from expression data. Examples include correlation and mutual information methods for coexpression analysis, clustering and undirected graphical models for functional assignments, and directed graphical models for pathway reconstruction. Using an encoding for gene expression data, followed by deep neural networks analysis, we present a framework that can successfully address all of these diverse tasks. We show that our method, convolutional neural network for coexpression (CNNC), improves upon prior methods in tasks ranging from predicting transcription factor targets to identifying disease-related genes to causality inference. CNNC's encoding provides insights about some of the decisions it makes and their biological basis. CNNC is flexible and can easily be extended to integrate additional types of genomics data, leading to further improvements in its performance.

18.
Proc Natl Acad Sci U S A ; 116(24): 11770-11775, 2019 06 11.
Artículo en Inglés | MEDLINE | ID: mdl-31127043

RESUMEN

The mechanisms of bacterial chemotaxis have been extensively studied for several decades, but how the physical environment influences the collective migration of bacterial cells remains less understood. Previous models of bacterial chemotaxis have suggested that the movement of migrating bacteria across obstacle-laden terrains may be slower compared with terrains without them. Here, we show experimentally that the size or density of evenly spaced obstacles do not alter the average exit rate of Escherichia coli cells from microchambers in response to external attractants, a function that is dependent on intact cell-cell communication. We also show, both by analyzing a revised theoretical model and by experimentally following single cells, that the reduced exit time in the presence of obstacles is a consequence of reduced tumbling frequency that is adjusted by the E. coli cells in response to the topology of their environment. These findings imply operational short-term memory of bacteria while moving through complex environments in response to chemotactic stimuli and motivate improved algorithms for self-autonomous robotic swarms.


Asunto(s)
Quimiotaxis/fisiología , Escherichia coli/fisiología , Comunicación Celular/fisiología , Movimiento/fisiología
19.
Genome Res ; 2018 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-29317474

RESUMEN

Generating detailed and accurate organogenesis models using single-cell RNA-seq data remains a major challenge. Current methods have relied primarily on the assumption that descendant cells are similar to their parents in terms of gene expression levels. These assumptions do not always hold for in vivo studies, which often include infrequently sampled, unsynchronized, and diverse cell populations. Thus, additional information may be needed to determine the correct ordering and branching of progenitor cells and the set of transcription factors (TFs) that are active during advancing stages of organogenesis. To enable such modeling, we have developed a method that learns a probabilistic model that integrates expression similarity with regulatory information to reconstruct the dynamic developmental cell trajectories. When applied to mouse lung developmental data, the method accurately distinguished different cell types and lineages. Existing and new experimental data validated the ability of the method to identify key regulators of cell fate.

20.
PLoS Comput Biol ; 16(10): e1007939, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-33108369

RESUMEN

Several studies profile similar single cell RNA-Seq (scRNA-Seq) data using different technologies and platforms. A number of alignment methods have been developed to enable the integration and comparison of scRNA-Seq data from such studies. While each performs well on some of the datasets, to date no method was able to both perform the alignment using the original expression space and generalize to new data. To enable such analysis we developed Single Cell Iterative Point set Registration (SCIPR) which extends methods that were successfully applied to align image data to scRNA-Seq. We discuss the required changes needed, the resulting optimization function, and algorithms for learning a transformation function for aligning data. We tested SCIPR on several scRNA-Seq datasets. As we show it successfully aligns data from several different cell types, improving upon prior methods proposed for this task. In addition, we show the parameters learned by SCIPR can be used to align data not used in the training and to identify key cell type-specific genes.


Asunto(s)
Perfilación de la Expresión Génica/métodos , ARN Citoplasmático Pequeño/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Algoritmos , Línea Celular , Biología Computacional , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA