Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 150
Filtrar
1.
Bioinformatics ; 2024 Aug 17.
Artículo en Inglés | MEDLINE | ID: mdl-39152991

RESUMEN

MOTIVATION: Spatial transcriptomics allow to quantify mRNA expression within the spatial context. Nonetheless, in-depth analysis of spatial transcriptomics data remains challenging and difficult to scale due to the number of methods and libraries required for that purpose. RESULTS: Here we present SpatialOne, an end-to-end pipeline designed to simplify the analysis of 10x Visium data by combining multiple state-of-the-art computational methods to segment, deconvolve and quantify spatial information; this approach streamlines the analysis of reproducible spatial-data at scale. AVAILABILITY AND IMPLEMENTATION: SpatialOne source code and execution examples are available at https://github.com/Sanofi-Public/spatialone-pipeline. SpatialOne is distributed as a docker container image. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

2.
Artículo en Inglés | MEDLINE | ID: mdl-39137087

RESUMEN

Time series RNASeq studies can enable understanding of the dynamics of disease progression and treatment response in patients. They also provide information on biomarkers, activated and repressed pathways, and more. While useful, data from multiple patients is challenging to integrate due to the heterogeneity in treatment response among patients, and the small number of timepoints that are usually profiled. Due to the heterogeneity among patients, relying on the sampled time points to integrate data across individuals is challenging and does not lead to correct reconstruction of the response patterns. To address these challenges, we developed a new constrained based pseudotime ordering method for analyzing transcriptomics data in clinical and response studies. Our method allows the assignment of samples to their correct placement on the response curve while respecting the individual patient order. We use polynomials to represent gene expression over the duration of the study and an EM algorithm to determine parameters and locations. Application to three treatment response datasets shows that our method improves on prior methods and leads to accurate orderings that provide new biological insight on the disease and response. Code for the method is available at https://github.com/Sanofi-Public/ RDCS-bulkRNASeq-pseudo ordering.

3.
Genome Res ; 34(7): 1027-1035, 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-38951026

RESUMEN

mRNA-based vaccines and therapeutics are gaining popularity and usage across a wide range of conditions. One of the critical issues when designing such mRNAs is sequence optimization. Even small proteins or peptides can be encoded by an enormously large number of mRNAs. The actual mRNA sequence can have a large impact on several properties, including expression, stability, immunogenicity, and more. To enable the selection of an optimal sequence, we developed CodonBERT, a large language model (LLM) for mRNAs. Unlike prior models, CodonBERT uses codons as inputs, which enables it to learn better representations. CodonBERT was trained using more than 10 million mRNA sequences from a diverse set of organisms. The resulting model captures important biological concepts. CodonBERT can also be extended to perform prediction tasks for various mRNA properties. CodonBERT outperforms previous mRNA prediction methods, including on a new flu vaccine data set.


Asunto(s)
ARN Mensajero , Vacunas de ARNm , Humanos , ARN Mensajero/genética , Codón , Algoritmos
4.
Bioinformatics ; 40(Supplement_1): i151-i159, 2024 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-38940139

RESUMEN

MOTIVATION: Analysis of time series transcriptomics data from clinical trials is challenging. Such studies usually profile very few time points from several individuals with varying response patterns and dynamics. Current methods for these datasets are mainly based on linear, global orderings using visit times which do not account for the varying response rates and subgroups within a patient cohort. RESULTS: We developed a new method that utilizes multi-commodity flow algorithms for trajectory inference in large scale clinical studies. Recovered trajectories satisfy individual-based timing restrictions while integrating data from multiple patients. Testing the method on multiple drug datasets demonstrated an improved performance compared to prior approaches suggested for this task, while identifying novel disease subtypes that correspond to heterogeneous patient response patterns. AVAILABILITY AND IMPLEMENTATION: The source code and instructions to download the data have been deposited on GitHub at https://github.com/euxhenh/Truffle.


Asunto(s)
Algoritmos , Transcriptoma , Humanos , Transcriptoma/genética , Perfilación de la Expresión Génica/métodos , Programas Informáticos
5.
Bioinformatics ; 40(7)2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38810107

RESUMEN

MOTIVATION: Lipid nanoparticles (LNPs) are the most widely used vehicles for mRNA vaccine delivery. The structure of the lipids composing the LNPs can have a major impact on the effectiveness of the mRNA payload. Several properties should be optimized to improve delivery and expression including biodegradability, synthetic accessibility, and transfection efficiency. RESULTS: To optimize LNPs, we developed and tested models that enable the virtual screening of LNPs with high transfection efficiency. Our best method uses the lipid Simplified Molecular-Input Line-Entry System (SMILES) as inputs to a large language model. Large language model-generated embeddings are then used by a downstream gradient-boosting classifier. As we show, our method can more accurately predict lipid properties, which could lead to higher efficiency and reduced experimental time and costs. AVAILABILITY AND IMPLEMENTATION: Code and data links available at: https://github.com/Sanofi-Public/LipoBART.


Asunto(s)
Lípidos , Nanopartículas , Transfección , Nanopartículas/química , Lípidos/química , Transfección/métodos , ARN Mensajero/metabolismo , Liposomas
6.
bioRxiv ; 2024 Jan 07.
Artículo en Inglés | MEDLINE | ID: mdl-38260359

RESUMEN

Direct nanopore-based RNA sequencing can be used to detect post-transcriptional base modifications, such as m6A methylation, based on the electric current signals produced by the distinct chemical structures of modified bases. A key challenge is the scarcity of adequate training data with known methylation modifications. We present Xron, a hybrid encoder-decoder framework that delivers a direct methylation-distinguishing basecaller by training on synthetic RNA data and immunoprecipitation-based experimental data in two steps. First, we generate data with more diverse modification combinations through in silico cross-linking. Second, we use this dataset to train an end-to-end neural network basecaller followed by fine-tuning on immunoprecipitation-based experimental data with label-smoothing. The trained neural network basecaller outperforms existing methylation detection methods on both read-level and site-level prediction scores. Xron is a standalone, end-to-end m6A-distinguishing basecaller capable of detecting methylated bases directly from raw sequencing signals, enabling de novo methylome assembly.

7.
Nature ; 626(7998): 367-376, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38092041

RESUMEN

Implantation of the human embryo begins a critical developmental stage that comprises profound events including axis formation, gastrulation and the emergence of haematopoietic system1,2. Our mechanistic knowledge of this window of human life remains limited due to restricted access to in vivo samples for both technical and ethical reasons3-5. Stem cell models of human embryo have emerged to help unlock the mysteries of this stage6-16. Here we present a genetically inducible stem cell-derived embryoid model of early post-implantation human embryogenesis that captures the reciprocal codevelopment of embryonic tissue and the extra-embryonic endoderm and mesoderm niche with early haematopoiesis. This model is produced from induced pluripotent stem cells and shows unanticipated self-organizing cellular programmes similar to those that occur in embryogenesis, including the formation of amniotic cavity and bilaminar disc morphologies as well as the generation of an anterior hypoblast pole and posterior domain. The extra-embryonic layer in these embryoids lacks trophoblast and shows advanced multilineage yolk sac tissue-like morphogenesis that harbours a process similar to distinct waves of haematopoiesis, including the emergence of erythroid-, megakaryocyte-, myeloid- and lymphoid-like cells. This model presents an easy-to-use, high-throughput, reproducible and scalable platform to probe multifaceted aspects of human development and blood formation at the early post-implantation stage. It will provide a tractable human-based model for drug testing and disease modelling.


Asunto(s)
Desarrollo Embrionario , Estratos Germinativos , Hematopoyesis , Saco Vitelino , Humanos , Implantación del Embrión , Endodermo/citología , Endodermo/embriología , Estratos Germinativos/citología , Estratos Germinativos/embriología , Saco Vitelino/citología , Saco Vitelino/embriología , Mesodermo/citología , Mesodermo/embriología , Células Madre Pluripotentes Inducidas/citología , Amnios/citología , Amnios/embriología , Cuerpos Embrioides/citología , Linaje de la Célula , Biología Evolutiva/métodos , Biología Evolutiva/tendencias
8.
bioRxiv ; 2023 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-37398213

RESUMEN

Spatial transcriptomics promises to greatly improve our understanding of tissue organization and cell-cell interactions. While most current platforms for spatial transcriptomics only offer multi-cellular resolution, with 10-15 cells per spot, recent technologies provide a much denser spot placement leading to sub-cellular resolution. A key challenge for these newer methods is cell segmentation and the assignment of spots to cells. Traditional image-based segmentation methods are limited and do not make full use of the information profiled by spatial transcrip-tomics. Here we present SCS, which combines imaging data with sequencing data to improve cell segmentation accuracy. SCS assigns spots to cells by adaptively learning the position of each spot relative to the center of its cell using a transformer neural network. SCS was tested on two new sub-cellular spatial transcriptomics technologies and outperformed traditional image-based segmentation methods. SCS achieved better accuracy, identified more cells, and provided more realistic cell size estimation. Sub-cellular analysis of RNAs using SCS spots assignments provides information on RNA localization and further supports the segmentation results.

9.
bioRxiv ; 2023 Jul 24.
Artículo en Inglés | MEDLINE | ID: mdl-37398391

RESUMEN

Implantation of the human embryo commences a critical developmental stage that comprises profound morphogenetic alteration of embryonic and extra-embryonic tissues, axis formation, and gastrulation events. Our mechanistic knowledge of this window of human life remains limited due to restricted access to in vivo samples for both technical and ethical reasons. Additionally, human stem cell models of early post-implantation development with both embryonic and extra-embryonic tissue morphogenesis are lacking. Here, we present iDiscoid, produced from human induced pluripotent stem cells via an engineered a synthetic gene circuit. iDiscoids exhibit reciprocal co-development of human embryonic tissue and engineered extra-embryonic niche in a model of human post-implantation. They exhibit unanticipated self-organization and tissue boundary formation that recapitulates yolk sac-like tissue specification with extra-embryonic mesoderm and hematopoietic characteristics, the formation of bilaminar disc-like embryonic morphology, the development of an amniotic-like cavity, and acquisition of an anterior-like hypoblast pole and posterior-like axis. iDiscoids offer an easy-to-use, high-throughput, reproducible, and scalable platform to probe multifaceted aspects of human early post-implantation development. Thus, they have the potential to provide a tractable human model for drug testing, developmental toxicology, and disease modeling.

10.
Nat Methods ; 20(8): 1237-1243, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37429992

RESUMEN

Spatial transcriptomics promises to greatly improve our understanding of tissue organization and cell-cell interactions. While most current platforms for spatial transcriptomics only offer multi-cellular resolution, with 10-15 cells per spot, recent technologies provide a much denser spot placement leading to subcellular resolution. A key challenge for these newer methods is cell segmentation and the assignment of spots to cells. Traditional image-based segmentation methods are limited and do not make full use of the information profiled by spatial transcriptomics. Here we present subcellular spatial transcriptomics cell segmentation (SCS), which combines imaging data with sequencing data to improve cell segmentation accuracy. SCS assigns spots to cells by adaptively learning the position of each spot relative to the center of its cell using a transformer neural network. SCS was tested on two new subcellular spatial transcriptomics technologies and outperformed traditional image-based segmentation methods. SCS achieved better accuracy, identified more cells and provided more realistic cell size estimation. Subcellular analysis of RNAs using SCS spot assignments provides information on RNA localization and further supports the segmentation results.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Comunicación Celular , Tamaño de la Célula , Aprendizaje
11.
Nat Aging ; 3(7): 776-790, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37400722

RESUMEN

Cellular senescence is a well-established driver of aging and age-related diseases. There are many challenges to mapping senescent cells in tissues such as the absence of specific markers and their relatively low abundance and vast heterogeneity. Single-cell technologies have allowed unprecedented characterization of senescence; however, many methodologies fail to provide spatial insights. The spatial component is essential, as senescent cells communicate with neighboring cells, impacting their function and the composition of extracellular space. The Cellular Senescence Network (SenNet), a National Institutes of Health (NIH) Common Fund initiative, aims to map senescent cells across the lifespan of humans and mice. Here, we provide a comprehensive review of the existing and emerging methodologies for spatial imaging and their application toward mapping senescent cells. Moreover, we discuss the limitations and challenges inherent to each technology. We argue that the development of spatially resolved methods is essential toward the goal of attaining an atlas of senescent cells.


Asunto(s)
Envejecimiento , Senescencia Celular , Estados Unidos , Humanos , Animales , Ratones , Longevidad
12.
Bioinformatics ; 39(39 Suppl 1): i140-i148, 2023 06 30.
Artículo en Inglés | MEDLINE | ID: mdl-37387167

RESUMEN

MOTIVATION: Spatial proteomics data have been used to map cell states and improve our understanding of tissue organization. More recently, these methods have been extended to study the impact of such organization on disease progression and patient survival. However, to date, the majority of supervised learning methods utilizing these data types did not take full advantage of the spatial information, impacting their performance and utilization. RESULTS: Taking inspiration from ecology and epidemiology, we developed novel spatial feature extraction methods for use with spatial proteomics data. We used these features to learn prediction models for cancer patient survival. As we show, using the spatial features led to consistent improvement over prior methods that used the spatial proteomics data for the same task. In addition, feature importance analysis revealed new insights about the cell interactions that contribute to patient survival. AVAILABILITY AND IMPLEMENTATION: The code for this work can be found at gitlab.com/enable-medicine-public/spatsurv.


Asunto(s)
Neoplasias , Proteómica , Humanos , Neoplasias/diagnóstico por imagen , Comunicación Celular , Progresión de la Enfermedad , Análisis de Supervivencia
13.
bioRxiv ; 2023 Mar 09.
Artículo en Inglés | MEDLINE | ID: mdl-36945593

RESUMEN

Cross-regulation between hormone signaling pathways is indispensable for plant growth and development. However, the molecular mechanisms by which multiple hormones interact and co-ordinate activity need to be understood. Here, we generated a cross-regulation network explaining how hormone signals are integrated from multiple pathways in etiolated Arabidopsis (Arabidopsis thaliana) seedlings. To do so we comprehensively characterized transcription factor activity during plant hormone responses and reconstructed dynamic transcriptional regulatory models for six hormones; abscisic acid, brassinosteroid, ethylene, jasmonic acid, salicylic acid and strigolactone/karrikin. These models incorporated target data for hundreds of transcription factors and thousands of protein-protein interactions. Each hormone recruited different combinations of transcription factors, a subset of which were shared between hormones. Hub target genes existed within hormone transcriptional networks, exhibiting transcription factor activity themselves. In addition, a group of MITOGEN-ACTIVATED PROTEIN KINASES (MPKs) were identified as potential key points of cross-regulation between multiple hormones. Accordingly, the loss of function of one of these (MPK6) disrupted the global proteome, phosphoproteome and transcriptome during hormone responses. Lastly, we determined that all hormones drive substantial alternative splicing that has distinct effects on the transcriptome compared with differential gene expression, acting in early hormone responses. These results provide a comprehensive understanding of the common features of plant transcriptional regulatory pathways and how cross-regulation between hormones acts upon gene expression.

14.
Nucleic Acids Res ; 51(7): e38, 2023 04 24.
Artículo en Inglés | MEDLINE | ID: mdl-36762475

RESUMEN

Inference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)-gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values. This allows our method to overcome many of the problems faced by prior methods leading to more accurate and more comprehensive set of identified regulatory interactions. Application of our method to atlas scale single cell data from 6 HuBMAP tissues led to several validated and novel predictions and greatly improved on prior methods proposed for this task.


Asunto(s)
Biología Computacional , Algoritmos , Redes Reguladoras de Genes , Biología de Sistemas , Análisis de la Célula Individual , Atlas como Asunto
15.
bioRxiv ; 2023 Dec 19.
Artículo en Inglés | MEDLINE | ID: mdl-38187629

RESUMEN

Many popular spatial transcriptomics techniques lack single-cell resolution. Instead, these methods measure the collective gene expression for each location from a mixture of cells, potentially containing multiple cell types. Here, we developed scResolve, a method for recovering single-cell expression profiles from spatial transcriptomics measurements at multi-cellular resolution. scResolve accurately restores expression profiles of individual cells at their locations, which is unattainable from cell type deconvolution. Applications of scResolve on human breast cancer data and human lung disease data demonstrate that scResolve enables cell type-specific differential gene expression analysis between different tissue contexts and accurate identification of rare cell populations. The spatially resolved cellular-level expression profiles obtained through scResolve facilitate more flexible and precise spatial analysis that complements raw multi-cellular level analysis.

16.
Cell Rep Methods ; 2(11): 100332, 2022 11 21.
Artículo en Inglés | MEDLINE | ID: mdl-36452867

RESUMEN

Markers are increasingly being used for several high-throughput data analysis and experimental design tasks. Examples include the use of markers for assigning cell types in scRNA-seq studies, for deconvolving bulk gene expression data, and for selecting marker proteins in single-cell spatial proteomics studies. Most marker selection methods focus on differential expression (DE) analysis. Although such methods work well for data with a few non-overlapping marker sets, they are not appropriate for large atlas-size datasets where several cell types and tissues are considered. To address this, we define the phenotype cover (PC) problem for marker selection and present algorithms that can improve the discriminative power of marker sets. Analysis of these sets on several marker-selection tasks suggests that these methods can lead to solutions that accurately distinguish different phenotypes in the data.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual/métodos , Análisis por Conglomerados , Algoritmos , Fenotipo
18.
Nat Methods ; 19(10): 1306-1319, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36064772

RESUMEN

Hematopoietic humanized (hu) mice are powerful tools for modeling the action of human immune system and are widely used for preclinical studies and drug discovery. However, generating a functional human T cell compartment in hu mice remains challenging, primarily due to the species-related differences between human and mouse thymus. While engrafting human fetal thymic tissues can support robust T cell development in hu mice, tissue scarcity and ethical concerns limit their wide use. Here, we describe the tissue engineering of human thymus organoids from inducible pluripotent stem cells (iPSC-thymus) that can support the de novo generation of a diverse population of functional human T cells. T cells of iPSC-thymus-engrafted hu mice could mediate both cellular and humoral immune responses, including mounting robust proinflammatory responses on T cell receptor engagement, inhibiting allogeneic tumor graft growth and facilitating efficient Ig class switching. Our findings indicate that hu mice engrafted with iPSC-thymus can serve as a new animal model to study human T cell-mediated immunity and accelerate the translation of findings from animal studies into the clinic.


Asunto(s)
Trasplante de Células Madre Hematopoyéticas , Células Madre Pluripotentes Inducidas , Animales , Modelos Animales de Enfermedad , Humanos , Ratones , Ratones SCID , Organoides , Linfocitos T , Timo
19.
PLoS Comput Biol ; 18(9): e1010468, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36095011

RESUMEN

Studies comparing single cell RNA-Seq (scRNA-Seq) data between conditions mainly focus on differences in the proportion of cell types or on differentially expressed genes. In many cases these differences are driven by changes in cell interactions which are challenging to infer without spatial information. To determine cell-cell interactions that differ between conditions we developed the Cell Interaction Network Inference (CINS) pipeline. CINS combines Bayesian network analysis with regression-based modeling to identify differential cell type interactions and the proteins that underlie them. We tested CINS on a disease case control and on an aging mouse dataset. In both cases CINS correctly identifies cell type interactions and the ligands involved in these interactions improving on prior methods suggested for cell interaction predictions. We performed additional mouse aging scRNA-Seq experiments which further support the interactions identified by CINS.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Animales , Teorema de Bayes , Comunicación Celular , Perfilación de la Expresión Génica/métodos , Ligandos , Ratones , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos
20.
J Comput Biol ; 29(11): 1229-1232, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36036832

RESUMEN

UNIFAN is an unsupervised cell type annotation tool for single-cell RNA sequencing data (scRNA-seq). Given single-cell expression data as input, UNIFAN outputs cell clusters as well as annotations for each cluster. The clustering process utilizes information on pathways and biological processes and these are also used to annotate the resulting clusters. In this software article, we focus on how to install UNIFAN and on the main steps involved in using UNIFAN for cell type annotations.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos , Análisis por Conglomerados , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA