Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
1.
Bioinformatics ; 40(2)2024 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-38243704

RESUMEN

MOTIVATION: Spatially resolved transcriptomics (SRT) enables scientists to investigate spatial context of mRNA abundance, including identifying spatially variable genes (SVGs), i.e. genes whose expression varies across the tissue. Although several methods have been proposed for this task, native SVG tools cannot jointly model biological replicates, or identify the key areas of the tissue affected by spatial variability. RESULTS: Here, we introduce DESpace, a framework, based on an original application of existing methods, to discover SVGs. In particular, our approach inputs all types of SRT data, summarizes spatial information via spatial clusters, and identifies spatially variable genes by performing differential gene expression testing between clusters. Furthermore, our framework can identify (and test) the main cluster of the tissue affected by spatial variability; this allows scientists to investigate spatial expression changes in specific areas of interest. Additionally, DESpace enables joint modeling of multiple samples (i.e. biological replicates); compared to inference based on individual samples, this approach increases statistical power, and targets SVGs with consistent spatial patterns across replicates. Overall, in our benchmarks, DESpace displays good true positive rates, controls for false positive and false discovery rates, and is computationally efficient. AVAILABILITY AND IMPLEMENTATION: DESpace is freely distributed as a Bioconductor R package at https://bioconductor.org/packages/DESpace.


Asunto(s)
Perfilación de la Expresión Génica , Programas Informáticos , Perfilación de la Expresión Génica/métodos , Benchmarking , Transcriptoma
2.
bioRxiv ; 2023 Aug 17.
Artículo en Inglés | MEDLINE | ID: mdl-37645841

RESUMEN

Motivation: Although transcriptomics data is typically used to analyse mature spliced mRNA, recent attention has focused on jointly investigating spliced and unspliced (or precursor-) mRNA, which can be used to study gene regulation and changes in gene expression production. Nonetheless, most methods for spliced/unspliced inference (such as RNA velocity tools) focus on individual samples, and rarely allow comparisons between groups of samples (e.g., healthy vs. diseased). Furthermore, this kind of inference is challenging, because spliced and unspliced mRNA abundance is characterized by a high degree of quantification uncertainty, due to the prevalence of multi-mapping reads, i.e., reads compatible with multiple transcripts (or genes), and/or with both their spliced and unspliced versions. Results: Here, we present DifferentialRegulation, a Bayesian hierarchical method to discover changes between experimental conditions with respect to the relative abundance of unspliced mRNA (over the total mRNA). We model the quantification uncertainty via a latent variable approach, where reads are allocated to their gene/transcript of origin, and to the respective splice version. We designed several benchmarks where our approach shows good performance, in terms of sensitivity and error control, versus state-of-the-art competitors. Importantly, our tool is flexible, and works with both bulk and single-cell RNA-sequencing data. Availability and implementation: DifferentialRegulation is distributed as a Bioconductor R package.

3.
Methods Mol Biol ; 2584: 269-292, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36495456

RESUMEN

Technological developments have led to an explosion of high-throughput single-cell data, which are revealing unprecedented perspectives on cell identity. Recently, significant attention has focused on investigating, from single-cell RNA-sequencing (scRNA-seq) data, cellular dynamic processes, such as cell differentiation, cell cycle and cell (de)activation. In particular, trajectory inference methods, by ordering cells along a trajectory, allow estimating a differentiation tree of cells. While trajectory inference tools typically work with gene expression levels, common scRNA-seq protocols allow the identification and quantification of unspliced pre-mRNAs and mature spliced mRNAs for each gene. By exploiting the abundance of unspliced and spliced mRNA, one can infer the RNA velocity of individual cells, i.e., the time derivative of the gene expression state of cells. Whereas traditional trajectory inference methods reconstruct cellular dynamics given a population of cells of varying maturity, RNA velocity relies on a dynamical model describing splicing dynamics. Here, we initially discuss conceptual and theoretical aspects of both approaches, then illustrate how they can be combined together, and finally present an example use case on real data.


Asunto(s)
ARN , Análisis de la Célula Individual , ARN/genética , Análisis de la Célula Individual/métodos , Empalme del ARN , Diferenciación Celular/genética , ARN Mensajero/genética , Análisis de Secuencia de ARN/métodos , Perfilación de la Expresión Génica/métodos
4.
Cancers (Basel) ; 14(10)2022 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-35626040

RESUMEN

Immune checkpoint inhibitors (ICIs) are largely used in the treatment of patients with advanced non-small-cell lung cancer (NSCLC). Novel biomarkers that provide biological information that could be useful for clinical management are needed. In this respect, extracellular vesicles (EV)-associated microRNAs (miRNAs) that are the principal vehicle of intercellular communication may be important sources of biomarkers. We analyzed the levels of 799 EV-miRNAs in the pretreatment plasma of 88 advanced NSCLC patients who received anti-PD-1 therapy as single agent. After data normalization, we used a two-step approach to identify candidate biomarkers associated to both objective response (OR) by RECIST and longer overall survival (OS). Univariate and multivariate analyses including known clinicopathologic variables and new findings were performed. In our cohort, 24/88 (27.3%) patients showed OR by RECIST. Median OS in the whole cohort was 11.5 months. In total, 196 EV-miRNAs out 799 were selected as expressed above background. After multiplicity adjustment, abundance of EV-miR-625-5p was found to be correlated with PD-L1 expression and significantly associated to OR by RECIST (p = 0.0366) and OS (p = 0.0031). In multivariate analysis, PD-L1 staining and EV-miR-625-5p levels were constantly associated to OR and OS. Finally, we showed that EV-miR-625-5p levels could discriminate patients with longer survival, in particular in the class expressing PD-L1 ≥50%. EV-miRNAs represent a source of relevant biomarkers. EV-miR-625-5p is an independent biomarker of response and survival in ICI-treated NSCLC patients, in particular in patients with PD-L1 expression ≥50%.

5.
Genome Biol ; 23(1): 69, 2022 03 03.
Artículo en Inglés | MEDLINE | ID: mdl-35241129

RESUMEN

BACKGROUND: The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms. RESULTS: We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis. CONCLUSIONS: Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.


Asunto(s)
Proteogenómica , Empalme Alternativo , Humanos , Isoformas de Proteínas/genética , Proteómica , Análisis de Secuencia de ARN/métodos , Transcriptoma
6.
BMC Biol ; 19(1): 177, 2021 08 28.
Artículo en Inglés | MEDLINE | ID: mdl-34454477

RESUMEN

BACKGROUND: Apomixis, the asexual reproduction through seeds, occurs in over 40 plant families and avoids the hidden cost of sex. Apomictic plants are thought to have an advantage in sparse populations and when colonizing new areas but may have a disadvantage in changing environments because they propagate via fixed genotypes. In this study, we separated the influences of different genetic backgrounds (potentially reflecting local adaptation) from those of the mode of reproduction, i.e., sexual vs. apomictic, on nine fitness-related traits in Hieracium pilosella L. We aimed to test whether apomixis per se may provide a fitness advantage in different competitive environments in a common garden setting. RESULTS: To separate the effects of genetic background from those of reproductive mode, we generated five families of apomictic and sexual full siblings by crossing two paternal with four maternal parents. Under competition, apomictic plants showed reproductive assurance (probability of seeding, fertility), while offspring of sexual plants with the same genetic background had a higher germination rate. Sexual plants grew better (biomass) than apomictic plants in the presence of grass as a competitor but apomictic plants spread further vegetatively (maximum stolon length) when their competitors were sexual plants of the same species. Furthermore, genetic background as represented by the five full-sibling families influenced maximum stolon length, the number of seeds, and total fitness. Under competition with grass, genetic background influenced fecundity, the number of seeds, and germination rate. CONCLUSIONS: Our results suggest that both the mode of reproduction as well as the genetic background affect the success of H. pilosella in competitive environments. Total fitness, the most relevant trait for adaptation, was only affected by the genetic background. However, we also show for the first time that apomixis per se has effects on fitness-related traits that are not confounded by-and thus independent of-the genetic background.


Asunto(s)
Apomixis , Asteraceae , Apomixis/genética , Asteraceae/genética , Antecedentes Genéticos , Fenotipo , Reproducción Asexuada/genética , Semillas/genética
7.
Genome Biol ; 22(1): 56, 2021 02 04.
Artículo en Inglés | MEDLINE | ID: mdl-33541397

RESUMEN

BACKGROUND: Transcription in mammalian cells is a complex stochastic process involving shuttling of polymerase between genes and phase-separated liquid condensates. It occurs in bursts, which results in vastly different numbers of an mRNA species in isogenic cell populations. Several factors contributing to transcriptional bursting have been identified, usually classified as intrinsic, in other words local to single genes, or extrinsic, relating to the macroscopic state of the cell. However, some possible contributors have not been explored yet. Here, we focus on processes at the 3 ' and 5 ' ends of a gene that enable reinitiation of transcription upon termination. RESULTS: Using Bayesian methodology, we measure the transcriptional bursting in inducible transgenes, showing that perturbation of polymerase shuttling typically reduces burst size, increases burst frequency, and thus limits transcriptional noise. Analysis based on paired-end tag sequencing (PolII ChIA-PET) suggests that this effect is genome wide. The observed noise patterns are also reproduced by a generative model that captures major characteristics of the polymerase flux between the ends of a gene and a phase-separated compartment. CONCLUSIONS: Interactions between the 3 ' and 5 ' ends of a gene, which facilitate polymerase recycling, are major contributors to transcriptional noise.


Asunto(s)
Fenómenos Fisiológicos Celulares , Expresión Génica , Modelos Genéticos , Transcripción Genética , Animales , Teorema de Bayes , Células HEK293 , Humanos , Modelos Teóricos , ARN Mensajero , Procesos Estocásticos , Globinas beta/genética
8.
Genome Biol ; 21(1): 69, 2020 03 16.
Artículo en Inglés | MEDLINE | ID: mdl-32178699

RESUMEN

Alternative splicing is a biological process during gene expression that allows a single gene to code for multiple proteins. However, splicing patterns can be altered in some conditions or diseases. Here, we present BANDITS, a R/Bioconductor package to perform differential splicing, at both gene and transcript level, based on RNA-seq data. BANDITS uses a Bayesian hierarchical structure to explicitly model the variability between samples and treats the transcript allocation of reads as latent variables. We perform an extensive benchmark across both simulated and experimental RNA-seq datasets, where BANDITS has extremely favourable performance with respect to the competitors considered.


Asunto(s)
Empalme Alternativo , Programas Informáticos , Teorema de Bayes , Humanos , RNA-Seq , Reproducibilidad de los Resultados
9.
Bioinformatics ; 34(17): i647-i655, 2018 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-30423089

RESUMEN

Motivation: Transcription in single cells is an inherently stochastic process as mRNA levels vary greatly between cells, even for genetically identical cells under the same experimental and environmental conditions. We present a stochastic two-state switch model for the population of mRNA molecules in single cells where genes stochastically alternate between a more active ON state and a less active OFF state. We prove that the stationary solution of such a model can be written as a mixture of a Poisson and a Poisson-beta probability distribution. This finding facilitates inference for single cell expression data, observed at a single time point, from flow cytometry experiments such as FACS or fluorescence in situ hybridization (FISH) as it allows one to sample directly from the equilibrium distribution of the mRNA population. We hence propose a Bayesian inferential methodology using a pseudo-marginal approach and a recent approximation to integrate over unobserved states associated with measurement error. Results: We provide a general inferential framework which can be widely used to study transcription in single cells from the kind of data arising in flow cytometry experiments. The approach allows us to separate between the intrinsic stochasticity of the molecular dynamics and the measurement noise. The methodology is tested in simulation studies and results are obtained for experimental multiple single cell expression data from FISH flow cytometry experiments. Availability and implementation: All analyses were implemented in R. Source code and the experimental data are available at https://github.com/SimoneTiberi/Bayesian-inference-on-stochastic-gene-transcription-from-flow-cytometry-data. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Teorema de Bayes , Transcripción Genética , Citometría de Flujo , Hibridación Fluorescente in Situ , Programas Informáticos , Procesos Estocásticos
10.
Stat Methods Med Res ; 27(11): 3386-3396, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-28395600

RESUMEN

Couples with diseases associated with the sexual chromosomes, as well as families in countries where the desire for a male is extreme, are interested in influencing the sex of the baby. We propose an original composite likelihood approach to analyse the relation between sex of the newborn and timing of the intercourse which leads to conception. Although there exist numerous works on this relation, only few studies have been carried out on independent datasets to validate the existing theories. Since the sex of the newborn is only known in case of conception, the full likelihood of the data is not easily defined without strong assumptions. A composite likelihood is a pseudo likelihood defined as the product of likelihood functions relative to subsets of the data. In particular, we consider two such likelihoods, one modelling the day-specific probabilities of conception and the other modelling the sex of the newborn given a conception has occurred. The methodology is applied to a dataset from a European fecundability study. The results show no significant dependence of the sex of the newborn on the time of intercourse. The method developed may be applied to other situations when data are affected by selective sampling.


Asunto(s)
Predicción , Funciones de Verosimilitud , Razón de Masculinidad , Femenino , Fertilidad , Humanos , Embarazo
11.
Tumori ; 97(6): 749-55, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22322842

RESUMEN

AIMS AND BACKGROUND: Discordance of intraoperative analysis with definitive histology of the sentinel lymph node in breast cancer leads to completion axillary lymph node dissection, which only in 35-50% shows additional nodal metastases. The aim of the study was to identify individual patient risk for non-sentinel lymph node metastases by validating several statistical methods present in the recent literature and by developing a new tool with the final goal of avoiding unnecessary completion axillary lymph node dissection. METHODS: We retrospectively evaluated 593 primary breast cancer patients. Completion axillary lymph node dissection was performed in 139 with a positive sentinel lymph node. The predictive accuracy of five published nomograms (MSKCC, Tenon, Cambridge, Stanford and Gur) was measured by the area under the receiver operating characteristic curve. We then developed a new logistic regression model to compare performance. Our model was validated by the leave-one-out cross-validation method. RESULTS: In 53 cases (38%), we found at least one metastatic non-sentinel lymph node. All the selected nomograms showed values greater than the 0.70 threshold, and our model reported a value of 0.77 (confidence interval = 0.69-0.86 and error rate = 0.28) and 0.72 (confidence interval = 0.63-0.81, error rate = 0.28) after the validation. With a 5% cutoff value, sensitivity was 98% and specificity 9%, for a cutoff of 10%, 96% and 2%, respectively. CONCLUSIONS: All the nomograms were good discriminators, but the alternative developed model showed the best predictive accuracy in this Italian breast cancer sample. We still confirm that these models, very accurate in the institution of origin, require a new validation if used on other populations of patients.


Asunto(s)
Neoplasias de la Mama/patología , Escisión del Ganglio Linfático , Ganglios Linfáticos/patología , Modelos Estadísticos , Nomogramas , Biopsia del Ganglio Linfático Centinela , Adulto , Anciano , Axila , Neoplasias de la Mama/cirugía , Femenino , Humanos , Italia , Escisión del Ganglio Linfático/métodos , Metástasis Linfática/patología , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Curva ROC , Estudios Retrospectivos , Sensibilidad y Especificidad , Procedimientos Innecesarios
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA