Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
1.
Cancers (Basel) ; 16(7)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38611028

RESUMO

Topic modeling is a popular technique in machine learning and natural language processing, where a corpus of text documents is classified into themes or topics using word frequency analysis. This approach has proven successful in various biological data analysis applications, such as predicting cancer subtypes with high accuracy and identifying genes, enhancers, and stable cell types simultaneously from sparse single-cell epigenomics data. The advantage of using a topic model is that it not only serves as a clustering algorithm, but it can also explain clustering results by providing word probability distributions over topics. Our study proposes a novel topic modeling approach for clustering single cells and detecting topics (gene signatures) in single-cell datasets that measure multiple omics simultaneously. We applied this approach to examine the transcriptional heterogeneity of luminal and triple-negative breast cancer cells using patient-derived xenograft models with acquired resistance to chemotherapy and targeted therapy. Through this approach, we identified protein-coding genes and long non-coding RNAs (lncRNAs) that group thousands of cells into biologically similar clusters, accurately distinguishing drug-sensitive and -resistant breast cancer types. In comparison to standard state-of-the-art clustering analyses, our approach offers an optimal partitioning of genes into topics and cells into clusters simultaneously, producing easily interpretable clustering outcomes. Additionally, we demonstrate that an integrative clustering approach, which combines the information from mRNAs and lncRNAs treated as disjoint omics layers, enhances the accuracy of cell classification.

2.
Phys Biol ; 20(5)2023 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-37489881

RESUMO

Cell-to-cell variability in protein concentrations is strongly affected by extrinsic noise, especially for highly expressed genes. Extrinsic noise can be due to fluctuations of several possible cellular factors connected to cell physiology and to the level of key enzymes in the expression process. However, how to identify the predominant sources of extrinsic noise in a biological system is still an open question. This work considers a general stochastic model of gene expression with extrinsic noise represented as fluctuations of the different model rates, and focuses on the out-of-equilibrium expression dynamics. Combining analytical calculations with stochastic simulations, we characterize how extrinsic noise shapes the protein variability during gene activation or inactivation, depending on the prevailing source of extrinsic variability, on its intensity and timescale. In particular, we show that qualitatively different noise profiles can be identified depending on which are the fluctuating parameters. This indicates an experimentally accessible way to pinpoint the dominant sources of extrinsic noise using time-coarse experiments.


Assuntos
Fenômenos Fisiológicos Celulares , Proteínas , Expressão Gênica , Processos Estocásticos , Modelos Biológicos
3.
Phys Rev E ; 107(4-1): 044403, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-37198814

RESUMO

Large-scale data on single-cell gene expression have the potential to unravel the specific transcriptional programs of different cell types. The structure of these expression datasets suggests a similarity with several other complex systems that can be analogously described through the statistics of their basic building blocks. Transcriptomes of single cells are collections of messenger RNA abundances transcribed from a common set of genes just as books are different collections of words from a shared vocabulary, genomes of different species are specific compositions of genes belonging to evolutionary families, and ecological niches can be described by their species abundances. Following this analogy, we identify several emergent statistical laws in single-cell transcriptomic data closely similar to regularities found in linguistics, ecology, or genomics. A simple mathematical framework can be used to analyze the relations between different laws and the possible mechanisms behind their ubiquity. Importantly, treatable statistical models can be useful tools in transcriptomics to disentangle the actual biological variability from general statistical effects present in most component systems and from the consequences of the sampling process inherent to the experimental technique.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Humanos , Genômica/métodos , Ecossistema , Ecologia
4.
Sci Rep ; 13(1): 4618, 2023 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-36944670

RESUMO

The description of physical processes with many-particle systems is a key approach to the modeling of numerous physical systems. For example in storage rings, where ultrarelativistic particles are agglomerated in dense bunches, the modeling and measurement of their phase-space distribution is of paramount importance: at any time the phase-space distribution not only determines the complete space-time evolution but also provides fundamental performance characteristics for storage ring operation. Here, we demonstrate a non-destructive tomographic imaging technique for the 2D longitudinal phase-space distribution of ultrarelativistic electron bunches. For this purpose, we utilize a unique setup, which streams turn-by-turn near-field measurements of bunch profiles at MHz repetition rates. To demonstrate the feasibility of our method, we induce a non-equilibrium state and show that the phase-space distribution microstructuring as well as the phase-space distribution dynamics can be observed in great detail. Our approach offers a pathway to control ultrashort bunches and supports, as one example, the development of compact accelerators with low energy footprints.

5.
Cancers (Basel) ; 14(5)2022 Feb 23.
Artigo em Inglês | MEDLINE | ID: mdl-35267458

RESUMO

The integration of transcriptional data with other layers of information, such as the post-transcriptional regulation mediated by microRNAs, can be crucial to identify the driver genes and the subtypes of complex and heterogeneous diseases such as cancer. This paper presents an approach based on topic modeling to accomplish this integration task. More specifically, we show how an algorithm based on a hierarchical version of stochastic block modeling can be naturally extended to integrate any combination of 'omics data. We test this approach on breast cancer samples from the TCGA database, integrating data on messenger RNA, microRNAs, and copy number variations. We show that the inclusion of the microRNA layer significantly improves the accuracy of subtype classification. Moreover, some of the hidden structures or "topics" that the algorithm extracts actually correspond to genes and microRNAs involved in breast cancer development and are associated to the survival probability.

6.
Cell Rep ; 38(12): 110547, 2022 03 22.
Artigo em Inglês | MEDLINE | ID: mdl-35320714

RESUMO

The sense of smell helps us navigate the environment, but its molecular architecture and underlying logic remain understudied. The spatial location of odorant receptor genes (Olfrs) in the nose is thought to be independent of the structural diversity of the odorants they detect. Using spatial transcriptomics, we create a genome-wide 3D atlas of the mouse olfactory mucosa (OM). Topographic maps of genes differentially expressed in space reveal that both Olfrs and non-Olfrs are distributed in a continuous and overlapping fashion over at least five broad zones in the OM. The spatial locations of Olfrs correlate with the mucus solubility of the odorants they recognize, providing direct evidence for the chromatographic theory of olfaction. This resource resolves the molecular architecture of the mouse OM and will inform future studies on mechanisms underlying Olfr gene choice, axonal pathfinding, patterning of the nervous system, and basic logic for the peripheral representation of smell.


Assuntos
Receptores Odorantes , Olfato , Animais , Lógica , Camundongos , Odorantes/análise , Receptores Odorantes/genética , Olfato/genética , Transcriptoma/genética
7.
PLoS Comput Biol ; 17(12): e1009638, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34871317

RESUMO

This work studies the effects of the two rounds of Whole Genome Duplication (WGD) at the origin of the vertebrate lineage on the architecture of the human gene regulatory networks. We integrate information on transcriptional regulation, miRNA regulation, and protein-protein interactions to comparatively analyse the role of WGD and Small Scale Duplications (SSD) in the structural properties of the resulting multilayer network. We show that complex network motifs, such as combinations of feed-forward loops and bifan arrays, deriving from WGD events are specifically enriched in the network. Pairs of WGD-derived proteins display a strong tendency to interact both with each other and with common partners and WGD-derived transcription factors play a prominent role in the retention of a strong regulatory redundancy. Combinatorial regulation and synergy between different regulatory layers are in general enhanced by duplication events, but the two types of duplications contribute in different ways. Overall, our findings suggest that the two WGD events played a substantial role in increasing the multi-layer complexity of the vertebrate regulatory network by enhancing its combinatorial organization, with potential consequences on its overall robustness and ability to perform high-level functions like signal integration and noise control. Lastly, we discuss in detail the RAR/RXR pathway as an illustrative example of the evolutionary impact of WGD duplications in human.


Assuntos
Evolução Molecular , Duplicação Gênica/genética , Redes Reguladoras de Genes/genética , Genoma Humano/genética , Animais , Genômica , Humanos , Modelos Genéticos , Vertebrados/genética
8.
Life (Basel) ; 11(9)2021 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-34575116

RESUMO

FAD synthase is the last enzyme in the pathway that converts riboflavin into FAD. In Saccharomyces cerevisiae, the gene encoding for FAD synthase is FAD1, from which a sole protein product (Fad1p) is expected to be generated. In this work, we showed that a natural Fad1p exists in yeast mitochondria and that, in its recombinant form, the protein is able, per se, to both enter mitochondria and to be destined to cytosol. Thus, we propose that FAD1 generates two echoforms-that is, two identical proteins addressed to different subcellular compartments. To shed light on the mechanism underlying the subcellular destination of Fad1p, the 3' region of FAD1 mRNA was analyzed by 3'RACE experiments, which revealed the existence of (at least) two FAD1 transcripts with different 3'UTRs, the short one being 128 bp and the long one being 759 bp. Bioinformatic analysis on these 3'UTRs allowed us to predict the existence of a cis-acting mitochondrial localization motif, present in both the transcripts and, presumably, involved in protein targeting based on the 3'UTR context. Here, we propose that the long FAD1 transcript might be responsible for the generation of mitochondrial Fad1p echoform.

9.
Cancers (Basel) ; 12(12)2020 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-33339347

RESUMO

Topic modeling is a widely used technique to extract relevant information from large arrays of data. The problem of finding a topic structure in a dataset was recently recognized to be analogous to the community detection problem in network theory. Leveraging on this analogy, a new class of topic modeling strategies has been introduced to overcome some of the limitations of classical methods. This paper applies these recent ideas to TCGA transcriptomic data on breast and lung cancer. The established cancer subtype organization is well reconstructed in the inferred latent topic structure. Moreover, we identify specific topics that are enriched in genes known to play a role in the corresponding disease and are strongly related to the survival probability of patients. Finally, we show that a simple neural network classifier operating in the low dimensional topic space is able to predict with high accuracy the cancer subtype of a test expression sample.

10.
Genome Res ; 30(10): 1492-1507, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32978246

RESUMO

The quantification of the kinetic rates of RNA synthesis, processing, and degradation are largely based on the integrative analysis of total and nascent transcription, the latter being quantified through RNA metabolic labeling. We developed INSPEcT-, a computational method based on the mathematical modeling of premature and mature RNA expression that is able to quantify kinetic rates from steady-state or time course total RNA-seq data without requiring any information on nascent transcripts. Our approach outperforms available solutions, closely recapitulates the kinetic rates obtained through RNA metabolic labeling, improves the ability to detect changes in transcript half-lives, reduces the cost and complexity of the experiments, and can be adopted to study experimental conditions in which nascent transcription cannot be readily profiled. Finally, we applied INSPEcT- to the characterization of post-transcriptional regulation landscapes in dozens of physiological and disease conditions. This approach was included in the INSPEcT Bioconductor package, which can now unveil RNA dynamics from steady-state or time course data, with or without the profiling of nascent RNA.


Assuntos
RNA-Seq , RNA/metabolismo , Biologia Computacional/métodos , Doença/genética , Expressão Gênica , Genoma , Humanos , Cinética , RNA/biossíntese , Processamento Pós-Transcricional do RNA , RNA-Seq/métodos , Tiouridina
11.
Genome Biol Evol ; 12(11): 2045-2059, 2020 11 03.
Artigo em Inglês | MEDLINE | ID: mdl-32986810

RESUMO

Retrotransposons, DNA sequences capable of creating copies of themselves, compose about half of the human genome and played a central role in the evolution of mammals. Their current position in the host genome is the result of the retrotranscription process and of the following host genome evolution. We apply a model from statistical physics to show that the genomic distribution of the two most populated classes of retrotransposons in human deviates from random placement, and that this deviation increases with time. The time dependence suggests a major role of the host genome dynamics in shaping the current retrotransposon distributions. Focusing on a neutral scenario, we show that a simple model based on random placement followed by genome expansion and sequence duplications can reproduce the empirical retrotransposon distributions, even though more complex and possibly selective mechanisms can have contributed. Besides the inherent interest in understanding the origin of current retrotransposon distributions, this work sets a general analytical framework to analyze quantitatively the effects of genome evolutionary dynamics on the distribution of genomic elements.


Assuntos
Elementos Alu , Evolução Biológica , Genoma Humano , Elementos Nucleotídeos Longos e Dispersos , Modelos Genéticos , Humanos , Mutação
12.
Nucleic Acids Res ; 48(4): 1730-1747, 2020 02 28.
Artigo em Inglês | MEDLINE | ID: mdl-31889184

RESUMO

Heterogeneity is a fundamental feature of complex phenotypes. So far, genomic screenings have profiled thousands of samples providing insights into the transcriptome of the cell. However, disentangling the heterogeneity of these transcriptomic Big Data to identify defective biological processes remains challenging. Here we present GSECA, a method exploiting the bimodal behavior of RNA-sequencing gene expression profiles to identify altered gene sets in heterogeneous patient cohorts. Using simulated and experimental RNA-sequencing data sets, we show that GSECA provides higher performances than other available algorithms in detecting truly altered biological processes in large cohorts. Applied to 5941 samples from 14 different cancer types, GSECA correctly identified the alteration of the PI3K/AKT signaling pathway driven by the somatic loss of PTEN and verified the emerging role of PTEN in modulating immune-related processes. In particular, we showed that, in prostate cancer, PTEN loss appears to establish an immunosuppressive tumor microenvironment through the activation of STAT3, and low PTEN expression levels have a detrimental impact on patient disease-free survival. GSECA is available at https://github.com/matteocereda/GSECA.


Assuntos
Big Data , Sequenciamento do Exoma/estatística & dados numéricos , RNA/genética , Transcriptoma/genética , Linhagem Celular Tumoral , Intervalo Livre de Doença , Regulação da Expressão Gênica/genética , Humanos , Internet , PTEN Fosfo-Hidrolase/genética , Fator de Transcrição STAT3/genética , Análise de Sequência de RNA , Transdução de Sinais/genética , Software , Microambiente Tumoral/genética
13.
BMC Bioinformatics ; 20(Suppl 9): 562, 2019 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-31757202

RESUMO

This preface introduces the content of the BioMed Central Bioinformatics journal Supplement related to the 15th annual meeting of the Bioinformatics Italian Society, BITS2018. The Conference was held in Torino, Italy, from June 27th to 29th, 2018.


Assuntos
Biologia Computacional , Algoritmos , Animais , Caenorhabditis elegans/embriologia , Caenorhabditis elegans/genética , Elementos de DNA Transponíveis/genética , Genômica , Humanos , Itália , Software
14.
Epigenomics ; 11(14): 1581-1599, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31693439

RESUMO

Aim: Growing evidence shows a strong interplay between post-transcriptional regulation, mediated by miRNAs (miRs) and epigenetic regulation. Nevertheless, the number of experimentally validated miRs (called epi-miRs) involved in these regulatory circuitries is still very small. Material & methods: We propose a pipeline to prioritize candidate epi-miRs and to identify potential epigenetic interactors of any given miR starting from miR transfection experiment datasets. Results & conclusion: We identified 34 candidate epi-miRs: 19 of them are known epi-miRs, while 15 are new. Moreover, using an in-house generated gene expression dataset, we experimentally proved that a component of the polycomb-repressive complex 2, the histone methyltransferase enhancer of zeste homolog 2 (EZH2), interacts with miR-214, a well-known prometastatic miR in melanoma and breast cancer, highlighting a miR-214-EZH2 regulatory axis potentially relevant in tumor progression.


Assuntos
Epigênese Genética/genética , MicroRNAs/genética , Neoplasias da Mama/genética , Linhagem Celular Tumoral , Proteína Potenciadora do Homólogo 2 de Zeste/genética , Feminino , Regulação Neoplásica da Expressão Gênica/genética , Humanos , Melanoma/genética , Complexo Repressor Polycomb 2/genética , Transfecção/métodos
15.
J Synchrotron Radiat ; 26(Pt 5): 1514-1522, 2019 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-31490139

RESUMO

Free-electron lasers (FELs) based on superconducting accelerator technology and storage ring facilities operate with bunch repetition rates in the MHz range, and the need arises for bunch-by-bunch electron and photon diagnostics. For photon-pulse-resolved measurements of spectral distributions, fast one-dimensional profile monitors are required. The linear array detector KALYPSO (KArlsruhe Linear arraY detector for MHz-rePetition rate SpectrOscopy) has been developed for electron bunch or photon pulse synchronous read-out with frame rates of up to 2.7 MHz. At the FLASH facility at DESY, a current version of KALYPSO with 256 pixels has been installed at a grating spectrometer as online diagnostics to monitor the pulse-resolved spectra of the high-repetition-rate FEL pulses. Application-specific front-end electronics based on MicroTCA standard have been developed for data acquisition and processing. Continuous data read-out with low latency in the microsecond range enables the integration into fast feedback applications. In this paper, pulse-resolved FEL spectra recorded at 1.0 MHz repetition rate for various operation conditions at FLASH are presented, and the first application of an adaptive feedback for accelerator control based on photon beam diagnostics is demonstrated.


Assuntos
Refratometria/instrumentação , Elétrons , Desenho de Equipamento , Lasers , Fótons , Espalhamento de Radiação , Síncrotrons
16.
Int J Mol Sci ; 20(13)2019 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-31247897

RESUMO

Matrix factorization (MF) is an established paradigm for large-scale biological data analysis with tremendous potential in computational biology. Here, we challenge MF in depicting the molecular bases of epidemiologically described disease-disease (DD) relationships. As a use case, we focus on the inverse comorbidity association between Alzheimer's disease (AD) and lung cancer (LC), described as a lower than expected probability of developing LC in AD patients. To this day, the molecular mechanisms underlying DD relationships remain poorly explained and their better characterization might offer unprecedented clinical opportunities. To this goal, we extend our previously designed MF-based framework for the molecular characterization of DD relationships. Considering AD-LC inverse comorbidity as a case study, we highlight multiple molecular mechanisms, among which we confirm the involvement of processes related to the immune system and mitochondrial metabolism. We then distinguish mechanisms specific to LC from those shared with other cancers through a pan-cancer analysis. Additionally, new candidate molecular players, such as estrogen receptor (ER), cadherin 1 (CDH1) and histone deacetylase (HDAC), are pinpointed as factors that might underlie the inverse relationship, opening the way to new investigations. Finally, some lung cancer subtype-specific factors are also detected, also suggesting the existence of heterogeneity across patients in the context of inverse comorbidity.


Assuntos
Doença de Alzheimer/epidemiologia , Biologia Computacional , Neoplasias Pulmonares/epidemiologia , Modelos Biológicos , Algoritmos , Doença de Alzheimer/complicações , Doença de Alzheimer/etiologia , Comorbidade , Biologia Computacional/métodos , Humanos , Neoplasias Pulmonares/complicações , Neoplasias Pulmonares/etiologia
17.
Sci Rep ; 9(1): 337, 2019 01 23.
Artigo em Inglês | MEDLINE | ID: mdl-30674955

RESUMO

After its introduction in 1982, the Hopfield model has been extensively applied for classification and pattern recognition. Recently, its great potential in gene expression patterns retrieval has also been shown. Following this line, we develop Hope4Genes a single-sample class prediction algorithm based on a Hopfield-like model. Differently from previous works, we here tested the performances of the algorithm for class prediction, a task of fundamental importance for precision medicine and therapeutic decision-making. Hope4Genes proved better performances than the state-of-art methodologies in the field independently of the size of the input dataset, its profiling platform, the number of classes and the typical class-imbalance present in biological data. Our results provide encoraging evidence that the Hopfield model, together with the use of its energy for the estimation of the false discoveries, is a particularly promising tool for precision medicine.


Assuntos
Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Animais , Humanos , Ratos
18.
Nucleic Acids Res ; 47(5): 2205-2215, 2019 03 18.
Artigo em Inglês | MEDLINE | ID: mdl-30657980

RESUMO

MicroRNAs play important roles in many biological processes. Their aberrant expression can have oncogenic or tumor suppressor function directly participating to carcinogenesis, malignant transformation, invasiveness and metastasis. Indeed, miRNA profiles can distinguish not only between normal and cancerous tissue but they can also successfully classify different subtypes of a particular cancer. Here, we focus on a particular class of transcripts encoding polycistronic miRNA genes that yields multiple miRNA components. We describe 'clustered MiRNA Master Regulator Analysis (ClustMMRA)', a fully redesigned release of the MMRA computational pipeline (MiRNA Master Regulator Analysis), developed to search for clustered miRNAs potentially driving cancer molecular subtyping. Genomically clustered miRNAs are frequently co-expressed to target different components of pro-tumorigenic signaling pathways. By applying ClustMMRA to breast cancer patient data, we identified key miRNA clusters driving the phenotype of different tumor subgroups. The pipeline was applied to two independent breast cancer datasets, providing statistically concordant results between the two analyses. We validated in cell lines the miR-199/miR-214 as a novel cluster of miRNAs promoting the triple negative breast cancer (TNBC) phenotype through its control of proliferation and EMT.


Assuntos
Transição Epitelial-Mesenquimal/genética , MicroRNAs/genética , Família Multigênica/genética , Neoplasias de Mama Triplo Negativas/genética , Neoplasias de Mama Triplo Negativas/patologia , Linhagem Celular Tumoral , Proliferação de Células , Conjuntos de Dados como Assunto , Inativação Gênica , Humanos , Invasividade Neoplásica/genética , Reprodutibilidade dos Testes , Neoplasias de Mama Triplo Negativas/classificação
19.
Genes (Basel) ; 10(1)2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30626100

RESUMO

N6-methyladenosine (m6A) is the most abundant RNA modification. It has been involved in the regulation of RNA metabolism, including degradation and translation, in both physiological and disease conditions. A recent study showed that m6A-mediated degradation of key transcripts also plays a role in the control of T cells homeostasis and IL-7 induced differentiation. We re-analyzed the omics data from that study and, through the integrative analysis of total and nascent RNA-seq data, we were able to comprehensively quantify T cells RNA dynamics and how these are affected by m6A depletion. In addition to the expected impact on RNA degradation, we revealed a broader effect of m6A on RNA dynamics, which included the alteration of RNA synthesis and processing. Altogether, the combined action of m6A on all major steps of the RNA life-cycle closely re-capitulated the observed changes in the abundance of premature and mature RNA species. Ultimately, our re-analysis extended the findings of the initial study, focused on RNA stability, and proposed a yet unappreciated role for m6A in RNA synthesis and processing dynamics.


Assuntos
Adenosina/análogos & derivados , Diferenciação Celular , Processamento Pós-Transcricional do RNA , Linfócitos T/metabolismo , Adenosina/metabolismo , Animais , Camundongos , Modelos Teóricos , Estabilidade de RNA , Linfócitos T/citologia
20.
Cell Syst ; 7(1): 3-4, 2018 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-30048619

RESUMO

A new study coupling bioinformatic and experimental investigations highlights the importance of combinatorial microRNA targeting in human EMT, a phenotypic program underlying normal and pathological processes.


Assuntos
Transição Epitelial-Mesenquimal , MicroRNAs , Biologia Computacional , Regulação da Expressão Gênica , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA