Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 46
Filter
Add more filters










Publication year range
1.
Neoplasia ; 51: 100987, 2024 05.
Article in English | MEDLINE | ID: mdl-38489912

ABSTRACT

Gene fusions are common in high-grade serous ovarian cancer (HGSC). Such genetic lesions may promote tumorigenesis, but the pathogenic mechanisms are currently poorly understood. Here, we investigated the role of a PIK3R1-CCDC178 fusion identified from a patient with advanced HGSC. We show that the fusion induces HGSC cell migration by regulating ERK1/2 and increases resistance to platinum treatment. Platinum resistance was associated with rod and ring-like cellular structure formation. These structures contained, in addition to the fusion protein, CIN85, a key regulator of PI3K-AKT-mTOR signaling. Our data suggest that the fusion-driven structure formation induces a previously unrecognized cell survival and resistance mechanism, which depends on ERK1/2-activation.


Subject(s)
Class Ia Phosphatidylinositol 3-Kinase , Drug Resistance, Neoplasm , MAP Kinase Signaling System , Oncogene Proteins, Fusion , Ovarian Neoplasms , Phosphatidylinositol 3-Kinases , Female , Humans , Class Ia Phosphatidylinositol 3-Kinase/genetics , Class Ia Phosphatidylinositol 3-Kinase/metabolism , Drug Resistance, Neoplasm/genetics , MAP Kinase Signaling System/genetics , Ovarian Neoplasms/drug therapy , Ovarian Neoplasms/genetics , Ovarian Neoplasms/metabolism , Phosphatidylinositol 3-Kinases/genetics , Phosphatidylinositol 3-Kinases/metabolism , Platinum , Oncogene Proteins, Fusion/genetics , Oncogene Proteins, Fusion/metabolism , Cytoskeletal Proteins/genetics , Cytoskeletal Proteins/metabolism
2.
PLoS One ; 19(3): e0289699, 2024.
Article in English | MEDLINE | ID: mdl-38512819

ABSTRACT

MicroRNAs (miRNAs) are small molecules that play an essential role in regulating gene expression by post-transcriptional gene silencing. Their study is crucial in revealing the fundamental processes underlying pathologies and, in particular, cancer. To date, most studies on miRNA regulation consider the effect of specific miRNAs on specific target mRNAs, providing wet-lab validation. However, few tools have been developed to explain the miRNA-mediated regulation at the protein level. In this paper, the MoPC computational tool is presented, that relies on the partial correlation between mRNAs and proteins conditioned on the miRNA expression to predict miRNA-target interactions in multi-omic datasets. MoPC returns the list of significant miRNA-target interactions and plot the significant correlations on the heatmap in which the miRNAs and targets are ordered by the chromosomal location. The software was applied on three TCGA/CPTAC datasets (breast, glioblastoma, and lung cancer), returning enriched results in three independent targets databases.


Subject(s)
MicroRNAs , Neoplasms , Humans , MicroRNAs/genetics , MicroRNAs/metabolism , Proteome/genetics , Proteome/metabolism , Neoplasms/genetics , Software , RNA, Messenger/genetics , RNA, Messenger/metabolism , Computational Biology/methods , Gene Expression Profiling , Gene Expression Regulation, Neoplastic
3.
IEEE Trans Med Imaging ; 43(4): 1412-1421, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38015690

ABSTRACT

The usage of Multi Instance Learning (MIL) for classifying Whole Slide Images (WSIs) has recently increased. Due to their gigapixel size, the pixel-level annotation of such data is extremely expensive and time-consuming, practically unfeasible. For this reason, multiple automatic approaches have been raised in the last years to support clinical practice and diagnosis. Unfortunately, most state-of-the-art proposals apply attention mechanisms without considering the spatial instance correlation and usually work on a single-scale resolution. To leverage the full potential of pyramidal structured WSI, we propose a graph-based multi-scale MIL approach, DAS-MIL. Our model comprises three modules: i) a self-supervised feature extractor, ii) a graph-based architecture that precedes the MIL mechanism and aims at creating a more contextualized representation of the WSI structure by considering the mutual (spatial) instance correlation both inter and intra-scale. Finally, iii) a (self) distillation loss between resolutions is introduced to compensate for their informative gap and significantly improve the final prediction. The effectiveness of the proposed framework is demonstrated on two well-known datasets, where we outperform SOTA on WSI classification, gaining a +2.7% AUC and +3.7% accuracy on the popular Camelyon16 benchmark.

4.
BMC Bioinformatics ; 24(1): 443, 2023 Nov 22.
Article in English | MEDLINE | ID: mdl-37993778

ABSTRACT

Messenger RNA (mRNA) has an essential role in the protein production process. Predicting mRNA expression levels accurately is crucial for understanding gene regulation, and various models (statistical and neural network-based) have been developed for this purpose. A few models predict mRNA expression levels from the DNA sequence, exploiting the DNA sequence and gene features (e.g., number of exons/introns, gene length). Other models include information about long-range interaction molecules (i.e., enhancers/silencers) and transcriptional regulators as predictive features, such as transcription factors (TFs) and small RNAs (e.g., microRNAs - miRNAs). Recently, a convolutional neural network (CNN) model, called Xpresso, has been proposed for mRNA expression level prediction leveraging the promoter sequence and mRNAs' half-life features (gene features). To push forward the mRNA level prediction, we present miREx, a CNN-based tool that includes information about miRNA targets and expression levels in the model. Indeed, each miRNA can target specific genes, and the model exploits this information to guide the learning process. In detail, not all miRNAs are included, only a selected subset with the highest impact on the model. MiREx has been evaluated on four cancer primary sites from the genomics data commons (GDC) database: lung, kidney, breast, and corpus uteri. Results show that mRNA level prediction benefits from selected miRNA targets and expression information. Future model developments could include other transcriptional regulators or be trained with proteomics data to infer protein levels.


Subject(s)
MicroRNAs , MicroRNAs/genetics , RNA, Messenger/genetics , Mirex , Gene Expression Regulation , Transcription Factors/genetics , Gene Expression Profiling
5.
Comput Methods Programs Biomed ; 234: 107504, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37004267

ABSTRACT

BACKGROUND AND OBJECTIVE: The functions of an organism and its biological processes result from the expression of genes and proteins. Therefore quantifying and predicting mRNA and protein levels is a crucial aspect of scientific research. Concerning the prediction of mRNA levels, the available approaches use the sequence upstream and downstream of the Transcription Start Site (TSS) as input to neural networks. The State-of-the-art models (e.g., Xpresso and Basenjii) predict mRNA levels exploiting Convolutional (CNN) or Long Short Term Memory (LSTM) Networks. However, CNN prediction depends on convolutional kernel size, and LSTM suffers from capturing long-range dependencies in the sequence. Concerning the prediction of protein levels, as far as we know, there is no model for predicting protein levels by exploiting the gene or protein sequences. METHODS: Here, we exploit a new model type (called Perceiver) for mRNA and protein level prediction, exploiting a Transformer-based architecture with an attention module to attend to long-range interactions in the sequences. In addition, the Perceiver model overcomes the quadratic complexity of the standard Transformer architectures. This work's contributions are 1. DNAPerceiver model to predict mRNA levels from the sequence upstream and downstream of the TSS; 2. ProteinPerceiver model to predict protein levels from the protein sequence; 3. Protein&DNAPerceiver model to predict protein levels from TSS and protein sequences. RESULTS: The models are evaluated on cell lines, mice, glioblastoma, and lung cancer tissues. The results show the effectiveness of the Perceiver-type models in predicting mRNA and protein levels. CONCLUSIONS: This paper presents a Perceiver architecture for mRNA and protein level prediction. In the future, inserting regulatory and epigenetic information into the model could improve mRNA and protein level predictions. The source code is freely available at https://github.com/MatteoStefanini/DNAPerceiver.


Subject(s)
DNA , Neural Networks, Computer , Animals , Mice , Algorithms , Proteins/genetics , RNA, Messenger/genetics
6.
Comput Methods Programs Biomed ; 225: 107035, 2022 Oct.
Article in English | MEDLINE | ID: mdl-35970054

ABSTRACT

BACKGROUND AND OBJECTIVES: In the latest years, the prediction of gene expression levels has been crucial due to its potential applications in the clinics. In this context, Xpresso and others methods based on Convolutional Neural Networks and Transformers were firstly proposed to this aim. However, all these methods embed data with a standard one-hot encoding algorithm, resulting in impressively sparse matrices. In addition, post-transcriptional regulation processes, which are of uttermost importance in the gene expression process, are not considered in the model. METHODS: This paper presents Transformer DeepLncLoc, a novel method to predict the abundance of the mRNA (i.e., gene expression levels) by processing gene promoter sequences, managing the problem as a regression task. The model exploits a transformer-based architecture, introducing the DeepLncLoc method to perform the data embedding. Since DeepLncloc is based on word2vec algorithm, it avoids the sparse matrices problem. RESULTS: Post-transcriptional information related to mRNA stability and transcription factors is included in the model, leading to significantly improved performances compared to the state-of-the-art works. Transformer DeepLncLoc reached 0.76 of R2 evaluation metric compared to 0.74 of Xpresso. CONCLUSION: The Multi-Headed Attention mechanisms which characterizes the transformer methodology is suitable for modeling the interactions between DNA's locations, overcoming the recurrent models. Finally, the integration of the transcription factors data in the pipeline leads to impressive gains in predictive power.


Subject(s)
DNA , Transcription Factors , Base Sequence , DNA/genetics , Gene Expression , RNA, Messenger/genetics , Transcription Factors/genetics
7.
BMC Bioinformatics ; 23(1): 295, 2022 Jul 24.
Article in English | MEDLINE | ID: mdl-35871688

ABSTRACT

MOTIVATION: Computer-aided analysis of biological images typically requires extensive training on large-scale annotated datasets, which is not viable in many situations. In this paper, we present Generative Adversarial Network Discriminator Learner (GAN-DL), a novel self-supervised learning paradigm based on the StyleGAN2 architecture, which we employ for self-supervised image representation learning in the case of fluorescent biological images. RESULTS: We show that Wasserstein Generative Adversarial Networks enable high-throughput compound screening based on raw images. We demonstrate this by classifying active and inactive compounds tested for the inhibition of SARS-CoV-2 infection in two different cell models: the primary human renal cortical epithelial cells (HRCE) and the African green monkey kidney epithelial cells (VERO). In contrast to previous methods, our deep learning-based approach does not require any annotation, and can also be used to solve subtle tasks it was not specifically trained on, in a self-supervised manner. For example, it can effectively derive a dose-response curve for the tested treatments. AVAILABILITY AND IMPLEMENTATION: Our code and embeddings are available at https://gitlab.com/AlesioRFM/gan-dl StyleGAN2 is available at https://github.com/NVlabs/stylegan2 .


Subject(s)
COVID-19 , Image Processing, Computer-Assisted , Animals , Cell Count , Chlorocebus aethiops , Humans , Image Processing, Computer-Assisted/methods , SARS-CoV-2 , Supervised Machine Learning
8.
J Biomed Inform ; 129: 104057, 2022 05.
Article in English | MEDLINE | ID: mdl-35339665

ABSTRACT

It is estimated that oncogenic gene fusions cause about 20% of human cancer morbidity. Identifying potentially oncogenic gene fusions may improve affected patients' diagnosis and treatment. Previous approaches to this issue included exploiting specific gene-related information, such as gene function and regulation. Here we propose a model that profits from the previous findings and includes the microRNAs in the oncogenic assessment. We present ChimerDriver, a tool to classify gene fusions as oncogenic or not oncogenic. ChimerDriver is based on a specifically designed neural network and trained on genetic and post-transcriptional information to obtain a reliable classification. The designed neural network integrates information related to transcription factors, gene ontologies, microRNAs and other detailed information related to the functions of the genes involved in the fusion and the gene fusion structure. As a result, the performances on the test set reached 0.83 f1-score and 96% recall. The comparison with state-of-the-art tools returned comparable or higher results. Moreover, ChimerDriver performed well in a real-world case where 21 out of 24 validated gene fusion samples were detected by the gene fusion detection tool Starfusion. ChimerDriver integrates transcriptional and post-transcriptional information in an ad-hoc designed neural network to effectively discriminate oncogenic gene fusions from passenger ones. ChimerDriver source code is freely available at https://github.com/martalovino/ChimerDriver.


Subject(s)
MicroRNAs , Gene Fusion , Humans , MicroRNAs/genetics , Neural Networks, Computer , Oncogene Fusion , Software
9.
BMC Bioinformatics ; 23(1): 18, 2022 Jan 06.
Article in English | MEDLINE | ID: mdl-34991448

ABSTRACT

BACKGROUND: The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson-Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software. RESULTS: We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version. CONCLUSIONS: With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases.


Subject(s)
RNA , Software , Algorithms , Base Sequence , Humans , Nucleic Acid Conformation , RNA/genetics , Sequence Alignment , Sequence Analysis, RNA
10.
BMC Bioinformatics ; 22(1): 360, 2021 Jul 03.
Article in English | MEDLINE | ID: mdl-34217219

ABSTRACT

BACKGROUND: Tumors are composed by a number of cancer cell subpopulations (subclones), characterized by a distinguishable set of mutations. This phenomenon, known as intra-tumor heterogeneity (ITH), may be studied using Copy Number Aberrations (CNAs). Nowadays ITH can be assessed at the highest possible resolution using single-cell DNA (scDNA) sequencing technology. Additionally, single-cell CNA (scCNA) profiles from multiple samples of the same tumor can in principle be exploited to study the spatial distribution of subclones within a tumor mass. However, since the technology required to generate large scDNA sequencing datasets is relatively recent, dedicated analytical approaches are still lacking. RESULTS: We present PhyliCS, the first tool which exploits scCNA data from multiple samples from the same tumor to estimate whether the different clones of a tumor are well mixed or spatially separated. Starting from the CNA data produced with third party instruments, it computes a score, the Spatial Heterogeneity score, aimed at distinguishing spatially intermixed cell populations from spatially segregated ones. Additionally, it provides functionalities to facilitate scDNA analysis, such as feature selection and dimensionality reduction methods, visualization tools and a flexible clustering module. CONCLUSIONS: PhyliCS represents a valuable instrument to explore the extent of spatial heterogeneity in multi-regional tumour sampling, exploiting the potential of scCNA data.


Subject(s)
DNA Copy Number Variations , Neoplasms , Cluster Analysis , Genetic Heterogeneity , Humans , Sequence Analysis, DNA , Single-Cell Analysis
11.
Bioinformatics ; 37(19): 3353-3355, 2021 Oct 11.
Article in English | MEDLINE | ID: mdl-33772596

ABSTRACT

MOTIVATION: Fusion genes are both useful cancer biomarkers and important drug targets. Finding relevant fusion genes is challenging due to genomic instability resulting in a high number of passenger events. To reveal and prioritize relevant gene fusion events we have developed FUsionN Gene Identification toolset (FUNGI) that uses an ensemble of fusion detection algorithms with prioritization and visualization modules. RESULTS: We applied FUNGI to an ovarian cancer dataset of 107 tumor samples from 36 patients. Ten out of 11 detected and prioritized fusion genes were validated. Many of detected fusion genes affect the PI3K-AKT pathway with potential role in treatment resistance. AVAILABILITYAND IMPLEMENTATION: FUNGI and its documentation are available at https://bitbucket.org/alejandra_cervera/fungi as standalone or from Anduril at https://www.anduril.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

12.
J Anat ; 237(5): 988-997, 2020 11.
Article in English | MEDLINE | ID: mdl-32579747

ABSTRACT

Dorsal root ganglia (DRGs) host the somata of sensory neurons which convey information from the periphery to the central nervous system. These neurons have heterogeneous size and neurochemistry, and those of small-to-medium size, which play an important role in nociception, form two distinct subpopulations based on the presence (peptidergic) or absence (non-peptidergic) of transmitter neuropeptides. Few investigations have so far addressed the spatial relationship between neurochemically different subpopulations of DRG neurons and glia. We used a whole-mount mouse lumbar DRG preparation, confocal microscopy and computer-aided 3D analysis to unveil that IB4+ non-peptidergic neurons form small clusters of 4.7 ± 0.26 cells, differently from CGRP+ peptidergic neurons that are, for the most, isolated (1.89 ± 0.11 cells). Both subpopulations of neurons are ensheathed by a thin layer of satellite glial cells (SGCs) that can be observed after immunolabeling with the specific marker glutamine synthetase (GS). Notably, at the ultrastructural level we observed that this glial layer was discontinuous, as there were patches of direct contact between the membranes of two adjacent IB4+ neurons. To test whether this cytoarchitectonic organization was modified in the diabetic neuropathy, one of the most devastating sensory pathologies, mice were made diabetic by streptozotocin (STZ). In diabetic animals, cluster organization of the IB4+ non-peptidergic neurons was maintained, but the neuro-glial relationship was altered, as STZ treatment caused a statistically significant increase of GS staining around CGRP+ neurons but a reduction around IB4+ neurons. Ultrastructural analysis unveiled that SGC coverage was increased at the interface between IB4+ cluster-forming neurons in diabetic mice, with a 50% reduction in the points of direct contacts between cells. These observations demonstrate the existence of a structural plasticity of the DRG cytoarchitecture in response to STZ.


Subject(s)
Diabetes Mellitus, Experimental/pathology , Ganglia, Spinal/ultrastructure , Neuroglia/ultrastructure , Animals , Calcitonin Gene-Related Peptide/metabolism , Ganglia, Spinal/metabolism , Glutamate-Ammonia Ligase/metabolism , Glycoproteins/metabolism , Male , Mice , Neuroglia/enzymology
13.
Bioinformatics ; 36(10): 3248-3250, 2020 05 01.
Article in English | MEDLINE | ID: mdl-32016382

ABSTRACT

SUMMARY: In the last decade, increasing attention has been paid to the study of gene fusions. However, the problem of determining whether a gene fusion is a cancer driver or just a passenger mutation is still an open issue. Here we present DEEPrior, an inherently flexible deep learning tool with two modes (Inference and Retraining). Inference mode predicts the probability of a gene fusion being involved in an oncogenic process, by directly exploiting the amino acid sequence of the fused protein. Retraining mode allows to obtain a custom prediction model including new data provided by the user. AVAILABILITY AND IMPLEMENTATION: Both DEEPrior and the protein fusions dataset are freely available from GitHub at (https://github.com/bioinformatics-polito/DEEPrior). The tool was designed to operate in Python 3.7, with minimal additional libraries. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Deep Learning , Software , Gene Fusion , Probability , Proteins
14.
Bioinformatics ; 36(9): 2705-2711, 2020 05 01.
Article in English | MEDLINE | ID: mdl-31999333

ABSTRACT

MOTIVATION: High-throughput next-generation sequencing can generate huge sequence files, whose analysis requires alignment algorithms that are typically very demanding in terms of memory and computational resources. This is a significant issue, especially for machines with limited hardware capabilities. As the redundancy of the sequences typically increases with coverage, collapsing such files into compact sets of non-redundant reads has the 2-fold advantage of reducing file size and speeding-up the alignment, avoiding to map the same sequence multiple times. METHOD: BioSeqZip generates compact and sorted lists of alignment-ready non-redundant sequences, keeping track of their occurrences in the raw files as well as of their quality score information. By exploiting a memory-constrained external sorting algorithm, it can be executed on either single- or multi-sample datasets even on computers with medium computational capabilities. On request, it can even re-expand the compacted files to their original state. RESULTS: Our extensive experiments on RNA-Seq data show that BioSeqZip considerably brings down the computational costs of a standard sequence analysis pipeline, with particular benefits for the alignment procedures that typically have the highest requirements in terms of memory and execution time. In our tests, BioSeqZip was able to compact 2.7 billion of reads into 963 million of unique tags reducing the size of sequence files up to 70% and speeding-up the alignment by 50% at least. AVAILABILITY AND IMPLEMENTATION: BioSeqZip is available at https://github.com/bioinformatics-polito/BioSeqZip. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
High-Throughput Nucleotide Sequencing , Software , Algorithms , RNA-Seq , Sequence Analysis, DNA , Exome Sequencing
15.
Bioinformatics ; 36(3): 698-703, 2020 02 01.
Article in English | MEDLINE | ID: mdl-31504201

ABSTRACT

MOTIVATION: MicroRNAs (miRNAs) are small RNA molecules (∼22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies led to the discovery of isomiRs, which are miRNA sequence variants. While many miRNA-seq analysis tools exist, the diversity of output formats hinders accurate comparisons between tools and precludes data sharing and the development of common downstream analysis methods. RESULTS: To overcome this situation, we present here a community-based project, miRNA Transcriptomic Open Project (miRTOP) working towards the optimization of miRNA analyses. The aim of miRTOP is to promote the development of downstream isomiR analysis tools that are compatible with existing detection and quantification tools. Based on the existing GFF3 format, we first created a new standard format, mirGFF3, for the output of miRNA/isomiR detection and quantification results from small RNA-seq data. Additionally, we developed a command line Python tool, mirtop, to create and manage the mirGFF3 format. Currently, mirtop can convert into mirGFF3 the outputs of commonly used pipelines, such as seqbuster, isomiR-SEA, sRNAbench, Prost! as well as BAM files. Some tools have also incorporated the mirGFF3 format directly into their code, such as, miRge2.0, IsoMIRmap and OptimiR. Its open architecture enables any tool or pipeline to output or convert results into mirGFF3. Collectively, this isomiR categorization system, along with the accompanying mirGFF3 and mirtop API, provide a comprehensive solution for the standardization of miRNA and isomiR annotation, enabling data sharing, reporting, comparative analyses and benchmarking, while promoting the development of common miRNA methods focusing on downstream steps of miRNA detection, annotation and quantification. AVAILABILITY AND IMPLEMENTATION: https://github.com/miRTop/mirGFF3/ and https://github.com/miRTop/mirtop. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
MicroRNAs , Gene Expression Regulation , High-Throughput Nucleotide Sequencing , Sequence Analysis, RNA , Transcriptome
16.
Cancers (Basel) ; 11(12)2019 Dec 05.
Article in English | MEDLINE | ID: mdl-31817495

ABSTRACT

Approximately 18% of acute myeloid leukemia (AML) cases express a fusion transcript. However, few fusions are recurrent across AML and the identification of these rare chimeras is of interest to characterize AML patients. Here, we studied the transcriptome of 8 adult AML patients with poorly described chromosomal translocation(s), with the aim of identifying novel and rare fusion transcripts. We integrated RNA-sequencing data with multiple approaches including computational analysis, Sanger sequencing, fluorescence in situ hybridization and in vitro studies to assess the oncogenic potential of the ZEB2-BCL11B chimera. We detected 7 different fusions with partner genes involving transcription factors (OAZ-MAFK, ZEB2-BCL11B), tumor suppressors (SAV1-GYPB, PUF60-TYW1, CNOT2-WT1) and rearrangements associated with the loss of NF1 (CPD-PXT1, UTP6-CRLF3). Notably, ZEB2-BCL11B rearrangements co-occurred with FLT3 mutations and were associated with a poorly differentiated or mixed phenotype leukemia. Although the fusion alone did not transform murine c-Kit+ bone marrow cells, 45.4% of 14q32 non-rearranged AML cases were also BCL11B-positive, suggesting a more general and complex mechanism of leukemogenesis associated with BCL11B expression. Overall, by combining different approaches, we described rare fusion events contributing to the complexity of AML and we linked the expression of some chimeras to genomic alterations hitting known genes in AML.

17.
Int J Mol Sci ; 20(8)2019 Apr 25.
Article in English | MEDLINE | ID: mdl-31027180

ABSTRACT

The brain comprises a complex system of neurons interconnected by an intricate network of anatomical links. While recent studies demonstrated the correlation between anatomical connectivity patterns and gene expression of neurons, using transcriptomic information to automatically predict such patterns is still an open challenge. In this work, we present a completely data-driven approach relying on machine learning (i.e., neural networks) to learn the anatomical connection directly from a training set of gene expression data. To do so, we combined gene expression and connectivity data from the Allen Mouse Brain Atlas to generate thousands of gene expression profile pairs from different brain regions. To each pair, we assigned a label describing the physical connection between the corresponding brain regions. Then, we exploited these data to train neural networks, designed to predict brain area connectivity. We assessed our solution on two prediction problems (with three and two connectivity class categories) involving cortical and cerebellum regions. As demonstrated by our results, we distinguish between connected and unconnected regions with 85% prediction accuracy and good balance of precision and recall. In our future work we may extend the analysis to more complex brain structures and consider RNA-Seq data as additional input to our model.


Subject(s)
Brain/physiology , Gene Expression Profiling , Nerve Net/physiology , Algorithms , Animals , Automation , Gene Expression Regulation , Mice , Neural Networks, Computer , Organ Size , ROC Curve
18.
Int J Mol Sci ; 20(7)2019 Apr 02.
Article in English | MEDLINE | ID: mdl-30987060

ABSTRACT

Gene fusions have a very important role in the study of cancer development. In this regard, predicting the probability of protein fusion transcripts of developing into a cancer is a very challenging and yet not fully explored research problem. To this date, all the available approaches in literature try to explain the oncogenic potential of gene fusions based on protein domain analysis, that is cancer-specific and not easy to adapt to newly developed information. In our work, we choose the raw protein sequences as the input baseline, and propose the use of deep learning, and more specifically Convolutional Neural Networks, to infer the oncogenity probability score of gene fusion transcripts and to group them into a number of categories (e.g., oncogenic/not oncogenic). This is an inherently flexible methodology that, unlike previous approaches, can be re-trained with very less efforts on newly available data (for example, from a different cancer). Based on experimental results on a large dataset of pre-annotated gene fusions, our method is able to predict the oncogenity potential of gene fusion transcripts with accuracy of about 72%, which increases to 86% if we consider the only instances that are classified with a high confidence level.


Subject(s)
Deep Learning , Oncogene Fusion , Algorithms , Humans , Neural Networks, Computer , Probability
19.
Cancer ; 125(5): 712-725, 2019 03 01.
Article in English | MEDLINE | ID: mdl-30480765

ABSTRACT

BACKGROUND: Aneuploidy occurs in more than 20% of acute myeloid leukemia (AML) cases and correlates with an adverse prognosis. METHODS: To understand the molecular bases of aneuploid acute myeloid leukemia (A-AML), this study examined the genomic profile in 42 A-AML cases and 35 euploid acute myeloid leukemia (E-AML) cases. RESULTS: A-AML was characterized by increased genomic complexity based on exonic variants (an average of 26 somatic mutations per sample vs 15 for E-AML). The integration of exome, copy number, and gene expression data revealed alterations in genes involved in DNA repair (eg, SLX4IP, RINT1, HINT1, and ATR) and the cell cycle (eg, MCM2, MCM4, MCM5, MCM7, MCM8, MCM10, UBE2C, USP37, CK2, CK3, CK4, BUB1B, NUSAP1, and E2F) in A-AML, which was associated with a 3-gene signature defined by PLK1 and CDC20 upregulation and RAD50 downregulation and with structural or functional silencing of the p53 transcriptional program. Moreover, A-AML was enriched for alterations in the protein ubiquitination and degradation pathway (eg, increased levels of UHRF1 and UBE2C and decreased UBA3 expression), response to reactive oxygen species, energy metabolism, and biosynthetic processes, which may help in facing the unbalanced protein load. E-AML was associated with BCOR/BCORL1 mutations and HOX gene overexpression. CONCLUSIONS: These findings indicate that aneuploidy-related and leukemia-specific alterations cooperate to tolerate an abnormal chromosome number in AML, and they point to the mitotic and protein degradation machineries as potential therapeutic targets.


Subject(s)
Gene Expression Profiling/methods , Gene Regulatory Networks , Genomics/methods , Leukemia, Myeloid, Acute/genetics , Adult , Aged , Aged, 80 and over , Aneuploidy , Cell Cycle , Chromosome Banding , Female , Gene Dosage , Gene Expression Regulation, Leukemic , Genetic Predisposition to Disease , Humans , Male , Middle Aged , Mutation , Proteolysis , Exome Sequencing , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...