RESUMO
MOTIVATION: Fusion genes are both useful cancer biomarkers and important drug targets. Finding relevant fusion genes is challenging due to genomic instability resulting in a high number of passenger events. To reveal and prioritize relevant gene fusion events we have developed FUsionN Gene Identification toolset (FUNGI) that uses an ensemble of fusion detection algorithms with prioritization and visualization modules. RESULTS: We applied FUNGI to an ovarian cancer dataset of 107 tumor samples from 36 patients. Ten out of 11 detected and prioritized fusion genes were validated. Many of detected fusion genes affect the PI3K-AKT pathway with potential role in treatment resistance. AVAILABILITYAND IMPLEMENTATION: FUNGI and its documentation are available at https://bitbucket.org/alejandra_cervera/fungi as standalone or from Anduril at https://www.anduril.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMO
Approximately 18% of acute myeloid leukemia (AML) cases express a fusion transcript. However, few fusions are recurrent across AML and the identification of these rare chimeras is of interest to characterize AML patients. Here, we studied the transcriptome of 8 adult AML patients with poorly described chromosomal translocation(s), with the aim of identifying novel and rare fusion transcripts. We integrated RNA-sequencing data with multiple approaches including computational analysis, Sanger sequencing, fluorescence in situ hybridization and in vitro studies to assess the oncogenic potential of the ZEB2-BCL11B chimera. We detected 7 different fusions with partner genes involving transcription factors (OAZ-MAFK, ZEB2-BCL11B), tumor suppressors (SAV1-GYPB, PUF60-TYW1, CNOT2-WT1) and rearrangements associated with the loss of NF1 (CPD-PXT1, UTP6-CRLF3). Notably, ZEB2-BCL11B rearrangements co-occurred with FLT3 mutations and were associated with a poorly differentiated or mixed phenotype leukemia. Although the fusion alone did not transform murine c-Kit+ bone marrow cells, 45.4% of 14q32 non-rearranged AML cases were also BCL11B-positive, suggesting a more general and complex mechanism of leukemogenesis associated with BCL11B expression. Overall, by combining different approaches, we described rare fusion events contributing to the complexity of AML and we linked the expression of some chimeras to genomic alterations hitting known genes in AML.
RESUMO
Transcriptional enhancers function as docking platforms for combinations of transcription factors (TFs) to control gene expression. How enhancer sequences determine nucleosome occupancy, TF recruitment and transcriptional activation in vivo remains unclear. Using ATAC-seq across a panel of Drosophila inbred strains, we found that SNPs affecting binding sites of the TF Grainy head (Grh) causally determine the accessibility of epithelial enhancers. We show that deletion and ectopic expression of Grh cause loss and gain of DNA accessibility, respectively. However, although Grh binding is necessary for enhancer accessibility, it is insufficient to activate enhancers. Finally, we show that human Grh homologs-GRHL1, GRHL2 and GRHL3-function similarly. We conclude that Grh binding is necessary and sufficient for the opening of epithelial enhancers but not for their activation. Our data support a model positing that complex spatiotemporal expression patterns are controlled by regulatory hierarchies in which pioneer factors, such as Grh, establish tissue-specific accessible chromatin landscapes upon which other factors can act.
Assuntos
Proteínas de Ligação a DNA/genética , Proteínas de Drosophila/genética , Nucleossomos/genética , Fatores de Transcrição/genética , Animais , Animais Geneticamente Modificados , Sítios de Ligação , Linhagem Celular Tumoral , Cromatina/genética , Drosophila melanogaster/genética , Elementos Facilitadores Genéticos , Células Epiteliais , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Células MCF-7 , Polimorfismo de Nucleotídeo Único , Ativação TranscricionalRESUMO
BACKGROUND: Latest Next Generation Sequencing technologies opened the way to a novel era of genomic studies, allowing to gain novel insights into multifactorial pathologies as cancer. In particular gene fusion detection and comprehension have been deeply enhanced by these methods. However, state of the art algorithms for gene fusion identification are still challenging. Indeed, they identify huge amounts of poorly overlapping candidates and all the reported fusions should be considered for in lab validation clearly overwhelming wet lab capabilities. RESULTS: In this work we propose a novel methodological approach and tool named FuGePrior for the prioritization of gene fusions from paired-end RNA-Seq data. The proposed pipeline combines state of the art tools for chimeric transcript discovery and prioritization, a series of filtering and processing steps designed by considering modern literature on gene fusions and an analysis on functional reliability of gene fusion structure. CONCLUSIONS: FuGePrior performance has been assessed on two publicly available paired-end RNA-Seq datasets: The first by Edgren and colleagues includes four breast cancer cell lines and a normal breast sample, whereas the second by Ren and colleagues comprises fourteen primary prostate cancer samples and their paired normal counterparts. FuGePrior results accounted for a reduction in the number of fusions output of chimeric transcript discovery tools that ranges from 65 to 75% depending on the considered breast cancer cell line and from 37 to 65% according to the prostate cancer sample under examination. Furthermore, since both datasets come with a partial validation we were able to assess the performance of FuGePrior in correctly prioritizing real gene fusions. Specifically, 25 out of 26 validated fusions in breast cancer dataset have been correctly labelled as reliable and biologically significant. Similarly, 2 out of 5 validated fusions in prostate dataset have been recognized as priority by FuGePrior tool.
Assuntos
Neoplasias da Mama/genética , Neoplasias da Próstata/genética , Proteínas Recombinantes de Fusão/genética , Análise de Sequência de RNA , Algoritmos , Linhagem Celular Tumoral , Bases de Dados Genéticas , Feminino , Genômica , Humanos , Células MCF-7 , Masculino , Proteínas Recombinantes de Fusão/química , Reprodutibilidade dos Testes , SoftwareRESUMO
In this paper we present VDJSeq-Solver, a methodology and tool to identify clonal lymphocyte populations from paired-end RNA Sequencing reads derived from the sequencing of mRNA neoplastic cells. The tool detects the main clone that characterises the tissue of interest by recognizing the most abundant V(D)J rearrangement among the existing ones in the sample under study. The exact sequence of the clone identified is capable of accounting for the modifications introduced by the enzymatic processes. The proposed tool overcomes limitations of currently available lymphocyte rearrangements recognition methods, working on a single sequence at a time, that are not applicable to high-throughput sequencing data. In this work, VDJSeq-Solver has been applied to correctly detect the main clone and identify its sequence on five Mantle Cell Lymphoma samples; then the tool has been tested on twelve Diffuse Large B-Cell Lymphoma samples. In order to comply with the privacy, ethics and intellectual property policies of the University Hospital and the University of Verona, data is available upon request to supporto.utenti@ateneo.univr.it after signing a mandatory Materials Transfer Agreement. VDJSeq-Solver JAVA/Perl/Bash software implementation is free and available at http://eda.polito.it/VDJSeq-Solver/.
Assuntos
Simulação por Computador , Genes de Imunoglobulinas , Linfoma Difuso de Grandes Células B/diagnóstico , Linfoma de Célula do Manto/diagnóstico , Software , Recombinação V(D)J/genética , Algoritmos , Células Clonais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Linfoma Difuso de Grandes Células B/genética , Linfoma de Célula do Manto/genética , Reação em Cadeia da Polimerase , Análise de Sequência de RNA/métodosRESUMO
MOTIVATION: Next-generation sequencing technology allows the detection of genomic structural variations, novel genes and transcript isoforms from the analysis of high-throughput data. In this work, we propose a new framework for the detection of fusion transcripts through short paired-end reads which integrates splicing-driven alignment and abundance estimation analysis, producing a more accurate set of reads supporting the junction discovery and taking into account also not annotated transcripts. Bellerophontes performs a selection of putative junctions on the basis of a match to an accurate gene fusion model. RESULTS: We report the fusion genes discovered by the proposed framework on experimentally validated biological samples of chronic myelogenous leukemia (CML) and on public NCBI datasets, for which Bellerophontes is able to detect the exact junction sequence. With respect to state-of-art approaches, Bellerophontes detects the same experimentally validated fusions, however, it is more selective on the total number of detected fusions and provides a more accurate set of spanning reads supporting the junctions. We finally report the fusions involving non-annotated transcripts found in CML samples. AVAILABILITY AND IMPLEMENTATION: Bellerophontes JAVA/Perl/Bash software implementation is free and available at http://eda.polito.it/bellerophontes/.