RESUMEN
Background: Spindle cell rhabdomyosarcoma (S-RMS) is a rare tumor that was previously considered as an uncommon variant of embryonal RMS (ERMS) and recently reclassified as a distinct RMS subtype with NCOA2, NCOA1, and VGLL2 fusion genes. In this study, we established a cell line (S-RMS1) derived from a four-month-old boy with infantile spindle cell RMS harboring SRF-NCOA2 gene fusion. Methods: Morphological and molecular characteristics of S-RMS1 were analyzed and compared with two RMS cell lines, RH30 and RD18. Whole genome sequencing of S-RMS1 and clinical exome sequencing of genomic DNA were performed. Results: S-RMS1 showed cells small in size, with a fibroblast-like morphology and positivity for MyoD-1, myogenin, desmin, and smooth muscle actin. The population doubling time was 3.7 days. Whole genome sequencing demonstrated that S-RMS1 retained the same genetic profile of the tumor at diagnosis. A Western blot analysis showed downregulation of AKT-p and YAP-p while RT-qPCR showed upregulation of endoglin and GATA6 as well as downregulation of TGFßR1 and Mef2C transcripts. Conclusion: This is the first report of the establishment of a cell line from an infantile spindle cell RMS with SRF-NCOA2 gene fusion. S-RMS1 should represent a useful tool for the molecular characterization of this rare and almost unknown tumor.
Asunto(s)
Fusión Génica/genética , Coactivador 2 del Receptor Nuclear/genética , Proteínas Recombinantes de Fusión/genética , Rabdomiosarcoma/genética , Factor de Respuesta Sérica/genética , Adulto , Línea Celular , Niño , Preescolar , Regulación hacia Abajo/genética , Exoma/genética , Femenino , Humanos , Lactante , Masculino , Miogenina/genética , Coactivador 1 de Receptor Nuclear/genética , Adulto JovenRESUMEN
Alzheimer's disease (AD) involves changes in both lipid and RNA metabolism, but it remained unknown if these differences associate with AD's cognition and/or post-mortem neuropathology indices. Here, we report RNA-sequencing evidence of inter-related associations between lipid processing, cognition level, and AD neuropathology. In two unrelated cohorts, we identified pathway-enriched facilitation of lipid processing and alternative splicing genes, including the neuronal-enriched NOVA1 and hnRNPA1. Specifically, this association emerged in temporal lobe tissue samples from donors where postmortem evidence demonstrated AD neuropathology, but who presented normal cognition proximate to death. The observed changes further associated with modified ATP synthesis and mitochondrial transcripts, indicating metabolic relevance; accordingly, mass-spectrometry-derived lipidomic profiles distinguished between individuals with and without cognitive impairment prior to death. In spite of the limited group sizes, tissues from persons with both cognitive impairment and AD pathology showed elevation in several drug-targeted genes of other brain, vascular and autoimmune disorders, accompanied by pathology-related increases in distinct lipid processing transcripts, and in the RNA metabolism genes hnRNPH2, TARDBP, CLP1 and EWSR1. To further detect 3'-polyadenylation variants, we employed multiple cDNA primer pairs. This identified variants that showed limited differences in scope and length between the tested cohorts, yet enabled superior clustering of demented and non-demented AD brains versus controls compared to total mRNA expression values. Our findings indicate inter-related cognition-associated differences in AD's lipid processing, alternative splicing and 3'-polyadenylation, calling for pursuing the underlying psychological and therapeutics implications.
Asunto(s)
Enfermedad de Alzheimer/metabolismo , Disfunción Cognitiva/metabolismo , Metabolismo de los Lípidos/fisiología , ARN/metabolismo , Lóbulo Temporal/metabolismo , Anciano , Anciano de 80 o más Años , Empalme Alternativo , Enfermedad de Alzheimer/patología , Cognición , Disfunción Cognitiva/patología , Estudios de Cohortes , Humanos , Masculino , Análisis de Secuencia de ARN , Lóbulo Temporal/patologíaRESUMEN
This preface introduces the content of the BioMed Central journal Supplements related to the BITS 2015 meeting, held in Milan, Italy, from the 3th to the 5th of June, 2015.
Asunto(s)
Biología Computacional , Biología Computacional/organización & administración , Humanos , Italia , Revisión por Pares , PublicacionesRESUMEN
In recent years, considerable advances have been made in our understanding of genetics of mammalian gonad development; however, the underlying genetic aetiology in the majority of patients with 46,XY disorders of sex development (DSD) still remains unknown. Based on mouse models, it has been hypothesized that haploinsufficiency of the Friend of GATA 2 (FOG2) gene could lead to 46,XY gonadal dysgenesis on specific inbred genetic backgrounds. Using whole exome sequencing, we identified independent missense mutations in FOG2 in two patients with 46,XY gonadal dysgenesis. One patient carried a non-synonymous heterozygous mutation (p.S402R), while the other patient carried a heterozygous p.R260Q mutation and a homozygous p.M544I mutation. Functional studies indicated that the failure of testis development in these cases could be explained by the impaired ability of the mutant FOG2 proteins to interact with a known regulator of early testis development, GATA4. This is the first example of mutations in the coding sequence of FOG2 associated with 46,XY DSD in human and adds to the list of genes in the human known to be associated with DSD.
Asunto(s)
Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Trastorno del Desarrollo Sexual 46,XY/genética , Trastorno del Desarrollo Sexual 46,XY/patología , Factor de Transcripción GATA4/metabolismo , Testículo/anomalías , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Exoma , Femenino , Estudios de Asociación Genética , Células HEK293 , Heterocigoto , Homocigoto , Humanos , Masculino , Modelos Moleculares , Mutación Missense , Linaje , Análisis de Secuencia de ADN , Testículo/metabolismoRESUMEN
Cardiac hypertrophy, initially an adaptive response of the myocardium to stress, can progress to heart failure. The epigenetic signature underlying this phenomenon is poorly understood. Here, we report on the genome-wide distribution of seven histone modifications in adult mouse cardiomyocytes subjected to a prohypertrophy stimulus in vivo. We found a set of promoters with an epigenetic pattern that distinguishes specific functional classes of genes regulated in hypertrophy and identified 9,207 candidate active enhancers whose activity was modulated. We also analyzed the transcriptional network within which these genetic elements act to orchestrate hypertrophy gene expression, finding a role for myocyte enhancer factor (MEF)2C and MEF2A in regulating enhancers. We propose that the epigenetic landscape is a key determinant of gene expression reprogramming in cardiac hypertrophy and provide a basis for understanding the role of chromatin in regulating this phenomenon.
Asunto(s)
Cardiomegalia/genética , Epigénesis Genética/genética , Regulación del Desarrollo de la Expresión Génica/genética , Histonas/metabolismo , Factores de Transcripción/metabolismo , Acetilación , Animales , Cardiomegalia/metabolismo , Elementos de Facilitación Genéticos/genética , Metilación , Ratones , Regiones Promotoras Genéticas/genética , Factores de Transcripción/genéticaRESUMEN
This Preface introduces the content of the BioMed Central journal Supplements related to BITS2014 meeting, held in Rome, Italy, from the 26th to the 28th of February, 2014.
Asunto(s)
Biología Computacional , Humanos , Sociedades CientíficasRESUMEN
UNLABELLED: The present study employed mass sequencing of small RNA libraries to identify the repertoire of small noncoding RNAs expressed in normal CD4(+) T cells compared to cells transformed with human T-cell leukemia virus type 1 (HTLV-1), the causative agent of adult T-cell leukemia/lymphoma (ATLL). The results revealed distinct patterns of microRNA expression in HTLV-1-infected CD4(+) T-cell lines with respect to their normal counterparts. In addition, a search for virus-encoded microRNAs yielded 2 sequences that originated from the plus strand of the HTLV-1 genome. Several sequences derived from tRNAs were expressed at substantial levels in both uninfected and infected cells. One of the most abundant tRNA fragments (tRF-3019) was derived from the 3' end of tRNA-proline. tRF-3019 exhibited perfect sequence complementarity to the primer binding site of HTLV-1. The results of an in vitro reverse transcriptase assay verified that tRF-3019 was capable of priming HTLV-1 reverse transcriptase. Both tRNA-proline and tRF-3019 were detected in virus particles isolated from HTLV-1-infected cells. These findings suggest that tRF-3019 may play an important role in priming HTLV-1 reverse transcription and could thus represent a novel target to control HTLV-1 infection. IMPORTANCE: Small noncoding RNAs, a growing family of regulatory RNAs that includes microRNAs and tRNA fragments, have recently emerged as key players in many biological processes, including viral infection and cancer. In the present study, we employed mass sequencing to identify the repertoire of small noncoding RNAs in normal T cells compared to T cells transformed with human T-cell leukemia virus type 1 (HTLV-1), a retrovirus that causes adult T-cell leukemia/lymphoma. The results revealed a distinct pattern of microRNA expression in HTLV-1-infected cells and a tRNA fragment (tRF-3019) that was packaged into virions and capable of priming HTLV-1 reverse transcription, a key event in the retroviral life cycle. These findings indicate tRF-3019 could represent a novel target for therapies aimed at controlling HTLV-1 infection.
Asunto(s)
Linfocitos T CD4-Positivos/virología , Transformación Celular Viral , Virus Linfotrópico T Tipo 1 Humano/fisiología , ARN Pequeño no Traducido/metabolismo , ARN de Transferencia de Prolina/metabolismo , ADN Polimerasa Dirigida por ARN/metabolismo , Transcripción Reversa , Células Cultivadas , Interacciones Huésped-Patógeno , Humanos , ADN Polimerasa Dirigida por ARN/biosíntesisRESUMEN
The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5'-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia-nigra, compared to controls. This novel workflow allows deep multi-level inspection of RNA-Seq datasets and provides a comprehensive new resource for understanding disease transcriptome modifications in PD and other neurodegenerative diseases.
Asunto(s)
Empalme Alternativo , Leucocitos/metabolismo , Enfermedad de Parkinson/sangre , ARN Largo no Codificante , Análisis de Secuencia de ARN/métodos , Amígdala del Cerebelo/metabolismo , Mapeo Encefálico/métodos , Estimulación Encefálica Profunda , Femenino , Perfilación de la Expresión Génica , Humanos , Masculino , MicroARNs , Análisis de Secuencia por Matrices de Oligonucleótidos , Sustancia Negra/metabolismoRESUMEN
In order to understand the role of microRNAs (miRNAs) in vascular physiopathology, we took advantage of deep-sequencing techniques to accurately and comprehensively profile the entire miRNA population expressed by endothelial cells exposed to hypoxia. SOLiD sequencing of small RNAs derived from human umbilical vein endothelial cells (HUVECs) exposed to 1% O2 or normoxia for 24 h yielded more than 22 million reads per library. A customized bioinformatic pipeline identified more than 400 annotated microRNA/microRNA* species with a broad abundance range: miR-21 and miR-126 totaled almost 40% of all miRNAs. A complex repertoire of isomiRs was found, displaying also 5' variations, potentially affecting target recognition. High-stringency bioinformatic analysis identified microRNA candidates, whose predicted pre-miRNAs folded into a stable hairpin. Validation of a subset by qPCR identified 18 high-confidence novel miRNAs as detectable in independent HUVEC cultures and associated to the RISC complex. The expression of two novel miRNAs was significantly down-modulated by hypoxia, while miR-210 was significantly induced. Gene ontology analysis of their predicted targets revealed a significant association to hypoxia-inducible factor signaling, cardiovascular diseases, and cancer. Overexpression of the novel miRNAs in hypoxic endothelial cells affected cell growth and confirmed the biological relevance of their down-modulation. In conclusion, deep-sequencing accurately profiled known, variant, and novel microRNAs expressed by endothelial cells in normoxia and hypoxia.
Asunto(s)
Células Endoteliales/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento , MicroARNs/análisis , MicroARNs/química , Carboxipeptidasas/metabolismo , Hipoxia de la Célula , Proliferación Celular , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Biblioteca de Genes , Células HEK293 , Humanos , MicroARNs/metabolismo , Anotación de Secuencia Molecular , Conformación de Ácido Nucleico , ARN Bicatenario , Análisis de Secuencia de ARN , Transducción de SeñalRESUMEN
BACKGROUND: Rhabdomyosarcoma (RMS) is a highly malignant tumour accounting for nearly half of soft tissue sarcomas in children. MicroRNAs (miRNAs) represent a class of short, non-coding, regulatory RNAs which play a critical role in different cellular processes. Altered miRNA levels have been reported in human cancers, including RMS. METHODS: Using deep sequencing technology, a total of 685 miRNAs were investigated in a group of alveolar RMSs (ARMSs), embryonal RMSs (ERMSs) as well as in normal skeletal muscle (NSM). Q-PCR, MTT, cytofluorimetry, migration assay, western blot and immunofluorescence experiments were carried out to determine the role of miR-378a-3p in cancer cell growth, apoptosis, migration and differentiation. Bioinformatics pipelines were used for miRNA target prediction and clustering analysis. RESULTS: Ninety-seven miRNAs were significantly deregulated in ARMS and ERMS when compared to NSM. MiR-378 family members were dramatically decreased in RMS tumour tissue and cell lines. Interestingly, members of the miR-378 family presented as a possible target the insulin-like growth factor receptor 1 (IGF1R), a key signalling molecule in RMS. MiR-378a-3p over-expression in an RMS-derived cell line suppressed IGF1R expression and affected phosphorylated-Akt protein levels. Ectopic expression of miR-378a-3p caused significant changes in apoptosis, cell migration, cytoskeleton organization as well as a modulation of the muscular markers MyoD1, MyoR, desmin and MyHC. In addition, DNA demethylation by 5-aza-2'-deoxycytidine (5-aza-dC) was able to up-regulate miR-378a-3p levels with a concomitant induction of apoptosis, decrease in cell viability and cell cycle arrest in G2-phase. Cells treated with 5-aza-dC clearly changed their morphology and expressed moderate levels of MyHC. CONCLUSIONS: MiR-378a-3p may function as a tumour suppressor in RMS and the restoration of its expression would be of therapeutic benefit in RMS. Furthermore, the role of epigenetic modifications in RMS deserves further investigations.
Asunto(s)
MicroARNs/análisis , MicroARNs/genética , Rabdomiosarcoma/genética , Rabdomiosarcoma/metabolismo , Apoptosis , Secuencia de Bases , Línea Celular Tumoral , Metilación de ADN , Regulación hacia Abajo , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , MicroARNs/metabolismo , Datos de Secuencia Molecular , Desarrollo de Músculos , Receptor IGF Tipo 1 , Receptores de Somatomedina/metabolismo , Alineación de Secuencia , Análisis de Secuencia de ARNRESUMEN
The complete genomic sequence of the dairy Lactobacillus helveticus bacteriophage ΦAQ113 was determined. Phage ΦAQ113 is a Myoviridae bacteriophage with an isometric capsid and a contractile tail. The final assembled consensus sequence revealed a linear, circularly permuted, double-stranded DNA genome with a size of 36,566 bp and a G+C content of 37%. Fifty-six open reading frames (ORFs) were predicted, and a putative function was assigned to approximately 90% of them. The ΦAQ113 genome shows functionally related genes clustered together in a genome structure composed of modules for DNA replication/regulation, DNA packaging, head and tail morphogenesis, cell lysis, and lysogeny. The identification of genes involved in the establishment of lysogeny indicates that it may have originated as a temperate phage, even if it was isolated from natural cheese whey starters as a virulent phage, because it is able to propagate in a sensitive host strain. Additionally, we discovered that the ΦAQ113 phage genome is closely related to Lactobacillus gasseri phage KC5a and Lactobacillus johnsonii phage Lj771 genomes. The phylogenetic similarities between L. helveticus phage ΦAQ113 and two phages that belong to gut species confirm a possible common ancestral origin and support the increasing consideration of L. helveticus as a health-promoting organism.
Asunto(s)
ADN Viral/genética , Genoma Viral , Lactobacillus helveticus/virología , Myoviridae/genética , Composición de Base , ADN Viral/metabolismo , Datos de Secuencia Molecular , Myoviridae/clasificación , Myoviridae/ultraestructura , Sistemas de Lectura Abierta , Filogenia , Reacción en Cadena de la Polimerasa , Análisis de Secuencia de ADN , Homología de Secuencia , Espectrometría de Masa por Ionización de ElectrosprayRESUMEN
Ontogenesis of T cells in the thymus is a complex process whose molecular control is poorly understood. The present study investigated microRNAs involved in human thymocyte differentiation by comparing the microRNA expression profiles of thymocytes at the double-positive, single-positive CD4(+) and single-positive CD8(+) maturation stages. Microarray analysis showed that each thymocyte population displays a distinct microRNA expression profile that reflects their developmental relationships. Moreover, analysis of small-RNA libraries generated from human unsorted and double-positive thymocytes and from mature peripheral CD4(+) and CD8(+) T lymphocytes, together with the microarray data, indicated a trend toward up-regulation of microRNA expression during T-cell maturation after the double-positive stage and revealed a group of microRNAs regulated during normal T-cell development, including miR-150, which is strongly up-regulated as maturation progresses. We showed that miR-150 targets NOTCH3, a member of the Notch receptor family that plays important roles both in T-cell differentiation and leukemogenesis. Forced expression of miR-150 reduces NOTCH3 levels in T-cell lines and has adverse effects on their proliferation and survival. Overall, these findings suggest that control of the Notch pathway through miR-150 may have an important impact on T-cell development and physiology.
Asunto(s)
Diferenciación Celular , Regulación de la Expresión Génica , MicroARNs/metabolismo , Receptores Notch/metabolismo , Linfocitos T/citología , Linfocitos T/metabolismo , Regiones no Traducidas 3' , Adulto , Apoptosis , Línea Celular , Línea Celular Tumoral , Proliferación Celular , Células Cultivadas , Preescolar , Perfilación de la Expresión Génica , Genes Reporteros , Humanos , Lactante , Recién Nacido , MicroARNs/antagonistas & inhibidores , MicroARNs/genética , Leucemia-Linfoma Linfoblástico de Células Precursoras/metabolismo , ARN Mensajero/metabolismo , Receptor Notch3 , Receptores Notch/antagonistas & inhibidores , Receptores Notch/genética , Subgrupos de Linfocitos T/citología , Subgrupos de Linfocitos T/metabolismo , Timo/citología , Timo/metabolismoRESUMEN
Integration of retroviral vectors in the human genome follows nonrandom patterns that favor insertional deregulation of gene expression and increase the risk of their use in clinical gene therapy. The molecular basis of retroviral target site selection is still poorly understood. We used deep sequencing technology to build genomewide, high-definition maps of > 60 000 integration sites of Moloney murine leukemia virus (MLV)- and HIV-based retroviral vectors in the genome of human CD34(+) multipotent hematopoietic progenitor cells (HPCs) and used gene expression profiling, chromatin immunoprecipitation, and bioinformatics to associate integration to genetic and epigenetic features of the HPC genome. Clusters of recurrent MLV integrations identify regulatory elements (alternative promoters, enhancers, evolutionarily conserved noncoding regions) within or around protein-coding genes and microRNAs with crucial functions in HPC growth and differentiation, bearing epigenetic marks of active or poised transcription (H3K4me1, H3K4me2, H3K4me3, H3K9Ac, Pol II) and specialized chromatin configurations (H2A.Z). Overall, we mapped 3500 high-frequency integration clusters, which represent a new resource for the identification of transcriptionally active regulatory elements. High-definition MLV integration maps provide a rational basis for predicting genotoxic risks in gene therapy and a new tool for genomewide identification of promoters and regulatory elements controlling hematopoietic stem and progenitor cell functions.
Asunto(s)
Genoma Humano , Células Madre Hematopoyéticas/fisiología , Elementos Reguladores de la Transcripción/genética , Retroviridae/genética , Integración Viral/genética , Biomarcadores/metabolismo , Células Cultivadas , Cromatina/genética , Inmunoprecipitación de Cromatina , Epigenómica , Sangre Fetal/citología , Perfilación de la Expresión Génica , VIH/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Virus de la Leucemia Murina de Moloney/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Regiones Promotoras Genéticas/genéticaRESUMEN
Recombination signal sequences (RSSs) flanking V, D and J gene segments are recognized and cut by the VDJ recombinase during development of B and T lymphocytes. All RSSs are composed of seven conserved nucleotides, followed by a spacer (containing either 12 +/- 1 or 23 +/- 1 poorly conserved nucleotides) and a conserved nonamer. Errors in V(D)J recombination, including cleavage of cryptic RSS outside the immunoglobulin and T cell receptor loci, are associated with oncogenic translocations observed in some lymphoid malignancies. We present in this paper the RSSsite web server, which is available from the address http://www.itb.cnr.it/rss. RSSsite consists of a web-accessible database, RSSdb, for the identification of pre-computed potential RSSs, and of the related search tool, DnaGrab, which allows the scoring of potential RSSs in user-supplied sequences. This latter algorithm makes use of probability models, which can be recasted to Bayesian network, taking into account correlations between groups of positions of a sequence, developed starting from specific reference sets of RSSs. In validation laboratory experiments, we selected 33 predicted cryptic RSSs (cRSSs) from 11 chromosomal regions outside the immunoglobulin and TCR loci for functional testing.
Asunto(s)
Reordenamiento Génico de Linfocito B , Reordenamiento Génico de Linfocito T , Genoma Humano , Recombinación Genética , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos , Algoritmos , Animales , Bases de Datos de Ácidos Nucleicos , Genoma , Genómica/métodos , Humanos , Internet , RatonesRESUMEN
Non-protein coding RNAs (ncRNAs) have emerged as a vast and heterogeneous portion of eukaryotic transcriptomes. Several ncRNA families, either short (<200 nucleotides, nt) or long (>200 nt), have been described and implicated in a variety of biological processes, from translation to gene expression regulation and nuclear trafficking. Most probably, other families are still to be discovered. Computational methods for ncRNA research require different approaches from the ones normally used in the prediction of protein-coding genes. Indeed, primary sequence alone is often insufficient to infer ncRNA functionality, whereas secondary structure and local conservation of portions of the transcript could provide useful information for both the prediction and the functional annotation of ncRNAs. Here we present an overview of computational methods and bioinformatics resources currently available for studying ncRNA genes, introducing the common themes as well as the different approaches required for long and short ncRNA identification and annotation.
Asunto(s)
Biología Computacional/métodos , Células Eucariotas , ARN no Traducido , Animales , Secuencia de Bases , Bases de Datos Genéticas , Genómica/métodos , Conformación de Ácido Nucleico , ARN no Traducido/química , ARN no Traducido/clasificación , ARN no Traducido/genéticaRESUMEN
BACKGROUND: The cancer transcriptome is difficult to explore due to the heterogeneity of quantitative and qualitative changes in gene expression linked to the disease status. An increasing number of "unconventional" transcripts, such as novel isoforms, non-coding RNAs, somatic gene fusions and deletions have been associated with the tumoral state. Massively parallel sequencing techniques provide a framework for exploring the transcriptional complexity inherent to cancer with a limited laboratory and financial effort. We developed a deep sequencing and bioinformatics analysis protocol to investigate the molecular composition of a breast cancer poly(A)+ transcriptome. This method utilizes a cDNA library normalization step to diminish the representation of highly expressed transcripts and biology-oriented bioinformatic analyses to facilitate detection of rare and novel transcripts. RESULTS: We analyzed over 132,000 Roche 454 high-confidence deep sequencing reads from a primary human lobular breast cancer tissue specimen, and detected a range of unusual transcriptional events that were subsequently validated by RT-PCR in additional eight primary human breast cancer samples. We identified and validated one deletion, two novel ncRNAs (one intergenic and one intragenic), ten previously unknown or rare transcript isoforms and a novel gene fusion specific to a single primary tissue sample. We also explored the non-protein-coding portion of the breast cancer transcriptome, identifying thousands of novel non-coding transcripts and more than three hundred reads corresponding to the non-coding RNA MALAT1, which is highly expressed in many human carcinomas. CONCLUSION: Our results demonstrate that combining 454 deep sequencing with a normalization step and careful bioinformatic analysis facilitates the discovery and quantification of rare transcripts or ncRNAs, and can be used as a qualitative tool to characterize transcriptome complexity, revealing many hitherto unknown transcripts, splice isoforms, gene fusion events and ncRNAs, even at a relatively low sequence sampling.
Asunto(s)
Neoplasias de la Mama/genética , Perfilación de la Expresión Génica , Análisis de Secuencia de ADN , Transcripción Genética , Secuencia de Aminoácidos , Secuencia de Bases , Neoplasias de la Mama/metabolismo , Proteínas de Unión a Calmodulina/genética , Biología Computacional , Proteínas del Citoesqueleto/genética , ADN Complementario/química , Bases de Datos Genéticas , Femenino , Regulación Neoplásica de la Expresión Génica , N-Metiltransferasa de Histona-Lisina/genética , Humanos , Datos de Secuencia Molecular , Proteínas Nucleares/genética , ARN no Traducido/metabolismo , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Alineación de Secuencia , Análisis de Secuencia de ARN , Ubiquitina-Proteína LigasasRESUMEN
Antisense transcription has long been recognized as a mechanism involved in the regulation of gene expression. Therefore, several human diseases associated with abnormal patterns of gene expression might display antisense RNA-mediated pathogenetic mechanisms. Such issue could be particularly relevant for cancer pathogenesis, since deregulated gene expression has long been established as a hallmark of cancer cells. Herein, we report on a bioinformatic search for antisense transcription in two cancer-associated regions of human chromosome 6 (6q21 and 6q27). Natural antisense transcripts (NATs) for several genes in both genomic regions were predicted in silico and subsequently validated by strand-specific RT-PCR. Detailed experimental validation by quantitative real-time RT-PCR of five putative cancer related sense-antisense transcript pairs revealed a single candidate tumor suppressor gene (RPS6KA2) whose expression levels display marked cancer-related changes that are likely mediated by its antisense RNA in a breast cancer cell line model.
Asunto(s)
Cromosomas Humanos Par 6 , Neoplasias/genética , ARN sin Sentido/genética , Proteínas Quinasas S6 Ribosómicas 90-kDa/genética , Biología Computacional/métodos , Genes Supresores de Tumor , Genoma Humano , Humanos , Intrones , Modelos Biológicos , Modelos Genéticos , Neoplasias/metabolismo , Análisis de Secuencia por Matrices de Oligonucleótidos , Oligonucleótidos Antisentido , ARN sin Sentido/metabolismo , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Proteínas Quinasas S6 Ribosómicas 90-kDa/biosíntesisRESUMEN
BACKGROUND: Although the overlap of transcriptional units occurs frequently in eukaryotic genomes, its evolutionary and biological significance remains largely unclear. Here we report a comparative analysis of overlaps between genes coding for well-annotated proteins in five metazoan genomes (human, mouse, zebrafish, fruit fly and worm). RESULTS: For all analyzed species the observed number of overlapping genes is always lower than expected assuming functional neutrality, suggesting that gene overlap is negatively selected. The comparison to the random distribution also shows that retained overlaps do not exhibit random features: antiparallel overlaps are significantly enriched, while overlaps lying on the same strand and those involving coding sequences are highly underrepresented. We confirm that overlap is mostly species-specific and provide evidence that it frequently originates through the acquisition of terminal, non-coding exons. Finally, we show that overlapping genes tend to be significantly co-expressed in a breast cancer cDNA library obtained by 454 deep sequencing, and that different overlap types display different patterns of reciprocal expression. CONCLUSION: Our data suggest that overlap between protein-coding genes is selected against in Metazoa. However, when retained it may be used as a species-specific mechanism for the reciprocal regulation of neighboring genes. The tendency of overlaps to involve non-coding regions of the genes leads to the speculation that the advantages achieved by an overlapping arrangement may be optimized by evolving regulatory non-coding transcripts.
Asunto(s)
Evolución Molecular , Genes Sobrepuestos/genética , Filogenia , Animales , Neoplasias de la Mama/genética , Caenorhabditis elegans/genética , Secuencia Conservada/genética , Drosophila melanogaster/genética , Biblioteca de Genes , Humanos , Ratones , Modelos Genéticos , Pez Cebra/genéticaRESUMEN
BACKGROUND: The Affymetrix technology is nowadays a well-established method for the analysis of gene expression profiles in cancer research studies. However, changes in gene expression levels are not the only way to link genes and disease. The existence of gene isoforms specifically linked with cancer or apoptosis is increasingly found in literature. Hence it is of great interest to associate the results of a gene expression study with updated evidences on the transcript structure and its possible variants. RESULTS: We present here a web-based software tool, Splicy, whose primary task is to retrieve data on the mapping of Affymetrix probes to single exons of gene transcripts and displaying graphically this information projected on the gene physical structure. Starting from a list of Affymetrix probesets the program produces a series of graphical displays, each relative to a transcript associated with the gene targeted by a given probe. The information on the transcript-by-transcript and exon-by-exon mapping of probe pairs can be retrieved both graphically and in the form of tab-separated files. The mapping of single probes to NCBI RefSeq or EMBL cDNAs is handled by the ISREC mapping tables used in the CleanEx Expression Reference Database Project. We currently maintain these mappings for most popular human and mouse Affymetrix chips, and Splicy can be queried for matches with human and mouse NCBI RefSeq or EMBL cDNAs. CONCLUSION: Splicy generates probeset annotations and images describing the relation between the single probes and intron/exon structure of the target transcript in all its known variants. We think that Splicy will be useful for giving to the researcher a clearer picture of the possible transcript variants linked with a given gene and an additional view on the interpretation of microarray experiment data. Splicy is publicly available and has been realized in the framework of a bioinformatics grant from the Italian Cancer Research Association.
Asunto(s)
Empalme Alternativo/genética , Sondas de ADN/genética , Bases de Datos Genéticas , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Sitios de Empalme de ARN/genética , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Mapeo Cromosómico/métodos , Internet , Análisis de Secuencia por Matrices de Oligonucleótidos/instrumentaciónRESUMEN
The vast amount of unstructured data emerging from the various genome projects has led to the development of a number of web-based tools designed to annotate genes with biological information. Here we discuss a selection of these tools with regards to their scope, limitations and ease of use.