Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 28
Filter
2.
Ann Surg ; 279(5): 866-873, 2024 May 01.
Article in English | MEDLINE | ID: mdl-38073557

ABSTRACT

OBJECTIVE: We aim to determine whether incremental changes in genetic ancestry percentages influence molecular and clinical outcome characteristics of breast cancer in an admixed population. BACKGROUND: Patients with breast cancer are predominantly characterized as "Black" or "White" based on self-identified race/ethnicity or arbitrary genetic ancestry cutoffs. This limits scientific discovery in populations that are admixed or of mixed race/ethnicity as they cannot be classified based on historical race/ethnicity boxes or genetic ancestry cutoffs. METHODS: We used The Cancer Genome Atlas cohort and focused on genetically admixed patients that had less than 90% European, African, Asian, or Native American ancestry. RESULTS: Genetically admixed patients with breast cancer exhibited improved 10-year overall survival relative to those with >90% European ancestry. Within the luminal A subtype, patients with lower African ancestry had longer 10-year overall survival compared to those with higher African ancestry. The correlation of genetic ancestry with gene expression and DNA methylation in the admixed cohort revealed novel ancestry-specific intrinsic PAM50 subtype patterns. In luminal A tumors, genetic ancestry was correlated with both the expression and methylation of signaling genes, while in basal-like tumors, genetic ancestry was correlated with stemness genes. In addition, we took a machine-learning approach to estimate genetic ancestry from gene expression or DNA methylation and were able to accurately calculate ancestry values from a reduced set of 10 genes or 50 methylation sites that were specific for each molecular subtype. CONCLUSIONS: Our results suggest that incremental changes in genetic ancestry percentages result in ancestry-specific molecular differences even between well-established PAM50 subtypes which may influence disparities in breast cancer survival outcomes. Accounting for incremental changes in ancestry will be important in future research, prognostication, and risk stratification, particularly in ancestrally diverse populations.


Subject(s)
Breast Neoplasms , Female , Humans , Breast Neoplasms/ethnology , Breast Neoplasms/mortality , Ethnicity , Racial Groups
3.
Nat Commun ; 13(1): 6524, 2022 10 31.
Article in English | MEDLINE | ID: mdl-36316347

ABSTRACT

DNMT3A and IDH1/2 mutations combinatorically regulate the transcriptome and the epigenome in acute myeloid leukemia; yet the mechanisms of this interplay are unknown. Using a systems approach within topologically associating domains, we find that genes with significant expression-methylation correlations are enriched in signaling and metabolic pathways. The common denominator across these methylation-regulated genes is the density in MIR retrotransposons of their introns. Moreover, a discrete number of CpGs overlapping enhancers are responsible for regulating most of these genes. Established mouse models recapitulate the dependency of MIR-rich genes on the balanced expression of epigenetic modifiers, while projection of leukemic profiles onto normal hematopoiesis ones further consolidates the dependencies of methylation-regulated genes on MIRs. Collectively, MIR elements on genes and enhancers are susceptible to changes in DNA methylation activity and explain the cooperativity of proteins in this pathway in normal and malignant hematopoiesis.


Subject(s)
Epigenome , Leukemia, Myeloid, Acute , Mice , Animals , Retroelements/genetics , Transcriptome/genetics , Mutation , Leukemia, Myeloid, Acute/genetics , Leukemia, Myeloid, Acute/metabolism , DNA Methylation/genetics
4.
BMC Biol ; 19(1): 60, 2021 03 25.
Article in English | MEDLINE | ID: mdl-33765992

ABSTRACT

BACKGROUND: Extensive molecular differences exist between proliferative and differentiated cells. Here, we conduct a meta-analysis of publicly available transcriptomic datasets from preimplantation and differentiation stages examining the architectural properties and content of genes whose abundance changes significantly across developmental time points. RESULTS: Analysis of preimplantation embryos from human and mouse showed that short genes whose introns are enriched in Alu (human) and B (mouse) elements, respectively, have higher abundance in the blastocyst compared to the zygote. These highly expressed genes encode ribosomal proteins or metabolic enzymes. On the other hand, long genes whose introns are depleted in repetitive elements have lower abundance in the blastocyst and include genes from signaling pathways. Additionally, the sequences of the genes that are differentially expressed between the blastocyst and the zygote contain distinct collections of pyknon motifs that differ between up- and down-regulated genes. Further examination of the genes that participate in the stem cell-specific protein interaction network shows that their introns are short and enriched in Alu (human) and B (mouse) elements. As organogenesis progresses, in both human and mouse, we find that the primarily short and repeat-rich expressed genes make way for primarily longer, repeat-poor genes. With that in mind, we used a machine learning-based approach to identify gene signatures able to classify human adult tissues: we find that the most discriminatory genes comprising these signatures have long introns that are repeat-poor and include transcription factors and signaling-cascade genes. The introns of widely expressed genes across human tissues, on the other hand, are short and repeat-rich, and coincide with those with the highest expression at the blastocyst stage. CONCLUSIONS: Protein-coding genes that are characteristic of each trajectory, i.e., proliferation/pluripotency or differentiation, exhibit antithetical biases in their intronic and exonic lengths and in their repetitive-element content. While the respective human and mouse gene signatures are functionally and evolutionarily conserved, their introns and exons are enriched or depleted in organism-specific repetitive elements. We posit that these organism-specific repetitive sequences found in exons and introns are used to effect the corresponding genes' regulation.


Subject(s)
Cell Differentiation/genetics , Pluripotent Stem Cells , Animals , Blastocyst/cytology , Blastocyst/metabolism , Gene Expression Regulation, Developmental , Humans , Mice , Repetitive Sequences, Nucleic Acid
5.
Bioinformatics ; 37(13): 1828-1838, 2021 Jul 27.
Article in English | MEDLINE | ID: mdl-33471076

ABSTRACT

MOTIVATION: MicroRNA (miRNA) precursor arms give rise to multiple isoforms simultaneously called 'isomiRs.' IsomiRs from the same arm typically differ by a few nucleotides at either their 5' or 3' termini or both. In humans, the identities and abundances of isomiRs depend on a person's sex and genetic ancestry as well as on tissue type, tissue state and disease type/subtype. Moreover, nearly half of the time the most abundant isomiR differs from the miRNA sequence found in public databases. Accurate mining of isomiRs from deep sequencing data is thus important. RESULTS: We developed isoMiRmap, a fast, standalone, user-friendly mining tool that identifies and quantifies all isomiRs by directly processing short RNA-seq datasets. IsoMiRmap is a portable 'plug-and-play' tool, requires minimal setup, has modest computing and storage requirements, and can process an RNA-seq dataset with 50 million reads in just a few minutes on an average laptop. IsoMiRmap deterministically and exhaustively reports all isomiRs in a given deep sequencing dataset and quantifies them accurately (no double-counting). IsoMiRmap comprehensively reports all miRNA precursor locations from which an isomiR may be transcribed, tags as 'ambiguous' isomiRs whose sequences exist both inside and outside of the space of known miRNA sequences and reports the public identifiers of common single-nucleotide polymorphisms and documented somatic mutations that may be present in an isomiR. IsoMiRmap also identifies isomiRs with 3' non-templated post-transcriptional additions. Compared to similar tools, isoMiRmap is the fastest, reports more bona fide isomiRs, and provides the most comprehensive information related to an isomiR's transcriptional origin. AVAILABILITY AND IMPLEMENTATION: The codes for isoMiRmap are freely available at https://cm.jefferson.edu/isoMiRmap/ and https://github.com/TJU-CMC-Org/isoMiRmap/. IsomiR profiles for the datasets of the 1000 Genomes Project, spanning five population groups, and The Cancer Genome Atlas (TCGA), spanning 33 cancer studies, are also available at https://cm.jefferson.edu/isoMiRmap/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

6.
J Bone Miner Res ; 35(3): 550-570, 2020 03.
Article in English | MEDLINE | ID: mdl-31692093

ABSTRACT

Maintenance of glycolytic metabolism is postulated to be required for health of the spinal column. In the hypoxic tissues of the intervertebral disc and glycolytic cells of vertebral bone, glucose is metabolized into pyruvate for ATP generation and reduced to lactate to sustain redox balance. The rise in intracellular H+ /lactate concentrations are balanced by plasma-membrane monocarboxylate transporters (MCTs). Using MCT4 null mice and human tissue samples, complemented with genetic and metabolic approaches, we determine that H+ /lactate efflux is critical for maintenance of disc and vertebral bone health. Mechanistically, MCT4 maintains glycolytic and tricarboxylic acid (TCA) cycle flux and intracellular pH homeostasis in the nucleus pulposus compartment of the disc, where hypoxia-inducible factor 1α (HIF-1α) directly activates an intronic enhancer in SLC16A3. Ultimately, our results provide support for research into lactate as a diagnostic biomarker for chronic, painful, disc degeneration. © 2019 American Society for Bone and Mineral Research.


Subject(s)
Intervertebral Disc Degeneration , Intervertebral Disc , Nucleus Pulposus , Biological Transport , Humans , Hypoxia-Inducible Factor 1, alpha Subunit/metabolism , Intervertebral Disc/metabolism , Intervertebral Disc Degeneration/metabolism , Lactic Acid/metabolism
7.
Bioinformatics ; 36(3): 698-703, 2020 02 01.
Article in English | MEDLINE | ID: mdl-31504201

ABSTRACT

MOTIVATION: MicroRNAs (miRNAs) are small RNA molecules (∼22 nucleotide long) involved in post-transcriptional gene regulation. Advances in high-throughput sequencing technologies led to the discovery of isomiRs, which are miRNA sequence variants. While many miRNA-seq analysis tools exist, the diversity of output formats hinders accurate comparisons between tools and precludes data sharing and the development of common downstream analysis methods. RESULTS: To overcome this situation, we present here a community-based project, miRNA Transcriptomic Open Project (miRTOP) working towards the optimization of miRNA analyses. The aim of miRTOP is to promote the development of downstream isomiR analysis tools that are compatible with existing detection and quantification tools. Based on the existing GFF3 format, we first created a new standard format, mirGFF3, for the output of miRNA/isomiR detection and quantification results from small RNA-seq data. Additionally, we developed a command line Python tool, mirtop, to create and manage the mirGFF3 format. Currently, mirtop can convert into mirGFF3 the outputs of commonly used pipelines, such as seqbuster, isomiR-SEA, sRNAbench, Prost! as well as BAM files. Some tools have also incorporated the mirGFF3 format directly into their code, such as, miRge2.0, IsoMIRmap and OptimiR. Its open architecture enables any tool or pipeline to output or convert results into mirGFF3. Collectively, this isomiR categorization system, along with the accompanying mirGFF3 and mirtop API, provide a comprehensive solution for the standardization of miRNA and isomiR annotation, enabling data sharing, reporting, comparative analyses and benchmarking, while promoting the development of common miRNA methods focusing on downstream steps of miRNA detection, annotation and quantification. AVAILABILITY AND IMPLEMENTATION: https://github.com/miRTop/mirGFF3/ and https://github.com/miRTop/mirtop. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
MicroRNAs , Gene Expression Regulation , High-Throughput Nucleotide Sequencing , Sequence Analysis, RNA , Transcriptome
8.
Cancer Res ; 79(12): 3034-3049, 2019 06 15.
Article in English | MEDLINE | ID: mdl-30996049

ABSTRACT

tRNA-derived fragments (tRF) are a class of potent regulatory RNAs. We mined the datasets from The Cancer Genome Atlas (TCGA) representing 32 cancer types with a deterministic and exhaustive pipeline for tRNA fragments. We found that mitochondrial tRNAs contribute disproportionally more tRFs than nuclear tRNAs. Through integrative analyses, we uncovered a multitude of statistically significant and context-dependent associations between the identified tRFs and mRNAs. In many of the 32 cancer types, these associations involve mRNAs from developmental processes, receptor tyrosine kinase signaling, the proteasome, and metabolic pathways that include glycolysis, oxidative phosphorylation, and ATP synthesis. Even though the pathways are common to multiple cancers, the association of specific mRNAs with tRFs depends on and differs from cancer to cancer. The associations between tRFs and mRNAs extend to genomic properties as well; specifically, tRFs are positively correlated with shorter genes that have a higher density in repeats, such as ALUs, MIRs, and ERVLs. Conversely, tRFs are negatively correlated with longer genes that have a lower repeat density, suggesting a possible dichotomy between cell proliferation and differentiation. Analyses of bladder, lung, and kidney cancer data indicate that the tRF-mRNA wiring can also depend on a patient's sex. Sex-dependent associations involve cyclin-dependent kinases in bladder cancer, the MAPK signaling pathway in lung cancer, and purine metabolism in kidney cancer. Taken together, these findings suggest diverse and wide-ranging roles for tRFs and highlight the extensive interconnections of tRFs with key cellular processes and human genomic architecture. SIGNIFICANCE: Across 32 TCGA cancer contexts, nuclear and mitochondrial tRNA fragments exhibit associations with mRNAs that belong to concrete pathways, encode proteins with particular destinations, have a biased repeat content, and are sex dependent.


Subject(s)
Gene Regulatory Networks , Genome, Human , Health Status Disparities , Neoplasms/genetics , Neoplasms/pathology , RNA, Messenger/genetics , RNA, Transfer/genetics , Cell Nucleus/genetics , Cell Nucleus/metabolism , Cell Proliferation , Humans , Mitochondria/genetics , Mitochondria/metabolism , Neoplasms/classification , Neoplasms/metabolism , RNA, Messenger/metabolism , RNA, Transfer/metabolism , Transcriptome
9.
Sci Rep ; 8(1): 5314, 2018 03 28.
Article in English | MEDLINE | ID: mdl-29593348

ABSTRACT

MicroRNA (miRNA) isoforms ("isomiRs") and tRNA-derived fragments ("tRFs") are powerful regulatory non-coding RNAs (ncRNAs). In human tissues, both types of molecules are abundant, with expression patterns that depend on a person's race, sex and population origin. Here, we present our analyses of the Prostate Cancer (PRAD) datasets of The Cancer Genome Atlas (TCGA) from the standpoint of isomiRs and tRFs. This study represents the first simultaneous examination of isomiRs and tRFs in a large cohort of PRAD patients. We find that isomiRs and tRFs have extensive correlations with messenger RNAs (mRNAs). These correlations are disrupted in PRAD, which suggests disruptions of the regulatory network in the disease state. Notably, we find that the profiles of isomiRs and tRFs differ in patients belonging to different races. We hope that the presented findings can lay the groundwork for future research efforts aimed at elucidating the functional roles of the numerous and distinct members of these two categories of ncRNAs that are present in PRAD.


Subject(s)
Gene Expression Profiling/methods , Prostatic Neoplasms/genetics , RNA Isoforms/genetics , Databases, Genetic , Humans , Male , MicroRNAs/genetics , RNA, Messenger/genetics , RNA, Transfer/genetics , Transcriptome/genetics
10.
Cancer Res ; 78(5): 1140-1154, 2018 03 01.
Article in English | MEDLINE | ID: mdl-29229607

ABSTRACT

Triple-negative breast cancer (TNBC) is a breast cancer subtype characterized by marked differences between White and Black/African-American women. We performed a systems-level analysis on datasets from The Cancer Genome Atlas to elucidate how the expression patterns of mRNAs are shaped by regulatory noncoding RNAs (ncRNA). Specifically, we studied isomiRs, that is, isoforms of miRNAs, and tRNA-derived fragments (tRF). In normal breast tissue, we observed a marked cohesiveness in both the ncRNA and mRNA layers and the associations between them. This cohesiveness was widely disrupted in TNBC. Many mRNAs become either differentially expressed or differentially wired between normal breast and TNBC in tandem with isomiR or tRF dysregulation. The affected pathways included energy metabolism, cell signaling, and immune responses. Within TNBC, the wiring of the affected pathways with isomiRs and tRFs differed in each race. Multiple isomiRs and tRFs arising from specific miRNA loci (e.g., miR-200c, miR-21, the miR-17/92 cluster, the miR-183/96/182 cluster) and from specific tRNA loci (e.g., the nuclear tRNAGly and tRNALeu, the mitochondrial tRNAVal and tRNAPro) were strongly associated with the observed race disparities in TNBC. We highlight the race-specific aspects of transcriptome wiring by discussing in detail the metastasis-related MAPK and the Wnt/ß-catenin signaling pathways, two of the many key pathways that were found differentially wired. In conclusion, by employing a data- and knowledge-driven approach, we comprehensively analyzed the normal and cancer transcriptomes to uncover novel key contributors to the race-based disparities of TNBC.Significance: This big data-driven study comparing normal and cancer transcriptomes uncovers RNA expression differences between Caucasian and African-American patients with triple-negative breast cancer that might help explain disparities in incidence and aggressive character. Cancer Res; 78(5); 1140-54. ©2017 AACR.


Subject(s)
Gene Expression Regulation, Neoplastic , Gene Regulatory Networks , MicroRNAs/genetics , RNA, Transfer/genetics , Racial Groups/genetics , Triple Negative Breast Neoplasms/ethnology , Triple Negative Breast Neoplasms/genetics , Biomarkers, Tumor , Female , Health Status Disparities , Humans , RNA, Long Noncoding , RNA, Messenger , Transcriptome
11.
Nucleic Acids Res ; 46(D1): D152-D159, 2018 01 04.
Article in English | MEDLINE | ID: mdl-29186503

ABSTRACT

MINTbase is a repository that comprises nuclear and mitochondrial tRNA-derived fragments ('tRFs') found in multiple human tissues. The original version of MINTbase comprised tRFs obtained from 768 transcriptomic datasets. We used our deterministic and exhaustive tRF mining pipeline to process all of The Cancer Genome Atlas datasets (TCGA). We identified 23 413 tRFs with abundance of ≥ 1.0 reads-per-million (RPM). To facilitate further studies of tRFs by the community, we just released version 2.0 of MINTbase that contains information about 26 531 distinct human tRFs from 11 719 human datasets as of October 2017. Key new elements include: the ability to filter tRFs on-the-fly by minimum abundance thresholding; the ability to filter tRFs by tissue keywords; easy access to information about a tRF's maximum abundance and the datasets that contain it; the ability to generate relative abundance plots for tRFs across cancer types and convert them into embeddable figures; MODOMICS information about modifications of the parental tRNA, etc. Version 2.0 of MINTbase contains 15x more datasets and nearly 4x more distinct tRFs than the original version, yet continues to offer fast, interactive access to its contents. Version 2.0 is available freely at http://cm.jefferson.edu/MINTbase/.


Subject(s)
Databases, Nucleic Acid , Neoplasms/genetics , RNA, Transfer/genetics , Genome, Human , Humans , RNA, Mitochondrial/genetics , RNA, Neoplasm/genetics , RNA, Nuclear/genetics , User-Computer Interface
12.
Methods Mol Biol ; 1680: 237-255, 2018.
Article in English | MEDLINE | ID: mdl-29030853

ABSTRACT

There is an increasing interest within the scientific community in identifying tRNA-derived fragments (tRFs) and elucidating the roles they play in the cell. Such endeavors can be greatly facilitated by mining the numerous datasets from many cellular contexts that exist publicly. However, the standard mapping tools cannot be used for the purpose. Several factors complicate this endeavor including: the presence of multiple identical or nearly identical isodecoders at various genomic locations; the presence of identical sequence segments that are shared by isodecoders of the same or even different anticodons; the existence of numerous partial tRNA sequences across the genome; the existence of hundreds of "lookalike" sequences that resemble true tRNAs; and others. This is generating a need for specialized tools that can mine deep sequencing data to identify and quantify tRFs. We discuss the various complicating factors and their ramifications, and how to use and run MINTmap, a tool that addresses these considerations.


Subject(s)
Gene Expression Profiling , RNA, Transfer/genetics , Sequence Analysis, RNA , Algorithms , Chromosome Mapping , Computational Biology/methods , Databases, Nucleic Acid , Software , User-Computer Interface , Web Browser
13.
Noncoding RNA ; 3(2)2017 Jun.
Article in English | MEDLINE | ID: mdl-28730153

ABSTRACT

We sought to determine whether commercial quantitative polymerase chain reaction (qPCR) methods are capable of distinguishing isomiRs: variants of mature microRNAs (miRNAs) with sequence endpoint differences. We used two commercially available miRNA qPCR methods to quantify miR-21-5p in both synthetic and real cell contexts. We find that although these miRNA qPCR methods possess high sensitivity for specific sequences, they also pick up background signals from closely related isomiRs, which influences the reliable quantification of individual isomiRs. We conclude that these methods do not possess the requisite specificity for reliable isomiR quantification.

14.
Genome Biol ; 18(1): 98, 2017 05 24.
Article in English | MEDLINE | ID: mdl-28535802

ABSTRACT

BACKGROUND: Non-coding RNAs have been drawing increasing attention in recent years as functional data suggest that they play important roles in key cellular processes. N-BLR is a primate-specific long non-coding RNA that modulates the epithelial-to-mesenchymal transition, facilitates cell migration, and increases colorectal cancer invasion. RESULTS: We performed multivariate analyses of data from two independent cohorts of colorectal cancer patients and show that the abundance of N-BLR is associated with tumor stage, invasion potential, and overall patient survival. Through in vitro and in vivo experiments we found that N-BLR facilitates migration primarily via crosstalk with E-cadherin and ZEB1. We showed that this crosstalk is mediated by a pyknon, a short ~20 nucleotide-long DNA motif contained in the N-BLR transcript and is targeted by members of the miR-200 family. In light of these findings, we used a microarray to investigate the expression patterns of other pyknon-containing genomic loci. We found multiple such loci that are differentially transcribed between healthy and diseased tissues in colorectal cancer and chronic lymphocytic leukemia. Moreover, we identified several new loci whose expression correlates with the colorectal cancer patients' overall survival. CONCLUSIONS: The primate-specific N-BLR is a novel molecular contributor to the complex mechanisms that underlie metastasis in colorectal cancer and a potential novel biomarker for this disease. The presence of a functional pyknon within N-BLR and the related finding that many more pyknon-containing genomic loci in the human genome exhibit tissue-specific and disease-specific expression suggests the possibility of an alternative class of biomarkers and therapeutic targets that are primate-specific.


Subject(s)
Colorectal Neoplasms/genetics , Epithelial-Mesenchymal Transition/genetics , Gene Expression Regulation, Neoplastic , Leukemia, Lymphocytic, Chronic, B-Cell/genetics , RNA, Long Noncoding/genetics , Adult , Aged , Aged, 80 and over , Animals , Cadherins/genetics , Cadherins/metabolism , Cell Movement , Cell Proliferation , Cohort Studies , Colorectal Neoplasms/metabolism , Colorectal Neoplasms/mortality , Colorectal Neoplasms/pathology , Female , Genetic Loci , HCT116 Cells , Humans , Leukemia, Lymphocytic, Chronic, B-Cell/metabolism , Leukemia, Lymphocytic, Chronic, B-Cell/mortality , Leukemia, Lymphocytic, Chronic, B-Cell/pathology , Male , MicroRNAs/genetics , MicroRNAs/metabolism , Middle Aged , Neoplasm Invasiveness , Neoplasm Staging , Nucleotide Motifs , RNA, Long Noncoding/metabolism , RNA, Messenger/genetics , RNA, Messenger/metabolism , Survival Analysis , Transcription, Genetic , Zinc Finger E-box-Binding Homeobox 1/genetics , Zinc Finger E-box-Binding Homeobox 1/metabolism
15.
Sci Rep ; 7: 41184, 2017 02 21.
Article in English | MEDLINE | ID: mdl-28220888

ABSTRACT

Transfer RNA fragments (tRFs) are an established class of constitutive regulatory molecules that arise from precursor and mature tRNAs. RNA deep sequencing (RNA-seq) has greatly facilitated the study of tRFs. However, the repeat nature of the tRNA templates and the idiosyncrasies of tRNA sequences necessitate the development and use of methodologies that differ markedly from those used to analyze RNA-seq data when studying microRNAs (miRNAs) or messenger RNAs (mRNAs). Here we present MINTmap (for MItochondrial and Nuclear TRF mapping), a method and a software package that was developed specifically for the quick, deterministic and exhaustive identification of tRFs in short RNA-seq datasets. In addition to identifying them, MINTmap is able to unambiguously calculate and report both raw and normalized abundances for the discovered tRFs. Furthermore, to ensure specificity, MINTmap identifies the subset of discovered tRFs that could be originating outside of tRNA space and flags them as candidate false positives. Our comparative analysis shows that MINTmap exhibits superior sensitivity and specificity to other available methods while also being exceptionally fast. The MINTmap codes are available through https://github.com/TJU-CMC-Org/MINTmap/ under an open source GNU GPL v3.0 license.


Subject(s)
Cell Nucleus/genetics , High-Throughput Nucleotide Sequencing/methods , Mitochondria/genetics , RNA, Transfer/genetics , Sequence Analysis, RNA/methods , Animals , Humans , Male , Mice , Models, Molecular , Nucleic Acid Conformation , RNA, Transfer/chemistry , Software
16.
Nucleic Acids Res ; 45(6): 2973-2985, 2017 04 07.
Article in English | MEDLINE | ID: mdl-28206648

ABSTRACT

Isoforms of human miRNAs (isomiRs) are constitutively expressed with tissue- and disease-subtype-dependencies. We studied 10 271 tumor datasets from The Cancer Genome Atlas (TCGA) to evaluate whether isomiRs can distinguish amongst 32 TCGA cancers. Unlike previous approaches, we built a classifier that relied solely on 'binarized' isomiR profiles: each isomiR is simply labeled as 'present' or 'absent'. The resulting classifier successfully labeled tumor datasets with an average sensitivity of 90% and a false discovery rate (FDR) of 3%, surpassing the performance of expression-based classification. The classifier maintained its power even after a 15× reduction in the number of isomiRs that were used for training. Notably, the classifier could correctly predict the cancer type in non-TCGA datasets from diverse platforms. Our analysis revealed that the most discriminatory isomiRs happen to also be differentially expressed between normal tissue and cancer. Even so, we find that these highly discriminating isomiRs have not been attracting the most research attention in the literature. Given their ability to successfully classify datasets from 32 cancers, isomiRs and our resulting 'Pan-cancer Atlas' of isomiR expression could serve as a suitable framework to explore novel cancer biomarkers.


Subject(s)
MicroRNAs/metabolism , Neoplasms/classification , Cluster Analysis , Datasets as Topic , Humans , Neoplasms/genetics , Neoplasms/metabolism , RNA Isoforms/metabolism
17.
Nucleic Acids Res ; 45(9): e70, 2017 May 19.
Article in English | MEDLINE | ID: mdl-28108659

ABSTRACT

Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , RNA, Transfer/chemistry , Sequence Analysis, RNA/methods , Cell Line, Tumor , Computational Biology , DNA, Complementary , Humans , Reproducibility of Results , Sensitivity and Specificity
18.
Bioinformatics ; 32(16): 2481-9, 2016 08 15.
Article in English | MEDLINE | ID: mdl-27153631

ABSTRACT

MOTIVATION: It has been known that mature transfer RNAs (tRNAs) that are encoded in the nuclear genome give rise to short molecules, collectively known as tRNA fragments or tRFs. Recently, we reported that, in healthy individuals and in patients, tRFs are constitutive, arise from mitochondrial as well as from nuclear tRNAs, and have composition and abundances that depend on a person's sex, population origin and race as well as on tissue, disease and disease subtype. Our findings as well as similar work by other groups highlight the importance of tRFs and presage an increase in the community's interest in elucidating the roles of tRFs in health and disease. RESULTS: We created MINTbase, a web-based framework that serves the dual-purpose of being a content repository for tRFs and a tool for the interactive exploration of these newly discovered molecules. A key feature of MINTbase is that it deterministically and exhaustively enumerates all possible genomic locations where a sequence fragment can be found and indicates which fragments are exclusive to tRNA space, and thus can be considered as tRFs: this is a very important consideration given that the genomes of higher organisms are riddled with partial tRNA sequences and with tRNA-lookalikes whose aberrant transcripts can be mistaken for tRFs. MINTbase is extremely flexible and integrates and presents tRF information from multiple yet interconnected vantage points ('vistas'). Vistas permit the user to interactively personalize the information that is returned and the manner in which it is displayed. MINTbase can report comparative information on how a tRF is distributed across all anticodon/amino acid combinations, provides alignments between a tRNA and multiple tRFs with which the user can interact, provides details on published studies that reported a tRF as expressed, etc. Importantly, we designed MINTbase to contain all possible tRFs that could ever be produced by mature tRNAs: this allows us to report on their genomic distributions, anticodon/amino acid properties, alignments, etc. while giving users the ability to at-will investigate candidate tRF molecules before embarking on focused experimental explorations. Lastly, we also introduce a new labeling scheme that is tRF-sequence-based and allows users to associate a tRF with a universally unique label ('tRF-license plate') that is independent of a genome assembly and does not require any brokering mechanism. AVAILABILITY AND IMPLEMENTATION: MINTbase is freely accessible at http://cm.jefferson.edu/MINTbase/. Dataset submissions to MINTbase can be initiated at http://cm.jefferson.edu/MINTsubmit/ CONTACT: isidore.rigoutsos@jefferson.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Mitochondria , RNA, Transfer , Anticodon , Base Sequence , Humans , Internet , Sequence Alignment , Software
19.
BMC Bioinformatics ; 17: 123, 2016 Mar 10.
Article in English | MEDLINE | ID: mdl-26961774

ABSTRACT

We examine several of the choices that went into the design of tDRmapper, a recently reported tool for identifying transfer RNA (tRNA) fragments in deep sequencing data, evaluate them in the context of currently available knowledge, and discuss their potential impact on the output that the tool generates.


Subject(s)
Databases, Nucleic Acid , High-Throughput Nucleotide Sequencing/methods , Molecular Sequence Annotation , RNA, Transfer/genetics , RNA, Transfer/chemistry
20.
Oncotarget ; 6(28): 24797-822, 2015 Sep 22.
Article in English | MEDLINE | ID: mdl-26325506

ABSTRACT

We analyzed transcriptomic data from 452 healthy men and women representing five different human populations and two races, and, 311 breast cancer samples from The Cancer Genome Atlas. Our studies revealed numerous constitutive, distinct fragments with overlapping sequences and quantized lengths that persist across dozens of individuals and arise from the genomic loci of all nuclear and mitochondrial human transfer RNAs (tRNAs). Surprisingly, we discovered that the tRNA fragments' length, starting and ending points, and relative abundance depend on gender, population, race and also on amino acid identity, anticodon, genomic locus, tissue, disease, and disease subtype. Moreover, the length distribution of mitochondrially-encoded tRNAs differs from that of nuclearly-encoded tRNAs, and the specifics of these distributions depend on tissue. Notably, tRNA fragments from the same anticodon do not have correlated abundances. We also report on a novel category of tRNA fragments that significantly contribute to the differences we observe across tissues, genders, populations, and races: these fragments, referred to as i-tRFs, are abundant in human tissues, wholly internal to the respective mature tRNA, and can straddle the anticodon. HITS-CLIP data analysis revealed that tRNA fragments are loaded on Argonaute in a cell-dependent manner, suggesting cell-dependent functional roles through the RNA interference pathway. We validated experimentally two i-tRF molecules: the first was found in 21 of 22 tested breast tumor and adjacent normal samples and was differentially abundant between health and disease whereas the second was found in all eight tested breast cancer cell lines.


Subject(s)
Gene Expression Regulation , RNA, Transfer/genetics , RNA/genetics , Transcriptome/genetics , Anticodon/genetics , Base Sequence , Breast Neoplasms/genetics , Cell Line , Cell Line, Tumor , Cluster Analysis , Female , Genetic Variation , Humans , MCF-7 Cells , Male , Models, Genetic , Molecular Sequence Data , Nucleic Acid Conformation , RNA/chemistry , RNA, Mitochondrial , RNA, Transfer/chemistry , RNA, Transfer/classification
SELECTION OF CITATIONS
SEARCH DETAIL