RESUMO
Public databases contain a planetary collection of nucleic acid sequences, but their systematic exploration has been inhibited by a lack of efficient methods for searching this corpus, which (at the time of writing) exceeds 20 petabases and is growing exponentially1. Here we developed a cloud computing infrastructure, Serratus, to enable ultra-high-throughput sequence alignment at the petabase scale. We searched 5.7 million biologically diverse samples (10.2 petabases) for the hallmark gene RNA-dependent RNA polymerase and identified well over 105 novel RNA viruses, thereby expanding the number of known species by roughly an order of magnitude. We characterized novel viruses related to coronaviruses, hepatitis delta virus and huge phages, respectively, and analysed their environmental reservoirs. To catalyse the ongoing revolution of viral discovery, we established a free and comprehensive database of these data and tools. Expanding the known sequence diversity of viruses can reveal the evolutionary origins of emerging pathogens and improve pathogen surveillance for the anticipation and mitigation of future pandemics.
Assuntos
Computação em Nuvem , Bases de Dados Genéticas , Vírus de RNA/genética , Vírus de RNA/isolamento & purificação , Alinhamento de Sequência/métodos , Virologia/métodos , Viroma/genética , Animais , Arquivos , Bacteriófagos/enzimologia , Bacteriófagos/genética , Biodiversidade , Coronavirus/classificação , Coronavirus/enzimologia , Coronavirus/genética , Evolução Molecular , Vírus Delta da Hepatite/enzimologia , Vírus Delta da Hepatite/genética , Humanos , Modelos Moleculares , Vírus de RNA/classificação , Vírus de RNA/enzimologia , RNA Polimerase Dependente de RNA/química , RNA Polimerase Dependente de RNA/genética , SoftwareRESUMO
Kolmioviridae is a family for negative-sense RNA viruses with circular, viroid-like genomes of about 1.5-1.7 kb that are maintained in mammals, amphibians, birds, fish, insects and reptiles. Deltaviruses, for instance, can cause severe hepatitis in humans. Kolmiovirids encode delta antigen (DAg) and replicate using host-cell DNA-directed RNA polymerase II and ribozymes encoded in their genome and antigenome. They require evolutionary unrelated helper viruses to provide envelopes and incorporate helper virus proteins for infectious particle formation. This is a summary of the International Committee on Taxonomy of Viruses (ICTV) Report on the family Kolmioviridae, which is available at ictv.global/report/kolmioviridae.
Assuntos
Vírus Auxiliares , Viroides , Animais , Humanos , Evolução Biológica , Vírus de RNA de Sentido Negativo , RNA Polimerase II , MamíferosRESUMO
Overcoming drug resistance and targeting cancer stem cells remain challenges for curative cancer treatment. To investigate the role of microRNAs (miRNAs) in regulating drug resistance and leukemic stem cell (LSC) fate, we performed global transcriptome profiling in treatment-naive chronic myeloid leukemia (CML) stem/progenitor cells and identified that miR-185 levels anticipate their response to ABL tyrosine kinase inhibitors (TKIs). miR-185 functions as a tumor suppressor: its restored expression impaired survival of drug-resistant cells, sensitized them to TKIs in vitro, and markedly eliminated long-term repopulating LSCs and infiltrating blast cells, conferring a survival advantage in preclinical xenotransplantation models. Integrative analysis with mRNA profiles uncovered PAK6 as a crucial target of miR-185, and pharmacological inhibition of PAK6 perturbed the RAS/MAPK pathway and mitochondrial activity, sensitizing therapy-resistant cells to TKIs. Thus, miR-185 presents as a potential predictive biomarker, and dual targeting of miR-185-mediated PAK6 activity and BCR-ABL1 may provide a valuable strategy for overcoming drug resistance in patients.
Assuntos
Resistencia a Medicamentos Antineoplásicos/genética , Leucemia Mielogênica Crônica BCR-ABL Positiva/genética , MicroRNAs/genética , Células-Tronco Neoplásicas/patologia , Quinases Ativadas por p21/genética , Animais , Regulação Leucêmica da Expressão Gênica/genética , Xenoenxertos , Humanos , Leucemia Mielogênica Crônica BCR-ABL Positiva/tratamento farmacológico , Leucemia Mielogênica Crônica BCR-ABL Positiva/metabolismo , Camundongos , Camundongos SCID , MicroRNAs/metabolismo , Células-Tronco Neoplásicas/metabolismo , Inibidores de Proteínas Quinases/uso terapêutico , Transdução de Sinais/fisiologia , Quinases Ativadas por p21/metabolismoRESUMO
SUMMARY: Transposable elements (TEs) influence the evolution of novel transcriptional networks yet the specific and meaningful interpretation of how TE-derived transcriptional initiation contributes to the transcriptome has been marred by computational and methodological deficiencies. We developed LIONS for the analysis of RNA-seq data to specifically detect and quantify TE-initiated transcripts. AVAILABILITY AND IMPLEMENTATION: Source code, container, test data and instruction manual are freely available at www.github.com/ababaian/LIONS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Elementos de DNA Transponíveis , RNA-Seq , Software , Sequenciamento do ExomaRESUMO
BACKGROUND: Computational biology requires the reading and comprehension of biological data files. Plain-text formats such as SAM, VCF, GTF, PDB and FASTA, often contain critical information which is obfuscated by the data structure complexity. RESULTS: bioSyntax ( https://biosyntax.org/ ) is a freely available suite of biological syntax highlighting packages for vim, gedit, Sublime, VSCode, and less. bioSyntax improves the legibility of low-level biological data in the bioinformatics workspace. CONCLUSION: bioSyntax supports computational scientists in parsing and comprehending their data efficiently and thus can accelerate research output.
Assuntos
Biologia Computacional , Software , Armazenamento e Recuperação da Informação , Nucleotídeos/genética , Alinhamento de SequênciaRESUMO
Remnants of ancient transposable elements (TEs) are abundant in mammalian genomes. These sequences harbor multiple regulatory motifs and hence are capable of influencing expression of host genes. In response to environmental changes, TEs are known to be released from epigenetic repression and to become transcriptionally active. Such activation could also lead to lineage-inappropriate activation of oncogenes, as one study described in Hodgkin lymphoma. However, little further evidence for this mechanism in other cancers has been reported. Here, we reanalyzed whole transcriptome data from a large cohort of patients with diffuse large B-cell lymphoma (DLBCL) compared with normal B-cell centroblasts to detect genes ectopically expressed through activation of TE promoters. We have identified 98 such TE-gene chimeric transcripts that were exclusively expressed in primary DLBCL cases and confirmed several in DLBCL-derived cell lines. We further characterized a TE-gene chimeric transcript involving a fatty acid-binding protein gene (LTR2-FABP7), normally expressed in brain, that was ectopically expressed in a subset of DLBCL patients through the use of an endogenous retroviral LTR promoter of the LTR2 family. The LTR2-FABP7 chimeric transcript encodes a novel chimeric isoform of the protein with characteristics distinct from native FABP7. In vitro studies reveal a dependency for DLBCL cell line proliferation and growth on LTR2-FABP7 chimeric protein expression. Taken together, these data demonstrate the significance of TEs as regulators of aberrant gene expression in cancer and suggest that LTR2-FABP7 may contribute to the pathogenesis of DLBCL in a subgroup of patients.
Assuntos
Proteínas de Transporte/genética , Proteínas de Transporte/metabolismo , Linfoma Difuso de Grandes Células B/genética , Linfoma Difuso de Grandes Células B/metabolismo , Proteínas Supressoras de Tumor/genética , Proteínas Supressoras de Tumor/metabolismo , Linhagem Celular Tumoral , Elementos de DNA Transponíveis/genética , Epigênese Genética , Proteína 7 de Ligação a Ácidos Graxos , Ácidos Graxos/metabolismo , Regulação Neoplásica da Expressão Gênica , Testes Genéticos , Humanos , Linfoma Difuso de Grandes Células B/etiologia , Proteínas de Fusão Oncogênica/genética , Proteínas de Fusão Oncogênica/metabolismo , Regiões Promotoras Genéticas , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , RNA Neoplásico/genética , RNA Neoplásico/metabolismo , Retroelementos/genética , Sequências Repetidas Terminais , Análise Serial de Tecidos , Ativação TranscricionalRESUMO
[This corrects the article DOI: 10.7717/peerj.14055.].
RESUMO
We are entering a 'Platinum Age of Virus Discovery', an era marked by exponential growth in the discovery of virus biodiversity, and driven by advances in metagenomics and computational analysis. In the ecosystem of a human (or any animal) there are more species of viruses than simply those directly infecting the animal cells. Viruses can infect all organisms constituting the microbiome, including bacteria, fungi, and unicellular parasites. Thus the complexity of possible interactions between host, microbe, and viruses is unfathomable. To understand this interaction network we must employ computationally assisted virology as a means of analyzing and interpreting the millions of available samples to make inferences about the ways in which viruses may intersect human health. From a computational viral screen of human neuronal datasets, we identified a novel narnavirus Apocryptovirus odysseus (Ao) which likely infects the neurotropic parasite Toxoplasma gondii. Previously, several parasitic protozoan viruses (PPVs) have been mechanistically established as triggers of host innate responses, and here we present in silico evidence that Ao is a plausible pro-inflammatory factor in human and mouse cells infected by T. gondii. T. gondii infects billions of people worldwide, yet the prognosis of toxoplasmosis disease is highly variable, and PPVs like Ao could function as a hitherto undescribed hypervirulence factor. In a broader screen of over 7.6 million samples, we explored phylogenetically proximal viruses to Ao and discovered nineteen Apocryptovirus species, all found in libraries annotated as vertebrate transcriptome or metatranscriptomes. While samples containing this genus of narnaviruses are derived from sheep, goat, bat, rabbit, chicken, and pigeon samples, the presence of virus is strongly predictive of parasitic Apicomplexa nucleic acid co-occurrence, supporting the fact that Apocryptovirus is a genus of parasite-infecting viruses. This is a computational proof-of-concept study in which we rapidly analyze millions of datasets from which we distilled a mechanistically, ecologically, and phylogenetically refined hypothesis. We predict that this highly diverged Ao RNA virus is biologically a T. gondii infection, and that Ao, and other viruses like it, will modulate this disease which afflicts billions worldwide.
RESUMO
The recent CASP15 competition highlighted the critical role of multiple sequence alignments (MSAs) in protein structure prediction, as demonstrated by the success of the top AlphaFold2-based prediction methods. To push the boundaries of MSA utilization, we conducted a petabase-scale search of the Sequence Read Archive (SRA), resulting in gigabytes of aligned homologs for CASP15 targets. These were merged with default MSAs produced by ColabFold-search and provided to ColabFold-predict. By using SRA data, we achieved highly accurate predictions (GDT_TS > 70) for 66% of the non-easy targets, whereas using ColabFold-search default MSAs scored highly in only 52%. Next, we tested the effect of deep homology search and ColabFold's advanced features, such as more recycles, on prediction accuracy. While SRA homologs were most significant for improving ColabFold's CASP15 ranking from 11th to 3rd place, other strategies contributed too. We analyze these in the context of existing strategies to improve prediction.
Assuntos
Biologia Computacional , Proteínas , Biologia Computacional/métodos , Proteínas/química , Alinhamento de Sequência , Conformação Proteica , Software , Algoritmos , Análise de Sequência de Proteína/métodosRESUMO
Here, we describe the "Obelisks," a previously unrecognised class of viroid-like elements that we first identified in human gut metatranscriptomic data. "Obelisks" share several properties: (i) apparently circular RNA ~1kb genome assemblies, (ii) predicted rod-like secondary structures encompassing the entire genome, and (iii) open reading frames coding for a novel protein superfamily, which we call the "Oblins". We find that Obelisks form their own distinct phylogenetic group with no detectable sequence or structural similarity to known biological agents. Further, Obelisks are prevalent in tested human microbiome metatranscriptomes with representatives detected in ~7% of analysed stool metatranscriptomes (29/440) and in ~50% of analysed oral metatranscriptomes (17/32). Obelisk compositions appear to differ between the anatomic sites and are capable of persisting in individuals, with continued presence over >300 days observed in one case. Large scale searches identified 29,959 Obelisks (clustered at 90% nucleotide identity), with examples from all seven continents and in diverse ecological niches. From this search, a subset of Obelisks are identified to code for Obelisk-specific variants of the hammerhead type-III self-cleaving ribozyme. Lastly, we identified one case of a bacterial species (Streptococcus sanguinis) in which a subset of defined laboratory strains harboured a specific Obelisk RNA population. As such, Obelisks comprise a class of diverse RNAs that have colonised, and gone unnoticed in, human, and global microbiomes.
RESUMO
RNA viruses are ubiquitous components of the global virosphere, yet relatively little is known about their genetic diversity or the cellular mechanisms by which they exploit the biology of their diverse eukaryotic hosts. A hallmark of (+)ssRNA (positive single-stranded RNA) viruses is the ability to remodel host endomembranes for their own replication. However, the subcellular interplay between RNA viruses and host organelles that harbor gene expression systems, such as mitochondria, is complex and poorly understood. Here we report the discovery of 763 new virus sequences belonging to the family Mitoviridae by metatranscriptomic analysis, the identification of previously uncharacterized mitovirus clades, and a putative new viral class. With this expanded understanding of the diversity of mitovirus and encoded RNA-dependent RNA polymerases (RdRps), we annotate mitovirus-specific protein motifs and identify hallmarks of mitochondrial translation, including mitochondrion-specific codons. This study expands the known diversity of mitochondrial viruses and provides additional evidence that they co-opt mitochondrial biology for their survival. IMPORTANCE Metatranscriptomic studies have rapidly expanded the cadre of known RNA viruses, yet our understanding of how these viruses navigate the cytoplasmic milieu of their hosts to survive remains poorly characterized. In this study, we identify and assemble 763 new viral sequences belonging to the Mitoviridae, a family of (+)ssRNA viruses thought to interact with and remodel host mitochondria. We exploit this genetic diversity to identify new clades of Mitoviridae, annotate clade-specific sequence motifs that distinguish the mitoviral RdRp, and reveal patterns of RdRp codon usage consistent with translation on host cell mitoribosomes. These results serve as a foundation for understanding how mitoviruses co-opt mitochondrial biology for their proliferation.
Assuntos
Vírus de RNA , Vírus , Fases de Leitura Aberta , Vírus de RNA/genética , Vírus/genética , Códon , RNA Polimerase Dependente de RNA/genéticaRESUMO
The recent CASP15 competition highlighted the critical role of multiple sequence alignments (MSAs) in protein structure prediction, as demonstrated by the success of the top AlphaFold2-based prediction methods. To push the boundaries of MSA utilization, we conducted a petabase-scale search of the Sequence Read Archive (SRA), resulting in gigabytes of aligned homologs for CASP15 targets. These were merged with default MSAs produced by ColabFold-search and provided to ColabFold-predict. By using SRA data, we achieved highly accurate predictions (GDT_TS > 70) for 66% of the non-easy targets, whereas using ColabFold-search default MSAs scored highly in only 52%. Next, we tested the effect of deep homology search and ColabFold's advanced features, such as more recycles, on prediction accuracy. While SRA homologs were most significant for improving ColabFold's CASP15 ranking from 11th to 3rd place, other strategies contributed too. We analyze these in the context of existing strategies to improve prediction.
RESUMO
Earth's life may have originated as self-replicating RNA, and it has been argued that RNA viruses and viroid-like elements are remnants of such pre-cellular RNA world. RNA viruses are defined by linear RNA genomes encoding an RNA-dependent RNA polymerase (RdRp), whereas viroid-like elements consist of small, single-stranded, circular RNA genomes that, in some cases, encode paired self-cleaving ribozymes. Here we show that the number of candidate viroid-like elements occurring in geographically and ecologically diverse niches is much higher than previously thought. We report that, amongst these circular genomes, fungal ambiviruses are viroid-like elements that undergo rolling circle replication and encode their own viral RdRp. Thus, ambiviruses are distinct infectious RNAs showing hybrid features of viroid-like RNAs and viruses. We also detected similar circular RNAs, containing active ribozymes and encoding RdRps, related to mitochondrial-like fungal viruses, highlighting fungi as an evolutionary hub for RNA viruses and viroid-like elements. Our findings point to a deep co-evolutionary history between RNA viruses and subviral elements and offer new perspectives in the origin and evolution of primordial infectious agents, and RNA life.
Assuntos
Vírus de RNA , RNA Catalítico , Viroides , Viroides/genética , RNA Catalítico/genética , RNA Viral/genética , Replicação Viral/genética , RNA/genética , Vírus de RNA/genética , RNA Polimerase Dependente de RNA/genética , Fungos/genéticaRESUMO
The 2023 International Virus Bioinformatics Meeting was held in Valencia, Spain, from 24-26 May 2023, attracting approximately 180 participants worldwide. The primary objective of the conference was to establish a dynamic scientific environment conducive to discussion, collaboration, and the generation of novel research ideas. As the first in-person event following the SARS-CoV-2 pandemic, the meeting facilitated highly interactive exchanges among attendees. It served as a pivotal gathering for gaining insights into the current status of virus bioinformatics research and engaging with leading researchers and emerging scientists. The event comprised eight invited talks, 19 contributed talks, and 74 poster presentations across eleven sessions spanning three days. Topics covered included machine learning, bacteriophages, virus discovery, virus classification, virus visualization, viral infection, viromics, molecular epidemiology, phylodynamic analysis, RNA viruses, viral sequence analysis, viral surveillance, and metagenomics. This report provides rewritten abstracts of the presentations, a summary of the key research findings, and highlights shared during the meeting.
Assuntos
Bacteriófagos , Vírus de RNA , Viroses , Vírus , Humanos , Biologia Computacional , Vírus/genéticaRESUMO
RNA viruses encoding a polymerase gene (riboviruses) dominate the known eukaryotic virome. High-throughput sequencing is revealing a wealth of new riboviruses known only from sequence, precluding classification by traditional taxonomic methods. Sequence classification is often based on polymerase sequences, but standardised methods to support this approach are currently lacking. To address this need, we describe the polymerase palmprint, a segment of the palm sub-domain robustly delineated by well-conserved catalytic motifs. We present an algorithm, Palmscan, which identifies palmprints in nucleotide and amino acid sequences; PALMdb, a collection of palmprints derived from public sequence databases; and palmID, a public website implementing palmprint identification, search, and annotation. Together, these methods demonstrate a proof-of-concept workflow for high-throughput characterisation of RNA viruses, paving the path for the continued rapid growth in RNA virus discovery anticipated in the coming decade.
Assuntos
Vírus de RNA , Sequência de Aminoácidos , Eucariotos , Nucleotidiltransferases , AlgoritmosRESUMO
The ribosome is an RNA-protein complex that is essential for translation in all domains of life. The structural and catalytic core of the ribosome is its ribosomal RNA (rRNA). While mutations in ribosomal protein (RP) genes are known drivers of oncogenesis, oncogenic rRNA variants have remained elusive. We identify a cancer-specific single-nucleotide variation in 18S rRNA at nucleotide 1248.U in up to 45.9% of patients with colorectal carcinoma (CRC) and present across >22 cancer types. This is the site of a unique hyper-modified base, 1-methyl-3-α-amino-α-carboxyl-propyl pseudouridine (m1acp3Ψ), a >1-billion-years-conserved RNA modification at the peptidyl decoding site of the ribosome. A subset of CRC tumors we call hypo-m1acp3Ψ shows sub-stoichiometric m1acp3Ψ modification, unlike normal control tissues. An m1acp3Ψ knockout model and hypo-m1acp3Ψ patient tumors share a translational signature characterized by highly abundant ribosomal proteins. Thus, m1acp3Ψ-deficient rRNA forms an uncharacterized class of "onco-ribosome" which may serve as a chemotherapeutic target for treating cancer patients.
Assuntos
Neoplasias/genética , Oncogenes/genética , RNA Ribossômico/metabolismo , Proteínas Ribossômicas/metabolismo , Ribossomos/metabolismo , Sequência de Bases/genética , Humanos , Conformação de Ácido Nucleico , Pseudouridina/genéticaRESUMO
Patients with chronic myeloid leukemia (CML) often require lifelong therapy with ABL1 tyrosine kinase inhibitors (TKIs) due to a persisting TKI-resistant population of leukemic stem cells (LSCs). From transcriptome profiling, we show integrin-linked kinase (ILK), a key constituent of focal adhesions, is highly expressed in TKI-nonresponsive patient cells and their LSCs. Genetic and pharmacological inhibition of ILK impaired the survival of nonresponder patient cells, sensitizing them to TKIs, even in the presence of protective niche cells. Furthermore, ILK inhibition eliminated TKI-refractory LSCs from patients, but not normal HSCs, in vitro and in vivo. RNA-sequencing and functional validation studies implicated an important role of ILK in maintaining a requisite level of mitochondrial oxidative metabolism in highly purified, quiescent LSCs. Thus, these findings point to ILK as a critical survival mediator to TKIs and quiescent stem cells, offering an attractive therapeutic target and model for curative combination therapies in stem-cell-driven cancers.
Assuntos
Proteínas de Fusão bcr-abl , Leucemia Mielogênica Crônica BCR-ABL Positiva , Resistencia a Medicamentos Antineoplásicos , Humanos , Leucemia Mielogênica Crônica BCR-ABL Positiva/tratamento farmacológico , Células-Tronco Neoplásicas , Inibidores de Proteínas Quinases/farmacologia , Inibidores de Proteínas Quinases/uso terapêutico , Proteínas Serina-Treonina QuinasesRESUMO
Mechanistic studies in human cancer have relied heavily on cell lines and mouse models, but are limited by in vitro adaptation and species context issues, respectively. More recent efforts have utilized patient-derived xenografts; however, these are hampered by variable genetic background, inability to study early events, and practical issues with availability/reproducibility. We report here an efficient, reproducible model of T-cell leukemia in which lentiviral transduction of normal human cord blood yields aggressive leukemia that appears indistinguishable from natural disease. We utilize this synthetic model to uncover a role for oncogene-induced HOXB activation which is operative in leukemia cells-of-origin and persists in established tumors where it defines a novel subset of patients distinct from other known genetic subtypes and with poor clinical outcome. We show further that anterior HOXB genes are specifically activated in human T-ALL by an epigenetic mechanism and confer growth advantage in both pre-leukemia cells and established clones.
Assuntos
Proteínas de Homeodomínio/metabolismo , Leucemia/metabolismo , Família Multigênica , Animais , Proliferação de Células , Epigênese Genética , Feminino , Xenoenxertos , Proteínas de Homeodomínio/genética , Humanos , Leucemia/genética , Leucemia/fisiopatologia , Masculino , Camundongos , Camundongos Endogâmicos NOD , Modelos Genéticos , Proteínas Oncogênicas/genética , Proteínas Oncogênicas/metabolismoRESUMO
Remnants of ancient transposable elements (TEs) are abundant in mammalian genomes. These sequences contain multiple regulatory motifs and hence are capable of influencing expression of host genes. TEs are known to be released from epigenetic repression and can become transcriptionally active in cancer. Such activation could also lead to lineage-inappropriate activation of oncogenes, as previously described in lymphomas. However, there are few reports of this mechanism occurring in non-blood cancers. Here, we re-analyzed whole transcriptome data from a large cohort of patients with colon cancer, compared to matched normal colon control samples, to detect genes or transcripts ectopically expressed through activation of TE promoters. Among many such transcripts, we identified six where the affected gene has described role in cancer and where the TE-driven gene mRNA is expressed in primary colon cancer, but not normal matched tissue, and confirmed expression in colon cancer-derived cell lines. We further characterized a TE-gene chimeric transcript involving the Interleukin 33 (IL-33) gene (termed LTR-IL-33), that is ectopically expressed in a subset of colon cancer samples through the use of an endogenous retroviral long terminal repeat (LTR) promoter of the MSTD family. The LTR-IL-33 chimeric transcript encodes a novel shorter isoform of the protein, which is missing the initial N-terminus (including many conserved residues) of Native IL-33. In vitro studies showed that LTR-IL-33 expression is required for optimal CRC cell line growth as 3D colonospheres. Taken together, these data demonstrate the significance of TEs as regulators of aberrant gene expression in colon cancer.
Assuntos
Neoplasias Colorretais/patologia , Elementos de DNA Transponíveis/genética , Interleucina-33/genética , Sequência de Aminoácidos , Animais , Linhagem Celular Tumoral , Proliferação de Células , Regulação Neoplásica da Expressão Gênica , Humanos , Interleucina-33/química , Regiões Promotoras Genéticas/genética , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Sequências Repetidas Terminais/genéticaRESUMO
Cancer arises from a series of genetic and epigenetic changes, which result in abnormal expression or mutational activation of oncogenes, as well as suppression/inactivation of tumor suppressor genes. Aberrant expression of coding genes or long non-coding RNAs (lncRNAs) with oncogenic properties can be caused by translocations, gene amplifications, point mutations or other less characterized mechanisms. One such mechanism is the inappropriate usage of normally dormant, tissue-restricted or cryptic enhancers or promoters that serve to drive oncogenic gene expression. Dispersed across the human genome, endogenous retroviruses (ERVs) provide an enormous reservoir of autonomous gene regulatory modules, some of which have been co-opted by the host during evolution to play important roles in normal regulation of genes and gene networks. This review focuses on the "dark side" of such ERV regulatory capacity. Specifically, we discuss a growing number of examples of normally dormant or epigenetically repressed ERVs that have been harnessed to drive oncogenes in human cancer, a process we term onco-exaptation, and we propose potential mechanisms that may underlie this phenomenon.