ABSTRACT
L1 elements represent the only currently active, autonomous retrotransposon in the human genome, and they make major contributions to human genetic instability. The vast majority of the 500 000 L1 elements in the genome are defective, and only a relatively few can contribute to the retrotransposition process. However, there is currently no comprehensive approach to identify the specific loci that are actively transcribed separate from the excess of L1-related sequences that are co-transcribed within genes. We have developed RNA-Seq procedures, as well as a 1200 bp 5Î RACE product coupled with PACBio sequencing that can identify the specific L1 loci that contribute most of the L1-related RNA reads. At least 99% of L1-related sequences found in RNA do not arise from the L1 promoter, instead representing pieces of L1 incorporated in other cellular RNAs. In any given cell type a relatively few active L1 loci contribute to the 'authentic' L1 transcripts that arise from the L1 promoter, with significantly different loci seen expressed in different tissues.
Subject(s)
Chromosomes, Human/chemistry , Genetic Loci , Genome, Human , Long Interspersed Nucleotide Elements , RNA, Messenger/genetics , Transcription, Genetic , Animals , Chromosome Mapping , Chromosomes, Human/metabolism , DNA, Complementary/genetics , DNA, Complementary/metabolism , Genomic Instability , HeLa Cells , Humans , Mice , NIH 3T3 Cells , Nucleic Acid Amplification Techniques , Promoter Regions, Genetic , RNA, Messenger/metabolism , Sequence Analysis, RNAABSTRACT
BACKGROUND: Approximately 17 % of the human genome is comprised of the Long INterspersed Element-1 (LINE-1 or L1) retrotransposon, the only currently active autonomous family of retroelements. Though L1 elements have helped to shape mammalian genome evolution over millions of years, L1 activity can also be mutagenic and result in human disease. L1 expression has the potential to contribute to genomic instability via retrotransposition and DNA double-strand breaks (DSBs). Additionally, L1 is responsible for structural genomic variations induced by other transposable elements such as Alu and SVA, which rely on the L1 ORF2 protein for their propagation. Most of the genomic damage associated with L1 activity originates with the endonuclease domain of the ORF2 protein, which nicks the DNA in preparation for target-primed reverse transcription. RESULTS: Bioinformatic analysis of full-length L1 loci residing in the human genome identified numerous mutations in the amino acid sequence of the ORF2 endonuclease domain. Some of these mutations were found in residues which were predicted to be phosphorylation sites for cellular kinases. We mutated several of these putative phosphorylation sites in the ORF2 endonuclease domain and investigated the effect of these mutations on the function of the full-length ORF2 protein and the endonuclease domain (ENp) alone. Most of the single and multiple point mutations that were tested did not significantly impact expression of the full-length ORF2p, or alter its ability to drive Alu retrotransposition. Similarly, most of those same mutations did not significantly alter expression of ENp, or impair its ability to induce DNA damage and cause toxicity. CONCLUSIONS: Overall, our data demonstrate that the full-length ORF2p or the ENp alone can tolerate several specific single and multiple point mutations in the endonuclease domain without significant impairment of their ability to support Alu mobilization or induce DNA damage, respectively.