Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 42
Filter
1.
Cell ; 173(6): 1356-1369.e22, 2018 05 31.
Article in English | MEDLINE | ID: mdl-29856954

ABSTRACT

Genetic changes causing brain size expansion in human evolution have remained elusive. Notch signaling is essential for radial glia stem cell proliferation and is a determinant of neuronal number in the mammalian cortex. We find that three paralogs of human-specific NOTCH2NL are highly expressed in radial glia. Functional analysis reveals that different alleles of NOTCH2NL have varying potencies to enhance Notch signaling by interacting directly with NOTCH receptors. Consistent with a role in Notch signaling, NOTCH2NL ectopic expression delays differentiation of neuronal progenitors, while deletion accelerates differentiation into cortical neurons. Furthermore, NOTCH2NL genes provide the breakpoints in 1q21.1 distal deletion/duplication syndrome, where duplications are associated with macrocephaly and autism and deletions with microcephaly and schizophrenia. Thus, the emergence of human-specific NOTCH2NL genes may have contributed to the rapid evolution of the larger human neocortex, accompanied by loss of genomic stability at the 1q21.1 locus and resulting recurrent neurodevelopmental disorders.


Subject(s)
Brain/embryology , Cerebral Cortex/physiology , Neurogenesis/physiology , Receptor, Notch2/metabolism , Signal Transduction , Animals , Cell Differentiation , Embryonic Stem Cells/metabolism , Female , Gene Deletion , Genes, Reporter , Gorilla gorilla , HEK293 Cells , Humans , Neocortex/cytology , Neural Stem Cells/metabolism , Neuroglia/metabolism , Neurons/metabolism , Pan troglodytes , Receptor, Notch2/genetics , Sequence Analysis, RNA
2.
Cell ; 161(2): 228-39, 2015 Apr 09.
Article in English | MEDLINE | ID: mdl-25860606

ABSTRACT

Somatic LINE-1 (L1) retrotransposition during neurogenesis is a potential source of genotypic variation among neurons. As a neurogenic niche, the hippocampus supports pronounced L1 activity. However, the basal parameters and biological impact of L1-driven mosaicism remain unclear. Here, we performed single-cell retrotransposon capture sequencing (RC-seq) on individual human hippocampal neurons and glia, as well as cortical neurons. An estimated 13.7 somatic L1 insertions occurred per hippocampal neuron and carried the sequence hallmarks of target-primed reverse transcription. Notably, hippocampal neuron L1 insertions were specifically enriched in transcribed neuronal stem cell enhancers and hippocampus genes, increasing their probability of functional relevance. In addition, bias against intronic L1 insertions sense oriented relative to their host gene was observed, perhaps indicating moderate selection against this configuration in vivo. These experiments demonstrate pervasive L1 mosaicism at genomic loci expressed in hippocampal neurons.


Subject(s)
Hippocampus/cytology , Long Interspersed Nucleotide Elements , Mosaicism , Neurons/cytology , Genetic Variation , Humans , Neurogenesis , Polymerase Chain Reaction , Tissue Banks
3.
Mol Cell ; 80(5): 915-928.e5, 2020 12 03.
Article in English | MEDLINE | ID: mdl-33186547

ABSTRACT

Transposable elements (TEs) drive genome evolution and are a notable source of pathogenesis, including cancer. While CpG methylation regulates TE activity, the locus-specific methylation landscape of mobile human TEs has to date proven largely inaccessible. Here, we apply new computational tools and long-read nanopore sequencing to directly infer CpG methylation of novel and extant TE insertions in hippocampus, heart, and liver, as well as paired tumor and non-tumor liver. As opposed to an indiscriminate stochastic process, we find pronounced demethylation of young long interspersed element 1 (LINE-1) retrotransposons in cancer, often distinct to the adjacent genome and other TEs. SINE-VNTR-Alu (SVA) retrotransposons, including their internal tandem repeat-associated CpG island, are near-universally methylated. We encounter allele-specific TE methylation and demethylation of aberrantly expressed young LINE-1s in normal tissues. Finally, we recover the complete sequences of tumor-specific LINE-1 insertions and their retrotransposition hallmarks, demonstrating how long-read sequencing can simultaneously survey the epigenome and detect somatic TE mobilization.


Subject(s)
DNA Methylation , DNA Transposable Elements , DNA, Neoplasm , Epigenesis, Genetic , Epigenome , Gene Expression Regulation, Neoplastic , Long Interspersed Nucleotide Elements , Nanopore Sequencing , Neoplasms , DNA, Neoplasm/genetics , DNA, Neoplasm/metabolism , Female , Gene Expression Profiling , Humans , Middle Aged , Neoplasms/genetics , Neoplasms/metabolism , Organ Specificity
4.
Mol Cell ; 75(3): 590-604.e12, 2019 08 08.
Article in English | MEDLINE | ID: mdl-31230816

ABSTRACT

Epigenetic silencing defends against LINE-1 (L1) retrotransposition in mammalian cells. However, the mechanisms that repress young L1 families and how L1 escapes to cause somatic genome mosaicism in the brain remain unclear. Here we report that a conserved Yin Yang 1 (YY1) transcription factor binding site mediates L1 promoter DNA methylation in pluripotent and differentiated cells. By analyzing 24 hippocampal neurons with three distinct single-cell genomic approaches, we characterized and validated a somatic L1 insertion bearing a 3' transduction. The source (donor) L1 for this insertion was slightly 5' truncated, lacked the YY1 binding site, and was highly mobile when tested in vitro. Locus-specific bisulfite sequencing revealed that the donor L1 and other young L1s with mutated YY1 binding sites were hypomethylated in embryonic stem cells, during neurodifferentiation, and in liver and brain tissue. These results explain how L1 can evade repression and retrotranspose in the human body.


Subject(s)
Epigenetic Repression/genetics , Long Interspersed Nucleotide Elements/genetics , Retroelements/genetics , YY1 Transcription Factor/genetics , Binding Sites/genetics , DNA Methylation/genetics , DNA-Binding Proteins/genetics , Genome, Human/genetics , Hippocampus/metabolism , Humans , Liver/metabolism , Neurons/metabolism , Single-Cell Analysis
5.
Genome Res ; 33(9): 1465-1481, 2023 09.
Article in English | MEDLINE | ID: mdl-37798118

ABSTRACT

Mice harbor ∼2800 intact copies of the retrotransposon Long Interspersed Element 1 (L1). The in vivo retrotransposition capacity of an L1 copy is defined by both its sequence integrity and epigenetic status, including DNA methylation of the monomeric units constituting young mouse L1 promoters. Locus-specific L1 methylation dynamics during development may therefore elucidate and explain spatiotemporal niches of endogenous retrotransposition but remain unresolved. Here, we interrogate the retrotransposition efficiency and epigenetic fate of source (donor) L1s, identified as mobile in vivo. We show that promoter monomer loss consistently attenuates the relative retrotransposition potential of their offspring (daughter) L1 insertions. We also observe that most donor/daughter L1 pairs are efficiently methylated upon differentiation in vivo and in vitro. We use Oxford Nanopore Technologies (ONT) long-read sequencing to resolve L1 methylation genome-wide and at individual L1 loci, revealing a distinctive "smile" pattern in methylation levels across the L1 promoter region. Using Pacific Biosciences (PacBio) SMRT sequencing of L1 5' RACE products, we then examine DNA methylation dynamics at the mouse L1 promoter in parallel with transcription start site (TSS) distribution at locus-specific resolution. Together, our results offer a novel perspective on the interplay between epigenetic repression, L1 evolution, and genome stability.


Subject(s)
Embryonic Development , Long Interspersed Nucleotide Elements , Mice , Animals , Retroelements/genetics , DNA Methylation , Promoter Regions, Genetic
6.
Genome Res ; 32(7): 1298-1314, 2022 07.
Article in English | MEDLINE | ID: mdl-35728967

ABSTRACT

The retrotransposon LINE-1 (L1) is central to the recent evolutionary history of the human genome and continues to drive genetic diversity and germline pathogenesis. However, the spatiotemporal extent and biological significance of somatic L1 activity are poorly defined and are virtually unexplored in other primates. From a single L1 lineage active at the divergence of apes and Old World monkeys, successive L1 subfamilies have emerged in each descendant primate germline. As revealed by case studies, the presently active human L1 subfamily can also mobilize during embryonic and brain development in vivo. It is unknown whether nonhuman primate L1s can similarly generate somatic insertions in the brain. Here we applied approximately 40× single-cell whole-genome sequencing (scWGS), as well as retrotransposon capture sequencing (RC-seq), to 20 hippocampal neurons from two rhesus macaques (Macaca mulatta). In one animal, we detected and PCR-validated a somatic L1 insertion that generated target site duplications, carried a short 5' transduction, and was present in ∼7% of hippocampal neurons but absent from cerebellum and nonbrain tissues. The corresponding donor L1 allele was exceptionally mobile in vitro and was embedded in PRDM4, a gene expressed throughout development and in neural stem cells. Nanopore long-read methylome and RNA-seq transcriptome analyses indicated young retrotransposon subfamily activation in the early embryo, followed by repression in adult tissues. These data highlight endogenous macaque L1 retrotransposition potential, provide prototypical evidence of L1-mediated somatic mosaicism in a nonhuman primate, and allude to L1 mobility in the brain over the past 30 million years of human evolution.


Subject(s)
Brain , Long Interspersed Nucleotide Elements , Retroelements , Animals , DNA-Binding Proteins/genetics , Macaca mulatta/genetics , Neurons , Retroelements/genetics , Transcription Factors/genetics
7.
Genome Res ; 32(4): 656-670, 2022 04.
Article in English | MEDLINE | ID: mdl-35332097

ABSTRACT

Genome-wide association studies (GWAS) have been highly informative in discovering disease-associated loci but are not designed to capture all structural variations in the human genome. Using long-read sequencing data, we discovered widespread structural variation within SINE-VNTR-Alu (SVA) elements, a class of great ape-specific transposable elements with gene-regulatory roles, which represents a major source of structural variability in the human population. We highlight the presence of structurally variable SVAs (SV-SVAs) in neurological disease-associated loci, and we further associate SV-SVAs to disease-associated SNPs and differential gene expression using luciferase assays and expression quantitative trait loci data. Finally, we genetically deleted SV-SVAs in the BIN1 and CD2AP Alzheimer's disease-associated risk loci and in the BCKDK Parkinson's disease-associated risk locus and assessed multiple aspects of their gene-regulatory influence in a human neuronal context. Together, this study reveals a novel layer of genetic variation in transposable elements that may contribute to identification of the structural variants that are the actual drivers of disease associations of GWAS loci.


Subject(s)
DNA Transposable Elements , Genome-Wide Association Study , Alu Elements , DNA Transposable Elements/genetics , Genetic Predisposition to Disease , Genetic Variation , Genome, Human , Humans , Polymorphism, Single Nucleotide , Quantitative Trait Loci
8.
Mol Genet Metab ; 142(4): 108516, 2024 Jun 17.
Article in English | MEDLINE | ID: mdl-38941880

ABSTRACT

Glutaric aciduria type II (GAII) is a heterogeneous genetic disorder affecting mitochondrial fatty acid, amino acid and choline oxidation. Clinical manifestations vary across the lifespan and onset may occur at any time from the early neonatal period to advanced adulthood. Historically, some patients, in particular those with late onset disease, have experienced significant benefit from riboflavin supplementation. GAII has been considered an autosomal recessive condition caused by pathogenic variants in the gene encoding electron-transfer flavoprotein ubiquinone-oxidoreductase (ETFDH) or in the genes encoding electron-transfer flavoprotein subunits A and B (ETFA and ETFB respectively). Variants in genes involved in riboflavin metabolism have also been reported. However, in some patients, molecular analysis has failed to reveal diagnostic molecular results. In this study, we report the outcome of molecular analysis in 28 Australian patients across the lifespan, 10 paediatric and 18 adult, who had a diagnosis of glutaric aciduria type II based on both clinical and biochemical parameters. Whole genome sequencing was performed on 26 of the patients and two neonatal onset patients had targeted sequencing of candidate genes. The two patients who had targeted sequencing had biallelic pathogenic variants (in ETFA and ETFDH). None of the 26 patients whose whole genome was sequenced had biallelic variants in any of the primary candidate genes. Interestingly, nine of these patients (34.6%) had a monoallelic pathogenic or likely pathogenic variant in a single primary candidate gene and one patient (3.9%) had a monoallelic pathogenic or likely pathogenic variant in two separate genes within the same pathway. The frequencies of the damaging variants within ETFDH and FAD transporter gene SLC25A32 were significantly higher than expected when compared to the corresponding allele frequencies in the general population. The remaining 16 patients (61.5%) had no pathogenic or likely pathogenic variants in the candidate genes. Ten (56%) of the 18 adult patients were taking the selective serotonin reuptake inhibitor antidepressant sertraline, which has been shown to produce a GAII phenotype, and another two adults (11%) were taking a serotonin-norepinephrine reuptake inhibitor antidepressant, venlafaxine or duloxetine, which have a mechanism of action overlapping that of sertraline. Riboflavin deficiency can also mimic both the clinical and biochemical phenotype of GAII. Several patients on these antidepressants showed an initial response to riboflavin but then that response waned. These results suggest that the GAII phenotype can result from a complex interaction between monoallelic variants and the cellular environment. Whole genome or targeted gene panel analysis may not provide a clear molecular diagnosis.

9.
Bioinformatics ; 38(11): 3109-3112, 2022 05 26.
Article in English | MEDLINE | ID: mdl-35482479

ABSTRACT

SUMMARY: Methylartist is a consolidated suite of tools for processing, visualizing and analysing nanopore-derived modified base calls. All detectable methylation types (e.g. 5mCpG, 5hmC, 6mA) are supported, enabling integrated study of base pairs when modified naturally or as part of an experimental protocol. AVAILABILITY AND IMPLEMENTATION: Methylartist is implemented in Python and is installable via PyPI and bioconda. Source code and test data are available at https://github.com/adamewing/methylartist. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Nanopores , Software
10.
Acta Haematol ; 146(2): 166-171, 2023.
Article in English | MEDLINE | ID: mdl-36273464

ABSTRACT

Here, we present a novel case of a patient with chronic lymphocytic leukemia (CLL) who received CTLA-4 and then PD-1 immune-checkpoint blockade (ICB) as treatment for concomitant metastatic melanoma. Whereas the metastatic melanoma was responsive to ICB, the CLL rapidly progressed (but responded to ICB cessation and ibrutinib). There were no new genetic mutational drivers to explain the altered clinical course. PD-1/PD-L1/PD-L2 and CTLA-4/CD80/CD86 expression was not increased in CLL B cells, CD8+ or CD4+ T-cell subsets, or monocytes. The patient's CLL B cells demonstrated strikingly prolonged in vitro survival during PD-1 blockade, which was not observed in samples taken before or after ICB, or with other patients. To our knowledge, a discordant clinical course to ICB coupled with these biological features has not been reported in a patient with dual malignancies.


Subject(s)
Antineoplastic Agents , Immune Checkpoint Inhibitors , Leukemia, Lymphocytic, Chronic, B-Cell , Melanoma , Programmed Cell Death 1 Receptor , Skin Neoplasms , Humans , CTLA-4 Antigen/antagonists & inhibitors , CTLA-4 Antigen/immunology , Disease Progression , Leukemia, Lymphocytic, Chronic, B-Cell/drug therapy , Leukemia, Lymphocytic, Chronic, B-Cell/immunology , Leukemia, Lymphocytic, Chronic, B-Cell/pathology , Melanoma/drug therapy , Melanoma/etiology , Melanoma/pathology , Programmed Cell Death 1 Receptor/antagonists & inhibitors , Programmed Cell Death 1 Receptor/immunology , Skin Neoplasms/drug therapy , Skin Neoplasms/etiology , Skin Neoplasms/pathology , B7-H1 Antigen , Immune Checkpoint Inhibitors/immunology , Immune Checkpoint Inhibitors/therapeutic use , Antineoplastic Agents/immunology , Antineoplastic Agents/therapeutic use
11.
Proc Natl Acad Sci U S A ; 117(43): 26822-26832, 2020 10 27.
Article in English | MEDLINE | ID: mdl-33033227

ABSTRACT

The mammary epithelium is indispensable for the continued survival of more than 5,000 mammalian species. For some, the volume of milk ejected in a single day exceeds their entire blood volume. Here, we unveil the spatiotemporal properties of physiological signals that orchestrate the ejection of milk from alveolar units and its passage along the mammary ductal network. Using quantitative, multidimensional imaging of mammary cell ensembles from GCaMP6 transgenic mice, we reveal how stimulus evoked Ca2+ oscillations couple to contractions in basal epithelial cells. Moreover, we show that Ca2+-dependent contractions generate the requisite force to physically deform the innermost layer of luminal cells, compelling them to discharge the fluid that they produced and housed. Through the collective action of thousands of these biological positive-displacement pumps, each linked to a contractile ductal network, milk begins its passage toward the dependent neonate, seconds after the command.


Subject(s)
Calcium Signaling , Mammary Glands, Animal/physiology , Milk Ejection , Animals , Epithelial Cells/physiology , Humans , Intravital Microscopy , Mammary Glands, Animal/cytology , Mammary Glands, Animal/diagnostic imaging , Mammary Glands, Human/metabolism , Mice , Mice, Transgenic , Myosin Light Chains/metabolism
12.
Genome Res ; 28(5): 639-653, 2018 05.
Article in English | MEDLINE | ID: mdl-29643204

ABSTRACT

The retrotransposon Long Interspersed Element 1 (LINE-1 or L1) is a continuing source of germline and somatic mutagenesis in mammals. Deregulated L1 activity is a hallmark of cancer, and L1 mutagenesis has been described in numerous human malignancies. We previously employed retrotransposon capture sequencing (RC-seq) to analyze hepatocellular carcinoma (HCC) samples from patients infected with hepatitis B or hepatitis C virus and identified L1 variants responsible for activating oncogenic pathways. Here, we have applied RC-seq and whole-genome sequencing (WGS) to an Abcb4 (Mdr2)-/- mouse model of hepatic carcinogenesis and demonstrated for the first time that L1 mobilization occurs in murine tumors. In 12 HCC nodules obtained from 10 animals, we validated four somatic L1 insertions by PCR and capillary sequencing, including TF subfamily elements, and one GF subfamily example. One of the TF insertions carried a 3' transduction, allowing us to identify its donor L1 and to demonstrate that this full-length TF element retained retrotransposition capacity in cultured cancer cells. Using RC-seq, we also identified eight tumor-specific L1 insertions from 25 HCC patients with a history of alcohol abuse. Finally, we used RC-seq and WGS to identify three tumor-specific L1 insertions among 10 intra-hepatic cholangiocarcinoma (ICC) patients, including one insertion traced to a donor L1 on Chromosome 22 known to be highly active in other cancers. This study reveals L1 mobilization as a common feature of hepatocarcinogenesis in mammals, demonstrating that the phenomenon is not restricted to human viral HCC etiologies and is encountered in murine liver tumors.


Subject(s)
Carcinoma, Hepatocellular/genetics , Liver Neoplasms/genetics , Long Interspersed Nucleotide Elements/genetics , Retroelements/genetics , ATP Binding Cassette Transporter, Subfamily B/genetics , Adult , Aged , Aged, 80 and over , Animals , Cell Transformation, Neoplastic/genetics , Female , Humans , Liver/metabolism , Liver/pathology , Male , Mammals/genetics , Mice, Knockout , Middle Aged , Mutagenesis, Insertional , ATP-Binding Cassette Sub-Family B Member 4
13.
Am J Med Genet A ; 185(7): 2070-2083, 2021 07.
Article in English | MEDLINE | ID: mdl-33960642

ABSTRACT

Basal cell nevus syndrome (also known as Gorlin Syndrome; MIM109400) is an autosomal dominant disorder characterized by recurrent pathological features such as basal cell carcinomas and odontogenic keratocysts as well as skeletal abnormalities. Most affected individuals have point mutations or small insertions or deletions within the PTCH1 gene on human chromosome 9, but there are some cases with more extensive deletion of the region, usually including the neighboring FANCC and/or ERCC6L2 genes. We report a 16-year-old patient with a deletion of approximately 400,000 bases which removes only PTCH1 and some non-coding RNA genes but leaves FANCC and ERCC6L2 intact. In spite of the small amount of DNA for which he is haploid, his phenotype is more extreme than many individuals with longer deletions in the region. This includes early presentation with a large number of basal cell nevi and other skin lesions, multiple jaw keratocysts, and macrosomia. We found that the deletion was in the paternal chromosome, in common with other macrosomia cases. Using public databases, we have examined possible interactions between sequences within and outside the deletion and speculate that a regulatory relationship exists with flanking genes, which is unbalanced by the deletion, resulting in abnormal activation or repression of the target genes and hence the severity of the phenotype.


Subject(s)
Basal Cell Nevus Syndrome/genetics , DNA Helicases/genetics , Fanconi Anemia Complementation Group C Protein/genetics , Patched-1 Receptor/genetics , Adolescent , Basal Cell Nevus Syndrome/epidemiology , Basal Cell Nevus Syndrome/pathology , Child , Child, Preschool , Chromosome Disorders/genetics , Chromosome Disorders/pathology , Chromosomes, Human, Pair 9/genetics , Genetic Predisposition to Disease , Humans , Infant , Infant, Newborn , Male , Neoplasm Recurrence, Local/epidemiology , Neoplasm Recurrence, Local/genetics , Neoplasm Recurrence, Local/pathology , Odontogenic Cysts/genetics , Odontogenic Cysts/pathology , Phenotype , Severity of Illness Index
14.
Genome Res ; 27(8): 1395-1405, 2017 08.
Article in English | MEDLINE | ID: mdl-28483779

ABSTRACT

LINE-1 (L1) retrotransposons are a noted source of genetic diversity and disease in mammals. To expand its genomic footprint, L1 must mobilize in cells that will contribute their genetic material to subsequent generations. Heritable L1 insertions may therefore arise in germ cells and in pluripotent embryonic cells, prior to germline specification, yet the frequency and predominant developmental timing of such events remain unclear. Here, we applied mouse retrotransposon capture sequencing (mRC-seq) and whole-genome sequencing (WGS) to pedigrees of C57BL/6J animals, and uncovered an L1 insertion rate of ≥1 event per eight births. We traced heritable L1 insertions to pluripotent embryonic cells and, strikingly, to early primordial germ cells (PGCs). New L1 insertions bore structural hallmarks of target-site primed reverse transcription (TPRT) and mobilized efficiently in a cultured cell retrotransposition assay. Together, our results highlight the rate and evolutionary impact of heritable L1 retrotransposition and reveal retrotransposition-mediated genomic diversification as a fundamental property of pluripotent embryonic cells in vivo.


Subject(s)
Embryo, Mammalian/metabolism , Long Interspersed Nucleotide Elements , Animals , Embryo, Mammalian/cytology , Female , Genomics/methods , Germ Cells , HeLa Cells , Humans , Male , Mice , Mice, Inbred C57BL , Mosaicism , Whole Genome Sequencing/methods
15.
Nature ; 516(7530): 242-5, 2014 Dec 11.
Article in English | MEDLINE | ID: mdl-25274305

ABSTRACT

Throughout evolution primate genomes have been modified by waves of retrotransposon insertions. For each wave, the host eventually finds a way to repress retrotransposon transcription and prevent further insertions. In mouse embryonic stem cells, transcriptional silencing of retrotransposons requires KAP1 (also known as TRIM28) and its repressive complex, which can be recruited to target sites by KRAB zinc-finger (KZNF) proteins such as murine-specific ZFP809 which binds to integrated murine leukaemia virus DNA elements and recruits KAP1 to repress them. KZNF genes are one of the fastest growing gene families in primates and this expansion is hypothesized to enable primates to respond to newly emerged retrotransposons. However, the identity of KZNF genes battling retrotransposons currently active in the human genome, such as SINE-VNTR-Alu (SVA) and long interspersed nuclear element 1 (L1), is unknown. Here we show that two primate-specific KZNF genes rapidly evolved to repress these two distinct retrotransposon families shortly after they began to spread in our ancestral genome. ZNF91 underwent a series of structural changes 8-12 million years ago that enabled it to repress SVA elements. ZNF93 evolved earlier to repress the primate L1 lineage until ∼12.5 million years ago when the L1PA3-subfamily of retrotransposons escaped ZNF93's restriction through the removal of the ZNF93-binding site. Our data support a model where KZNF gene expansion limits the activity of newly emerged retrotransposon classes, and this is followed by mutations in these retrotransposons to evade repression, a cycle of events that could explain the rapid expansion of lineage-specific KZNF genes.


Subject(s)
Evolution, Molecular , Kruppel-Like Transcription Factors/metabolism , Primates/genetics , Retroelements/genetics , Animals , Base Sequence , Embryonic Stem Cells/cytology , Embryonic Stem Cells/metabolism , Humans , Kruppel-Like Transcription Factors/genetics , Mice , Mutation/genetics , Zinc Fingers
16.
BMC Bioinformatics ; 19(1): 28, 2018 01 31.
Article in English | MEDLINE | ID: mdl-29385983

ABSTRACT

BACKGROUND: The clinical sequencing of cancer genomes to personalize therapy is becoming routine across the world. However, concerns over patient re-identification from these data lead to questions about how tightly access should be controlled. It is not thought to be possible to re-identify patients from somatic variant data. However, somatic variant detection pipelines can mistakenly identify germline variants as somatic ones, a process called "germline leakage". The rate of germline leakage across different somatic variant detection pipelines is not well-understood, and it is uncertain whether or not somatic variant calls should be considered re-identifiable. To fill this gap, we quantified germline leakage across 259 sets of whole-genome somatic single nucleotide variant (SNVs) predictions made by 21 teams as part of the ICGC-TCGA DREAM Somatic Mutation Calling Challenge. RESULTS: The median somatic SNV prediction set contained 4325 somatic SNVs and leaked one germline polymorphism. The level of germline leakage was inversely correlated with somatic SNV prediction accuracy and positively correlated with the amount of infiltrating normal cells. The specific germline variants leaked differed by tumour and algorithm. To aid in quantitation and correction of leakage, we created a tool, called GermlineFilter, for use in public-facing somatic SNV databases. CONCLUSIONS: The potential for patient re-identification from leaked germline variants in somatic SNV predictions has led to divergent open data access policies, based on different assessments of the risks. Indeed, a single, well-publicized re-identification event could reshape public perceptions of the values of genomic data sharing. We find that modern somatic SNV prediction pipelines have low germline-leakage rates, which can be further reduced, especially for cloud-sharing, using pre-filtering software.


Subject(s)
Genome, Human , Germ Cells/metabolism , Polymorphism, Single Nucleotide , Algorithms , Humans , Internet , Neoplasms/genetics , Neoplasms/pathology , User-Computer Interface , Whole Genome Sequencing
17.
Genome Res ; 25(10): 1536-45, 2015 Oct.
Article in English | MEDLINE | ID: mdl-26260970

ABSTRACT

Somatic L1 retrotransposition events have been shown to occur in epithelial cancers. Here, we attempted to determine how early somatic L1 insertions occurred during the development of gastrointestinal (GI) cancers. Using L1-targeted resequencing (L1-seq), we studied different stages of four colorectal cancers arising from colonic polyps, seven pancreatic carcinomas, as well as seven gastric cancers. Surprisingly, we found somatic L1 insertions not only in all cancer types and metastases but also in colonic adenomas, well-known cancer precursors. Some insertions were also present in low quantities in normal GI tissues, occasionally caught in the act of being clonally fixed in the adjacent tumors. Insertions in adenomas and cancers numbered in the hundreds, and many were present in multiple tumor sections, implying clonal distribution. Our results demonstrate that extensive somatic insertional mutagenesis occurs very early during the development of GI tumors, probably before dysplastic growth.


Subject(s)
Gastrointestinal Neoplasms/genetics , Long Interspersed Nucleotide Elements , Mutagenesis, Insertional , Disease Progression , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Humans , Neoplasm Proteins/biosynthesis , Neoplasm Proteins/genetics , Oligonucleotide Array Sequence Analysis , Time Factors
18.
Nat Methods ; 12(7): 623-30, 2015 Jul.
Article in English | MEDLINE | ID: mdl-25984700

ABSTRACT

The detection of somatic mutations from cancer genome sequences is key to understanding the genetic basis of disease progression, patient survival and response to therapy. Benchmarking is needed for tool assessment and improvement but is complicated by a lack of gold standards, by extensive resource requirements and by difficulties in sharing personal genomic information. To resolve these issues, we launched the ICGC-TCGA DREAM Somatic Mutation Calling Challenge, a crowdsourced benchmark of somatic mutation detection algorithms. Here we report the BAMSurgeon tool for simulating cancer genomes and the results of 248 analyses of three in silico tumors created with it. Different algorithms exhibit characteristic error profiles, and, intriguingly, false positives show a trinucleotide profile very similar to one found in human tumors. Although the three simulated tumors differ in sequence contamination (deviation from normal cell sequence) and in subclonality, an ensemble of pipelines outperforms the best individual pipeline in all cases. BAMSurgeon is available at https://github.com/adamewing/bamsurgeon/.


Subject(s)
Benchmarking , Crowdsourcing , Genome , Neoplasms/genetics , Polymorphism, Single Nucleotide , Algorithms , Humans
20.
Hum Mol Genet ; 22(18): 3730-48, 2013 Sep 15.
Article in English | MEDLINE | ID: mdl-23696454

ABSTRACT

Long INterspersed Elements (LINE-1s, L1s) are responsible for over one million retrotransposon insertions and 8000 processed pseudogenes (PPs) in the human genome. An active L1 encodes two proteins (ORF1p and ORF2p) that bind with L1 RNA and form L1-ribonucleoprotein particles (RNPs). Although it is believed that the RNA-binding property of ORF1p is critical to recruit other mobile RNAs to the RNP, the identity of recruited RNAs is largely unknown. Here, we used crosslinking and immunoprecipitation followed by deep sequencing to identify RNA components of L1-RNPs. Our results show that in addition to retrotransposed RNAs [L1, Alu and SINE-VNTR-Alu (SVA)], L1-RNPs are enriched with cellular mRNAs, which have PPs in the human genome. Using purified L1-RNPs, we show that PP-source RNAs preferentially serve as ORF2p templates in a reverse transcriptase assay. In addition, we find that exogenous ORF2p binds endogenous ORF1p, allowing reverse transcription of the same PP-source RNAs. These data demonstrate that interaction of a cellular RNA with the L1-RNP is an inside track to PP formation.


Subject(s)
Long Interspersed Nucleotide Elements/genetics , Open Reading Frames , Pseudogenes , RNA-Binding Proteins/genetics , RNA-Binding Proteins/metabolism , RNA/metabolism , Ribonucleoproteins/metabolism , Gene Expression , HEK293 Cells , High-Throughput Nucleotide Sequencing , Humans , RNA/genetics , RNA-Directed DNA Polymerase/genetics , RNA-Directed DNA Polymerase/metabolism , Retroelements , Ribonucleoproteins/genetics
SELECTION OF CITATIONS
SEARCH DETAIL