RESUMO
Breast cancer (BC) is defined by distinct molecular subtypes with different cells of origin. The transcriptional networks that characterize the subtype-specific tumor-normal lineages are not established. In this work, we applied bulk, single-cell and single-nucleus multi-omic techniques as well as spatial transcriptomics and multiplex imaging on 61 samples from 37 patients with BC to show characteristic links in gene expression and chromatin accessibility between BC subtypes and their putative cells of origin. Regulatory network analysis of transcription factors underscored the importance of BHLHE40 in luminal BC and luminal mature cells and KLF5 in basal-like tumors and luminal progenitor cells. Furthermore, we identify key genes defining the basal-like (SOX6 and KCNQ3) and luminal A/B (FAM155A and LRP1B) lineages. Exhausted CTLA4-expressing CD8+ T cells were enriched in basal-like BC, suggesting an altered means of immune dysfunction. These findings demonstrate analysis of paired transcription and chromatin accessibility at the single-cell level is a powerful tool for investigating cancer lineage and highlight transcriptional networks that define basal and luminal BC lineages.
RESUMO
Breast cancer is a heterogeneous disease, and treatment is guided by biomarker profiles representing distinct molecular subtypes. Breast cancer arises from the breast ductal epithelium, and experimental data suggests breast cancer subtypes have different cells of origin within that lineage. The precise cells of origin for each subtype and the transcriptional networks that characterize these tumor-normal lineages are not established. In this work, we applied bulk, single-cell (sc), and single-nucleus (sn) multi-omic techniques as well as spatial transcriptomics and multiplex imaging on 61 samples from 37 breast cancer patients to show characteristic links in gene expression and chromatin accessibility between breast cancer subtypes and their putative cells of origin. We applied the PAM50 subtyping algorithm in tandem with bulk RNA-seq and snRNA-seq to reliably subtype even low-purity tumor samples and confirm promoter accessibility using snATAC. Trajectory analysis of chromatin accessibility and differentially accessible motifs clearly connected progenitor populations with breast cancer subtypes supporting the cell of origin for basal-like and luminal A and B tumors. Regulatory network analysis of transcription factors underscored the importance of BHLHE40 in luminal breast cancer and luminal mature cells, and KLF5 in basal-like tumors and luminal progenitor cells. Furthermore, we identify key genes defining the basal-like ( PRKCA , SOX6 , RGS6 , KCNQ3 ) and luminal A/B ( FAM155A , LRP1B ) lineages, with expression in both precursor and cancer cells and further upregulation in tumors. Exhausted CTLA4-expressing CD8+ T cells were enriched in basal-like breast cancer, suggesting altered means of immune dysfunction among breast cancer subtypes. We used spatial transcriptomics and multiplex imaging to provide spatial detail for key markers of benign and malignant cell types and immune cell colocation. These findings demonstrate analysis of paired transcription and chromatin accessibility at the single cell level is a powerful tool for investigating breast cancer lineage development and highlight transcriptional networks that define basal and luminal breast cancer lineages.
RESUMO
Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.
Assuntos
Genoma Humano , Genômica , Humanos , Diploide , Genoma Humano/genética , Haplótipos/genética , Análise de Sequência de DNA , Genômica/normas , Padrões de Referência , Estudos de Coortes , Alelos , Variação GenéticaRESUMO
The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society1,2. However, it still has many gaps and errors, and does not represent a biological genome as it is a blend of multiple individuals3,4. Recently, a high-quality telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but it was derived from a hydatidiform mole cell line with a nearly homozygous genome5. To address these limitations, the Human Pangenome Reference Consortium formed with the goal of creating high-quality, cost-effective, diploid genome assemblies for a pangenome reference that represents human genetic diversity6. Here, in our first scientific report, we determined which combination of current genome sequencing and assembly approaches yield the most complete and accurate diploid genome assembly with minimal manual curation. Approaches that used highly accurate long reads and parent-child data with graph-based haplotype phasing during assembly outperformed those that did not. Developing a combination of the top-performing methods, we generated our first high-quality diploid reference assembly, containing only approximately four gaps per chromosome on average, with most chromosomes within ±1% of the length of CHM13. Nearly 48% of protein-coding genes have non-synonymous amino acid changes between haplotypes, and centromeric regions showed the highest diversity. Our findings serve as a foundation for assembling near-complete diploid human genomes at scale for a pangenome reference to capture global genetic variation from single nucleotides to structural rearrangements.
Assuntos
Mapeamento Cromossômico , Diploide , Genoma Humano , Genômica , Humanos , Mapeamento Cromossômico/normas , Genoma Humano/genética , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas , Padrões de Referência , Genômica/métodos , Genômica/normas , Cromossomos Humanos/genética , Variação Genética/genéticaRESUMO
The Sumatran orang-utan (Pongo abelii) reference genome was first published in 2011, in conjunction with ten re-sequenced genomes from unrelated wild-caught individuals. Together, these published data have been utilized in almost all great ape genomic studies, plus in much broader comparative genomic research. Here, we report that the original sequencing Consortium inadvertently switched nine of the ten samples and/or resulting re-sequenced genomes, erroneously attributing eight of these to the wrong source individuals. Among them is a genome from the recently identified Tapanuli (P. tapanuliensis) species: thus, this genome was sequenced and published a full six years prior to the species' description. Sex was wrongly assigned to five known individuals; the numbers in one sample identifier were swapped; and the identifier for another sample most closely resembles that of a sample from another individual entirely. These errors have been reproduced in countless subsequent manuscripts, with noted implications for studies reliant on data from known individuals.
RESUMO
Pancreatic ductal adenocarcinoma is a lethal disease with limited treatment options and poor survival. We studied 83 spatial samples from 31 patients (11 treatment-naïve and 20 treated) using single-cell/nucleus RNA sequencing, bulk-proteogenomics, spatial transcriptomics and cellular imaging. Subpopulations of tumor cells exhibited signatures of proliferation, KRAS signaling, cell stress and epithelial-to-mesenchymal transition. Mapping mutations and copy number events distinguished tumor populations from normal and transitional cells, including acinar-to-ductal metaplasia and pancreatic intraepithelial neoplasia. Pathology-assisted deconvolution of spatial transcriptomic data identified tumor and transitional subpopulations with distinct histological features. We showed coordinated expression of TIGIT in exhausted and regulatory T cells and Nectin in tumor cells. Chemo-resistant samples contain a threefold enrichment of inflammatory cancer-associated fibroblasts that upregulate metallothioneins. Our study reveals a deeper understanding of the intricate substructure of pancreatic ductal adenocarcinoma tumors that could help improve therapy for patients with this disease.
Assuntos
Carcinoma Ductal Pancreático , Neoplasias Pancreáticas , Carcinoma Ductal Pancreático/metabolismo , Transformação Celular Neoplásica/genética , Humanos , Pâncreas/metabolismo , Neoplasias Pancreáticas/metabolismo , Microambiente Tumoral/genética , Neoplasias PancreáticasRESUMO
The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.
Assuntos
Genoma Humano , Genômica , Genoma Humano/genética , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNARESUMO
While 9p deletion and duplication syndromes have been studied for several years, small sample sizes and minimal high-resolution data have limited a comprehensive delineation of genotypic and phenotypic characteristics. In this study, we examined genetic data from 719 individuals in the worldwide 9p Network Cohort: a cohort seven to nine times larger than any previous study of 9p. Most breakpoints occur in bands 9p22 and 9p24, accounting for 35% and 38% of all breakpoints, respectively. Bands 9p11 and 9p12 have the fewest breakpoints, with each accounting for 0.6% of all breakpoints. The most common phenotype in 9p deletion and duplication syndromes is developmental delay, and we identified eight known neurodevelopmental disorder genes in 9p22 and 9p24. Since it has been previously reported that some individuals have a secondary structural variant related to the 9p variant, we examined our cohort for these variants and found 97 events. The top secondary variant involved 9q in 14 individuals (1.9%), including ring chromosomes and inversions. We identified a gender bias with significant enrichment for females (p = 0.0006) that may arise from a sex reversal in some individuals with 9p deletions. Genes on 9p were characterized regarding function, constraint metrics, and protein-protein interactions, resulting in a prioritized set of genes for further study. Finally, we achieved precision genomics in one child with a complex 9p structural variation using modern genomic technologies, demonstrating that long-read sequencing will be integral for some cases. Our study is the largest ever on 9p-related syndromes and provides key insights into genetic factors involved in these syndromes.
RESUMO
Large-scale gene sequencing studies for complex traits have the potential to identify causal genes with therapeutic implications. We performed gene-based association testing of blood lipid levels with rare (minor allele frequency < 1%) predicted damaging coding variation by using sequence data from >170,000 individuals from multiple ancestries: 97,493 European, 30,025 South Asian, 16,507 African, 16,440 Hispanic/Latino, 10,420 East Asian, and 1,182 Samoan. We identified 35 genes associated with circulating lipid levels; some of these genes have not been previously associated with lipid levels when using rare coding variation from population-based samples. We prioritize 32 genes in array-based genome-wide association study (GWAS) loci based on aggregations of rare coding variants; three (EVI5, SH2B3, and PLIN1) had no prior association of rare coding variants with lipid levels. Most of our associated genes showed evidence of association among multiple ancestries. Finally, we observed an enrichment of gene-based associations for low-density lipoprotein cholesterol drug target genes and for genes closest to GWAS index single-nucleotide polymorphisms (SNPs). Our results demonstrate that gene-based associations can be beneficial for drug target development and provide evidence that the gene closest to the array-based GWAS index SNP is often the functional gene for blood lipid levels.
Assuntos
Exoma , Variação Genética , Estudo de Associação Genômica Ampla , Lipídeos/sangue , Fases de Leitura Aberta , Alelos , Glicemia/genética , Estudos de Casos e Controles , Biologia Computacional/métodos , Bases de Dados Genéticas , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Predisposição Genética para Doença , Genética Populacional , Estudo de Associação Genômica Ampla/métodos , Humanos , Metabolismo dos Lipídeos/genética , Fígado/metabolismo , Fígado/patologia , Anotação de Sequência Molecular , Herança Multifatorial , Fenótipo , Polimorfismo de Nucleotídeo ÚnicoRESUMO
PURPOSE: Approximately 10%-40% of patients with lung cancer report no history of tobacco smoking (never-smokers). We analyzed whole-exome and RNA-sequencing data of 160 tumor and normal lung adenocarcinoma (LUAD) samples from never-smokers to identify clinically actionable alterations and gain insight into the environmental and hereditary risk factors for LUAD among never-smokers. METHODS: We performed whole-exome and RNA-sequencing of 88 and 69 never-smoker LUADs. We analyzed these data in conjunction with data from 76 never-smoker and 299 smoker LUAD samples sequenced by The Cancer Genome Atlas and Clinical Proteomic Tumor Analysis Consortium. RESULTS: We observed a high prevalence of clinically actionable driver alterations in never-smoker LUADs compared with smoker LUADs (78%-92% v 49.5%; P < .0001). Although a subset of never-smoker samples demonstrated germline alterations in DNA repair genes, the frequency of samples showing germline variants in cancer predisposing genes was comparable between smokers and never-smokers (6.4% v 6.9%; P = .82). A subset of never-smoker samples (5.9%) showed mutation signatures that were suggestive of passive exposure to cigarette smoke. Finally, analysis of RNA-sequencing data showed distinct immune transcriptional subtypes of never-smoker LUADs that varied in their expression of clinically relevant immune checkpoint molecules and immune cell composition. CONCLUSION: In this comprehensive genomic and transcriptome analysis of never-smoker LUADs, we observed a potential role for germline variants in DNA repair genes and passive exposure to cigarette smoke in the pathogenesis of a subset of never-smoker LUADs. Our findings also show that clinically actionable driver alterations are highly prevalent in never-smoker LUADs, highlighting the need for obtaining biopsies with adequate cellularity for clinical genomic testing in these patients.
Assuntos
Adenocarcinoma de Pulmão/patologia , Biomarcadores Tumorais/genética , Sequenciamento do Exoma/métodos , Neoplasias Pulmonares/patologia , Mutação , Fumar/tendências , Adenocarcinoma de Pulmão/epidemiologia , Adenocarcinoma de Pulmão/genética , Idoso , Feminino , Seguimentos , Humanos , Neoplasias Pulmonares/epidemiologia , Neoplasias Pulmonares/genética , Masculino , Prognóstico , Estados Unidos/epidemiologiaRESUMO
Autosomal genetic analyses of blood lipids have yielded key insights for coronary heart disease (CHD). However, X chromosome genetic variation is understudied for blood lipids in large sample sizes. We now analyze genetic and blood lipid data in a high-coverage whole X chromosome sequencing study of 65,322 multi-ancestry participants and perform replication among 456,893 European participants. Common alleles on chromosome Xq23 are strongly associated with reduced total cholesterol, LDL cholesterol, and triglycerides (min P = 8.5 × 10-72), with similar effects for males and females. Chromosome Xq23 lipid-lowering alleles are associated with reduced odds for CHD among 42,545 cases and 591,247 controls (P = 1.7 × 10-4), and reduced odds for diabetes mellitus type 2 among 54,095 cases and 573,885 controls (P = 1.4 × 10-5). Although we observe an association with increased BMI, waist-to-hip ratio adjusted for BMI is reduced, bioimpedance analyses indicate increased gluteofemoral fat, and abdominal MRI analyses indicate reduced visceral adiposity. Co-localization analyses strongly correlate increased CHRDL1 gene expression, particularly in adipose tissue, with reduced concentrations of blood lipids.
Assuntos
Fatores de Risco Cardiometabólico , Cromossomos Humanos X/genética , Lipídeos/sangue , Proteínas do Olho/metabolismo , Feminino , Regulação da Expressão Gênica , Estudos de Associação Genética , Loci Gênicos , Predisposição Genética para Doença , Genótipo , Humanos , Masculino , Pessoa de Meia-Idade , Proteínas do Tecido Nervoso/metabolismo , Fenômica , Polimorfismo de Nucleotídeo Único/genética , Tela Subcutânea/metabolismo , Sequenciamento Completo do GenomaRESUMO
Studies of Y Chromosome evolution have focused primarily on gene decay, a consequence of suppression of crossing-over with the X Chromosome. Here, we provide evidence that suppression of X-Y crossing-over unleashed a second dynamic: selfish X-Y arms races that reshaped the sex chromosomes in mammals as different as cattle, mice, and men. Using super-resolution sequencing, we explore the Y Chromosome of Bos taurus (bull) and find it to be dominated by massive, lineage-specific amplification of testis-expressed gene families, making it the most gene-dense Y Chromosome sequenced to date. As in mice, an X-linked homolog of a bull Y-amplified gene has become testis-specific and amplified. This evolutionary convergence implies that lineage-specific X-Y coevolution through gene amplification, and the selfish forces underlying this phenomenon, were dominatingly powerful among diverse mammalian lineages. Together with Y gene decay, X-Y arms races molded mammalian sex chromosomes and influenced the course of mammalian evolution.
Assuntos
Análise de Sequência de DNA/veterinária , Cromossomo X/genética , Cromossomo Y/genética , Animais , Bovinos , Linhagem da Célula , Troca Genética , Evolução Molecular , Feminino , Amplificação de Genes , Humanos , Masculino , Camundongos , Especificidade de Órgãos , Testículo/químicaRESUMO
A correction to this paper has been published and can be accessed via a link at the top of the paper.
RESUMO
The Alzheimer's Disease Sequencing Project (ADSP) undertook whole exome sequencing in 5,740 late-onset Alzheimer disease (AD) cases and 5,096 cognitively normal controls primarily of European ancestry (EA), among whom 218 cases and 177 controls were Caribbean Hispanic (CH). An age-, sex- and APOE based risk score and family history were used to select cases most likely to harbor novel AD risk variants and controls least likely to develop AD by age 85 years. We tested ~1.5 million single nucleotide variants (SNVs) and 50,000 insertion-deletion polymorphisms (indels) for association to AD, using multiple models considering individual variants as well as gene-based tests aggregating rare, predicted functional, and loss of function variants. Sixteen single variants and 19 genes that met criteria for significant or suggestive associations after multiple-testing correction were evaluated for replication in four independent samples; three with whole exome sequencing (2,778 cases, 7,262 controls) and one with genome-wide genotyping imputed to the Haplotype Reference Consortium panel (9,343 cases, 11,527 controls). The top findings in the discovery sample were also followed-up in the ADSP whole-genome sequenced family-based dataset (197 members of 42 EA families and 501 members of 157 CH families). We identified novel and predicted functional genetic variants in genes previously associated with AD. We also detected associations in three novel genes: IGHG3 (p = 9.8 × 10-7), an immunoglobulin gene whose antibodies interact with ß-amyloid, a long non-coding RNA AC099552.4 (p = 1.2 × 10-7), and a zinc-finger protein ZNF655 (gene-based p = 5.0 × 10-6). The latter two suggest an important role for transcriptional regulation in AD pathogenesis.
Assuntos
Doença de Alzheimer/genética , Doença de Alzheimer/imunologia , Sequenciamento do Exoma , Regulação da Expressão Gênica/genética , Imunidade/genética , Transcrição Gênica/genética , Idoso , Idoso de 80 Anos ou mais , Doença de Alzheimer/patologia , Peptídeos beta-Amiloides/imunologia , Apolipoproteínas E/genética , Feminino , Haplótipos/genética , Humanos , Imunoglobulina G , Fatores de Transcrição Kruppel-Like/genética , Masculino , Polimorfismo Genético/genética , RNA Longo não Codificante/genéticaRESUMO
This corrects the article DOI: 10.1038/ncomms15451.
RESUMO
Biomphalaria snails are instrumental in transmission of the human blood fluke Schistosoma mansoni. With the World Health Organization's goal to eliminate schistosomiasis as a global health problem by 2025, there is now renewed emphasis on snail control. Here, we characterize the genome of Biomphalaria glabrata, a lophotrochozoan protostome, and provide timely and important information on snail biology. We describe aspects of phero-perception, stress responses, immune function and regulation of gene expression that support the persistence of B. glabrata in the field and may define this species as a suitable snail host for S. mansoni. We identify several potential targets for developing novel control measures aimed at reducing snail-mediated transmission of schistosomiasis.
Assuntos
Biomphalaria/genética , Biomphalaria/parasitologia , Genoma , Esquistossomose mansoni/transmissão , Comunicação Animal , Animais , Biomphalaria/imunologia , Elementos de DNA Transponíveis , Evolução Molecular , Água Doce , Regulação da Expressão Gênica , Interações Hospedeiro-Parasita , Feromônios , Proteoma , Schistosoma mansoni , Análise de Sequência de DNA , Estresse FisiológicoRESUMO
OBJECTIVE: In many rheumatoid arthritis (RA) patients, disease is controlled with anti-tumor necrosis factor (anti-TNF) biologic therapies. However, in a significant number of patients, the disease fails to respond to anti-TNF therapy. We undertook the present study to examine the hypothesis that rare and low-frequency genetic variants might influence response to anti-TNF treatment. METHODS: We sequenced the coding region of 750 genes in 1,094 RA patients of European ancestry who were treated with anti-TNF. After quality control, 690 genes were included in the analysis. We applied single-variant association and gene-based association tests to identify variants associated with anti-TNF treatment response. In addition, given the key mechanistic role of TNF, we performed gene set analyses of 27 TNF pathway genes. RESULTS: We identified 14,420 functional variants, of which 6,934 were predicted as nonsynonymous 2,136 of which were further predicted to be "damaging." Despite the fact that the study was well powered, no single variant or gene showed study-wide significant association with change in the outcome measures disease activity or European League Against Rheumatism response. Intriguingly, we observed 3 genes, of 27 with nominal signals of association (P < 0.05), that were involved in the TNF signaling pathway. However, when we performed a rigorous gene set enrichment analysis based on association P value ranking, we observed no evidence of enrichment of association at genes involved in the TNF pathway (Penrichment = 0.15, based on phenotype permutations). CONCLUSION: Our findings suggest that rare and low-frequency protein-coding variants in TNF signaling pathway genes or other genes do not contribute substantially to anti-TNF treatment response in patients with RA.
Assuntos
Artrite Reumatoide/tratamento farmacológico , Artrite Reumatoide/genética , Fator de Necrose Tumoral alfa/antagonistas & inibidores , Feminino , Variação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Fases de Leitura Aberta , Resultado do TratamentoRESUMO
Acute myeloid leukemia (AML) comprises a heterogeneous group of leukemias frequently defined by recurrent cytogenetic abnormalities, including rearrangements involving the core-binding factor (CBF) transcriptional complex. To better understand the genomic landscape of CBF-AMLs, we analyzed both pediatric (n = 87) and adult (n = 78) samples, including cases with RUNX1-RUNX1T1 (n = 85) or CBFB-MYH11 (n = 80) rearrangements, by whole-genome or whole-exome sequencing. In addition to known mutations in the Ras pathway, we identified recurrent stabilizing mutations in CCND2, suggesting a previously unappreciated cooperating pathway in CBF-AML. Outside of signaling alterations, RUNX1-RUNX1T1 and CBFB-MYH11 AMLs demonstrated remarkably different spectra of cooperating mutations, as RUNX1-RUNX1T1 cases harbored recurrent mutations in DHX15 and ZBTB7A, as well as an enrichment of mutations in epigenetic regulators, including ASXL2 and the cohesin complex. This detailed analysis provides insights into the pathogenesis and development of CBF-AML, while highlighting dramatic differences in the landscapes of cooperating mutations for these related AML subtypes.
Assuntos
Biomarcadores Tumorais/genética , Fatores de Ligação ao Core/genética , Genômica/métodos , Leucemia Mieloide Aguda/genética , Mutação/genética , Proteínas de Fusão Oncogênica/genética , Adulto , Criança , HumanosRESUMO
Chromosomal rearrangements deregulating hematopoietic transcription factors are common in acute lymphoblastic leukemia (ALL). Here we show that deregulation of the homeobox transcription factor gene DUX4 and the ETS transcription factor gene ERG is a hallmark of a subtype of B-progenitor ALL that comprises up to 7% of B-ALL. DUX4 rearrangement and overexpression was present in all cases and was accompanied by transcriptional deregulation of ERG, expression of a novel ERG isoform, ERGalt, and frequent ERG deletion. ERGalt uses a non-canonical first exon whose transcription was initiated by DUX4 binding. ERGalt retains the DNA-binding and transactivation domains of ERG, but it inhibits wild-type ERG transcriptional activity and is transforming. These results illustrate a unique paradigm of transcription factor deregulation in leukemia in which DUX4 deregulation results in loss of function of ERG, either by deletion or induced expression of an isoform that is a dominant-negative inhibitor of wild-type ERG function.