RESUMO
The genetic basis of brain tumor development is poorly understood. Here, leukocyte DNA of 21 patients from 15 families with ≥ 2 glioma cases each was analyzed by whole-genome or targeted sequencing. As a result, we identified two families with rare germline variants, p.(A592T) or p.(A817V), in the E-cadherin gene CDH1 that co-segregate with the tumor phenotype, consisting primarily of oligodendrogliomas, WHO grade II/III, IDH-mutant, 1p/19q-codeleted (ODs). Rare CDH1 variants, previously shown to predispose to gastric and breast cancer, were significantly overrepresented in these glioma families (13.3%) versus controls (1.7%). In 68 individuals from 28 gastric cancer families with pathogenic CDH1 germline variants, brain tumors, including a pituitary adenoma, were observed in three cases (4.4%), a significantly higher prevalence than in the general population (0.2%). Furthermore, rare CDH1 variants were identified in tumor DNA of 6/99 (6%) ODs. CDH1 expression was detected in undifferentiated and differentiating oligodendroglial cells isolated from rat brain. Functional studies using CRISPR/Cas9-mediated knock-in or stably transfected cell models demonstrated that the identified CDH1 germline variants affect cell membrane expression, cell migration and aggregation. E-cadherin ectodomain containing variant p.(A592T) had an increased intramolecular flexibility in a molecular dynamics simulation model. E-cadherin harboring intracellular variant p.(A817V) showed reduced ß-catenin binding resulting in increased cytosolic and nuclear ß-catenin levels reverted by treatment with the MAPK interacting serine/threonine kinase 1 inhibitor CGP 57380. Our data provide evidence for a role of deactivating CDH1 variants in the risk and tumorigenesis of neuroepithelial and epithelial brain tumors, particularly ODs, possibly via WNT/ß-catenin signaling.
Assuntos
Antígenos CD/genética , Neoplasias Encefálicas/genética , Caderinas/genética , Carcinoma/genética , Neoplasias Neuroepiteliomatosas/genética , Adenoma/genética , Adenoma/patologia , Compostos de Anilina/uso terapêutico , Animais , Diversidade de Anticorpos , Neoplasias Encefálicas/tratamento farmacológico , Carcinoma/tratamento farmacológico , DNA de Neoplasias/genética , Técnicas de Introdução de Genes , Variação Genética , Células HEK293 , Humanos , Neoplasias Neuroepiteliomatosas/tratamento farmacológico , Oligodendroglioma/genética , Oligodendroglioma/patologia , Inibidores de Proteínas Quinases/uso terapêutico , Purinas/uso terapêutico , Ratos , Ratos Sprague-Dawley , Sequenciamento Completo do GenomaRESUMO
A core promoter is a stretch of DNA surrounding the transcription start site (TSS) that integrates regulatory inputs and recruits general transcription factors to initiate transcription. The nature and causative relationship of the DNA sequence and chromatin signals that govern the selection of most TSSs by RNA polymerase II remain unresolved. Maternal to zygotic transition represents the most marked change of the transcriptome repertoire in the vertebrate life cycle. Early embryonic development in zebrafish is characterized by a series of transcriptionally silent cell cycles regulated by inherited maternal gene products: zygotic genome activation commences at the tenth cell cycle, marking the mid-blastula transition. This transition provides a unique opportunity to study the rules of TSS selection and the hierarchy of events linking transcription initiation with key chromatin modifications. We analysed TSS usage during zebrafish early embryonic development at high resolution using cap analysis of gene expression, and determined the positions of H3K4me3-marked promoter-associated nucleosomes. Here we show that the transition from the maternal to zygotic transcriptome is characterized by a switch between two fundamentally different modes of defining transcription initiation, which drive the dynamic change of TSS usage and promoter shape. A maternal-specific TSS selection, which requires an A/T-rich (W-box) motif, is replaced with a zygotic TSS selection grammar characterized by broader patterns of dinucleotide enrichments, precisely aligned with the first downstream (+1) nucleosome. The developmental dynamics of the H3K4me3-marked nucleosomes reveal their DNA-sequence-associated positioning at promoters before zygotic transcription and subsequent transcription-independent adjustment to the final position downstream of the zygotic TSS. The two TSS-defining grammars coexist, often physically overlapping, in core promoters of constitutively expressed genes to enable their expression in the two regulatory environments. The dissection of overlapping core promoter determinants represents a framework for future studies of promoter structure and function across different regulatory contexts.
Assuntos
Regiões Promotoras Genéticas/genética , Sítio de Iniciação de Transcrição , Peixe-Zebra/genética , Animais , Sequência de Bases , Embrião não Mamífero/embriologia , Embrião não Mamífero/metabolismo , Feminino , Regulação da Expressão Gênica no Desenvolvimento/genética , Histonas/metabolismo , Metilação , Mães , Nucleossomos/genética , Iniciação da Transcrição Genética , Transcriptoma/genética , Peixe-Zebra/embriologia , Zigoto/metabolismoRESUMO
Atlantic cod (Gadus morhua) is a large, cold-adapted teleost that sustains long-standing commercial fisheries and incipient aquaculture. Here we present the genome sequence of Atlantic cod, showing evidence for complex thermal adaptations in its haemoglobin gene cluster and an unusual immune architecture compared to other sequenced vertebrates. The genome assembly was obtained exclusively by 454 sequencing of shotgun and paired-end libraries, and automated annotation identified 22,154 genes. The major histocompatibility complex (MHC) II is a conserved feature of the adaptive immune system of jawed vertebrates, but we show that Atlantic cod has lost the genes for MHC II, CD4 and invariant chain (Ii) that are essential for the function of this pathway. Nevertheless, Atlantic cod is not exceptionally susceptible to disease under natural conditions. We find a highly expanded number of MHC I genes and a unique composition of its Toll-like receptor (TLR) families. This indicates how the Atlantic cod immune system has evolved compensatory mechanisms in both adaptive and innate immunity in the absence of MHC II. These observations affect fundamental assumptions about the evolution of the adaptive immune system and its components in vertebrates.
Assuntos
Gadus morhua/genética , Gadus morhua/imunologia , Genoma/genética , Sistema Imunitário/imunologia , Imunidade/genética , Animais , Evolução Molecular , Genômica , Hemoglobinas/genética , Imunidade/imunologia , Complexo Principal de Histocompatibilidade/genética , Complexo Principal de Histocompatibilidade/imunologia , Masculino , Polimorfismo Genético/genética , Sintenia/genética , Receptores Toll-Like/genéticaRESUMO
Spatiotemporal control of gene expression is central to animal development. Core promoters represent a previously unanticipated regulatory level by interacting with cis-regulatory elements and transcription initiation in different physiological and developmental contexts. Here, we provide a first and comprehensive description of the core promoter repertoire and its dynamic use during the development of a vertebrate embryo. By using cap analysis of gene expression (CAGE), we mapped transcription initiation events at single nucleotide resolution across 12 stages of zebrafish development. These CAGE-based transcriptome maps reveal genome-wide rules of core promoter usage, structure, and dynamics, key to understanding the control of gene regulation during vertebrate ontogeny. They revealed the existence of multiple classes of pervasive intra- and intergenic post-transcriptionally processed RNA products and their developmental dynamics. Among these RNAs, we report splice donor site-associated intronic RNA (sRNA) to be specific to genes of the splicing machinery. For the identification of conserved features, we compared the zebrafish data sets to the first CAGE promoter map of Tetraodon and the existing human CAGE data. We show that a number of features, such as promoter type, newly discovered promoter properties such as a specialized purine-rich initiator motif, as well as sRNAs and the genes in which they are detected, are conserved in mammalian and Tetraodon CAGE-defined promoter maps. The zebrafish developmental promoterome represents a powerful resource for studying developmental gene regulation and revealing promoter features shared across vertebrates.
Assuntos
Desenvolvimento Embrionário/genética , Regulação da Expressão Gênica no Desenvolvimento , Purinas/metabolismo , Sítio de Iniciação de Transcrição , Peixe-Zebra/embriologia , Peixe-Zebra/genética , Animais , Evolução Molecular , Perfilação da Expressão Gênica , Genes , Genoma , Filogenia , Regiões Promotoras Genéticas , RNA/genética , RNA/metabolismo , Capuzes de RNA/genética , Splicing de RNA , Transcriptoma , Vertebrados/genéticaRESUMO
Catalytic RNAs are attractive objects for studying molecular evolution. To understand how RNA libraries can evolve from randomness toward highly active catalysts, we analyze the original samples that led to the discovery of Diels-Alderase ribozymes by next-generation sequencing. Known structure-activity relationships are used to correlate abundance with catalytic performance. We find that efficient catalysts arose not just from selection for reactivity among the members of the starting library, but from improvement of less potent precursors by mutations. We observe changes in the ribozyme population in response to increasing selection pressure. Surprisingly, even after many rounds of enrichment, the libraries are highly diverse, suggesting that potential catalysts are more abundant in random space than generally thought. To highlight the use of next-generation sequencing as a tool for in vitro selections, we also apply this technique to a recent, less characterized ribozyme selection. Making use of the correlation between sequence evolution and catalytic activity, we predict mutations that improve ribozyme activity and validate them biochemically. Our study reveals principles underlying ribozyme in vitro selections and provides guidelines to render future selections more efficient, as well as to predict the conservation of key structural elements, allowing the rational improvement of catalysts.
Assuntos
RNA Catalítico/química , Evolução Molecular Direcionada , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNARESUMO
MicroRNAs (miRNAs) are key mediators of post-transcriptional gene regulation. The miRNA precursors are processed by the endonucleases Drosha and Dicer into a duplex, bound to an Argonaute protein and unwound into two single-stranded miRNAs. Although alternative ways to generate miRNAs have been discovered, e.g. pre-miRNA cleavage by Ago2 or cleavage products of snoRNAs or tRNAs, all known pathways converge on a double-stranded RNA duplex. Exogenous single-stranded siRNAs (ss-siRNAs) can elicit an effective RNA interference reaction; recent studies have identified chemical modifications increasing their stability and activity. Here, we provide first evidence that endogenous, unmodified, single-stranded RNA sequences are generated from single-stranded loop regions of human pre-miRNA hairpins, the so called loop-miRs. Luciferase assays and immunoprecipitation validate loop-miR activity and incorporation into RNA-induced silencing complexes. This study identifies endogenous miRNAs that are generated from single-stranded regions; hence, it provides evidence that precursor-miRNAs can give rise to three distinct endogenous miRNAs: the guide strand, the passenger strand and the loop-miR.
Assuntos
MicroRNAs/química , Precursores de RNA/química , Proteínas Argonautas/metabolismo , Linhagem Celular , Citoplasma/metabolismo , Humanos , MicroRNAs/metabolismo , Conformação de Ácido Nucleico , Precursores de RNA/metabolismoRESUMO
The large diversity of central nervous system (CNS) tumor types in children and adolescents results in disparate patient outcomes and renders accurate diagnosis challenging. In this study, we prospectively integrated DNA methylation profiling and targeted gene panel sequencing with blinded neuropathological reference diagnostics for a population-based cohort of more than 1,200 newly diagnosed pediatric patients with CNS tumors, to assess their utility in routine neuropathology. We show that the multi-omic integration increased diagnostic accuracy in a substantial proportion of patients through annotation to a refining DNA methylation class (50%), detection of diagnostic or therapeutically relevant genetic alterations (47%) or identification of cancer predisposition syndromes (10%). Discrepant results by neuropathological WHO-based and DNA methylation-based classification (30%) were enriched in histological high-grade gliomas, implicating relevance for current clinical patient management in 5% of all patients. Follow-up (median 2.5 years) suggests improved survival for patients with histological high-grade gliomas displaying lower-grade molecular profiles. These results provide preliminary evidence of the utility of integrating multi-omics in neuropathology for pediatric neuro-oncology.
Assuntos
Neoplasias Encefálicas , Glioma , Adolescente , Humanos , Criança , Multiômica , Glioma/diagnóstico , Glioma/genética , Neuropatologia , Metilação de DNA/genética , Mutação , Neoplasias Encefálicas/diagnóstico , Neoplasias Encefálicas/genéticaRESUMO
The international precision oncology program INFORM enrolls relapsed/refractory pediatric cancer patients for comprehensive molecular analysis. We report a two-year pilot study implementing ex vivo drug sensitivity profiling (DSP) using a library of 75-78 clinically relevant drugs. We included 132 viable tumor samples from 35 pediatric oncology centers in seven countries. DSP was conducted on multicellular fresh tumor tissue spheroid cultures in 384-well plates with an overall mean processing time of three weeks. In 89 cases (67%), sufficient viable tissue was received; 69 (78%) passed internal quality controls. The DSP results matched the identified molecular targets, including BRAF, ALK, MET, and TP53 status. Drug vulnerabilities were identified in 80% of cases lacking actionable (very) high-evidence molecular events, adding value to the molecular data. Striking parallels between clinical courses and the DSP results were observed in selected patients. Overall, DSP in clinical real-time is feasible in international multicenter precision oncology programs.
RESUMO
BACKGROUND: Recent functional studies have demonstrated that many microRNAs (miRNAs) are expressed by RNA polymerase II in a specific spatiotemporal manner during the development of organisms and play a key role in cell-lineage decisions and morphogenesis. They are therefore functionally related to a number of key protein coding developmental genes, that form genomic regulatory blocks (GRBs) with arrays of highly conserved non-coding elements (HCNEs) functioning as long-range enhancers that collaboratively regulate the expression of their target genes. Given this functional similarity as well as recent zebrafish transgenesis assays showing that the miR-9 family is indeed regulated by HCNEs with enhancer activity, we hypothesized that this type of miRNA regulation is prevalent. In this paper, we therefore systematically investigate the regulatory landscape around conserved self-transcribed miRNAs (ST miRNAs), with their own known or computationally inferred promoters, by analyzing the hallmarks of GRB target genes. These include not only the density of HCNEs in their vicinity but also the presence of large CpG islands (CGIs) and distinct patterns of histone modification marks associated with developmental genes. RESULTS: Our results show that a subset of the conserved ST miRNAs we studied shares properties similar to those of protein-coding GRB target genes: they are located in regions of significantly higher HCNE/enhancer binding density and are more likely to be associated with CGIs. Furthermore, their putative promoters have both activating as well as silencing histone modification marks during development and differentiation. Based on these results we used both an elevated HCNE density in the genomic vicinity as well as the presence of a bivalent promoter to identify 29 putative GRB target miRNAs/miRNA clusters, over two-thirds of which are known to play a role during development and differentiation. Furthermore these predictions include miRNAs of the miR-9 family, which are the only experimentally verified GRB target miRNAs. CONCLUSIONS: A subset of the conserved miRNA loci we investigated exhibits typical characteristics of GRB target genes, which may partially explain their complex expression profiles during development.
Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Genômica , MicroRNAs/genética , Animais , Ilhas de CpG/genética , Elementos Facilitadores Genéticos/genética , Genoma Humano/genética , Humanos , Camundongos , Regiões Promotoras Genéticas/genética , Reprodutibilidade dos Testes , Transcrição Gênica/genéticaRESUMO
BACKGROUND: Unmethylated stretches of CpG dinucleotides (CpG islands) are an outstanding property of mammal genomes. Conventionally, these regions are detected by sliding window approaches using %G + C, CpG observed/expected ratio and length thresholds as main parameters. Recently, clustering methods directly detect clusters of CpG dinucleotides as a statistical property of the genome sequence. RESULTS: We compare sliding-window to clustering (i.e. CpGcluster) predictions by applying new ways to detect putative functionality of CpG islands. Analyzing the co-localization with several genomic regions as a function of window size vs. statistical significance (p-value), CpGcluster shows a higher overlap with promoter regions and highly conserved elements, at the same time showing less overlap with Alu retrotransposons. The major difference in the prediction was found for short islands (CpG islets), often exclusively predicted by CpGcluster. Many of these islets seem to be functional, as they are unmethylated, highly conserved and/or located within the promoter region. Finally, we show that window-based islands can spuriously overlap several, differentially regulated promoters as well as different methylation domains, which might indicate a wrong merge of several CpG islands into a single, very long island. The shorter CpGcluster islands seem to be much more specific when concerning the overlap with alternative transcription start sites or the detection of homogenous methylation domains. CONCLUSIONS: The main difference between sliding-window approaches and clustering methods is the length of the predicted islands. Short islands, often differentially methylated, are almost exclusively predicted by CpGcluster. This suggests that CpGcluster may be the algorithm of choice to explore the function of these short, but putatively functional CpG islands.
Assuntos
Algoritmos , Ilhas de CpG , Elementos Alu/genética , Análise por Conglomerados , Sequência Conservada/genética , Metilação de DNA/genética , Evolução Molecular , Humanos , Regiões Promotoras Genéticas/genéticaRESUMO
BACKGROUND: The computational prediction of DNA methylation has become an important topic in the recent years due to its role in the epigenetic control of normal and cancer-related processes. While previous prediction approaches focused merely on differences between methylated and unmethylated DNA sequences, recent experimental results have shown the presence of much more complex patterns of methylation across tissues and time in the human genome. These patterns are only partially described by a binary model of DNA methylation. In this work we propose a novel approach, based on profile analysis of tissue-specific methylation that uncovers significant differences in the sequences of CpG islands (CGIs) that predispose them to a tissue- specific methylation pattern. RESULTS: We defined CGI methylation profiles that separate not only between constitutively methylated and unmethylated CGIs, but also identify CGIs showing a differential degree of methylation across tissues and cell-types or a lack of methylation exclusively in sperm. These profiles are clearly distinguished by a number of CGI attributes including their evolutionary conservation, their significance, as well as the evolutionary evidence of prior methylation. Additionally, we assess profile functionality with respect to the different compartments of protein coding genes and their possible use in the prediction of DNA methylation. CONCLUSION: Our approach provides new insights into the biological features that determine if a CGI has a functional role in the epigenetic control of gene expression and the features associated with CGI methylation susceptibility. Moreover, we show that the ability to predict CGI methylation is based primarily on the quality of the biological information used and the relationships uncovered between different sources of knowledge. The strategy presented here is able to predict, besides the constitutively methylated and unmethylated classes, two more tissue specific methylation classes conserving the accuracy provided by leading binary methylation classification methods.
Assuntos
Biologia Computacional/métodos , Ilhas de CpG/genética , Metilação de DNA/genética , Bases de Dados Genéticas , Epigênese Genética , Genoma Humano , Humanos , Especificidade de Órgãos , Distribuição Tecidual/genéticaRESUMO
Extracellular vesicles (EVs) are shed by many different cell types. Their nucleic acids content offers new opportunities for biomarker research in different solid tumors. The role of EV RNA in prostate cancer (PCa) is still largely unknown. EVs were isolated from different benign and malignant prostate cell lines and blood plasma from patients with PCa (n = 18) and controls with benign prostatic hyperplasia (BPH) (n = 7). Nanoparticle tracking analysis (NTA), Western blot, electron microscopy, and flow cytometry analysis were used for the characterization of EVs. Non-coding RNA expression profiling of PC3 metastatic PCa cells and their EVs was performed by next generation sequencing (NGS). miRNAs differentially expressed in PC3 EVs were validated with qRT-PCR in EVs derived from additional cell lines and patient plasma and from matched tissue samples. 92 miRNAs were enriched and 48 miRNAs were depleted in PC3 EVs compared to PC3 cells, which could be confirmed by qRT-PCR. miR-99b-5p was significantly higher expressed in malignant compared to benign EVs. Furthermore, expression profiling showed miR-10a-5p (p = 0.018) and miR-29b-3p (p = 0.002), but not miR-99b-5p, to be overexpressed in plasma-derived EVs from patients with PCa compared with controls. In the corresponding tissue samples, no significant differences in the miRNA expression could be observed. We thus propose that EV-associated miR-10a-5p and miR-29b-3p could serve as potential new PCa detection markers.
RESUMO
BACKGROUND: Despite their involvement in the regulation of gene expression and their importance as genomic markers for promoter prediction, no objective standard exists for defining CpG islands (CGIs), since all current approaches rely on a large parameter space formed by the thresholds of length, CpG fraction and G+C content. RESULTS: Given the higher frequency of CpG dinucleotides at CGIs, as compared to bulk DNA, the distance distributions between neighboring CpGs should differ for bulk and island CpGs. A new algorithm (CpGcluster) is presented, based on the physical distance between neighboring CpGs on the chromosome and able to predict directly clusters of CpGs, while not depending on the subjective criteria mentioned above. By assigning a p-value to each of these clusters, the most statistically significant ones can be predicted as CGIs. CpGcluster was benchmarked against five other CGI finders by using a test sequence set assembled from an experimental CGI library. CpGcluster reached the highest overall accuracy values, while showing the lowest rate of false-positive predictions. Since a minimum-length threshold is not required, CpGcluster can find short but fully functional CGIs usually missed by other algorithms. The CGIs predicted by CpGcluster present the lowest degree of overlap with Alu retrotransposons and, simultaneously, the highest overlap with vertebrate Phylogenetic Conserved Elements (PhastCons). CpGcluster's CGIs overlapping with the Transcription Start Site (TSS) show the highest statistical significance, as compared to the islands in other genome locations, thus qualifying CpGcluster as a valuable tool in discriminating functional CGIs from the remaining islands in the bulk genome. CONCLUSION: CpGcluster uses only integer arithmetic, thus being a fast and computationally efficient algorithm able to predict statistically significant clusters of CpG dinucleotides. Another outstanding feature is that all predicted CGIs start and end with a CpG dinucleotide, which should be appropriate for a genomic feature whose functionality is based precisely on CpG dinucleotides. The only search parameter in CpGcluster is the distance between two consecutive CpGs, in contrast to previous algorithms. Therefore, none of the main statistical properties of CpG islands (neither G+C content, CpG fraction nor length threshold) are needed as search parameters, which may lead to the high specificity and low overlap with spurious Alu elements observed for CpGcluster predictions.
Assuntos
Algoritmos , Ilhas de CpG/genética , Animais , Genoma/genética , Humanos , CamundongosRESUMO
The 'Individualized Therapy for Relapsed Malignancies in Childhood' (INFORM) precision medicine study is a nationwide German program for children with high-risk relapsed/refractory malignancies, which aims to identify therapeutic targets on an individualised basis. In a pilot phase, reported here, we developed the logistical and analytical pipelines necessary for rapid and comprehensive molecular profiling in a clinical setting. Fifty-seven patients from 20 centers were prospectively recruited. Malignancies investigated included sarcomas (n = 25), brain tumours (n = 23), and others (n = 9). Whole-exome, low-coverage whole-genome, and RNA sequencing were complemented with methylation and expression microarray analyses. Alterations were assessed for potential targetability according to a customised prioritisation algorithm and subsequently discussed in an interdisciplinary molecular tumour board. Next-generation sequencing data were generated for 52 patients, with the full analysis possible in 46 of 52. Turnaround time from sample receipt until first report averaged 28 d. Twenty-six patients (50%) harbored a potentially druggable alteration with a prioritisation score of 'intermediate' or higher (level 4 of 7). Common targets included receptor tyrosine kinases, phosphoinositide 3-kinase-mammalian target of rapamycin pathway, mitogen-activated protein kinase pathway, and cell cycle control. Ten patients received a targeted therapy based on these findings, with responses observed in some previously treatment-refractory tumours. Comparative primary relapse analysis revealed substantial tumour evolution as well as one case of unsuspected secondary malignancy, highlighting the importance of re-biopsy at relapse. This study demonstrates the feasibility of comprehensive, real-time molecular profiling for high-risk paediatric cancer patients. This extended proof-of-concept, with examples of treatment consequences, expands upon previous personalised oncology endeavors, and presents a model with considerable interest and practical relevance in the burgeoning era of personalised medicine.
Assuntos
Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Técnicas de Diagnóstico Molecular , Terapia de Alvo Molecular/métodos , Neoplasias/tratamento farmacológico , Medicina de Precisão/métodos , Adolescente , Adulto , Criança , Pré-Escolar , Feminino , Humanos , Lactente , Masculino , Neoplasias/genética , Projetos Piloto , Adulto JovemRESUMO
As whole-genome sequencing for cancer genome analysis becomes a clinical tool, a full understanding of the variables affecting sequencing analysis output is required. Here using tumour-normal sample pairs from two different types of cancer, chronic lymphocytic leukaemia and medulloblastoma, we conduct a benchmarking exercise within the context of the International Cancer Genome Consortium. We compare sequencing methods, analysis pipelines and validation methods. We show that using PCR-free methods and increasing sequencing depth to â¼ 100 × shows benefits, as long as the tumour:control coverage ratio remains balanced. We observe widely varying mutation call rates and low concordance among analysis pipelines, reflecting the artefact-prone nature of the raw data and lack of standards for dealing with the artefacts. However, we show that, using the benchmark mutation set we have created, many issues are in fact easy to remedy and have an immediate positive impact on mutation detection accuracy.