Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 15 de 15
Filter
Add more filters










Publication year range
1.
Bioinformatics ; 33(2): 280-282, 2017 01 15.
Article in English | MEDLINE | ID: mdl-27605106

ABSTRACT

MOTIVATION: Large-scale rearrangements and copy number changes combined with different modes of clonal evolution create extensive somatic genome diversity, making it difficult to develop versatile and scalable variant calling tools and create well-calibrated benchmarks. RESULTS: We developed a new simulation framework tHapMix that enables the creation of tumour samples with different ploidy, purity and polyclonality features. It easily scales to simulation of hundreds of somatic genomes, while re-use of real read data preserves noise and biases present in sequencing platforms. We further demonstrate tHapMix utility by creating a simulated set of 140 somatic genomes and showing how it can be used in training and testing of somatic copy number variant calling tools. AVAILABILITY AND IMPLEMENTATION: tHapMix is distributed under an open source license and can be downloaded from https://github.com/Illumina/tHapMix CONTACT: sivakhno@illumina.comSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Copy Number Variations , Genomics/methods , Haplotypes , Neoplasms/genetics , Ploidies , Software , Computer Simulation , DNA, Neoplasm , Genome , Humans
2.
J Pathol ; 237(3): 296-306, 2015 Nov.
Article in English | MEDLINE | ID: mdl-26096211

ABSTRACT

The study of the relationships between pre-cancer and cancer and identification of early driver mutations is becoming increasingly important as the value of molecular markers of early disease and personalised drug targets is recognized, especially now the extent of clonal heterogeneity in fully invasive disease is being realized. It has been assumed that pre-cancerous lesions exhibit a fairly passive progression to invasive disease; the degree to which they, too, are heterogeneous is unknown. We performed ultra-deep sequencing of thousands of selected mutations, together with copy number analysis, from multiple, matched pre-invasive lesions, primary tumours and metastases from five patients with oral cancer, some with multiple primary tumours presenting either synchronously or metachronously, totalling 75 samples. This allowed the clonal relationships between the samples to be observed for each patient. We expose for the first time the unexpected variety and complexity of the relationships between this group of oral dysplasias and their associated carcinomas and, ultimately, the diversity of processes by which tumours are initiated, spread and metastasize. Instead of a series of genomic precursors of their adjacent invasive disease, we have shown dysplasia to be a distinct dynamic entity, refuting the belief that pre-cancer and invasive tumours with a close spatial relationship always have linearly related genomes. We show that oral pre-cancer exhibits considerable subclonal heterogeneity in its own right, that mutational changes in pre-cancer do not predict the onset of invasion, and that the genomic pathway to invasion is neither unified nor predictable. Sequence data from this study have been deposited in the European Nucleotide Archive, Accession No. PRJEB6588.


Subject(s)
Biomarkers, Tumor/genetics , Carcinoma/genetics , Cell Lineage , Cell Transformation, Neoplastic/genetics , Clonal Evolution , High-Throughput Nucleotide Sequencing/methods , Mouth Neoplasms/genetics , Precancerous Conditions/genetics , Sequence Analysis, DNA/methods , Carcinoma/secondary , Cell Movement , Cell Proliferation , Cell Transformation, Neoplastic/pathology , Disease Progression , Gene Dosage , Genetic Predisposition to Disease , Humans , Mouth Neoplasms/pathology , Mutation , Neoplasm Invasiveness , Phenotype , Precancerous Conditions/pathology
3.
Int J Cancer ; 137(10): 2364-73, 2015 Nov 15.
Article in English | MEDLINE | ID: mdl-26014678

ABSTRACT

Verrucous carcinoma of the oral cavity (OVC) is considered a subtype of classical oral squamous cell carcinoma (OSCC). Diagnosis is problematic, and additional biomarkers are needed to better stratify patients. To investigate their molecular signature, we performed low-coverage copy number (CN) sequencing on 57 OVC and exome and RNA sequencing on a subset of these and compared the data to the same OSCC parameters. CN results showed that OVC lacked any of the classical OSCC patterns such as gain of 3q and loss of 3p and demonstrated considerably fewer genomic rearrangements compared to the OSCC cohort. OVC and OSCC samples could be clearly differentiated. Exome sequencing showed that OVC samples lacked mutations in genes commonly associated with OSCC (TP53, NOTCH1, NOTCH2, CDKN2A and FAT1). RNA sequencing identified genes that were differentially expressed between the groups. In silico functional analysis showed that the mutated and differentially expressed genes in OVC samples were involved in cell adhesion and keratinocyte proliferation, while those in the OSCC cohort were enriched for cell death and apoptosis pathways. This is the largest and most detailed genomic and transcriptomic analysis yet performed on this tumour type, which, as an example of non-metastatic cancer, may shed light on the nature of metastases. These three independent investigations consistently show substantial differences between the cohorts. Taken together, they lead to the conclusion that OVC is not a subtype of OSCC, but should be classified as a distinct entity.


Subject(s)
Carcinoma, Verrucous/genetics , Carcinoma, Verrucous/pathology , Genetic Variation , Mouth Neoplasms/genetics , Mouth Neoplasms/pathology , Chromosomes, Human, Pair 3/genetics , Computer Simulation , Exome , Gene Expression Regulation, Neoplastic , Humans , Sequence Analysis, DNA/methods , Sequence Analysis, RNA/methods
4.
Oral Surg Oral Med Oral Pathol Oral Radiol ; 118(1): 117-125.e1, 2014 Jul.
Article in English | MEDLINE | ID: mdl-24908602

ABSTRACT

OBJECTIVE: The etiology of oral verrucous carcinoma is unknown, and human papillomavirus 'involvement' remains contentious. The uncertainty can be attributed to varied detection procedures and difficulties in defining 'gold-standard' histologic criteria for diagnosing 'verrucous' lesions. Their paucity also hampers investigation. We aimed to analyze oral verrucous lesions for human papillomavirus (HPV) subtype genomes. STUDY DESIGN: We used next-generation sequencing for the detection of papillomavirus sequences, identifying subtypes and computing viral loads. We identified a total of 78 oral verrucous cases (62 carcinomas and 16 hyperplasias). DNA was extracted from all and sequenced at a coverage between 2.5% and 13%. RESULTS: An HPV-16 sequence was detected in 1 carcinoma and 1 hyperplasia, and an HPV-2 sequence was detected in 1 carcinoma out of the 78 cases, with viral loads of 2.24, 8.16, and 0.33 viral genomes per cell, respectively. CONCLUSIONS: Our results indicate no conclusive human papillomavirus involvement in oral verrucous carcinoma or hyperplasia.


Subject(s)
Carcinoma, Verrucous/virology , Mouth Neoplasms/virology , Papillomaviridae/isolation & purification , Papillomavirus Infections/virology , Sequence Analysis, DNA/methods , Adult , Aged , Aged, 80 and over , Carcinoma, Verrucous/genetics , Female , Herpesviridae Infections/genetics , Herpesviridae Infections/virology , Human papillomavirus 16/genetics , Human papillomavirus 16/isolation & purification , Humans , Male , Middle Aged , Mouth Neoplasms/genetics , Papillomaviridae/genetics , Papillomavirus Infections/genetics , Viral Load
5.
Bioinformatics ; 30(13): 1823-9, 2014 Jul 01.
Article in English | MEDLINE | ID: mdl-24603986

ABSTRACT

MOTIVATION: Current high-throughput sequencing has greatly transformed genome sequence analysis. In the context of very low-coverage sequencing (<0.1×), performing 'binning' or 'windowing' on mapped short sequences ('reads') is critical to extract genomic information of interest for further evaluation, such as copy-number alteration analysis. If the window size is too small, many windows will exhibit zero counts and almost no pattern can be observed. In contrast, if the window size is too wide, the patterns or genomic features will be 'smoothed out'. Our objective is to identify an optimal window size in between the two extremes. RESULTS: We assume the reads density to be a step function. Given this model, we propose a data-based estimation of optimal window size based on Akaike's information criterion (AIC) and cross-validation (CV) log-likelihood. By plotting the AIC and CV log-likelihood curve as a function of window size, we are able to estimate the optimal window size that minimizes AIC or maximizes CV log-likelihood. The proposed methods are of general purpose and we illustrate their application using low-coverage next-generation sequence datasets from real tumour samples and simulated datasets. AVAILABILITY AND IMPLEMENTATION: An R package to estimate optimal window size is available at http://www1.maths.leeds.ac.uk/∼arief/R/win/.


Subject(s)
High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Genome, Human , Genomics/methods , Humans , Likelihood Functions , Lung Neoplasms/genetics
6.
PLoS One ; 8(11): e78823, 2013.
Article in English | MEDLINE | ID: mdl-24244370

ABSTRACT

Squamous cell carcinoma (SCC) of the lung kills over 350,000 people annually worldwide, and is the main lung cancer histotype with no targeted treatments. High-coverage whole-genome sequencing of the other main subtypes, small-cell and adenocarcinoma, gave insights into carcinogenic mechanisms and disease etiology. The genomic complexity within the lung SCC subtype, as revealed by The Cancer Genome Atlas, means this subtype is likely to benefit from a more integrated approach in which the transcriptional consequences of somatic mutations are simultaneously inspected. Here we present such an approach: the integrated analysis of deep sequencing data from both the whole genome and whole transcriptome (coding and non-coding) of LUDLU-1, a SCC lung cell line. Our results show that LUDLU-1 lacks the mutational signature that has been previously associated with tobacco exposure in other lung cancer subtypes, and suggests that DNA-repair efficiency is adversely affected; LUDLU-1 contains somatic mutations in TP53 and BRCA2, allelic imbalance in the expression of two cancer-associated BRCA1 germline polymorphisms and reduced transcription of a potentially endogenous PARP2 inhibitor. Functional assays were performed and compared with a control lung cancer cell line. LUDLU-1 did not exhibit radiosensitisation or an increase in sensitivity to PARP inhibitors. However, LUDLU-1 did exhibit small but significant differences with respect to cisplatin sensitivity. Our research shows how integrated analyses of high-throughput data can generate hypotheses to be tested in the lab.


Subject(s)
Carcinoma, Squamous Cell/genetics , Gene Expression Regulation, Neoplastic , Mutation , Neoplasm Proteins/genetics , Transcription, Genetic , Carcinoma, Squamous Cell/metabolism , Carcinoma, Squamous Cell/pathology , Cell Line, Tumor , DNA Mutational Analysis , High-Throughput Nucleotide Sequencing , Humans , Lung Neoplasms , Neoplasm Proteins/biosynthesis
7.
Neoplasia ; 14(11): 1075-86, 2012 Nov.
Article in English | MEDLINE | ID: mdl-23226101

ABSTRACT

Lung cancer causes more deaths, worldwide, than any other cancer. Several histologic subtypes exist. Currently, there is a dearth of targeted therapies for treating one of the main subtypes: squamous cell carcinoma (SCC). As for many cancers, lung SCC karyotypes are often highly anomalous owing to large somatic structural variants, some of which are seen repeatedly in lung SCC, indicating a potential causal association for genes therein. We chose to characterize a lung SCC genome to unprecedented detail and integrate our findings with the concurrently characterized transcriptome. We aimed to ascertain how somatic structural changes affected gene expression within the cell in ways that could confer a pathogenic phenotype. We sequenced the genomes of a lung SCC cell line (LUDLU-1) and its matched lymphocyte cell line (AGLCL) to more than 50x coverage. We also sequenced the transcriptomes of LUDLU-1 and a normal bronchial epithelium cell line (LIMM-NBE1), resulting in more than 600 million aligned reads per sample, including both coding and non-coding RNA (ncRNA), in a strand-directional manner. We also captured small RNA (<30 bp). We discovered significant, but weak, correlations between copy number and expression for protein-coding genes, antisense transcripts, long intergenic ncRNA, and microRNA (miRNA). We found that miRNA undergo the largest change in overall expression pattern between the normal bronchial epithelium and the tumor cell line. We found evidence of transcription across the novel genomic sequence created from six somatic structural variants. For each part of our integrated analysis, we highlight candidate genes that have undergone the largest expression changes.


Subject(s)
Carcinoma, Squamous Cell/genetics , Gene Amplification , Gene Deletion , Gene Rearrangement , Lung Neoplasms/genetics , Transcription, Genetic , Cell Line, Tumor , DNA Copy Number Variations , Gene Dosage , Gene Expression Regulation, Neoplastic , Genome-Wide Association Study , Humans , Ploidies
8.
Article in English | MEDLINE | ID: mdl-22408616

ABSTRACT

Equipped with its 302-cell nervous system, the nematode Caenorhabditis elegans adapts its locomotion in different environments, exhibiting so-called swimming in liquids and crawling on dense gels. Recent experiments have demonstrated that the worm displays the full range of intermediate behaviors when placed in intermediate environments. The continuous nature of this transition strongly suggests that these behaviors all stem from modulation of a single underlying mechanism. We present a model of C. elegans forward locomotion that includes a neuromuscular control system that relies on a sensory feedback mechanism to generate undulations and is integrated with a physical model of the body and environment. We find that the model reproduces the entire swim-crawl transition, as well as locomotion in complex and heterogeneous environments. This is achieved with no modulatory mechanism, except via the proprioceptive response to the physical environment. Manipulations of the model are used to dissect the proposed pattern generation mechanism and its modulation. The model suggests a possible role for GABAergic D-class neurons in forward locomotion and makes a number of experimental predictions, in particular with respect to non-linearities in the model and to symmetry breaking between the neuromuscular systems on the ventral and dorsal sides of the body.

9.
J Mol Diagn ; 14(2): 104-11, 2012.
Article in English | MEDLINE | ID: mdl-22240447

ABSTRACT

Human papillomavirus (HPV) infection in cases of squamous cell carcinoma of the oropharynx is a powerful predictive and prognostic biomarker. We describe how the use of next-generation sequencing can provide a novel method for the detection of HPV in DNA isolated from formalin-fixed paraffin-embedded tissues. Using this methodology in a cohort of 44 head and neck tumors, we identified the samples that contained HPV sequences, the viral subtype involved, and a direct readout of viral load. Specificity of HPV detection by sequencing compared to traditional detection methods using either PCR or p16 immunohistochemistry was 100%. Sensitivity was 50% when either compared to PCR [confidence interval (CI) = 29% to 71%] or 75% when compared to p16 (CI = 47% to 91%). In addition, we demonstrate the ability of next-generation sequencing to detect other HPV subtypes that would not have been detected by traditional methods, and we demonstrated the ability to apply this method to any tumor and any virus in a panel of eight human cancer cell lines. This methodology also provides a tumor genomic copy number karyogram, and in the samples analyzed here, a lower level of chromosome instability was detected in HPV-positive tumors compared to HPV-negative tumors, as observed in previous studies. Thus, the use of next-generation sequencing for the detection of HPV provides a multiplicity of data with clinical significance in a single test.


Subject(s)
Gene Dosage , Head and Neck Neoplasms/diagnosis , Papillomaviridae/classification , Papillomaviridae/genetics , Papillomavirus Infections/diagnosis , Tumor Virus Infections/diagnosis , Viral Load/genetics , Carcinoma, Squamous Cell/diagnosis , Carcinoma, Squamous Cell/genetics , Carcinoma, Squamous Cell/virology , Cyclin-Dependent Kinase Inhibitor p16 , DNA, Viral/genetics , Female , Head and Neck Neoplasms/genetics , Head and Neck Neoplasms/virology , High-Throughput Nucleotide Sequencing , Humans , Immunoenzyme Techniques , Middle Aged , Neoplasm Proteins/metabolism , Papillomavirus Infections/genetics , Papillomavirus Infections/virology , Polymerase Chain Reaction , Sequence Analysis, DNA , Tumor Virus Infections/genetics , Tumor Virus Infections/virology
10.
Bioinformatics ; 28(1): 40-7, 2012 Jan 01.
Article in English | MEDLINE | ID: mdl-22039209

ABSTRACT

MOTIVATION: Comparison of read depths from next-generation sequencing between cancer and normal cells makes the estimation of copy number alteration (CNA) possible, even at very low coverage. However, estimating CNA from patients' tumour samples poses considerable challenges due to infiltration with normal cells and aneuploid cancer genomes. Here we provide a method that corrects contamination with normal cells and adjusts for genomes of different sizes so that the actual copy number of each region can be estimated. RESULTS: The procedure consists of several steps. First, we identify the multi-modality of the distribution of smoothed ratios. Then we use the estimates of the mean (modes) to identify underlying ploidy and the contamination level, and finally we perform the correction. The results indicate that the method works properly to estimate genomic regions with gains and losses in a range of simulated data as well as in two datasets from lung cancer patients. It also proves a powerful tool when analysing publicly available data from two cell lines (HCC1143 and COLO829). AVAILABILITY: An R package, called CNAnorm, is available at http://www.precancer.leeds.ac.uk/cnanorm or from Bioconductor. CONTACT: a.gusnanto@leeds.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Copy Number Variations , Genome Size , High-Throughput Nucleotide Sequencing , Neoplasms/genetics , Software , Cell Line, Tumor , Computer Simulation , Humans , Lung Neoplasms/genetics , Sequence Analysis, DNA
11.
Genomics ; 99(1): 18-24, 2012 Jan.
Article in English | MEDLINE | ID: mdl-22050995

ABSTRACT

Squamous cell carcinoma of the lung is remarkable for the extent to which the same chromosomal abnormalities are detected in individual tumours. We have used next generation sequencing at low coverage to produce high resolution copy number karyograms of a series of 89 non-small cell lung tumours specifically of the squamous cell subtype. Because this methodology is able to create karyograms from formalin-fixed paraffin-embedded material, we were able to use archival stored samples for which survival data were available and correlate frequently occurring copy number changes with disease outcome. No single region of genomic change showed significant correlation with survival. However, adopting a whole-genome approach, we devised an algorithm that relates to total genomic damage, specifically the relative ratios of copy number states across the genome. This algorithm generated a novel index, which is an independent prognostic indicator in early stage squamous cell carcinoma of the lung.


Subject(s)
Carcinoma, Non-Small-Cell Lung/genetics , Carcinoma, Non-Small-Cell Lung/mortality , Carcinoma, Squamous Cell/genetics , Carcinoma, Squamous Cell/mortality , Lung Neoplasms/genetics , Lung Neoplasms/mortality , Adult , Aged , Aged, 80 and over , Algorithms , Carcinoma, Non-Small-Cell Lung/surgery , Carcinoma, Squamous Cell/surgery , Female , Gene Dosage , Genome, Human , Humans , Lung Neoplasms/surgery , Male , Middle Aged , Models, Genetic , Prognosis , Sequence Analysis, DNA , Survival Analysis
13.
BMC Plant Biol ; 9: 120, 2009 Sep 22.
Article in English | MEDLINE | ID: mdl-19772648

ABSTRACT

BACKGROUND: The WRKY transcription factor gene family has a very ancient origin and has undergone extensive duplications in the plant kingdom. Several studies have pointed out their involvement in a range of biological processes, revealing that a large number of WRKY genes are transcriptionally regulated under conditions of biotic and/or abiotic stress. To investigate the existence of WRKY co-regulatory networks in plants, a whole gene family WRKYs expression study was carried out in rice (Oryza sativa). This analysis was extended to Arabidopsis thaliana taking advantage of an extensive repository of gene expression data. RESULTS: The presented results suggested that 24 members of the rice WRKY gene family (22% of the total) were differentially-regulated in response to at least one of the stress conditions tested. We defined the existence of nine OsWRKY gene clusters comprising both phylogenetically related and unrelated genes that were significantly co-expressed, suggesting that specific sets of WRKY genes might act in co-regulatory networks. This hypothesis was tested by Pearson Correlation Coefficient analysis of the Arabidopsis WRKY gene family in a large set of Affymetrix microarray experiments. AtWRKYs were found to belong to two main co-regulatory networks (COR-A, COR-B) and two smaller ones (COR-C and COR-D), all including genes belonging to distinct phylogenetic groups. The COR-A network contained several AtWRKY genes known to be involved mostly in response to pathogens, whose physical and/or genetic interaction was experimentally proven. We also showed that specific co-regulatory networks were conserved between the two model species by identifying Arabidopsis orthologs of the co-expressed OsWRKY genes. CONCLUSION: In this work we identified sets of co-expressed WRKY genes in both rice and Arabidopsis that are functionally likely to cooperate in the same signal transduction pathways. We propose that, making use of data from co-regulatory networks, it is possible to highlight novel clusters of plant genes contributing to the same biological processes or signal transduction pathways. Our approach will contribute to unveil gene cooperation pathways not yet identified by classical genetic analyses. This information will open new routes contributing to the dissection of WRKY signal transduction pathways in plants.


Subject(s)
Arabidopsis/genetics , Gene Regulatory Networks , Multigene Family , Oryza/genetics , Plant Proteins/metabolism , Arabidopsis/metabolism , Cluster Analysis , DNA, Plant/genetics , Gene Expression Profiling , Gene Expression Regulation, Plant , Genes, Plant , Oligonucleotide Array Sequence Analysis , Oryza/metabolism , Phylogeny , Plant Proteins/genetics , Signal Transduction , Stress, Physiological , Transcription Factors/genetics , Transcription Factors/metabolism
14.
HFSP J ; 3(3): 186-93, 2009 Jun.
Article in English | MEDLINE | ID: mdl-19639043

ABSTRACT

The ability of an animal to locomote through its environment depends crucially on the interplay between its active endogenous control and the physics of its interactions with the environment. The nematode worm Caenorhabditis elegans serves as an ideal model system for studying the respective roles of neural control and biomechanics, as well as the interaction between them. With only 302 neurons in a hard-wired neural circuit, the worm's apparent anatomical simplicity belies its behavioural complexity. Indeed, C. elegans exhibits a rich repertoire of complex behaviors, the majority of which are mediated by its adaptive undulatory locomotion. The conventional wisdom is that two kinematically distinct C. elegans locomotion behaviors-swimming in liquids and crawling on dense gel-like media-correspond to distinct locomotory gaits. Here we analyze the worm's motion through a series of different media and reveal a smooth transition from swimming to crawling, marked by a linear relationship between key locomotion metrics. These results point to a single locomotory gait, governed by the same underlying control mechanism. We further show that environmental forces play only a small role in determining the shape of the worm, placing conditions on the minimal pattern of internal forces driving locomotion.

15.
Plant Mol Biol ; 59(1): 99-110, 2005 Sep.
Article in English | MEDLINE | ID: mdl-16217605

ABSTRACT

A collection of 1373 unique flanking sequence tags (FSTs), generated from Ac/Ds and Ac transposon lines for reverse genetics studies, were produced in japonica and indica rice, respectively. The Ds and Ac FSTs together with the original T-DNAs were assigned a position in the rice genome sequence represented as assembled pseudomolecules, and found to be distributed evenly over the entire rice genome with a distinct bias for predicted gene-rich regions. The bias of the Ds and Ac transposon inserts for genes was exemplified by the presence of 59% of the inserts in genes annotated on the rice chromosomes and 41% present in genes transcribed as disclosed by their homology to cDNA clones. In a screen for inserts in a set of 75 well annotated transcription factors, including homeobox-containing genes, we found six Ac/Ds inserts. This high frequency of Ds and Ac inserts in genes suggests that saturated knockout mutagenesis in rice using this strategy will be efficient and possible with a lower number of inserts than expected. These FSTs and the corresponding plant lines are publicly available through OrygenesDB database and from the EU consortium members.


Subject(s)
DNA Transposable Elements/genetics , DNA, Plant/genetics , Databases, Genetic , Genomics/methods , Mutation/genetics , Oryza/genetics , Binding Sites/genetics , Chromosome Mapping , Chromosomes, Plant/genetics , DNA, Bacterial/genetics , DNA, Plant/isolation & purification , Genome, Plant , Mutagenesis, Insertional/methods , Transcription Factors/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...