ABSTRACT
Accurate measurement of clonal genotypes, mutational processes, and replication states from individual tumor-cell genomes will facilitate improved understanding of tumor evolution. We have developed DLP+, a scalable single-cell whole-genome sequencing platform implemented using commodity instruments, image-based object recognition, and open source computational methods. Using DLP+, we have generated a resource of 51,926 single-cell genomes and matched cell images from diverse cell types including cell lines, xenografts, and diagnostic samples with limited material. From this resource we have defined variation in mitotic mis-segregation rates across tissue types and genotypes. Analysis of matched genomic and image measurements revealed correlations between cellular morphology and genome ploidy states. Aggregation of cells sharing copy number profiles allowed for calculation of single-nucleotide resolution clonal genotypes and inference of clonal phylogenies and avoided the limitations of bulk deconvolution. Finally, joint analysis over the above features defined clone-specific chromosomal aneuploidy in polyclonal populations.
Subject(s)
DNA Replication/genetics , Genome, Human , High-Throughput Nucleotide Sequencing , Single-Cell Analysis , Aneuploidy , Animals , Cell Cycle/genetics , Cell Line, Tumor , Cell Shape , Cell Survival , Chromosomes, Human/genetics , Clone Cells , DNA Transposable Elements/genetics , Diploidy , Female , Genotype , Humans , Male , Mice , Mutation/genetics , Phylogeny , Polymorphism, Single Nucleotide/geneticsABSTRACT
Human cancers, including breast cancers, comprise clones differing in mutation content. Clones evolve dynamically in space and time following principles of Darwinian evolution, underpinning important emergent features such as drug resistance and metastasis. Human breast cancer xenoengraftment is used as a means of capturing and studying tumour biology, and breast tumour xenografts are generally assumed to be reasonable models of the originating tumours. However, the consequences and reproducibility of engraftment and propagation on the genomic clonal architecture of tumours have not been systematically examined at single-cell resolution. Here we show, using deep-genome and single-cell sequencing methods, the clonal dynamics of initial engraftment and subsequent serial propagation of primary and metastatic human breast cancers in immunodeficient mice. In all 15 cases examined, clonal selection on engraftment was observed in both primary and metastatic breast tumours, varying in degree from extreme selective engraftment of minor (<5% of starting population) clones to moderate, polyclonal engraftment. Furthermore, ongoing clonal dynamics during serial passaging is a feature of tumours experiencing modest initial selection. Through single-cell sequencing, we show that major mutation clusters estimated from tumour population sequencing relate predictably to the most abundant clonal genotypes, even in clonally complex and rapidly evolving cases. Finally, we show that similar clonal expansion patterns can emerge in independent grafts of the same starting tumour population, indicating that genomic aberrations can be reproducible determinants of evolutionary trajectories. Our results show that measurement of genomically defined clonal population dynamics will be highly informative for functional studies using patient-derived breast cancer xenoengraftment.
Subject(s)
Breast Neoplasms/genetics , Breast Neoplasms/pathology , Clone Cells/metabolism , Clone Cells/pathology , Genome, Human/genetics , Single-Cell Analysis , Xenograft Model Antitumor Assays , Animals , Breast Neoplasms/secondary , DNA Mutational Analysis , Genomics , Genotype , High-Throughput Nucleotide Sequencing , Humans , Mice , Neoplasm Transplantation , Time Factors , Transplantation, Heterologous , Xenograft Model Antitumor Assays/methodsABSTRACT
Single-cell genomics is critical for understanding cellular heterogeneity in cancer, but existing library preparation methods are expensive, require sample preamplification and introduce coverage bias. Here we describe direct library preparation (DLP), a robust, scalable, and high-fidelity method that uses nanoliter-volume transposition reactions for single-cell whole-genome library preparation without preamplification. We examined 782 cells from cell lines and triple-negative breast xenograft tumors. Low-depth sequencing, compared with existing methods, revealed greater coverage uniformity and more reliable detection of copy-number alterations. Using phylogenetic analysis, we found minor xenograft subpopulations that were undetectable by bulk sequencing, as well as dynamic clonal expansion and diversification between passages. Merging single-cell genomes in silico, we generated 'bulk-equivalent' genomes with high depth and uniform coverage. Thus, low-depth sequencing of DLP libraries may provide an attractive replacement for conventional bulk sequencing methods, permitting analysis of copy number at the cell level and of other genomic variants at the population level.
Subject(s)
Genomics/methods , Single-Cell Analysis/methods , Animals , Breast Neoplasms/genetics , Breast Neoplasms/pathology , Cell Line, Tumor , Female , Gene Library , Humans , Lab-On-A-Chip Devices , Mice, SCID , Phylogeny , Single-Cell Analysis/instrumentation , Xenograft Model Antitumor AssaysABSTRACT
The genomes of large numbers of single cells must be sequenced to further understanding of the biological significance of genomic heterogeneity in complex systems. Whole genome amplification (WGA) of single cells is generally the first step in such studies, but is prone to nonuniformity that can compromise genomic measurement accuracy. Despite recent advances, robust performance in high-throughput single-cell WGA remains elusive. Here, we introduce droplet multiple displacement amplification (MDA), a method that uses commercially available liquid dispensing to perform high-throughput single-cell MDA in nanoliter volumes. The performance of droplet MDA is characterized using a large dataset of 129 normal diploid cells, and is shown to exceed previously reported single-cell WGA methods in amplification uniformity, genome coverage, and/or robustness. We achieve up to 80% coverage of a single-cell genome at 5× sequencing depth, and demonstrate excellent single-nucleotide variant (SNV) detection using targeted sequencing of droplet MDA product to achieve a median allelic dropout of 15%, and using whole genome sequencing to achieve false and true positive rates of 9.66 × 10(-6) and 68.8%, respectively, in a G1-phase cell. We further show that droplet MDA allows for the detection of copy number variants (CNVs) as small as 30 kb in single cells of an ovarian cancer cell line and as small as 9 Mb in two high-grade serous ovarian cancer samples using only 0.02× depth. Droplet MDA provides an accessible and scalable method for performing robust and accurate CNV and SNV measurements on large numbers of single cells.
Subject(s)
Genome, Human/genetics , Genomics/methods , Nucleic Acid Amplification Techniques/methods , Single-Cell Analysis/methods , Alleles , Cell Line , Cell Line, Tumor , DNA Copy Number Variations , High-Throughput Nucleotide Sequencing/methods , Humans , Polymorphism, Single Nucleotide , Reproducibility of ResultsSubject(s)
Evolution, Molecular , Neoplasms/genetics , Phylogeny , Single-Cell Analysis/methods , Humans , SoftwareABSTRACT
Functional RNA structures tend to be conserved during evolution. This finding is, for example, exploited by comparative methods for RNA secondary structure prediction that currently provide the state-of-art in terms of prediction accuracy. We here provide strong evidence that homologous RNA genes not only fold into similar final RNA structures, but that their folding pathways also share common transient structural features that have been evolutionarily conserved. For this, we compile and investigate a non-redundant data set of 32 sequences with known transient and final RNA secondary structures and devise a dedicated computational analysis pipeline.
Subject(s)
RNA Folding , RNA/chemistry , Computational Biology/methods , Evolution, Molecular , Nucleic Acid Conformation , Sequence Homology, Nucleic Acid , SoftwareABSTRACT
Here we use single-cell RNA sequencing to compile a human breast cell atlas assembled from 55 donors that had undergone reduction mammoplasties or risk reduction mastectomies. From more than 800,000 cells we identified 41 cell subclusters across the epithelial, immune and stromal compartments. The contribution of these different clusters varied according to the natural history of the tissue. Age, parity and germline mutations, known to modulate the risk of developing breast cancer, affected the homeostatic cellular state of the breast in different ways. We found that immune cells from BRCA1 or BRCA2 carriers had a distinct gene expression signature indicative of potential immune exhaustion, which was validated by immunohistochemistry. This suggests that immune-escape mechanisms could manifest in non-cancerous tissues very early during tumor initiation. This atlas is a rich resource that can be used to inform novel approaches for early detection and prevention of breast cancer.
Subject(s)
BRCA1 Protein , Breast Neoplasms , Adult , Female , Pregnancy , Humans , BRCA1 Protein/genetics , Breast Neoplasms/genetics , Breast Neoplasms/pathology , BRCA2 Protein/genetics , Genes, BRCA2 , Germ-Line MutationABSTRACT
The hok/sok toxin-antitoxin system of Escherichia coli plasmid R1 increases plasmid maintenance by killing plasmid-free daughter cells. The hok/sok locus specifies two RNAs: hok mRNA, which encodes a toxic transmembrane protein, and sok antisense RNA, which binds a complementary region in the hok mRNA and induces transcript degradation. During cell growth, the cis-encoded sok RNA inhibits expression of the Hok toxin. In plasmid-free segregants, the rapid decay of sok RNA relative to hok mRNA permits Hok translation, leading to cell death. This post-segregational killing mechanism relies upon the ability of the hok mRNA to adopt alternative structural configurations, which affect ease of translation and the susceptibility of the molecule to degradation. The full-length hok transcript is stable, highly structured and immune to ribosome and antisense RNA binding. Gradual 3' end processing produces dramatic structural rearrangements in the mRNA, which render the molecule translationally active and expose the sok RNA binding site. During transcription, premature ribosome and sok binding are prevented through the formation of transient metastable hairpins in the 5' end of the nascent transcript. Several hok mRNA paralogs have been identified in the genome of E. coli, and Hok protein orthologs found in the genomes of Enterobacteria. Using a combination of automated search and extensive manual editing, we compiled a multiple sequence alignment for the hok mRNA. All three experimentally validated hok mRNA structures are mapped onto this alignment, which has been submitted to the Rfam database for RNA families.
Subject(s)
Bacterial Toxins/metabolism , Escherichia coli Proteins/metabolism , Genome, Bacterial , RNA, Bacterial/metabolism , RNA, Messenger/metabolism , Bacterial Toxins/genetics , Base Sequence , Binding Sites , Escherichia coli/genetics , Escherichia coli/metabolism , Escherichia coli Proteins/genetics , Gene Expression Regulation, Bacterial , Nucleic Acid Conformation , Plasmids/genetics , Plasmids/metabolism , Protein Biosynthesis , RNA Stability , RNA, Bacterial/genetics , RNA, Messenger/genetics , Ribosomes/genetics , Ribosomes/metabolism , Sequence Alignment , Sequence Homology, Nucleic Acid , Transcription, GeneticABSTRACT
Measuring gene expression of tumor clones at single-cell resolution links functional consequences to somatic alterations. Without scalable methods to simultaneously assay DNA and RNA from the same single cell, parallel single-cell DNA and RNA measurements from independent cell populations must be mapped for genome-transcriptome association. We present clonealign, which assigns gene expression states to cancer clones using single-cell RNA and DNA sequencing independently sampled from a heterogeneous population. We apply clonealign to triple-negative breast cancer patient-derived xenografts and high-grade serous ovarian cancer cell lines and discover clone-specific dysregulated biological pathways not visible using either sequencing method alone.
Subject(s)
Biomarkers, Tumor/genetics , Cystadenocarcinoma, Serous/genetics , High-Throughput Nucleotide Sequencing/methods , Models, Statistical , Ovarian Neoplasms/genetics , Single-Cell Analysis/methods , Software , Triple Negative Breast Neoplasms/genetics , Animals , Clone Cells , Cystadenocarcinoma, Serous/pathology , Female , Humans , Mice, Inbred NOD , Mice, SCID , Ovarian Neoplasms/pathology , Triple Negative Breast Neoplasms/pathology , Tumor Cells, Cultured , Xenograft Model Antitumor AssaysABSTRACT
Next-generation sequencing (NGS) of bulk tumour tissue can identify constituent cell populations in cancers and measure their abundance. This requires computational deconvolution of allelic counts from somatic mutations, which may be incapable of fully resolving the underlying population structure. Single cell sequencing (SCS) is a more direct method, although its replacement of NGS is impeded by technical noise and sampling limitations. We propose ddClone, which analytically integrates NGS and SCS data, leveraging their complementary attributes through joint statistical inference. We show on real and simulated datasets that ddClone produces more accurate results than can be achieved by either method alone.
Subject(s)
Clone Cells/metabolism , Computational Biology/methods , Models, Statistical , Neoplasms/genetics , Single-Cell Analysis , Alleles , Animals , Cluster Analysis , Computer Simulation , Disease Models, Animal , Female , Genotype , Heterografts , High-Throughput Nucleotide Sequencing , Humans , Mice , Mutation , Neoplasms/pathology , Reproducibility of Results , Sequence Analysis, DNA , Single-Cell Analysis/methods , Triple Negative Breast Neoplasms/genetics , Triple Negative Breast Neoplasms/pathology , WorkflowABSTRACT
Somatic evolution of malignant cells produces tumors composed of multiple clonal populations, distinguished in part by rearrangements and copy number changes affecting chromosomal segments. Whole genome sequencing mixes the signals of sampled populations, diluting the signals of clone-specific aberrations, and complicating estimation of clone-specific genotypes. We introduce ReMixT, a method to unmix tumor and contaminating normal signals and jointly predict mixture proportions, clone-specific segment copy number, and clone specificity of breakpoints. ReMixT is free, open-source software and is available at http://bitbucket.org/dranew/remixt .